[jira] [Created] (DRILL-7037) Apache Drill Crashes when a 50mb json string is queried via the REST API provided
Ayush Sharma created DRILL-7037: --- Summary: Apache Drill Crashes when a 50mb json string is queried via the REST API provided Key: DRILL-7037 URL: https://issues.apache.org/jira/browse/DRILL-7037 Project: Apache Drill Issue Type: Bug Components: Client - HTTP Affects Versions: 1.14.0 Environment: Windows 10 24GB RAM 8 Cores Used the REST API call to query drill Reporter: Ayush Sharma Attachments: scheduler.txt Apache Drill crashes with OutofMemoryException (24GB RAM) when a REST API call is made by supplying a json of size 50MB in the query paramater of the REST API. The REST API even crashes for a 10MB query (16GB RAM) and works with a 5MB query. This is a blocker for us and will need immediate remediation. We are also not aware of any sys.options which might bring the HEAP size down drastically or currently making it go up. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-7036) Improve UI for alert and error messages
Kunal Khatua created DRILL-7036: --- Summary: Improve UI for alert and error messages Key: DRILL-7036 URL: https://issues.apache.org/jira/browse/DRILL-7036 Project: Apache Drill Issue Type: Improvement Components: Web Server Affects Versions: 1.15.0 Reporter: Kunal Khatua Assignee: Kunal Khatua Fix For: 1.16.0 Currently, the WebUI has a rather inconsistent user experience when it comes to dealing with errors and exceptions. This Jira proposes standardizing that to a cleaner interface by leveraging Bootstraps modals and panels for publishing the messages in a presentable format -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6992) Support column histogram statistics
[ https://issues.apache.org/jira/browse/DRILL-6992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pritesh Maker updated DRILL-6992: - Fix Version/s: (was: 1.16.0) > Support column histogram statistics > --- > > Key: DRILL-6992 > URL: https://issues.apache.org/jira/browse/DRILL-6992 > Project: Apache Drill > Issue Type: New Feature > Components: Query Planning Optimization >Affects Versions: 1.15.0 >Reporter: Aman Sinha >Assignee: Aman Sinha >Priority: Major > > As a follow-up to > [DRILL-1328|https://issues.apache.org/jira/browse/DRILL-1328] which is adding > NDV (num distinct values) support and creating the framework for statistics, > we also need Histograms. These are needed for range predicates selectivity > estimation as well as equality predicates when there is non-uniform > distribution of data. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6992) Support column histogram statistics
[ https://issues.apache.org/jira/browse/DRILL-6992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pritesh Maker updated DRILL-6992: - Fix Version/s: 1.16.0 > Support column histogram statistics > --- > > Key: DRILL-6992 > URL: https://issues.apache.org/jira/browse/DRILL-6992 > Project: Apache Drill > Issue Type: New Feature > Components: Query Planning Optimization >Affects Versions: 1.15.0 >Reporter: Aman Sinha >Assignee: Aman Sinha >Priority: Major > Fix For: 1.16.0 > > > As a follow-up to > [DRILL-1328|https://issues.apache.org/jira/browse/DRILL-1328] which is adding > NDV (num distinct values) support and creating the framework for statistics, > we also need Histograms. These are needed for range predicates selectivity > estimation as well as equality predicates when there is non-uniform > distribution of data. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-7035) Drill C++ Client crashes on multiple SaslAuthenticatorImpl Destruction due to communication error
[ https://issues.apache.org/jira/browse/DRILL-7035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pritesh Maker updated DRILL-7035: - Fix Version/s: 1.16.0 > Drill C++ Client crashes on multiple SaslAuthenticatorImpl Destruction due to > communication error > -- > > Key: DRILL-7035 > URL: https://issues.apache.org/jira/browse/DRILL-7035 > Project: Apache Drill > Issue Type: Bug > Components: Client - C++ >Affects Versions: 1.12.0 >Reporter: Rob Wu >Assignee: Debraj Ray >Priority: Major > Fix For: 1.16.0 > > > [~debraj92] found that when under some circumstance the > SaslAuthenticatorImpl's sasl_dispose() function will crash out at > destruction. The incident seems to be random and only when certain > authentication and encryption combinations are used during connection. > After digging a little deeper, I found that when BOOST communication error > occurs, the shutdownSocket (eventually triggering sasl_dispose()) could be > called from various threads resulting in a race condition of freeing the > handle. This can be reproduced with the querysubmitter. This is reproducible > since 1.12.0+. > [~debraj92] will be adding a patch to resolve this incident. > > {code:java} > 2019-Feb-11 10:44:01 : TRACE : 2d74 : DrillClientImpl::handleRead: Handle > Read from buffer 04E1D850 > 2019-Feb-11 10:44:01 : TRACE : 2d74 : DrillClientImpl::handleRead: Cancel > deadline timer. > 2019-Feb-11 10:44:01 : TRACE : 2d74 : DrillClientImpl::handleRead: > ERR_QRY_COMMERR. Boost Communication Error: End of file > 2019-Feb-11 10:44:31 : TRACE : 3df8 : Disposing 1: +++ ENTER +++ > 2019-Feb-11 10:44:31 : TRACE : 2d74 : Disposing 2: +++ ENTER +++ > 2019-Feb-11 10:44:31 : TRACE : 2d74 : Disposing 2: --- EXIT --- > 2019-Feb-11 10:44:31 : TRACE : 2d74 : Socket shutdown > 2019-Feb-11 10:44:31 : TRACE : 3df8 : Disposing 1: --- EXIT --- > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (DRILL-7035) Drill C++ Client crashes on multiple SaslAuthenticatorImpl Destruction due to communication error
[ https://issues.apache.org/jira/browse/DRILL-7035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pritesh Maker reassigned DRILL-7035: Assignee: Debraj Ray > Drill C++ Client crashes on multiple SaslAuthenticatorImpl Destruction due to > communication error > -- > > Key: DRILL-7035 > URL: https://issues.apache.org/jira/browse/DRILL-7035 > Project: Apache Drill > Issue Type: Bug > Components: Client - C++ >Affects Versions: 1.12.0 >Reporter: Rob Wu >Assignee: Debraj Ray >Priority: Major > > [~debraj92] found that when under some circumstance the > SaslAuthenticatorImpl's sasl_dispose() function will crash out at > destruction. The incident seems to be random and only when certain > authentication and encryption combinations are used during connection. > After digging a little deeper, I found that when BOOST communication error > occurs, the shutdownSocket (eventually triggering sasl_dispose()) could be > called from various threads resulting in a race condition of freeing the > handle. This can be reproduced with the querysubmitter. This is reproducible > since 1.12.0+. > [~debraj92] will be adding a patch to resolve this incident. > > {code:java} > 2019-Feb-11 10:44:01 : TRACE : 2d74 : DrillClientImpl::handleRead: Handle > Read from buffer 04E1D850 > 2019-Feb-11 10:44:01 : TRACE : 2d74 : DrillClientImpl::handleRead: Cancel > deadline timer. > 2019-Feb-11 10:44:01 : TRACE : 2d74 : DrillClientImpl::handleRead: > ERR_QRY_COMMERR. Boost Communication Error: End of file > 2019-Feb-11 10:44:31 : TRACE : 3df8 : Disposing 1: +++ ENTER +++ > 2019-Feb-11 10:44:31 : TRACE : 2d74 : Disposing 2: +++ ENTER +++ > 2019-Feb-11 10:44:31 : TRACE : 2d74 : Disposing 2: --- EXIT --- > 2019-Feb-11 10:44:31 : TRACE : 2d74 : Socket shutdown > 2019-Feb-11 10:44:31 : TRACE : 3df8 : Disposing 1: --- EXIT --- > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (DRILL-7035) Drill C++ Client crashes on multiple SaslAuthenticatorImpl Destruction due to communication error
[ https://issues.apache.org/jira/browse/DRILL-7035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rob Wu reassigned DRILL-7035: - Assignee: (was: Rob Wu) > Drill C++ Client crashes on multiple SaslAuthenticatorImpl Destruction due to > communication error > -- > > Key: DRILL-7035 > URL: https://issues.apache.org/jira/browse/DRILL-7035 > Project: Apache Drill > Issue Type: Bug > Components: Client - C++ >Affects Versions: 1.12.0 >Reporter: Rob Wu >Priority: Major > > [~debraj92] found that when under some circumstance the > SaslAuthenticatorImpl's sasl_dispose() function will crash out at > destruction. The incident seems to be random and only when certain > authentication and encryption combinations are used during connection. > After digging a little deeper, I found that when BOOST communication error > occurs, the shutdownSocket (eventually triggering sasl_dispose()) could be > called from various threads resulting in a race condition of freeing the > handle. This can be reproduced with the querysubmitter. This is reproducible > since 1.12.0+. > [~debraj92] will be adding a patch to resolve this incident. > > {code:java} > 2019-Feb-11 10:44:01 : TRACE : 2d74 : DrillClientImpl::handleRead: Handle > Read from buffer 04E1D850 > 2019-Feb-11 10:44:01 : TRACE : 2d74 : DrillClientImpl::handleRead: Cancel > deadline timer. > 2019-Feb-11 10:44:01 : TRACE : 2d74 : DrillClientImpl::handleRead: > ERR_QRY_COMMERR. Boost Communication Error: End of file > 2019-Feb-11 10:44:31 : TRACE : 3df8 : Disposing 1: +++ ENTER +++ > 2019-Feb-11 10:44:31 : TRACE : 2d74 : Disposing 2: +++ ENTER +++ > 2019-Feb-11 10:44:31 : TRACE : 2d74 : Disposing 2: --- EXIT --- > 2019-Feb-11 10:44:31 : TRACE : 2d74 : Socket shutdown > 2019-Feb-11 10:44:31 : TRACE : 3df8 : Disposing 1: --- EXIT --- > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (DRILL-7035) Drill C++ Client crashes on multiple SaslAuthenticatorImpl Destruction due to communication error
[ https://issues.apache.org/jira/browse/DRILL-7035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rob Wu reassigned DRILL-7035: - Assignee: Rob Wu > Drill C++ Client crashes on multiple SaslAuthenticatorImpl Destruction due to > communication error > -- > > Key: DRILL-7035 > URL: https://issues.apache.org/jira/browse/DRILL-7035 > Project: Apache Drill > Issue Type: Bug > Components: Client - C++ >Affects Versions: 1.12.0 >Reporter: Rob Wu >Assignee: Rob Wu >Priority: Major > > [~debraj92] found that when under some circumstance the > SaslAuthenticatorImpl's sasl_dispose() function will crash out at > destruction. The incident seems to be random and only when certain > authentication and encryption combinations are used during connection. > After digging a little deeper, I found that when BOOST communication error > occurs, the shutdownSocket (eventually triggering sasl_dispose()) could be > called from various threads resulting in a race condition of freeing the > handle. This can be reproduced with the querysubmitter. This is reproducible > since 1.12.0+. > [~debraj92] will be adding a patch to resolve this incident. > > {code:java} > 2019-Feb-11 10:44:01 : TRACE : 2d74 : DrillClientImpl::handleRead: Handle > Read from buffer 04E1D850 > 2019-Feb-11 10:44:01 : TRACE : 2d74 : DrillClientImpl::handleRead: Cancel > deadline timer. > 2019-Feb-11 10:44:01 : TRACE : 2d74 : DrillClientImpl::handleRead: > ERR_QRY_COMMERR. Boost Communication Error: End of file > 2019-Feb-11 10:44:31 : TRACE : 3df8 : Disposing 1: +++ ENTER +++ > 2019-Feb-11 10:44:31 : TRACE : 2d74 : Disposing 2: +++ ENTER +++ > 2019-Feb-11 10:44:31 : TRACE : 2d74 : Disposing 2: --- EXIT --- > 2019-Feb-11 10:44:31 : TRACE : 2d74 : Socket shutdown > 2019-Feb-11 10:44:31 : TRACE : 3df8 : Disposing 1: --- EXIT --- > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-7035) Drill C++ Client crashes on multiple SaslAuthenticatorImpl Destruction due to communication error
[ https://issues.apache.org/jira/browse/DRILL-7035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rob Wu updated DRILL-7035: -- Description: [~debraj92] found that when under some circumstance the SaslAuthenticatorImpl's sasl_dispose() function will crash out at destruction. The incident seems to be random and only when certain authentication and encryption combinations are used during connection. After digging a little deeper, I found that when BOOST communication error occurs, the shutdownSocket (eventually triggering sasl_dispose()) could be called from various threads resulting in a race condition of freeing the handle. This can be reproduced with the querysubmitter. This is reproducible since 1.12.0+. [~debraj92] will be adding a patch to resolve this incident. {code:java} 2019-Feb-11 10:44:01 : TRACE : 2d74 : DrillClientImpl::handleRead: Handle Read from buffer 04E1D850 2019-Feb-11 10:44:01 : TRACE : 2d74 : DrillClientImpl::handleRead: Cancel deadline timer. 2019-Feb-11 10:44:01 : TRACE : 2d74 : DrillClientImpl::handleRead: ERR_QRY_COMMERR. Boost Communication Error: End of file 2019-Feb-11 10:44:31 : TRACE : 3df8 : Disposing 1: +++ ENTER +++ 2019-Feb-11 10:44:31 : TRACE : 2d74 : Disposing 2: +++ ENTER +++ 2019-Feb-11 10:44:31 : TRACE : 2d74 : Disposing 2: --- EXIT --- 2019-Feb-11 10:44:31 : TRACE : 2d74 : Socket shutdown 2019-Feb-11 10:44:31 : TRACE : 3df8 : Disposing 1: --- EXIT --- {code} was: [~debraj92] found that when under some circumstance the SaslAuthenticatorImpl's sasl_dispose() function will crash out at destruction. The incident seems to be random and only when certain authentication and encryption combinations are used during connection. After digging a little deeper, I found that when BOOST communication error occurs, the shutdownSocket (eventually triggering sasl_dispose()) could be called from various threads resulting in a race condition of freeing the handle. This can be reproduced with the querysubmitter. This is reproducible since 1.12.0+. [~debraj92] will be adding a patch to resolve this incident. > Drill C++ Client crashes on multiple SaslAuthenticatorImpl Destruction due to > communication error > -- > > Key: DRILL-7035 > URL: https://issues.apache.org/jira/browse/DRILL-7035 > Project: Apache Drill > Issue Type: Bug > Components: Client - C++ >Affects Versions: 1.12.0 >Reporter: Rob Wu >Priority: Major > > [~debraj92] found that when under some circumstance the > SaslAuthenticatorImpl's sasl_dispose() function will crash out at > destruction. The incident seems to be random and only when certain > authentication and encryption combinations are used during connection. > After digging a little deeper, I found that when BOOST communication error > occurs, the shutdownSocket (eventually triggering sasl_dispose()) could be > called from various threads resulting in a race condition of freeing the > handle. This can be reproduced with the querysubmitter. This is reproducible > since 1.12.0+. > [~debraj92] will be adding a patch to resolve this incident. > > {code:java} > 2019-Feb-11 10:44:01 : TRACE : 2d74 : DrillClientImpl::handleRead: Handle > Read from buffer 04E1D850 > 2019-Feb-11 10:44:01 : TRACE : 2d74 : DrillClientImpl::handleRead: Cancel > deadline timer. > 2019-Feb-11 10:44:01 : TRACE : 2d74 : DrillClientImpl::handleRead: > ERR_QRY_COMMERR. Boost Communication Error: End of file > 2019-Feb-11 10:44:31 : TRACE : 3df8 : Disposing 1: +++ ENTER +++ > 2019-Feb-11 10:44:31 : TRACE : 2d74 : Disposing 2: +++ ENTER +++ > 2019-Feb-11 10:44:31 : TRACE : 2d74 : Disposing 2: --- EXIT --- > 2019-Feb-11 10:44:31 : TRACE : 2d74 : Socket shutdown > 2019-Feb-11 10:44:31 : TRACE : 3df8 : Disposing 1: --- EXIT --- > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-7035) Drill C++ Client crashes on multiple SaslAuthenticatorImpl Destruction due to communication error
[ https://issues.apache.org/jira/browse/DRILL-7035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rob Wu updated DRILL-7035: -- Affects Version/s: 1.12.0 Description: [~debraj92] found that when under some circumstance the SaslAuthenticatorImpl's sasl_dispose() function will crash out at destruction. The incident seems to be random and only when certain authentication and encryption combinations are used during connection. After digging a little deeper, I found that when BOOST communication error occurs, the shutdownSocket (eventually triggering sasl_dispose()) could be called from various threads resulting in a race condition of freeing the handle. This can be reproduced with the querysubmitter. This is reproducible since 1.12.0+. [~debraj92] will be adding a patch to resolve this incident. was: [~debraj92] found that when under some circumstance the SaslAuthenticatorImpl's sasl_dispose() function will crash out at destruction. The incident seems to be random and only when certain authentication and encryption combinations are used during connection. After digging a little deeper, I found that when BOOST communication error occurs, the shutdownSocket (eventually triggering sasl_dispose()) could be called from various threads resulting in a race condition of freeing the handle. This can be reproduced with the querysubmitter. [~debraj92] will be adding a patch to resolve this incident. > Drill C++ Client crashes on multiple SaslAuthenticatorImpl Destruction due to > communication error > -- > > Key: DRILL-7035 > URL: https://issues.apache.org/jira/browse/DRILL-7035 > Project: Apache Drill > Issue Type: Bug > Components: Client - C++ >Affects Versions: 1.12.0 >Reporter: Rob Wu >Priority: Major > > [~debraj92] found that when under some circumstance the > SaslAuthenticatorImpl's sasl_dispose() function will crash out at > destruction. The incident seems to be random and only when certain > authentication and encryption combinations are used during connection. > After digging a little deeper, I found that when BOOST communication error > occurs, the shutdownSocket (eventually triggering sasl_dispose()) could be > called from various threads resulting in a race condition of freeing the > handle. This can be reproduced with the querysubmitter. This is reproducible > since 1.12.0+. > [~debraj92] will be adding a patch to resolve this incident. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-7035) Drill C++ Client crashes on multiple SaslAuthenticatorImpl Destruction due to communication error
Rob Wu created DRILL-7035: - Summary: Drill C++ Client crashes on multiple SaslAuthenticatorImpl Destruction due to communication error Key: DRILL-7035 URL: https://issues.apache.org/jira/browse/DRILL-7035 Project: Apache Drill Issue Type: Bug Components: Client - C++ Reporter: Rob Wu [~debraj92] found that when under some circumstance the SaslAuthenticatorImpl's sasl_dispose() function will crash out at destruction. The incident seems to be random and only when certain authentication and encryption combinations are used during connection. After digging a little deeper, I found that when BOOST communication error occurs, the shutdownSocket (eventually triggering sasl_dispose()) could be called from various threads resulting in a race condition of freeing the handle. This can be reproduced with the querysubmitter. [~debraj92] will be adding a patch to resolve this incident. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Closed] (DRILL-6744) Support filter push down for varchar / decimal data types
[ https://issues.apache.org/jira/browse/DRILL-6744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anton Gozhiy closed DRILL-6744. --- Verified with Drill version 1.16.0-SNAPSHOT (commit 3bec197bce73ed7aa2ae3fabf457c408aa7aff87) > Support filter push down for varchar / decimal data types > - > > Key: DRILL-6744 > URL: https://issues.apache.org/jira/browse/DRILL-6744 > Project: Apache Drill > Issue Type: Improvement >Affects Versions: 1.14.0 >Reporter: Arina Ielchiieva >Assignee: Arina Ielchiieva >Priority: Major > Labels: doc-complete, ready-to-commit > Fix For: 1.15.0 > > > Since now Drill is using Apache Parquet 1.10.0 where issue with incorrectly > stored varchar / decimal min / max statistics is resolved, we should add > support for varchar / decimal filter push down. Only files created with > parquet lib 1.9.1 (1.10.0)) and later will be subjected to push down. In > cases if user knows that prior created files have correct min / max > statistics (i.e. user exactly knows that data in binary columns in ASCII (not > UTF-8)) than parquet.strings.signed-min-max.enabled can be set to true to > enable filter push down. > *Description* > _Note: Drill is using Parquet 1.10.0 library since 1.13.0 version._ > *Varchar Partition Pruning* > Varchar Pruning will work for files generated prior and after Parquet 1.10.0 > version, since to enable partition pruning both min and max values should be > the same and there are no issues with incorrectly stored statistics for > binary data for the same min and max values. Partition pruning using Drill > metadata files will also work, no matter when metadata file was created > (prior or after Drill 1.15.0). > Partition pruning won't work for files where partition is null due to > PARQUET-1341, issue will be fixed in Parquet 1.11.0. > *Varchar Filter Push Down* > Varchar filter push down will work for parquet files created with Parquet > 1.10.0 and later. > There are two options how to enable push down for files generated with prior > Parquet versions, when user exactly knows that binary data is in ASCII (not > UTF-8): > 1. set configuration {{enableStringsSignedMinMax}} to true (false by default) > for parquet format plugin: > {noformat} > "parquet" : { > type: "parquet", > enableStringsSignedMinMax: true > } > {noformat} > This would apply to all parquet files of a given file plugin, including all > workspaces. > 2. If user wants to enable / disable allowing reading binary statistics for > old parquet files per session, session option > {{store.parquet.reader.strings_signed_min_max}} can be used. By default, it > has empty string value. Setting such option will take priority over config in > parquet format plugin. Option allows three values: 'true', 'false', '' (empty > string). > _Note: store.parquet.reader.strings_signed_min_max also can be set at system > level, thus it will apply to all parquet files in the system._ > The same config / session option will apply to allow reading binary > statistics from Drill metadata files generated prior to Drill 1.15.0. If > Drill metadata file was created prior to Drill 1.15.0 but for parquet files > created with Parquet library 1.10.0 and later, user would have to enable > config / session option or regenerate Drill metadata file with Drill 1.15.0 > or later, because from the metadata file we don't know if statistics is > stored correctly (prior Drill was writing reading and writing binary > statistics by default though did not use it). > When creating Drill metadata file with Drill 1.15.0 and later for old parquet > files, user should mind config / session option. If strings_signed_min_max is > enabled, Drill will store in the Drill metadata file binary statistics but > since metadata file was created with Drill 1.15.0 and later, Drill would read > it back disregarding the option (assuming that if statistics is present in > the Drill metadata file, it is correct). If user mistakenly enabled > strings_signed_min_max, he needs to disable it and regenerated Drill metadata > file. The same is in the opposite way, if user created metadata file when > strings_signed_min_max was disabled, no min / max values for binary > statistics will be written and thus read back, even if during reading the > metadata strings_signed_min_max is enabled. > *Decimal Partition Pruning* > Decimal values can be represented in four logical types: int_32, int_64, > fixed_len_byte_array and binary. > Partition pruning will work for all logical types for old and new decimal > files, i.e. created with Parquet 1.10.0, prior and after. Partition pruning > won't work for files with null partition due to PARQUET-1341 which will be > fixed in Parquet