[jira] [Commented] (DRILL-6231) Fix memory allocation for repeated list vector
[ https://issues.apache.org/jira/browse/DRILL-6231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16403220#comment-16403220 ] ASF GitHub Bot commented on DRILL-6231: --- Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/1171#discussion_r175244575 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/record/RecordBatchSizer.java --- @@ -395,11 +395,24 @@ private void allocateMap(AbstractMapVector map, int recordCount) { } } +private void allocateRepeatedList(RepeatedListVector vector, int recordCount) { + vector.allocateOffsetsNew(recordCount); + recordCount *= getCardinality(); + ColumnSize child = children.get(vector.getField().getName()); + child.allocateVector(vector.getDataVector(), recordCount); --- End diff -- One interesting feature of this vector is that the child can be null during reading for some time. That is, in JSON, we may see that the field is `foo: [[]]`, but don't know the inner type yet. So, for safety, allocate the inner vector only if `vector.getDataVector()` is non-null. Also note that a repeated list can be of any dimension. So, the inner vector can be another repeated list of lesser dimension. The code here handles that case. But, does the sizer itself handle nested repeated lists? Do we have a unit test for a 2D and 3D list? Never had to do these before because only JSON can produce such structures and we don't seem to exercise most operators with complex JSON structures. We probably should. > Fix memory allocation for repeated list vector > -- > > Key: DRILL-6231 > URL: https://issues.apache.org/jira/browse/DRILL-6231 > Project: Apache Drill > Issue Type: Improvement > Components: Execution - Flow >Affects Versions: 1.13.0 >Reporter: Padma Penumarthy >Assignee: Padma Penumarthy >Priority: Critical > Fix For: 1.14.0 > > > Vector allocation in record batch sizer can be enhanced to allocate memory > for repeated list vector more accurately rather than using default functions. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6262) IndexOutOfBoundException in RecordBatchSize for empty variableWidthVector
Sorabh Hamirwasia created DRILL-6262: Summary: IndexOutOfBoundException in RecordBatchSize for empty variableWidthVector Key: DRILL-6262 URL: https://issues.apache.org/jira/browse/DRILL-6262 Project: Apache Drill Issue Type: Bug Components: Execution - Flow Reporter: Sorabh Hamirwasia Assignee: Sorabh Hamirwasia Fix For: 1.14.0 ColumnSize inside RecordBatchSizer while computing the totalDataSize for VariableWidthVector throws IndexOutOfBoundException when the underlying vector is empty without any allocated memory. This happens because the way totalDataSize is computed is using the offsetVector value at an index n where n is total number of records in the vector. When vector is empty then n=0 and offsetVector drillbuf is empty as well. So while retrieving value at index 0 from offsetVector exception is thrown. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6008) Unable to shutdown Drillbit using short domain name
[ https://issues.apache.org/jira/browse/DRILL-6008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16403110#comment-16403110 ] Venkata Jyothsna Donapati commented on DRILL-6008: -- [~arina] With DRILL-6044 changes the post request (shutdown request) is done on the host from the url. So this issue will not occur. Moreover, I have tried to shutdown using short host name from Postman(Rest API client) and it works fine (This works only in case of ssl, auth disabled). All I did was add XX.XX.XX.XXX cv1 cv1.lab in /etc/hosts in my local from where I'm using the Postman to shutdown (POST on http://cv1:8047/gracefulShutdown) > Unable to shutdown Drillbit using short domain name > --- > > Key: DRILL-6008 > URL: https://issues.apache.org/jira/browse/DRILL-6008 > Project: Apache Drill > Issue Type: Bug >Reporter: Arina Ielchiieva >Assignee: Venkata Jyothsna Donapati >Priority: Major > Attachments: fqdn.JPG, method_is_not_allowed.JPG, > response_of_undefined.JPG > > > Could not shutdown drillbit on cluster where host name was used as drillbit's > address (fqdn.JPG). Pressing shutdown resulted in > (response_of_undefined.JPG). I have tried using ip address and also no luck > (method_is_not_allowed.JPG). > I could shutdown drillbit in embeddded mode but then I saw the following > errors (local_shutdown.JPG): looks like Web UI was trying to get drillbit > status though it was down. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6215) Use prepared statement instead of Statement in JdbcRecordReader class
[ https://issues.apache.org/jira/browse/DRILL-6215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16403090#comment-16403090 ] ASF GitHub Bot commented on DRILL-6215: --- Github user kfaraaz commented on the issue: https://github.com/apache/drill/pull/1159 I don't know about the other file, I didn't add it. Let me check. Thanks, Khurram > Use prepared statement instead of Statement in JdbcRecordReader class > - > > Key: DRILL-6215 > URL: https://issues.apache.org/jira/browse/DRILL-6215 > Project: Apache Drill > Issue Type: Bug > Components: Storage - JDBC >Affects Versions: 1.12.0 >Reporter: Khurram Faraaz >Priority: Major > > Use prepared statement instead of Statement in JdbcRecordReader class, which > is more efficient and less vulnerable to SQL injection attacks. > Apache Drill 1.13.0-SNAPSHOT, commit : > 9073aed67d89e8b2188870d6c812706085c9c41b > Findbugs reports the below bug and suggests that we use prepared statement > instead of Statement. > {noformat} > In class org.apache.drill.exec.store.jdbc.JdbcRecordReader > In method > org.apache.drill.exec.store.jdbc.JdbcRecordReader.setup(OperatorContext, > OutputMutator) > At JdbcRecordReader.java:[line 170] > org.apache.drill.exec.store.jdbc.JdbcRecordReader.setup(OperatorContext, > OutputMutator) passes a nonconstant String to an execute method on an SQL > statement > The method invokes the execute method on an SQL statement with a String that > seems to be dynamically generated. > Consider using a prepared statement instead. > It is more efficient and less vulnerable to SQL injection attacks. > {noformat} > LOC - > https://github.com/apache/drill/blob/a9ea4ec1c5645ddab4b7aef9ac060ff5f109b696/contrib/storage-jdbc/src/main/java/org/apache/drill/exec/store/jdbc/JdbcRecordReader.java#L170 > {noformat} > To run with findbugs: > mvn clean install -Pfindbugs -DskipTests > Findbugs will wirite the output to finbugsXml.html in the target directory of > each module. > For example the java-exec module report is located at: > ./exec/java-exec/target/findbugs/findbugsXml.html > Use > find . -name "findbugsXml.html" > to locate the files. > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-3640) Drill JDBC driver support Statement.setQueryTimeout(int)
[ https://issues.apache.org/jira/browse/DRILL-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pritesh Maker updated DRILL-3640: - Labels: doc-impacting ready-to-commit (was: ready-to-commit) > Drill JDBC driver support Statement.setQueryTimeout(int) > > > Key: DRILL-3640 > URL: https://issues.apache.org/jira/browse/DRILL-3640 > Project: Apache Drill > Issue Type: Improvement > Components: Client - JDBC >Affects Versions: 1.2.0 >Reporter: Chun Chang >Assignee: Kunal Khatua >Priority: Major > Labels: doc-impacting, ready-to-commit > Fix For: 1.13.0 > > > It would be nice if we have this implemented. Run away queries can be > automatically canceled by setting the timeout. > java.sql.SQLFeatureNotSupportedException: Setting network timeout is not > supported. > at > org.apache.drill.jdbc.impl.DrillStatementImpl.setQueryTimeout(DrillStatementImpl.java:152) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6260) Query fails with "ERROR: Non-scalar sub-query used in an expression" when it contains a cast expression around a scalar sub-query
[ https://issues.apache.org/jira/browse/DRILL-6260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16402565#comment-16402565 ] Gautam Kumar Parai commented on DRILL-6260: --- [~hanu.ncr] did not realize you had reassigned it to yourself. I analyzed it a bit - Drill throws the error when it `visits` the Calcite logical op tree and finds a LogicalAggregate which has SqlSingleValueAggFunction. However, we may need to change the visitor to keep going down the tree to find one which is not. The code is in PreProcessLogicalRel.java:visit(LogicalAggregate aggregate). Hope this helps! > Query fails with "ERROR: Non-scalar sub-query used in an expression" when it > contains a cast expression around a scalar sub-query > -- > > Key: DRILL-6260 > URL: https://issues.apache.org/jira/browse/DRILL-6260 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning Optimization >Affects Versions: 1.13.0, 1.14.0 > Environment: git Commit ID: dd4a46a6c57425284a2b8c68676357f947e01988 > git Commit Message: Update version to 1.14.0-SNAPSHOT >Reporter: Abhishek Girish >Assignee: Hanumath Rao Maduri >Priority: Major > > {code} > > explain plan for SELECT T1.b FROM `t1.json` T1 WHERE T1.a = (SELECT > > cast(max(T2.a) as varchar) FROM `t2.json` T2); > Error: UNSUPPORTED_OPERATION ERROR: Non-scalar sub-query used in an expression > See Apache Drill JIRA: DRILL-1937 > {code} > Slightly different variants of the query work fine. > {code} > > explain plan for SELECT T1.b FROM `t1.json` T1 WHERE T1.a = (SELECT > > max(cast(T2.a as varchar)) FROM `t2.json` T2); > 00-00 Screen > 00-01 Project(b=[$0]) > 00-02 Project(b=[$1]) > 00-03 SelectionVectorRemover > 00-04 Filter(condition=[=($0, $2)]) > 00-05 NestedLoopJoin(condition=[true], joinType=[left]) > 00-07 Scan(table=[[si, tmp, t1.json]], > groupscan=[EasyGroupScan [selectionRoot=maprfs:/tmp/t1.json, numFiles=1, > columns=[`a`, `b`], files=[maprfs:///tmp/t1.json]]]) > 00-06 StreamAgg(group=[{}], EXPR$0=[MAX($0)]) > 00-08 Project($f0=[CAST($0):VARCHAR(65535) CHARACTER SET > "UTF-16LE" COLLATE "UTF-16LE$en_US$primary"]) > 00-09 Scan(table=[[si, tmp, t2.json]], > groupscan=[EasyGroupScan [selectionRoot=maprfs:/tmp/t2.json, numFiles=1, > columns=[`a`], files=[maprfs:///tmp/t2.json]]]){code} > {code} > > explain plan for SELECT T1.b FROM `t1.json` T1 WHERE T1.a = (SELECT > > max(T2.a) FROM `t2.json` T2); > 00-00Screen > 00-01 Project(b=[$0]) > 00-02Project(b=[$1]) > 00-03 SelectionVectorRemover > 00-04Filter(condition=[=($0, $2)]) > 00-05 NestedLoopJoin(condition=[true], joinType=[left]) > 00-07Scan(table=[[si, tmp, t1.json]], > groupscan=[EasyGroupScan [selectionRoot=maprfs:/tmp/t1.json, numFiles=1, > columns=[`a`, `b`], files=[maprfs:///tmp/t1.json]]]) > 00-06StreamAgg(group=[{}], EXPR$0=[MAX($0)]) > 00-08 Scan(table=[[si, tmp, t2.json]], > groupscan=[EasyGroupScan [selectionRoot=maprfs:/tmp/t2.json, numFiles=1, > columns=[`a`], files=[maprfs:///tmp/t2.json]]]) > {code} > File contents: > {code} > # cat t1.json > {"a":1, "b":"V"} > {"a":2, "b":"W"} > {"a":3, "b":"X"} > {"a":4, "b":"Y"} > {"a":5, "b":"Z"} > # cat t2.json > {"a":1, "b":"A"} > {"a":2, "b":"B"} > {"a":3, "b":"C"} > {"a":4, "b":"D"} > {"a":5, "b":"E"} > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6261) logging "Waiting for X queries to complete before shutting down" even before shutdown request is triggered
Venkata Jyothsna Donapati created DRILL-6261: Summary: logging "Waiting for X queries to complete before shutting down" even before shutdown request is triggered Key: DRILL-6261 URL: https://issues.apache.org/jira/browse/DRILL-6261 Project: Apache Drill Issue Type: Bug Reporter: Venkata Jyothsna Donapati After https://issues.apache.org/jira/browse/DRILL-5922 changes "Waiting for X queries to complete before shutting down" is logged every time a query runs instead of it being logged after a shutdown request is triggered. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (DRILL-6245) Clicking on anything redirects to main login page
[ https://issues.apache.org/jira/browse/DRILL-6245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pritesh Maker reassigned DRILL-6245: Assignee: Venkata Jyothsna Donapati > Clicking on anything redirects to main login page > - > > Key: DRILL-6245 > URL: https://issues.apache.org/jira/browse/DRILL-6245 > Project: Apache Drill > Issue Type: Bug >Reporter: Venkata Jyothsna Donapati >Assignee: Venkata Jyothsna Donapati >Priority: Minor > Fix For: 1.14.0 > > > When the Drill Web UI is accessed using https and then by http protocol, the > Web UI is always trying to redirect to main login page if anything is clicked > on index page. However, this works fine if the cookies are cleared. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6245) Clicking on anything redirects to main login page
[ https://issues.apache.org/jira/browse/DRILL-6245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pritesh Maker updated DRILL-6245: - Fix Version/s: 1.14.0 > Clicking on anything redirects to main login page > - > > Key: DRILL-6245 > URL: https://issues.apache.org/jira/browse/DRILL-6245 > Project: Apache Drill > Issue Type: Bug >Reporter: Venkata Jyothsna Donapati >Priority: Minor > Fix For: 1.14.0 > > > When the Drill Web UI is accessed using https and then by http protocol, the > Web UI is always trying to redirect to main login page if anything is clicked > on index page. However, this works fine if the cookies are cleared. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-4897) NumberFormatException in Drill SQL while casting to BIGINT when its actually a number
[ https://issues.apache.org/jira/browse/DRILL-4897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16402463#comment-16402463 ] Karthikeyan Manivannan commented on DRILL-4897: --- This seems to be happening because the "WHEN 0 THEN 0" in the query. I think the "THEN 0" causes PROJECT to assume that the result column is INT instead of BIGINT and the query throws the exception when a number larger than what INT can hold is processed. The query runs fine if it is changed to "...WHEN 0 THEN 2147483648..." but fails when it is changed to "...WHEN 0 THEN 2147483647..." 0: jdbc:drill:zk=local> select CAST(case isnumeric(columns[0]) WHEN 0 THEN 2147483647 ELSE columns[0] END AS BIGINT) from dfs.`/Users/karthik/work/bugs/DRILL-4897/pw2.csv`; Error: SYSTEM ERROR: NumberFormatException: 2147483648 Fragment 0:0 [Error Id: d29ec48e-e659-41b4-a722-9c546ef8c9c9 on 172.30.8.179:31010] (state=,code=0) 0: jdbc:drill:zk=local> select CAST(case isnumeric(columns[0]) WHEN 0 THEN 2147483648 ELSE columns[0] END AS BIGINT) from dfs.`/Users/karthik/work/bugs/DRILL-4897/pw2.csv`; +-+ | EXPR$0 | +-+ | 1 | | 2 | ... ... | 2147483648 | | 4294967296 | +-+ The planner seems to be doing the same thing in both cases: Failed Case < Project(EXPR$0=[CAST(CASE(=(ISNUMERIC(ITEM($0, 0)), 0), 2147483647, ITEM($0, 0))):BIGINT]) : rowType = RecordType(BIGINT EXPR$0): rowcount = 2.0, cumulative cost = \{4.0 rows, 10.0 cpu, 0.0 io, 0.0 network, 0.0 memory} --- Succesfull > Project(EXPR$0=[CAST(CASE(=(ISNUMERIC(ITEM($0, 0)), 0), 2147483648, ITEM($0, > 0))):BIGINT]) : rowType = RecordType(BIGINT EXPR$0): rowcount = 2.0, > cumulative cost = \{4.0 rows, 10.0 cpu, 0.0 io, 0.0 network, 0.0 memory} So, I guess the problem is in the way the expression is handled in PROJECT. I will investigate this further. > NumberFormatException in Drill SQL while casting to BIGINT when its actually > a number > - > > Key: DRILL-4897 > URL: https://issues.apache.org/jira/browse/DRILL-4897 > Project: Apache Drill > Issue Type: Bug > Components: Functions - Drill >Reporter: Srihari Karanth >Assignee: Karthikeyan Manivannan >Priority: Blocker > > In the following SQL, drill cribs when trying to convert a number which is in > varchar >select cast (case IsNumeric(Delta_Radio_Delay) > when 0 then 0 else Delta_Radio_Delay end as BIGINT) > from datasource.`./sometable` > where Delta_Radio_Delay='4294967294'; > BIGINT should be able to take very large number. I dont understand how it > throws the below error: > 0: jdbc:drill:> select cast (case IsNumeric(Delta_Radio_Delay) > when 0 then 0 else Delta_Radio_Delay end as BIGINT) > from datasource.`./sometable` > where Delta_Radio_Delay='4294967294'; > Error: SYSTEM ERROR: NumberFormatException: 4294967294 > Fragment 1:29 > [Error Id: a63bb113-271f-4d8b-8194-2c9728543200 on cluster-3:31010] > (state=,code=0) > How can i modify SQL to fix this? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6227) Graceful shutdown should fail if unable to kill drillbit during some timeout rather then trying indefinitely
[ https://issues.apache.org/jira/browse/DRILL-6227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pritesh Maker updated DRILL-6227: - Issue Type: Sub-task (was: Improvement) Parent: DRILL-6023 > Graceful shutdown should fail if unable to kill drillbit during some timeout > rather then trying indefinitely > > > Key: DRILL-6227 > URL: https://issues.apache.org/jira/browse/DRILL-6227 > Project: Apache Drill > Issue Type: Sub-task >Affects Versions: 1.12.0 >Reporter: Arina Ielchiieva >Assignee: Venkata Jyothsna Donapati >Priority: Major > Fix For: 1.14.0 > > > In drillbit.sh graceful shutdown calls drillbit stop with {{kill_drillbit}} > set to false. > {code} > kill_drillbit=false > stop_bit $kill_drillbit > {code} > It means that {{waitForProcessEnd}} will be called with the same property. > When {{waitForProcessEnd}} is called with {{kill_drillbit}} set to false, > this method will try to kill drillbit using {{kill -0}} until succeeds. So if > at some point it won't be able, it may run forever. Need to have some timeout > when {{waitForProcessEnd}} will stop trying to kill drillbit and report an > error. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6024) Use grace period only in production servers
[ https://issues.apache.org/jira/browse/DRILL-6024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pritesh Maker updated DRILL-6024: - Issue Type: Sub-task (was: Improvement) Parent: DRILL-6023 > Use grace period only in production servers > --- > > Key: DRILL-6024 > URL: https://issues.apache.org/jira/browse/DRILL-6024 > Project: Apache Drill > Issue Type: Sub-task >Affects Versions: 1.12.0 >Reporter: Arina Ielchiieva >Assignee: Venkata Jyothsna Donapati >Priority: Major > > DRILL-4286 introduces graceful shutdown. Currently by default it is turned > out (grace period is set to 0) since if we turn it out by default it affects > non-productions systems, for example, unit tests time run increased x3 times. > [~Paul.Rogers] proposed the following solution: > {quote} > In a production system, we do want the grace period; it is an essential part > of the graceful shutdown procedure. > However, if we are doing a non-graceful shutdown, the grace is unneeded. > Also, if the cluster contains only one node (as in most unit tests), there is > nothing to wait for, so the grace period is not needed. The same is true in > an embedded Drillbit for Sqlline. > So, can we provide a solution that handles these cases rather than simply > turning off the grace period always? > If using the local cluster coordinator, say, then no grace is needed. If > using ZK, but there is only one Drillbit, no grace is needed. (There is a > race condition, but may be OK.) > Or, if we detect we are embedded, no grace period. > Then, also, if we are doing a graceful shutdown, we need the grace. But, if > we are doing a "classic" shutdown, no grace is needed. > The result should be that the grace period is used only in production > servers, only when doing a graceful shutdown. > {quote} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6022) Improve js part for graceful shutdown
[ https://issues.apache.org/jira/browse/DRILL-6022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pritesh Maker updated DRILL-6022: - Issue Type: Sub-task (was: Improvement) Parent: DRILL-6023 > Improve js part for graceful shutdown > - > > Key: DRILL-6022 > URL: https://issues.apache.org/jira/browse/DRILL-6022 > Project: Apache Drill > Issue Type: Sub-task >Affects Versions: 1.12.0 >Reporter: Arina Ielchiieva >Assignee: Venkata Jyothsna Donapati >Priority: Major > Fix For: 1.14.0 > > > DRILL-4286 introduces graceful shutdown but its js part needs improvement: > a. ajax call do not handle errors, so when error occurs it is just swallowed. > b. there are some unused and / or unnecessary variables usage > c. shutdown functionality is disabled when user is not an admin but some > other ajax calls are still being executed, for example, port number, number > of queries, grace period. All that can be also can be disabled when user is > not an admin. > d. there are many ajax calls which can be factored out in dedicated js file. > Other fixes: > a. all shutdown functionality reside in DrillRoot class, it can be factored > out in shutdown specific class where all shutdown functionality will be > allowed only for admin on class level, currently we marked in on the level > (see DRILL-6019). > b. issue described in DRILL-6021. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (DRILL-2656) Add ability to specify options for clean shutdown of a Drillbit
[ https://issues.apache.org/jira/browse/DRILL-2656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pritesh Maker resolved DRILL-2656. -- Resolution: Duplicate Addressed by DRILL-4286 > Add ability to specify options for clean shutdown of a Drillbit > --- > > Key: DRILL-2656 > URL: https://issues.apache.org/jira/browse/DRILL-2656 > Project: Apache Drill > Issue Type: New Feature > Components: Execution - Flow >Affects Versions: 0.8.0 >Reporter: Chris Westin >Assignee: Venkata Jyothsna Donapati >Priority: Major > Fix For: Future > > > When we shut down a Drillbit, we should provide some options similar to those > available from Oracle's shutdown command (see > https://docs.oracle.com/cd/B28359_01/server.111/b28310/start003.htm#ADMIN11156) > . > At present, in order to avoid problems like DRILL-2654, we try to do a short > wait for executing queries, but that times out after 5 seconds, and doesn't > help with long-running queries. > Someone that is running a long query might be unhappy about losing work for > something that was near completion, so we can do better. > And, in order to avoid spurious cleanup problems and exceptions, we should > explicitly cancel any remaining queries before we do complete the shutdown. > As in the Oracle example, we might have shutdown immediate issue > cancellations to the running queries. A clean shutdown might not have a > timeout, or might allow the specification of a longer timeout, and even when > the timeout goes off, we should still cleanly cancel any remaining queries, > and wait for the cancellations to complete. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-4829) Configure the address to bind to
[ https://issues.apache.org/jira/browse/DRILL-4829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pritesh Maker updated DRILL-4829: - Fix Version/s: 1.14.0 > Configure the address to bind to > > > Key: DRILL-4829 > URL: https://issues.apache.org/jira/browse/DRILL-4829 > Project: Apache Drill > Issue Type: Improvement >Reporter: Daniel Stockton >Assignee: Venkata Jyothsna Donapati >Priority: Minor > Fix For: 1.14.0 > > > 1.7 included the following patch to prevent Drillbits binding to the loopback > address: https://issues.apache.org/jira/browse/DRILL-4523 > "Drillbit is disallowed to bind to loopback address in distributed mode." > It would be better if this was configurable rather than rely on /etc/hosts, > since it's common for the hostname to resolve to loopback. > Would you accept a patch that adds this option to drill.override.conf? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-4829) Configure the address to bind to
[ https://issues.apache.org/jira/browse/DRILL-4829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pritesh Maker updated DRILL-4829: - Fix Version/s: (was: 1.14.0) > Configure the address to bind to > > > Key: DRILL-4829 > URL: https://issues.apache.org/jira/browse/DRILL-4829 > Project: Apache Drill > Issue Type: Improvement >Reporter: Daniel Stockton >Assignee: Venkata Jyothsna Donapati >Priority: Minor > Fix For: 1.14.0 > > > 1.7 included the following patch to prevent Drillbits binding to the loopback > address: https://issues.apache.org/jira/browse/DRILL-4523 > "Drillbit is disallowed to bind to loopback address in distributed mode." > It would be better if this was configurable rather than rely on /etc/hosts, > since it's common for the hostname to resolve to loopback. > Would you accept a patch that adds this option to drill.override.conf? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (DRILL-4829) Configure the address to bind to
[ https://issues.apache.org/jira/browse/DRILL-4829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pritesh Maker resolved DRILL-4829. -- Resolution: Duplicate Addressed by DRILL-6005 > Configure the address to bind to > > > Key: DRILL-4829 > URL: https://issues.apache.org/jira/browse/DRILL-4829 > Project: Apache Drill > Issue Type: Improvement >Reporter: Daniel Stockton >Assignee: Venkata Jyothsna Donapati >Priority: Minor > Fix For: 1.14.0 > > > 1.7 included the following patch to prevent Drillbits binding to the loopback > address: https://issues.apache.org/jira/browse/DRILL-4523 > "Drillbit is disallowed to bind to loopback address in distributed mode." > It would be better if this was configurable rather than rely on /etc/hosts, > since it's common for the hostname to resolve to loopback. > Would you accept a patch that adds this option to drill.override.conf? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6243) Alert box to confirm shutdown of drillbit after clicking shutdown button
[ https://issues.apache.org/jira/browse/DRILL-6243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pritesh Maker updated DRILL-6243: - Reviewer: Sorabh Hamirwasia > Alert box to confirm shutdown of drillbit after clicking shutdown button > - > > Key: DRILL-6243 > URL: https://issues.apache.org/jira/browse/DRILL-6243 > Project: Apache Drill > Issue Type: Improvement >Reporter: Venkata Jyothsna Donapati >Assignee: Venkata Jyothsna Donapati >Priority: Minor > Fix For: 1.14.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6243) Alert box to confirm shutdown of drillbit after clicking shutdown button
[ https://issues.apache.org/jira/browse/DRILL-6243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pritesh Maker updated DRILL-6243: - Fix Version/s: 1.14.0 > Alert box to confirm shutdown of drillbit after clicking shutdown button > - > > Key: DRILL-6243 > URL: https://issues.apache.org/jira/browse/DRILL-6243 > Project: Apache Drill > Issue Type: Improvement >Reporter: Venkata Jyothsna Donapati >Assignee: Venkata Jyothsna Donapati >Priority: Minor > Fix For: 1.14.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6039) drillbit.sh graceful_stop does not wait for fragments to complete before stopping the drillbit
[ https://issues.apache.org/jira/browse/DRILL-6039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16402447#comment-16402447 ] Pritesh Maker commented on DRILL-6039: -- This should be tested after DRIL-6252 is addressed. > drillbit.sh graceful_stop does not wait for fragments to complete before > stopping the drillbit > -- > > Key: DRILL-6039 > URL: https://issues.apache.org/jira/browse/DRILL-6039 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Flow >Affects Versions: 1.3.0 >Reporter: Krystal >Assignee: Venkata Jyothsna Donapati >Priority: Major > Fix For: 1.14.0 > > > git.commit.id.abbrev=eb0c403 > I have 3-nodes cluster with drillbits running on each node. I kicked off a > long running query. In the middle of the query, I did a "./drillbit.sh > graceful_stop" on one of the non-foreman node. The node was stopped within a > few seconds and the query failed with error: > Error: SYSTEM ERROR: IOException: Filesystem closed > Fragment 4:15 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6008) Unable to shutdown Drillbit using short domain name
[ https://issues.apache.org/jira/browse/DRILL-6008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pritesh Maker updated DRILL-6008: - Summary: Unable to shutdown Drillbit using short domain name (was: Unable to shutdown Drillbit) > Unable to shutdown Drillbit using short domain name > --- > > Key: DRILL-6008 > URL: https://issues.apache.org/jira/browse/DRILL-6008 > Project: Apache Drill > Issue Type: Bug >Reporter: Arina Ielchiieva >Assignee: Venkata Jyothsna Donapati >Priority: Major > Attachments: fqdn.JPG, method_is_not_allowed.JPG, > response_of_undefined.JPG > > > Could not shutdown drillbit on cluster where host name was used as drillbit's > address (fqdn.JPG). Pressing shutdown resulted in > (response_of_undefined.JPG). I have tried using ip address and also no luck > (method_is_not_allowed.JPG). > I could shutdown drillbit in embeddded mode but then I saw the following > errors (local_shutdown.JPG): looks like Web UI was trying to get drillbit > status though it was down. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (DRILL-143) Support CGROUPs resource management
[ https://issues.apache.org/jira/browse/DRILL-143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kunal Khatua reassigned DRILL-143: -- Assignee: Kunal Khatua > Support CGROUPs resource management > --- > > Key: DRILL-143 > URL: https://issues.apache.org/jira/browse/DRILL-143 > Project: Apache Drill > Issue Type: New Feature >Reporter: Jacques Nadeau >Assignee: Kunal Khatua >Priority: Major > Fix For: 1.14.0 > > > For the purpose of playing nice on clusters that don't have YARN, we should > write up configuration and scripts to allows users to run Drill next to > existing workloads without sharing resources. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-143) Support CGROUPs resource management
[ https://issues.apache.org/jira/browse/DRILL-143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16402385#comment-16402385 ] Kunal Khatua commented on DRILL-143: Marking this as a feature for 1.14.0 since Drill-on-Yarn will be part of 1.13.0. However, this would be a generic feature for Drill honoring CGroups irrespective of whether the node is managed by YARN or not. > Support CGROUPs resource management > --- > > Key: DRILL-143 > URL: https://issues.apache.org/jira/browse/DRILL-143 > Project: Apache Drill > Issue Type: New Feature >Reporter: Jacques Nadeau >Priority: Major > Fix For: 1.14.0 > > > For the purpose of playing nice on clusters that don't have YARN, we should > write up configuration and scripts to allows users to run Drill next to > existing workloads without sharing resources. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (DRILL-3928) OutOfMemoryException should not be derived from FragmentSetupException
[ https://issues.apache.org/jira/browse/DRILL-3928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthikeyan Manivannan resolved DRILL-3928. --- Resolution: Not A Problem OutofMemoryException is not derived from FragmentSetupException > OutOfMemoryException should not be derived from FragmentSetupException > -- > > Key: DRILL-3928 > URL: https://issues.apache.org/jira/browse/DRILL-3928 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Flow >Affects Versions: 1.2.0 >Reporter: Chris Westin >Assignee: Karthikeyan Manivannan >Priority: Major > > Discovered while working on DRILL-3927. > The client and server both use the same direct memory allocator code. But the > allocator's OutOfMemoryException is derived from FragmentSetupException > (which is derived from ForemanException). > Firstly, OOM situations don't only happen during setup. > Secondly, Fragment and Foreman classes shouldn't exist on the client side. > (This is causing unnecessary dependencies on the jdbc-all jar on server-only > code). > There's nothing special in those base classes that OutOfMemoryException > depends on. This looks like it was just a cheap way to avoid extra catch > clauses in Foreman and FragmentExecutor by catching the baser classes only. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-143) Support CGROUPs resource management
[ https://issues.apache.org/jira/browse/DRILL-143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kunal Khatua updated DRILL-143: --- Fix Version/s: (was: Future) 1.14.0 > Support CGROUPs resource management > --- > > Key: DRILL-143 > URL: https://issues.apache.org/jira/browse/DRILL-143 > Project: Apache Drill > Issue Type: New Feature >Reporter: Jacques Nadeau >Priority: Major > Fix For: 1.14.0 > > > For the purpose of playing nice on clusters that don't have YARN, we should > write up configuration and scripts to allows users to run Drill next to > existing workloads without sharing resources. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6259) Support parquet filter push down for complex types
[ https://issues.apache.org/jira/browse/DRILL-6259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16402375#comment-16402375 ] ASF GitHub Bot commented on DRILL-6259: --- Github user priteshm commented on the issue: https://github.com/apache/drill/pull/1173 @parthchandra can you please review this change? > Support parquet filter push down for complex types > -- > > Key: DRILL-6259 > URL: https://issues.apache.org/jira/browse/DRILL-6259 > Project: Apache Drill > Issue Type: Improvement >Affects Versions: 1.13.0 >Reporter: Arina Ielchiieva >Assignee: Arina Ielchiieva >Priority: Major > Fix For: 1.14.0 > > > Currently parquet filter push down is not working for complex types > (including arrays). > This Jira aims to implement filter push down for complex types which > underneath type is among supported simple types for filter push down. For > instance, currently Drill does not support filter push down for varchars, > decimals etc. Though once Drill will start support, this support will be > applied for complex type automatically. > Complex fields will be pushed down the same way regular fields are, except > for one case with arrays. > Query with predicate {{where users.hobbies_ids[2] is null}} won't be able to > push down because we are not able to determine exact number of nulls in > arrays fields. > {{Consider [1, 2, 3]}} vs {{[1, 2]}} if these arrays are in different files. > Statistics for the second case won't show any nulls but when querying from > two files, in terms of data the third value in array is null. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6259) Support parquet filter push down for complex types
[ https://issues.apache.org/jira/browse/DRILL-6259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pritesh Maker updated DRILL-6259: - Reviewer: Parth Chandra > Support parquet filter push down for complex types > -- > > Key: DRILL-6259 > URL: https://issues.apache.org/jira/browse/DRILL-6259 > Project: Apache Drill > Issue Type: Improvement >Affects Versions: 1.13.0 >Reporter: Arina Ielchiieva >Assignee: Arina Ielchiieva >Priority: Major > Fix For: 1.14.0 > > > Currently parquet filter push down is not working for complex types > (including arrays). > This Jira aims to implement filter push down for complex types which > underneath type is among supported simple types for filter push down. For > instance, currently Drill does not support filter push down for varchars, > decimals etc. Though once Drill will start support, this support will be > applied for complex type automatically. > Complex fields will be pushed down the same way regular fields are, except > for one case with arrays. > Query with predicate {{where users.hobbies_ids[2] is null}} won't be able to > push down because we are not able to determine exact number of nulls in > arrays fields. > {{Consider [1, 2, 3]}} vs {{[1, 2]}} if these arrays are in different files. > Statistics for the second case won't show any nulls but when querying from > two files, in terms of data the third value in array is null. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6252) Foreman node is going down when the non foreman node is stopped
[ https://issues.apache.org/jira/browse/DRILL-6252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Venkata Jyothsna Donapati updated DRILL-6252: - Attachment: foreman_drillbit.log > Foreman node is going down when the non foreman node is stopped > --- > > Key: DRILL-6252 > URL: https://issues.apache.org/jira/browse/DRILL-6252 > Project: Apache Drill > Issue Type: Bug >Reporter: Venkata Jyothsna Donapati >Assignee: Vlad Rozov >Priority: Major > Fix For: 1.14.0 > > Attachments: foreman_drillbit.log, nonforeman_drillbit.log > > > Two drillbits are running. I'm running a join query over parquet and tried to > stop the non-foreman node using drillbit.sh stop. The query fails with > *"Error: DATA_READ ERROR: Exception occurred while reading from disk".* The > non-foreman node goes down. The foreman node also goes down. When I looked at > the drillbit.log of both foreman and non-foreman I found that there is memory > leak "Memory was leaked by query. Memory leaked: > (2097152)\nAllocator(op:2:0:0:HashPartitionSender) > 100/6291456/6832128/100 (res/actual/peak/limit)\n". Following are > the stack traces for memory leaks > {noformat} > [Error Id: 0d9a2799-7e97-46b3-953b-1f8d0dd87a04 on qa102-34.qa.lab:31010] > org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: > IllegalStateException: Memory was leaked by query. Memory leaked: (3145728) > Allocator(op:2:1:0:HashPartitionSender) 100/6291456/6291456/100 > (res/actual/peak/limit) > > > Fragment 2:1 > [Error Id: 0d9a2799-7e97-46b3-953b-1f8d0dd87a04 on qa102-34.qa.lab:31010] > at > org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:633) > ~[drill-common-1.13.0-SNAPSHOT.jar:1.13.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:297) > [drill-java-exec-1.13.0-SNAPSHOT.jar:1.13.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:160) > [drill-java-exec-1.13.0-SNAPSHOT.jar:1.13.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:266) > [drill-java-exec-1.13.0-SNAPSHOT.jar:1.13.0-SNAPSHOT] > at > org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) > [drill-common-1.13.0-SNAPSHOT.jar:1.13.0-SNAPSHOT] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > [na:1.8.0_161] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > [na:1.8.0_161] > at java.lang.Thread.run(Thread.java:748) [na:1.8.0_161] > Caused by: java.lang.IllegalStateException: Memory was leaked by query. > Memory leaked: (3145728) > Allocator(op:2:1:0:HashPartitionSender) 100/6291456/6291456/100 > (res/actual/peak/limit) > {noformat} > > Ping me for the logs and more information. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6252) Foreman node is going down when the non foreman node is stopped
[ https://issues.apache.org/jira/browse/DRILL-6252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Venkata Jyothsna Donapati updated DRILL-6252: - Attachment: nonforeman_drillbit.log > Foreman node is going down when the non foreman node is stopped > --- > > Key: DRILL-6252 > URL: https://issues.apache.org/jira/browse/DRILL-6252 > Project: Apache Drill > Issue Type: Bug >Reporter: Venkata Jyothsna Donapati >Assignee: Vlad Rozov >Priority: Major > Fix For: 1.14.0 > > Attachments: foreman_drillbit.log, nonforeman_drillbit.log > > > Two drillbits are running. I'm running a join query over parquet and tried to > stop the non-foreman node using drillbit.sh stop. The query fails with > *"Error: DATA_READ ERROR: Exception occurred while reading from disk".* The > non-foreman node goes down. The foreman node also goes down. When I looked at > the drillbit.log of both foreman and non-foreman I found that there is memory > leak "Memory was leaked by query. Memory leaked: > (2097152)\nAllocator(op:2:0:0:HashPartitionSender) > 100/6291456/6832128/100 (res/actual/peak/limit)\n". Following are > the stack traces for memory leaks > {noformat} > [Error Id: 0d9a2799-7e97-46b3-953b-1f8d0dd87a04 on qa102-34.qa.lab:31010] > org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: > IllegalStateException: Memory was leaked by query. Memory leaked: (3145728) > Allocator(op:2:1:0:HashPartitionSender) 100/6291456/6291456/100 > (res/actual/peak/limit) > > > Fragment 2:1 > [Error Id: 0d9a2799-7e97-46b3-953b-1f8d0dd87a04 on qa102-34.qa.lab:31010] > at > org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:633) > ~[drill-common-1.13.0-SNAPSHOT.jar:1.13.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:297) > [drill-java-exec-1.13.0-SNAPSHOT.jar:1.13.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:160) > [drill-java-exec-1.13.0-SNAPSHOT.jar:1.13.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:266) > [drill-java-exec-1.13.0-SNAPSHOT.jar:1.13.0-SNAPSHOT] > at > org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) > [drill-common-1.13.0-SNAPSHOT.jar:1.13.0-SNAPSHOT] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > [na:1.8.0_161] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > [na:1.8.0_161] > at java.lang.Thread.run(Thread.java:748) [na:1.8.0_161] > Caused by: java.lang.IllegalStateException: Memory was leaked by query. > Memory leaked: (3145728) > Allocator(op:2:1:0:HashPartitionSender) 100/6291456/6291456/100 > (res/actual/peak/limit) > {noformat} > > Ping me for the logs and more information. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6223) Drill fails on Schema changes
[ https://issues.apache.org/jira/browse/DRILL-6223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16402255#comment-16402255 ] salim achouche commented on DRILL-6223: --- Parth, this PR is not Parquet specific as it deals with downstream operators having issues handling schema changes. Most of the time, the end result would be downstream operators trying to access stale data. > Drill fails on Schema changes > -- > > Key: DRILL-6223 > URL: https://issues.apache.org/jira/browse/DRILL-6223 > Project: Apache Drill > Issue Type: Improvement > Components: Execution - Relational Operators >Affects Versions: 1.10.0, 1.12.0 >Reporter: salim achouche >Assignee: salim achouche >Priority: Major > Fix For: 1.14.0 > > > Drill Query Failing when selecting all columns from a Complex Nested Data > File (Parquet) Set). There are differences in Schema among the files: > * The Parquet files exhibit differences both at the first level and within > nested data types > * A select * will not cause an exception but using a limit clause will > * Note also this issue seems to happen only when multiple Drillbit minor > fragments are involved (concurrency higher than one) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6223) Drill fails on Schema changes
[ https://issues.apache.org/jira/browse/DRILL-6223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16402228#comment-16402228 ] ASF GitHub Bot commented on DRILL-6223: --- Github user parthchandra commented on the issue: https://github.com/apache/drill/pull/1170 I added a comment in the JIRA - [DRILL-6223](https://issues.apache.org/jira/browse/DRILL-6223?focusedCommentId=16402223=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16402223) > Drill fails on Schema changes > -- > > Key: DRILL-6223 > URL: https://issues.apache.org/jira/browse/DRILL-6223 > Project: Apache Drill > Issue Type: Improvement > Components: Execution - Relational Operators >Affects Versions: 1.10.0, 1.12.0 >Reporter: salim achouche >Assignee: salim achouche >Priority: Major > Fix For: 1.14.0 > > > Drill Query Failing when selecting all columns from a Complex Nested Data > File (Parquet) Set). There are differences in Schema among the files: > * The Parquet files exhibit differences both at the first level and within > nested data types > * A select * will not cause an exception but using a limit clause will > * Note also this issue seems to happen only when multiple Drillbit minor > fragments are involved (concurrency higher than one) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6223) Drill fails on Schema changes
[ https://issues.apache.org/jira/browse/DRILL-6223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16402223#comment-16402223 ] Parth Chandra commented on DRILL-6223: -- Schema change for Parquet files is not supported by the Parquet metadata cache. The Parquet metadata cache overwrites the schema if it changes (does not merge) and so the last one encountered is the schema selected. New columns added are OK, I think, but type changes are not. See [1]. I haven't looked at the PR, but you might want to test this out with the metadata cache enabled. [1] https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/Metadata.java#L420 > Drill fails on Schema changes > -- > > Key: DRILL-6223 > URL: https://issues.apache.org/jira/browse/DRILL-6223 > Project: Apache Drill > Issue Type: Improvement > Components: Execution - Relational Operators >Affects Versions: 1.10.0, 1.12.0 >Reporter: salim achouche >Assignee: salim achouche >Priority: Major > Fix For: 1.14.0 > > > Drill Query Failing when selecting all columns from a Complex Nested Data > File (Parquet) Set). There are differences in Schema among the files: > * The Parquet files exhibit differences both at the first level and within > nested data types > * A select * will not cause an exception but using a limit clause will > * Note also this issue seems to happen only when multiple Drillbit minor > fragments are involved (concurrency higher than one) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (DRILL-6260) Query fails with "ERROR: Non-scalar sub-query used in an expression" when it contains a cast expression around a scalar sub-query
[ https://issues.apache.org/jira/browse/DRILL-6260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hanumath Rao Maduri reassigned DRILL-6260: -- Assignee: Hanumath Rao Maduri > Query fails with "ERROR: Non-scalar sub-query used in an expression" when it > contains a cast expression around a scalar sub-query > -- > > Key: DRILL-6260 > URL: https://issues.apache.org/jira/browse/DRILL-6260 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning Optimization >Affects Versions: 1.13.0, 1.14.0 > Environment: git Commit ID: dd4a46a6c57425284a2b8c68676357f947e01988 > git Commit Message: Update version to 1.14.0-SNAPSHOT >Reporter: Abhishek Girish >Assignee: Hanumath Rao Maduri >Priority: Major > > {code} > > explain plan for SELECT T1.b FROM `t1.json` T1 WHERE T1.a = (SELECT > > cast(max(T2.a) as varchar) FROM `t2.json` T2); > Error: UNSUPPORTED_OPERATION ERROR: Non-scalar sub-query used in an expression > See Apache Drill JIRA: DRILL-1937 > {code} > Slightly different variants of the query work fine. > {code} > > explain plan for SELECT T1.b FROM `t1.json` T1 WHERE T1.a = (SELECT > > max(cast(T2.a as varchar)) FROM `t2.json` T2); > 00-00 Screen > 00-01 Project(b=[$0]) > 00-02 Project(b=[$1]) > 00-03 SelectionVectorRemover > 00-04 Filter(condition=[=($0, $2)]) > 00-05 NestedLoopJoin(condition=[true], joinType=[left]) > 00-07 Scan(table=[[si, tmp, t1.json]], > groupscan=[EasyGroupScan [selectionRoot=maprfs:/tmp/t1.json, numFiles=1, > columns=[`a`, `b`], files=[maprfs:///tmp/t1.json]]]) > 00-06 StreamAgg(group=[{}], EXPR$0=[MAX($0)]) > 00-08 Project($f0=[CAST($0):VARCHAR(65535) CHARACTER SET > "UTF-16LE" COLLATE "UTF-16LE$en_US$primary"]) > 00-09 Scan(table=[[si, tmp, t2.json]], > groupscan=[EasyGroupScan [selectionRoot=maprfs:/tmp/t2.json, numFiles=1, > columns=[`a`], files=[maprfs:///tmp/t2.json]]]){code} > {code} > > explain plan for SELECT T1.b FROM `t1.json` T1 WHERE T1.a = (SELECT > > max(T2.a) FROM `t2.json` T2); > 00-00Screen > 00-01 Project(b=[$0]) > 00-02Project(b=[$1]) > 00-03 SelectionVectorRemover > 00-04Filter(condition=[=($0, $2)]) > 00-05 NestedLoopJoin(condition=[true], joinType=[left]) > 00-07Scan(table=[[si, tmp, t1.json]], > groupscan=[EasyGroupScan [selectionRoot=maprfs:/tmp/t1.json, numFiles=1, > columns=[`a`, `b`], files=[maprfs:///tmp/t1.json]]]) > 00-06StreamAgg(group=[{}], EXPR$0=[MAX($0)]) > 00-08 Scan(table=[[si, tmp, t2.json]], > groupscan=[EasyGroupScan [selectionRoot=maprfs:/tmp/t2.json, numFiles=1, > columns=[`a`], files=[maprfs:///tmp/t2.json]]]) > {code} > File contents: > {code} > # cat t1.json > {"a":1, "b":"V"} > {"a":2, "b":"W"} > {"a":3, "b":"X"} > {"a":4, "b":"Y"} > {"a":5, "b":"Z"} > # cat t2.json > {"a":1, "b":"A"} > {"a":2, "b":"B"} > {"a":3, "b":"C"} > {"a":4, "b":"D"} > {"a":5, "b":"E"} > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6260) Query fails with "ERROR: Non-scalar sub-query used in an expression" when it contains a cast expression around a scalar sub-query
[ https://issues.apache.org/jira/browse/DRILL-6260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhishek Girish updated DRILL-6260: --- Description: {code} > explain plan for SELECT T1.b FROM `t1.json` T1 WHERE T1.a = (SELECT > cast(max(T2.a) as varchar) FROM `t2.json` T2); Error: UNSUPPORTED_OPERATION ERROR: Non-scalar sub-query used in an expression See Apache Drill JIRA: DRILL-1937 {code} Slightly different variants of the query work fine. {code} > explain plan for SELECT T1.b FROM `t1.json` T1 WHERE T1.a = (SELECT > max(cast(T2.a as varchar)) FROM `t2.json` T2); 00-00 Screen 00-01 Project(b=[$0]) 00-02 Project(b=[$1]) 00-03 SelectionVectorRemover 00-04 Filter(condition=[=($0, $2)]) 00-05 NestedLoopJoin(condition=[true], joinType=[left]) 00-07 Scan(table=[[si, tmp, t1.json]], groupscan=[EasyGroupScan [selectionRoot=maprfs:/tmp/t1.json, numFiles=1, columns=[`a`, `b`], files=[maprfs:///tmp/t1.json]]]) 00-06 StreamAgg(group=[{}], EXPR$0=[MAX($0)]) 00-08 Project($f0=[CAST($0):VARCHAR(65535) CHARACTER SET "UTF-16LE" COLLATE "UTF-16LE$en_US$primary"]) 00-09 Scan(table=[[si, tmp, t2.json]], groupscan=[EasyGroupScan [selectionRoot=maprfs:/tmp/t2.json, numFiles=1, columns=[`a`], files=[maprfs:///tmp/t2.json]]]){code} {code} > explain plan for SELECT T1.b FROM `t1.json` T1 WHERE T1.a = (SELECT > max(T2.a) FROM `t2.json` T2); 00-00Screen 00-01 Project(b=[$0]) 00-02Project(b=[$1]) 00-03 SelectionVectorRemover 00-04Filter(condition=[=($0, $2)]) 00-05 NestedLoopJoin(condition=[true], joinType=[left]) 00-07Scan(table=[[si, tmp, t1.json]], groupscan=[EasyGroupScan [selectionRoot=maprfs:/tmp/t1.json, numFiles=1, columns=[`a`, `b`], files=[maprfs:///tmp/t1.json]]]) 00-06StreamAgg(group=[{}], EXPR$0=[MAX($0)]) 00-08 Scan(table=[[si, tmp, t2.json]], groupscan=[EasyGroupScan [selectionRoot=maprfs:/tmp/t2.json, numFiles=1, columns=[`a`], files=[maprfs:///tmp/t2.json]]]) {code} File contents: {code} # cat t1.json {"a":1, "b":"V"} {"a":2, "b":"W"} {"a":3, "b":"X"} {"a":4, "b":"Y"} {"a":5, "b":"Z"} # cat t2.json {"a":1, "b":"A"} {"a":2, "b":"B"} {"a":3, "b":"C"} {"a":4, "b":"D"} {"a":5, "b":"E"} {code} was: {code} > explain plan for SELECT T1.b FROM `t1.json` T1 WHERE T1.a = (SELECT > cast(max(T2.a) as varchar) FROM `t2.json` T2); Error: UNSUPPORTED_OPERATION ERROR: Non-scalar sub-query used in an expression See Apache Drill JIRA: DRILL-1937 {code} Slightly different variants of the query work fine. {code} > explain plan for SELECT T1.b FROM `t1.json` T1 WHERE T1.a = (SELECT > max(cast(T2.a as varchar)) FROM `t2.json` T2); +--+--+ | text | json | +--+--+ | 00-00 Screen 00-01 Project(b=[$0]) 00-02 Project(b=[$1]) 00-03 SelectionVectorRemover 00-04 Filter(condition=[=($0, $2)]) 00-05 NestedLoopJoin(condition=[true], joinType=[left]) 00-07 Scan(table=[[si, tmp, t1.json]], groupscan=[EasyGroupScan [selectionRoot=maprfs:/tmp/t1.json, numFiles=1, columns=[`a`, `b`], files=[maprfs:///tmp/t1.json]]]) 00-06 StreamAgg(group=[{}], EXPR$0=[MAX($0)]) 00-08 Project($f0=[CAST($0):VARCHAR(65535) CHARACTER SET "UTF-16LE" COLLATE "UTF-16LE$en_US$primary"]) 00-09 Scan(table=[[si, tmp, t2.json]], groupscan=[EasyGroupScan [selectionRoot=maprfs:/tmp/t2.json, numFiles=1, columns=[`a`], files=[maprfs:///tmp/t2.json]]]){code} {code} > explain plan for SELECT T1.b FROM `t1.json` T1 WHERE T1.a = (SELECT > max(T2.a) FROM `t2.json` T2); +--+--+ | text | json | +--+--+ | 00-00Screen 00-01 Project(b=[$0]) 00-02Project(b=[$1]) 00-03 SelectionVectorRemover 00-04Filter(condition=[=($0, $2)]) 00-05 NestedLoopJoin(condition=[true], joinType=[left]) 00-07Scan(table=[[si, tmp, t1.json]], groupscan=[EasyGroupScan [selectionRoot=maprfs:/tmp/t1.json, numFiles=1, columns=[`a`, `b`], files=[maprfs:///tmp/t1.json]]]) 00-06StreamAgg(group=[{}], EXPR$0=[MAX($0)]) 00-08 Scan(table=[[si, tmp, t2.json]], groupscan=[EasyGroupScan [selectionRoot=maprfs:/tmp/t2.json, numFiles=1, columns=[`a`], files=[maprfs:///tmp/t2.json]]]) {code} File contents: {code} # cat t1.json {"a":1, "b":"V"} {"a":2, "b":"W"} {"a":3, "b":"X"} {"a":4, "b":"Y"} {"a":5, "b":"Z"} # cat t2.json {"a":1, "b":"A"} {"a":2, "b":"B"} {"a":3, "b":"C"} {"a":4, "b":"D"} {"a":5, "b":"E"} {code} > Query fails with "ERROR: Non-scalar sub-query used in an expression" when it > contains a cast expression around a scalar sub-query >
[jira] [Updated] (DRILL-6260) Query fails with "ERROR: Non-scalar sub-query used in an expression" when it contains a cast expression around a scalar sub-query
[ https://issues.apache.org/jira/browse/DRILL-6260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhishek Girish updated DRILL-6260: --- Description: {code} > explain plan for SELECT T1.b FROM `t1.json` T1 WHERE T1.a = (SELECT > cast(max(T2.a) as varchar) FROM `t2.json` T2); Error: UNSUPPORTED_OPERATION ERROR: Non-scalar sub-query used in an expression See Apache Drill JIRA: DRILL-1937 {code} Slightly different variants of the query work fine. {code} > explain plan for SELECT T1.b FROM `t1.json` T1 WHERE T1.a = (SELECT > max(cast(T2.a as varchar)) FROM `t2.json` T2); +--+--+ | text | json | +--+--+ | 00-00 Screen 00-01 Project(b=[$0]) 00-02 Project(b=[$1]) 00-03 SelectionVectorRemover 00-04 Filter(condition=[=($0, $2)]) 00-05 NestedLoopJoin(condition=[true], joinType=[left]) 00-07 Scan(table=[[si, tmp, t1.json]], groupscan=[EasyGroupScan [selectionRoot=maprfs:/tmp/t1.json, numFiles=1, columns=[`a`, `b`], files=[maprfs:///tmp/t1.json]]]) 00-06 StreamAgg(group=[{}], EXPR$0=[MAX($0)]) 00-08 Project($f0=[CAST($0):VARCHAR(65535) CHARACTER SET "UTF-16LE" COLLATE "UTF-16LE$en_US$primary"]) 00-09 Scan(table=[[si, tmp, t2.json]], groupscan=[EasyGroupScan [selectionRoot=maprfs:/tmp/t2.json, numFiles=1, columns=[`a`], files=[maprfs:///tmp/t2.json]]]){code} {code} > explain plan for SELECT T1.b FROM `t1.json` T1 WHERE T1.a = (SELECT > max(T2.a) FROM `t2.json` T2); +--+--+ | text | json | +--+--+ | 00-00Screen 00-01 Project(b=[$0]) 00-02Project(b=[$1]) 00-03 SelectionVectorRemover 00-04Filter(condition=[=($0, $2)]) 00-05 NestedLoopJoin(condition=[true], joinType=[left]) 00-07Scan(table=[[si, tmp, t1.json]], groupscan=[EasyGroupScan [selectionRoot=maprfs:/tmp/t1.json, numFiles=1, columns=[`a`, `b`], files=[maprfs:///tmp/t1.json]]]) 00-06StreamAgg(group=[{}], EXPR$0=[MAX($0)]) 00-08 Scan(table=[[si, tmp, t2.json]], groupscan=[EasyGroupScan [selectionRoot=maprfs:/tmp/t2.json, numFiles=1, columns=[`a`], files=[maprfs:///tmp/t2.json]]]) {code} File contents: {code} # cat t1.json {"a":1, "b":"V"} {"a":2, "b":"W"} {"a":3, "b":"X"} {"a":4, "b":"Y"} {"a":5, "b":"Z"} # cat t2.json {"a":1, "b":"A"} {"a":2, "b":"B"} {"a":3, "b":"C"} {"a":4, "b":"D"} {"a":5, "b":"E"} {code} was: {code} > explain plan for SELECT T1.b FROM `t1.json` T1 WHERE T1.a = (SELECT > cast(max(T2.a) as varchar) FROM `t2.json` T2); Error: UNSUPPORTED_OPERATION ERROR: Non-scalar sub-query used in an expression See Apache Drill JIRA: DRILL-1937 {code} Slightly different variants of the query work fine. {code} > explain plan for SELECT T1.b FROM `t1.json` T1 WHERE T1.a = (SELECT > max(cast(T2.a as varchar)) FROM `t2.json` T2); +--+--+ | text | json | +--+--+ | 00-00 Screen 00-01 Project(b=[$0]) 00-02 Project(b=[$1]) 00-03 SelectionVectorRemover 00-04 Filter(condition=[=($0, $2)]) 00-05 NestedLoopJoin(condition=[true], joinType=[left]) 00-07 Scan(table=[[si, tmp, t1.json]], groupscan=[EasyGroupScan [selectionRoot=maprfs:/tmp/t1.json, numFiles=1, columns=[`a`, `b`], files=[maprfs:///tmp/t1.json]]]) 00-06 StreamAgg(group=[{}], EXPR$0=[MAX($0)]) 00-08 Project($f0=[CAST($0):VARCHAR(65535) CHARACTER SET "UTF-16LE" COLLATE "UTF-16LE$en_US$primary"]) 00-09 Scan(table=[[si, tmp, t2.json]], groupscan=[EasyGroupScan [selectionRoot=maprfs:/tmp/t2.json, numFiles=1, columns=[`a`], files=[maprfs:///tmp/t2.json]]]){code} {code} > explain plan for SELECT T1.b FROM `t1.json` T1 WHERE T1.a = (SELECT > max(T2.a) FROM `t2.json` T2); +--+--+ | text | json | +--+--+ | 00-00Screen 00-01 Project(b=[$0]) 00-02Project(b=[$1]) 00-03 SelectionVectorRemover 00-04Filter(condition=[=($0, $2)]) 00-05 NestedLoopJoin(condition=[true], joinType=[left]) 00-07Scan(table=[[si, tmp, t1.json]], groupscan=[EasyGroupScan [selectionRoot=maprfs:/tmp/t1.json, numFiles=1, columns=[`a`, `b`], files=[maprfs:///tmp/t1.json]]]) 00-06StreamAgg(group=[{}], EXPR$0=[MAX($0)]) 00-08 Scan(table=[[si, tmp, t2.json]], groupscan=[EasyGroupScan [selectionRoot=maprfs:/tmp/t2.json, numFiles=1, columns=[`a`], files=[maprfs:///tmp/t2.json]]]) {code} File contents: {code} # cat t1.json {"a":1, "b":"V"} {"a":2, "b":"W"} {"a":3, "b":"X"} {"a":4, "b":"Y"} {"a":5, "b":"Z"} # # cat t2.json {"a":1, "b":"A"} {"a":2, "b":"B"} {"a":3, "b":"C"} {"a":4, "b":"D"} {"a":5, "b":"E"} {code} > Query fails with "ERROR: Non-scalar sub-query used in an expression" when it > contains a cast expression around a scalar sub-query >
[jira] [Comment Edited] (DRILL-6259) Support parquet filter push down for complex types
[ https://issues.apache.org/jira/browse/DRILL-6259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16402147#comment-16402147 ] Arina Ielchiieva edited comment on DRILL-6259 at 3/16/18 4:32 PM: -- Support for complex types but not for scalar complex types. Example of supported types in parquet schema: {noformat} message complex_users { required group user { required int32 id; optional int32 age; repeated int32 hobby_ids; optional boolean active; } } {noformat} This is simple one, it can be nested as well. was (Author: arina): Support for complex types but not for scalar complex type. Example of supported types in parquet schema: {noformat} message complex_users { required group user { required int32 id; optional int32 age; repeated int32 hobby_ids; optional boolean active; } } {noformat} This is simple one, it can be nested as well. > Support parquet filter push down for complex types > -- > > Key: DRILL-6259 > URL: https://issues.apache.org/jira/browse/DRILL-6259 > Project: Apache Drill > Issue Type: Improvement >Affects Versions: 1.13.0 >Reporter: Arina Ielchiieva >Assignee: Arina Ielchiieva >Priority: Major > Fix For: 1.14.0 > > > Currently parquet filter push down is not working for complex types > (including arrays). > This Jira aims to implement filter push down for complex types which > underneath type is among supported simple types for filter push down. For > instance, currently Drill does not support filter push down for varchars, > decimals etc. Though once Drill will start support, this support will be > applied for complex type automatically. > Complex fields will be pushed down the same way regular fields are, except > for one case with arrays. > Query with predicate {{where users.hobbies_ids[2] is null}} won't be able to > push down because we are not able to determine exact number of nulls in > arrays fields. > {{Consider [1, 2, 3]}} vs {{[1, 2]}} if these arrays are in different files. > Statistics for the second case won't show any nulls but when querying from > two files, in terms of data the third value in array is null. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6259) Support parquet filter push down for complex types
[ https://issues.apache.org/jira/browse/DRILL-6259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16402147#comment-16402147 ] Arina Ielchiieva commented on DRILL-6259: - Support for complex types but not for scalar complex type. Example of supported types in parquet schema: {noformat} message complex_users { required group user { required int32 id; optional int32 age; repeated int32 hobby_ids; optional boolean active; } } {noformat} This is simple one, it can be nested as well. > Support parquet filter push down for complex types > -- > > Key: DRILL-6259 > URL: https://issues.apache.org/jira/browse/DRILL-6259 > Project: Apache Drill > Issue Type: Improvement >Affects Versions: 1.13.0 >Reporter: Arina Ielchiieva >Assignee: Arina Ielchiieva >Priority: Major > Fix For: 1.14.0 > > > Currently parquet filter push down is not working for complex types > (including arrays). > This Jira aims to implement filter push down for complex types which > underneath type is among supported simple types for filter push down. For > instance, currently Drill does not support filter push down for varchars, > decimals etc. Though once Drill will start support, this support will be > applied for complex type automatically. > Complex fields will be pushed down the same way regular fields are, except > for one case with arrays. > Query with predicate {{where users.hobbies_ids[2] is null}} won't be able to > push down because we are not able to determine exact number of nulls in > arrays fields. > {{Consider [1, 2, 3]}} vs {{[1, 2]}} if these arrays are in different files. > Statistics for the second case won't show any nulls but when querying from > two files, in terms of data the third value in array is null. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6260) Query fails with "ERROR: Non-scalar sub-query used in an expression" when it contains a cast expression around a scalar sub-query
[ https://issues.apache.org/jira/browse/DRILL-6260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhishek Girish updated DRILL-6260: --- Affects Version/s: 1.14.0 Environment: git Commit ID: dd4a46a6c57425284a2b8c68676357f947e01988 git Commit message: Update version to 1.14.0-SNAPSHOT > Query fails with "ERROR: Non-scalar sub-query used in an expression" when it > contains a cast expression around a scalar sub-query > -- > > Key: DRILL-6260 > URL: https://issues.apache.org/jira/browse/DRILL-6260 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning Optimization >Affects Versions: 1.13.0, 1.14.0 > Environment: git Commit ID: dd4a46a6c57425284a2b8c68676357f947e01988 > git Commit message: Update version to 1.14.0-SNAPSHOT >Reporter: Abhishek Girish >Priority: Major > > {code} > > explain plan for SELECT T1.b FROM `t1.json` T1 WHERE T1.a = (SELECT > > cast(max(T2.a) as varchar) FROM `t2.json` T2); > Error: UNSUPPORTED_OPERATION ERROR: Non-scalar sub-query used in an expression > See Apache Drill JIRA: DRILL-1937 > {code} > Slightly different variants of the query work fine. > {code} > > explain plan for SELECT T1.b FROM `t1.json` T1 WHERE T1.a = (SELECT > > max(cast(T2.a as varchar)) FROM `t2.json` T2); > +--+--+ > | text | json | > +--+--+ > | 00-00 Screen > 00-01 Project(b=[$0]) > 00-02 Project(b=[$1]) > 00-03 SelectionVectorRemover > 00-04 Filter(condition=[=($0, $2)]) > 00-05 NestedLoopJoin(condition=[true], joinType=[left]) > 00-07 Scan(table=[[si, tmp, t1.json]], > groupscan=[EasyGroupScan [selectionRoot=maprfs:/tmp/t1.json, numFiles=1, > columns=[`a`, `b`], files=[maprfs:///tmp/t1.json]]]) > 00-06 StreamAgg(group=[{}], EXPR$0=[MAX($0)]) > 00-08 Project($f0=[CAST($0):VARCHAR(65535) CHARACTER SET > "UTF-16LE" COLLATE "UTF-16LE$en_US$primary"]) > 00-09 Scan(table=[[si, tmp, t2.json]], > groupscan=[EasyGroupScan [selectionRoot=maprfs:/tmp/t2.json, numFiles=1, > columns=[`a`], files=[maprfs:///tmp/t2.json]]]){code} > {code} > > explain plan for SELECT T1.b FROM `t1.json` T1 WHERE T1.a = (SELECT > > max(T2.a) FROM `t2.json` T2); > +--+--+ > | text | json | > +--+--+ > | 00-00Screen > 00-01 Project(b=[$0]) > 00-02Project(b=[$1]) > 00-03 SelectionVectorRemover > 00-04Filter(condition=[=($0, $2)]) > 00-05 NestedLoopJoin(condition=[true], joinType=[left]) > 00-07Scan(table=[[si, tmp, t1.json]], > groupscan=[EasyGroupScan [selectionRoot=maprfs:/tmp/t1.json, numFiles=1, > columns=[`a`, `b`], files=[maprfs:///tmp/t1.json]]]) > 00-06StreamAgg(group=[{}], EXPR$0=[MAX($0)]) > 00-08 Scan(table=[[si, tmp, t2.json]], > groupscan=[EasyGroupScan [selectionRoot=maprfs:/tmp/t2.json, numFiles=1, > columns=[`a`], files=[maprfs:///tmp/t2.json]]]) > {code} > File contents: > {code} > # cat t1.json > {"a":1, "b":"V"} > {"a":2, "b":"W"} > {"a":3, "b":"X"} > {"a":4, "b":"Y"} > {"a":5, "b":"Z"} > # # cat t2.json > {"a":1, "b":"A"} > {"a":2, "b":"B"} > {"a":3, "b":"C"} > {"a":4, "b":"D"} > {"a":5, "b":"E"} > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6260) Query fails with "ERROR: Non-scalar sub-query used in an expression" when it contains a cast expression around a scalar sub-query
[ https://issues.apache.org/jira/browse/DRILL-6260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhishek Girish updated DRILL-6260: --- Environment: git Commit ID: dd4a46a6c57425284a2b8c68676357f947e01988 git Commit Message: Update version to 1.14.0-SNAPSHOT was: git Commit ID: dd4a46a6c57425284a2b8c68676357f947e01988 git Commit message: Update version to 1.14.0-SNAPSHOT > Query fails with "ERROR: Non-scalar sub-query used in an expression" when it > contains a cast expression around a scalar sub-query > -- > > Key: DRILL-6260 > URL: https://issues.apache.org/jira/browse/DRILL-6260 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning Optimization >Affects Versions: 1.13.0, 1.14.0 > Environment: git Commit ID: dd4a46a6c57425284a2b8c68676357f947e01988 > git Commit Message: Update version to 1.14.0-SNAPSHOT >Reporter: Abhishek Girish >Priority: Major > > {code} > > explain plan for SELECT T1.b FROM `t1.json` T1 WHERE T1.a = (SELECT > > cast(max(T2.a) as varchar) FROM `t2.json` T2); > Error: UNSUPPORTED_OPERATION ERROR: Non-scalar sub-query used in an expression > See Apache Drill JIRA: DRILL-1937 > {code} > Slightly different variants of the query work fine. > {code} > > explain plan for SELECT T1.b FROM `t1.json` T1 WHERE T1.a = (SELECT > > max(cast(T2.a as varchar)) FROM `t2.json` T2); > +--+--+ > | text | json | > +--+--+ > | 00-00 Screen > 00-01 Project(b=[$0]) > 00-02 Project(b=[$1]) > 00-03 SelectionVectorRemover > 00-04 Filter(condition=[=($0, $2)]) > 00-05 NestedLoopJoin(condition=[true], joinType=[left]) > 00-07 Scan(table=[[si, tmp, t1.json]], > groupscan=[EasyGroupScan [selectionRoot=maprfs:/tmp/t1.json, numFiles=1, > columns=[`a`, `b`], files=[maprfs:///tmp/t1.json]]]) > 00-06 StreamAgg(group=[{}], EXPR$0=[MAX($0)]) > 00-08 Project($f0=[CAST($0):VARCHAR(65535) CHARACTER SET > "UTF-16LE" COLLATE "UTF-16LE$en_US$primary"]) > 00-09 Scan(table=[[si, tmp, t2.json]], > groupscan=[EasyGroupScan [selectionRoot=maprfs:/tmp/t2.json, numFiles=1, > columns=[`a`], files=[maprfs:///tmp/t2.json]]]){code} > {code} > > explain plan for SELECT T1.b FROM `t1.json` T1 WHERE T1.a = (SELECT > > max(T2.a) FROM `t2.json` T2); > +--+--+ > | text | json | > +--+--+ > | 00-00Screen > 00-01 Project(b=[$0]) > 00-02Project(b=[$1]) > 00-03 SelectionVectorRemover > 00-04Filter(condition=[=($0, $2)]) > 00-05 NestedLoopJoin(condition=[true], joinType=[left]) > 00-07Scan(table=[[si, tmp, t1.json]], > groupscan=[EasyGroupScan [selectionRoot=maprfs:/tmp/t1.json, numFiles=1, > columns=[`a`, `b`], files=[maprfs:///tmp/t1.json]]]) > 00-06StreamAgg(group=[{}], EXPR$0=[MAX($0)]) > 00-08 Scan(table=[[si, tmp, t2.json]], > groupscan=[EasyGroupScan [selectionRoot=maprfs:/tmp/t2.json, numFiles=1, > columns=[`a`], files=[maprfs:///tmp/t2.json]]]) > {code} > File contents: > {code} > # cat t1.json > {"a":1, "b":"V"} > {"a":2, "b":"W"} > {"a":3, "b":"X"} > {"a":4, "b":"Y"} > {"a":5, "b":"Z"} > # # cat t2.json > {"a":1, "b":"A"} > {"a":2, "b":"B"} > {"a":3, "b":"C"} > {"a":4, "b":"D"} > {"a":5, "b":"E"} > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6259) Support parquet filter push down for complex types
[ https://issues.apache.org/jira/browse/DRILL-6259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16402120#comment-16402120 ] Paul Rogers commented on DRILL-6259: What is meant when we say "complex type"? Drill has multiple "complex" types: * Arrays * Nested tuples (AKA "maps") * Arrays of nested tuples * Multi-dimensional arrays (AKA "repeated lists") * Hetrogenous values (AKA "unions") * Hetrogenous lists (AKA "non-repeated lists") * Combinations of the above (a repeated map that contains a union that contains a 2D list of maps) Then there are the "complex" scalar types (complex because they are not simply bit values like an int or a float): * Decimal * Date/time * Date * Time * Period The write-up mentions arrays. Is this only for arrays? Also for maps? For map arrays? Please identify which complex types are now supported. > Support parquet filter push down for complex types > -- > > Key: DRILL-6259 > URL: https://issues.apache.org/jira/browse/DRILL-6259 > Project: Apache Drill > Issue Type: Improvement >Affects Versions: 1.13.0 >Reporter: Arina Ielchiieva >Assignee: Arina Ielchiieva >Priority: Major > Fix For: 1.14.0 > > > Currently parquet filter push down is not working for complex types > (including arrays). > This Jira aims to implement filter push down for complex types which > underneath type is among supported simple types for filter push down. For > instance, currently Drill does not support filter push down for varchars, > decimals etc. Though once Drill will start support, this support will be > applied for complex type automatically. > Complex fields will be pushed down the same way regular fields are, except > for one case with arrays. > Query with predicate {{where users.hobbies_ids[2] is null}} won't be able to > push down because we are not able to determine exact number of nulls in > arrays fields. > {{Consider [1, 2, 3]}} vs {{[1, 2]}} if these arrays are in different files. > Statistics for the second case won't show any nulls but when querying from > two files, in terms of data the third value in array is null. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6260) Query fails with "ERROR: Non-scalar sub-query used in an expression" when it contains a cast expression around a scalar sub-query
[ https://issues.apache.org/jira/browse/DRILL-6260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhishek Girish updated DRILL-6260: --- Summary: Query fails with "ERROR: Non-scalar sub-query used in an expression" when it contains a cast expression around a scalar sub-query (was: Query fails with "UNSUPPORTED_OPERATION ERROR: Non-scalar sub-query used in an expression" when it contains a cast expression around a scalar sub-query ) > Query fails with "ERROR: Non-scalar sub-query used in an expression" when it > contains a cast expression around a scalar sub-query > -- > > Key: DRILL-6260 > URL: https://issues.apache.org/jira/browse/DRILL-6260 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning Optimization >Affects Versions: 1.13.0 >Reporter: Abhishek Girish >Priority: Major > > {code} > > explain plan for SELECT T1.b FROM `t1.json` T1 WHERE T1.a = (SELECT > > cast(max(T2.a) as varchar) FROM `t2.json` T2); > Error: UNSUPPORTED_OPERATION ERROR: Non-scalar sub-query used in an expression > See Apache Drill JIRA: DRILL-1937 > {code} > Slightly different variants of the query work fine. > {code} > > explain plan for SELECT T1.b FROM `t1.json` T1 WHERE T1.a = (SELECT > > max(cast(T2.a as varchar)) FROM `t2.json` T2); > +--+--+ > | text | json | > +--+--+ > | 00-00 Screen > 00-01 Project(b=[$0]) > 00-02 Project(b=[$1]) > 00-03 SelectionVectorRemover > 00-04 Filter(condition=[=($0, $2)]) > 00-05 NestedLoopJoin(condition=[true], joinType=[left]) > 00-07 Scan(table=[[si, tmp, t1.json]], > groupscan=[EasyGroupScan [selectionRoot=maprfs:/tmp/t1.json, numFiles=1, > columns=[`a`, `b`], files=[maprfs:///tmp/t1.json]]]) > 00-06 StreamAgg(group=[{}], EXPR$0=[MAX($0)]) > 00-08 Project($f0=[CAST($0):VARCHAR(65535) CHARACTER SET > "UTF-16LE" COLLATE "UTF-16LE$en_US$primary"]) > 00-09 Scan(table=[[si, tmp, t2.json]], > groupscan=[EasyGroupScan [selectionRoot=maprfs:/tmp/t2.json, numFiles=1, > columns=[`a`], files=[maprfs:///tmp/t2.json]]]){code} > {code} > > explain plan for SELECT T1.b FROM `t1.json` T1 WHERE T1.a = (SELECT > > max(T2.a) FROM `t2.json` T2); > +--+--+ > | text | json | > +--+--+ > | 00-00Screen > 00-01 Project(b=[$0]) > 00-02Project(b=[$1]) > 00-03 SelectionVectorRemover > 00-04Filter(condition=[=($0, $2)]) > 00-05 NestedLoopJoin(condition=[true], joinType=[left]) > 00-07Scan(table=[[si, tmp, t1.json]], > groupscan=[EasyGroupScan [selectionRoot=maprfs:/tmp/t1.json, numFiles=1, > columns=[`a`, `b`], files=[maprfs:///tmp/t1.json]]]) > 00-06StreamAgg(group=[{}], EXPR$0=[MAX($0)]) > 00-08 Scan(table=[[si, tmp, t2.json]], > groupscan=[EasyGroupScan [selectionRoot=maprfs:/tmp/t2.json, numFiles=1, > columns=[`a`], files=[maprfs:///tmp/t2.json]]]) > {code} > File contents: > {code} > # cat t1.json > {"a":1, "b":"V"} > {"a":2, "b":"W"} > {"a":3, "b":"X"} > {"a":4, "b":"Y"} > {"a":5, "b":"Z"} > # # cat t2.json > {"a":1, "b":"A"} > {"a":2, "b":"B"} > {"a":3, "b":"C"} > {"a":4, "b":"D"} > {"a":5, "b":"E"} > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6260) Query fails with "UNSUPPORTED_OPERATION ERROR: Non-scalar sub-query used in an expression" when it contains a cast expression around a scalar sub-query
Abhishek Girish created DRILL-6260: -- Summary: Query fails with "UNSUPPORTED_OPERATION ERROR: Non-scalar sub-query used in an expression" when it contains a cast expression around a scalar sub-query Key: DRILL-6260 URL: https://issues.apache.org/jira/browse/DRILL-6260 Project: Apache Drill Issue Type: Bug Components: Query Planning Optimization Affects Versions: 1.13.0 Reporter: Abhishek Girish {code} > explain plan for SELECT T1.b FROM `t1.json` T1 WHERE T1.a = (SELECT > cast(max(T2.a) as varchar) FROM `t2.json` T2); Error: UNSUPPORTED_OPERATION ERROR: Non-scalar sub-query used in an expression See Apache Drill JIRA: DRILL-1937 {code} Slightly different variants of the query work fine. {code} > explain plan for SELECT T1.b FROM `t1.json` T1 WHERE T1.a = (SELECT > max(cast(T2.a as varchar)) FROM `t2.json` T2); +--+--+ | text | json | +--+--+ | 00-00 Screen 00-01 Project(b=[$0]) 00-02 Project(b=[$1]) 00-03 SelectionVectorRemover 00-04 Filter(condition=[=($0, $2)]) 00-05 NestedLoopJoin(condition=[true], joinType=[left]) 00-07 Scan(table=[[si, tmp, t1.json]], groupscan=[EasyGroupScan [selectionRoot=maprfs:/tmp/t1.json, numFiles=1, columns=[`a`, `b`], files=[maprfs:///tmp/t1.json]]]) 00-06 StreamAgg(group=[{}], EXPR$0=[MAX($0)]) 00-08 Project($f0=[CAST($0):VARCHAR(65535) CHARACTER SET "UTF-16LE" COLLATE "UTF-16LE$en_US$primary"]) 00-09 Scan(table=[[si, tmp, t2.json]], groupscan=[EasyGroupScan [selectionRoot=maprfs:/tmp/t2.json, numFiles=1, columns=[`a`], files=[maprfs:///tmp/t2.json]]]){code} {code} > explain plan for SELECT T1.b FROM `t1.json` T1 WHERE T1.a = (SELECT > max(T2.a) FROM `t2.json` T2); +--+--+ | text | json | +--+--+ | 00-00Screen 00-01 Project(b=[$0]) 00-02Project(b=[$1]) 00-03 SelectionVectorRemover 00-04Filter(condition=[=($0, $2)]) 00-05 NestedLoopJoin(condition=[true], joinType=[left]) 00-07Scan(table=[[si, tmp, t1.json]], groupscan=[EasyGroupScan [selectionRoot=maprfs:/tmp/t1.json, numFiles=1, columns=[`a`, `b`], files=[maprfs:///tmp/t1.json]]]) 00-06StreamAgg(group=[{}], EXPR$0=[MAX($0)]) 00-08 Scan(table=[[si, tmp, t2.json]], groupscan=[EasyGroupScan [selectionRoot=maprfs:/tmp/t2.json, numFiles=1, columns=[`a`], files=[maprfs:///tmp/t2.json]]]) {code} File contents: {code} # cat t1.json {"a":1, "b":"V"} {"a":2, "b":"W"} {"a":3, "b":"X"} {"a":4, "b":"Y"} {"a":5, "b":"Z"} # # cat t2.json {"a":1, "b":"A"} {"a":2, "b":"B"} {"a":3, "b":"C"} {"a":4, "b":"D"} {"a":5, "b":"E"} {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6252) Foreman node is going down when the non foreman node is stopped
[ https://issues.apache.org/jira/browse/DRILL-6252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vlad Rozov updated DRILL-6252: -- Priority: Major (was: Critical) > Foreman node is going down when the non foreman node is stopped > --- > > Key: DRILL-6252 > URL: https://issues.apache.org/jira/browse/DRILL-6252 > Project: Apache Drill > Issue Type: Bug >Reporter: Venkata Jyothsna Donapati >Assignee: Vlad Rozov >Priority: Major > Fix For: 1.14.0 > > > Two drillbits are running. I'm running a join query over parquet and tried to > stop the non-foreman node using drillbit.sh stop. The query fails with > *"Error: DATA_READ ERROR: Exception occurred while reading from disk".* The > non-foreman node goes down. The foreman node also goes down. When I looked at > the drillbit.log of both foreman and non-foreman I found that there is memory > leak "Memory was leaked by query. Memory leaked: > (2097152)\nAllocator(op:2:0:0:HashPartitionSender) > 100/6291456/6832128/100 (res/actual/peak/limit)\n". Following are > the stack traces for memory leaks > {noformat} > [Error Id: 0d9a2799-7e97-46b3-953b-1f8d0dd87a04 on qa102-34.qa.lab:31010] > org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: > IllegalStateException: Memory was leaked by query. Memory leaked: (3145728) > Allocator(op:2:1:0:HashPartitionSender) 100/6291456/6291456/100 > (res/actual/peak/limit) > > > Fragment 2:1 > [Error Id: 0d9a2799-7e97-46b3-953b-1f8d0dd87a04 on qa102-34.qa.lab:31010] > at > org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:633) > ~[drill-common-1.13.0-SNAPSHOT.jar:1.13.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:297) > [drill-java-exec-1.13.0-SNAPSHOT.jar:1.13.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:160) > [drill-java-exec-1.13.0-SNAPSHOT.jar:1.13.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:266) > [drill-java-exec-1.13.0-SNAPSHOT.jar:1.13.0-SNAPSHOT] > at > org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) > [drill-common-1.13.0-SNAPSHOT.jar:1.13.0-SNAPSHOT] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > [na:1.8.0_161] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > [na:1.8.0_161] > at java.lang.Thread.run(Thread.java:748) [na:1.8.0_161] > Caused by: java.lang.IllegalStateException: Memory was leaked by query. > Memory leaked: (3145728) > Allocator(op:2:1:0:HashPartitionSender) 100/6291456/6291456/100 > (res/actual/peak/limit) > {noformat} > > Ping me for the logs and more information. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6199) Filter push down doesn't work with more than one nested subqueries
[ https://issues.apache.org/jira/browse/DRILL-6199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16402106#comment-16402106 ] ASF GitHub Bot commented on DRILL-6199: --- Github user arina-ielchiieva commented on the issue: https://github.com/apache/drill/pull/1152 @HanumathRao thanks for the review. Applied code review comment. > Filter push down doesn't work with more than one nested subqueries > -- > > Key: DRILL-6199 > URL: https://issues.apache.org/jira/browse/DRILL-6199 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.13.0 >Reporter: Anton Gozhiy >Assignee: Arina Ielchiieva >Priority: Major > Fix For: 1.14.0 > > Attachments: DRILL_6118_data_source.csv > > > *Data set:* > The data is generated used the attached file: *DRILL_6118_data_source.csv* > Data gen commands: > {code:sql} > create table dfs.tmp.`DRILL_6118_parquet_partitioned_by_folders/d1` (c1, c2, > c3, c4, c5) as select cast(columns[0] as int) c1, columns[1] c2, columns[2] > c3, columns[3] c4, columns[4] c5 from dfs.tmp.`DRILL_6118_data_source.csv` > where columns[0] in (1, 3); > create table dfs.tmp.`DRILL_6118_parquet_partitioned_by_folders/d2` (c1, c2, > c3, c4, c5) as select cast(columns[0] as int) c1, columns[1] c2, columns[2] > c3, columns[3] c4, columns[4] c5 from dfs.tmp.`DRILL_6118_data_source.csv` > where columns[0]=2; > create table dfs.tmp.`DRILL_6118_parquet_partitioned_by_folders/d3` (c1, c2, > c3, c4, c5) as select cast(columns[0] as int) c1, columns[1] c2, columns[2] > c3, columns[3] c4, columns[4] c5 from dfs.tmp.`DRILL_6118_data_source.csv` > where columns[0]>3; > {code} > *Steps:* > # Execute the following query: > {code:sql} > explain plan for select * from (select * from (select * from > dfs.tmp.`DRILL_6118_parquet_partitioned_by_folders`)) where c1<3 > {code} > *Expected result:* > numFiles=2, numRowGroups=2, only files from the folders d1 and d2 should be > scanned. > *Actual result:* > Filter push down doesn't work: > numFiles=3, numRowGroups=3, scanning from all files -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6199) Filter push down doesn't work with more than one nested subqueries
[ https://issues.apache.org/jira/browse/DRILL-6199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16402103#comment-16402103 ] ASF GitHub Bot commented on DRILL-6199: --- Github user arina-ielchiieva commented on a diff in the pull request: https://github.com/apache/drill/pull/1152#discussion_r175137249 --- Diff: exec/java-exec/src/test/java/org/apache/drill/exec/store/parquet/TestPushDownAndPruningWithItemStar.java --- @@ -180,4 +248,38 @@ public void testFilterPushDownMultipleConditions() throws Exception { .build(); } + @Test + public void testFilterPushDownWithSeveralNestedStarSubQueries() throws Exception { +String subQuery = String.format("select * from `%s`.`%s`", DFS_TMP_SCHEMA, TABLE_NAME); +String query = String.format("select * from (select * from (select * from (%s))) where o_orderdate = date '1992-01-01'", subQuery); + +String[] expectedPlan = {"numFiles=1, numRowGroups=1, usedMetadataFile=false, columns=\\[`\\*\\*`, `o_orderdate`\\]"}; +String[] excludedPlan = {}; + +PlanTestBase.testPlanMatchingPatterns(query, expectedPlan, excludedPlan); + +testBuilder() +.sqlQuery(query) +.unOrdered() +.sqlBaselineQuery("select * from `%s`.`%s` where o_orderdate = date '1992-01-01'", DFS_TMP_SCHEMA, TABLE_NAME) +.build(); + } + + @Test + public void testFilterPushDownWithSeveralNestedStarSubQueriesWithAdditionalColumns() throws Exception { +String subQuery = String.format("select * from `%s`.`%s`", DFS_TMP_SCHEMA, TABLE_NAME); +String query = String.format("select * from (select * from (select *, o_orderdate from (%s))) where o_orderdate = date '1992-01-01'", subQuery); --- End diff -- Done. > Filter push down doesn't work with more than one nested subqueries > -- > > Key: DRILL-6199 > URL: https://issues.apache.org/jira/browse/DRILL-6199 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.13.0 >Reporter: Anton Gozhiy >Assignee: Arina Ielchiieva >Priority: Major > Fix For: 1.14.0 > > Attachments: DRILL_6118_data_source.csv > > > *Data set:* > The data is generated used the attached file: *DRILL_6118_data_source.csv* > Data gen commands: > {code:sql} > create table dfs.tmp.`DRILL_6118_parquet_partitioned_by_folders/d1` (c1, c2, > c3, c4, c5) as select cast(columns[0] as int) c1, columns[1] c2, columns[2] > c3, columns[3] c4, columns[4] c5 from dfs.tmp.`DRILL_6118_data_source.csv` > where columns[0] in (1, 3); > create table dfs.tmp.`DRILL_6118_parquet_partitioned_by_folders/d2` (c1, c2, > c3, c4, c5) as select cast(columns[0] as int) c1, columns[1] c2, columns[2] > c3, columns[3] c4, columns[4] c5 from dfs.tmp.`DRILL_6118_data_source.csv` > where columns[0]=2; > create table dfs.tmp.`DRILL_6118_parquet_partitioned_by_folders/d3` (c1, c2, > c3, c4, c5) as select cast(columns[0] as int) c1, columns[1] c2, columns[2] > c3, columns[3] c4, columns[4] c5 from dfs.tmp.`DRILL_6118_data_source.csv` > where columns[0]>3; > {code} > *Steps:* > # Execute the following query: > {code:sql} > explain plan for select * from (select * from (select * from > dfs.tmp.`DRILL_6118_parquet_partitioned_by_folders`)) where c1<3 > {code} > *Expected result:* > numFiles=2, numRowGroups=2, only files from the folders d1 and d2 should be > scanned. > *Actual result:* > Filter push down doesn't work: > numFiles=3, numRowGroups=3, scanning from all files -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6199) Filter push down doesn't work with more than one nested subqueries
[ https://issues.apache.org/jira/browse/DRILL-6199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16402104#comment-16402104 ] ASF GitHub Bot commented on DRILL-6199: --- Github user arina-ielchiieva commented on a diff in the pull request: https://github.com/apache/drill/pull/1152#discussion_r175120182 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillFilterItemStarReWriterRule.java --- @@ -54,83 +44,189 @@ import static org.apache.drill.exec.planner.logical.FieldsReWriterUtil.FieldsReWriter; /** - * Rule will transform filter -> project -> scan call with item star fields in filter - * into project -> filter -> project -> scan where item star fields are pushed into scan - * and replaced with actual field references. + * Rule will transform item star fields in filter and replaced with actual field references. * * This will help partition pruning and push down rules to detect fields that can be pruned or push downed. * Item star operator appears when sub-select or cte with star are used as source. */ -public class DrillFilterItemStarReWriterRule extends RelOptRule { +public class DrillFilterItemStarReWriterRule { - public static final DrillFilterItemStarReWriterRule INSTANCE = new DrillFilterItemStarReWriterRule( - RelOptHelper.some(Filter.class, RelOptHelper.some(Project.class, RelOptHelper.any( TableScan.class))), - "DrillFilterItemStarReWriterRule"); + public static final DrillFilterItemStarReWriterRule.ProjectOnScan PROJECT_ON_SCAN = new ProjectOnScan( + RelOptHelper.some(DrillProjectRel.class, RelOptHelper.any(DrillScanRel.class)), + "DrillFilterItemStarReWriterRule.ProjectOnScan"); - private DrillFilterItemStarReWriterRule(RelOptRuleOperand operand, String id) { -super(operand, id); - } + public static final DrillFilterItemStarReWriterRule.FilterOnScan FILTER_ON_SCAN = new FilterOnScan( + RelOptHelper.some(DrillFilterRel.class, RelOptHelper.any(DrillScanRel.class)), + "DrillFilterItemStarReWriterRule.FilterOnScan"); - @Override - public void onMatch(RelOptRuleCall call) { -Filter filterRel = call.rel(0); -Project projectRel = call.rel(1); -TableScan scanRel = call.rel(2); + public static final DrillFilterItemStarReWriterRule.FilterOnProject FILTER_ON_PROJECT = new FilterOnProject( + RelOptHelper.some(DrillFilterRel.class, RelOptHelper.some(DrillProjectRel.class, RelOptHelper.any(DrillScanRel.class))), + "DrillFilterItemStarReWriterRule.FilterOnProject"); -ItemStarFieldsVisitor itemStarFieldsVisitor = new ItemStarFieldsVisitor(filterRel.getRowType().getFieldNames()); -filterRel.getCondition().accept(itemStarFieldsVisitor); -// there are no item fields, no need to proceed further -if (!itemStarFieldsVisitor.hasItemStarFields()) { - return; + private static class ProjectOnScan extends RelOptRule { + +ProjectOnScan(RelOptRuleOperand operand, String id) { + super(operand, id); } -MapitemStarFields = itemStarFieldsVisitor.getItemStarFields(); +@Override +public boolean matches(RelOptRuleCall call) { + DrillScanRel scan = call.rel(1); + return scan.getGroupScan() instanceof ParquetGroupScan && super.matches(call); +} -// create new scan -RelNode newScan = constructNewScan(scanRel, itemStarFields.keySet()); +@Override +public void onMatch(RelOptRuleCall call) { + DrillProjectRel projectRel = call.rel(0); + DrillScanRel scanRel = call.rel(1); + + ItemStarFieldsVisitor itemStarFieldsVisitor = new ItemStarFieldsVisitor(scanRel.getRowType().getFieldNames()); + List projects = projectRel.getProjects(); + for (RexNode project : projects) { +project.accept(itemStarFieldsVisitor); + } -// combine original and new projects -List newProjects = new ArrayList<>(projectRel.getProjects()); + Map itemStarFields = itemStarFieldsVisitor.getItemStarFields(); -// prepare node mapper to replace item star calls with new input field references -Map fieldMapper = new HashMap<>(); + // if there are no item fields, no need to proceed further + if (itemStarFieldsVisitor.hasNoItemStarFields()) { --- End diff -- Sure, moved. > Filter push down doesn't work with more than one nested subqueries > -- > > Key: DRILL-6199 > URL: https://issues.apache.org/jira/browse/DRILL-6199 >
[jira] [Commented] (DRILL-6199) Filter push down doesn't work with more than one nested subqueries
[ https://issues.apache.org/jira/browse/DRILL-6199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16402102#comment-16402102 ] ASF GitHub Bot commented on DRILL-6199: --- Github user arina-ielchiieva commented on a diff in the pull request: https://github.com/apache/drill/pull/1152#discussion_r175136589 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillFilterItemStarReWriterRule.java --- @@ -54,83 +44,189 @@ import static org.apache.drill.exec.planner.logical.FieldsReWriterUtil.FieldsReWriter; /** - * Rule will transform filter -> project -> scan call with item star fields in filter - * into project -> filter -> project -> scan where item star fields are pushed into scan - * and replaced with actual field references. + * Rule will transform item star fields in filter and replaced with actual field references. * * This will help partition pruning and push down rules to detect fields that can be pruned or push downed. * Item star operator appears when sub-select or cte with star are used as source. */ -public class DrillFilterItemStarReWriterRule extends RelOptRule { +public class DrillFilterItemStarReWriterRule { - public static final DrillFilterItemStarReWriterRule INSTANCE = new DrillFilterItemStarReWriterRule( - RelOptHelper.some(Filter.class, RelOptHelper.some(Project.class, RelOptHelper.any( TableScan.class))), - "DrillFilterItemStarReWriterRule"); + public static final DrillFilterItemStarReWriterRule.ProjectOnScan PROJECT_ON_SCAN = new ProjectOnScan( + RelOptHelper.some(DrillProjectRel.class, RelOptHelper.any(DrillScanRel.class)), + "DrillFilterItemStarReWriterRule.ProjectOnScan"); - private DrillFilterItemStarReWriterRule(RelOptRuleOperand operand, String id) { -super(operand, id); - } + public static final DrillFilterItemStarReWriterRule.FilterOnScan FILTER_ON_SCAN = new FilterOnScan( + RelOptHelper.some(DrillFilterRel.class, RelOptHelper.any(DrillScanRel.class)), + "DrillFilterItemStarReWriterRule.FilterOnScan"); - @Override - public void onMatch(RelOptRuleCall call) { -Filter filterRel = call.rel(0); -Project projectRel = call.rel(1); -TableScan scanRel = call.rel(2); + public static final DrillFilterItemStarReWriterRule.FilterOnProject FILTER_ON_PROJECT = new FilterOnProject( --- End diff -- Done. > Filter push down doesn't work with more than one nested subqueries > -- > > Key: DRILL-6199 > URL: https://issues.apache.org/jira/browse/DRILL-6199 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.13.0 >Reporter: Anton Gozhiy >Assignee: Arina Ielchiieva >Priority: Major > Fix For: 1.14.0 > > Attachments: DRILL_6118_data_source.csv > > > *Data set:* > The data is generated used the attached file: *DRILL_6118_data_source.csv* > Data gen commands: > {code:sql} > create table dfs.tmp.`DRILL_6118_parquet_partitioned_by_folders/d1` (c1, c2, > c3, c4, c5) as select cast(columns[0] as int) c1, columns[1] c2, columns[2] > c3, columns[3] c4, columns[4] c5 from dfs.tmp.`DRILL_6118_data_source.csv` > where columns[0] in (1, 3); > create table dfs.tmp.`DRILL_6118_parquet_partitioned_by_folders/d2` (c1, c2, > c3, c4, c5) as select cast(columns[0] as int) c1, columns[1] c2, columns[2] > c3, columns[3] c4, columns[4] c5 from dfs.tmp.`DRILL_6118_data_source.csv` > where columns[0]=2; > create table dfs.tmp.`DRILL_6118_parquet_partitioned_by_folders/d3` (c1, c2, > c3, c4, c5) as select cast(columns[0] as int) c1, columns[1] c2, columns[2] > c3, columns[3] c4, columns[4] c5 from dfs.tmp.`DRILL_6118_data_source.csv` > where columns[0]>3; > {code} > *Steps:* > # Execute the following query: > {code:sql} > explain plan for select * from (select * from (select * from > dfs.tmp.`DRILL_6118_parquet_partitioned_by_folders`)) where c1<3 > {code} > *Expected result:* > numFiles=2, numRowGroups=2, only files from the folders d1 and d2 should be > scanned. > *Actual result:* > Filter push down doesn't work: > numFiles=3, numRowGroups=3, scanning from all files -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6259) Support parquet filter push down for complex types
[ https://issues.apache.org/jira/browse/DRILL-6259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16402086#comment-16402086 ] ASF GitHub Bot commented on DRILL-6259: --- GitHub user arina-ielchiieva opened a pull request: https://github.com/apache/drill/pull/1173 DRILL-6259: Support parquet filter push down for complex types Details in [DRILL-6259](https://issues.apache.org/jira/browse/DRILL-6259). You can merge this pull request into a Git repository by running: $ git pull https://github.com/arina-ielchiieva/drill DRILL-6259 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/drill/pull/1173.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1173 commit 7a694cedc76d76ce062b393ddd30002e8a6ba11a Author: Arina IelchiievaDate: 2018-03-13T17:54:25Z DRILL-6259: Support parquet filter push down for complex types > Support parquet filter push down for complex types > -- > > Key: DRILL-6259 > URL: https://issues.apache.org/jira/browse/DRILL-6259 > Project: Apache Drill > Issue Type: Improvement >Affects Versions: 1.13.0 >Reporter: Arina Ielchiieva >Assignee: Arina Ielchiieva >Priority: Major > Fix For: 1.14.0 > > > Currently parquet filter push down is not working for complex types > (including arrays). > This Jira aims to implement filter push down for complex types which > underneath type is among supported simple types for filter push down. For > instance, currently Drill does not support filter push down for varchars, > decimals etc. Though once Drill will start support, this support will be > applied for complex type automatically. > Complex fields will be pushed down the same way regular fields are, except > for one case with arrays. > Query with predicate {{where users.hobbies_ids[2] is null}} won't be able to > push down because we are not able to determine exact number of nulls in > arrays fields. > {{Consider [1, 2, 3]}} vs {{[1, 2]}} if these arrays are in different files. > Statistics for the second case won't show any nulls but when querying from > two files, in terms of data the third value in array is null. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6259) Support parquet filter push down for complex types
[ https://issues.apache.org/jira/browse/DRILL-6259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva updated DRILL-6259: Description: Currently parquet filter push down is not working for complex types (including arrays). This Jira aims to implement filter push down for complex types which underneath type is among supported simple types for filter push down. For instance, currently Drill does not support filter push down for varchars, decimals etc. Though once Drill will start support, this support will be applied for complex type automatically. Complex fields will be pushed down the same way regular fields are, except for one case with arrays. Query with predicate {{where users.hobbies_ids[2] is null}} won't be able to push down because we are not able to determine exact number of nulls in arrays fields. {{Consider [1, 2, 3]}} vs {{[1, 2]}} if}} these arrays are in different files. Statistics for the second case won't show any nulls but when querying from two files, in terms of data the third value in array is null. was: Currently parquet filter push down is not working for complex types (including arrays). This Jira aims to implement filter push down for complex types which underneath type is among supported simple types for filter push down. For instance, currently Drill does not support filter push down for varchars, decimals etc. Though once Drill will start support, this support will be applied for complex type automatically. Complex fields will be pushed down the same way regular fields are, except for one case with arrays. Query with predicate {{where users.hobbies_ids[2] is null}} won't be able to push down because we are not able to determine exact number of nulls in arrays fields. {{Consider [1, 2, 3]}} vs {{[1, 2]. If}} these arrays are in different files. Statistics for the second case won't show any nulls but when querying from two files, in terms of data the third value in array is null. > Support parquet filter push down for complex types > -- > > Key: DRILL-6259 > URL: https://issues.apache.org/jira/browse/DRILL-6259 > Project: Apache Drill > Issue Type: Improvement >Affects Versions: 1.13.0 >Reporter: Arina Ielchiieva >Assignee: Arina Ielchiieva >Priority: Major > Fix For: 1.14.0 > > > Currently parquet filter push down is not working for complex types > (including arrays). > This Jira aims to implement filter push down for complex types which > underneath type is among supported simple types for filter push down. For > instance, currently Drill does not support filter push down for varchars, > decimals etc. Though once Drill will start support, this support will be > applied for complex type automatically. > Complex fields will be pushed down the same way regular fields are, except > for one case with arrays. > Query with predicate {{where users.hobbies_ids[2] is null}} won't be able to > push down because we are not able to determine exact number of nulls in > arrays fields. > {{Consider [1, 2, 3]}} vs {{[1, 2]}} if}} these arrays are in different > files. Statistics for the second case won't show any nulls but when querying > from two files, in terms of data the third value in array is null. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6259) Support parquet filter push down for complex types
[ https://issues.apache.org/jira/browse/DRILL-6259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva updated DRILL-6259: Description: Currently parquet filter push down is not working for complex types (including arrays). This Jira aims to implement filter push down for complex types which underneath type is among supported simple types for filter push down. For instance, currently Drill does not support filter push down for varchars, decimals etc. Though once Drill will start support, this support will be applied for complex type automatically. Complex fields will be pushed down the same way regular fields are, except for one case with arrays. Query with predicate {{where users.hobbies_ids[2] is null}} won't be able to push down because we are not able to determine exact number of nulls in arrays fields. {{Consider [1, 2, 3]}} vs {{[1, 2]}} if these arrays are in different files. Statistics for the second case won't show any nulls but when querying from two files, in terms of data the third value in array is null. was: Currently parquet filter push down is not working for complex types (including arrays). This Jira aims to implement filter push down for complex types which underneath type is among supported simple types for filter push down. For instance, currently Drill does not support filter push down for varchars, decimals etc. Though once Drill will start support, this support will be applied for complex type automatically. Complex fields will be pushed down the same way regular fields are, except for one case with arrays. Query with predicate {{where users.hobbies_ids[2] is null}} won't be able to push down because we are not able to determine exact number of nulls in arrays fields. {{Consider [1, 2, 3]}} vs {{[1, 2]}} if}} these arrays are in different files. Statistics for the second case won't show any nulls but when querying from two files, in terms of data the third value in array is null. > Support parquet filter push down for complex types > -- > > Key: DRILL-6259 > URL: https://issues.apache.org/jira/browse/DRILL-6259 > Project: Apache Drill > Issue Type: Improvement >Affects Versions: 1.13.0 >Reporter: Arina Ielchiieva >Assignee: Arina Ielchiieva >Priority: Major > Fix For: 1.14.0 > > > Currently parquet filter push down is not working for complex types > (including arrays). > This Jira aims to implement filter push down for complex types which > underneath type is among supported simple types for filter push down. For > instance, currently Drill does not support filter push down for varchars, > decimals etc. Though once Drill will start support, this support will be > applied for complex type automatically. > Complex fields will be pushed down the same way regular fields are, except > for one case with arrays. > Query with predicate {{where users.hobbies_ids[2] is null}} won't be able to > push down because we are not able to determine exact number of nulls in > arrays fields. > {{Consider [1, 2, 3]}} vs {{[1, 2]}} if these arrays are in different files. > Statistics for the second case won't show any nulls but when querying from > two files, in terms of data the third value in array is null. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6259) Support parquet filter push down for complex types
[ https://issues.apache.org/jira/browse/DRILL-6259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva updated DRILL-6259: Summary: Support parquet filter push down for complex types (was: Implement parquet filter push down for complex types) > Support parquet filter push down for complex types > -- > > Key: DRILL-6259 > URL: https://issues.apache.org/jira/browse/DRILL-6259 > Project: Apache Drill > Issue Type: Improvement >Affects Versions: 1.13.0 >Reporter: Arina Ielchiieva >Assignee: Arina Ielchiieva >Priority: Major > Fix For: 1.14.0 > > > Currently parquet filter push down is not working for complex types > (including arrays). > This Jira aims to implement filter push down for complex types which > underneath type is among supported simple types for filter push down. For > instance, currently Drill does not support filter push down for varchars, > decimals etc. Though once Drill will start support, this support will be > applied for complex type automatically. > Complex fields will be pushed down the same way regular fields are, except > for one case with arrays. > Query with predicate {{where users.hobbies_ids[2] is null}} won't be able to > push down because we are not able to determine exact number of nulls in > arrays fields. > {{Consider [1, 2, 3]}} vs {{[1, 2]. If}} these arrays are in different files. > Statistics for the second case won't show any nulls but when querying from > two files, in terms of data the third value in array is null. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6242) Output format for nested date, time, timestamp values in an object hierarchy
[ https://issues.apache.org/jira/browse/DRILL-6242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16402061#comment-16402061 ] Jiang Wu commented on DRILL-6242: - I can take a look at the changes required. Will update if this becomes too complicated for me to do. > Output format for nested date, time, timestamp values in an object hierarchy > > > Key: DRILL-6242 > URL: https://issues.apache.org/jira/browse/DRILL-6242 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Data Types >Affects Versions: 1.12.0 >Reporter: Jiang Wu >Priority: Major > > Some storages (mapr db, mongo db, etc.) have hierarchical objects that > contain nested fields of date, time, timestamp types. When a query returns > these objects, the output format for the nested date, time, timestamp, are > showing the internal object (org.joda.time.DateTime), rather than the logical > data value. > For example. Suppose in MongoDB, we have a single object that looks like > this: > {code:java} > > db.test.findOne(); > { > "_id" : ObjectId("5aa8487d470dd39a635a12f5"), > "name" : "orange", > "context" : { > "date" : ISODate("2018-03-13T21:52:54.940Z"), > "user" : "jack" > } > } > {code} > Then connect Drill to the above MongoDB storage, and run the following query > within Drill: > {code:java} > > select t.context.`date`, t.context from test t; > ++-+ > | EXPR$0 | context | > ++-+ > | 2018-03-13 | > {"date":{"dayOfYear":72,"year":2018,"dayOfMonth":13,"dayOfWeek":2,"era":1,"millisOfDay":78774940,"weekOfWeekyear":11,"weekyear":2018,"monthOfYear":3,"yearOfEra":2018,"yearOfCentury":18,"centuryOfEra":20,"millisOfSecond":940,"secondOfMinute":54,"secondOfDay":78774,"minuteOfHour":52,"minuteOfDay":1312,"hourOfDay":21,"zone":{"fixed":true,"id":"UTC"},"millis":1520977974940,"chronology":{"zone":{"fixed":true,"id":"UTC"}},"afterNow":false,"beforeNow":true,"equalNow":false},"user":"jack"} > | > {code} > We can see that from the above output, when the date field is retrieved as a > top level column, Drill outputs a logical date value. But when the same > field is within an object hierarchy, Drill outputs the internal object used > to hold the date value. > The expected output is the same display for whether the date field is shown > as a top level column or when it is within an object hierarchy: > {code:java} > > select t.context.`date`, t.context from test t; > ++-+ > | EXPR$0 | context | > ++-+ > | 2018-03-13 | {"date":"2018-03-13","user":"jack"} | > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6259) Implement parquet filter push down for complex types
Arina Ielchiieva created DRILL-6259: --- Summary: Implement parquet filter push down for complex types Key: DRILL-6259 URL: https://issues.apache.org/jira/browse/DRILL-6259 Project: Apache Drill Issue Type: Improvement Affects Versions: 1.13.0 Reporter: Arina Ielchiieva Assignee: Arina Ielchiieva Fix For: 1.14.0 Currently parquet filter push down is not working for complex types (including arrays). This Jira aims to implement filter push down for complex types which underneath type is among supported simple types for filter push down. For instance, currently Drill does not support filter push down for varchars, decimals etc. Though once Drill will start support, this support will be applied for complex type automatically. Complex fields will be pushed down the same way regular fields are, except for one case with arrays. Query with predicate {{where users.hobbies_ids[2] is null}} won't be able to push down because we are not able to determine exact number of nulls in arrays fields. {{Consider [1, 2, 3]}} vs {{[1, 2]. If}} these arrays are in different files. Statistics for the second case won't show any nulls but when querying from two files, in terms of data the third value in array is null. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (DRILL-6256) Remove references to java 7 from readme and other files
[ https://issues.apache.org/jira/browse/DRILL-6256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva reassigned DRILL-6256: --- Assignee: Volodymyr Tkach (was: Arina Ielchiieva) > Remove references to java 7 from readme and other files > --- > > Key: DRILL-6256 > URL: https://issues.apache.org/jira/browse/DRILL-6256 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.13.0 >Reporter: Volodymyr Tkach >Assignee: Volodymyr Tkach >Priority: Major > Labels: ready-to-commit > Fix For: 1.14.0 > > > Since master branch uses jdk 8 we should remove all references to java 7. > Also change min required maven version. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6256) Remove references to java 7 from readme and other files
[ https://issues.apache.org/jira/browse/DRILL-6256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva updated DRILL-6256: Labels: ready-to-commit (was: ) > Remove references to java 7 from readme and other files > --- > > Key: DRILL-6256 > URL: https://issues.apache.org/jira/browse/DRILL-6256 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.13.0 >Reporter: Volodymyr Tkach >Assignee: Volodymyr Tkach >Priority: Major > Labels: ready-to-commit > Fix For: 1.14.0 > > > Since master branch uses jdk 8 we should remove all references to java 7. > Also change min required maven version. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (DRILL-6256) Remove references to java 7 from readme and other files
[ https://issues.apache.org/jira/browse/DRILL-6256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva reassigned DRILL-6256: --- Assignee: Arina Ielchiieva (was: Volodymyr Tkach) > Remove references to java 7 from readme and other files > --- > > Key: DRILL-6256 > URL: https://issues.apache.org/jira/browse/DRILL-6256 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.13.0 >Reporter: Volodymyr Tkach >Assignee: Arina Ielchiieva >Priority: Major > Labels: ready-to-commit > Fix For: 1.14.0 > > > Since master branch uses jdk 8 we should remove all references to java 7. > Also change min required maven version. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6256) Remove references to java 7 from readme and other files
[ https://issues.apache.org/jira/browse/DRILL-6256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16401968#comment-16401968 ] ASF GitHub Bot commented on DRILL-6256: --- Github user arina-ielchiieva commented on the issue: https://github.com/apache/drill/pull/1172 +1 > Remove references to java 7 from readme and other files > --- > > Key: DRILL-6256 > URL: https://issues.apache.org/jira/browse/DRILL-6256 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.13.0 >Reporter: Volodymyr Tkach >Assignee: Volodymyr Tkach >Priority: Major > Fix For: 1.14.0 > > > Since master branch uses jdk 8 we should remove all references to java 7. > Also change min required maven version. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (DRILL-6251) Queries from system tables are hang
[ https://issues.apache.org/jira/browse/DRILL-6251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vitalii Diravka reassigned DRILL-6251: -- Assignee: Vitalii Diravka > Queries from system tables are hang > --- > > Key: DRILL-6251 > URL: https://issues.apache.org/jira/browse/DRILL-6251 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Flow >Affects Versions: 1.13.0 >Reporter: Vitalii Diravka >Assignee: Vitalii Diravka >Priority: Major > > On CentoOS cluster Drill hangs while querying sys tables after "use dfs;" > (embedded or distributed mode): > {code} > 0: jdbc:drill:> select * from sys.version; > +--+---+--++++ > | version | commit_id | > commit_message|commit_time | > build_email | build_time | > +--+---+--++++ > | 1.13.0 | 796fcf051b3553c4597abbdca5ca247b139734ba | > [maven-release-plugin] prepare release drill-1.13.0 | 13.03.2018 @ 11:39:14 > IST | par...@apache.org | 13.03.2018 @ 13:13:45 IST | > +--+---+--++++ > 1 row selected (3.784 seconds) > 0: jdbc:drill:> use dfs; > +---+--+ > | ok | summary | > +---+--+ > | true | Default schema changed to [dfs] | > +---+--+ > 1 row selected (0.328 seconds) > 0: jdbc:drill:> select * from sys.version; > Error: Statement canceled (state=,code=0) > 0: jdbc:drill:> > {code} > *Note*: there is no failure on local debian machine with Drill in embedded > mode. > dfs pugin configs are default (with "connection": "file:///", other file > systems works good). > This failure is connected to DRILL-5089 and Calcite rebase. > Related commits: > https://github.com/apache/drill/commit/450e67094eb6e9a6484d7f86c49b51c77a08d7b2 > https://github.com/apache/drill/commit/18a71a38f6bd1fd33d21d1c68fc23c5901b0080a > After analyzing in remote debug I found the following flow: > "dfs" DynamicRootSchema is created, then a new "sys" one is created. > After Calcite validate "sys" SimpleCalciteSchema is created. But in > WorkspaceSchemaFactory#create wrong WorkspaceConfig is left and "/" is > combined with "sys". -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6145) Implement Hive MapR-DB JSON handler.
[ https://issues.apache.org/jira/browse/DRILL-6145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16401784#comment-16401784 ] ASF GitHub Bot commented on DRILL-6145: --- Github user vdiravka commented on the issue: https://github.com/apache/drill/pull/1158 @priteshm @priteshm I have created a Jira for above mentioned issue: [DRILL-6258](https://issues.apache.org/jira/browse/DRILL-6258) > Implement Hive MapR-DB JSON handler. > - > > Key: DRILL-6145 > URL: https://issues.apache.org/jira/browse/DRILL-6145 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - Hive, Storage - MapRDB >Affects Versions: 1.12.0 >Reporter: Vitalii Diravka >Assignee: Vitalii Diravka >Priority: Major > Labels: doc-impacting > Fix For: 1.14.0 > > > Similar to "hive-hbase-storage-handler" to support querying MapR-DB Hive's > external tables it is necessary to add "hive-maprdb-json-handler". > Use case: > # Create a table MapR-DB JSON table: > {code} > _> mapr dbshell_ > _maprdb root:> create /tmp/table/json_ (make sure /tmp/table exists) > {code} > -- insert data > {code} > insert /tmp/table/json --value '\{"_id":"movie002" , "title":"Developers > on the Edge", "studio":"Command Line Studios"}' > insert /tmp/table/json --id movie003 --value '\{"title":"The Golden > Master", "studio":"All-Nighter"}' > {code} > # Create a Hive external table: > {code} > hive> CREATE EXTERNAL TABLE mapr_db_json_hive_tbl ( > > movie_id string, title string, studio string) > > STORED BY 'org.apache.hadoop.hive.maprdb.json.MapRDBJsonStorageHandler' > > TBLPROPERTIES("maprdb.table.name" = > "/tmp/table/json","maprdb.column.id" = "movie_id"); > {code} > > # Use hive schema to query this table via Drill: > {code} > 0: jdbc:drill:> select * from hive.mapr_db_json_hive_tbl; > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6258) Jar files aren't downloaded if dependency is present only in profile section
Vitalii Diravka created DRILL-6258: -- Summary: Jar files aren't downloaded if dependency is present only in profile section Key: DRILL-6258 URL: https://issues.apache.org/jira/browse/DRILL-6258 Project: Apache Drill Issue Type: Improvement Components: Tools, Build Test Affects Versions: 1.13.0 Reporter: Vitalii Diravka Fix For: Future Dependencies from specific profiles in POM files of any modules, which are present in distribution POM should be downloaded as jars (with enabled appropriate profile) like dependencies from common section of POM files. It will allow don't create extra dependency sections or additional modules. Currently to add jar files for some specific profile it is necessary to add it to profile section in distribution/pom file. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (DRILL-4493) Fixed issues in various POMs with MapR profile
[ https://issues.apache.org/jira/browse/DRILL-4493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vitalii Diravka resolved DRILL-4493. Resolution: Fixed It was merged into master branch with commit id c047f04b507faec > Fixed issues in various POMs with MapR profile > -- > > Key: DRILL-4493 > URL: https://issues.apache.org/jira/browse/DRILL-4493 > Project: Apache Drill > Issue Type: Bug > Components: Tools, Build Test >Affects Versions: 1.6.0 >Reporter: Aditya Kishore >Assignee: Aditya Kishore >Priority: Major > Fix For: 1.6.0 > > > * Remove inclusion of some transitive dependencies from distribution pom. > * Remove maprfs/json artifacts from "mapr" profile in drill-java-exec pom. > * Set "hadoop-common"'s scope as test in jdbc pom (without this the jdbc-all > jar bloats to >60MB). > * Revert HBase version to 0.98.12-mapr-1602-m7-5.1.0. > * Exclude log4j and commons-logging from some HBase artifacts. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-4493) Fixed issues in various POMs with MapR profile
[ https://issues.apache.org/jira/browse/DRILL-4493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vitalii Diravka updated DRILL-4493: --- Fix Version/s: (was: Future) 1.6.0 > Fixed issues in various POMs with MapR profile > -- > > Key: DRILL-4493 > URL: https://issues.apache.org/jira/browse/DRILL-4493 > Project: Apache Drill > Issue Type: Bug > Components: Tools, Build Test >Affects Versions: 1.6.0 >Reporter: Aditya Kishore >Assignee: Aditya Kishore >Priority: Major > Fix For: 1.6.0 > > > * Remove inclusion of some transitive dependencies from distribution pom. > * Remove maprfs/json artifacts from "mapr" profile in drill-java-exec pom. > * Set "hadoop-common"'s scope as test in jdbc pom (without this the jdbc-all > jar bloats to >60MB). > * Revert HBase version to 0.98.12-mapr-1602-m7-5.1.0. > * Exclude log4j and commons-logging from some HBase artifacts. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6145) Implement Hive MapR-DB JSON handler.
[ https://issues.apache.org/jira/browse/DRILL-6145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vitalii Diravka updated DRILL-6145: --- Reviewer: Vlad Rozov (was: Sorabh Hamirwasia) > Implement Hive MapR-DB JSON handler. > - > > Key: DRILL-6145 > URL: https://issues.apache.org/jira/browse/DRILL-6145 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - Hive, Storage - MapRDB >Affects Versions: 1.12.0 >Reporter: Vitalii Diravka >Assignee: Vitalii Diravka >Priority: Major > Labels: doc-impacting > Fix For: 1.14.0 > > > Similar to "hive-hbase-storage-handler" to support querying MapR-DB Hive's > external tables it is necessary to add "hive-maprdb-json-handler". > Use case: > # Create a table MapR-DB JSON table: > {code} > _> mapr dbshell_ > _maprdb root:> create /tmp/table/json_ (make sure /tmp/table exists) > {code} > -- insert data > {code} > insert /tmp/table/json --value '\{"_id":"movie002" , "title":"Developers > on the Edge", "studio":"Command Line Studios"}' > insert /tmp/table/json --id movie003 --value '\{"title":"The Golden > Master", "studio":"All-Nighter"}' > {code} > # Create a Hive external table: > {code} > hive> CREATE EXTERNAL TABLE mapr_db_json_hive_tbl ( > > movie_id string, title string, studio string) > > STORED BY 'org.apache.hadoop.hive.maprdb.json.MapRDBJsonStorageHandler' > > TBLPROPERTIES("maprdb.table.name" = > "/tmp/table/json","maprdb.column.id" = "movie_id"); > {code} > > # Use hive schema to query this table via Drill: > {code} > 0: jdbc:drill:> select * from hive.mapr_db_json_hive_tbl; > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6145) Implement Hive MapR-DB JSON handler.
[ https://issues.apache.org/jira/browse/DRILL-6145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vitalii Diravka updated DRILL-6145: --- Issue Type: Improvement (was: Bug) > Implement Hive MapR-DB JSON handler. > - > > Key: DRILL-6145 > URL: https://issues.apache.org/jira/browse/DRILL-6145 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - Hive, Storage - MapRDB >Affects Versions: 1.12.0 >Reporter: Vitalii Diravka >Assignee: Vitalii Diravka >Priority: Major > Labels: doc-impacting > Fix For: 1.14.0 > > > Similar to "hive-hbase-storage-handler" to support querying MapR-DB Hive's > external tables it is necessary to add "hive-maprdb-json-handler". > Use case: > # Create a table MapR-DB JSON table: > {code} > _> mapr dbshell_ > _maprdb root:> create /tmp/table/json_ (make sure /tmp/table exists) > {code} > -- insert data > {code} > insert /tmp/table/json --value '\{"_id":"movie002" , "title":"Developers > on the Edge", "studio":"Command Line Studios"}' > insert /tmp/table/json --id movie003 --value '\{"title":"The Golden > Master", "studio":"All-Nighter"}' > {code} > # Create a Hive external table: > {code} > hive> CREATE EXTERNAL TABLE mapr_db_json_hive_tbl ( > > movie_id string, title string, studio string) > > STORED BY 'org.apache.hadoop.hive.maprdb.json.MapRDBJsonStorageHandler' > > TBLPROPERTIES("maprdb.table.name" = > "/tmp/table/json","maprdb.column.id" = "movie_id"); > {code} > > # Use hive schema to query this table via Drill: > {code} > 0: jdbc:drill:> select * from hive.mapr_db_json_hive_tbl; > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6246) Build Failing in jdbc-all artifact
[ https://issues.apache.org/jira/browse/DRILL-6246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16401723#comment-16401723 ] ASF GitHub Bot commented on DRILL-6246: --- Github user vvysotskyi commented on the issue: https://github.com/apache/drill/pull/1168 Classes from `avatica.metrics` are used in `JsonHandler`, `ProtobufHandler` and `LocalService`. If Drill does not use these classes than I agree that we can exclude it from `jdbc-all` jar. Regarding excluding `avatica/org/**`, looks like the problem is in the Avatica pom files since there are no dependencies to `org.apache.commons` and `org.apache.http`, but they are shaded to the jar. Created Jira CALCITE-2215 to fix this issue, but for now, I think it's ok to exclude them. > Build Failing in jdbc-all artifact > -- > > Key: DRILL-6246 > URL: https://issues.apache.org/jira/browse/DRILL-6246 > Project: Apache Drill > Issue Type: Bug > Components: Client - JDBC >Affects Versions: 1.13.0 >Reporter: salim achouche >Assignee: salim achouche >Priority: Major > > * {color:#00}It was noticed that the build was failing because of the > jdbc-all artifact{color} > * {color:#00}The maximum compressed jar size was set to 32MB but we are > currently creating a JAR a bit larger than 32MB {color} > * {color:#00}I compared apache drill-1.10.0, drill-1.12.0, and > drill-1.13.0 (on my MacOS){color} > * {color:#00}jdbc-all-1.10.0 jar size: 21MB{color} > * {color:#00}jdbc-all-1.12.0 jar size: 27MB{color} > * {color:#00}jdbc-all-1.13.0 jar size: 34MB (on Linux this size is > roughly 32MB){color} > * {color:#00}Compared then in more details jdbc-all-1.12.0 and > jdbc-all-1.13.0{color} > * {color:#00}The bulk of the increase is attributed to the calcite > artifact{color} > * {color:#00}Used to be 2MB (uncompressed) and now 22MB > (uncompressed){color} > * {color:#00}It is likely an exclusion problem {color} > * {color:#00}The jdbc-all-1.12.0 version has only two top packages > calcite/avatica/utils and calcite/avatica/remote{color} > * {color:#00}The jdbc-all-1.13.0 includes new packages (within > calcite/avatica) metrics, proto, org/apache/, com/fasterxml, com/google{color} > {color:#00} {color} > {color:#00}I am planning to exclude these new sub-packages{color} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6256) Remove references to java 7 from readme and other files
[ https://issues.apache.org/jira/browse/DRILL-6256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva updated DRILL-6256: Affects Version/s: 1.13.0 > Remove references to java 7 from readme and other files > --- > > Key: DRILL-6256 > URL: https://issues.apache.org/jira/browse/DRILL-6256 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.13.0 >Reporter: Volodymyr Tkach >Assignee: Volodymyr Tkach >Priority: Major > Fix For: 1.14.0 > > > Since master branch uses jdk 8 we should remove all references to java 7. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6256) Remove references to java 7 from readme and other files
[ https://issues.apache.org/jira/browse/DRILL-6256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva updated DRILL-6256: Reviewer: Arina Ielchiieva Description: Since master branch uses jdk 8 we should remove all references to java 7. Also change min required maven version. was:Since master branch uses jdk 8 we should remove all references to java 7. > Remove references to java 7 from readme and other files > --- > > Key: DRILL-6256 > URL: https://issues.apache.org/jira/browse/DRILL-6256 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.13.0 >Reporter: Volodymyr Tkach >Assignee: Volodymyr Tkach >Priority: Major > Fix For: 1.14.0 > > > Since master branch uses jdk 8 we should remove all references to java 7. > Also change min required maven version. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6256) Remove references to java 7 from readme and other files
[ https://issues.apache.org/jira/browse/DRILL-6256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva updated DRILL-6256: Fix Version/s: 1.14.0 > Remove references to java 7 from readme and other files > --- > > Key: DRILL-6256 > URL: https://issues.apache.org/jira/browse/DRILL-6256 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.13.0 >Reporter: Volodymyr Tkach >Assignee: Volodymyr Tkach >Priority: Major > Fix For: 1.14.0 > > > Since master branch uses jdk 8 we should remove all references to java 7. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-1491) Support for JDK 8
[ https://issues.apache.org/jira/browse/DRILL-1491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva updated DRILL-1491: Labels: doc-impacting ready-to-commit (was: doc-impacting) > Support for JDK 8 > - > > Key: DRILL-1491 > URL: https://issues.apache.org/jira/browse/DRILL-1491 > Project: Apache Drill > Issue Type: Task > Components: Tools, Build Test >Reporter: Aditya Kishore >Assignee: Volodymyr Tkach >Priority: Blocker > Labels: doc-impacting, ready-to-commit > Fix For: 1.13.0 > > Attachments: DRILL-1491.1.patch.txt > > > This will be the umbrella JIRA used to track and fix issues with JDK 8 > support. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-1491) Support for JDK 8
[ https://issues.apache.org/jira/browse/DRILL-1491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva updated DRILL-1491: Labels: doc-impacting (was: ) > Support for JDK 8 > - > > Key: DRILL-1491 > URL: https://issues.apache.org/jira/browse/DRILL-1491 > Project: Apache Drill > Issue Type: Task > Components: Tools, Build Test >Reporter: Aditya Kishore >Assignee: Volodymyr Tkach >Priority: Blocker > Labels: doc-impacting, ready-to-commit > Fix For: 1.13.0 > > Attachments: DRILL-1491.1.patch.txt > > > This will be the umbrella JIRA used to track and fix issues with JDK 8 > support. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Closed] (DRILL-6257) Sqlline start command with password appears in the sqlline.log
[ https://issues.apache.org/jira/browse/DRILL-6257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva closed DRILL-6257. --- Resolution: Duplicate Original issue https://issues.apache.org/jira/browse/DRILL-6250. > Sqlline start command with password appears in the sqlline.log > -- > > Key: DRILL-6257 > URL: https://issues.apache.org/jira/browse/DRILL-6257 > Project: Apache Drill > Issue Type: Bug >Reporter: Volodymyr Tkach >Assignee: Volodymyr Tkach >Priority: Major > > *Prerequisites:* > *1.* Log level is set to "all" in the conf/logback.xml: > {code:xml} > > > > > {code} > *2.* PLAIN authentication mechanism is configured: > {code:java} > security.user.auth: { > enabled: true, > packages += "org.apache.drill.exec.rpc.user.security", > impl: "pam", > pam_profiles: [ "sudo", "login" ] > } > {code} > *Steps:* > *1.* Start the drillbits > *2.* Connect by sqlline: > {noformat} > /opt/mapr/drill/drill-1.13.0/bin/sqlline -u "jdbc:drill:zk=node1:5181;" -n > user1 -p 1234 > {noformat} > *3.* Use check the sqlline logs: > {noformat} > tail -F log/sqlline.log|grep 1234 -a5 -b5 > {noformat} > *Expected result:* Logs shouldn't contain clear-text passwords > *Actual result:* The logs contain the sqlline start command with password: > {noformat} > # system properties > 35333-"java" : { > 35352-# system properties > 35384:"command" : "sqlline.SqlLine -d > org.apache.drill.jdbc.Driver --maxWidth=1 --color=true -u > jdbc:drill:zk=node1:5181; -n user1 -p 1234", > 35535-# system properties > 35567-"launcher" : "SUN_STANDARD" > 35607-} > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6257) Sqlline start command with password appears in the sqlline.log
Volodymyr Tkach created DRILL-6257: -- Summary: Sqlline start command with password appears in the sqlline.log Key: DRILL-6257 URL: https://issues.apache.org/jira/browse/DRILL-6257 Project: Apache Drill Issue Type: Bug Reporter: Volodymyr Tkach Assignee: Volodymyr Tkach *Prerequisites:* *1.* Log level is set to "all" in the conf/logback.xml: {code:xml} {code} *2.* PLAIN authentication mechanism is configured: {code:java} security.user.auth: { enabled: true, packages += "org.apache.drill.exec.rpc.user.security", impl: "pam", pam_profiles: [ "sudo", "login" ] } {code} *Steps:* *1.* Start the drillbits *2.* Connect by sqlline: {noformat} /opt/mapr/drill/drill-1.13.0/bin/sqlline -u "jdbc:drill:zk=node1:5181;" -n user1 -p 1234 {noformat} *3.* Use check the sqlline logs: {noformat} tail -F log/sqlline.log|grep 1234 -a5 -b5 {noformat} *Expected result:* Logs shouldn't contain clear-text passwords *Actual result:* The logs contain the sqlline start command with password: {noformat} # system properties 35333-"java" : { 35352-# system properties 35384:"command" : "sqlline.SqlLine -d org.apache.drill.jdbc.Driver --maxWidth=1 --color=true -u jdbc:drill:zk=node1:5181; -n user1 -p 1234", 35535-# system properties 35567-"launcher" : "SUN_STANDARD" 35607-} {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6256) Remove references to java 7 from readme and other files
[ https://issues.apache.org/jira/browse/DRILL-6256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16401487#comment-16401487 ] ASF GitHub Bot commented on DRILL-6256: --- GitHub user vladimirtkach opened a pull request: https://github.com/apache/drill/pull/1172 DRILL-6256: Remove references to java 7 from readme and other files You can merge this pull request into a Git repository by running: $ git pull https://github.com/vladimirtkach/drill DRILL-6256 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/drill/pull/1172.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1172 commit 8349436f24f22e44cf9c0dfe2bd453d8a9fd3137 Author: vladimir tkachDate: 2018-03-16T05:58:42Z DRILL-6256: Remove references to java 7 from readme and other files > Remove references to java 7 from readme and other files > --- > > Key: DRILL-6256 > URL: https://issues.apache.org/jira/browse/DRILL-6256 > Project: Apache Drill > Issue Type: Bug >Reporter: Volodymyr Tkach >Assignee: Volodymyr Tkach >Priority: Major > > Since master branch uses jdk 8 we should remove all references to java 7. -- This message was sent by Atlassian JIRA (v7.6.3#76005)