[jira] [Commented] (DRILL-4553) Joins using views are not returning results.
[ https://issues.apache.org/jira/browse/DRILL-4553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15217296#comment-15217296 ] Anton Fernando commented on DRILL-4553: --- The query returns data if I do not have the filter on the first view (where username = upper(user)), but the whole point of this exercise is to secure the data by using data around who can view what inside of the JSON files. > Joins using views are not returning results. > > > Key: DRILL-4553 > URL: https://issues.apache.org/jira/browse/DRILL-4553 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.5.0, 1.6.0 >Reporter: Anton Fernando >Priority: Critical > > I have the following three views: > create view view1 as select . from where username=user; > create view view2 as select . from view1 as a, as b where a.col1 > = b.col1; > create view view3 as select . from view1 as a, as b where a.col1 > = b.col1; > A select * from each of these views works fine and returns the expected > results. A self join on view2 and view3 also works fine. However when view2 > and view3 are joined on common keys there are no rows returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4553) Joins using views are not returning results.
[ https://issues.apache.org/jira/browse/DRILL-4553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15217288#comment-15217288 ] Anton Fernando commented on DRILL-4553: --- This is over JSON and CSV, in this scenario the security metadata is in csv and the first view is created over it. Views 2 and 3 are used to secure data in JSON with the security metadata in csv. We are currently evaluating Drill to see if it is a good fit to analyze healthcare data and we have run into this issue. The explain plan for the query that is not returning data is as follows: 0: jdbc:drill:zk=localhost:2181> explain plan for select a.facilityidentifier, a.encounteridentifier from dischargedetail a, dischargephysn b where a.encounteridentifier=b.encounteridentifier and a.facilityidentifier=b.facilityidentifier; +--+--+ | text | json | +--+--+ | 00-00Screen 00-01 Project(facilityidentifier=[$0], encounteridentifier=[$1]) 00-02Project(facilityidentifier=[$1], encounteridentifier=[$0]) 00-03 Project(EncounterIdentifier=[$2], FacilityIdentifier=[$3], EncounterIdentifier0=[$0], FacilityIdentifier0=[$1]) 00-04HashJoin(condition=[AND(=($2, $0), =($3, $1))], joinType=[inner]) 00-06 Project(EncounterIdentifier=[$0], FacilityIdentifier=[$1]) 00-08HashJoin(condition=[AND(=($1, $13), =($2, $14))], joinType=[inner]) 00-11 Project(EncounterIdentifier=[$0], FacilityIdentifier=[$1], SettingOfCare=[$2], ITEM=[ITEM($3, 'MedicalProfessionalIdentifierRaw')], ITEM4=[ITEM($3, 'MedicalProfessionalRoleCodeRaw')], ITEM5=[ITEM($3, 'MedicalProfessionalRoleCode')], ITEM6=[ITEM($3, 'FirstNameRaw')], ITEM7=[ITEM($3, 'LastNameRaw')], ITEM8=[ITEM($3, 'MiddleNameRaw')], ITEM9=[ITEM($3, 'MedicalProfessionalPrimarySpecialtyRaw')], ITEM10=[ITEM($3, 'MedicalProfessionalSecondarySpecialtyRaw')], ITEM11=[ITEM($3, 'NationalProviderIdentifierRaw')], ITEM12=[ITEM($3, 'UniformProviderIdentifierRaw')]) 00-14Flatten(flattenField=[$3]) 00-17 Project(EncounterIdentifier=[$0], FacilityIdentifier=[ITEM($1, 'FacilityIdentifier')], SettingOfCare=[$2], MedicalProfessionals=[$3]) 00-21Scan(groupscan=[EasyGroupScan [selectionRoot=hdfs://sandbox.hortonworks.com:8020/tmp/json, numFiles=3, columns=[`EncounterIdentifier`, `Facility`.`FacilityIdentifier`, `SettingOfCare`, `MedicalProfessionals`], files=[hdfs://sandbox.hortonworks.com:8020/tmp/json/403Encounters.json, hdfs://sandbox.hortonworks.com:8020/tmp/json/404Encounters.json, hdfs://sandbox.hortonworks.com:8020/tmp/json/405Encounters.json]]]) 00-10 Project(FacilityIdentifier0=[$0], SettingOfCare0=[$1]) 00-13Project(FacilityIdentifier=[$1], SettingOfCare=[$2]) 00-16 SelectionVectorRemover 00-20Filter(condition=[=($0, UPPER(USER))]) 00-24 Project(username=[ITEM($0, 0)], FacilityIdentifier=[ITEM($0, 1)], SettingOfCare=[ITEM($0, 2)]) 00-26Scan(groupscan=[EasyGroupScan [selectionRoot=hdfs://sandbox.hortonworks.com:8020/tmp/security, numFiles=1, columns=[`columns`[0], `columns`[1], `columns`[2]], files=[hdfs://sandbox.hortonworks.com:8020/tmp/security/lake_data_security.csv]]]) 00-05 Project(EncounterIdentifier0=[$0], FacilityIdentifier0=[$1]) 00-07Project(EncounterIdentifier=[$1], FacilityIdentifier=[$2]) 00-09 SelectionVectorRemover 00-12Filter(condition=[=($2, $3)]) 00-15 HashJoin(condition=[=($0, $4)], joinType=[inner]) 00-19Project(SettingOfCare=[$0], EncounterIdentifier=[$1], ITEM=[ITEM($2, 'FacilityIdentifier')]) 00-23 Scan(groupscan=[EasyGroupScan [selectionRoot=hdfs://sandbox.hortonworks.com:8020/tmp/json, numFiles=3, columns=[`SettingOfCare`, `EncounterIdentifier`, `Facility`.`FacilityIdentifier`], files=[hdfs://sandbox.hortonworks.com:8020/tmp/json/403Encounters.json, hdfs://sandbox.hortonworks.com:8020/tmp/json/404Encounters.json, hdfs://sandbox.hortonworks.com:8020/tmp/json/405Encounters.json]]]) 00-18SelectionVectorRemover 00-22 Filter(condition=[=($0, UPPER(USER))]) 00-25Project(username=[ITEM($0, 0)], FacilityIdentifier=[ITEM($0, 1)]) 00-27 Scan(groupscan=[EasyGroupScan [selectionRoot=hdfs://sandbox.hortonworks.com:8020/tmp/security, numFiles=1, columns=[`columns`[0], `columns`[1]], files=[hdfs://sandbox.hortonworks.com:8020/tmp/security/lake_data_security.csv]]]) | { "head" : { "version" : 1, "generator" : { "type" : "ExplainHandler", "info" : "" }, "type" : "APACHE_DRILL_PHYSICAL", "options" : [ ], "queue" : 0, "resultMode" : "EXEC"
[jira] [Commented] (DRILL-4550) Add support more time units in extract function
[ https://issues.apache.org/jira/browse/DRILL-4550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15217191#comment-15217191 ] ASF GitHub Bot commented on DRILL-4550: --- GitHub user vkorukanti opened a pull request: https://github.com/apache/drill/pull/453 DRILL-4550: Add support more time units in extract function Calcite changes are pending in CALCITE-1177 You can merge this pull request into a Git repository by running: $ git pull https://github.com/vkorukanti/drill DRILL-4550 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/drill/pull/453.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #453 commit ca223227cc44052b55e13a4b7525262ec4ec40f8 Author: vkorukantiDate: 2016-03-30T00:08:57Z DRILL-4550: Add support more time units in extract function > Add support more time units in extract function > --- > > Key: DRILL-4550 > URL: https://issues.apache.org/jira/browse/DRILL-4550 > Project: Apache Drill > Issue Type: Improvement > Components: Functions - Drill >Affects Versions: 1.6.0 >Reporter: Venki Korukanti >Assignee: Venki Korukanti > Fix For: 1.7.0 > > > Currently {{extract}} function support following units {{YEAR, MONTH, DAY, > HOUR, MINUTE, SECOND}}. Add support for more units: {{CENTURY, DECADE, DOW, > DOY, EPOCH, MILLENNIUM, QUARTER, WEEK}}. > We also need changes in the SQL parser. Currently the parser only allows > {{YEAR, MONTH, DAY, HOUR, MINUTE, SECOND}} as units. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (DRILL-4557) Make complex writer handle also scalars
Julien Le Dem created DRILL-4557: Summary: Make complex writer handle also scalars Key: DRILL-4557 URL: https://issues.apache.org/jira/browse/DRILL-4557 Project: Apache Drill Issue Type: Improvement Reporter: Julien Le Dem Currently complex writer can be used to write array or map but not scalar -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-4557) Make complex writer handle also scalars
[ https://issues.apache.org/jira/browse/DRILL-4557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Julien Le Dem updated DRILL-4557: - Issue Type: Sub-task (was: Improvement) Parent: DRILL-4538 > Make complex writer handle also scalars > --- > > Key: DRILL-4557 > URL: https://issues.apache.org/jira/browse/DRILL-4557 > Project: Apache Drill > Issue Type: Sub-task >Reporter: Julien Le Dem > > Currently complex writer can be used to write array or map but not scalar -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (DRILL-4556) UDF with FieldReader parameter reading union type fails compilation
Julien Le Dem created DRILL-4556: Summary: UDF with FieldReader parameter reading union type fails compilation Key: DRILL-4556 URL: https://issues.apache.org/jira/browse/DRILL-4556 Project: Apache Drill Issue Type: Bug Reporter: Julien Le Dem select foo(a) from mixed where a is a union vector (say mixed is a json file where a is a string or an int) Foo is a UDF that has one param defined as a FieldReader the operator compilation fails as the field is produced as a UnionHolder instead of a FieldReader. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-4556) UDF with FieldReader parameter reading union type fails compilation
[ https://issues.apache.org/jira/browse/DRILL-4556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Julien Le Dem updated DRILL-4556: - Issue Type: Sub-task (was: Bug) Parent: DRILL-4538 > UDF with FieldReader parameter reading union type fails compilation > --- > > Key: DRILL-4556 > URL: https://issues.apache.org/jira/browse/DRILL-4556 > Project: Apache Drill > Issue Type: Sub-task >Reporter: Julien Le Dem > > select foo(a) from mixed > where a is a union vector (say mixed is a json file where a is a string or an > int) > Foo is a UDF that has one param defined as a FieldReader > the operator compilation fails as the field is produced as a UnionHolder > instead of a FieldReader. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3743) query hangs on sqlline once Drillbit on foreman node is killed
[ https://issues.apache.org/jira/browse/DRILL-3743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15217180#comment-15217180 ] Sudheesh Katkam commented on DRILL-3743: The close listener is [removed too early|https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/rpc/user/QueryResultHandler.java#L323]. The listener should be removed after the final query state is received (through #resultArrived). > query hangs on sqlline once Drillbit on foreman node is killed > -- > > Key: DRILL-3743 > URL: https://issues.apache.org/jira/browse/DRILL-3743 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Flow >Affects Versions: 1.2.0 > Environment: 4 node cluster CentOS >Reporter: Khurram Faraaz >Assignee: Sudheesh Katkam >Priority: Critical > Fix For: Future > > > sqlline/query hangs once Drillbit (on Foreman node) is killed. (kill -9 ) > query was issued from the Foreman node. The query returns many records, and > it is a long running query. > Steps to reproduce the problem. > set planner.slice_target=1 > 1. clush -g khurram service mapr-warden stop > 2. clush -g khurram service mapr-warden start > 3. ./sqlline -u "jdbc:drill:schema=dfs.tmp" > 0: jdbc:drill:schema=dfs.tmp> select * from `twoKeyJsn.json` limit 200; > 4. Immediately from another console do a jps and kill the Drillbit process > (in this case foreman) while the query is being run on sqlline. You will > notice that sqlline just hangs, we do not see any exceptions or errors being > reported on sqlline prompt or in drillbit.log or drillbit.out > I do see this Exception in sqlline.log on the node from where sqlline was > started > {code} > 2015-09-04 18:45:12,069 [Client-1] INFO o.a.d.e.rpc.user.QueryResultHandler > - User Error Occurred > org.apache.drill.common.exceptions.UserException: CONNECTION ERROR: > Connection /10.10.100.201:53425 <--> /10.10.100.201:31010 (user client) > closed unexpectedly. > [Error Id: ec316cfd-c9a5-4905-98e3-da20cb799ba5 ] > at > org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:524) > ~[drill-common-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT] > at > org.apache.drill.exec.rpc.user.QueryResultHandler$SubmissionListener$ChannelClosedListener.operationComplete(QueryResultHandler.java:298) > [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT] > at > io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:680) > [netty-common-4.0.27.Final.jar:4.0.27.Final] > at > io.netty.util.concurrent.DefaultPromise$LateListeners.run(DefaultPromise.java:845) > [netty-common-4.0.27.Final.jar:4.0.27.Final] > at > io.netty.util.concurrent.DefaultPromise$LateListenerNotifier.run(DefaultPromise.java:873) > [netty-common-4.0.27.Final.jar:4.0.27.Final] > at > io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:357) > [netty-common-4.0.27.Final.jar:4.0.27.Final] > at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:254) > [netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na] > at > io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111) > [netty-common-4.0.27.Final.jar:4.0.27.Final] > at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45] > 2015-09-04 18:45:12,069 [Client-1] INFO > o.a.d.j.i.DrillResultSetImpl$ResultsListener - [#7] Query failed: > org.apache.drill.common.exceptions.UserException: CONNECTION ERROR: > Connection /10.10.100.201:53425 <--> /10.10.100.201:31010 (user client) > closed unexpectedly. > [Error Id: ec316cfd-c9a5-4905-98e3-da20cb799ba5 ] > at > org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:524) > ~[drill-common-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT] > at > org.apache.drill.exec.rpc.user.QueryResultHandler$SubmissionListener$ChannelClosedListener.operationComplete(QueryResultHandler.java:298) > [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT] > at > io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:680) > [netty-common-4.0.27.Final.jar:4.0.27.Final] > at > io.netty.util.concurrent.DefaultPromise$LateListeners.run(DefaultPromise.java:845) > [netty-common-4.0.27.Final.jar:4.0.27.Final] > at > io.netty.util.concurrent.DefaultPromise$LateListenerNotifier.run(DefaultPromise.java:873) > [netty-common-4.0.27.Final.jar:4.0.27.Final] > at > io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:357) > [netty-common-4.0.27.Final.jar:4.0.27.Final] > at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:254) >
[jira] [Commented] (DRILL-4531) Query with filter and aggregate hangs in planning phase
[ https://issues.apache.org/jira/browse/DRILL-4531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15217169#comment-15217169 ] Chun Chang commented on DRILL-4531: --- added a test in automation. verified fix. > Query with filter and aggregate hangs in planning phase > --- > > Key: DRILL-4531 > URL: https://issues.apache.org/jira/browse/DRILL-4531 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning & Optimization >Reporter: Jinfeng Ni >Assignee: Jinfeng Ni > Fix For: 1.7.0 > > > For the following query, > {code} > SELECT cust.custAddress, >lineitem.provider > FROM ( > SELECT cast(c_custkey AS bigint) AS custkey, > c_address AS custAddress > FROM cp.`tpch/customer.parquet` ) cust > LEFT JOIN > ( > SELECT DISTINCT l_linenumber, >CASE > WHEN l_partkey IN (1, 2) THEN 'Store1' > WHEN l_partkey IN (5, 6) THEN 'Store2' >END AS provider > FROM cp.`tpch/lineitem.parquet` > WHERE ( l_orderkey >=20160101 AND l_partkey <=20160301) > AND l_partkey IN (1,2, 5, 6) ) lineitem > ONcust.custkey = lineitem.l_linenumber > WHERE provider IS NOT NULL > GROUP BY cust.custAddress, > lineitem.provider > ORDER BY cust.custAddress, > lineitem.provider; > {code} > When run on today's master branch commit: > 79a3c164c1df7a5d7a0b82574316b4a0b1c7593e, query just hangs there in the > planning phase. > Log shows that it stuck in Drill_Logical planning phase. > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4553) Joins using views are not returning results.
[ https://issues.apache.org/jira/browse/DRILL-4553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15217163#comment-15217163 ] Khurram Faraaz commented on DRILL-4553: --- Is this over Parquet or JSON or CSV data ? Can you please share the query plan for the case where equi-join over two views returns no results ? > Joins using views are not returning results. > > > Key: DRILL-4553 > URL: https://issues.apache.org/jira/browse/DRILL-4553 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.5.0, 1.6.0 >Reporter: Anton Fernando >Priority: Critical > > I have the following three views: > create view view1 as select . from where username=user; > create view view2 as select . from view1 as a, as b where a.col1 > = b.col1; > create view view3 as select . from view1 as a, as b where a.col1 > = b.col1; > A select * from each of these views works fine and returns the expected > results. A self join on view2 and view3 also works fine. However when view2 > and view3 are joined on common keys there are no rows returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-4553) Joins using views are not returning results.
[ https://issues.apache.org/jira/browse/DRILL-4553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Khurram Faraaz updated DRILL-4553: -- Priority: Critical (was: Major) > Joins using views are not returning results. > > > Key: DRILL-4553 > URL: https://issues.apache.org/jira/browse/DRILL-4553 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.5.0, 1.6.0 >Reporter: Anton Fernando >Priority: Critical > > I have the following three views: > create view view1 as select . from where username=user; > create view view2 as select . from view1 as a, as b where a.col1 > = b.col1; > create view view3 as select . from view1 as a, as b where a.col1 > = b.col1; > A select * from each of these views works fine and returns the expected > results. A self join on view2 and view3 also works fine. However when view2 > and view3 are joined on common keys there are no rows returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (DRILL-3743) query hangs on sqlline once Drillbit on foreman node is killed
[ https://issues.apache.org/jira/browse/DRILL-3743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sudheesh Katkam reassigned DRILL-3743: -- Assignee: Sudheesh Katkam > query hangs on sqlline once Drillbit on foreman node is killed > -- > > Key: DRILL-3743 > URL: https://issues.apache.org/jira/browse/DRILL-3743 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Flow >Affects Versions: 1.2.0 > Environment: 4 node cluster CentOS >Reporter: Khurram Faraaz >Assignee: Sudheesh Katkam >Priority: Critical > Fix For: Future > > > sqlline/query hangs once Drillbit (on Foreman node) is killed. (kill -9 ) > query was issued from the Foreman node. The query returns many records, and > it is a long running query. > Steps to reproduce the problem. > set planner.slice_target=1 > 1. clush -g khurram service mapr-warden stop > 2. clush -g khurram service mapr-warden start > 3. ./sqlline -u "jdbc:drill:schema=dfs.tmp" > 0: jdbc:drill:schema=dfs.tmp> select * from `twoKeyJsn.json` limit 200; > 4. Immediately from another console do a jps and kill the Drillbit process > (in this case foreman) while the query is being run on sqlline. You will > notice that sqlline just hangs, we do not see any exceptions or errors being > reported on sqlline prompt or in drillbit.log or drillbit.out > I do see this Exception in sqlline.log on the node from where sqlline was > started > {code} > 2015-09-04 18:45:12,069 [Client-1] INFO o.a.d.e.rpc.user.QueryResultHandler > - User Error Occurred > org.apache.drill.common.exceptions.UserException: CONNECTION ERROR: > Connection /10.10.100.201:53425 <--> /10.10.100.201:31010 (user client) > closed unexpectedly. > [Error Id: ec316cfd-c9a5-4905-98e3-da20cb799ba5 ] > at > org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:524) > ~[drill-common-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT] > at > org.apache.drill.exec.rpc.user.QueryResultHandler$SubmissionListener$ChannelClosedListener.operationComplete(QueryResultHandler.java:298) > [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT] > at > io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:680) > [netty-common-4.0.27.Final.jar:4.0.27.Final] > at > io.netty.util.concurrent.DefaultPromise$LateListeners.run(DefaultPromise.java:845) > [netty-common-4.0.27.Final.jar:4.0.27.Final] > at > io.netty.util.concurrent.DefaultPromise$LateListenerNotifier.run(DefaultPromise.java:873) > [netty-common-4.0.27.Final.jar:4.0.27.Final] > at > io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:357) > [netty-common-4.0.27.Final.jar:4.0.27.Final] > at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:254) > [netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na] > at > io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111) > [netty-common-4.0.27.Final.jar:4.0.27.Final] > at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45] > 2015-09-04 18:45:12,069 [Client-1] INFO > o.a.d.j.i.DrillResultSetImpl$ResultsListener - [#7] Query failed: > org.apache.drill.common.exceptions.UserException: CONNECTION ERROR: > Connection /10.10.100.201:53425 <--> /10.10.100.201:31010 (user client) > closed unexpectedly. > [Error Id: ec316cfd-c9a5-4905-98e3-da20cb799ba5 ] > at > org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:524) > ~[drill-common-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT] > at > org.apache.drill.exec.rpc.user.QueryResultHandler$SubmissionListener$ChannelClosedListener.operationComplete(QueryResultHandler.java:298) > [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT] > at > io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:680) > [netty-common-4.0.27.Final.jar:4.0.27.Final] > at > io.netty.util.concurrent.DefaultPromise$LateListeners.run(DefaultPromise.java:845) > [netty-common-4.0.27.Final.jar:4.0.27.Final] > at > io.netty.util.concurrent.DefaultPromise$LateListenerNotifier.run(DefaultPromise.java:873) > [netty-common-4.0.27.Final.jar:4.0.27.Final] > at > io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:357) > [netty-common-4.0.27.Final.jar:4.0.27.Final] > at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:254) > [netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na] > at > io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111) > [netty-common-4.0.27.Final.jar:4.0.27.Final] > at java.lang.Thread.run(Thread.java:744)
[jira] [Created] (DRILL-4555) JsonReader does not support nulls in lists
Julien Le Dem created DRILL-4555: Summary: JsonReader does not support nulls in lists Key: DRILL-4555 URL: https://issues.apache.org/jira/browse/DRILL-4555 Project: Apache Drill Issue Type: Bug Reporter: Julien Le Dem {noformat} case VALUE_NULL: throw UserException.unsupportedError() .message("Null values are not supported in lists by default. " + "Please set `store.json.all_text_mode` to true to read lists containing nulls. " + "Be advised that this will treat JSON null values as a string containing the word 'null'.") .build(logger); {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (DRILL-4472) Pushing Filter past Union All fails: DRILL-3257 regressed DRILL-2746 but unit test update break test goal
[ https://issues.apache.org/jira/browse/DRILL-4472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Hsuan-Yi Chu resolved DRILL-4472. -- Resolution: Fixed Fix Version/s: 1.7.0 It was resolved when DRILL-4476 > Pushing Filter past Union All fails: DRILL-3257 regressed DRILL-2746 but unit > test update break test goal > - > > Key: DRILL-4472 > URL: https://issues.apache.org/jira/browse/DRILL-4472 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning & Optimization >Reporter: Jacques Nadeau >Assignee: Sean Hsuan-Yi Chu > Fix For: 1.7.0 > > > While reviewing DRILL-4467, I discovered this test. > https://github.com/apache/drill/blame/master/exec/java-exec/src/test/java/org/apache/drill/TestUnionAll.java#L560 > As you can see, the test is checking that test name confirms that filter is > pushed below union all. However, as you can see, the expected result in > DRILL-3257 was updated to a plan which doesn't push the in clause below the > filter. I'm disabling the test since 4467 happens to remove what becomes a > trivial project. However, we really should fix the core problem (a regression > of DRILL-2746. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4551) Add some missing functions that are generated by Tableau (cot, regex_matches, split_part, isdate)
[ https://issues.apache.org/jira/browse/DRILL-4551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15217083#comment-15217083 ] ASF GitHub Bot commented on DRILL-4551: --- Github user jinfengni commented on a diff in the pull request: https://github.com/apache/drill/pull/452#discussion_r57819554 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/impl/StringFunctionHelpers.java --- @@ -213,11 +213,39 @@ public static long getDate(DrillBuf buf, int start, int end){ if (BoundsChecking.BOUNDS_CHECKING_ENABLED) { buf.checkBytes(start, end); } -return memGetDate(buf.memoryAddress(), start, end); +int[] dateFields = memGetDate(buf.memoryAddress(), start, end); +return CHRONOLOGY.getDateTimeMillis(dateFields[0], dateFields[1], dateFields[2], 0); } + /** + * Takes a string value, specified as a buffer with a start and end and + * returns true if the value can be read as a date. + * + * @param buf + * @param start + * @param end + * @return true iff the string value can be read as a date + */ + public static boolean isReadableAsDate(DrillBuf buf, int start, int end){ +if (BoundsChecking.BOUNDS_CHECKING_ENABLED) { + buf.checkBytes(start, end); +} +int[] dateFields = memGetDate(buf.memoryAddress(), start, end); --- End diff -- Can we call getDate() directly here, and wrap with a try/catch block? The code seems identical to getDate(), except for the try/catch block. > Add some missing functions that are generated by Tableau (cot, regex_matches, > split_part, isdate) > - > > Key: DRILL-4551 > URL: https://issues.apache.org/jira/browse/DRILL-4551 > Project: Apache Drill > Issue Type: Improvement >Reporter: Jason Altekruse >Assignee: Jason Altekruse > > Several of these functions do not appear to be standard SQL functions, but > they are available in several other popular databases like SQL Server, Oracle > and Postgres. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4551) Add some missing functions that are generated by Tableau (cot, regex_matches, split_part, isdate)
[ https://issues.apache.org/jira/browse/DRILL-4551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15217072#comment-15217072 ] ASF GitHub Bot commented on DRILL-4551: --- Github user jinfengni commented on a diff in the pull request: https://github.com/apache/drill/pull/452#discussion_r57818552 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/impl/DateTypeFunctions.java --- @@ -40,6 +41,36 @@ public class DateTypeFunctions { +/** + * Function to check if a varchar value can be cast to a date. + * + * At the time of writing this function, several other databases were checked + * for behavior compatibility. There was not a consensus between oracle and + * Sql server about the expected behavior of this function, and Postgres + * lacks it completely. + * + * Sql Server appears to have both a DATEFORMAT and language locale setting + * that can change the values accepted by this function. Oracle appears to + * support several formats, some of which are not mentioned in the Sql + * Server docs. With the lack of standardization, we decided to implement + * this function so that it would only consider date strings that would be + * accepted by the cast function as valid. + */ +@SuppressWarnings("unused") +@FunctionTemplate(name = "isdate", scope = FunctionTemplate.FunctionScope.SIMPLE, nulls=NullHandling.NULL_IF_NULL, --- End diff -- Have you checked isdate() returns null for null input in other system like oracle? I thought it would return either true or false. > Add some missing functions that are generated by Tableau (cot, regex_matches, > split_part, isdate) > - > > Key: DRILL-4551 > URL: https://issues.apache.org/jira/browse/DRILL-4551 > Project: Apache Drill > Issue Type: Improvement >Reporter: Jason Altekruse >Assignee: Jason Altekruse > > Several of these functions do not appear to be standard SQL functions, but > they are available in several other popular databases like SQL Server, Oracle > and Postgres. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-4531) Query with filter and aggregate hangs in planning phase
[ https://issues.apache.org/jira/browse/DRILL-4531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chun Chang updated DRILL-4531: -- Reviewer: Chun Chang > Query with filter and aggregate hangs in planning phase > --- > > Key: DRILL-4531 > URL: https://issues.apache.org/jira/browse/DRILL-4531 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning & Optimization >Reporter: Jinfeng Ni >Assignee: Jinfeng Ni > Fix For: 1.7.0 > > > For the following query, > {code} > SELECT cust.custAddress, >lineitem.provider > FROM ( > SELECT cast(c_custkey AS bigint) AS custkey, > c_address AS custAddress > FROM cp.`tpch/customer.parquet` ) cust > LEFT JOIN > ( > SELECT DISTINCT l_linenumber, >CASE > WHEN l_partkey IN (1, 2) THEN 'Store1' > WHEN l_partkey IN (5, 6) THEN 'Store2' >END AS provider > FROM cp.`tpch/lineitem.parquet` > WHERE ( l_orderkey >=20160101 AND l_partkey <=20160301) > AND l_partkey IN (1,2, 5, 6) ) lineitem > ONcust.custkey = lineitem.l_linenumber > WHERE provider IS NOT NULL > GROUP BY cust.custAddress, > lineitem.provider > ORDER BY cust.custAddress, > lineitem.provider; > {code} > When run on today's master branch commit: > 79a3c164c1df7a5d7a0b82574316b4a0b1c7593e, query just hangs there in the > planning phase. > Log shows that it stuck in Drill_Logical planning phase. > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (DRILL-4554) Data type mismatch for union all with timestamp and date
Krystal created DRILL-4554: -- Summary: Data type mismatch for union all with timestamp and date Key: DRILL-4554 URL: https://issues.apache.org/jira/browse/DRILL-4554 Project: Apache Drill Issue Type: Bug Components: Query Planning & Optimization Reporter: Krystal Assignee: Sean Hsuan-Yi Chu Calcite and drill execute different implicit cast when a union all query contains timestamp and date on both right and left hand side but in different order. select col_tmstmp,col_date, col_boln from `prqUnAll_0_v` union all select col_date, col_tmstmp, col_boln from `prqUnAll_1_v` limit 0: select * from (select col_tmstmp,col_date, col_boln from `prqUnAll_0_v` union all select col_date, col_tmstmp, col_boln from `prqUnAll_1_v`) t limit 0 limit 0: [col_tmstmp, col_date, col_boln] regular: [col_tmstmp, col_date, col_boln] limit 0: [DATE, DATE, BOOLEAN] regular: [TIMESTAMP, TIMESTAMP, BOOLEAN] limit 0: [columnNullable, columnNullable, columnNullable] regular: [columnNullable, columnNullable, columnNullable] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (DRILL-4553) Joins using views are not returning results.
Anton Fernando created DRILL-4553: - Summary: Joins using views are not returning results. Key: DRILL-4553 URL: https://issues.apache.org/jira/browse/DRILL-4553 Project: Apache Drill Issue Type: Bug Affects Versions: 1.6.0, 1.5.0 Reporter: Anton Fernando I have the following three views: create view view1 as select . from where username=user; create view view2 as select . from view1 as a, as b where a.col1 = b.col1; create view view3 as select . from view1 as a, as b where a.col1 = b.col1; A select * from each of these views works fine and returns the expected results. A self join on view2 and view3 also works fine. However when view2 and view3 are joined on common keys there are no rows returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (DRILL-4552) Treat decimal literals as Double when type inference is taking place
Sean Hsuan-Yi Chu created DRILL-4552: Summary: Treat decimal literals as Double when type inference is taking place Key: DRILL-4552 URL: https://issues.apache.org/jira/browse/DRILL-4552 Project: Apache Drill Issue Type: Improvement Components: Query Planning & Optimization Reporter: Sean Hsuan-Yi Chu Assignee: Sean Hsuan-Yi Chu In SQL standard, decimal literals (e.g., 1.2, 2.5, etc) are decimal types. However, currently, Drill always converts them to Double in DrillOptiq. Since they will be converted as Double in execution anyway, at inference, we can treat them as Double to help determine the return types. (The current behavior is "not to do any inference if the operand is Decimal type"). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4551) Add some missing functions that are generated by Tableau (cot, regex_matches, split_part, isdate)
[ https://issues.apache.org/jira/browse/DRILL-4551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15216933#comment-15216933 ] ASF GitHub Bot commented on DRILL-4551: --- GitHub user jaltekruse opened a pull request: https://github.com/apache/drill/pull/452 DRILL-4551: Implement new functions (cot, regex_matches, split_part, … …isdate) You can merge this pull request into a Git repository by running: $ git pull https://github.com/jaltekruse/incubator-drill 4551-new-functions Alternatively you can review and apply these changes as the patch at: https://github.com/apache/drill/pull/452.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #452 commit 0166aab070aa7175175b4a35162fc2502ea3cb90 Author: Jason AltekruseDate: 2016-03-28T18:55:11Z DRILL-4551: Implement new functions (cot, regex_matches, split_part, isdate) > Add some missing functions that are generated by Tableau (cot, regex_matches, > split_part, isdate) > - > > Key: DRILL-4551 > URL: https://issues.apache.org/jira/browse/DRILL-4551 > Project: Apache Drill > Issue Type: Improvement >Reporter: Jason Altekruse >Assignee: Jason Altekruse > > Several of these functions do not appear to be standard SQL functions, but > they are available in several other popular databases like SQL Server, Oracle > and Postgres. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (DRILL-4551) Add some missing functions that are generated by Tableau (cot, regex_matches, split_part, isdate)
Jason Altekruse created DRILL-4551: -- Summary: Add some missing functions that are generated by Tableau (cot, regex_matches, split_part, isdate) Key: DRILL-4551 URL: https://issues.apache.org/jira/browse/DRILL-4551 Project: Apache Drill Issue Type: Improvement Reporter: Jason Altekruse Assignee: Jason Altekruse Several of these functions do not appear to be standard SQL functions, but they are available in several other popular databases like SQL Server, Oracle and Postgres. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4543) Advertise Drill-bit ports, status, capabilities in ZooKeeper
[ https://issues.apache.org/jira/browse/DRILL-4543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15216900#comment-15216900 ] John Omernik commented on DRILL-4543: - Paul, should we open another JIRA for the data port issue? I am guessing that you will want that for YARN as well as me wanting it for Mesos. The kicker being I don't have the dev background or team to be able to implement it. (The control port + 1 is going to be an issue if a node running a bit dynamically allocates control port, but controlport + 1 isn't available). Let me know if you to open a JIRA or want me to. I > Advertise Drill-bit ports, status, capabilities in ZooKeeper > > > Key: DRILL-4543 > URL: https://issues.apache.org/jira/browse/DRILL-4543 > Project: Apache Drill > Issue Type: Improvement > Components: Server >Reporter: Paul Rogers > Fix For: 2.0.0 > > > Today Drill uses ZooKeeper (ZK) to advertise the existence of a Drill-bit, > providing just the host name/IP Address of the Drill-bit. All other > information (ports, status, capabilities) are assumed to be the same across > all Drill-bits in the cluster as specified in the Drill config file. > Moving forward, as Drill becomes more sophisticated, Drill should advertise > the specifics of each Drill-bit so that one Drill bit can differ from another. > For example, when running on YARN, we need a way for Drill to gracefully shut > down. Advertising a status of Ready or Unavailable will help. Ready is the > normal state. Unavailable means the Drill-bit will finish in-flight queries, > but won't accept new ones. (The actual status is a separate enhancement.) > In a YARN cluster, Drill should take advantage of machines with more memory, > but live with machines with less. (Perhaps some are newer, some are older or > more heavily loaded.) Drill should use ZK to identify its available memory > and CPUs so that the planner can use them. (Use of the info is a separate > enhancement.) > There may be times when two drill bits run on a single machine. If so, they > must use separate ports. So, each Drill-bit should advertise its ports in ZK. > For backward compatibility, the information is optional; if not present, the > receiver should assume the information defaults to that in the config file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4543) Advertise Drill-bit ports, status, capabilities in ZooKeeper
[ https://issues.apache.org/jira/browse/DRILL-4543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15216875#comment-15216875 ] Paul Rogers commented on DRILL-4543: Thanks for the clarification. Might be a good idea for the docs on the Apache Drill site to point to the HOCON docs so folks know about this system. For YARN, when testing gets far enough, we'll try out the system property override for the YARN-relevant properties. > Advertise Drill-bit ports, status, capabilities in ZooKeeper > > > Key: DRILL-4543 > URL: https://issues.apache.org/jira/browse/DRILL-4543 > Project: Apache Drill > Issue Type: Improvement > Components: Server >Reporter: Paul Rogers > Fix For: 2.0.0 > > > Today Drill uses ZooKeeper (ZK) to advertise the existence of a Drill-bit, > providing just the host name/IP Address of the Drill-bit. All other > information (ports, status, capabilities) are assumed to be the same across > all Drill-bits in the cluster as specified in the Drill config file. > Moving forward, as Drill becomes more sophisticated, Drill should advertise > the specifics of each Drill-bit so that one Drill bit can differ from another. > For example, when running on YARN, we need a way for Drill to gracefully shut > down. Advertising a status of Ready or Unavailable will help. Ready is the > normal state. Unavailable means the Drill-bit will finish in-flight queries, > but won't accept new ones. (The actual status is a separate enhancement.) > In a YARN cluster, Drill should take advantage of machines with more memory, > but live with machines with less. (Perhaps some are newer, some are older or > more heavily loaded.) Drill should use ZK to identify its available memory > and CPUs so that the planner can use them. (Use of the info is a separate > enhancement.) > There may be times when two drill bits run on a single machine. If so, they > must use separate ports. So, each Drill-bit should advertise its ports in ZK. > For backward compatibility, the information is optional; if not present, the > receiver should assume the information defaults to that in the config file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4543) Advertise Drill-bit ports, status, capabilities in ZooKeeper
[ https://issues.apache.org/jira/browse/DRILL-4543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15216669#comment-15216669 ] John Omernik commented on DRILL-4543: - Interesting. That would keep them all in the drill-env though. So that's good. The only downside to that, is you can be making changes just in drill-env, and someone else, or a previous change in drill-override could make for a situation where the drill-override is stomping on what is set in system properties/env right? Or am I looking at that wrong. I guess that's such an edge case it shouldn't matter. I just like being overtly explicit. This way, if I look at drill-override later, I am always reminded that my ports are set in drill-env. Tiny issue but one I can see is a person preference thing. > Advertise Drill-bit ports, status, capabilities in ZooKeeper > > > Key: DRILL-4543 > URL: https://issues.apache.org/jira/browse/DRILL-4543 > Project: Apache Drill > Issue Type: Improvement > Components: Server >Reporter: Paul Rogers > Fix For: 2.0.0 > > > Today Drill uses ZooKeeper (ZK) to advertise the existence of a Drill-bit, > providing just the host name/IP Address of the Drill-bit. All other > information (ports, status, capabilities) are assumed to be the same across > all Drill-bits in the cluster as specified in the Drill config file. > Moving forward, as Drill becomes more sophisticated, Drill should advertise > the specifics of each Drill-bit so that one Drill bit can differ from another. > For example, when running on YARN, we need a way for Drill to gracefully shut > down. Advertising a status of Ready or Unavailable will help. Ready is the > normal state. Unavailable means the Drill-bit will finish in-flight queries, > but won't accept new ones. (The actual status is a separate enhancement.) > In a YARN cluster, Drill should take advantage of machines with more memory, > but live with machines with less. (Perhaps some are newer, some are older or > more heavily loaded.) Drill should use ZK to identify its available memory > and CPUs so that the planner can use them. (Use of the info is a separate > enhancement.) > There may be times when two drill bits run on a single machine. If so, they > must use separate ports. So, each Drill-bit should advertise its ports in ZK. > For backward compatibility, the information is optional; if not present, the > receiver should assume the information defaults to that in the config file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4543) Advertise Drill-bit ports, status, capabilities in ZooKeeper
[ https://issues.apache.org/jira/browse/DRILL-4543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15216654#comment-15216654 ] Jacques Nadeau commented on DRILL-4543: --- In general, I would recommend that you set the value using system properties (this overrides the drill-override.conf file). If you want to use enviornment variables, I'd pass them with system properties in drill-env as opposed to the drill-override.conf file. But that is probably just personal preferences. > Advertise Drill-bit ports, status, capabilities in ZooKeeper > > > Key: DRILL-4543 > URL: https://issues.apache.org/jira/browse/DRILL-4543 > Project: Apache Drill > Issue Type: Improvement > Components: Server >Reporter: Paul Rogers > Fix For: 2.0.0 > > > Today Drill uses ZooKeeper (ZK) to advertise the existence of a Drill-bit, > providing just the host name/IP Address of the Drill-bit. All other > information (ports, status, capabilities) are assumed to be the same across > all Drill-bits in the cluster as specified in the Drill config file. > Moving forward, as Drill becomes more sophisticated, Drill should advertise > the specifics of each Drill-bit so that one Drill bit can differ from another. > For example, when running on YARN, we need a way for Drill to gracefully shut > down. Advertising a status of Ready or Unavailable will help. Ready is the > normal state. Unavailable means the Drill-bit will finish in-flight queries, > but won't accept new ones. (The actual status is a separate enhancement.) > In a YARN cluster, Drill should take advantage of machines with more memory, > but live with machines with less. (Perhaps some are newer, some are older or > more heavily loaded.) Drill should use ZK to identify its available memory > and CPUs so that the planner can use them. (Use of the info is a separate > enhancement.) > There may be times when two drill bits run on a single machine. If so, they > must use separate ports. So, each Drill-bit should advertise its ports in ZK. > For backward compatibility, the information is optional; if not present, the > receiver should assume the information defaults to that in the config file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4543) Advertise Drill-bit ports, status, capabilities in ZooKeeper
[ https://issues.apache.org/jira/browse/DRILL-4543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15216644#comment-15216644 ] John Omernik commented on DRILL-4543: - I think one difference is that with HOCON we'd have to explicitly set the drill-override to have the ENV variable as its value. I guess this shouldn't hurt. We'd just have to make two changes, define what we want as ENV variables, and then update the drill-override on each drill bit to have the setting use the ENV variable instead of a hard coded port. I guess from a simplicity point of view, I always saw the drill-override as a cluster wide settings file, as it makes it harder to read/grok if there are lots of variables. On the other hand, it's not more difficult than doing things implicitly with defaults, therefore, as long as Drill ok with the different settings, for us to do this, all we have to do define what environmental variables we want to use and then set them, while also updating the drill-override. Then yes, the only other thing is the data port, getting the ability to explicitly set that. > Advertise Drill-bit ports, status, capabilities in ZooKeeper > > > Key: DRILL-4543 > URL: https://issues.apache.org/jira/browse/DRILL-4543 > Project: Apache Drill > Issue Type: Improvement > Components: Server >Reporter: Paul Rogers > Fix For: 2.0.0 > > > Today Drill uses ZooKeeper (ZK) to advertise the existence of a Drill-bit, > providing just the host name/IP Address of the Drill-bit. All other > information (ports, status, capabilities) are assumed to be the same across > all Drill-bits in the cluster as specified in the Drill config file. > Moving forward, as Drill becomes more sophisticated, Drill should advertise > the specifics of each Drill-bit so that one Drill bit can differ from another. > For example, when running on YARN, we need a way for Drill to gracefully shut > down. Advertising a status of Ready or Unavailable will help. Ready is the > normal state. Unavailable means the Drill-bit will finish in-flight queries, > but won't accept new ones. (The actual status is a separate enhancement.) > In a YARN cluster, Drill should take advantage of machines with more memory, > but live with machines with less. (Perhaps some are newer, some are older or > more heavily loaded.) Drill should use ZK to identify its available memory > and CPUs so that the planner can use them. (Use of the info is a separate > enhancement.) > There may be times when two drill bits run on a single machine. If so, they > must use separate ports. So, each Drill-bit should advertise its ports in ZK. > For backward compatibility, the information is optional; if not present, the > receiver should assume the information defaults to that in the config file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4543) Advertise Drill-bit ports, status, capabilities in ZooKeeper
[ https://issues.apache.org/jira/browse/DRILL-4543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15216617#comment-15216617 ] Jacques Nadeau commented on DRILL-4543: --- The configuration system already works this way (see the HOCON documentation). The only ask I see here from John is being able to configure the data port (rather than control + 1) > Advertise Drill-bit ports, status, capabilities in ZooKeeper > > > Key: DRILL-4543 > URL: https://issues.apache.org/jira/browse/DRILL-4543 > Project: Apache Drill > Issue Type: Improvement > Components: Server >Reporter: Paul Rogers > Fix For: 2.0.0 > > > Today Drill uses ZooKeeper (ZK) to advertise the existence of a Drill-bit, > providing just the host name/IP Address of the Drill-bit. All other > information (ports, status, capabilities) are assumed to be the same across > all Drill-bits in the cluster as specified in the Drill config file. > Moving forward, as Drill becomes more sophisticated, Drill should advertise > the specifics of each Drill-bit so that one Drill bit can differ from another. > For example, when running on YARN, we need a way for Drill to gracefully shut > down. Advertising a status of Ready or Unavailable will help. Ready is the > normal state. Unavailable means the Drill-bit will finish in-flight queries, > but won't accept new ones. (The actual status is a separate enhancement.) > In a YARN cluster, Drill should take advantage of machines with more memory, > but live with machines with less. (Perhaps some are newer, some are older or > more heavily loaded.) Drill should use ZK to identify its available memory > and CPUs so that the planner can use them. (Use of the info is a separate > enhancement.) > There may be times when two drill bits run on a single machine. If so, they > must use separate ports. So, each Drill-bit should advertise its ports in ZK. > For backward compatibility, the information is optional; if not present, the > receiver should assume the information defaults to that in the config file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4543) Advertise Drill-bit ports, status, capabilities in ZooKeeper
[ https://issues.apache.org/jira/browse/DRILL-4543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15216552#comment-15216552 ] John Omernik commented on DRILL-4543: - Make sure the second part of my comment is also realized. I think if we don't fix that, both YARN and MESOS face challenges. Not sure if that should be a separate JIRA or not. John > Advertise Drill-bit ports, status, capabilities in ZooKeeper > > > Key: DRILL-4543 > URL: https://issues.apache.org/jira/browse/DRILL-4543 > Project: Apache Drill > Issue Type: Improvement > Components: Server >Reporter: Paul Rogers > Fix For: 2.0.0 > > > Today Drill uses ZooKeeper (ZK) to advertise the existence of a Drill-bit, > providing just the host name/IP Address of the Drill-bit. All other > information (ports, status, capabilities) are assumed to be the same across > all Drill-bits in the cluster as specified in the Drill config file. > Moving forward, as Drill becomes more sophisticated, Drill should advertise > the specifics of each Drill-bit so that one Drill bit can differ from another. > For example, when running on YARN, we need a way for Drill to gracefully shut > down. Advertising a status of Ready or Unavailable will help. Ready is the > normal state. Unavailable means the Drill-bit will finish in-flight queries, > but won't accept new ones. (The actual status is a separate enhancement.) > In a YARN cluster, Drill should take advantage of machines with more memory, > but live with machines with less. (Perhaps some are newer, some are older or > more heavily loaded.) Drill should use ZK to identify its available memory > and CPUs so that the planner can use them. (Use of the info is a separate > enhancement.) > There may be times when two drill bits run on a single machine. If so, they > must use separate ports. So, each Drill-bit should advertise its ports in ZK. > For backward compatibility, the information is optional; if not present, the > receiver should assume the information defaults to that in the config file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4543) Advertise Drill-bit ports, status, capabilities in ZooKeeper
[ https://issues.apache.org/jira/browse/DRILL-4543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15216522#comment-15216522 ] Paul Rogers commented on DRILL-4543: Great idea! Perhaps generalize this to a configuration stack: * System properties (-DpropName=value) * Env var: DRILL_PROP_NAME=value * Drill site config: "propName": "value" * Defaults The idea is that items higher in order take precedence over items lower in the order. Mesos can override values with env vars. YARN can use either env vars or command-like args (-D...). Then, when the values are needed by other Drill-bits, this particular Drill-bit uses ZK to advertise its actual values as computed using the override rules. YARN (or Mesos) can adjust ports and resources, and other Drill-bits can learn of those customizations. > Advertise Drill-bit ports, status, capabilities in ZooKeeper > > > Key: DRILL-4543 > URL: https://issues.apache.org/jira/browse/DRILL-4543 > Project: Apache Drill > Issue Type: Improvement > Components: Server >Reporter: Paul Rogers > Fix For: 2.0.0 > > > Today Drill uses ZooKeeper (ZK) to advertise the existence of a Drill-bit, > providing just the host name/IP Address of the Drill-bit. All other > information (ports, status, capabilities) are assumed to be the same across > all Drill-bits in the cluster as specified in the Drill config file. > Moving forward, as Drill becomes more sophisticated, Drill should advertise > the specifics of each Drill-bit so that one Drill bit can differ from another. > For example, when running on YARN, we need a way for Drill to gracefully shut > down. Advertising a status of Ready or Unavailable will help. Ready is the > normal state. Unavailable means the Drill-bit will finish in-flight queries, > but won't accept new ones. (The actual status is a separate enhancement.) > In a YARN cluster, Drill should take advantage of machines with more memory, > but live with machines with less. (Perhaps some are newer, some are older or > more heavily loaded.) Drill should use ZK to identify its available memory > and CPUs so that the planner can use them. (Use of the info is a separate > enhancement.) > There may be times when two drill bits run on a single machine. If so, they > must use separate ports. So, each Drill-bit should advertise its ports in ZK. > For backward compatibility, the information is optional; if not present, the > receiver should assume the information defaults to that in the config file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4543) Advertise Drill-bit ports, status, capabilities in ZooKeeper
[ https://issues.apache.org/jira/browse/DRILL-4543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15216514#comment-15216514 ] Paul Rogers commented on DRILL-4543: Thanks, Jacques. I see my confusion. When using the ZK client, I see the host name, but ports appear as "noise". They are encoded as a Protobuf block and are thus visible only to Drill code (or other code that knows how to decode the Protobuf format.) So, let me rephrase the port suggestion: store ports in an easily readable format such as plain text: drill://host-name:123:456:789 Or, if ZK allows it (can auto remove a subtree as well as a single node), as children of the znode: - host-name: my-host - user-port: 123 - data-port: 456 - ... - memory-mb: 128 - cores: 5 > Advertise Drill-bit ports, status, capabilities in ZooKeeper > > > Key: DRILL-4543 > URL: https://issues.apache.org/jira/browse/DRILL-4543 > Project: Apache Drill > Issue Type: Improvement > Components: Server >Reporter: Paul Rogers > Fix For: 2.0.0 > > > Today Drill uses ZooKeeper (ZK) to advertise the existence of a Drill-bit, > providing just the host name/IP Address of the Drill-bit. All other > information (ports, status, capabilities) are assumed to be the same across > all Drill-bits in the cluster as specified in the Drill config file. > Moving forward, as Drill becomes more sophisticated, Drill should advertise > the specifics of each Drill-bit so that one Drill bit can differ from another. > For example, when running on YARN, we need a way for Drill to gracefully shut > down. Advertising a status of Ready or Unavailable will help. Ready is the > normal state. Unavailable means the Drill-bit will finish in-flight queries, > but won't accept new ones. (The actual status is a separate enhancement.) > In a YARN cluster, Drill should take advantage of machines with more memory, > but live with machines with less. (Perhaps some are newer, some are older or > more heavily loaded.) Drill should use ZK to identify its available memory > and CPUs so that the planner can use them. (Use of the info is a separate > enhancement.) > There may be times when two drill bits run on a single machine. If so, they > must use separate ports. So, each Drill-bit should advertise its ports in ZK. > For backward compatibility, the information is optional; if not present, the > receiver should assume the information defaults to that in the config file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-1170) YARN support for Drill
[ https://issues.apache.org/jira/browse/DRILL-1170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15216504#comment-15216504 ] Paul Rogers commented on DRILL-1170: We have considered Slider. Several factors nudged us in the direction of writing an AM directly on YARN: 1. Slider has much documentation, but it is incomplete and out-of-date in important places. 2. We could make up for the documenation by reading the source code. However, Slider is composed of a large amount of Python code. Our team are mostly Java developers. If we have to learn a bunch of code, we might as well learn YARN directly. 3. Drill needs certain features that Slider does not (yet) provide, such as monitoring ZooKeeper to track Drill-bit health, perhaps offering a connection proxy, etc. 4. Slider is a general-purpose tool with many cool features. As it turns out, many are not needed for Drill. This means that Slider introduces a bit of unnecessary complexity for Drill admins. 5. Slider adds its own level of configuration files on top of those that we'd need for Drill. Not a big issue, but it is just additional complexity for Drill admins to learn and manage. In balance, we like where Slider is going. Those Drill users who want to roll-their-own YARN integration should certainly give Slider a try as a short-term solution. This is particularly true for shops that already use Slider for other apps. On balance, however, Drill has a number of specialized needs that would seem to justify the cost of a custom AM. We will, of course, continue to revisit the issue as analysis proceeds. > YARN support for Drill > -- > > Key: DRILL-1170 > URL: https://issues.apache.org/jira/browse/DRILL-1170 > Project: Apache Drill > Issue Type: New Feature >Reporter: Neeraja >Assignee: Paul Rogers > Fix For: Future > > > This is a tracking item to make Drill work with YARN. > Below are few requirements/needs to consider. > - Drill should run as an YARN based application, side by side with other YARN > enabled applications (on same nodes or different nodes). Both memory and CPU > resources of Drill should be controlled in this mechanism. > - As an YARN enabled application, Drill resource consumption should be > adaptive to the load on the cluster. For ex: When there is no load on the > Drill , Drill should consume no resources on the cluster. As the load on > Drill increases, resources permitting, usage should grow proportionally. > - Low latency is a key requirement for Apache Drill along with support for > multiple users (concurrency in 100s-1000s). This should be supported when run > as YARN application as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3178) csv reader should allow newlines inside quotes
[ https://issues.apache.org/jira/browse/DRILL-3178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15216384#comment-15216384 ] Daniel Reznick commented on DRILL-3178: --- As drill is meant for working with data in place, having to pre-process files prior to use with drill is counter-productive. Drill should work hard to read data as is when possible, and as noted many other tools both read and write delimited content with newlines in quoted fields. > csv reader should allow newlines inside quotes > --- > > Key: DRILL-3178 > URL: https://issues.apache.org/jira/browse/DRILL-3178 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - Text & CSV >Affects Versions: 1.0.0 > Environment: Ubuntu Trusty 14.04.2 LTS >Reporter: Neal McBurnett > Fix For: Future > > > When reading a csv file which contains newlines within quoted strings, e.g. > via > select * from dfs.`/tmp/q.csv`; > Drill 1.0 says: > Error: SYSTEM ERROR: com.univocity.parsers.common.TextParsingException: > Error processing input: Cannot use newline character within quoted string > But many tools produce csv files with newlines in quoted strings. Drill > should be able to handle them. > Workaround: the csvquote program (https://github.com/dbro/csvquote) can > encode embedded commas and newlines, and even decode them later if desired. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4544) Improve error messages for REFRESH TABLE METADATA command
[ https://issues.apache.org/jira/browse/DRILL-4544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15216357#comment-15216357 ] ASF GitHub Bot commented on DRILL-4544: --- Github user jacques-n commented on the pull request: https://github.com/apache/drill/pull/448#issuecomment-203005531 Let's open a follow-up bug to move this to Calcite and get in Drill for now. > Improve error messages for REFRESH TABLE METADATA command > - > > Key: DRILL-4544 > URL: https://issues.apache.org/jira/browse/DRILL-4544 > Project: Apache Drill > Issue Type: Improvement > Components: Metadata >Reporter: Arina Ielchiieva >Assignee: Arina Ielchiieva >Priority: Minor > Fix For: 1.7.0 > > > Improve the error messages thrown by REFRESH TABLE METADATA command: > In the first case below, the error is maprfs.abc doesn't exist. It should > throw a Object not found or workspace not found. It is currently throwing a > non helpful message; > 0: jdbc:drill:> refresh table metadata maprfs.abc.`my_table`; > + > oksummary > + > false Error: null > + > 1 row selected (0.355 seconds) > In the second case below, it says refresh table metadata is supported only > for single-directory based Parquet tables. But the command works for nested > multi-directory Parquet files. > 0: jdbc:drill:> refresh table metadata maprfs.vnaranammalpuram.`rfm_sales_vw`; > ---+ > oksummary > ---+ > false Table rfm_sales_vw does not support metadata refresh. Support is > currently limited to single-directory-based Parquet tables. > ---+ > 1 row selected (0.418 seconds) > 0: jdbc:drill:> -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-3993) Rebase Drill on Calcite master branch
[ https://issues.apache.org/jira/browse/DRILL-3993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacques Nadeau updated DRILL-3993: -- Summary: Rebase Drill on Calcite master branch (was: Rebase Drill on Calcite 1.7.0 release) > Rebase Drill on Calcite master branch > - > > Key: DRILL-3993 > URL: https://issues.apache.org/jira/browse/DRILL-3993 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning & Optimization >Affects Versions: 1.2.0 >Reporter: Sudheesh Katkam >Assignee: Jacques Nadeau > > Calcite keeps moving, and now we need to catch up to Calcite 1.5, and ensure > there are no regressions. > Also, how do we resolve this 'catching up' issue in the long term? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-4403) AssertionError: Internal error: Conversion to relational algebra failed to preserve datatypes
[ https://issues.apache.org/jira/browse/DRILL-4403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Serge Harnyk updated DRILL-4403: Assignee: Taras Supyk (was: Serge Harnyk) > AssertionError: Internal error: Conversion to relational algebra failed to > preserve datatypes > -- > > Key: DRILL-4403 > URL: https://issues.apache.org/jira/browse/DRILL-4403 > Project: Apache Drill > Issue Type: Bug > Components: Storage - JDBC >Affects Versions: 1.5.0 >Reporter: N Campbell >Assignee: Taras Supyk > > select rnum, c1, c2, c3, stddev_pop( c3 ) over(partition by c1) from > postgres.public.tolap > Error: SYSTEM ERROR: AssertionError: Internal error: Conversion to relational > algebra failed to preserve datatypes: > validated type: > RecordType(INTEGER NOT NULL rnum, CHAR(3) CHARACTER SET "ISO-8859-1" COLLATE > "ISO-8859-1$en_US$primary" c1, CHAR(2) CHARACTER SET "ISO-8859-1" COLLATE > "ISO-8859-1$en_US$primary" c2, INTEGER c3, INTEGER EXPR$4) NOT NULL > converted type: > RecordType(INTEGER NOT NULL rnum, CHAR(3) CHARACTER SET "ISO-8859-1" COLLATE > "ISO-8859-1$en_US$primary" c1, CHAR(2) CHARACTER SET "ISO-8859-1" COLLATE > "ISO-8859-1$en_US$primary" c2, INTEGER c3, DOUBLE EXPR$4) NOT NULL > rel: > LogicalProject(rnum=[$0], c1=[$1], c2=[$2], c3=[$3], > EXPR$4=[POWER(/(CastHigh(-(SUM(*(CastHigh($3), CastHigh($3))) OVER (PARTITION > BY $1 RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING), > /(*(SUM(CastHigh($3)) OVER (PARTITION BY $1 RANGE BETWEEN UNBOUNDED PRECEDING > AND UNBOUNDED FOLLOWING), SUM(CastHigh($3)) OVER (PARTITION BY $1 RANGE > BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)), COUNT(CastHigh($3)) > OVER (PARTITION BY $1 RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED > FOLLOWING, COUNT(CastHigh($3)) OVER (PARTITION BY $1 RANGE BETWEEN > UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)), 0.5)]) > LogicalTableScan(table=[[postgres, public, tolap]]) > [Error Id: 61be4aa1-6486-4118-a82b-86c22b551bb5 on centos1:31010] > (org.apache.drill.exec.work.foreman.ForemanException) Unexpected exception > during fragment initialization: Internal error: Conversion to relational > algebra failed to preserve datatypes: > validated type: > RecordType(INTEGER NOT NULL rnum, CHAR(3) CHARACTER SET "ISO-8859-1" COLLATE > "ISO-8859-1$en_US$primary" c1, CHAR(2) CHARACTER SET "ISO-8859-1" COLLATE > "ISO-8859-1$en_US$primary" c2, INTEGER c3, INTEGER EXPR$4) NOT NULL > converted type: > RecordType(INTEGER NOT NULL rnum, CHAR(3) CHARACTER SET "ISO-8859-1" COLLATE > "ISO-8859-1$en_US$primary" c1, CHAR(2) CHARACTER SET "ISO-8859-1" COLLATE > "ISO-8859-1$en_US$primary" c2, INTEGER c3, DOUBLE EXPR$4) NOT NULL > rel: > LogicalProject(rnum=[$0], c1=[$1], c2=[$2], c3=[$3], > EXPR$4=[POWER(/(CastHigh(-(SUM(*(CastHigh($3), CastHigh($3))) OVER (PARTITION > BY $1 RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING), > /(*(SUM(CastHigh($3)) OVER (PARTITION BY $1 RANGE BETWEEN UNBOUNDED PRECEDING > AND UNBOUNDED FOLLOWING), SUM(CastHigh($3)) OVER (PARTITION BY $1 RANGE > BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)), COUNT(CastHigh($3)) > OVER (PARTITION BY $1 RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED > FOLLOWING, COUNT(CastHigh($3)) OVER (PARTITION BY $1 RANGE BETWEEN > UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)), 0.5)]) > LogicalTableScan(table=[[postgres, public, tolap]]) > org.apache.drill.exec.work.foreman.Foreman.run():261 > java.util.concurrent.ThreadPoolExecutor.runWorker():1142 > java.util.concurrent.ThreadPoolExecutor$Worker.run():617 > java.lang.Thread.run():745 > Caused By (java.lang.AssertionError) Internal error: Conversion to > relational algebra failed to preserve datatypes: > validated type: > RecordType(INTEGER NOT NULL rnum, CHAR(3) CHARACTER SET "ISO-8859-1" COLLATE > "ISO-8859-1$en_US$primary" c1, CHAR(2) CHARACTER SET "ISO-8859-1" COLLATE > "ISO-8859-1$en_US$primary" c2, INTEGER c3, INTEGER EXPR$4) NOT NULL > converted type: > RecordType(INTEGER NOT NULL rnum, CHAR(3) CHARACTER SET "ISO-8859-1" COLLATE > "ISO-8859-1$en_US$primary" c1, CHAR(2) CHARACTER SET "ISO-8859-1" COLLATE > "ISO-8859-1$en_US$primary" c2, INTEGER c3, DOUBLE EXPR$4) NOT NULL > rel: > LogicalProject(rnum=[$0], c1=[$1], c2=[$2], c3=[$3], > EXPR$4=[POWER(/(CastHigh(-(SUM(*(CastHigh($3), CastHigh($3))) OVER (PARTITION > BY $1 RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING), > /(*(SUM(CastHigh($3)) OVER (PARTITION BY $1 RANGE BETWEEN UNBOUNDED PRECEDING > AND UNBOUNDED FOLLOWING), SUM(CastHigh($3)) OVER (PARTITION BY $1 RANGE > BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)), COUNT(CastHigh($3)) > OVER (PARTITION BY $1 RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED > FOLLOWING,
[jira] [Commented] (DRILL-4405) invalid Postgres SQL generated for CONCAT (literal, literal)
[ https://issues.apache.org/jira/browse/DRILL-4405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15216175#comment-15216175 ] Serge Harnyk commented on DRILL-4405: - Doesn't reproduce in Drill 1.7. Solved in DRILL-4372 > invalid Postgres SQL generated for CONCAT (literal, literal) > - > > Key: DRILL-4405 > URL: https://issues.apache.org/jira/browse/DRILL-4405 > Project: Apache Drill > Issue Type: Bug > Components: Storage - JDBC >Affects Versions: 1.5.0 >Reporter: N Campbell >Assignee: Serge Harnyk > > select concat( 'FF' , 'FF' ) from postgres.public.tversion > Error: DATA_READ ERROR: The JDBC storage plugin failed while trying setup the > SQL query. > sql SELECT CAST('' AS ANY) AS "EXPR$0" > FROM "public"."tversion" > plugin postgres > Fragment 0:0 > [Error Id: c3f24106-8d75-4a57-a638-ac5f0aca0769 on centos1:31010] > (org.postgresql.util.PSQLException) ERROR: syntax error at or near "ANY" > Position: 23 > org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse():2182 > org.postgresql.core.v3.QueryExecutorImpl.processResults():1911 > org.postgresql.core.v3.QueryExecutorImpl.execute():173 > org.postgresql.jdbc.PgStatement.execute():622 > org.postgresql.jdbc.PgStatement.executeWithFlags():458 > org.postgresql.jdbc.PgStatement.executeQuery():374 > org.apache.commons.dbcp.DelegatingStatement.executeQuery():208 > org.apache.commons.dbcp.DelegatingStatement.executeQuery():208 > org.apache.drill.exec.store.jdbc.JdbcRecordReader.setup():177 > org.apache.drill.exec.physical.impl.ScanBatch.():108 > org.apache.drill.exec.physical.impl.ScanBatch.():136 > org.apache.drill.exec.store.jdbc.JdbcBatchCreator.getBatch():40 > org.apache.drill.exec.store.jdbc.JdbcBatchCreator.getBatch():33 > org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch():147 > org.apache.drill.exec.physical.impl.ImplCreator.getChildren():170 > org.apache.drill.exec.physical.impl.ImplCreator.getRootExec():101 > org.apache.drill.exec.physical.impl.ImplCreator.getExec():79 > org.apache.drill.exec.work.fragment.FragmentExecutor.run():230 > org.apache.drill.common.SelfCleaningRunnable.run():38 > java.util.concurrent.ThreadPoolExecutor.runWorker():1142 > java.util.concurrent.ThreadPoolExecutor$Worker.run():617 > java.lang.Thread.run():745 > SQLState: null > ErrorCode: 0 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4409) projecting literal will result in an empty resultset
[ https://issues.apache.org/jira/browse/DRILL-4409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15216143#comment-15216143 ] ASF GitHub Bot commented on DRILL-4409: --- GitHub user Serge-Harnyk opened a pull request: https://github.com/apache/drill-site/pull/1 DRILL-4409 - Add notice about Postgres typing of literals All in comments here https://issues.apache.org/jira/browse/DRILL-4409 You can merge this pull request into a Git repository by running: $ git pull https://github.com/Serge-Harnyk/drill-site asf-site Alternatively you can review and apply these changes as the patch at: https://github.com/apache/drill-site/pull/1.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1 > projecting literal will result in an empty resultset > > > Key: DRILL-4409 > URL: https://issues.apache.org/jira/browse/DRILL-4409 > Project: Apache Drill > Issue Type: Bug > Components: Storage - JDBC >Affects Versions: 1.5.0 >Reporter: N Campbell >Assignee: Serge Harnyk > > A query which projects a literal as shown against a Postgres table will > result in an empty result set being returned. > select 'BB' from postgres.public.tversion -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3878) Support XML Querying (selects/projections, no writing)
[ https://issues.apache.org/jira/browse/DRILL-3878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15216103#comment-15216103 ] ASF GitHub Bot commented on DRILL-3878: --- GitHub user magpierre opened a pull request: https://github.com/apache/drill/pull/451 Drill 3878 Please review my fix for JIRA DRILL-3878 provide XML support for Apache Drill. The fix utilizes the existing support for JSON by converting XML to JSON using a simple SAX parser built for the purpose. The parser tries to produce acceptable JSON documents that are then fed into the JSONRecordReader for futher processing. To add xml support into Apache Drill, please include the built package to 3rdparty folder of the built Apache Drill environment, and start. Add: "xml": { "type": "xml", "extensions": [ "xml" ], "keepPrefix": true } to the type section in dfs (keepPrefix = false will remove namespace from tags in Apache Drill since namespace can be named differently between documents and are not really part of the tagname) The parser tries to be nice to Drill / JSON Reader by avoiding mixing types, arranging recurring values in arrays, and by removing empty elements. This in order to minimize the amount of JSON errors due to the different nature of XML and Drill. Convention in JSON Attributes are named using convetiion @ and then the attribute name and store simple values. All other objects are stored as objects with a #value field. This is somewhat conforming with Apache Spark XML, but I need to store all values in objects in order to avoid as many map of different type problems as possible. Current limitations: DTD tags are currently not liked. Schema is not validated against XSD's. Also: SInce I am not a Drill Developer, I might have broken all rules possible of syntax, format, layout, test frameworks, as well as how to submit pull requests. You can merge this pull request into a Git repository by running: $ git pull https://github.com/magpierre/drill DRILL-3878 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/drill/pull/451.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #451 commit 844f34a16e75719535ff94c54d5337746ea18c20 Author: MPierreDate: 2015-11-05T14:42:06Z Initial commit XML support in Apache Drill commit 592b3af06c2ff45198136577561f2ec1f7caaee0 Author: MPierre Date: 2015-11-05T21:21:42Z Fixed some minor outstanding bugs EasyRecordReader have a new field userName, and I forgot to change jsonProcessor to protected from private. commit 8fad811edab43d3499b41bb66cb419248d11208f Author: MPierre Date: 2015-11-09T08:59:08Z Merge remote-tracking branch 'apache/master' into DRILL-3878 commit 38f4884fe9b8456c1cde5de44c1e54177301a974 Author: MPierre Date: 2016-03-16T11:33:15Z Syncing to latest release of drill commit 909c5dec8bdb01bfe0ed358ebc64c959785738df Author: MPierre Date: 2016-03-16T11:34:10Z syncing to latest release of drill commit 597d9657d613fa35df2c10dff23681545b13e531 Author: MPierre Date: 2016-03-18T08:55:51Z Cleaned up deliver Cleaned up the output generated by the SAX Parser, and removed all unnecessary code. commit 0cfaa31ab9af89833417288a290d21d0ce88c4ac Author: MPierre Date: 2016-03-18T10:29:51Z Merge remote-tracking branch 'apache/master' into DRILL-3878 commit aaaff05eb921125ad64854c89c179292c4441fb7 Author: MPierre Date: 2016-03-24T13:05:53Z Adjusted output from Parser to fit Drill better I have adjusted the SAX parser to produce JSON that Drill likes. Among the things corrected is to remove empty objects from the tree built. And to consolidate repeating values in arrays. commit ba19a356d850224c01b9e807183377b46cf7e545 Author: MPierre Date: 2016-03-24T13:10:57Z Fixed small typo commit 8ba6705be42c7847d469611ab070b869e0c76d8c Author: MPierre Date: 2016-03-24T21:17:30Z Further enhancements of the output format to fit Drill commit e2273f13b8e0136a33c1576c4667f16e23e1631c Author: MPierre Date: 2016-03-24T21:22:41Z Removed comment commit c1b6ff8375a7e3c8161167d1a5f2b34ba165e750 Author: MPierre Date: 2016-03-29T12:48:53Z Merge remote-tracking branch 'apache/master' into DRILL-3878 > Support XML Querying (selects/projections, no writing)
[jira] [Updated] (DRILL-4458) JDBC plugin case sensitive table names
[ https://issues.apache.org/jira/browse/DRILL-4458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Serge Harnyk updated DRILL-4458: External issue ID: DRILL-3993 > JDBC plugin case sensitive table names > -- > > Key: DRILL-4458 > URL: https://issues.apache.org/jira/browse/DRILL-4458 > Project: Apache Drill > Issue Type: Bug > Components: Storage - JDBC >Affects Versions: 1.5.0 > Environment: Drill embedded mode on OSX, connecting to MS SQLServer >Reporter: Paul Mogren >Assignee: Serge Harnyk >Priority: Minor > > I just tried Drill with MS SQL Server and I found that Drill treats table > names case-sensitively, contrary to > https://drill.apache.org/docs/lexical-structure/ which indicates that > table names are "case-insensitive unless enclosed in double quotation > marks”. This presents a problem for users and existing SQL scripts that > expect table names to be case-insensitive. > This works: select * from mysandbox.dbo.AD_Role > This does not work: select * from mysandbox.dbo.ad_role > Mailing list reference including stack trace: > http://mail-archives.apache.org/mod_mbox/drill-user/201603.mbox/%3ccajrw0otv8n5ybmvu6w_efe4npgenrdk5grmh9jtbxu9xnni...@mail.gmail.com%3e -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3714) Query runs out of memory and remains in CANCELLATION_REQUESTED state until drillbit is restarted
[ https://issues.apache.org/jira/browse/DRILL-3714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15215507#comment-15215507 ] Deneche A. Hakim commented on DRILL-3714: - [~jacq...@dremio.com] assuming this is indeed the test case requested, here is my understanding: The fix proposed for this JIRA is helpful for any message waiting, in the CoordinationQueue, for an ACK. In the case of the UserClient, it only really waits for the handshake and the queryId for every submitted query, other than that he CoordinationQueue should be empty most of the time. The fix I proposed here may help the UserClient fail quickly if it's waiting for a queryId, but other than that it will still hang for any query it didn't receive the terminal state yet (DRILL-3743). It's a problem worth fixing, but I think it should be done separately. > Query runs out of memory and remains in CANCELLATION_REQUESTED state until > drillbit is restarted > > > Key: DRILL-3714 > URL: https://issues.apache.org/jira/browse/DRILL-3714 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Flow >Affects Versions: 1.2.0 >Reporter: Victoria Markman >Assignee: Jacques Nadeau >Priority: Critical > Fix For: 1.7.0 > > Attachments: Screen Shot 2015-08-26 at 10.36.33 AM.png, drillbit.log, > jstack.txt, query_profile_2a2210a7-7a78-c774-d54c-c863d0b77bb0.json > > > This is a variation of DRILL-3705 with the difference of drill behavior when > hitting OOM condition. > Query runs out of memory during execution and remains in > "CANCELLATION_REQUESTED" state until drillbit is bounced. > Client (sqlline in this case) never gets a response from the server. > Reproduction details: > Single node drillbit installation. > DRILL_MAX_DIRECT_MEMORY="8G" > DRILL_HEAP="4G" > Run this query on TPCDS SF100 data set > {code} > SELECT SUM(ss.ss_net_paid_inc_tax) OVER (PARTITION BY ss.ss_store_sk) AS > TotalSpend FROM store_sales ss WHERE ss.ss_store_sk IS NOT NULL ORDER BY 1 > LIMIT 10; > {code} > drillbit.log > {code} > 2015-08-26 16:54:58,469 [2a2210a7-7a78-c774-d54c-c863d0b77bb0:frag:3:22] INFO > o.a.d.e.w.f.FragmentStatusReporter - > 2a2210a7-7a78-c774-d54c-c863d0b77bb0:3:22: State to report: RUNNING > 2015-08-26 16:55:50,498 [BitServer-5] WARN > o.a.drill.exec.rpc.data.DataServer - Message of mode REQUEST of rpc type 3 > took longer than 500ms. Actual duration was 2569ms. > 2015-08-26 16:56:31,086 [BitServer-5] ERROR > o.a.d.exec.rpc.RpcExceptionHandler - Exception in RPC communication. > Connection: /10.10.88.133:31012 <--> /10.10.88.133:54554 (data server). > Closing connection. > io.netty.handler.codec.DecoderException: java.lang.OutOfMemoryError: Direct > buffer memory > at > io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:233) > ~[netty-codec-4.0.27.Final.jar:4.0.27.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339) > [netty-transport-4.0.27.Final.jar:4.0.27.Final] > at > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324) > [netty-transport-4.0.27.Final.jar:4.0.27.Final] > at > io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86) > [netty-transport-4.0.27.Final.jar:4.0.27.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339) > [netty-transport-4.0.27.Final.jar:4.0.27.Final] > at > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324) > [netty-transport-4.0.27.Final.jar:4.0.27.Final] > at > io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:847) > [netty-transport-4.0.27.Final.jar:4.0.27.Final] > at > io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:618) > [netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na] > at > io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:329) > [netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na] > at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:250) > [netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na] > at > io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111) > [netty-common-4.0.27.Final.jar:4.0.27.Final] > at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71] > Caused by: java.lang.OutOfMemoryError: Direct buffer memory > at