[jira] [Created] (DRILL-7405) Build fails due to inaccessible apache-drill on S3 storage
Boaz Ben-Zvi created DRILL-7405: --- Summary: Build fails due to inaccessible apache-drill on S3 storage Key: DRILL-7405 URL: https://issues.apache.org/jira/browse/DRILL-7405 Project: Apache Drill Issue Type: Bug Components: Tools, Build Test Affects Versions: 1.16.0 Reporter: Boaz Ben-Zvi Assignee: Abhishek Girish A new clean build (e.g. after deleting the ~/.m2 local repository) would fail now due to: Access denied to: [http://apache-drill.s3.amazonaws.com|https://urldefense.proofpoint.com/v2/url?u=http-3A__apache-2Ddrill.s3.amazonaws.com_files_sf-2D0.01-5Ftpc-2Dh-5Fparquet-5Ftyped.tgz=DwMGaQ=C5b8zRQO1miGmBeVZ2LFWg=KLC1nKJ8dIOnUay2kR6CAw=08mf7Xfn1orlbAA60GKLIuj_PTtfaSAijrKDLOucMPU=CX97We3sm3ZZ_aVJIrsUdXVJ3CNMYg7p3IsxbJpuXWk=] (e.g., for the test data sf-0.01_tpc-h_parquet_typed.tgz ) A new publicly available storage place is needed, plus appropriate changes in Drill to get to these resources. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (DRILL-7170) IllegalStateException: Record count not set for this vector container
[ https://issues.apache.org/jira/browse/DRILL-7170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Boaz Ben-Zvi resolved DRILL-7170. - Reviewer: Sorabh Hamirwasia Resolution: Fixed > IllegalStateException: Record count not set for this vector container > - > > Key: DRILL-7170 > URL: https://issues.apache.org/jira/browse/DRILL-7170 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators >Reporter: Sorabh Hamirwasia >Assignee: Boaz Ben-Zvi >Priority: Major > Fix For: 1.17.0 > > > {code:java} > Query: > /root/drillAutomation/master/framework/resources/Advanced/tpcds/tpcds_sf1/original/maprdb/json/query95.sql > WITH ws_wh AS > ( > SELECT ws1.ws_order_number, > ws1.ws_warehouse_sk wh1, > ws2.ws_warehouse_sk wh2 > FROM web_sales ws1, > web_sales ws2 > WHERE ws1.ws_order_number = ws2.ws_order_number > ANDws1.ws_warehouse_sk <> ws2.ws_warehouse_sk) > SELECT > Count(DISTINCT ws_order_number) AS `order count` , > Sum(ws_ext_ship_cost) AS `total shipping cost` , > Sum(ws_net_profit) AS `total net profit` > FROM web_sales ws1 , > date_dim , > customer_address , > web_site > WHEREd_date BETWEEN '2000-04-01' AND ( > Cast('2000-04-01' AS DATE) + INTERVAL '60' day) > AND ws1.ws_ship_date_sk = d_date_sk > AND ws1.ws_ship_addr_sk = ca_address_sk > AND ca_state = 'IN' > AND ws1.ws_web_site_sk = web_site_sk > AND web_company_name = 'pri' > AND ws1.ws_order_number IN > ( > SELECT ws_order_number > FROM ws_wh) > AND ws1.ws_order_number IN > ( > SELECT wr_order_number > FROM web_returns, > ws_wh > WHERE wr_order_number = ws_wh.ws_order_number) > ORDER BY count(DISTINCT ws_order_number) > LIMIT 100 > Exception: > java.sql.SQLException: SYSTEM ERROR: IllegalStateException: Record count not > set for this vector container > Fragment 2:3 > Please, refer to logs for more information. > [Error Id: 4ed92fce-505b-40ba-ac0e-4a302c28df47 on drill87:31010] > (java.lang.IllegalStateException) Record count not set for this vector > container > > org.apache.drill.shaded.guava.com.google.common.base.Preconditions.checkState():459 > org.apache.drill.exec.record.VectorContainer.getRecordCount():394 > org.apache.drill.exec.record.RecordBatchSizer.():720 > org.apache.drill.exec.record.RecordBatchSizer.():704 > > org.apache.drill.exec.physical.impl.common.HashTableTemplate$BatchHolder.getActualSize():462 > > org.apache.drill.exec.physical.impl.common.HashTableTemplate.getActualSize():964 > > org.apache.drill.exec.physical.impl.common.HashTableTemplate.makeDebugString():973 > > org.apache.drill.exec.physical.impl.common.HashPartition.makeDebugString():601 > > org.apache.drill.exec.physical.impl.join.HashJoinBatch.makeDebugString():1313 > > org.apache.drill.exec.physical.impl.join.HashJoinBatch.executeBuildPhase():1105 > org.apache.drill.exec.physical.impl.join.HashJoinBatch.innerNext():525 > org.apache.drill.exec.record.AbstractRecordBatch.next():186 > org.apache.drill.exec.record.AbstractRecordBatch.next():126 > org.apache.drill.exec.record.AbstractRecordBatch.next():116 > org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():63 > > org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():141 > org.apache.drill.exec.record.AbstractRecordBatch.next():186 > org.apache.drill.exec.record.AbstractRecordBatch.next():126 > org.apache.drill.exec.test.generated.HashAggregatorGen1068899.doWork():642 > org.apache.drill.exec.physical.impl.aggregate.HashAggBatch.innerNext():296 > org.apache.drill.exec.record.AbstractRecordBatch.next():186 > org.apache.drill.exec.record.AbstractRecordBatch.next():126 > org.apache.drill.exec.record.AbstractRecordBatch.next():116 > org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():63 > > org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():141 > org.apache.drill.exec.record.AbstractRecordBatch.next():186 > org.apache.drill.exec.physical.impl.BaseRootExec.next():104 > > org.apache.drill.exec.physical.impl.SingleSenderCreator$SingleSenderRootExec.innerNext():93 > org.apache.drill.exec.physical.impl.BaseRootExec.next():94 > org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():296 > org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():283 > java.security.AccessController.doPrivileged():-2 > javax.security.auth.Subject.doAs():422 > org.apache.hadoop.security.UserGroupInformation.doAs():1669 > org.apache.drill.exec.work.fragment.FragmentExecutor.run():283 > org.apache.drill.common.SelfCleaningRunnable.run():38 >
[jira] [Created] (DRILL-7244) Run-time rowgroup pruning match() fails on casting a Long to an Integer
Boaz Ben-Zvi created DRILL-7244: --- Summary: Run-time rowgroup pruning match() fails on casting a Long to an Integer Key: DRILL-7244 URL: https://issues.apache.org/jira/browse/DRILL-7244 Project: Apache Drill Issue Type: Bug Components: Storage - Parquet Affects Versions: 1.17.0 Reporter: Boaz Ben-Zvi Assignee: Boaz Ben-Zvi Fix For: 1.17.0 See DRILL-7062, where a temporary workaround was created, skipping pruning (and logging) instead of this failure: After a Parquet table is refreshed with selected "interesting" columns, a query whose WHERE clause contains a condition on a "non interesting" INT64 column fails during run-time pruning (calling match()) with: {noformat} org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: ClassCastException: java.lang.Long cannot be cast to java.lang.Integer {noformat} A long term solution is to pass the whole (or the relevant part of the) schema to the runtime, instead of just passing the "interesting" columns. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-7240) Run-time rowgroup pruning match() fails on casting a Long to an Integer
Boaz Ben-Zvi created DRILL-7240: --- Summary: Run-time rowgroup pruning match() fails on casting a Long to an Integer Key: DRILL-7240 URL: https://issues.apache.org/jira/browse/DRILL-7240 Project: Apache Drill Issue Type: Bug Components: Storage - Parquet Affects Versions: 1.17.0 Reporter: Boaz Ben-Zvi Assignee: Boaz Ben-Zvi Fix For: 1.17.0 After a Parquet table is refreshed with select "interesting" columns, a query whose WHERE clause contains a condition on a "non interesting" INT64 column fails during run-time pruning (calling match()) with: {noformat} org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: ClassCastException: java.lang.Long cannot be cast to java.lang.Integer {noformat} Near-term fix suggestion: Catch the match() exception error, and instead do not prune (i.e. run-time pruning would be disabled in such cases). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-7173) Analyze table may fail when prefer_plain_java is set to true on codegen for resetValues
Boaz Ben-Zvi created DRILL-7173: --- Summary: Analyze table may fail when prefer_plain_java is set to true on codegen for resetValues Key: DRILL-7173 URL: https://issues.apache.org/jira/browse/DRILL-7173 Project: Apache Drill Issue Type: Improvement Components: Execution - Codegen Affects Versions: 1.15.0 Environment: *prefer_plain_java: true* Reporter: Boaz Ben-Zvi Fix For: 1.17.0 The *prefer_plain_java* compile option is useful for debugging of generated code (can be set in dril-override.conf; the default value is false). When set to true, some "analyze table" calls generate code that fails due to addition of a SchemaChangeException which is not in the Streaming Aggr template. For example: {noformat} apache drill (dfs.tmp)> create table lineitem3 as select * from cp.`tpch/lineitem.parquet`; +--+---+ | Fragment | Number of records written | +--+---+ | 0_0 | 60175 | +--+---+ 1 row selected (2.06 seconds) apache drill (dfs.tmp)> analyze table lineitem3 compute statistics; Error: SYSTEM ERROR: CompileException: File 'org.apache.drill.exec.compile.DrillJavaFileObject[StreamingAggregatorGen4.java]', Line 7869, Column 20: StreamingAggregatorGen4.java:7869: error: resetValues() in org.apache.drill.exec.test.generated.StreamingAggregatorGen4 cannot override resetValues() in org.apache.drill.exec.physical.impl.aggregate.StreamingAggTemplate public boolean resetValues() ^ overridden method does not throw org.apache.drill.exec.exception.SchemaChangeException (compiler.err.override.meth.doesnt.throw) {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-7069) Poor performance of transformBinaryInMetadataCache
Boaz Ben-Zvi created DRILL-7069: --- Summary: Poor performance of transformBinaryInMetadataCache Key: DRILL-7069 URL: https://issues.apache.org/jira/browse/DRILL-7069 Project: Apache Drill Issue Type: Improvement Components: Metadata Affects Versions: 1.15.0 Reporter: Boaz Ben-Zvi Assignee: Boaz Ben-Zvi Fix For: 1.16.0 The performance of the method *transformBinaryInMetadataCache* scales poorly as the table's numbers of underlying files, row-groups and columns grow. This method is invoked during planning of every query using this table. A test on a table using 219 directories (each with 20 files), 1 row-group in each file, and 94 columns, measured about *1340 milliseconds*. The main culprit are the version checks, which take place in *every iteration* (i.e., about 400k times in the previous example) and involve construction of 6 MetadataVersion objects (and possibly garbage collections). Removing the version checks from the loops improved this method's performance on the above test down to about *250 milliseconds*. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-7043) Enhance Merge-Join to support Full Outer Join
Boaz Ben-Zvi created DRILL-7043: --- Summary: Enhance Merge-Join to support Full Outer Join Key: DRILL-7043 URL: https://issues.apache.org/jira/browse/DRILL-7043 Project: Apache Drill Issue Type: Improvement Components: Execution - Relational Operators, Query Planning Optimization Affects Versions: 1.15.0 Reporter: Boaz Ben-Zvi Assignee: Boaz Ben-Zvi Currently the Merge Join operator internally cannot support a Right Outer Join (and thus a Full Outer Join; for ROJ alone, the planner rotates the inputs and specifies a Left Outer Join). The actual reason for not supporting ROJ is the current MJ implementation - when a match is found, it puts a mark on the right side and iterates down on the right, resetting back at the end (and on to the next left side entry). This would create an ambiguity if the next left entry is bigger than the previous - is this an unmatched (i.e., need to return the right entry), or there was a prior match (i.e., just advance to the next right). Seems that adding a relevant flag to the persisted state ({{status}}) and some other code changes would make the operator support Right-Outer-Join as well (and thus a Full Outer Join). The planner need an update as well - to suggest the MJ in case of a FOJ, and maybe not to rotate the inputs in some MJ cases. Currently trying a FOJ with MJ (i.e. HJ disabled) produces the following "no plan found" from Calcite: {noformat} 0: jdbc:drill:zk=local> select * from temp t1 full outer join temp2 t2 on t1.d_date = t2.d_date; Error: SYSTEM ERROR: CannotPlanException: Node [rel#2804:Subset#8.PHYSICAL.SINGLETON([]).[]] could not be implemented; planner state: Root: rel#2804:Subset#8.PHYSICAL.SINGLETON([]).[] Original rel: DrillScreenRel(subset=[rel#2804:Subset#8.PHYSICAL.SINGLETON([]).[]]): rowcount = 6.0, cumulative cost = {0.6001 rows, 0.6001 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 2802 DrillProjectRel(subset=[rel#2801:Subset#7.LOGICAL.ANY([]).[]], **=[$0], **0=[$2]): rowcount = 6.0, cumulative cost = {6.0 rows, 12.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 2800 DrillJoinRel(subset=[rel#2799:Subset#6.LOGICAL.ANY([]).[]], condition=[=($1, $3)], joinType=[full]): rowcount = 6.0, cumulative cost = {10.0 rows, 104.0 cpu, 0.0 io, 0.0 network, 70.4 memory}, id = 2798 {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (DRILL-6914) Query with RuntimeFilter and SemiJoin fails with IllegalStateException: Memory was leaked by query
[ https://issues.apache.org/jira/browse/DRILL-6914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Boaz Ben-Zvi resolved DRILL-6914. - Resolution: Fixed The interaction between the Hash-Join spill and the runtime filter was fixed in PR #1622. Testing with the latest code works OK (no memory leaks). > Query with RuntimeFilter and SemiJoin fails with IllegalStateException: > Memory was leaked by query > -- > > Key: DRILL-6914 > URL: https://issues.apache.org/jira/browse/DRILL-6914 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Flow >Affects Versions: 1.15.0 >Reporter: Abhishek Ravi >Assignee: Boaz Ben-Zvi >Priority: Major > Fix For: 1.16.0 > > Attachments: 23cc1af3-0e8e-b2c9-a889-a96504988d6c.sys.drill, > 23cc1b7c-5b5c-d123-5e72-6d7d2719df39.sys.drill > > > Following query fails on TPC-H SF 100 dataset when > exec.hashjoin.enable.runtime_filter = true AND planner.enable_semijoin = true. > Note that the query does not fail if any one of them or both are disabled. > {code:sql} > set `exec.hashjoin.enable.runtime_filter` = true; > set `exec.hashjoin.runtime_filter.max.waiting.time` = 1; > set `planner.enable_broadcast_join` = false; > set `planner.enable_semijoin` = true; > select > count(*) as row_count > from > lineitem l1 > where > l1.l_shipdate IN ( > select > distinct(cast(l2.l_shipdate as date)) > from > lineitem l2); > reset `exec.hashjoin.enable.runtime_filter`; > reset `exec.hashjoin.runtime_filter.max.waiting.time`; > reset `planner.enable_broadcast_join`; > reset `planner.enable_semijoin`; > {code} > > {noformat} > Error: SYSTEM ERROR: IllegalStateException: Memory was leaked by query. > Memory leaked: (134217728) > Allocator(frag:1:0) 800/134217728/172453568/70126322567 > (res/actual/peak/limit) > Fragment 1:0 > Please, refer to logs for more information. > [Error Id: ccee18b3-c3ff-4fdb-b314-23a6cfed0a0e on qa-node185.qa.lab:31010] > (state=,code=0) > java.sql.SQLException: SYSTEM ERROR: IllegalStateException: Memory was leaked > by query. Memory leaked: (134217728) > Allocator(frag:1:0) 800/134217728/172453568/70126322567 > (res/actual/peak/limit) > Fragment 1:0 > Please, refer to logs for more information. > [Error Id: ccee18b3-c3ff-4fdb-b314-23a6cfed0a0e on qa-node185.qa.lab:31010] > at > org.apache.drill.jdbc.impl.DrillCursor.nextRowInternally(DrillCursor.java:536) > at org.apache.drill.jdbc.impl.DrillCursor.next(DrillCursor.java:640) > at org.apache.calcite.avatica.AvaticaResultSet.next(AvaticaResultSet.java:217) > at > org.apache.drill.jdbc.impl.DrillResultSetImpl.next(DrillResultSetImpl.java:151) > at sqlline.BufferedRows.(BufferedRows.java:37) > at sqlline.SqlLine.print(SqlLine.java:1716) > at sqlline.Commands.execute(Commands.java:949) > at sqlline.Commands.sql(Commands.java:882) > at sqlline.SqlLine.dispatch(SqlLine.java:725) > at sqlline.SqlLine.runCommands(SqlLine.java:1779) > at sqlline.Commands.run(Commands.java:1485) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at sqlline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.java:38) > at sqlline.SqlLine.dispatch(SqlLine.java:722) > at sqlline.SqlLine.initArgs(SqlLine.java:458) > at sqlline.SqlLine.begin(SqlLine.java:514) > at sqlline.SqlLine.start(SqlLine.java:264) > at sqlline.SqlLine.main(SqlLine.java:195) > Caused by: org.apache.drill.common.exceptions.UserRemoteException: SYSTEM > ERROR: IllegalStateException: Memory was leaked by query. Memory leaked: > (134217728) > Allocator(frag:1:0) 800/134217728/172453568/70126322567 > (res/actual/peak/limit) > Fragment 1:0 > Please, refer to logs for more information. > [Error Id: ccee18b3-c3ff-4fdb-b314-23a6cfed0a0e on qa-node185.qa.lab:31010] > at > org.apache.drill.exec.rpc.user.QueryResultHandler.resultArrived(QueryResultHandler.java:123) > at org.apache.drill.exec.rpc.user.UserClient.handle(UserClient.java:422) > at org.apache.drill.exec.rpc.user.UserClient.handle(UserClient.java:96) > at org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:273) > at org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:243) > at > io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:88) > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) > at >
[jira] [Created] (DRILL-7034) Window function over a malformed CSV file crashes the JVM
Boaz Ben-Zvi created DRILL-7034: --- Summary: Window function over a malformed CSV file crashes the JVM Key: DRILL-7034 URL: https://issues.apache.org/jira/browse/DRILL-7034 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.15.0 Reporter: Boaz Ben-Zvi The JVM crashes executing window functions over (an ordered) CSV file with a small format issue - an empty line. To create: Take the following simple `a.csvh` file: {noformat} amount 10 11 {noformat} And execute a simple window function like {code:sql} select max(amount) over(order by amount) FROM dfs.`/data/a.csvh`; {code} Then add an empty line between the `10` and the `11`: {noformat} amount 10 11 {noformat} and try again: {noformat} 0: jdbc:drill:zk=local> select max(amount) over(order by amount) FROM dfs.`/data/a.csvh`; +-+ | EXPR$0 | +-+ | 10 | | 11 | +-+ 2 rows selected (3.554 seconds) 0: jdbc:drill:zk=local> select max(amount) over(order by amount) FROM dfs.`/data/a.csvh`; # # A fatal error has been detected by the Java Runtime Environment: # # SIGSEGV (0xb) at pc=0x0001064aeae7, pid=23450, tid=0x6103 # # JRE version: Java(TM) SE Runtime Environment (8.0_181-b13) (build 1.8.0_181-b13) # Java VM: Java HotSpot(TM) 64-Bit Server VM (25.181-b13 mixed mode bsd-amd64 compressed oops) # Problematic frame: # J 6719% C2 org.apache.drill.exec.expr.fn.impl.ByteFunctionHelpers.memcmp(JIIJII)I (188 bytes) @ 0x0001064aeae7 [0x0001064ae920+0x1c7] # # Core dump written. Default location: /cores/core or core.23450 # # An error report file with more information is saved as: # /Users/boazben-zvi/IdeaProjects/drill/hs_err_pid23450.log # # If you would like to submit a bug report, please visit: # http://bugreport.java.com/bugreport/crash.jsp # Abort trap: 6 (core dumped) {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-7015) Improve documentation for PARTITION BY
Boaz Ben-Zvi created DRILL-7015: --- Summary: Improve documentation for PARTITION BY Key: DRILL-7015 URL: https://issues.apache.org/jira/browse/DRILL-7015 Project: Apache Drill Issue Type: Improvement Components: Documentation Affects Versions: 1.15.0 Reporter: Boaz Ben-Zvi Assignee: Bridget Bevens Fix For: 1.16.0 The documentation for CREATE TABLE AS (CTAS) shows the syntax of the command, without the optional PARTITION BY clause. That option is only mentioned later under the usage notes. *+_Suggestion_+*: Add this optional clause to the syntax (same as for CREATE TEMPORARY TABLE (CTTAS)). And mention that this option is only applicable when storing in Parquet. And the documentation for CREATE TEMPORARY TABLE (CTTAS), the comment says: {panel} An optional parameter that can *only* be used to create temporary tables with the Parquet data format. {panel} Which can mistakenly be understood as "only for temporary tables". *_+Suggestion+_*: erase the "to create temporary tables" part (not needed, as it is implied from the context of this page). *_+Last suggestion+_*: In the documentation for the PARTITION BY clause, can add an example using the implicit column "filename" to demonstrate how the partitioning column puts each distinct value into a separate file. For example, add in the "Other Examples" section : {noformat} 0: jdbc:drill:zk=local> select distinct r_regionkey, filename from mytable1; +--++ | r_regionkey |filename| +--++ | 2| 0_0_3.parquet | | 1| 0_0_2.parquet | | 0| 0_0_1.parquet | | 3| 0_0_4.parquet | | 4| 0_0_5.parquet | +--++ {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-7013) Hash-Join and Hash-Aggr to handle incoming with selection vectors
Boaz Ben-Zvi created DRILL-7013: --- Summary: Hash-Join and Hash-Aggr to handle incoming with selection vectors Key: DRILL-7013 URL: https://issues.apache.org/jira/browse/DRILL-7013 Project: Apache Drill Issue Type: Improvement Components: Execution - Relational Operators, Query Planning Optimization Affects Versions: 1.15.0 Reporter: Boaz Ben-Zvi The Hash-Join and Hash-Aggr operators copy each incoming row separately. When the incoming data has a selection vector (e.g., outgoing from a Filter), a _SelectionVectorRemover_ is added before the Hash operator, as the latter cannot handle the selection vector. Thus every row is needlessly being copied twice! +Suggestion+: Enhance the Hash operators to handle potential incoming selection vectors, thus eliminating the need for the extra copy. The planner needs to be changed not to add that SelectionVectorRemover. For example: {code:sql} select * from cp.`tpch/lineitem.parquet` L, cp.`tpch/orders.parquet` O where O.o_custkey > 1498 and L.l_orderkey > 58999 and O.o_orderkey = L.l_orderkey {code} And the plan: {panel} 00-00 Screen : rowType = RecordType(DYNAMIC_STAR **, DYNAMIC_STAR **0): 00-01 ProjectAllowDup(**=[$0], **0=[$1]) : rowType = RecordType(DYNAMIC_STAR **, DYNAMIC_STAR **0): 00-02 Project(T44¦¦**=[$0], T45¦¦**=[$2]) : rowType = RecordType(DYNAMIC_STAR T44¦¦**, DYNAMIC_STAR T45¦¦**): 00-03 HashJoin(condition=[=($1, $4)], joinType=[inner], semi-join: =[false]) : rowType = RecordType(DYNAMIC_STAR T44¦¦**, ANY l_orderkey, DYNAMIC_STAR T45¦¦**, ANY o_custkey, ANY o_orderkey): 00-05 *SelectionVectorRemover* : rowType = RecordType(DYNAMIC_STAR T44¦¦**, ANY l_orderkey): 00-07 Filter(condition=[>($1, 58999)]) : rowType = RecordType(DYNAMIC_STAR T44¦¦**, ANY l_orderkey): 00-09 Project(T44¦¦**=[$0], l_orderkey=[$1]) : rowType = RecordType(DYNAMIC_STAR T44¦¦**, ANY l_orderkey): 00-11 Scan(table=[[cp, tpch/lineitem.parquet]], groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=classpath:/tpch/lineitem.parquet]], 00-04 *SelectionVectorRemover* : rowType = RecordType(DYNAMIC_STAR T45¦¦**, ANY o_custkey, ANY o_orderkey): 00-06 Filter(condition=[AND(>($1, 1498), >($2, 58999))]) : rowType = RecordType(DYNAMIC_STAR T45¦¦**, ANY o_custkey, ANY o_orderkey): 00-08 Project(T45¦¦**=[$0], o_custkey=[$1], o_orderkey=[$2]) : rowType = RecordType(DYNAMIC_STAR T45¦¦**, ANY o_custkey, ANY o_orderkey): 00-10 Scan(table=[[cp, tpch/orders.parquet]], {panel} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-7012) Make SelectionVectorRemover project only the needed columns
Boaz Ben-Zvi created DRILL-7012: --- Summary: Make SelectionVectorRemover project only the needed columns Key: DRILL-7012 URL: https://issues.apache.org/jira/browse/DRILL-7012 Project: Apache Drill Issue Type: Improvement Components: Execution - Relational Operators, Query Planning Optimization Affects Versions: 1.15.0 Reporter: Boaz Ben-Zvi A SelectionVectorRemover is often used after a filter, to copy into a newly allocated new batch only the "filtered out" rows. In some cases the columns used by the filter are not needed downstream; currently these columns are being needlessly allocated and copied, and later removed by a Project. _+Suggested improvement+_: The planner can pass the information about these columns to the SelectionVectorRemover, which would avoid this useless allocation and copy. The Planner would also eliminate that Project from the plan. Here is an example, the query: {code:java} select max(l_quantity) from cp.`tpch/lineitem.parquet` L where L.l_orderkey > 58999 and L.l_shipmode = 'TRUCK' group by l_linenumber ; {code} And the result plan (trimmed for readability), where "l_orderkey" and "l_shipmode" are removed by the Project: {noformat} 00-00 Screen : rowType = RecordType(ANY EXPR$0): 00-01 Project(EXPR$0=[$0]) : rowType = RecordType(ANY EXPR$0): 00-02 Project(EXPR$0=[$1]) : rowType = RecordType(ANY EXPR$0): 00-03 HashAgg(group=[\{0}], EXPR$0=[MAX($1)]) : rowType = RecordType(ANY l_linenumber, ANY EXPR$0): 00-04 *Project*(l_linenumber=[$2], l_quantity=[$3]) : rowType = RecordType(ANY l_linenumber, ANY l_quantity): 00-05 *SelectionVectorRemover* : rowType = RecordType(ANY *l_orderkey*, ANY *l_shipmode*, ANY l_linenumber, ANY l_quantity): 00-06 *Filter*(condition=[AND(>($0, 58999), =($1, 'TRUCK'))]) : rowType = RecordType(ANY l_orderkey, ANY l_shipmode, ANY l_linenumber, ANY l_quantity): 00-07 Scan(table=[[cp, tpch/lineitem.parquet]], groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=classpath:/tpch/lineitem.parquet]], selectionRoot=classpath:/tpch/lineitem.parquet, numFiles=1, numRowGroups=1, usedMetadataFile=false, columns=[`l_orderkey`, `l_shipmode`, `l_linenumber`, `l_quantity`]]]) : rowType = RecordType(ANY l_orderkey, ANY l_shipmode, ANY l_linenumber, ANY l_quantity): {noformat} The implementation will not be simple, as the relevant code (e.g., GenericSV2Copier) has no idea of specific columns. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6915) Unit test mysql-test-data.sql in contrib/jdbc-storage-plugin fails on newer MacOS
Boaz Ben-Zvi created DRILL-6915: --- Summary: Unit test mysql-test-data.sql in contrib/jdbc-storage-plugin fails on newer MacOS Key: DRILL-6915 URL: https://issues.apache.org/jira/browse/DRILL-6915 Project: Apache Drill Issue Type: Bug Components: Storage - JDBC Affects Versions: 1.14.0 Environment: MacOS, either High Sierra (10.13) or Mojave (10.14). Reporter: Boaz Ben-Zvi The newer MacOS file systems (10.13 and above) are case-insensitive by default. This leads to the following unit test failure: {code:java} ~/drill > mvn clean install -rf :drill-jdbc-storage [INFO] Scanning for projects... [INFO] [INFO] Detecting the operating system and CPU architecture [INFO] [INFO] os.detected.name: osx [INFO] os.detected.arch: x86_64 [INFO] os.detected.version: 10.14 . [INFO] [INFO] Building contrib/jdbc-storage-plugin 1.15.0-SNAPSHOT [INFO] . [INFO] >> 2018-12-19 15:11:32 7136 [Warning] Setting lower_case_table_names=2 because file system for __drill/contrib/storage-jdbc/target/mysql-data/data/ is case insensitive . [ERROR] Failed to execute: create table CASESENSITIVETABLE ( a BLOB, b BLOB ) [INFO] [INFO] Reactor Summary: [INFO] [INFO] contrib/jdbc-storage-plugin FAILURE [01:30 min] ... [ERROR] Failed to execute goal org.codehaus.mojo:sql-maven-plugin:1.5:execute (create-tables) on project drill-jdbc-storage: Table 'casesensitivetable' already exists -> [Help 1]{code} in the test file *mysql-test-data.sql*, where +both+ tables *caseSensitiveTable* and *CASESENSITIVETABLE* are created. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6888) Nested classes in HashAggTemplate break the plain Java for debugging codegen
Boaz Ben-Zvi created DRILL-6888: --- Summary: Nested classes in HashAggTemplate break the plain Java for debugging codegen Key: DRILL-6888 URL: https://issues.apache.org/jira/browse/DRILL-6888 Project: Apache Drill Issue Type: Improvement Components: Execution - Relational Operators Affects Versions: 1.14.0 Reporter: Boaz Ben-Zvi Assignee: Boaz Ben-Zvi The *prefer_plain_java* compile option is useful for debugging of generated code. DRILL-6719 ("separate spilling logic for Hash Agg") introduced two nested classes into the HashAggTemplate class. However those nested classes cause the prefer_plain_java compile option to fail when compiling the generated code, like: {code:java} Error: SYSTEM ERROR: CompileException: File '/tmp/janino5709636998794673307.java', Line 36, Column 35: No applicable constructor/method found for actual parameters "org.apache.drill.exec.test.generated.HashAggregatorGen11$HashAggSpilledPartition"; candidates are: "protected org.apache.drill.exec.physical.impl.aggregate.HashAggTemplate$BatchHolder org.apache.drill.exec.physical.impl.aggregate.HashAggTemplate.injectMembers(org.apache.drill.exec.physical.impl.aggregate.HashAggTemplate$BatchHolder)" {code} +The proposed fix+: Move those nested classes outside HashAgTemplate. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6881) Hash-Table insert and probe: Compare hash values before keys
Boaz Ben-Zvi created DRILL-6881: --- Summary: Hash-Table insert and probe: Compare hash values before keys Key: DRILL-6881 URL: https://issues.apache.org/jira/browse/DRILL-6881 Project: Apache Drill Issue Type: Improvement Components: Execution - Relational Operators Affects Versions: 1.14.0 Reporter: Boaz Ben-Zvi Assignee: Boaz Ben-Zvi Fix For: 1.16.0 When checking for existence of a key in the hash table (during _put_ or _probe_ operations), the value of that key is compared (using generated code) with a potential match key (same bucket). This comparison is slightly expensive (e.g., long keys, multi column keys, checking null conditions, NaN, etc). Instead, if the hash-values of the two keys are compared first (at practically zero cost), then the costly comparison can be avoided in case the hash values don't match. This code change is trivial, and given that the relevant Hash-Table code is *hot code*, then even minute improvements could add up. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6880) Hash-Join: Many null keys on the build side form a long linked chain in the Hash Table
Boaz Ben-Zvi created DRILL-6880: --- Summary: Hash-Join: Many null keys on the build side form a long linked chain in the Hash Table Key: DRILL-6880 URL: https://issues.apache.org/jira/browse/DRILL-6880 Project: Apache Drill Issue Type: Improvement Components: Execution - Relational Operators Affects Versions: 1.14.0 Reporter: Boaz Ben-Zvi Fix For: 1.16.0 When building the Hash Table for the Hash-Join, each new key is matched with an existing key (same bucket) by calling the generated method `isKeyMatchInternalBuild`, which compares the two. However when both keys are null, the method returns *false* (meaning not-equal; i.e. it is a new key), thus the new key is added into the list following the old key. When a third null key is found, it would be matched with the prior two, and added as well. Etc etc ... This way many null values would perform checks at order N^2 / 2. Suggested improvement: The generated code should return a third result, meaning "two null keys". Then in case of Inner or Left joins all the duplicate nulls can be discarded. Below is a simple example, note the time difference between non-null and the all-nulls tables (also instrumentation showed that for nulls, the method above was called 1249975000 times!!) {code:java} 0: jdbc:drill:zk=local> use dfs.tmp; 0: jdbc:drill:zk=local> create table test as (select cast(null as int) mycol from dfs.`/data/test128M.tbl` limit 5); 0: jdbc:drill:zk=local> create table test1 as (select cast(1 as int) mycol1 from dfs.`/data/test128M.tbl` limit 6); 0: jdbc:drill:zk=local> create table test2 as (select cast(2 as int) mycol2 from dfs.`/data/test128M.tbl` limit 5); 0: jdbc:drill:zk=local> select count(*) from test1 join test2 on test1.mycol1 = test2.mycol2; +-+ | EXPR$0 | +-+ | 0 | +-+ 1 row selected (0.443 seconds) 0: jdbc:drill:zk=local> create table test1 as (select cast(1 as int) mycol1 from dfs.`/data/test128M.tbl` limit 6); +---++ | Fragment | Number of records written | +---++ | 0_0 | 6 | +---++ 1 row selected (0.517 seconds) 0: jdbc:drill:zk=local> select count(*) from test1 join test on test1.mycol1 = test.mycol; +-+ | EXPR$0 | +-+ | 0 | +-+ 1 row selected (140.098 seconds) {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6864) Root POM: Update the git-commit-id plugin
Boaz Ben-Zvi created DRILL-6864: --- Summary: Root POM: Update the git-commit-id plugin Key: DRILL-6864 URL: https://issues.apache.org/jira/browse/DRILL-6864 Project: Apache Drill Issue Type: Improvement Components: Tools, Build Test Affects Versions: 1.14.0 Reporter: Boaz Ben-Zvi Assignee: Boaz Ben-Zvi Fix For: 1.15.0 The Maven git-commit-id plugin is of version 2.1.9, which is 4.5 years old. Executing this plugin seems to take a significant portion of the mvn build time. Newer versions run more than twice as fast (see below). Suggestion: Upgrade to the latest (2.2.5), to shorten the Drill mvn build time. Here are the run times with our *current (2.1.9)* version: {code:java} [INFO] git-commit-id-plugin:revision (for-jars) . [25.320s] [INFO] git-commit-id-plugin:revision (for-jars) . [24.255s] [INFO] git-commit-id-plugin:revision (for-jars) . [22.821s] [INFO] git-commit-id-plugin:revision (for-jars) . [32.889s] [INFO] git-commit-id-plugin:revision (for-jars) . [34.557s] [INFO] git-commit-id-plugin:revision (for-jars) . [26.085s] [INFO] git-commit-id-plugin:revision (for-jars) . [46.135s] [INFO] git-commit-id-plugin:revision (for-jars) . [72.811s] [INFO] git-commit-id-plugin:revision (for-jars) . [45.956s] [INFO] git-commit-id-plugin:revision (for-jars) . [18.223s] [INFO] git-commit-id-plugin:revision (for-jars) . [19.841s] [INFO] git-commit-id-plugin:revision (for-jars) . [50.146s] [INFO] git-commit-id-plugin:revision (for-jars) . [30.993s] [INFO] git-commit-id-plugin:revision (for-jars) . [32.839s] [INFO] git-commit-id-plugin:revision (for-jars) . [33.852s] [INFO] git-commit-id-plugin:revision (for-jars) . [23.562s] [INFO] git-commit-id-plugin:revision (for-jars) . [25.333s] [INFO] git-commit-id-plugin:revision (for-jars) . [24.737s] [INFO] git-commit-id-plugin:revision (for-jars) . [19.098s] [INFO] git-commit-id-plugin:revision (for-jars) . [46.245s] [INFO] git-commit-id-plugin:revision (for-jars) . [40.350s] [INFO] git-commit-id-plugin:revision (for-jars) . [34.610s] [INFO] git-commit-id-plugin:revision (for-jars) . [78.756s] [INFO] git-commit-id-plugin:revision (for-source-tarball) ... [52.551s] [INFO] git-commit-id-plugin:revision (for-jars) . [10.940s] [INFO] git-commit-id-plugin:revision (for-jars) . [24.573s] [INFO] git-commit-id-plugin:revision (for-jars) . [24.404s] [INFO] git-commit-id-plugin:revision (for-jars) . [43.501s] [INFO] git-commit-id-plugin:revision (for-jars) . [25.041s] [INFO] git-commit-id-plugin:revision (for-jars) . [39.149s] [INFO] git-commit-id-plugin:revision (for-jars) . [40.310s] {code} And here are the run times with a newer (2.2.4) version: {code:java} [INFO] git-commit-id-plugin:revision (for-jars) . [6.964s] [INFO] git-commit-id-plugin:revision (for-jars) . [18.732s] [INFO] git-commit-id-plugin:revision (for-jars) . [7.441s] [INFO] git-commit-id-plugin:revision (for-jars) . [8.146s] [INFO] git-commit-id-plugin:revision (for-jars) . [6.404s] [INFO] git-commit-id-plugin:revision (for-jars) . [7.837s] [INFO] git-commit-id-plugin:revision (for-jars) . [9.788s] [INFO] git-commit-id-plugin:revision (for-jars) . [9.136s] [INFO] git-commit-id-plugin:revision (for-jars) . [19.607s] [INFO] git-commit-id-plugin:revision (for-jars) . [9.289s] [INFO] git-commit-id-plugin:revision (for-jars) . [8.046s] [INFO] git-commit-id-plugin:revision (for-jars) . [8.268s] [INFO] git-commit-id-plugin:revision (for-jars) . [7.868s] [INFO] git-commit-id-plugin:revision (for-jars) . [10.750s] [INFO] git-commit-id-plugin:revision (for-jars) . [8.558s] [INFO] git-commit-id-plugin:revision (for-jars) . [11.267s] [INFO] git-commit-id-plugin:revision (for-jars) . [15.696s] [INFO] git-commit-id-plugin:revision (for-jars) . [9.446s] [INFO] git-commit-id-plugin:revision (for-jars) . [6.187s] [INFO] git-commit-id-plugin:revision (for-jars) . [24.806s] [INFO] git-commit-id-plugin:revision (for-jars) . [14.591s] [INFO]
[jira] [Created] (DRILL-6861) Hash-Join: Spilled partitions are skipped following an empty probe side
Boaz Ben-Zvi created DRILL-6861: --- Summary: Hash-Join: Spilled partitions are skipped following an empty probe side Key: DRILL-6861 URL: https://issues.apache.org/jira/browse/DRILL-6861 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.14.0 Reporter: Boaz Ben-Zvi Assignee: Boaz Ben-Zvi Fix For: 1.15.0 Following DRILL-6755 (_Avoid building a hash table when the probe side is empty_) - The special case of an empty spilled probe-partition was not handled. When such a case happens, the Hash-Join terminates early (returns NONE) and the remaining partitions are not processed/returned (which may lead to incorrect results). A test case - force tpcds/query95 to spill : {code:java} 0: jdbc:drill:zk=local> alter system set `exec.hashjoin.max_batches_in_memory` = 40; +---+---+ | ok |summary| +---+---+ | true | exec.hashjoin.max_batches_in_memory updated. | +---+---+ 1 row selected (1.325 seconds) 0: jdbc:drill:zk=local> WITH ws_wh AS . . . . . . . . . . . > ( . . . . . . . . . . . >SELECT ws1.ws_order_number, . . . . . . . . . . . > ws1.ws_warehouse_sk wh1, . . . . . . . . . . . > ws2.ws_warehouse_sk wh2 . . . . . . . . . . . >FROM dfs.`/data/tpcds/sf1/parquet/web_sales` ws1, . . . . . . . . . . . > dfs.`/data/tpcds/sf1/parquet/web_sales` ws2 . . . . . . . . . . . >WHERE ws1.ws_order_number = ws2.ws_order_number . . . . . . . . . . . >ANDws1.ws_warehouse_sk <> ws2.ws_warehouse_sk) . . . . . . . . . . . > SELECT . . . . . . . . . . . > Count(DISTINCT ws1.ws_order_number) AS `order count` , . . . . . . . . . . . > Sum(ws1.ws_ext_ship_cost) AS `total shipping cost` , . . . . . . . . . . . > Sum(ws1.ws_net_profit) AS `total net profit` . . . . . . . . . . . > FROM dfs.`/data/tpcds/sf1/parquet/web_sales` ws1 , . . . . . . . . . . . > dfs.`/data/tpcds/sf1/parquet/date_dim` dd, . . . . . . . . . . . > dfs.`/data/tpcds/sf1/parquet/customer_address` ca, . . . . . . . . . . . > dfs.`/data/tpcds/sf1/parquet/web_site` wbst . . . . . . . . . . . > WHEREdd.d_date BETWEEN '2000-04-01' AND ( . . . . . . . . . . . > Cast('2000-04-01' AS DATE) + INTERVAL '60' day) . . . . . . . . . . . > AND ws1.ws_ship_date_sk = dd.d_date_sk . . . . . . . . . . . > AND ws1.ws_ship_addr_sk = ca.ca_address_sk . . . . . . . . . . . > AND ca.ca_state = 'IN' . . . . . . . . . . . > AND ws1.ws_web_site_sk = wbst.web_site_sk . . . . . . . . . . . > AND wbst.web_company_name = 'pri' . . . . . . . . . . . > AND ws1.ws_order_number IN . . . . . . . . . . . > ( . . . . . . . . . . . > SELECT ws_wh.ws_order_number . . . . . . . . . . . > FROM ws_wh) . . . . . . . . . . . > AND ws1.ws_order_number IN . . . . . . . . . . . > ( . . . . . . . . . . . > SELECT wr.wr_order_number . . . . . . . . . . . > FROM dfs.`/data/tpcds/sf1/parquet/web_returns` wr, . . . . . . . . . . . >ws_wh . . . . . . . . . . . > WHERE wr.wr_order_number = ws_wh.ws_order_number) . . . . . . . . . . . > ORDER BY count(DISTINCT ws1.ws_order_number) . . . . . . . . . . . > LIMIT 100; +--+--+-+ | order count | total shipping cost | total net profit | +--+--+-+ | 17 | 38508.1305 | 20822.3 | +--+--+-+ 1 row selected (105.621 seconds) {code} The correct results should be: {code:java} +--+--+-+ | order count | total shipping cost | total net profit | +--+--+-+ | 34 | 63754.72 | 15919.0098 | +--+--+-+ {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6860) SqlLine: EXPLAIN produces very long header lines
Boaz Ben-Zvi created DRILL-6860: --- Summary: SqlLine: EXPLAIN produces very long header lines Key: DRILL-6860 URL: https://issues.apache.org/jira/browse/DRILL-6860 Project: Apache Drill Issue Type: Bug Components: Client - CLI Affects Versions: 1.14.0 Reporter: Boaz Ben-Zvi Assignee: Arina Ielchiieva Fix For: 1.15.0 Maybe a result of upgrading to SqlLine 1.5.0 (DRILL-3853 - PR #1462), the header dividing lines displayed when using EXPLAIN became very long: {code} 0: jdbc:drill:zk=local> explain plan for select count(*) from dfs.`/data/tpcds/sf1/parquet/date_dim`; +-+---+ | text | json | +-+---+ | 00-00Screen 00-01 Project(EXPR$0=[$0]) 00-02DirectScan(groupscan=[files = [/data/tpcds/sf1/parquet/date_dim/0_0_0.parquet], numFiles = 1, DynamicPojoRecordReader{records = [[73049]]}]) | { "head" : { "version" : 1, "generator" : { "type" : "ExplainHandler", "info" : "" }, "type" : "APACHE_DRILL_PHYSICAL", "options" : [ { "kind" : "BOOLEAN", "accessibleScopes" : "ALL", "name" : "planner.enable_nljoin_for_scalar_only", "bool_val" : true, "scope" : "SESSION" } ], "queue" : 0, "hasResourcePlan" : false, "resultMode" : "EXEC" }, "graph" : [ { "pop" : "metadata-direct-scan", "@id" : 2, "cost" : 1.0 }, { "pop" : "project", "@id" : 1, "exprs" : [ { "ref" : "`EXPR$0`", "expr" : "`count0$EXPR$0`"
[jira] [Created] (DRILL-6859) BETWEEN dates with a slightly malformed DATE string returns false
Boaz Ben-Zvi created DRILL-6859: --- Summary: BETWEEN dates with a slightly malformed DATE string returns false Key: DRILL-6859 URL: https://issues.apache.org/jira/browse/DRILL-6859 Project: Apache Drill Issue Type: Bug Components: Query Planning Optimization Affects Versions: 1.14.0 Reporter: Boaz Ben-Zvi Fix For: Future (This may be a Calcite issue ) In the following query using BETWEEN with dates, the "month" is specified as "4", instead of "04", which causes the BETWEEN clause to evaluate to FALSE. Note that rewriting the clause with less-than etc. does work correctly. {code:java} 0: jdbc:drill:zk=local> select count(*) from `date_dim` dd where dd.d_date BETWEEN '2000-4-01' and ( Cast('2000-4-01' AS DATE) + INTERVAL '60' day) ; +-+ | EXPR$0 | +-+ | 0 | +-+ 1 row selected (0.184 seconds) 0: jdbc:drill:zk=local> select count(*) from `date_dim` dd where dd.d_date BETWEEN '2000-04-01' and ( Cast('2000-4-01' AS DATE) + INTERVAL '60' day) limit 10; +-+ | EXPR$0 | +-+ | 61 | +-+ 1 row selected (0.209 seconds) 0: jdbc:drill:zk=local> select count(*) from `date_dim` dd where dd.d_date >= '2000-4-01' and dd.d_date <= '2000-5-31'; +-+ | EXPR$0 | +-+ | 61 | +-+ 1 row selected (0.227 seconds) {code} The physical plan for the second (good) case implements the BETWEEN clause with a FILTER on top of the scanner. For the first (failed) case, there is a "limit 0" on top of the scanner. (This query was extracted from TPC-DS 95, used over Parquet files). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (DRILL-6798) Planner changes to support semi-join
[ https://issues.apache.org/jira/browse/DRILL-6798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Boaz Ben-Zvi resolved DRILL-6798. - Resolution: Fixed Commit ID 71809ca6216d95540b2a41ce1ab2ebb742888671 > Planner changes to support semi-join > > > Key: DRILL-6798 > URL: https://issues.apache.org/jira/browse/DRILL-6798 > Project: Apache Drill > Issue Type: Sub-task > Components: Query Planning Optimization >Affects Versions: 1.14.0 >Reporter: Boaz Ben-Zvi >Assignee: Hanumath Rao Maduri >Priority: Major > Fix For: 1.15.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6845) Eliminate duplicates for Semi Hash Join
Boaz Ben-Zvi created DRILL-6845: --- Summary: Eliminate duplicates for Semi Hash Join Key: DRILL-6845 URL: https://issues.apache.org/jira/browse/DRILL-6845 Project: Apache Drill Issue Type: Sub-task Components: Execution - Relational Operators Affects Versions: 1.14.0 Reporter: Boaz Ben-Zvi Assignee: Boaz Ben-Zvi Fix For: 1.15.0 Following DRILL-6735: The performance of the new Semi Hash Join may degrade if the build side contains excessive number of join-key duplicate rows; this mainly a result of the need to store all those rows first, before the hash table is built. Proposed solution: For Semi, the Hash Agg would create a Hash-Table initially, and use it to eliminate key-duplicate rows as they arrive. Proposed extra: That Hash-Table has an added cost (e.g. resizing). So perform "runtime stats" – Check initial number of incoming rows (e.g. 32k), and if the number of duplicates is less than some threshold (e.g. %20) – cancel that "early" hash table. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6836) Eliminate StreamingAggr for COUNT DISTINCT
Boaz Ben-Zvi created DRILL-6836: --- Summary: Eliminate StreamingAggr for COUNT DISTINCT Key: DRILL-6836 URL: https://issues.apache.org/jira/browse/DRILL-6836 Project: Apache Drill Issue Type: Improvement Components: Execution - Relational Operators, Query Planning Optimization Affects Versions: 1.14.0 Reporter: Boaz Ben-Zvi Assignee: Boaz Ben-Zvi Fix For: 1.16.0 The COUNT DISTINCT operation is often implemented with a Hash-Aggr operator for the DISTINCT, and a Streaming-Aggr above to perform the COUNT. That Streaming-Aggr does the counting like any aggregation, counting each value, batch after batch. While very efficient, that counting work is basically not needed, as the Hash-Aggr knows the number of distinct values (in the in-memory partitions). Hence _a possible small performance improvement_ - eliminate the Streaming-Aggr operator, and notify the Hash-Aggr to return a COUNT (these are Planner changes). The Hash-Aggr operator would need to generate the single Float8 column output schema, and output that batch with a single value, just like the Streaming -Aggr did (likely without generating code). In case of a spill, the Hash-Aggr still needs to read and process those partitions, to get the exact distinct number. The expected improvement is the elimination of the batch by batch output from the Hash-Aggr, and the batch by batch, row by row processing of the Streaming-Aggr. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6799) Enhance the Hash-Join Operator to perform Anti-Semi-Join
Boaz Ben-Zvi created DRILL-6799: --- Summary: Enhance the Hash-Join Operator to perform Anti-Semi-Join Key: DRILL-6799 URL: https://issues.apache.org/jira/browse/DRILL-6799 Project: Apache Drill Issue Type: Improvement Components: Execution - Relational Operators, Query Planning Optimization Affects Versions: 1.14.0 Reporter: Boaz Ben-Zvi Assignee: Boaz Ben-Zvi Fix For: 1.16.0 Similar to handling Semi-Join (see DRILL-6735), the Anti-Semi-Join can be enhanced by eliminating the extra DISTINCT (i.e. Hash-Aggr) operator. Example (note the NOT IN): select c.c_first_name, c.c_last_name from dfs.`/data/json/s1/customer` c where c.c_customer_sk NOT IN (select s.ss_customer_sk from dfs.`/data/json/s1/store_sales` s) limit 4; -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6798) Planner changes to support semi-join
Boaz Ben-Zvi created DRILL-6798: --- Summary: Planner changes to support semi-join Key: DRILL-6798 URL: https://issues.apache.org/jira/browse/DRILL-6798 Project: Apache Drill Issue Type: Sub-task Affects Versions: 1.14.0 Reporter: Boaz Ben-Zvi Assignee: Hanumath Rao Maduri Fix For: 1.15.0 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6758) Hash Join should not return the join columns when they are not needed downstream
Boaz Ben-Zvi created DRILL-6758: --- Summary: Hash Join should not return the join columns when they are not needed downstream Key: DRILL-6758 URL: https://issues.apache.org/jira/browse/DRILL-6758 Project: Apache Drill Issue Type: Improvement Components: Execution - Relational Operators, Query Planning Optimization Affects Versions: 1.14.0 Reporter: Boaz Ben-Zvi Assignee: Hanumath Rao Maduri Fix For: 1.15.0 Currently the Hash-Join operator returns all its (both sides) incoming columns. In cases where the join columns are not used further downstream, this is a waste (allocating vectors, copying each value, etc). Suggestion: Have the planner pass this information to the Hash-Join operator, to enable skipping the return of these columns. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6735) Enhance the Hash-Join Operator to perform Semi and Anti-Semi joins
Boaz Ben-Zvi created DRILL-6735: --- Summary: Enhance the Hash-Join Operator to perform Semi and Anti-Semi joins Key: DRILL-6735 URL: https://issues.apache.org/jira/browse/DRILL-6735 Project: Apache Drill Issue Type: Improvement Components: Execution - Relational Operators, Query Planning Optimization Affects Versions: 1.14.0 Reporter: Boaz Ben-Zvi Assignee: Boaz Ben-Zvi Fix For: 1.15.0 Currently Drill implements Semi-Join (see DRILL-402) by using a regular join, with a DISTINCT operator under the build upstream side to eliminate duplicates. Typically a physical plan for the Semi uses a hash-join, with a hash-aggr performing the DISTINCT (see example below). This effectively builds the same hash table(s) twice - a big waste of time and memory. +Improvement+: Eliminate the Hash-Aggr from the plan, and notify the Hash-Join to perform a Semi-join. The HJ then would just skip the duplicates in its hash table(s), thus performing a Semi -Join. Example: {code} select c.c_first_name, c.c_last_name from dfs.`/data/json/s1/customer` c where c.c_customer_sk in (select s.ss_customer_sk from dfs.`/data/json/s1/store_sales` s) limit 4; {code} And the result plan (see the HJ at 01-03, and the Hash Agg at 01-05): {code} 00-00Screen : rowType = RecordType(ANY c_first_name, ANY c_last_name): rowcount = 4.0, cumulative cost = {4693752.96 rows, 2.309557672003E7 cpu, 0.0 io, 2.1598011392E9 network, 3.589586176005E7 memory}, id = 1320 00-01 Project(c_first_name=[$1], c_last_name=[$2]) : rowType = RecordType(ANY c_first_name, ANY c_last_name): rowcount = 4.0, cumulative cost = {4693752.56 rows, 2.309557632004E7 cpu, 0.0 io, 2.1598011392E9 network, 3.589586176005E7 memory}, id = 1319 00-02Project(c_customer_sk=[$1], c_first_name=[$2], c_last_name=[$3], ss_customer_sk=[$0]) : rowType = RecordType(ANY c_customer_sk, ANY c_first_name, ANY c_last_name, ANY ss_customer_sk): rowcount = 4.0, cumulative cost = {4693748.56 rows, 2.309556832004E7 cpu, 0.0 io, 2.1598011392E9 network, 3.589586176005E7 memory}, id = 1318 00-03 SelectionVectorRemover : rowType = RecordType(ANY ss_customer_sk, ANY c_customer_sk, ANY c_first_name, ANY c_last_name): rowcount = 4.0, cumulative cost = {4693744.56 rows, 2.309555232004E7 cpu, 0.0 io, 2.1598011392E9 network, 3.589586176005E7 memory}, id = 1317 00-04Limit(fetch=[4]) : rowType = RecordType(ANY ss_customer_sk, ANY c_customer_sk, ANY c_first_name, ANY c_last_name): rowcount = 4.0, cumulative cost = {4693740.56 rows, 2.309554832004E7 cpu, 0.0 io, 2.1598011392E9 network, 3.589586176005E7 memory}, id = 1316 00-05 UnionExchange : rowType = RecordType(ANY ss_customer_sk, ANY c_customer_sk, ANY c_first_name, ANY c_last_name): rowcount = 4.0, cumulative cost = {4693736.56 rows, 2.309553232004E7 cpu, 0.0 io, 2.1598011392E9 network, 3.589586176005E7 memory}, id = 1315 01-01SelectionVectorRemover : rowType = RecordType(ANY ss_customer_sk, ANY c_customer_sk, ANY c_first_name, ANY c_last_name): rowcount = 4.0, cumulative cost = {4693732.56 rows, 2.309550032004E7 cpu, 0.0 io, 2.1597356032E9 network, 3.589586176005E7 memory}, id = 1314 01-02 Limit(fetch=[4]) : rowType = RecordType(ANY ss_customer_sk, ANY c_customer_sk, ANY c_first_name, ANY c_last_name): rowcount = 4.0, cumulative cost = {4693728.56 rows, 2.309549632004E7 cpu, 0.0 io, 2.1597356032E9 network, 3.589586176005E7 memory}, id = 1313 01-03HashJoin(condition=[=($1, $0)], joinType=[inner]) : rowType = RecordType(ANY ss_customer_sk, ANY c_customer_sk, ANY c_first_name, ANY c_last_name): rowcount = 90182.8, cumulative cost = {4693724.56 rows, 2.309548032004E7 cpu, 0.0 io, 2.1597356032E9 network, 3.589586176005E7 memory}, id = 1312 01-05 HashAgg(group=[{0}]) : rowType = RecordType(ANY ss_customer_sk): rowcount = 18036.56, cumulative cost = {4509140.0 rows, 2.182423760005E7 cpu, 0.0 io, 1.4775549952E9 network, 3.491878016004E7 memory}, id = 1309 01-06Project(ss_customer_sk=[$0]) : rowType = RecordType(ANY ss_customer_sk): rowcount = 180365.6, cumulative cost = {4328774.4 rows, 2.038131280004E7 cpu, 0.0 io, 1.4775549952E9 network, 3.17443456E7 memory}, id = 1308 01-07 HashToRandomExchange(dist0=[[$0]]) : rowType = RecordType(ANY ss_customer_sk, ANY E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 180365.6, cumulative cost = {4148408.83 rows, 2.020094720003E7 cpu, 0.0 io, 1.4775549952E9 network, 3.17443456E7 memory}, id = 1307 02-01UnorderedMuxExchange : rowType = RecordType(ANY ss_customer_sk, ANY E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 180365.6, cumulative cost =
[jira] [Resolved] (DRILL-6566) Jenkins Regression: TPCDS query 66 fails with RESOURCE ERROR: One or more nodes ran out of memory while executing the query. AGGR OOM at First Phase.
[ https://issues.apache.org/jira/browse/DRILL-6566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Boaz Ben-Zvi resolved DRILL-6566. - Resolution: Fixed Reviewer: Timothy Farkas Commit ID 71c6c689a083e7496f06e99b4d253f11866ee741 > Jenkins Regression: TPCDS query 66 fails with RESOURCE ERROR: One or more > nodes ran out of memory while executing the query. AGGR OOM at First Phase. > -- > > Key: DRILL-6566 > URL: https://issues.apache.org/jira/browse/DRILL-6566 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators >Affects Versions: 1.14.0 >Reporter: Robert Hou >Assignee: Boaz Ben-Zvi >Priority: Critical > Labels: ready-to-commit > Fix For: 1.15.0 > > Attachments: drillbit.log.6566 > > > This is TPCDS Query 66. > Query: tpcds/tpcds_sf1/hive-generated-parquet/hive1_native/query66.sql > SELECT w_warehouse_name, > w_warehouse_sq_ft, > w_city, > w_county, > w_state, > w_country, > ship_carriers, > year1, > Sum(jan_sales) AS jan_sales, > Sum(feb_sales) AS feb_sales, > Sum(mar_sales) AS mar_sales, > Sum(apr_sales) AS apr_sales, > Sum(may_sales) AS may_sales, > Sum(jun_sales) AS jun_sales, > Sum(jul_sales) AS jul_sales, > Sum(aug_sales) AS aug_sales, > Sum(sep_sales) AS sep_sales, > Sum(oct_sales) AS oct_sales, > Sum(nov_sales) AS nov_sales, > Sum(dec_sales) AS dec_sales, > Sum(jan_sales / w_warehouse_sq_ft) AS jan_sales_per_sq_foot, > Sum(feb_sales / w_warehouse_sq_ft) AS feb_sales_per_sq_foot, > Sum(mar_sales / w_warehouse_sq_ft) AS mar_sales_per_sq_foot, > Sum(apr_sales / w_warehouse_sq_ft) AS apr_sales_per_sq_foot, > Sum(may_sales / w_warehouse_sq_ft) AS may_sales_per_sq_foot, > Sum(jun_sales / w_warehouse_sq_ft) AS jun_sales_per_sq_foot, > Sum(jul_sales / w_warehouse_sq_ft) AS jul_sales_per_sq_foot, > Sum(aug_sales / w_warehouse_sq_ft) AS aug_sales_per_sq_foot, > Sum(sep_sales / w_warehouse_sq_ft) AS sep_sales_per_sq_foot, > Sum(oct_sales / w_warehouse_sq_ft) AS oct_sales_per_sq_foot, > Sum(nov_sales / w_warehouse_sq_ft) AS nov_sales_per_sq_foot, > Sum(dec_sales / w_warehouse_sq_ft) AS dec_sales_per_sq_foot, > Sum(jan_net) AS jan_net, > Sum(feb_net) AS feb_net, > Sum(mar_net) AS mar_net, > Sum(apr_net) AS apr_net, > Sum(may_net) AS may_net, > Sum(jun_net) AS jun_net, > Sum(jul_net) AS jul_net, > Sum(aug_net) AS aug_net, > Sum(sep_net) AS sep_net, > Sum(oct_net) AS oct_net, > Sum(nov_net) AS nov_net, > Sum(dec_net) AS dec_net > FROM (SELECT w_warehouse_name, > w_warehouse_sq_ft, > w_city, > w_county, > w_state, > w_country, > 'ZOUROS' > \|\| ',' > \|\| 'ZHOU' AS ship_carriers, > d_yearAS year1, > Sum(CASE > WHEN d_moy = 1 THEN ws_ext_sales_price * ws_quantity > ELSE 0 > END) AS jan_sales, > Sum(CASE > WHEN d_moy = 2 THEN ws_ext_sales_price * ws_quantity > ELSE 0 > END) AS feb_sales, > Sum(CASE > WHEN d_moy = 3 THEN ws_ext_sales_price * ws_quantity > ELSE 0 > END) AS mar_sales, > Sum(CASE > WHEN d_moy = 4 THEN ws_ext_sales_price * ws_quantity > ELSE 0 > END) AS apr_sales, > Sum(CASE > WHEN d_moy = 5 THEN ws_ext_sales_price * ws_quantity > ELSE 0 > END) AS may_sales, > Sum(CASE > WHEN d_moy = 6 THEN ws_ext_sales_price * ws_quantity > ELSE 0 > END) AS jun_sales, > Sum(CASE > WHEN d_moy = 7 THEN ws_ext_sales_price * ws_quantity > ELSE 0 > END) AS jul_sales, > Sum(CASE > WHEN d_moy = 8 THEN ws_ext_sales_price * ws_quantity > ELSE 0 > END) AS aug_sales, > Sum(CASE > WHEN d_moy = 9 THEN ws_ext_sales_price * ws_quantity > ELSE 0 > END) AS sep_sales, > Sum(CASE > WHEN d_moy = 10 THEN ws_ext_sales_price * ws_quantity > ELSE 0 > END) AS oct_sales, > Sum(CASE > WHEN d_moy = 11 THEN ws_ext_sales_price * ws_quantity > ELSE 0 > END) AS nov_sales, > Sum(CASE > WHEN d_moy = 12 THEN ws_ext_sales_price * ws_quantity > ELSE 0 > END) AS dec_sales, > Sum(CASE > WHEN d_moy = 1 THEN ws_net_paid_inc_ship * ws_quantity > ELSE 0 > END) AS jan_net, > Sum(CASE > WHEN d_moy = 2 THEN ws_net_paid_inc_ship * ws_quantity > ELSE 0 > END) AS feb_net, > Sum(CASE > WHEN d_moy = 3 THEN ws_net_paid_inc_ship * ws_quantity > ELSE 0 > END) AS mar_net, > Sum(CASE > WHEN d_moy = 4 THEN ws_net_paid_inc_ship * ws_quantity > ELSE 0 > END) AS apr_net,
[jira] [Created] (DRILL-6637) Root pom: Release build needs to remove dep to tests in maven-javadoc-plugin
Boaz Ben-Zvi created DRILL-6637: --- Summary: Root pom: Release build needs to remove dep to tests in maven-javadoc-plugin Key: DRILL-6637 URL: https://issues.apache.org/jira/browse/DRILL-6637 Project: Apache Drill Issue Type: Bug Components: Tools, Build Test Affects Versions: 1.14.0 Reporter: Boaz Ben-Zvi Assignee: Boaz Ben-Zvi Fix For: 1.14.0 Error in "Preparing the release: build: {code} 27111 [INFO] [ERROR] Failed to execute goal org.apache.maven.plugins:maven-javadoc-plugin:2.10.3:jar (attach-javadocs) on project drill-fmpp-maven-plugin: MavenReportException: Error while generating Javadoc: artifact not found - Failure to find org.apache.drill.exec:drill-java-exec:jar:tests:1.14.0 in http://conjars.org/repo was cached in the local repository, resolution will not be reattempted until the update interval of conjars has elapsed or updates are forced {code} (Temporary ?) fix following Tim's suggestion: removing the dependencies to the test jars in the maven-javadoc-plugin in the drill-root pom. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6632) drill-jdbc-all jar size limit too small for release build
Boaz Ben-Zvi created DRILL-6632: --- Summary: drill-jdbc-all jar size limit too small for release build Key: DRILL-6632 URL: https://issues.apache.org/jira/browse/DRILL-6632 Project: Apache Drill Issue Type: Bug Components: Tools, Build Test Affects Versions: 1.14.0 Reporter: Boaz Ben-Zvi Assignee: Boaz Ben-Zvi Fix For: 1.14.0 Among the changes for DRILL-6294, the limit for the drill-jdbc-all jar file size was increased to 3600, about what was needed to accommodate the new Calcite version. However a Release build requires a slightly larger size (probably due to adding several of those *org.codehaus.plexus.compiler.javac.JavacCompiler6931842185404907145arguments*). Proposed Fix: Increase the size limit to 36,500,000 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6626) Hash Aggregate: Index out of bounds with small output batch size and spilling
Boaz Ben-Zvi created DRILL-6626: --- Summary: Hash Aggregate: Index out of bounds with small output batch size and spilling Key: DRILL-6626 URL: https://issues.apache.org/jira/browse/DRILL-6626 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.14.0 Reporter: Boaz Ben-Zvi This new IOOB failure was seen while trying to recreate the NPE failure in DRILL-6622 (over TPC-DS SF1). The proposed fix for the latter (PR #1391) does not seem to make a difference. This IOOB can easily be created with other large Hash-Agg queries that need to spill. The IOOB was caused after restricting the output batch size (to force many), and the Hash Aggr memory (to force a spill): {code} 0: jdbc:drill:zk=local> alter system set `drill.exec.memory.operator.output_batch_size` = 262144; +---++ | ok |summary | +---++ | true | drill.exec.memory.operator.output_batch_size updated. | +---++ 1 row selected (0.106 seconds) 0: jdbc:drill:zk=local> 0: jdbc:drill:zk=local> alter session set `exec.errors.verbose` = true; +---+---+ | ok |summary| +---+---+ | true | exec.errors.verbose updated. | +---+---+ 1 row selected (0.081 seconds) 0: jdbc:drill:zk=local> 0: jdbc:drill:zk=local> alter session set `exec.hashagg.mem_limit` = 16777216; +---+--+ | ok | summary | +---+--+ | true | exec.hashagg.mem_limit updated. | +---+--+ 1 row selected (0.089 seconds) 0: jdbc:drill:zk=local> 0: jdbc:drill:zk=local> SELECT c_customer_id FROM dfs.`/data/tpcds/sf1/parquet/customer` . . . . . . . . . . . > UNION . . . . . . . . . . . > SELECT ca_address_id FROM dfs.`/data/tpcds/sf1/parquet/customer_address` . . . . . . . . . . . > UNION . . . . . . . . . . . > SELECT cd_credit_rating FROM dfs.`/data/tpcds/sf1/parquet/customer_demographics` . . . . . . . . . . . > UNION . . . . . . . . . . . > SELECT hd_buy_potential FROM dfs.`/data/tpcds/sf1/parquet/household_demographics` . . . . . . . . . . . > UNION . . . . . . . . . . . > SELECT i_item_id FROM dfs.`/data/tpcds/sf1/parquet/item` . . . . . . . . . . . > UNION . . . . . . . . . . . > SELECT p_promo_id FROM dfs.`/data/tpcds/sf1/parquet/promotion` . . . . . . . . . . . > UNION . . . . . . . . . . . > SELECT t_time_id FROM dfs.`/data/tpcds/sf1/parquet/time_dim` . . . . . . . . . . . > UNION . . . . . . . . . . . > SELECT d_date_id FROM dfs.`/data/tpcds/sf1/parquet/date_dim` . . . . . . . . . . . > UNION . . . . . . . . . . . > SELECT s_store_id FROM dfs.`/data/tpcds/sf1/parquet/store` . . . . . . . . . . . > UNION . . . . . . . . . . . > SELECT w_warehouse_id FROM dfs.`/data/tpcds/sf1/parquet/warehouse` . . . . . . . . . . . > UNION . . . . . . . . . . . > SELECT sm_ship_mode_id FROM dfs.`/data/tpcds/sf1/parquet/ship_mode` . . . . . . . . . . . > UNION . . . . . . . . . . . > SELECT r_reason_id FROM dfs.`/data/tpcds/sf1/parquet/reason` . . . . . . . . . . . > UNION . . . . . . . . . . . > SELECT cc_call_center_id FROM dfs.`/data/tpcds/sf1/parquet/call_center` . . . . . . . . . . . > UNION . . . . . . . . . . . > SELECT web_site_id FROM dfs.`/data/tpcds/sf1/parquet/web_site` . . . . . . . . . . . > UNION . . . . . . . . . . . > SELECT wp_web_page_id FROM dfs.`/data/tpcds/sf1/parquet/web_page` . . . . . . . . . . . > UNION . . . . . . . . . . . > SELECT cp_catalog_page_id FROM dfs.`/data/tpcds/sf1/parquet/catalog_page`; Error: SYSTEM ERROR: IndexOutOfBoundsException: Index: 26474, Size: 7 Fragment 4:0 [Error Id: d44e64ea-f474-436e-94b0-61c61eec2227 on 172.30.8.176:31020] (java.lang.IndexOutOfBoundsException) Index: 26474, Size: 7 java.util.ArrayList.rangeCheck():653 java.util.ArrayList.get():429 org.apache.drill.exec.physical.impl.common.HashTableTemplate$BatchHolder.rehash():293 org.apache.drill.exec.physical.impl.common.HashTableTemplate$BatchHolder.access$1300():120 org.apache.drill.exec.physical.impl.common.HashTableTemplate.resizeAndRehashIfNeeded():805 org.apache.drill.exec.physical.impl.common.HashTableTemplate.put():682 org.apache.drill.exec.physical.impl.aggregate.HashAggTemplate.checkGroupAndAggrValues():1379 org.apache.drill.exec.physical.impl.aggregate.HashAggTemplate.doWork():604 org.apache.drill.exec.physical.impl.aggregate.HashAggBatch.innerNext():273 org.apache.drill.exec.record.AbstractRecordBatch.next():172
[jira] [Created] (DRILL-6625) Intermittent failures in Kafka unit tests
Boaz Ben-Zvi created DRILL-6625: --- Summary: Intermittent failures in Kafka unit tests Key: DRILL-6625 URL: https://issues.apache.org/jira/browse/DRILL-6625 Project: Apache Drill Issue Type: Bug Components: Storage - Other Affects Versions: 1.13.0 Reporter: Boaz Ben-Zvi Assignee: Abhishek Ravi Fix For: 1.15.0 The following failures have been seen (consistently on my Mac, or occasionally on Jenkins) when running the unit tests, in the Kafka test suit. After the failure, maven hangs for a long time. Cost was 0.0 (instead of 26.0) : {code:java} Running org.apache.drill.exec.store.kafka.KafkaFilterPushdownTest 16:46:57.748 [main] ERROR org.apache.drill.TestReporter - Test Failed (d: -65.3 KiB(73.6 KiB), h: -573.5 MiB(379.5 MiB), nh: 1.2 MiB(117.1 MiB)): testPushdownWithOr(org.apache.drill.exec.store.kafka.KafkaFilterPushdownTest) java.lang.AssertionError: Unable to find expected string "kafkaScanSpec" : { "topicName" : "drill-pushdown-topic" }, "cost" : 26.0 in plan: { "head" : { "version" : 1, "generator" : { "type" : "ExplainHandler", "info" : "" }, "type" : "APACHE_DRILL_PHYSICAL", "options" : [ { "kind" : "STRING", "accessibleScopes" : "ALL", "name" : "store.kafka.record.reader", "string_val" : "org.apache.drill.exec.store.kafka.decoders.JsonMessageReader", "scope" : "SESSION" }, { "kind" : "LONG", "accessibleScopes" : "ALL", "name" : "planner.width.max_per_node", "num_val" : 2, "scope" : "SESSION" }, { "kind" : "BOOLEAN", "accessibleScopes" : "ALL", "name" : "exec.errors.verbose", "bool_val" : true, "scope" : "SESSION" }, { "kind" : "LONG", "accessibleScopes" : "ALL", "name" : "store.kafka.poll.timeout", "num_val" : 200, "scope" : "SESSION" } ], "queue" : 0, "hasResourcePlan" : false, "resultMode" : "EXEC" }, "graph" : [ { "pop" : "kafka-scan", "@id" : 6, "userName" : "", "kafkaStoragePluginConfig" : { "type" : "kafka", "kafkaConsumerProps" : { "bootstrap.servers" : "127.0.0.1:63751", "group.id" : "drill-test-consumer" }, "enabled" : true }, "columns" : [ "`**`" ], "kafkaScanSpec" : { "topicName" : "drill-pushdown-topic" }, "cost" : 0.0 }, { {code} Or occasionally: {code} --- T E S T S --- 11:52:57.571 [main] ERROR o.a.d.e.s.k.KafkaMessageGenerator - org.apache.kafka.common.errors.NetworkException: The server disconnected before a response was received. java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.NetworkException: The server disconnected before a response was received. {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6577) Change Hash-Join default to not fallback (into pre-1.14 unlimited memory)
Boaz Ben-Zvi created DRILL-6577: --- Summary: Change Hash-Join default to not fallback (into pre-1.14 unlimited memory) Key: DRILL-6577 URL: https://issues.apache.org/jira/browse/DRILL-6577 Project: Apache Drill Issue Type: Improvement Components: Execution - Relational Operators Affects Versions: 1.13.0 Reporter: Boaz Ben-Zvi Assignee: Boaz Ben-Zvi Fix For: 1.14.0 Change the default for `drill.exec.hashjoin.fallback.enabled` to *false* (same as for the similar Hash-Agg option). This would force users to calculate and assign sufficient memory for the query, or explicitly choose to fallback. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6543) Options for memory mgmt: Reserve allowance for non-buffered, and Hash-Join default to not fallback
Boaz Ben-Zvi created DRILL-6543: --- Summary: Options for memory mgmt: Reserve allowance for non-buffered, and Hash-Join default to not fallback Key: DRILL-6543 URL: https://issues.apache.org/jira/browse/DRILL-6543 Project: Apache Drill Issue Type: Improvement Components: Execution - Relational Operators Affects Versions: 1.13.0 Reporter: Boaz Ben-Zvi Assignee: Boaz Ben-Zvi Fix For: 1.14.0 Changes to options related to memory budgeting: (1) Change the default for "drill.exec.hashjoin.fallback.enabled" to *false* (same as for the similar Hash-Agg option). This would force users to calculate and assign sufficient memory for the query, or explicitly choose to fallback. (2) When the "planner.memory.max_query_memory_per_node" (MQMPN) option is set equal (or "nearly equal") to the allocated *Direct Memory*, an OOM is still possible. The reason is that the memory used by the "non-buffered" operators is not taken into account. For example, MQMPN == Direct-Memory == 100 MB. Run a query with 5 buffered operators (e.g., 5 instances of a Hash-Join), so each gets "promised" 20 MB. When other non-buffered operators (e.g., a Scanner, or a Sender) also grab some of the Direct Memory, then less than 100 MB is left available. And if all those 5 Hash-Joins are pushing their limits, then one HJ may have only allocated 12MB so far, but on the next 1MB allocation it will hit an OOM (from the JVM, as all the 100MB Direct memory is already used). A solution -- a new option to _*reserve*_ some of the Direct Memory for those non-buffered operators (e.g., default %25). This *allowance* may prevent many of the cases like the example above. The new option would return an error (when a query initiates) if the MQMPN is set too high. Note that this option +can not+ address concurrent queries. This should also apply to the alternative for the MQMPN - the {{"planner.memory.percent_per_query"}} option (PPQ). The PPQ does not _*reserve*_ such memory (e.g., can set it to %100); only its documentation clearly explains this issue (that doc suggests reserving %50 allowance, as it was written when the Hash-Join was non-buffered; i.e., before spill was implemented). The memory given to the buffered operators is the highest calculated between the MQMPN and the PPQ. The new reserve option would verify that this figure allows the allowance. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6487) Negative row count when selecting from a json file with an OFFSET clause
Boaz Ben-Zvi created DRILL-6487: --- Summary: Negative row count when selecting from a json file with an OFFSET clause Key: DRILL-6487 URL: https://issues.apache.org/jira/browse/DRILL-6487 Project: Apache Drill Issue Type: Bug Components: Query Planning Optimization Affects Versions: 1.13.0 Reporter: Boaz Ben-Zvi Fix For: 1.14.0 This simple query fails: {code} select * from dfs.`/data/foo.json` offset 1 row; {code} where foo.json is {code} {"key": "aa", "sales": 11} {"key": "bb", "sales": 22} {code} The error returned is: {code} 0: jdbc:drill:zk=local> select * from dfs.`/data/foo.json` offset 1 row; Error: SYSTEM ERROR: AssertionError [Error Id: 960d66a9-b480-4a7e-9a25-beb4928e8139 on 10.254.130.25:31020] (org.apache.drill.exec.work.foreman.ForemanException) Unexpected exception during fragment initialization: null org.apache.drill.exec.work.foreman.Foreman.run():282 java.util.concurrent.ThreadPoolExecutor.runWorker():1142 java.util.concurrent.ThreadPoolExecutor$Worker.run():617 java.lang.Thread.run():745 Caused By (java.lang.AssertionError) null org.apache.calcite.rel.metadata.RelMetadataQuery.isNonNegative():900 org.apache.calcite.rel.metadata.RelMetadataQuery.validateResult():919 org.apache.calcite.rel.metadata.RelMetadataQuery.getRowCount():236 org.apache.calcite.rel.SingleRel.estimateRowCount():68 org.apache.drill.exec.planner.physical.visitor.ExcessiveExchangeIdentifier$MajorFragmentStat.add():103 org.apache.drill.exec.planner.physical.visitor.ExcessiveExchangeIdentifier.visitPrel():76 org.apache.drill.exec.planner.physical.visitor.ExcessiveExchangeIdentifier.visitPrel():32 org.apache.drill.exec.planner.physical.visitor.BasePrelVisitor.visitProject():50 org.apache.drill.exec.planner.physical.ProjectPrel.accept():98 org.apache.drill.exec.planner.physical.visitor.ExcessiveExchangeIdentifier.visitScreen():63 org.apache.drill.exec.planner.physical.visitor.ExcessiveExchangeIdentifier.visitScreen():32 org.apache.drill.exec.planner.physical.ScreenPrel.accept():65 org.apache.drill.exec.planner.physical.visitor.ExcessiveExchangeIdentifier.removeExcessiveEchanges():41 org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToPrel():557 org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan():179 org.apache.drill.exec.planner.sql.DrillSqlWorker.getQueryPlan():145 org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan():83 org.apache.drill.exec.work.foreman.Foreman.runSQL():567 org.apache.drill.exec.work.foreman.Foreman.run():264 java.util.concurrent.ThreadPoolExecutor.runWorker():1142 java.util.concurrent.ThreadPoolExecutor$Worker.run():617 java.lang.Thread.run():745 (state=,code=0) {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6479) Support for EMIT outcome in Hash Aggregate
Boaz Ben-Zvi created DRILL-6479: --- Summary: Support for EMIT outcome in Hash Aggregate Key: DRILL-6479 URL: https://issues.apache.org/jira/browse/DRILL-6479 Project: Apache Drill Issue Type: Improvement Components: Execution - Relational Operators Reporter: Boaz Ben-Zvi Assignee: Boaz Ben-Zvi Fix For: 1.14.0 With the new Lateral and Unnest -- if a Hash-Aggregate operator is present in the sub-query, then it needs to handle the EMIT outcome correctly. This means that when a EMIT is received then perform the aggregation operation on the records buffered so far and produce the output with it. After handling an EMIT the Hash-Aggr should refresh it's state and a continue to work on the next batches of incoming records unless an EMIT is seen again. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6475) Unnest: Null fieldId Pointer
Boaz Ben-Zvi created DRILL-6475: --- Summary: Unnest: Null fieldId Pointer Key: DRILL-6475 URL: https://issues.apache.org/jira/browse/DRILL-6475 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Reporter: Boaz Ben-Zvi Assignee: Parth Chandra Fix For: 1.14.0 Executing the following (in TestE2EUnnestAndLateral.java) causes an NPE as `fieldId` is null in `schemaChanged()`: ``` @Test public void testMultipleBatchesLateral_twoUnnests() throws Exception { String sql = "SELECT t5.l_quantity FROM dfs.`lateraljoin/multipleFiles/` t, LATERAL " + "(SELECT t2.ordrs FROM UNNEST(t.c_orders) t2(ordrs)) t3(ordrs), LATERAL " + "(SELECT t4.l_quantity FROM UNNEST(t3.ordrs) t4(l_quantity)) t5"; test(sql); } ``` And the error is: ``` Error: SYSTEM ERROR: NullPointerException Fragment 0:0 [Error Id: 25f42765-8f68-418e-840a-ffe65788e1e2 on 10.254.130.25:31020] (java.lang.NullPointerException) null org.apache.drill.exec.physical.impl.unnest.UnnestRecordBatch.schemaChanged():381 org.apache.drill.exec.physical.impl.unnest.UnnestRecordBatch.innerNext():199 org.apache.drill.exec.record.AbstractRecordBatch.next():172 org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next():229 org.apache.drill.exec.record.AbstractRecordBatch.next():119 org.apache.drill.exec.physical.impl.join.LateralJoinBatch.prefetchFirstBatchFromBothSides():241 org.apache.drill.exec.physical.impl.join.LateralJoinBatch.buildSchema():264 org.apache.drill.exec.record.AbstractRecordBatch.next():152 org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next():229 org.apache.drill.exec.record.AbstractRecordBatch.next():119 org.apache.drill.exec.record.AbstractRecordBatch.next():109 org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():63 org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():137 org.apache.drill.exec.record.AbstractRecordBatch.next():172 org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next():229 org.apache.drill.exec.record.AbstractRecordBatch.next():119 org.apache.drill.exec.record.AbstractRecordBatch.next():109 org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():63 org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():137 org.apache.drill.exec.record.AbstractRecordBatch.next():172 org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next():229 org.apache.drill.exec.physical.impl.BaseRootExec.next():103 org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():83 org.apache.drill.exec.physical.impl.BaseRootExec.next():93 org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():292 org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():279 java.security.AccessController.doPrivileged():-2 javax.security.auth.Subject.doAs():422 org.apache.hadoop.security.UserGroupInformation.doAs():1657 org.apache.drill.exec.work.fragment.FragmentExecutor.run():279 org.apache.drill.common.SelfCleaningRunnable.run():38 java.util.concurrent.ThreadPoolExecutor.runWorker():1142 java.util.concurrent.ThreadPoolExecutor$Worker.run():617 java.lang.Thread.run():745 (state=,code=0) ``` -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6444) Hash Join: Avoid partitioning when memory is sufficient
Boaz Ben-Zvi created DRILL-6444: --- Summary: Hash Join: Avoid partitioning when memory is sufficient Key: DRILL-6444 URL: https://issues.apache.org/jira/browse/DRILL-6444 Project: Apache Drill Issue Type: Improvement Components: Execution - Relational Operators Reporter: Boaz Ben-Zvi Assignee: Boaz Ben-Zvi The Hash Join Spilling feature introduced partitioning (of the incoming build side) which adds some overhead (copying the incoming data, row by row). That happens even when no spilling is needed. Suggested optimization: Try reading the incoming build data without partitioning, while checking that enough memory is available. In case the whole build side (plus hash table) fits in memory - then continue like a "single partition". In case not, then need to partition the data read so far and continue as usual (with partitions). (See optimization 8.1 in the Hash Join Spill design document: [https://docs.google.com/document/d/1-c_oGQY4E5d58qJYv_zc7ka834hSaB3wDQwqKcMoSAI/edit] ) This is currently implemented only for the case of num_partitions = 1 (i.e, no spilling, and no memory checking). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6400) Hash-Aggr: Avoid recreating common Hash-Table setups for every partition
Boaz Ben-Zvi created DRILL-6400: --- Summary: Hash-Aggr: Avoid recreating common Hash-Table setups for every partition Key: DRILL-6400 URL: https://issues.apache.org/jira/browse/DRILL-6400 Project: Apache Drill Issue Type: Improvement Components: Execution - Relational Operators Affects Versions: 1.13.0 Reporter: Boaz Ben-Zvi Assignee: Boaz Ben-Zvi Fix For: 1.14.0 The current Hash-Aggr code (and soon the Hash-Join code) creates multiple partitions to hold the incoming data; each partition with its own HashTable. The current code invokes the HashTable method _createAndSetupHashTable()_ for *each* partition. But most of the setups done by this method are identical for all the partitions (e.g., code generation). Calling this method has a performance cost (some local tests measured between 3 - 30 milliseconds, depends on the key columns). Suggested performance improvement: Extract the common settings to be called *once*, and use the results later by all the partitions. When running with the default 32 partitions, this can have a measurable improvement (and if spilling, this method is used again). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6339) New option to disable TopN (for testing Sort)
Boaz Ben-Zvi created DRILL-6339: --- Summary: New option to disable TopN (for testing Sort) Key: DRILL-6339 URL: https://issues.apache.org/jira/browse/DRILL-6339 Project: Apache Drill Issue Type: Improvement Components: Query Planning Optimization Affects Versions: 1.13.0 Reporter: Boaz Ben-Zvi Assignee: Boaz Ben-Zvi Fix For: 1.14.0 When a query uses ... ORDER BY ... LIMIT ..., the planner unconditionally picks the TopN operator to (efficiently) implement this order. This precludes an easy way to test the External Sort operator over a large dataset (e.g., test spilling). A new internal option to disable picking TopN (hence using the External Sort instead) would be useful for various testings. (And may be in other scenarios, like a bug found in the TopN). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (DRILL-6162) Enhance record batch sizer to retain nesting information for map columns.
[ https://issues.apache.org/jira/browse/DRILL-6162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Boaz Ben-Zvi resolved DRILL-6162. - Resolution: Fixed Merged with commit ID - 47c5d1feaaaf9b6384ed8ef1011fa58b9272b362 > Enhance record batch sizer to retain nesting information for map columns. > - > > Key: DRILL-6162 > URL: https://issues.apache.org/jira/browse/DRILL-6162 > Project: Apache Drill > Issue Type: Improvement > Components: Execution - Flow >Affects Versions: 1.12.0 >Reporter: Padma Penumarthy >Assignee: Padma Penumarthy >Priority: Major > Fix For: 1.14.0 > > > Enhance the record batch sizer to maintain the columnSizes in nested fashion > for maps so given a column, we can get sizing information of all children > underneath. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6139) Travis CI hangs on TestVariableWidthWriter#testRestartRow
Boaz Ben-Zvi created DRILL-6139: --- Summary: Travis CI hangs on TestVariableWidthWriter#testRestartRow Key: DRILL-6139 URL: https://issues.apache.org/jira/browse/DRILL-6139 Project: Apache Drill Issue Type: Bug Affects Versions: 1.12.0 Reporter: Boaz Ben-Zvi The Travis CI fails (probably hangs, then times out) in the following test: {code:java} Running org.apache.drill.test.rowSet.test.DummyWriterTest Running org.apache.drill.test.rowSet.test.DummyWriterTest#testDummyScalar Running org.apache.drill.test.rowSet.test.DummyWriterTest#testDummyMap Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 3.109 sec - in org.apache.drill.test.rowSet.test.DummyWriterTest Running org.apache.drill.test.rowSet.test.TestVariableWidthWriter Running org.apache.drill.test.rowSet.test.TestVariableWidthWriter#testSkipNulls Running org.apache.drill.test.rowSet.test.TestVariableWidthWriter#testWrite Running org.apache.drill.test.rowSet.test.TestVariableWidthWriter#testFillEmpties Running org.apache.drill.test.rowSet.test.TestVariableWidthWriter#testRollover Running org.apache.drill.test.rowSet.test.TestVariableWidthWriter#testSizeLimit Running org.apache.drill.test.rowSet.test.TestVariableWidthWriter#testRolloverWithEmpties Running org.apache.drill.test.rowSet.test.TestVariableWidthWriter#testRestartRow Killed Results : Tests run: 1554, Failures: 0, Errors: 0, Skipped: 66{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6027) Implement spill to disk for the Hash Join
Boaz Ben-Zvi created DRILL-6027: --- Summary: Implement spill to disk for the Hash Join Key: DRILL-6027 URL: https://issues.apache.org/jira/browse/DRILL-6027 Project: Apache Drill Issue Type: New Feature Components: Execution - Relational Operators Affects Versions: 1.11.0 Reporter: Boaz Ben-Zvi Assignee: Boaz Ben-Zvi Fix For: 1.13.0 Implement the spill memory to disk (as needed) feature for the Hash Join operator (similar to the prior work on the Hash Aggregate). A design draft document has been published: https://docs.google.com/document/d/1-c_oGQY4E5d58qJYv_zc7ka834hSaB3wDQwqKcMoSAI/edit?usp=sharing -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (DRILL-5935) Hash Join projects unneeded columns
Boaz Ben-Zvi created DRILL-5935: --- Summary: Hash Join projects unneeded columns Key: DRILL-5935 URL: https://issues.apache.org/jira/browse/DRILL-5935 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.11.0 Reporter: Boaz Ben-Zvi Priority: Minor The Hash Join operator projects all its input columns, including unneeded ones, relying on (multiple) project operators downstream to remove those columns. This is significantly wasteful, in both time and space (as each value is copied individually). Instead, the Hash Join itself should not project these unneeded columns. In the following example, the join-key columns need not be projected. However the two hash join operators do project them. Another problem: The join-key columns are copied from BOTH sides (build and probe), which is a waste, as both are IDENTICAL. Last - the plan in this example places the first join under the _build_ side of the second join; and the unneeded column from the first join (the join-key) is taken and finally projected by the second join. The sample query is: {code} select c.c_first_name, c.c_last_name, s.ss_quantity, a.ca_city from dfs.`/data/json/s1/customer` c, dfs.`/data/json/s1/store_sales` s, dfs.`/data/json/s1/customer_address` a where c.c_customer_sk = s.ss_customer_sk and c.c_customer_id = a.ca_address_id; {code} The plan first builds on 'customer_address' and probes with 'customer', and the output projects all 6 columns (2 from 'a', 4 from 'c'). Then the second join builds on all those 6 columns from the first join, and probes from the large table 'store_sales', and finally all 8 columns are projected (see below). Then 3 project operators are used to remove the unneeded columns (see attached profile) - hence more waste. {code} public void projectBuildRecord(int buildIndex, int outIndex) throws SchemaChangeException { { vv3 .copyFromSafe(((buildIndex)& 65535), (outIndex), vv0 [((buildIndex)>>> 16)]); } { vv9 .copyFromSafe(((buildIndex)& 65535), (outIndex), vv6 [((buildIndex)>>> 16)]); } { vv15 .copyFromSafe(((buildIndex)& 65535), (outIndex), vv12 [((buildIndex)>>> 16)]); } { vv21 .copyFromSafe(((buildIndex)& 65535), (outIndex), vv18 [((buildIndex)>>> 16)]); } { vv27 .copyFromSafe(((buildIndex)& 65535), (outIndex), vv24 [((buildIndex)>>> 16)]); } { vv33 .copyFromSafe(((buildIndex)& 65535), (outIndex), vv30 [((buildIndex)>>> 16)]); } } public void projectProbeRecord(int probeIndex, int outIndex) throws SchemaChangeException { { vv39 .copyFromSafe((probeIndex), (outIndex), vv36); } { vv45 .copyFromSafe((probeIndex), (outIndex), vv42); } } {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (DRILL-5915) Streaming aggregate with limit query does not return
Boaz Ben-Zvi created DRILL-5915: --- Summary: Streaming aggregate with limit query does not return Key: DRILL-5915 URL: https://issues.apache.org/jira/browse/DRILL-5915 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.11.0 Reporter: Boaz Ben-Zvi Reading a 1M rows table, in embedded mode, using sort+streaming_aggr -- the work is complete, but the query does not return (see attached profile) {code} alter session set `planner.enable_hashagg` = false; select b.g, b.s from (select gby_int32, gby_date g, gby_int32_rand, sum(int32_field) s from dfs.`/data/PARQUET-1M.parquet` group by gby_int32, gby_date, gby_int32_rand) b limit 30; {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (DRILL-5852) Display the "memory per query per node" in the query's profile
Boaz Ben-Zvi created DRILL-5852: --- Summary: Display the "memory per query per node" in the query's profile Key: DRILL-5852 URL: https://issues.apache.org/jira/browse/DRILL-5852 Project: Apache Drill Issue Type: Improvement Components: Execution - Relational Operators, Web Server Affects Versions: 1.11.0 Reporter: Boaz Ben-Zvi Priority: Minor Fix For: 1.12.0 Following DRILL-5815, the memory assigned per query per node is calculated before the query is run (in MemoryAllocationUtilities.java). It would be useful to show the final figure in the query's profile (in the Web UI) - both for users and for developers or QA. (Maybe also the memory per each "buffered operator" as well ? -- currently the figure above, divided by the number of such operators and the max concurrency). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (DRILL-5715) Performance of refactored HashAgg operator regressed
[ https://issues.apache.org/jira/browse/DRILL-5715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Boaz Ben-Zvi resolved DRILL-5715. - Resolution: Fixed Reviewer: Paul Rogers The commit for DRILL-5694 (PR #938) also solves this performance bug (basically removed calls to Setup before every hash computation, plus few little changes like replacing setSafe with set ). > Performance of refactored HashAgg operator regressed > > > Key: DRILL-5715 > URL: https://issues.apache.org/jira/browse/DRILL-5715 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Codegen >Affects Versions: 1.11.0 > Environment: 10-node RHEL 6.4 (32 Core, 256GB RAM) >Reporter: Kunal Khatua >Assignee: Boaz Ben-Zvi > Labels: performance, regression > Fix For: 1.12.0 > > Attachments: 26736242-d084-6604-aac9-927e729da755.sys.drill, > 26736615-9e86-dac9-ad77-b022fd791f67.sys.drill, > 2675cc73-9481-16e0-7d21-5f1338611e5f.sys.drill, > 2675de42-3789-47b8-29e8-c5077af136db.sys.drill, drill-1.10.0_callTree.png, > drill-1.10.0_hotspot.png, drill-1.11.0_callTree.png, drill-1.11.0_hotspot.png > > > When running the following simple HashAgg-based query on a TPCH-table - > Lineitem with 6Billion rows on a 10 node setup (with a single partition to > disable any possible spilling to disk) > {code:sql} > select count(*) > from ( > select l_quantity > , count(l_orderkey) > from lineitem > group by l_quantity > ) {code} > the runtime increased from {{7.378 sec}} to {{11.323 sec}} [reported by the > JDBC client]. > To disable spill-to-disk in Drill-1.11.0, the {{drill-override.conf}} was > modified to > {code}drill.exec.hashagg.num_partitions : 1{code} > Attached are two profiles > Drill 1.10.0 : [^2675cc73-9481-16e0-7d21-5f1338611e5f.sys.drill] > Drill 1.11.0 : [^2675de42-3789-47b8-29e8-c5077af136db.sys.drill] > A separate run was done for both scenarios with the > {{planner.width.max_per_node=10}} and profiled with YourKit. > Image snippets are attached, indicating the hotspots in both builds: > *Drill 1.10.0* : > Profile: [^26736242-d084-6604-aac9-927e729da755.sys.drill] > CallTree: [^drill-1.10.0_callTree.png] > HotSpot: [^drill-1.10.0_hotspot.png] > !drill-1.10.0_hotspot.png|drill-1.10.0_hotspot! > *Drill 1.11.0* : > Profile: [^26736615-9e86-dac9-ad77-b022fd791f67.sys.drill] > CallTree: [^drill-1.11.0_callTree.png] > HotSpot: [^drill-1.11.0_hotspot.png] > !drill-1.11.0_hotspot.png|drill-1.11.0_hotspot! -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (DRILL-5740) hash agg fail to read spill file
[ https://issues.apache.org/jira/browse/DRILL-5740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Boaz Ben-Zvi resolved DRILL-5740. - Resolution: Fixed Fix Version/s: 1.12.0 The commit for DRILL-5694 (PR #938) also solves this bug (basically removed an unneeded closing of the SpillSet). > hash agg fail to read spill file > > > Key: DRILL-5740 > URL: https://issues.apache.org/jira/browse/DRILL-5740 > Project: Apache Drill > Issue Type: Bug > Components: Functions - Drill >Affects Versions: 1.12.0 >Reporter: Chun Chang >Assignee: Boaz Ben-Zvi >Priority: Blocker > Fix For: 1.12.0 > > > -Build: | 1.12.0-SNAPSHOT | 11008d029bafa36279e3045c4ed1a64366080620 > -Multi-node drill cluster > Running a query causing hash agg spill fails with the following error. And > this seems to be a regression. > {noformat} > Execution Failures: > /root/drill-test-framework/framework/resources/Advanced/hash-agg/spill/hagg5.q > Query: > select gby_date, gby_int32_rand, sum(int32_field), avg(float_field), > min(boolean_field), count(double_rand) from > dfs.`/drill/testdata/hagg/PARQUET-500M.parquet` group by gby_date, > gby_int32_rand order by gby_date, gby_int32_rand limit 30 > Failed with exception > java.sql.SQLException: SYSTEM ERROR: FileNotFoundException: File > /tmp/drill/spill/10.10.30.168-31010/265f91f9-78d2-78a6-68ad-4709674efe0a_HashAgg_1-4-34/spill3 > does not exist > Fragment 1:34 > [Error Id: 291a79f8-9b7a-485d-9404-e7b7fe1d8f1e on 10.10.30.168:31010] > (java.lang.RuntimeException) java.io.FileNotFoundException: File > /tmp/drill/spill/10.10.30.168-31010/265f91f9-78d2-78a6-68ad-4709674efe0a_HashAgg_1-4-34/spill3 > does not exist > > org.apache.drill.exec.physical.impl.aggregate.SpilledRecordbatch.():67 > > org.apache.drill.exec.test.generated.HashAggregatorGen1891.outputCurrentBatch():980 > org.apache.drill.exec.test.generated.HashAggregatorGen1891.doWork():617 > org.apache.drill.exec.physical.impl.aggregate.HashAggBatch.innerNext():168 > org.apache.drill.exec.record.AbstractRecordBatch.next():164 > org.apache.drill.exec.record.AbstractRecordBatch.next():119 > org.apache.drill.exec.record.AbstractRecordBatch.next():109 > org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51 > > org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():133 > org.apache.drill.exec.record.AbstractRecordBatch.next():164 > org.apache.drill.exec.record.AbstractRecordBatch.next():119 > org.apache.drill.exec.record.AbstractRecordBatch.next():109 > org.apache.drill.exec.physical.impl.TopN.TopNBatch.innerNext():191 > org.apache.drill.exec.record.AbstractRecordBatch.next():164 > org.apache.drill.exec.record.AbstractRecordBatch.next():119 > org.apache.drill.exec.record.AbstractRecordBatch.next():109 > org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51 > > org.apache.drill.exec.physical.impl.svremover.RemovingRecordBatch.innerNext():93 > org.apache.drill.exec.record.AbstractRecordBatch.next():164 > org.apache.drill.exec.physical.impl.BaseRootExec.next():105 > > org.apache.drill.exec.physical.impl.SingleSenderCreator$SingleSenderRootExec.innerNext():92 > org.apache.drill.exec.physical.impl.BaseRootExec.next():95 > org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():234 > org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():227 > java.security.AccessController.doPrivileged():-2 > javax.security.auth.Subject.doAs():415 > org.apache.hadoop.security.UserGroupInformation.doAs():1595 > org.apache.drill.exec.work.fragment.FragmentExecutor.run():227 > org.apache.drill.common.SelfCleaningRunnable.run():38 > java.util.concurrent.ThreadPoolExecutor.runWorker():1145 > java.util.concurrent.ThreadPoolExecutor$Worker.run():615 > java.lang.Thread.run():745 > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (DRILL-5755) TOP_N_SORT operator does not free memory while running
Boaz Ben-Zvi created DRILL-5755: --- Summary: TOP_N_SORT operator does not free memory while running Key: DRILL-5755 URL: https://issues.apache.org/jira/browse/DRILL-5755 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.11.0 Reporter: Boaz Ben-Zvi The TOP_N_SORT operator should keep the top N rows while processing its input, and free the memory used to hold all rows below the top N. For example, the following query uses a table with 125M rows: {code} select row_count, sum(row_count), avg(double_field), max(double_rand), count(float_rand) from dfs.`/data/tmp` group by row_count order by row_count limit 30; {code} And failed with an OOM when each of the 3 TOP_N_SORT operators was holding about 2.44 GB !! (see attached profile). It should take far less memory to hold 30 rows !! -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (DRILL-5728) Hash Aggregate: Useless bigint value vector in the values batch
Boaz Ben-Zvi created DRILL-5728: --- Summary: Hash Aggregate: Useless bigint value vector in the values batch Key: DRILL-5728 URL: https://issues.apache.org/jira/browse/DRILL-5728 Project: Apache Drill Issue Type: Improvement Components: Execution - Codegen Affects Versions: 1.11.0 Reporter: Boaz Ben-Zvi Priority: Minor When aggregating a non-nullable column (like *sum(l_partkey)* below), the code generation creates an extra value vector (in addition to the actual "sum" vector) which is used as a "nonNullCount". This is useless (as the underlying column is non-nullable), and wastes considerable memory ( 8 * 64K = 512K per each value in a batch !!) Example query: select sum(l_partkey) as slpk from cp.`tpch/lineitem.parquet` group by l_orderkry; And as can be seen in the generated code below, the bigint value vector *vv5* is only used to hold a *1* flag to note "not null": public void updateAggrValuesInternal(int incomingRowIdx, int htRowIdx) throws SchemaChangeException { { IntHolder out11 = new IntHolder(); { out11 .value = vv8 .getAccessor().get((incomingRowIdx)); } IntHolder in = out11; work0 .value = vv1 .getAccessor().get((htRowIdx)); BigIntHolder value = work0; work4 .value = vv5 .getAccessor().get((htRowIdx)); BigIntHolder nonNullCount = work4; SumFunctions$IntSum_add: { nonNullCount.value = 1; value.value += in.value; } work0 = value; vv1 .getMutator().set((htRowIdx), work0 .value); work4 = nonNullCount; vv5 .getMutator().set((htRowIdx), work4 .value); } } -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (DRILL-5665) planner.force_2phase.aggr Set to TRUE for HashAgg may cause wrong results for VARIANCE and STD_DEV
Boaz Ben-Zvi created DRILL-5665: --- Summary: planner.force_2phase.aggr Set to TRUE for HashAgg may cause wrong results for VARIANCE and STD_DEV Key: DRILL-5665 URL: https://issues.apache.org/jira/browse/DRILL-5665 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.11.0 Reporter: Boaz Ben-Zvi Assignee: Boaz Ben-Zvi Fix For: 1.11.0 *planner.force_2phase.aggr* was added for testing the hash 2-phase spill to disk aggregation implementation. However, if it is set to true, stream aggregate will run in two phase too and return wrong results for some functions such as variance() and std_dev(). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (DRILL-5616) Hash Agg Spill: OOM while reading irregular varchar data
Boaz Ben-Zvi created DRILL-5616: --- Summary: Hash Agg Spill: OOM while reading irregular varchar data Key: DRILL-5616 URL: https://issues.apache.org/jira/browse/DRILL-5616 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.11.0 Reporter: Boaz Ben-Zvi Assignee: Boaz Ben-Zvi Fix For: 1.11.0 An OOM while aggregating a table of two varchar columns where sizes vary significantly ( about 8 bytes long in average, but 250 bytes max ) alter session set `planner.width.max_per_node` = 1; alter session set `planner.memory.max_query_memory_per_node` = 327127360; select count(*) from (select max(`filename`) from dfs.`/drill/testdata/hash-agg/data2` group by no_nulls_col, nulls_col) d; Error: RESOURCE ERROR: One or more nodes ran out of memory while executing the query. OOM at Second Phase. Partitions: 2. Estimated batch size: 12255232. Planned batches: 0. Rows spilled so far: 434127447 Memory limit: 163563680 so far allocated: 150601728. Fragment 1:0 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (DRILL-5588) Hash Aggregate: Avoid copy on output of aggregate columns
Boaz Ben-Zvi created DRILL-5588: --- Summary: Hash Aggregate: Avoid copy on output of aggregate columns Key: DRILL-5588 URL: https://issues.apache.org/jira/browse/DRILL-5588 Project: Apache Drill Issue Type: Improvement Components: Execution - Relational Operators Affects Versions: 1.10.0 Reporter: Boaz Ben-Zvi When the Hash Aggregate operator outputs its result batches downstream, the key columns (value vectors) are returned as is, but for the aggregate columns new value vectors are allocated and the values are copied. This has an impact on performance. (see the method allocateOutgoing() ). A second effect is on memory management (as this allocation is not planned for by the code that controls spilling, etc). For some simple aggregate functions (e.g. SUM), the stored value vectors for the aggregate values can be returned as is. For functions like AVG, there is a need to divide the SUM values by the COUNT values. Still this can be done in-place (of the SUM values) and avoid new allocation and copy. For VarChar type aggregate values (only used by MAX or MIN), there is another issue -- currently any such value vector is allocated as an ObjectVector (see BatchHolder()) (and on the JVM heap, not in direct memory). This is to manage the sizes of the values, which could change as the aggregation progresses (e.g., for MAX(name) -- first record has 'abe', but the next record has 'benjamin' which is both bigger ('b' > 'a') and longer). For the final output, this requires a new allocation and a copy in order to have a compact value vector in direct memory. Maybe the ObjectVector could be replaced with some direct memory implementation that is optimized for "good" values (e.g., all are of similar size), but penalized "bad" values (e.g., reallocates or moves values, when needed) ? -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (DRILL-5457) Support Spill to Disk for the Hash Aggregate Operator
Boaz Ben-Zvi created DRILL-5457: --- Summary: Support Spill to Disk for the Hash Aggregate Operator Key: DRILL-5457 URL: https://issues.apache.org/jira/browse/DRILL-5457 Project: Apache Drill Issue Type: Improvement Components: Execution - Relational Operators Affects Versions: 1.10.0 Reporter: Boaz Ben-Zvi Assignee: Boaz Ben-Zvi Fix For: 1.11.0 Support gradual spilling memory to disk as the available memory gets too small to allow in memory work for the Hash Aggregate Operator. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (DRILL-5446) Offset Vector in VariableLengthVectors may waste up to 256KB per value vector
Boaz Ben-Zvi created DRILL-5446: --- Summary: Offset Vector in VariableLengthVectors may waste up to 256KB per value vector Key: DRILL-5446 URL: https://issues.apache.org/jira/browse/DRILL-5446 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.10.0 Reporter: Boaz Ben-Zvi Assignee: Boaz Ben-Zvi Fix For: 1.11.0 In exec/vector/src/main/codegen/templates/VariableLengthVectors.java -- the implementation uses an "offset vector" to note the BEGINNING of each variable length element. In order to find the length (i.e. the END of the element), need to look at the FOLLOWING element. This requires the "offset vector" to have ONE MORE entry than the total number of elements -- in order to find the END of the LAST element. Some places in the code (e.g., the hash table) use the maximum number of elements - 64K ( = 65536 ). And each entry in the "offset vector" is 4-byte UInt4, hence looks like needing 256KB. However because of that "ONE MORE", the code in this case allocates for 65537, thus (rounding to next power of 2) allocating 512KB, where half is not used (And this is per each varchar value vector, per each batch; e.g., in the qa test Functional/aggregates/tpcds_variants/text/aggregate25.q where there are 10 key columns, each hash-table batch is wasting 2.5MB !). Possible fix: change the logic in VariableLengthVectors.java to keep the END point of each variable length element - the first element's beginning is always ZERO, so it need not be kept. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (DRILL-5442) Managed Sort: IndexOutOfBounds with a join over an inlist
Boaz Ben-Zvi created DRILL-5442: --- Summary: Managed Sort: IndexOutOfBounds with a join over an inlist Key: DRILL-5442 URL: https://issues.apache.org/jira/browse/DRILL-5442 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.10.0 Reporter: Boaz Ben-Zvi Assignee: Paul Rogers Fix For: 1.11.0 The following query fails with IOOB when a managed sort is used, but passes with the old default sort: = 0: jdbc:drill:zk=local> alter session set `exec.sort.disable_managed` = false; +---+-+ | ok | summary | +---+-+ | true | exec.sort.disable_managed updated. | +---+-+ 1 row selected (0.16 seconds) 0: jdbc:drill:zk=local> select * from dfs.`/data/json/s1/date_dim` where d_year in(1990, 1901, 1902, 1903, 1904, 1905, 1906, 1907, 1908, 1909, 1910, 1911, 1912, 1913, 1914, 1915, 1916, 1917, 1918, 1919) limit 3; Error: SYSTEM ERROR: IndexOutOfBoundsException: index: 0, length: 1 (expected: range(0, 0)) Fragment 0:0 [Error Id: 370fd706-c365-421f-b57d-d6ab7fde82df on 10.250.56.251:31010] (state=,code=0) (the above query was extracted from /root/drillAutomation/framework-master/framework/resources/Functional/tpcds/variants/hive/q4_1.sql ) Note that the inlist must have at list 20 items, in which case the plan becomes a join over a stream-aggregate over a sort over the (inlist's) values. When the IOOB happens, the stack does not show the sort anymore, but probably handling a NONE returned by the last next() on the sort ( StreamingAggTemplate.doWork():182 ) The "date_dim" can probably be made up with any data. The one above was taken from: [root@atsqa6c85 ~]# hadoop fs -ls /drill/testdata/tpcds/json/s1/date_dim Found 1 items -rwxr-xr-x 3 root root 50713534 2014-10-14 22:39 /drill/testdata/tpcds/json/s1/date_dim/0_0_0.json -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (DRILL-5293) Poor performance of Hash Table due to same hash value as distribution below
Boaz Ben-Zvi created DRILL-5293: --- Summary: Poor performance of Hash Table due to same hash value as distribution below Key: DRILL-5293 URL: https://issues.apache.org/jira/browse/DRILL-5293 Project: Apache Drill Issue Type: Bug Components: Execution - Codegen Affects Versions: 1.8.0 Reporter: Boaz Ben-Zvi Assignee: Boaz Ben-Zvi The computation of the hash value is basically the same whether for the Hash Table (used by Hash Agg, and Hash Join), or for distribution of rows at the exchange. As a result, a specific Hash Table (in a parallel minor fragment) gets only rows "filtered out" by the partition below ("upstream"), so the pattern of this filtering leads to a non uniform usage of the hash buckets in the table. Here is a simplified example: An exchange partitions into TWO (minor fragments), each running a Hash Agg. So the partition sends rows of EVEN hash values to the first, and rows of ODD hash values to the second. Now the first recomputes the _same_ hash value for its Hash table -- and only the even buckets get used !! (Or with a partition into EIGHT -- possibly only one eighth of the buckets would be used !! ) This would lead to longer hash chains and thus a _poor performance_ ! A possible solution -- add a distribution function distFunc (only for partitioning) that takes the hash value and "scrambles" it so that the entropy in all the bits effects the low bits of the output. This function should be applied (in HashPrelUtil) over the generated code that produces the hash value, like: distFunc( hash32(field1, hash32(field2, hash32(field3, 0))) ); Tested with a huge hash aggregate (64 M rows) and a parallelism of 8 ( planner.width.max_per_node = 8 ); minor fragments 0 and 4 used only 1/8 of their buckets, the others used 1/4 of their buckets. Maybe the reason for this variance is that distribution is using "hash32AsDouble" and hash agg is using "hash32". -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (DRILL-5236) Need more detail on specifying list of directories for spilling
Boaz Ben-Zvi created DRILL-5236: --- Summary: Need more detail on specifying list of directories for spilling Key: DRILL-5236 URL: https://issues.apache.org/jira/browse/DRILL-5236 Project: Apache Drill Issue Type: Improvement Components: Documentation Affects Versions: 1.8.0 Reporter: Boaz Ben-Zvi Priority: Minor Fix For: 1.9.0 Under "Start-Up Options" (https://drill.apache.org/docs/start-up-options/) , for the option: drill.exec.sort.external.spill.directories Need to explain how multiple directories can (also) be specified -- using a comma separated list of strings. For example (in the config file): directories : [ "/fs1/drill/spill" , "/fs2/drill/spill" , "/fs3/drill/spill" ], -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (DRILL-4993) Documentation: Wrong output displayed for convert_from() with a map
Boaz Ben-Zvi created DRILL-4993: --- Summary: Documentation: Wrong output displayed for convert_from() with a map Key: DRILL-4993 URL: https://issues.apache.org/jira/browse/DRILL-4993 Project: Apache Drill Issue Type: Bug Components: Documentation Affects Versions: 1.8.0 Reporter: Boaz Ben-Zvi Priority: Trivial Fix For: 1.9.0 In the Drill docs: SQL REFERENCE -> SQL FUNCTIONS -> DATA TYPE CONVERSION The output for the example shown for convert_from() with a map into JSON is wrong: - BEGIN EXCERPT - This example uses a map as input to return a repeated list vector (JSON). SELECT CONVERT_FROM('[{a : 100, b: 200}, {a:300, b: 400}]' ,'JSON') AS MYCOL1 FROM (VALUES(1)); ++ | MYCOL1 | ++ | [[1,2],[3,4],[5]] | ++ 1 row selected (0.141 seconds) - END -- The correct output should be: --- SELECT CONVERT_FROM('[{a : 100, b: 200}, {a:300, b: 400}]' ,'JSON') AS MYCOL1 FROM (VALUES(1)); ++ | MYCOL1 | ++ | [{"a":100,"b":200},{"a":300,"b":400}] | ++ 1 row selected (2.618 seconds) The error probably resulted from copying the output of the previous example. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (DRILL-4961) Schema change error due to a missing column in a Json file
Boaz Ben-Zvi created DRILL-4961: --- Summary: Schema change error due to a missing column in a Json file Key: DRILL-4961 URL: https://issues.apache.org/jira/browse/DRILL-4961 Project: Apache Drill Issue Type: Bug Components: Execution - Flow Affects Versions: 1.8.0 Reporter: Boaz Ben-Zvi A missing column in a batch defaults to a (hard coded) nullable INT (e.g., see line 128 in ExpressionTreeMaterializer.java), which can cause a schema conflict when that column in another batch has a conflicting type (e.g. VARCHAR). To recreate (the following test also created DRILL-4960 ; which may be related) : Run a parallel aggregation over two small Json files (e.g. copy twice contrib/storage-mongo/src/test/resources/emp.json ) where in one of the files a whole column was eliminated (e.g. "last_name"). 0: jdbc:drill:zk=local> alter session set planner.slice_target = 1; +---++ | ok |summary | +---++ | true | planner.slice_target updated. | +---++ 1 row selected (0.091 seconds) 0: jdbc:drill:zk=local> select first_name, last_name from `drill/data/emp` group by first_name, last_name; Error: SYSTEM ERROR: SchemaChangeException: Incoming batches for merging receiver have different schemas! Fragment 1:0 [Error Id: 1315ddc5-5c31-404f-917b-c7a082d016cf on 10.250.57.63:31010] (state=,code=0) The above used a streaming aggregation; when switching to hash aggregation the same error manifests differently: 0: jdbc:drill:zk=local> alter session set `planner.enable_streamagg` = false; +---++ | ok | summary | +---++ | true | planner.enable_streamagg updated. | +---++ 1 row selected (0.083 seconds) 0: jdbc:drill:zk=local> select first_name, last_name from `drill/data/emp` group by first_name, last_name; Error: SYSTEM ERROR: IllegalStateException: Failure while reading vector. Expected vector class of org.apache.drill.exec.vector.NullableIntVector but was holding vector class org.apache.drill.exec.vector.NullableVarCharVector, field= last_name(VARCHAR:OPTIONAL)[$bits$(UINT1:REQUIRED), last_name(VARCHAR:OPTIONAL)[$offsets$(UINT4:REQUIRED)]] Fragment 2:0 [Error Id: 58d0-3bfe-4197-b4bd-44f9d7604d77 on 10.250.57.63:31010] (state=,code=0) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (DRILL-4954) allTextMode in the MapRDB plugin always return nulls
Boaz Ben-Zvi created DRILL-4954: --- Summary: allTextMode in the MapRDB plugin always return nulls Key: DRILL-4954 URL: https://issues.apache.org/jira/browse/DRILL-4954 Project: Apache Drill Issue Type: Bug Components: Storage - MapRDB Affects Versions: 1.8.0 Environment: MapRDB Reporter: Boaz Ben-Zvi Assignee: Smidth Panchamia Fix For: 1.9.0 Setting the "allTextMode" option to "true" in the MapR fs plugin, like: "formats": { "maprdb": { "type": "maprdb", "allTextMode": true } makes the returned results null. Here’s an example: << default plugin, unchanged >> 0: jdbc:drill:> use mfs.tpch_sf1_maprdb_json; +---++ | ok |summary | +---++ | true | Default schema changed to [mfs1.tpch_sf1_maprdb_json] | +---++ 1 row selected (0.153 seconds) 0: jdbc:drill:> select typeof(N_REGIONKEY) from nation limit 1; +-+ | EXPR$0 | +-+ | BIGINT | +-+ 1 row selected (0.206 seconds) 0: jdbc:drill:> select N_REGIONKEY from nation limit 2; +--+ | N_REGIONKEY | +--+ | 0| | 2| +--+ 2 rows selected (0.254 seconds) << plugin changed to all text mode (as shown above) >> 0: jdbc:drill:> select typeof(N_REGIONKEY) from nation limit 1; +-+ | EXPR$0 | +-+ | NULL| +-+ 1 row selected (0.321 seconds) 0: jdbc:drill:> select N_REGIONKEY from nation limit 2; +--+ | N_REGIONKEY | +--+ | null | | null | +--+ 2 rows selected (0.25 seconds) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (DRILL-4921) Scripts drill_config.sh, drillbit.sh, and drill-embedded fail when accessed via a symbolic link
Boaz Ben-Zvi created DRILL-4921: --- Summary: Scripts drill_config.sh, drillbit.sh, and drill-embedded fail when accessed via a symbolic link Key: DRILL-4921 URL: https://issues.apache.org/jira/browse/DRILL-4921 Project: Apache Drill Issue Type: Bug Components: Server Affects Versions: 1.8.0 Environment: The drill-embedded on the Mac; the other files on Linux Reporter: Boaz Ben-Zvi Priority: Minor Fix For: 1.9.0 Several of the drill... scripts under $DRILL_HOME/bin use "pwd" to produce the local path of that script. However "pwd" defaults to "logical" (i.e. the same as "pwd -L"); so if accessed via a symbolic link, that link is used verbatim in the path, which can produce wrong paths (e.g., when followed by "cd .."). For example, creating a symbolic link and using it (on the Mac): $ cd ~/drill $ ln -s $DRILL_HOME/bin $ bin/drill-embedded ERROR: Drill config file missing: /Users/boazben-zvi/drill/conf/drill-override.conf -- Wrong config dir? Similarly on Linux the CLASS_PATH gets set wrong (when running "drillbit.sh start" via a symlink). Solution: need to replace all the "pwd" in all the scripts with "pwd -P" which produces the Physical path. (Or replace a preceding "cd" with "cd -P" which does the same). Relevant scripts: = $ cd bin; grep pwd * drillbit.sh:bin=`cd "$bin">/dev/null; pwd` drillbit.sh: echo "cwd:" `pwd` drill-conf:bin=`cd "$bin">/dev/null; pwd` drill-config.sh:home=`cd "$bin/..">/dev/null; pwd` drill-config.sh: DIR="$( cd -P "$( dirname "$SOURCE" )" && pwd )" drill-config.sh:JAVA_HOME="$( cd -P "$( dirname "$SOURCE" )" && cd .. && pwd )" drill-embedded:bin=`cd "$bin">/dev/null; pwd` drill-localhost:bin=`cd "$bin">/dev/null; pwd` submit_plan:bin=`cd "$bin">/dev/null; pwd` -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (DRILL-4904) NPE from backquoted LIKE argument
Boaz Ben-Zvi created DRILL-4904: --- Summary: NPE from backquoted LIKE argument Key: DRILL-4904 URL: https://issues.apache.org/jira/browse/DRILL-4904 Project: Apache Drill Issue Type: Bug Components: Functions - Drill Affects Versions: 1.8.0 Reporter: Boaz Ben-Zvi By mistakenly using backquotes (instead of regular single quotes) in the argument for LIKE, an NPE shows: 0: jdbc:drill:zk=local> select col from test1 where col like `24518133617%` group by col; Error: SYSTEM ERROR: NullPointerException Fragment 3:1 [Error Id: 5c8317c3-3b1c-415b-8967-56696eeff764 on 10.250.57.63:31010] (java.lang.NullPointerException) null org.apache.drill.exec.expr.fn.impl.StringFunctionHelpers.toStringFromUTF8():199 org.apache.drill.exec.test.generated.FiltererGen58.doSetup():84 org.apache.drill.exec.test.generated.FiltererGen58.setup():54 org.apache.drill.exec.physical.impl.filter.FilterRecordBatch.generateSV2Filterer():195 org.apache.drill.exec.physical.impl.filter.FilterRecordBatch.setupNewSchema():107 org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():78 org.apache.drill.exec.record.AbstractRecordBatch.next():162 org.apache.drill.exec.record.AbstractRecordBatch.next():119 org.apache.drill.exec.record.AbstractRecordBatch.next():109 org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51 org.apache.drill.exec.physical.impl.svremover.RemovingRecordBatch.innerNext():94 org.apache.drill.exec.record.AbstractRecordBatch.next():162 org.apache.drill.exec.record.AbstractRecordBatch.next():119 org.apache.drill.exec.record.AbstractRecordBatch.next():109 org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51 org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():135 org.apache.drill.exec.record.AbstractRecordBatch.next():162 org.apache.drill.exec.record.AbstractRecordBatch.next():119 org.apache.drill.exec.record.AbstractRecordBatch.next():109 org.apache.drill.exec.physical.impl.aggregate.HashAggBatch.buildSchema():97 org.apache.drill.exec.record.AbstractRecordBatch.next():142 org.apache.drill.exec.record.AbstractRecordBatch.next():119 org.apache.drill.exec.record.AbstractRecordBatch.next():109 org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51 org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():135 org.apache.drill.exec.record.AbstractRecordBatch.next():162 org.apache.drill.exec.physical.impl.BaseRootExec.next():104 org.apache.drill.exec.physical.impl.SingleSenderCreator$SingleSenderRootExec.innerNext():92 org.apache.drill.exec.physical.impl.BaseRootExec.next():94 org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():232 org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():226 java.security.AccessController.doPrivileged():-2 javax.security.auth.Subject.doAs():422 org.apache.hadoop.security.UserGroupInformation.doAs():1657 org.apache.drill.exec.work.fragment.FragmentExecutor.run():226 org.apache.drill.common.SelfCleaningRunnable.run():38 java.util.concurrent.ThreadPoolExecutor.runWorker():1142 java.util.concurrent.ThreadPoolExecutor$Worker.run():617 java.lang.Thread.run():745 (state=,code=0) More info: Table is json ; single varchar column. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (DRILL-3898) No space error during external sort does not cancel the query
[ https://issues.apache.org/jira/browse/DRILL-3898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Boaz Ben-Zvi resolved DRILL-3898. - Resolution: Fixed Fix Version/s: (was: Future) 1.9.0 Commit ID: 140304d47daf8b18c72a0ab8f39c67d9d8a3031d The code fix: "widen" the exception catch (from IOException to Throwable) to also cover the FSError returned from the Hadoop FS I/O call; in addition, ignore a repeat of the same exception while closing the new group (which triggers a fllush()). This would produce a relatively "clean and clear" error message (instead of a Java stack dump; which is more appropriate for a software bug). > No space error during external sort does not cancel the query > - > > Key: DRILL-3898 > URL: https://issues.apache.org/jira/browse/DRILL-3898 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators >Affects Versions: 1.2.0, 1.8.0 >Reporter: Victoria Markman >Assignee: Boaz Ben-Zvi > Fix For: 1.9.0 > > Attachments: drillbit.log, sqlline_3898.ver_1_8.log > > > While verifying DRILL-3732 I ran into a new problem. > I think drill somehow loses track of out of disk exception and does not > cancel rest of the query, which results in NPE: > Reproduction is the same as in DRILL-3732: > {code} > 0: jdbc:drill:schema=dfs> create table store_sales_20(ss_item_sk, > ss_customer_sk, ss_cdemo_sk, ss_hdemo_sk, s_sold_date_sk, ss_promo_sk) > partition by (ss_promo_sk) as > . . . . . . . . . . . . > select > . . . . . . . . . . . . > case when columns[2] = '' then cast(null as > varchar(100)) else cast(columns[2] as varchar(100)) end, > . . . . . . . . . . . . > case when columns[3] = '' then cast(null as > varchar(100)) else cast(columns[3] as varchar(100)) end, > . . . . . . . . . . . . > case when columns[4] = '' then cast(null as > varchar(100)) else cast(columns[4] as varchar(100)) end, > . . . . . . . . . . . . > case when columns[5] = '' then cast(null as > varchar(100)) else cast(columns[5] as varchar(100)) end, > . . . . . . . . . . . . > case when columns[0] = '' then cast(null as > varchar(100)) else cast(columns[0] as varchar(100)) end, > . . . . . . . . . . . . > case when columns[8] = '' then cast(null as > varchar(100)) else cast(columns[8] as varchar(100)) end > . . . . . . . . . . . . > from > . . . . . . . . . . . . > `store_sales.dat` ss > . . . . . . . . . . . . > ; > Error: SYSTEM ERROR: NullPointerException > Fragment 1:16 > [Error Id: 0ae9338d-d04f-4b4a-93aa-a80d13cedb29 on atsqa4-133.qa.lab:31010] > (state=,code=0) > {code} > This exception in drillbit.log should have triggered query cancellation: > {code} > 2015-10-06 17:01:34,463 [WorkManager-2] ERROR > o.apache.drill.exec.work.WorkManager - > org.apache.drill.exec.work.WorkManager$WorkerBee$1.run() leaked an exception. > org.apache.hadoop.fs.FSError: java.io.IOException: No space left on device > at > org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.write(RawLocalFileSystem.java:226) > ~[hadoop-common-2.5.1-mapr-1503.jar:na] > at > java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82) > ~[na:1.7.0_71] > at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140) > ~[na:1.7.0_71] > at java.io.FilterOutputStream.close(FilterOutputStream.java:157) > ~[na:1.7.0_71] > at > org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72) > ~[hadoop-common-2.5.1-mapr-1503.jar:na] > at > org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:106) > ~[hadoop-common-2.5.1-mapr-1503.jar:na] > at > org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.close(ChecksumFileSystem.java:400) > ~[hadoop-common-2.5.1-mapr-1503.jar:na] > at > org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72) > ~[hadoop-common-2.5.1-mapr-1503.jar:na] > at > org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:106) > ~[hadoop-common-2.5.1-mapr-1503.jar:na] > at > org.apache.drill.exec.physical.impl.xsort.BatchGroup.close(BatchGroup.java:152) > ~[drill-java-exec-1.2.0.jar:1.2.0] > at > org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:44) > ~[drill-common-1.2.0.jar:1.2.0] > at > org.apache.drill.exec.physical.impl.xsort.ExternalSortBatch.mergeAndSpill(ExternalSortBatch.java:553) > ~[drill-java-exec-1.2.0.jar:1.2.0] > at > org.apache.drill.exec.physical.impl.xsort.ExternalSortBatch.innerNext(ExternalSortBatch.java:362) > ~[drill-java-exec-1.2.0.jar:1.2.0] > at >
[jira] [Created] (DRILL-4896) After a failed CTAS, the table both exists and does not exist
Boaz Ben-Zvi created DRILL-4896: --- Summary: After a failed CTAS, the table both exists and does not exist Key: DRILL-4896 URL: https://issues.apache.org/jira/browse/DRILL-4896 Project: Apache Drill Issue Type: Improvement Components: Server Affects Versions: 1.8.0 Reporter: Boaz Ben-Zvi After CTAS failed (due to no space on storage device) there were (incomplete) Parquet files left. A subsequent CTAS for the same table name fails with "table exists", and a subsequent DROP on the same table name fails with "table does not exist". A possible enhancement: DROP to be able to cleanup such a corrupted table. 0: jdbc:drill:zk=local> create table `/drill/spill/tt1` as . . . . . . . . . . . > select . . . . . . . . . . . >case when columns[2] = '' then cast(null as varchar(100)) else cast(columns[2] as varchar(100)) end, . . . . . . . . . . . >case when columns[3] = '' then cast(null as varchar(100)) else cast(columns[3] as varchar(100)) end, . . . . . . . . . . . >case when columns[4] = '' then cast(null as varchar(100)) else cast(columns[4] as varchar(100)) end, . . . . . . . . . . . >case when columns[5] = '' then cast(null as varchar(100)) else cast(columns[5] as varchar(100)) end, . . . . . . . . . . . >case when columns[0] = '' then cast(null as varchar(100)) else cast(columns[0] as varchar(100)) end, . . . . . . . . . . . >case when columns[8] = '' then cast(null as varchar(100)) else cast(columns[8] as varchar(100)) end . . . . . . . . . . . > FROM dfs.`/Users/boazben-zvi/data/store_sales/store_sales.dat`; Exception in thread "drill-executor-4" org.apache.hadoop.fs.FSError: java.io.IOException: No space left on device . 39 more Error: SYSTEM ERROR: IOException: The file being written is in an invalid state. Probably caused by an error thrown previously. Current state: COLUMN Fragment 0:0 [Error Id: de84c212-2400-4a08-a15c-8e3adb5ec774 on 10.250.57.63:31010] (state=,code=0) 0: jdbc:drill:zk=local> create table `/drill/spill/tt1` as select * from dfs.`/Users/boazben-zvi/data/store_sales/store_sales.dat`; Error: VALIDATION ERROR: A table or view with given name [/drill/spill/tt1] already exists in schema [dfs.tmp] [Error Id: 0ef99a15-9d67-49ad-87fb-023105dece3c on 10.250.57.63:31010] (state=,code=0) 0: jdbc:drill:zk=local> drop table `/drill/spill/tt1` ; Error: DATA_WRITE ERROR: Failed to drop table: File /drill/spill/tt1 does not exist [Error Id: c22da79f-ecbd-423c-b5b2-4eae7d1263d7 on 10.250.57.63:31010] (state=,code=0) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (DRILL-4872) NPE from CTAS partitioned by a projected casted null
Boaz Ben-Zvi created DRILL-4872: --- Summary: NPE from CTAS partitioned by a projected casted null Key: DRILL-4872 URL: https://issues.apache.org/jira/browse/DRILL-4872 Project: Apache Drill Issue Type: Bug Components: Functions - Drill Affects Versions: 1.7.0 Reporter: Boaz Ben-Zvi Fix For: Future Extracted from DRILL-3898 : Running the same test case on a smaller table ( store_sales.dat from TPCDS SF 1) has no space issues, but there is a Null Pointer Exception from the projection: Caused by: java.lang.NullPointerException: null at org.apache.drill.exec.expr.fn.impl.ByteFunctionHelpers.compare(ByteFunctionHelpers.java:100) ~[vector-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT] at org.apache.drill.exec.test.generated.ProjectorGen1.doEval(ProjectorTemplate.java:49) ~[na:na] at org.apache.drill.exec.test.generated.ProjectorGen1.projectRecords(ProjectorTemplate.java:62) ~[na:na] at org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.doWork(ProjectRecordBatch.java:199) ~[drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT] A simplified version of the test case: 0: jdbc:drill:zk=local> create table dfs.tmp.ttt partition by (x) as select case when columns[8] = '' then cast(null as varchar(10)) else cast(columns[8] as varchar(10)) end as x FROM dfs.`/Users/boazben-zvi/data/store_sales/store_sales.dat`; Error: SYSTEM ERROR: NullPointerException -- This message was sent by Atlassian JIRA (v6.3.4#6332)