[jira] [Created] (DRILL-7227) TPCDS queries 47, 57, 59 fail to run with Statistics enabled at sf100
Robert Hou created DRILL-7227: - Summary: TPCDS queries 47, 57, 59 fail to run with Statistics enabled at sf100 Key: DRILL-7227 URL: https://issues.apache.org/jira/browse/DRILL-7227 Project: Apache Drill Issue Type: Bug Components: Metadata Affects Versions: 1.16.0 Reporter: Robert Hou Assignee: Gautam Parai Fix For: 1.17.0 Attachments: 23387ab0-cb1c-cd5e-449a-c9bcefc901c1.sys.drill, 2338ae93-155b-356d-382e-0da949c6f439.sys.drill Here is query 78: {noformat} WITH ws AS (SELECT d_year AS ws_sold_year, ws_item_sk, ws_bill_customer_skws_customer_sk, Sum(ws_quantity) ws_qty, Sum(ws_wholesale_cost) ws_wc, Sum(ws_sales_price)ws_sp FROM web_sales LEFT JOIN web_returns ON wr_order_number = ws_order_number AND ws_item_sk = wr_item_sk JOIN date_dim ON ws_sold_date_sk = d_date_sk WHERE wr_order_number IS NULL GROUP BY d_year, ws_item_sk, ws_bill_customer_sk), cs AS (SELECT d_year AS cs_sold_year, cs_item_sk, cs_bill_customer_skcs_customer_sk, Sum(cs_quantity) cs_qty, Sum(cs_wholesale_cost) cs_wc, Sum(cs_sales_price)cs_sp FROM catalog_sales LEFT JOIN catalog_returns ON cr_order_number = cs_order_number AND cs_item_sk = cr_item_sk JOIN date_dim ON cs_sold_date_sk = d_date_sk WHERE cr_order_number IS NULL GROUP BY d_year, cs_item_sk, cs_bill_customer_sk), ss AS (SELECT d_year AS ss_sold_year, ss_item_sk, ss_customer_sk, Sum(ss_quantity) ss_qty, Sum(ss_wholesale_cost) ss_wc, Sum(ss_sales_price)ss_sp FROM store_sales LEFT JOIN store_returns ON sr_ticket_number = ss_ticket_number AND ss_item_sk = sr_item_sk JOIN date_dim ON ss_sold_date_sk = d_date_sk WHERE sr_ticket_number IS NULL GROUP BY d_year, ss_item_sk, ss_customer_sk) SELECT ss_item_sk, Round(ss_qty / ( COALESCE(ws_qty + cs_qty, 1) ), 2) ratio, ss_qty store_qty, ss_wc store_wholesale_cost, ss_sp store_sales_price, COALESCE(ws_qty, 0) + COALESCE(cs_qty, 0) other_chan_qty, COALESCE(ws_wc, 0) + COALESCE(cs_wc, 0) other_chan_wholesale_cost, COALESCE(ws_sp, 0) + COALESCE(cs_sp, 0) other_chan_sales_price FROM ss LEFT JOIN ws ON ( ws_sold_year = ss_sold_year AND ws_item_sk = ss_item_sk AND ws_customer_sk = ss_customer_sk ) LEFT JOIN cs ON ( cs_sold_year = ss_sold_year AND cs_item_sk = cs_item_sk AND cs_customer_sk = ss_customer_sk ) WHERE COALESCE(ws_qty, 0) > 0 AND COALESCE(cs_qty, 0) > 0 AND ss_sold_year = 1999 ORDER BY ss_item_sk, ss_qty DESC, ss_wc DESC, ss_sp DESC, other_chan_qty, other_chan_wholesale_cost, other_chan_sales_price, Round(ss_qty / ( COALESCE(ws_qty + cs_qty, 1) ), 2) LIMIT 100; {noformat} The profile for the new plan is 2338ae93-155b-356d-382e-0da949c6f439. Hash partition sender operator (10-00) takes 10-15 minutes. I am not sure why it takes so long. It has 10 minor fragments sending to receiver (06-05), which has 62 minor fragments. But hash partition sender (16-00) has 10 minor fragments sending to receiver (12-06), which has 220 minor fragments, and there is no performance issue. The profile for the old plan is 23387ab0-cb1c-cd5e-449a-c9bcefc901c1. Both plans use the same commit. The old plan is created by disabling statistics. I have not included the plans in the Jira because Jira has a max of 32K. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-7183) TPCDS query 10, 35, 69 take longer with sf 1000 when Statistics are disabled
Robert Hou created DRILL-7183: - Summary: TPCDS query 10, 35, 69 take longer with sf 1000 when Statistics are disabled Key: DRILL-7183 URL: https://issues.apache.org/jira/browse/DRILL-7183 Project: Apache Drill Issue Type: Bug Components: Query Planning & Optimization Affects Versions: 1.16.0 Reporter: Robert Hou Assignee: Hanumath Rao Maduri Fix For: 1.16.0 Query 69 runs 150% slower when Statistics is disabled. Here is the query: {noformat} SELECT cd_gender, cd_marital_status, cd_education_status, count(*) cnt1, cd_purchase_estimate, count(*) cnt2, cd_credit_rating, count(*) cnt3 FROM customer c, customer_address ca, customer_demographics WHERE c.c_current_addr_sk = ca.ca_address_sk AND ca_state IN ('KY', 'GA', 'NM') AND cd_demo_sk = c.c_current_cdemo_sk AND exists(SELECT * FROM store_sales, date_dim WHERE c.c_customer_sk = ss_customer_sk AND ss_sold_date_sk = d_date_sk AND d_year = 2001 AND d_moy BETWEEN 4 AND 4 + 2) AND (NOT exists(SELECT * FROM web_sales, date_dim WHERE c.c_customer_sk = ws_bill_customer_sk AND ws_sold_date_sk = d_date_sk AND d_year = 2001 AND d_moy BETWEEN 4 AND 4 + 2) AND NOT exists(SELECT * FROM catalog_sales, date_dim WHERE c.c_customer_sk = cs_ship_customer_sk AND cs_sold_date_sk = d_date_sk AND d_year = 2001 AND d_moy BETWEEN 4 AND 4 + 2)) GROUP BY cd_gender, cd_marital_status, cd_education_status, cd_purchase_estimate, cd_credit_rating ORDER BY cd_gender, cd_marital_status, cd_education_status, cd_purchase_estimate, cd_credit_rating LIMIT 100; {noformat} This regression is caused by commit 982e98061e029a39f1c593f695c0d93ec7079f0d. This commit should be reverted for now. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-7155) Create a standard logging message for batch sizes generated by individual operators
Robert Hou created DRILL-7155: - Summary: Create a standard logging message for batch sizes generated by individual operators Key: DRILL-7155 URL: https://issues.apache.org/jira/browse/DRILL-7155 Project: Apache Drill Issue Type: Task Components: Execution - Relational Operators Affects Versions: 1.16.0 Reporter: Robert Hou Assignee: Robert Hou QA reads log messages in drillbit.log to verify the sizes of data batches generated by individual operators. These log messages need to be standardized so that each operator creates the same message. This allows the QA test framework to verify the information in each message. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-7154) TPCH query 4 and 17 take longer with sf 1000 when Statistics are disabled
Robert Hou created DRILL-7154: - Summary: TPCH query 4 and 17 take longer with sf 1000 when Statistics are disabled Key: DRILL-7154 URL: https://issues.apache.org/jira/browse/DRILL-7154 Project: Apache Drill Issue Type: Bug Components: Query Planning & Optimization Affects Versions: 1.16.0 Reporter: Robert Hou Assignee: Boaz Ben-Zvi Fix For: 1.16.0 Attachments: 235a3ed4-e3d1-f3b7-39c5-fc947f56b6d5.sys.drill, 235a471b-aa97-bfb5-207d-3f25b4b5fbbb.sys.drill, hashagg.nostats.log, hashagg.stats.disabled.log Here is TPCH 04 with sf 1000: {noformat} select o.o_orderpriority, count(*) as order_count from orders o where o.o_orderdate >= date '1996-10-01' and o.o_orderdate < date '1996-10-01' + interval '3' month and exists ( select * from lineitem l where l.l_orderkey = o.o_orderkey and l.l_commitdate < l.l_receiptdate ) group by o.o_orderpriority order by o.o_orderpriority; {noformat} TPCH query 4 takes 30% longer. The plan is the same. But the Hash Agg operator in the new plan is taking longer. One possible reason is that the Hash Agg operator in the new plan is not using as many buckets as the old plan did. The memory usage of the Hash Agg operator in the new plan is using less memory compared to the old plan. Here is the old plan: {noformat} 00-00Screen : rowType = RecordType(ANY o_orderpriority, BIGINT order_count): rowcount = 375.0, cumulative cost = {1.9163601940441746E10 rows, 9.07316867594483E10 cpu, 2.2499969127E10 io, 3.59423968386048E12 network, 2.2631985057468002E10 memory}, id = 5645 00-01 Project(o_orderpriority=[$0], order_count=[$1]) : rowType = RecordType(ANY o_orderpriority, BIGINT order_count): rowcount = 375.0, cumulative cost = {1.9163226940441746E10 rows, 9.07313117594483E10 cpu, 2.2499969127E10 io, 3.59423968386048E12 network, 2.2631985057468002E10 memory}, id = 5644 00-02SingleMergeExchange(sort0=[0]) : rowType = RecordType(ANY o_orderpriority, BIGINT order_count): rowcount = 375.0, cumulative cost = {1.9159476940441746E10 rows, 9.07238117594483E10 cpu, 2.2499969127E10 io, 3.59423968386048E12 network, 2.2631985057468002E10 memory}, id = 5643 01-01 OrderedMuxExchange(sort0=[0]) : rowType = RecordType(ANY o_orderpriority, BIGINT order_count): rowcount = 375.0, cumulative cost = {1.9155726940441746E10 rows, 9.0643982838025E10 cpu, 2.2499969127E10 io, 3.56351968386048E12 network, 2.2631985057468002E10 memory}, id = 5642 02-01SelectionVectorRemover : rowType = RecordType(ANY o_orderpriority, BIGINT order_count): rowcount = 375.0, cumulative cost = {1.9151976940441746E10 rows, 9.0640232838025E10 cpu, 2.2499969127E10 io, 3.56351968386048E12 network, 2.2631985057468002E10 memory}, id = 5641 02-02 Sort(sort0=[$0], dir0=[ASC]) : rowType = RecordType(ANY o_orderpriority, BIGINT order_count): rowcount = 375.0, cumulative cost = {1.9148226940441746E10 rows, 9.0636482838025E10 cpu, 2.2499969127E10 io, 3.56351968386048E12 network, 2.2631985057468002E10 memory}, id = 5640 02-03HashAgg(group=[{0}], order_count=[$SUM0($1)]) : rowType = RecordType(ANY o_orderpriority, BIGINT order_count): rowcount = 375.0, cumulative cost = {1.9144476940441746E10 rows, 9.030890595055101E10 cpu, 2.2499969127E10 io, 3.56351968386048E12 network, 2.2571985057468002E10 memory}, id = 5639 02-04 HashToRandomExchange(dist0=[[$0]]) : rowType = RecordType(ANY o_orderpriority, BIGINT order_count): rowcount = 3.75E7, cumulative cost = {1.9106976940441746E10 rows, 8.955890595055101E10 cpu, 2.2499969127E10 io, 3.56351968386048E12 network, 2.1911985057468002E10 memory}, id = 5638 03-01HashAgg(group=[{0}], order_count=[COUNT()]) : rowType = RecordType(ANY o_orderpriority, BIGINT order_count): rowcount = 3.75E7, cumulative cost = {1.9069476940441746E10 rows, 8.895890595055101E10 cpu, 2.2499969127E10 io, 3.25631968386048E12 network, 2.1911985057468002E10 memory}, id = 5637 03-02 Project(o_orderpriority=[$1]) : rowType = RecordType(ANY o_orderpriority): rowcount = 3.75E8, cumulative cost = {1.8694476940441746E10 rows, 8.145890595055101E10 cpu, 2.2499969127E10 io, 3.25631968386048E12 network, 1.5311985057468002E10 memory}, id = 5636 03-03Project(o_orderkey=[$1], o_orderpriority=[$2], l_orderkey=[$0]) : rowType = RecordType(ANY o_orderkey, ANY o_orderpriority, ANY l_orderkey): rowcount = 3.75E8, cumulative cost = {1.8319476940441746E10 rows, 8.108390595055101E10 cpu, 2.2499969127E10 io, 3.25631968386048E12 network, 1.5311985057468002E10 memory}, id = 5635 03-04 HashJoin(condition=[=($1, $0)], joinType=[inner], semi-join: =[false]) : rowType = RecordType(ANY l_orderkey, ANY o_ord
[jira] [Created] (DRILL-7139) Date)add produces Incorrect results when adding to a timestamp
Robert Hou created DRILL-7139: - Summary: Date)add produces Incorrect results when adding to a timestamp Key: DRILL-7139 URL: https://issues.apache.org/jira/browse/DRILL-7139 Project: Apache Drill Issue Type: Bug Components: Functions - Drill Affects Versions: 1.15.0 Reporter: Robert Hou Assignee: Pritesh Maker I am using date_add() to create a sequence of timestamps: {noformat} select date_add(timestamp '1970-01-01 00:00:00', cast(concat('PT',{color:#f79232}107374{color},'M') as interval minute)) timestamp_id from (values(1)); +--+ | timestamp_id | +--+ | 1970-01-25 20:31:12.704 | +--+ 1 row selected (0.121 seconds) {noformat} When I add one more, I get an older timestamp: {noformat} 0: jdbc:drill:drillbit=10.10.51.5> select date_add(timestamp '1970-01-01 00:00:00', cast(concat('PT',{color:#f79232}107375{color},'M') as interval minute)) timestamp_id from (values(1)); +--+ | timestamp_id | +--+ | 1969-12-07 03:29:25.408 | +--+ 1 row selected (0.126 seconds) {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-7136) Num_buckets for HashAgg in profile may be inaccurate
Robert Hou created DRILL-7136: - Summary: Num_buckets for HashAgg in profile may be inaccurate Key: DRILL-7136 URL: https://issues.apache.org/jira/browse/DRILL-7136 Project: Apache Drill Issue Type: Bug Components: Tools, Build & Test Affects Versions: 1.16.0 Reporter: Robert Hou Assignee: Pritesh Maker Fix For: 1.16.0 Attachments: 23650ee5-6721-8a8f-7dd3-f5dd09a3a7b0.sys.drill I ran TPCH query 17 with sf 1000. Here is the query: {noformat} select sum(l.l_extendedprice) / 7.0 as avg_yearly from lineitem l, part p where p.p_partkey = l.l_partkey and p.p_brand = 'Brand#13' and p.p_container = 'JUMBO CAN' and l.l_quantity < ( select 0.2 * avg(l2.l_quantity) from lineitem l2 where l2.l_partkey = p.p_partkey ); {noformat} One of the hash agg operators has resized 6 times. It should have 4M buckets. But the profile shows it has 64K buckets. I have attached a sample profile. In this profile, the hash agg operator is (04-02). {noformat} Operator Metrics Minor Fragment NUM_BUCKETS NUM_ENTRIES NUM_RESIZING RESIZING_TIME_MSNUM_PARTITIONS SPILLED_PARTITIONS SPILL_MB SPILL_CYCLE INPUT_BATCH_COUNT AVG_INPUT_BATCH_BYTES AVG_INPUT_ROW_BYTES INPUT_RECORD_COUNT OUTPUT_BATCH_COUNT AVG_OUTPUT_BATCH_BYTES AVG_OUTPUT_ROW_BYTESOUTPUT_RECORD_COUNT 04-00-0265,536 748,746 6 364 1 582 0 813 582,653 18 26,316,456 401 1,631,943 25 26,176,350 {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (DRILL-7132) Metadata cache does not have correct min/max values for varchar and interval data types
[ https://issues.apache.org/jira/browse/DRILL-7132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Hou resolved DRILL-7132. --- Resolution: Not A Problem > Metadata cache does not have correct min/max values for varchar and interval > data types > --- > > Key: DRILL-7132 > URL: https://issues.apache.org/jira/browse/DRILL-7132 > Project: Apache Drill > Issue Type: Bug > Components: Metadata >Affects Versions: 1.14.0 >Reporter: Robert Hou >Priority: Major > Fix For: 1.17.0 > > Attachments: 0_0_10.parquet > > > The parquet metadata cache does not have correct min/max values for varchar > and interval data types. > I have attached a parquet file. Here is what parquet tools shows for varchar: > [varchar_col] BINARY 14.6% of all space [PLAIN, BIT_PACKED] min: 67 max: 67 > average: 67 total: 67 (raw data: 65 saving -3%) > values: min: 1 max: 1 average: 1 total: 1 > uncompressed: min: 65 max: 65 average: 65 total: 65 > column values statistics: min: ioegjNJKvnkd, max: ioegjNJKvnkd, num_nulls: 0 > Here is what the metadata cache file shows: > "name" : [ "varchar_col" ], > "minValue" : "aW9lZ2pOSkt2bmtk", > "maxValue" : "aW9lZ2pOSkt2bmtk", > "nulls" : 0 > Here is what parquet tools shows for interval: > [interval_col] BINARY 11.3% of all space [PLAIN, BIT_PACKED] min: 52 max: 52 > average: 52 total: 52 (raw data: 50 saving -4%) > values: min: 1 max: 1 average: 1 total: 1 > uncompressed: min: 50 max: 50 average: 50 total: 50 > column values statistics: min: P18582D, max: P18582D, num_nulls: 0 > Here is what the metadata cache file shows: > "name" : [ "interval_col" ], > "minValue" : "UDE4NTgyRA==", > "maxValue" : "UDE4NTgyRA==", > "nulls" : 0 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-7132) Metadata cache does not have correct min/max values for varchar and interval data types
Robert Hou created DRILL-7132: - Summary: Metadata cache does not have correct min/max values for varchar and interval data types Key: DRILL-7132 URL: https://issues.apache.org/jira/browse/DRILL-7132 Project: Apache Drill Issue Type: Bug Components: Metadata Affects Versions: 1.14.0 Reporter: Robert Hou Fix For: 1.17.0 Attachments: 0_0_10.parquet The parquet metadata cache does not have correct min/max values for varchar and interval data types. I have attached a parquet file. Here is what parquet tools shows for varchar: [varchar_col] BINARY 14.6% of all space [PLAIN, BIT_PACKED] min: 67 max: 67 average: 67 total: 67 (raw data: 65 saving -3%) values: min: 1 max: 1 average: 1 total: 1 uncompressed: min: 65 max: 65 average: 65 total: 65 column values statistics: min: ioegjNJKvnkd, max: ioegjNJKvnkd, num_nulls: 0 Here is what the metadata cache file shows: "name" : [ "varchar_col" ], "minValue" : "aW9lZ2pOSkt2bmtk", "maxValue" : "aW9lZ2pOSkt2bmtk", "nulls" : 0 Here is what parquet tools shows for interval: [interval_col] BINARY 11.3% of all space [PLAIN, BIT_PACKED] min: 52 max: 52 average: 52 total: 52 (raw data: 50 saving -4%) values: min: 1 max: 1 average: 1 total: 1 uncompressed: min: 50 max: 50 average: 50 total: 50 column values statistics: min: P18582D, max: P18582D, num_nulls: 0 Here is what the metadata cache file shows: "name" : [ "interval_col" ], "minValue" : "UDE4NTgyRA==", "maxValue" : "UDE4NTgyRA==", "nulls" : 0 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-7122) TPCDS queries 29 25 17 are slower when Statistics is disabled.
Robert Hou created DRILL-7122: - Summary: TPCDS queries 29 25 17 are slower when Statistics is disabled. Key: DRILL-7122 URL: https://issues.apache.org/jira/browse/DRILL-7122 Project: Apache Drill Issue Type: Bug Reporter: Robert Hou -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-7123) TPCDS query 83 runs slower when Statistics is disabled
Robert Hou created DRILL-7123: - Summary: TPCDS query 83 runs slower when Statistics is disabled Key: DRILL-7123 URL: https://issues.apache.org/jira/browse/DRILL-7123 Project: Apache Drill Issue Type: Bug Components: Query Planning & Optimization Affects Versions: 1.16.0 Reporter: Robert Hou Assignee: Gautam Parai Fix For: 1.16.0 Query is TPCDS 83 with sf 100: {noformat} WITH sr_items AS (SELECT i_item_id item_id, Sum(sr_return_quantity) sr_item_qty FROM store_returns, item, date_dim WHERE sr_item_sk = i_item_sk AND d_date IN (SELECT d_date FROM date_dim WHERE d_week_seq IN (SELECT d_week_seq FROM date_dim WHERE d_date IN ( '1999-06-30', '1999-08-28', '1999-11-18' ))) AND sr_returned_date_sk = d_date_sk GROUP BY i_item_id), cr_items AS (SELECT i_item_id item_id, Sum(cr_return_quantity) cr_item_qty FROM catalog_returns, item, date_dim WHERE cr_item_sk = i_item_sk AND d_date IN (SELECT d_date FROM date_dim WHERE d_week_seq IN (SELECT d_week_seq FROM date_dim WHERE d_date IN ( '1999-06-30', '1999-08-28', '1999-11-18' ))) AND cr_returned_date_sk = d_date_sk GROUP BY i_item_id), wr_items AS (SELECT i_item_id item_id, Sum(wr_return_quantity) wr_item_qty FROM web_returns, item, date_dim WHERE wr_item_sk = i_item_sk AND d_date IN (SELECT d_date FROM date_dim WHERE d_week_seq IN (SELECT d_week_seq FROM date_dim WHERE d_date IN ( '1999-06-30', '1999-08-28', '1999-11-18' ))) AND wr_returned_date_sk = d_date_sk GROUP BY i_item_id) SELECT sr_items.item_id, sr_item_qty, sr_item_qty / ( sr_item_qty + cr_item_qty + wr_item_qty ) / 3.0 * 100 sr_dev, cr_item_qty, cr_item_qty / ( sr_item_qty + cr_item_qty + wr_item_qty ) / 3.0 * 100 cr_dev, wr_item_qty, wr_item_qty / ( sr_item_qty + cr_item_qty + wr_item_qty ) / 3.0 * 100 wr_dev, ( sr_item_qty + cr_item_qty + wr_item_qty ) / 3.0 average FROM sr_items, cr_items, wr_items WHERE sr_items.item_id = cr_items.item_id AND sr_items.item_id = wr_items.item_id ORDER BY sr_items.item_id, sr_item_qty LIMIT 100; {noformat} The number of threads for major fragments 1 and 2 has changed when Statistics is disabled. The number of minor fragments has been reduced from 10 and 15 fragments down to 3 fragments. Rowcount has changed for major fragment 2 from 1439754.0 down to 287950.8. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-7121) TPCH 4 takes longer
Robert Hou created DRILL-7121: - Summary: TPCH 4 takes longer Key: DRILL-7121 URL: https://issues.apache.org/jira/browse/DRILL-7121 Project: Apache Drill Issue Type: Bug Components: Query Planning & Optimization Affects Versions: 1.16.0 Reporter: Robert Hou Assignee: Gautam Parai Fix For: 1.16.0 Here is TPCH 4 with sf 100: {noformat} select o.o_orderpriority, count(*) as order_count from orders o where o.o_orderdate >= date '1996-10-01' and o.o_orderdate < date '1996-10-01' + interval '3' month and exists ( select * from lineitem l where l.l_orderkey = o.o_orderkey and l.l_commitdate < l.l_receiptdate ) group by o.o_orderpriority order by o.o_orderpriority; {noformat} The plan has changed when Statistics is disabled. A Hash Agg and a Broadcast Exchange have been added. These two operators expand the number of rows from the lineitem table from 137M to 9B rows. This forces the hash join to use 6GB of memory instead of 30 MB. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-7120) Query fails with ChannelClosedException
Robert Hou created DRILL-7120: - Summary: Query fails with ChannelClosedException Key: DRILL-7120 URL: https://issues.apache.org/jira/browse/DRILL-7120 Project: Apache Drill Issue Type: Bug Components: Query Planning & Optimization Affects Versions: 1.16.0 Reporter: Robert Hou Assignee: Gautam Parai Fix For: 1.16.0 TPCH query 5 fails at sf100. Here is the query: {noformat} select n.n_name, sum(l.l_extendedprice * (1 - l.l_discount)) as revenue from customer c, orders o, lineitem l, supplier s, nation n, region r where c.c_custkey = o.o_custkey and l.l_orderkey = o.o_orderkey and l.l_suppkey = s.s_suppkey and c.c_nationkey = s.s_nationkey and s.s_nationkey = n.n_nationkey and n.n_regionkey = r.r_regionkey and r.r_name = 'EUROPE' and o.o_orderdate >= date '1997-01-01' and o.o_orderdate < date '1997-01-01' + interval '1' year group by n.n_name order by revenue desc; {noformat} This is the error from drillbit.log: {noformat} 2019-03-04 17:46:38,684 [23822b0a-b7bd-0b79-b905-1438f5b1d039:frag:6:64] INFO o.a.d.e.w.fragment.FragmentExecutor - 23822b0a-b7bd-0b79-b905-1438f5b1d039:6:64: State change requested RUNNING --> FINISHED 2019-03-04 17:46:38,684 [23822b0a-b7bd-0b79-b905-1438f5b1d039:frag:6:64] INFO o.a.d.e.w.f.FragmentStatusReporter - 23822b0a-b7bd-0b79-b905-1438f5b1d039:6:64: State to report: FINISHED 2019-03-04 18:17:51,454 [BitServer-13] WARN o.a.d.exec.rpc.ProtobufLengthDecoder - Failure allocating buffer on incoming stream due to memory limits. Current Allocation: 262144. 2019-03-04 18:17:51,454 [BitServer-13] ERROR o.a.drill.exec.rpc.data.DataServer - Out of memory in RPC layer. 2019-03-04 18:17:51,463 [BitServer-13] ERROR o.a.d.exec.rpc.RpcExceptionHandler - Exception in RPC communication. Connection: /10.10.120.104:31012 <--> /10.10.120.106:53048 (data server). Closing connection. io.netty.handler.codec.DecoderException: org.apache.drill.exec.exception.OutOfMemoryException: Failure allocating buffer. at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:271) ~[netty-codec-4.0.48.Final.jar:4.0.48.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) [netty-transport-4.0.48.Final.jar:4.0.48.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) [netty-transport-4.0.48.Final.jar:4.0.48.Final] at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335) [netty-transport-4.0.48.Final.jar:4.0.48.Final] at io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86) [netty-transport-4.0.48.Final.jar:4.0.48.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) [netty-transport-4.0.48.Final.jar:4.0.48.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) [netty-transport-4.0.48.Final.jar:4.0.48.Final] at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335) [netty-transport-4.0.48.Final.jar:4.0.48.Final] at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1294) [netty-transport-4.0.48.Final.jar:4.0.48.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) [netty-transport-4.0.48.Final.jar:4.0.48.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) [netty-transport-4.0.48.Final.jar:4.0.48.Final] at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:911) [netty-transport-4.0.48.Final.jar:4.0.48.Final] at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131) [netty-transport-4.0.48.Final.jar:4.0.48.Final] at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:645) [netty-transport-4.0.48.Final.jar:4.0.48.Final] at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:580) [netty-transport-4.0.48.Final.jar:4.0.48.Final] at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:497) [netty-transport-4.0.48.Final.jar:4.0.48.Final] at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459) [netty-transport-4.0.48.Final.jar:4.0.48.Final] at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:131) [netty-common-4.0.48.Final.jar:4.0.48.Final] at java.lang.Thread.run(Thread.java:745) [na:1.8.0_1
[jira] [Created] (DRILL-7109) Statistics adds external sort, which spills to disk
Robert Hou created DRILL-7109: - Summary: Statistics adds external sort, which spills to disk Key: DRILL-7109 URL: https://issues.apache.org/jira/browse/DRILL-7109 Project: Apache Drill Issue Type: Bug Components: Query Planning & Optimization Affects Versions: 1.16.0 Reporter: Robert Hou Assignee: Gautam Parai Fix For: 1.16.0 TPCH query 4 with sf 100 runs many times slower. One issue is that an extra external sort has been added, and both external sorts spill to disk. Also, the hash join sees 100x more data. Here is the query: {noformat} select o.o_orderpriority, count(*) as order_count from orders o where o.o_orderdate >= date '1996-10-01' and o.o_orderdate < date '1996-10-01' + interval '3' month and exists ( select * from lineitem l where l.l_orderkey = o.o_orderkey and l.l_commitdate < l.l_receiptdate ) group by o.o_orderpriority order by o.o_orderpriority; {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-7108) Statistics adds two exchange operators
Robert Hou created DRILL-7108: - Summary: Statistics adds two exchange operators Key: DRILL-7108 URL: https://issues.apache.org/jira/browse/DRILL-7108 Project: Apache Drill Issue Type: Bug Components: Query Planning & Optimization Affects Versions: 1.16.0 Reporter: Robert Hou Assignee: Gautam Parai Fix For: 1.16.0 TPCH 16 with sf 100 runs 14% slower. Here is the query: {noformat} select p.p_brand, p.p_type, p.p_size, count(distinct ps.ps_suppkey) as supplier_cnt from partsupp ps, part p where p.p_partkey = ps.ps_partkey and p.p_brand <> 'Brand#21' and p.p_type not like 'MEDIUM PLATED%' and p.p_size in (38, 2, 8, 31, 44, 5, 14, 24) and ps.ps_suppkey not in ( select s.s_suppkey from supplier s where s.s_comment like '%Customer%Complaints%' ) group by p.p_brand, p.p_type, p.p_size order by supplier_cnt desc, p.p_brand, p.p_type, p.p_size; {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6957) Parquet rowgroup filtering can have incorrect file count
Robert Hou created DRILL-6957: - Summary: Parquet rowgroup filtering can have incorrect file count Key: DRILL-6957 URL: https://issues.apache.org/jira/browse/DRILL-6957 Project: Apache Drill Issue Type: Bug Reporter: Robert Hou Assignee: Jean-Blas IMBERT If a query accesses all the files, the Scan operator indicates that one file is accessed. The number of rowgroups is correct. Here is an example query: {noformat} select count(*) from dfs.`/custdata/tudata/fact/vintage/snapshot_period_id=20151231/comp_id=120` where cur_tot_bal_amt < 100 {noformat} Here is the plan: {noformat} Screen : rowType = RecordType(BIGINT EXPR$0): rowcount = 1.0, cumulative cost = {9.8376721446E9 rows, 4.35668337906E10 cpu, 2.810763469E9 io, 4096.0 network, 0.0 memory}, id = 4477 00-01 Project(EXPR$0=[$0]) : rowType = RecordType(BIGINT EXPR$0): rowcount = 1.0, cumulative cost = {9.8376721445E9 rows, 4.35668337905E10 cpu, 2.810763469E9 io, 4096.0 network, 0.0 memory}, id = 4476 00-02StreamAgg(group=[{}], EXPR$0=[$SUM0($0)]) : rowType = RecordType(BIGINT EXPR$0): rowcount = 1.0, cumulative cost = {9.8376721435E9 rows, 4.35668337895E10 cpu, 2.810763469E9 io, 4096.0 network, 0.0 memory}, id = 4475 00-03 UnionExchange : rowType = RecordType(BIGINT EXPR$0): rowcount = 1.0, cumulative cost = {9.8376721425E9 rows, 4.35668337775E10 cpu, 2.810763469E9 io, 4096.0 network, 0.0 memory}, id = 4474 01-01StreamAgg(group=[{}], EXPR$0=[COUNT()]) : rowType = RecordType(BIGINT EXPR$0): rowcount = 1.0, cumulative cost = {9.8376721415E9 rows, 4.35668337695E10 cpu, 2.810763469E9 io, 0.0 network, 0.0 memory}, id = 4473 01-02 Project($f0=[0]) : rowType = RecordType(INTEGER $f0): rowcount = 1.4053817345E9, cumulative cost = {8.432290407E9 rows, 2.67022529555E10 cpu, 2.810763469E9 io, 0.0 network, 0.0 memory}, id = 4472 01-03SelectionVectorRemover : rowType = RecordType(ANY cur_tot_bal_amt): rowcount = 1.4053817345E9, cumulative cost = {7.0269086725E9 rows, 2.10807260175E10 cpu, 2.810763469E9 io, 0.0 network, 0.0 memory}, id = 4471 01-04 Filter(condition=[<($0, 100)]) : rowType = RecordType(ANY cur_tot_bal_amt): rowcount = 1.4053817345E9, cumulative cost = {5.621526938E9 rows, 1.9675344283E10 cpu, 2.810763469E9 io, 0.0 network, 0.0 memory}, id = 4470 01-05Scan(table=[[dfs, /custdata/tudata/fact/vintage/snapshot_period_id=20151231/comp_id=120]], groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:///custdata/tudata/fact/vintage/snapshot_period_id=20151231/comp_id=120]], selectionRoot=maprfs:/custdata/tudata/fact/vintage/snapshot_period_id=20151231/comp_id=120, numFiles=1, numRowGroups=1007, usedMetadataFile=false, columns=[`cur_tot_bal_amt`]]]) : rowType = RecordType(ANY cur_tot_bal_amt): rowcount = 2.810763469E9, cumulative cost = {2.810763469E9 rows, 2.810763469E9 cpu, 2.810763469E9 io, 0.0 network, 0.0 memory}, id = 4469 {noformat} numFiles is set to 1 when it should be set to 21. All the files are in one directory. If I add a level of directories (i.e. a directory with multiple directories, each with files), then I get the correct file count. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6906) File permissions are not being honored
Robert Hou created DRILL-6906: - Summary: File permissions are not being honored Key: DRILL-6906 URL: https://issues.apache.org/jira/browse/DRILL-6906 Project: Apache Drill Issue Type: Bug Components: Client - JDBC, Client - ODBC Affects Versions: 1.15.0 Reporter: Robert Hou Assignee: Pritesh Maker Fix For: 1.15.0 I ran sqlline with user "kuser1". {noformat} /opt/mapr/drill/drill-1.15.0.apache/bin/sqlline -u "jdbc:drill:drillbit=10.10.30.206" -n kuser1 -p mapr {noformat} I tried to access a file that is only accessible by root: {noformat} [root@perfnode206 drill-test-framework_krystal]# hf -ls /drill/testdata/impersonation/neg_tc5/student -rwx-- 3 root root 64612 2018-06-19 10:30 /drill/testdata/impersonation/neg_tc5/student {noformat} I am able to read the table, which should not be possible. I used this commit for Drill 1.15. {noformat} git.commit.id=bf2b414ac62cfc515fdd77f2688bb110073d764d git.commit.message.full=DRILL-6866\: Upgrade to SqlLine 1.6.0\n\n1. Changed SqlLine version to 1.6.0.\n2. Overridden new getVersion method in DrillSqlLineApplication.\n3. Set maxColumnWidth to 80 to avoid issue described in DRILL-6769.\n4. Changed colorScheme to obsidian.\n5. Output null value for varchar / char / boolean types as null instead of empty string.\n6. Changed access modifier from package default to public for JDBC classes that implement external interfaces to avoid issues when calling methods from these classes using reflection.\n\ncloses \#1556 {noformat} This is from drillbit.log. It shows that user is kuser1. {noformat} 2018-12-15 05:00:52,516 [23eb04fb-1701-bea7-dd97-ecda58795b3b:foreman] DEBUG o.a.d.e.w.f.QueryStateProcessor - 23eb04fb-1701-bea7-dd97-ecda58795b3b: State change requested PREPARING --> PLANNING 2018-12-15 05:00:52,531 [23eb04fb-1701-bea7-dd97-ecda58795b3b:foreman] INFO o.a.drill.exec.work.foreman.Foreman - Query text for query with id 23eb04fb-1701-bea7-dd97-ecda58795b3b issued by kuser1: select * from dfs.`/drill/testdata/impersonation/neg_tc5/student` {noformat} It is not clear to me if this is a Drill problem or a file system problem. I tested MFS by logging in as kuser1 and trying to copy the file using "hadoop fs -copyToLocal /drill/testdata/impersonation/neg_tc5/student" and got an error, and was not able to copy the file. So I think MFS permissions are working. I also tried with Drill 1.14, and I get the expected error: {noformat} 0: jdbc:drill:drillbit=10.10.30.206> select * from dfs.`/drill/testdata/impersonation/neg_tc5/student` limit 1; Error: VALIDATION ERROR: From line 1, column 15 to line 1, column 17: Object '/drill/testdata/impersonation/neg_tc5/student' not found within 'dfs' [Error Id: cdf18c2a-b005-4f92-b819-d4324e8807d9 on perfnode206.perf.lab:31010] (state=,code=0) {noformat} The commit for Drill 1.14 is: {noformat} git.commit.message.full=[maven-release-plugin] prepare release drill-1.14.0\n git.commit.id=0508a128853ce796ca7e99e13008e49442f83147 {noformat} This problem exists with both Apache JDBC and Simba ODBC. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6902) Extra limit operator is not needed
Robert Hou created DRILL-6902: - Summary: Extra limit operator is not needed Key: DRILL-6902 URL: https://issues.apache.org/jira/browse/DRILL-6902 Project: Apache Drill Issue Type: Bug Components: Query Planning & Optimization Affects Versions: 1.15.0 Reporter: Robert Hou Assignee: Pritesh Maker For TPCDS query 49, there is an extra limit operator that is not needed. Here is the query: {noformat} SELECT 'web' AS channel, web.item, web.return_ratio, web.return_rank, web.currency_rank FROM (SELECT item, return_ratio, currency_ratio, Rank() OVER ( ORDER BY return_ratio) AS return_rank, Rank() OVER ( ORDER BY currency_ratio) AS currency_rank FROM (SELECT ws.ws_item_sk AS item, ( Cast(Sum(COALESCE(wr.wr_return_quantity, 0)) AS DEC(15, 4)) / Cast( Sum(COALESCE(ws.ws_quantity, 0)) AS DEC(15, 4)) ) AS return_ratio, ( Cast(Sum(COALESCE(wr.wr_return_amt, 0)) AS DEC(15, 4)) / Cast( Sum( COALESCE(ws.ws_net_paid, 0)) AS DEC(15, 4)) ) AS currency_ratio FROM web_sales ws LEFT OUTER JOIN web_returns wr ON ( ws.ws_order_number = wr.wr_order_number AND ws.ws_item_sk = wr.wr_item_sk ), date_dim WHERE wr.wr_return_amt > 1 AND ws.ws_net_profit > 1 AND ws.ws_net_paid > 0 AND ws.ws_quantity > 0 AND ws_sold_date_sk = d_date_sk AND d_year = 1999 AND d_moy = 12 GROUP BY ws.ws_item_sk) in_web) web WHERE ( web.return_rank <= 10 OR web.currency_rank <= 10 ) UNION SELECT 'catalog' AS channel, catalog.item, catalog.return_ratio, catalog.return_rank, catalog.currency_rank FROM (SELECT item, return_ratio, currency_ratio, Rank() OVER ( ORDER BY return_ratio) AS return_rank, Rank() OVER ( ORDER BY currency_ratio) AS currency_rank FROM (SELECT cs.cs_item_sk AS item, ( Cast(Sum(COALESCE(cr.cr_return_quantity, 0)) AS DEC(15, 4)) / Cast( Sum(COALESCE(cs.cs_quantity, 0)) AS DEC(15, 4)) ) AS return_ratio, ( Cast(Sum(COALESCE(cr.cr_return_amount, 0)) AS DEC(15, 4 )) / Cast(Sum( COALESCE(cs.cs_net_paid, 0)) AS DEC( 15, 4)) ) AS currency_ratio FROM catalog_sales cs LEFT OUTER JOIN catalog_returns cr ON ( cs.cs_order_number = cr.cr_order_number AND cs.cs_item_sk = cr.cr_item_sk ), date_dim WHERE cr.cr_return_amount > 1 AND cs.cs_net_profit > 1 AND cs.cs_net_paid > 0 AND cs.cs_quantity > 0 AND cs_sold_date_sk = d_date_sk AND d_year = 1999 AND d_moy = 12 GROUP BY cs.cs_item_sk) in_cat) catalog WHERE ( catalog.return_rank <= 10 OR catalog.currency_rank <= 10 ) UNION SELECT 'store' AS channel, store.item, store.return_ratio, store.return_rank, store.currency_rank FROM (SELECT item, return_ratio, currency_ratio, Rank() OVER ( ORDER BY return_ratio) AS return_rank, Rank() OVER ( ORDER BY currency_ratio) AS currency_rank FROM (SELECT sts.ss_item_sk AS item, ( Cast(Sum(COALESCE(sr.sr_return_quantity, 0)
[jira] [Created] (DRILL-6897) TPCH 13 has regressed
Robert Hou created DRILL-6897: - Summary: TPCH 13 has regressed Key: DRILL-6897 URL: https://issues.apache.org/jira/browse/DRILL-6897 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.15.0 Reporter: Robert Hou Assignee: Karthikeyan Manivannan Attachments: 240099ed-ef2a-a23a-4559-f1b2e0809e72.sys.drill, 2400be84-c024-cb92-8743-3211589e0247.sys.drill I ran TPCH query 13 with both scale factor 100 and 1000, and ran them 3x to get a warm start, and ran them twice to verify the regression. It is regressing between 26 and 33%. Here is the query: {noformat} select c_count, count(*) as custdist from ( select c.c_custkey, count(o.o_orderkey) from customer c left outer join orders o on c.c_custkey = o.o_custkey and o.o_comment not like '%special%requests%' group by c.c_custkey ) as orders (c_custkey, c_count) group by c_count order by custdist desc, c_count desc; {noformat} I have attached two profiles. 240099ed-ef2a-a23a-4559-f1b2e0809e72 is for Drill 1.15. 2400be84-c024-cb92-8743-3211589e0247 is for Drill 1.14. The commit for Drill 1.15 is 596227bbbecfb19bdb55dd8ea58159890f83bc9c. The commit for Drill 1.14 is 0508a128853ce796ca7e99e13008e49442f83147. The two plans nearly the same. One difference is that Drill 1.15 is using four times more memory in operator 07-01 Unordered Mux Exchange. I think the problem may be in operator 09-01 Project. Drill 1.15 is projecting the comment field while Drill 1.14 does not project the comment field. Another issue is that the Drill 1.15 takes more processing time to filter the order table. Filter operator 09-03 takes an average of 19.3s. For Drill 1.14, filter operator 09-04 takes an average of 15.6s. They process the same number of rows, and have the same number of minor fragments. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (DRILL-6828) Hit UnrecognizedPropertyException when run tpch queries
[ https://issues.apache.org/jira/browse/DRILL-6828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Hou resolved DRILL-6828. --- Resolution: Cannot Reproduce > Hit UnrecognizedPropertyException when run tpch queries > --- > > Key: DRILL-6828 > URL: https://issues.apache.org/jira/browse/DRILL-6828 > Project: Apache Drill > Issue Type: Bug > Components: Functions - Drill >Affects Versions: 1.15.0 > Environment: RHEL 7, Apache Drill commit id: > 18e09a1b1c801f2691a05ae7db543bf71874cfea >Reporter: Dechang Gu >Assignee: Robert Hou >Priority: Blocker > Fix For: 1.15.0 > > > Installed Apache Drill 1.15.0 commit id: > 18e09a1b1c801f2691a05ae7db543bf71874cfea DRILL-6763: Codegen optimization of > SQL functions with constant values(\#1481) > Hit the following errors: > {code} > java.sql.SQLException: SYSTEM ERROR: UnrecognizedPropertyException: > Unrecognized field "outgoingBatchSize" (class > org.apache.drill.exec.physical.config.HashPartitionSender), not marked as > ignorable (9 known properties: "receiver-major-fragment", > "initialAllocation", "expr", "userName", "@id", "child", "cost", > "destinations", "maxAllocation"]) > at [Source: (StringReader); line: 1000, column: 29] (through reference > chain: > org.apache.drill.exec.physical.config.HashPartitionSender["outgoingBatchSize"]) > Fragment 3:175 > Please, refer to logs for more information. > [Error Id: cc023cdb-9a46-4edd-ad0b-6da1e9085291 on ucs-node6.perf.lab:31010] > at > org.apache.drill.jdbc.impl.DrillCursor.nextRowInternally(DrillCursor.java:528) > at > org.apache.drill.jdbc.impl.DrillCursor.loadInitialSchema(DrillCursor.java:600) > at > org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(DrillResultSetImpl.java:1288) > at > org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(DrillResultSetImpl.java:61) > at > org.apache.calcite.avatica.AvaticaConnection$1.execute(AvaticaConnection.java:667) > at > org.apache.drill.jdbc.impl.DrillMetaImpl.prepareAndExecute(DrillMetaImpl.java:1109) > at > org.apache.drill.jdbc.impl.DrillMetaImpl.prepareAndExecute(DrillMetaImpl.java:1120) > at > org.apache.calcite.avatica.AvaticaConnection.prepareAndExecuteInternal(AvaticaConnection.java:675) > at > org.apache.drill.jdbc.impl.DrillConnectionImpl.prepareAndExecuteInternal(DrillConnectionImpl.java:196) > at > org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:156) > at > org.apache.calcite.avatica.AvaticaStatement.executeQuery(AvaticaStatement.java:227) > at PipSQueak.executeQuery(PipSQueak.java:289) > at PipSQueak.runTest(PipSQueak.java:104) > at PipSQueak.main(PipSQueak.java:477) > Caused by: org.apache.drill.common.exceptions.UserRemoteException: SYSTEM > ERROR: UnrecognizedPropertyException: Unrecognized field "outgoingBatchSize" > (class org.apache.drill.exec.physical.config.HashPartitionSender), not marked > as ignorable (9 known properties: "receiver-major-fragment", > "initialAllocation", "expr", "userName", "@id", "child", "cost", > "destinations", "maxAllocation"]) > at [Source: (StringReader); line: 1000, column: 29] (through reference > chain: > org.apache.drill.exec.physical.config.HashPartitionSender["outgoingBatchSize"]) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (DRILL-6567) Jenkins Regression: TPCDS query 93 fails with INTERNAL_ERROR ERROR: java.lang.reflect.UndeclaredThrowableException.
[ https://issues.apache.org/jira/browse/DRILL-6567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Hou resolved DRILL-6567. --- Resolution: Fixed Assignee: Vitalii Diravka (was: Robert Hou) > Jenkins Regression: TPCDS query 93 fails with INTERNAL_ERROR ERROR: > java.lang.reflect.UndeclaredThrowableException. > --- > > Key: DRILL-6567 > URL: https://issues.apache.org/jira/browse/DRILL-6567 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators >Affects Versions: 1.14.0 >Reporter: Robert Hou >Assignee: Vitalii Diravka >Priority: Critical > Fix For: 1.15.0 > > > This is TPCDS Query 93. > Query: > /root/drillAutomation/framework-master/framework/resources/Advanced/tpcds/tpcds_sf100/hive/parquet/query93.sql > SELECT ss_customer_sk, > Sum(act_sales) sumsales > FROM (SELECT ss_item_sk, > ss_ticket_number, > ss_customer_sk, > CASE > WHEN sr_return_quantity IS NOT NULL THEN > ( ss_quantity - sr_return_quantity ) * ss_sales_price > ELSE ( ss_quantity * ss_sales_price ) > END act_sales > FROM store_sales > LEFT OUTER JOIN store_returns > ON ( sr_item_sk = ss_item_sk > AND sr_ticket_number = ss_ticket_number ), > reason > WHERE sr_reason_sk = r_reason_sk > AND r_reason_desc = 'reason 38') t > GROUP BY ss_customer_sk > ORDER BY sumsales, > ss_customer_sk > LIMIT 100; > Here is the stack trace: > 2018-06-29 07:00:32 INFO DrillTestLogger:348 - > Exception: > java.sql.SQLException: INTERNAL_ERROR ERROR: > java.lang.reflect.UndeclaredThrowableException > Setup failed for null > Fragment 4:56 > [Error Id: 3c72c14d-9362-4a9b-affb-5cf937bed89e on atsqa6c82.qa.lab:31010] > (org.apache.drill.common.exceptions.ExecutionSetupException) > java.lang.reflect.UndeclaredThrowableException > > org.apache.drill.common.exceptions.ExecutionSetupException.fromThrowable():30 > org.apache.drill.exec.store.hive.readers.HiveAbstractReader.setup():327 > org.apache.drill.exec.physical.impl.ScanBatch.getNextReaderIfHas():245 > org.apache.drill.exec.physical.impl.ScanBatch.next():164 > org.apache.drill.exec.record.AbstractRecordBatch.next():119 > > org.apache.drill.exec.physical.impl.join.HashJoinBatch.sniffNonEmptyBatch():276 > > org.apache.drill.exec.physical.impl.join.HashJoinBatch.prefetchFirstBatchFromBothSides():238 > org.apache.drill.exec.physical.impl.join.HashJoinBatch.buildSchema():218 > org.apache.drill.exec.record.AbstractRecordBatch.next():152 > org.apache.drill.exec.record.AbstractRecordBatch.next():119 > org.apache.drill.exec.record.AbstractRecordBatch.next():109 > org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():63 > > org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():147 > org.apache.drill.exec.record.AbstractRecordBatch.next():172 > org.apache.drill.exec.record.AbstractRecordBatch.next():119 > > org.apache.drill.exec.physical.impl.join.HashJoinBatch.sniffNonEmptyBatch():276 > > org.apache.drill.exec.physical.impl.join.HashJoinBatch.prefetchFirstBatchFromBothSides():238 > org.apache.drill.exec.physical.impl.join.HashJoinBatch.buildSchema():218 > org.apache.drill.exec.record.AbstractRecordBatch.next():152 > org.apache.drill.exec.record.AbstractRecordBatch.next():119 > org.apache.drill.exec.record.AbstractRecordBatch.next():109 > org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():63 > > org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():147 > org.apache.drill.exec.record.AbstractRecordBatch.next():172 > org.apache.drill.exec.record.AbstractRecordBatch.next():119 > org.apache.drill.exec.record.AbstractRecordBatch.next():109 > > org.apache.drill.exec.physical.impl.aggregate.HashAggBatch.buildSchema():118 > org.apache.drill.exec.record.AbstractRecordBatch.next():152 > org.apache.drill.exec.physical.impl.BaseRootExec.next():103 > > org.apache.drill.exec.physical.impl.partitionsender.PartitionSenderRootExec.innerNext():152 > org.apache.drill.exec.physical.impl.BaseRootExec.next():93 > org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():294 > org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():281 > java.security.AccessController.doPrivileged():-2 > javax.security.auth.Subject.doAs():422 > org.apache.hadoop.security.UserGroupInformation.doAs():1595 > org.apache.drill.exec.work.fragment.FragmentExecutor.run():281 > org.apache.drill.common.SelfCleaningRunnable.run():38 > java.util.concurrent.ThreadPoolExecutor.runWorker():1149 > java.util.concurrent.ThreadPoolExecutor$Worker.run():624 > java.lang.Thread.run():748 > Caused By (
[jira] [Created] (DRILL-6787) Update Spnego webpage
Robert Hou created DRILL-6787: - Summary: Update Spnego webpage Key: DRILL-6787 URL: https://issues.apache.org/jira/browse/DRILL-6787 Project: Apache Drill Issue Type: Bug Affects Versions: 1.14.0 Reporter: Robert Hou Assignee: Bridget Bevens Fix For: 1.15.0 A few things should be updated on this webpage: https://drill.apache.org/docs/configuring-drill-to-use-spnego-for-http-authentication/ When configuring drillbits in drill-override.conf, the principal and keytab should be corrected. There are two places where this should be corrected. {noformat} drill.exec.http: { auth.spnego.principal:"HTTP/hostname@realm", auth.spnego.keytab:"path/to/keytab", auth.mechanisms: [“SPNEGO”] } {noformat} For the section on Chrome, we should change "hostname/domain" to "domain". Or "hostname@domain". Also, the two blanks around the "=" should be removed. {noformat} google-chrome --auth-server-whitelist="hostname/domain" {noformat} Also, for the section on Chrome, the "domain" should match the URL given to Chrome to access the Web UI. Also, Linux and Mac should be treated in separate paragraphs. These should be the directions for Mac: {noformat} cd /Applications/Google Chrome.app/Contents/MacOS ./"Google Chrome" --auth-server-whitelist="example.com" {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6726) Drill should return a better error message when a view uses a table that has a mixed case schema
Robert Hou created DRILL-6726: - Summary: Drill should return a better error message when a view uses a table that has a mixed case schema Key: DRILL-6726 URL: https://issues.apache.org/jira/browse/DRILL-6726 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.14.0 Reporter: Robert Hou Assignee: Arina Ielchiieva Fix For: 1.15.0 Drill 1.14 changes schemas to be case-insensitive (DRILL-6492). If a view references a schema which has upper case letters, the view needs to be rebuilt. For example: {noformat} create or replace view `dfs.drillTestDirP1`.student_parquet_v as select * from `dfs.drillTestDirP1`.student; {noformat} If a query references this schema, Drill will return an exception: {noformat} java.sql.SQLException: VALIDATION ERROR: Failure while attempting to expand view. Requested schema drillTestDirP1 not available in schema dfs. {noformat} It would be helpful to users if the error message explains that these views need to be re-created. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6725) Views cannot use tables with mixed case schemas
Robert Hou created DRILL-6725: - Summary: Views cannot use tables with mixed case schemas Key: DRILL-6725 URL: https://issues.apache.org/jira/browse/DRILL-6725 Project: Apache Drill Issue Type: Bug Components: Documentation Affects Versions: 1.14.0 Reporter: Robert Hou Assignee: Bridget Bevens Fix For: 1.14.0 Drill 1.14 changes schemas to be case-insensitive (DRILL-6492). If a view references a schema which has upper case letters, the view needs to be rebuilt. For example: create or replace view `dfs.drillTestDirP1`.student_parquet_v as select * from `dfs.drillTestDirP1`.student; Do we have release notes? If so, this should be documented. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6710) Drill C++ Client does not handle scale = 0 properly for decimal
Robert Hou created DRILL-6710: - Summary: Drill C++ Client does not handle scale = 0 properly for decimal Key: DRILL-6710 URL: https://issues.apache.org/jira/browse/DRILL-6710 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.14.0 Reporter: Robert Hou Assignee: Sorabh Hamirwasia Fix For: 1.15.0 Query is: select cast('99' as decimal(18,0)) + cast('9' as decimal(38,0)) from data limit 1 This is the error I get when my test program calls SQLExecDirect: The driver reported the following diagnostics whilst running SQLExecDirect HY000:1:40140:[MapR][Support] (40140) Scale can't be less than zero. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6709) Batch statistics logging utility needs to be extended to mid-stream operators
Robert Hou created DRILL-6709: - Summary: Batch statistics logging utility needs to be extended to mid-stream operators Key: DRILL-6709 URL: https://issues.apache.org/jira/browse/DRILL-6709 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.14.0 Reporter: Robert Hou Assignee: salim achouche Fix For: 1.15.0 A new batch logging utility has been created to log batch sizing messages to drillbit.log. It is being used by the Parquet reader. It needs to be enhanced so it can be used by mid-stream operators. In particular, mid-stream operators have both incoming batches and outgoing batches, while Parquet only has outgoing batches. So the utility needs to support incoming batches. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6688) Data batches for Project operator exceed the maximum specified
Robert Hou created DRILL-6688: - Summary: Data batches for Project operator exceed the maximum specified Key: DRILL-6688 URL: https://issues.apache.org/jira/browse/DRILL-6688 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.14.0 Reporter: Robert Hou Assignee: Karthikeyan Manivannan Fix For: 1.15.0 I ran this query: alter session set `drill.exec.memory.operator.project.output_batch_size` = 131072; alter session set `planner.width.max_per_node` = 1; alter session set `planner.width.max_per_query` = 1; select chr(101) CharacterValuea, chr(102) CharacterValueb, chr(103) CharacterValuec, chr(104) CharacterValued, chr(105) CharacterValuee from dfs.`/drill/testdata/batch_memory/character5_1MB.parquet`; The output has 1024 identical lines: e f g h i There is one incoming batch: 2018-08-09 15:50:14,794 [24933ad8-a5e2-73f1-90dd-947fc2938e54:frag:0:0] DEBUG o.a.d.e.p.i.p.ProjectMemoryManager - BATCH_STATS, incoming: Batch size: { Records: 6, Total size: 0, Data size: 30, Gross row width: 0, Net row width: 5, Density: 0% } Batch schema & sizes: { `_DEFAULT_COL_TO_READ_`(type: OPTIONAL INT, count: 6, Per entry: std data size: 4, std net size: 5, actual data size: 4, actual net size: 5 Totals: data size: 24, net size: 30) } } There are four outgoing batches. All are too large. The first three look like this: 2018-08-09 15:50:14,799 [24933ad8-a5e2-73f1-90dd-947fc2938e54:frag:0:0] DEBUG o.a.d.e.p.i.p.ProjectRecordBatch - BATCH_STATS, outgoing: Batch size: { Records: 16383, Total size: 0, Data size: 409575, Gross row width: 0, Net row width: 25, Density: 0% } Batch schema & sizes: { CharacterValuea(type: REQUIRED VARCHAR, count: 16383, Per entry: std data size: 50, std net size: 54, actual data size: 1, actual net size: 5 Totals: data size: 16383, net size: 81915) } CharacterValueb(type: REQUIRED VARCHAR, count: 16383, Per entry: std data size: 50, std net size: 54, actual data size: 1, actual net size: 5 Totals: data size: 16383, net size: 81915) } CharacterValuec(type: REQUIRED VARCHAR, count: 16383, Per entry: std data size: 50, std net size: 54, actual data size: 1, actual net size: 5 Totals: data size: 16383, net size: 81915) } CharacterValued(type: REQUIRED VARCHAR, count: 16383, Per entry: std data size: 50, std net size: 54, actual data size: 1, actual net size: 5 Totals: data size: 16383, net size: 81915) } CharacterValuee(type: REQUIRED VARCHAR, count: 16383, Per entry: std data size: 50, std net size: 54, actual data size: 1, actual net size: 5 Totals: data size: 16383, net size: 81915) } } The last batch is smaller because it has the remaining records. The data size (409575) exceeds the maximum batch size (131072). character415.q -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6685) Error in parquet record reader
Robert Hou created DRILL-6685: - Summary: Error in parquet record reader Key: DRILL-6685 URL: https://issues.apache.org/jira/browse/DRILL-6685 Project: Apache Drill Issue Type: Bug Components: Storage - Parquet Affects Versions: 1.14.0 Reporter: Robert Hou Assignee: salim achouche Fix For: 1.15.0 This is the query: alter session set `drill.exec.memory.operator.project.output_batch_size` = 131072; alter session set `planner.width.max_per_node` = 1; alter session set `planner.width.max_per_query` = 1; select * from ( select BigIntValue, BooleanValue, DateValue, FloatValue, DoubleValue, IntegerValue, TimeValue, TimestampValue, IntervalYearValue, IntervalDayValue, IntervalSecondValue, VarbinaryValue1, VarcharValue1, VarbinaryValue2, VarcharValue2 from (select * from dfs.`/drill/testdata/batch_memory/fourvarchar_asc_nulls_16MB_1GB.parquet` order by IntegerValue)) d where d.VarcharValue1 = 'Fq'; It appears to be caused by this commit: DRILL-6570: Fixed IndexOutofBoundException in Parquet Reader aee899c1b26ebb9a5781d280d5a73b42c273d4d5 This is the stack trace: {noformat} oadd.org.apache.drill.common.exceptions.UserRemoteException: INTERNAL_ERROR ERROR: Error in parquet record reader.^M Message: ^M Hadoop path: /drill/testdata/batch_memory/fourvarchar_asc_nulls_16MB_1GB.parquet/0_0_0.parquet^M Total records read: 0^M Row group index: 0^M Records in row group: 14565^M Parquet Metadata: ParquetMetaData{FileMetaData{schema: message root {^M optional int64 Index;^M optional binary VarbinaryValue1;^M optional int64 BigIntValue;^M optional boolean BooleanValue;^M optional int32 DateValue (DATE);^M optional float FloatValue;^M optional binary VarcharValue1 (UTF8);^M optional double DoubleValue;^M optional int32 IntegerValue;^M optional int32 TimeValue (TIME_MILLIS);^M optional int64 TimestampValue (TIMESTAMP_MILLIS);^M optional binary VarbinaryValue2;^M optional fixed_len_byte_array(12) IntervalYearValue (INTERVAL);^M optional fixed_len_byte_array(12) IntervalDayValue (INTERVAL);^M optional fixed_len_byte_array(12) IntervalSecondValue (INTERVAL);^M optional binary VarcharValue2 (UTF8);^M }^M , metadata: {drill-writer.version=2, drill.version=1.14.0-SNAPSHOT}}, blocks: [BlockMetaData{14565, 268477520 [ColumnMetaData{UNCOMPRESSED [Index] optional int64 Index [PLAIN, RLE, BIT_PACKED], 4}, ColumnMetaData{UNCOMPRESSED [VarbinaryValue1] optional binary VarbinaryValue1 [PLAIN, RLE, BIT_PACKED], 116579}, ColumnMetaData{UNCOMPRESSED [BigIntValue] optional int64 BigIntValue [PLAIN, RLE, BIT_PACKED], 91098467}, ColumnMetaData{UNCOMPRESSED [BooleanValue] optional boolean BooleanValue [PLAIN, RLE, BIT_PACKED], 91155431}, ColumnMetaData{UNCOMPRESSED [DateValue] optional int32 DateValue (DATE) [PLAIN, RLE, BIT_PACKED], 91157291}, ColumnMetaData{UNCOMPRESSED [FloatValue] optional float FloatValue [PLAIN, RLE, BIT_PACKED], 91215598}, ColumnMetaData{UNCOMPRESSED [VarcharValue1] optional binary VarcharValue1 (UTF8) [PLAIN, RLE, BIT_PACKED], 91273905}, ColumnMetaData{UNCOMPRESSED [DoubleValue] optional double DoubleValue [PLAIN, RLE, BIT_PACKED], 114039039}, ColumnMetaData{UNCOMPRESSED [IntegerValue] optional int32 IntegerValue [PLAIN, RLE, BIT_PACKED], 114155614}, ColumnMetaData{UNCOMPRESSED [TimeValue] optional int32 TimeValue (TIME_MILLIS) [PLAIN, RLE, BIT_PACKED], 114213921}, ColumnMetaData{UNCOMPRESSED [TimestampValue] optional int64 TimestampValue (TIMESTAMP_MILLIS) [PLAIN, RLE, BIT_PACKED], 114272228}, ColumnMetaData{UNCOMPRESSED [VarbinaryValue2] optional binary VarbinaryValue2 [PLAIN, RLE, BIT_PACKED], 114388803}, ColumnMetaData{UNCOMPRESSED [IntervalYearValue] optional fixed_len_byte_array(12) IntervalYearValue (INTERVAL) [PLAIN, RLE, BIT_PACKED], 222455665}, ColumnMetaData{UNCOMPRESSED [IntervalDayValue] optional fixed_len_byte_array(12) IntervalDayValue (INTERVAL) [PLAIN, RLE, BIT_PACKED], 222630508}, ColumnMetaData{UNCOMPRESSED [IntervalSecondValue] optional fixed_len_byte_array(12) IntervalSecondValue (INTERVAL) [PLAIN, RLE, BIT_PACKED], 222805351}, ColumnMetaData{UNCOMPRESSED [VarcharValue2] optional binary VarcharValue2 (UTF8) [PLAIN, RLE, BIT_PACKED], 222980194}]}]}^M ^M Fragment 0:0^M ^M [Error Id: c6690ea1-2f28-4fbe-969f-d8b90da488fb on qa-node186.qa.lab:31010]^M ^M (org.apache.drill.common.exceptions.DrillRuntimeException) Error in parquet record reader.^M Message: ^M Hadoop path: /drill/testdata/batch_memory/fourvarchar_asc_nulls_16MB_1GB.parquet/0_0_0.parquet^M Total records read: 0^M Row group index: 0^M Records in row group: 14565^M Parquet Metadata: ParquetMetaData{FileMetaData{schema: message root {^M optional int64 Index;^M optional binary VarbinaryValue1;^M optional int64 BigIntValue;^M optional boolean BooleanValue;^M optional int32 DateValue (DATE);^M optiona
[jira] [Created] (DRILL-6682) Cast integer to binary returns incorrect result
Robert Hou created DRILL-6682: - Summary: Cast integer to binary returns incorrect result Key: DRILL-6682 URL: https://issues.apache.org/jira/browse/DRILL-6682 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.12.0 Reporter: Robert Hou Assignee: Pritesh Maker This query returns an empty binary string: select cast(123 as binary) from (values(1)); The same problem occurs for bigint, float and double. Casting works if the data type is date, time, timestamp, interval, varchar and binary. select cast(date '2018-08-10' as binary) from (values(1)); select length(string_binary(cast(123 as binary))), length(string_binary(cast(date '2018-08-10' as binary))) from (values(1)); +-+-+ | EXPR$0 | EXPR$1 | +-+-+ | 0 | 10 | +-+-+ -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6623) Drill encounters exception IndexOutOfBoundsException: writerIndex: -8373248 (expected: readerIndex(0) <= writerIndex <= capacity(32768))
Robert Hou created DRILL-6623: - Summary: Drill encounters exception IndexOutOfBoundsException: writerIndex: -8373248 (expected: readerIndex(0) <= writerIndex <= capacity(32768)) Key: DRILL-6623 URL: https://issues.apache.org/jira/browse/DRILL-6623 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.14.0 Reporter: Robert Hou Assignee: Pritesh Maker This is the query: alter session set `planner.width.max_per_node` = 1; alter session set `planner.width.max_per_query` = 1; select * from ( select split_part(CharacterValuea, '8', 1) CharacterValuea, split_part(CharacterValueb, '8', 1) CharacterValueb, split_part(CharacterValuec, '8', 2) CharacterValuec, split_part(CharacterValued, '8', 3) CharacterValued, split_part(CharacterValuee, 'b', 1) CharacterValuee from (select * from dfs.`/drill/testdata/batch_memory/character5_1MB_1GB.parquet` order by CharacterValuea) d where d.CharacterValuea = '1234567890123110'); The query works with a smaller table. This is the stack trace: {noformat} 2018-07-19 16:59:48,803 [24aedae9-d1f3-8e12-2e1f-0479915c61b1:frag:0:0] ERROR o.a.d.e.w.fragment.FragmentExecutor - SYSTEM ERROR: IndexOutOfBoundsException: writerIndex: -8373248 (expected: readerIndex(0) <= writerIndex <= capacity(32768)) Fragment 0:0 [Error Id: edc75560-41ca-4fdd-907f-060be1795786 on qa-node186.qa.lab:31010] org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: IndexOutOfBoundsException: writerIndex: -8373248 (expected: readerIndex(0) <= writerIndex <= capacity(32768)) Fragment 0:0 [Error Id: edc75560-41ca-4fdd-907f-060be1795786 on qa-node186.qa.lab:31010] at org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:633) ~[drill-common-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] at org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:361) [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] at org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:216) [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] at org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:327) [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] at org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) [drill-common-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [na:1.8.0_161] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [na:1.8.0_161] at java.lang.Thread.run(Thread.java:748) [na:1.8.0_161] Caused by: java.lang.IndexOutOfBoundsException: writerIndex: -8373248 (expected: readerIndex(0) <= writerIndex <= capacity(32768)) at io.netty.buffer.AbstractByteBuf.writerIndex(AbstractByteBuf.java:104) ~[netty-buffer-4.0.48.Final.jar:4.0.48.Final] at org.apache.drill.exec.vector.VarCharVector$Mutator.setValueCount(VarCharVector.java:810) ~[vector-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] at org.apache.drill.exec.vector.NullableVarCharVector$Mutator.setValueCount(NullableVarCharVector.java:641) ~[vector-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] at org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.setValueCount(ProjectRecordBatch.java:329) ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] at org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.doWork(ProjectRecordBatch.java:242) ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] at org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext(AbstractUnaryRecordBatch.java:117) ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] at org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:142) ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:172) ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119) ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:109) ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] at org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext(AbstractUnaryRecordBatch.java:63) ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] at org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:142) ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:172) ~[drill-java-exec-1.14
[jira] [Resolved] (DRILL-6605) TPCDS-84 Query does not return any rows
[ https://issues.apache.org/jira/browse/DRILL-6605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Hou resolved DRILL-6605. --- Resolution: Fixed > TPCDS-84 Query does not return any rows > --- > > Key: DRILL-6605 > URL: https://issues.apache.org/jira/browse/DRILL-6605 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators >Reporter: Robert Hou >Assignee: Robert Hou >Priority: Major > Attachments: drillbit.log.node80, drillbit.log.node81, > drillbit.log.node82, drillbit.log.node83, drillbit.log.node85, > drillbit.log.node86, drillbit.log.node87, drillbit.log.node88 > > > Query is: > Advanced/tpcds/tpcds_sf100/hive/parquet/query84.sql > This uses the hive parquet reader. > {code:sql} > SELECT c_customer_id AS customer_id, > c_last_name > || ', ' > || c_first_name AS customername > FROM customer, > customer_address, > customer_demographics, > household_demographics, > income_band, > store_returns > WHERE ca_city = 'Green Acres' > AND c_current_addr_sk = ca_address_sk > AND ib_lower_bound >= 54986 > AND ib_upper_bound <= 54986 + 5 > AND ib_income_band_sk = hd_income_band_sk > AND cd_demo_sk = c_current_cdemo_sk > AND hd_demo_sk = c_current_hdemo_sk > AND sr_cdemo_sk = cd_demo_sk > ORDER BY c_customer_id > LIMIT 100 > {code} > This query should return 100 rows. It does not return any rows. > Here is the explain plan: > {noformat} > | 00-00Screen > 00-01 Project(customer_id=[$0], customername=[$1]) > 00-02SelectionVectorRemover > 00-03 Limit(fetch=[100]) > 00-04SingleMergeExchange(sort0=[0]) > 01-01 OrderedMuxExchange(sort0=[0]) > 02-01SelectionVectorRemover > 02-02 TopN(limit=[100]) > 02-03HashToRandomExchange(dist0=[[$0]]) > 03-01 Project(customer_id=[$0], customername=[||(||($5, > ', '), $4)]) > 03-02Project(c_customer_id=[$1], > c_current_cdemo_sk=[$2], c_current_hdemo_sk=[$3], c_current_addr_sk=[$4], > c_first_name=[$5], c_last_name=[$6], ca_address_sk=[$8], ca_city=[$9], > cd_demo_sk=[$7], hd_demo_sk=[$10], hd_income_band_sk=[$11], > ib_income_band_sk=[$12], ib_lower_bound=[$13], ib_upper_bound=[$14], > sr_cdemo_sk=[$0]) > 03-03 HashJoin(condition=[=($7, $0)], > joinType=[inner]) > 03-05HashToRandomExchange(dist0=[[$0]]) > 04-01 Scan(groupscan=[HiveScan > [table=Table(dbName:tpcds100_parquet, tableName:store_returns), > columns=[`sr_cdemo_sk`], numPartitions=0, partitions= null, > inputDirectories=[maprfs:/drill/testdata/tpcds_sf100/parquet/web_returns], > confProperties={}]]) > 03-04HashToRandomExchange(dist0=[[$6]]) > 05-01 HashJoin(condition=[=($2, $9)], > joinType=[inner]) > 05-03HashJoin(condition=[=($3, $7)], > joinType=[inner]) > 05-05 HashJoin(condition=[=($1, $6)], > joinType=[inner]) > 05-07Scan(groupscan=[HiveScan > [table=Table(dbName:tpcds100_parquet, tableName:customer), > columns=[`c_customer_id`, `c_current_cdemo_sk`, `c_current_hdemo_sk`, > `c_current_addr_sk`, `c_first_name`, `c_last_name`], numPartitions=0, > partitions= null, > inputDirectories=[maprfs:/drill/testdata/tpcds_sf100/parquet/customer], > confProperties={}]]) > 05-06BroadcastExchange > 06-01 Scan(groupscan=[HiveScan > [table=Table(dbName:tpcds100_parquet, tableName:customer_demographics), > columns=[`cd_demo_sk`], numPartitions=0, partitions= null, > inputDirectories=[maprfs:/drill/testdata/tpcds_sf100/parquet/customer_demographics], > confProperties={}]]) > 05-04 BroadcastExchange > 07-01SelectionVectorRemover > 07-02 Filter(condition=[=($1, 'Green > Acres')]) > 07-03Scan(groupscan=[HiveScan > [table=Table(dbName:tpcds100_parquet, tableName:customer_address), > columns=[`ca_address_sk`, `ca_city`], numPartitions=0, partitions= null, > inputDirectories=[maprfs:/drill/testdata/tpcds_sf100/parquet/customer_address], > confProperties={}]]) > 05-02BroadcastExchange > 08-01 HashJoin(condition=[=($1, $2)], > joinType=[inner]) > 08-03Scan(groupscan=[HiveScan > [table=Table(dbName:tpcds100_parquet, tableName:household_demographics), > columns=[`hd_demo_sk`, `hd_income_band_sk`], numPartitions=0, partitions= > null, > inputDirectories=[maprfs:/drill/
[jira] [Created] (DRILL-6605) Query does not return any rows
Robert Hou created DRILL-6605: - Summary: Query does not return any rows Key: DRILL-6605 URL: https://issues.apache.org/jira/browse/DRILL-6605 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.13.0 Reporter: Robert Hou Assignee: Pritesh Maker Fix For: 1.15.0 Query is: Advanced/tpcds/tpcds_sf100/hive/parquet/query84.sql This uses the hive parquet reader. SELECT c_customer_id AS customer_id, c_last_name \|\| ', ' \|\| c_first_name AS customername FROM customer, customer_address, customer_demographics, household_demographics, income_band, store_returns WHERE ca_city = 'Green Acres' AND c_current_addr_sk = ca_address_sk AND ib_lower_bound >= 54986 AND ib_upper_bound <= 54986 + 5 AND ib_income_band_sk = hd_income_band_sk AND cd_demo_sk = c_current_cdemo_sk AND hd_demo_sk = c_current_hdemo_sk AND sr_cdemo_sk = cd_demo_sk ORDER BY c_customer_id LIMIT 100 This query should return 100 rows commit id is: 1.14.0-SNAPSHOT a77fd142d86dd5648cda8866b8ff3af39c7b6b11DRILL-6516: EMIT support in streaming agg 11.07.2018 @ 18:40:03 PDT Unknown 12.07.2018 @ 01:50:37 PDT -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6603) Query does not return enough rows
Robert Hou created DRILL-6603: - Summary: Query does not return enough rows Key: DRILL-6603 URL: https://issues.apache.org/jira/browse/DRILL-6603 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.14.0 Reporter: Robert Hou Assignee: Pritesh Maker Fix For: 1.15.0 Query is: /root/drillAutomation/framework-master/framework/resources/Advanced/data-shapes/wide-columns/5000/10rows/parquet/q67.q select * from widestrings where str_var is null and dec_var_prec5_sc2 between 10 and 15 This query should return 5 rows. It is missing 3 rows. 1664IaYIEviH tJHD 6nF33QQJn1p4uuTELHOR2z0FCzMK35JkNeDRKCduYKUiPaXFgwftf4Ciidk2d7IXxyrCoX56Vsb ITcI9yxPpd3Gu6zkk2kktmZv9oHxMVE1ccVh2iGzU7greQuUEJ1oYFHGzGN9MEeKc5DqbHHT0F65NF1LE88CAudZW5bv6AiIj2D714q72g8ULd2WaazavWBQ6PgdKax 5kVvGkt9czWgZOH9CfT0ApOWUWZlQcvtVC2UumK6Q8tmE5f5yjKhTqvXOiistNIMo4K1NqG8U5t9V33b3h9Hk1ymyeGNMrb5Is1jB5nL9zlpyx3y46WoxV9GornIyrLw W4wxtVsbj2yFYuU65RdDzkNKezE0LsPtpXeEpJeFoFSP lF0wj8xSQg1wx5cfOMXBGNA1nvqTELCPCEzUvFj8hXQ3gANHJ9bOt7QFZhxWLlBhCevbqA40IgJntlf0cAJM6V562fpGd16Trt3mI4YQUOkf3luTVRcBJRpIdoP3ZzgvhnVrgfblboAFMZ8CzCaH7QrZf02fPtYJlBAdoJB6DMjqh6mbkphod1QGYOkE0jqLMCnKoZSpOG9Rk9dIFdlkIrvea0f1KDGAuAlYiTTsdgU4R6CowbVNfEyjIv0Wp1CXC6SzM1Vex6Ye7CrRptvn92SOQCsAElScXa1EuErruEAyIEvtWraXL5X42RxTBsH3TZTR6NVuUcpObKbVIx0kLTdbxIElf33x31QwXUfUVZ T4zHEpu6f4mLR6N9uLVG0Fza Glq3UxixhgxPXgZpQt9GqT3HJXHEn9F0KGaxhC9VCqSk119HrrJuMpHiYS34MCkw1iFhGFUsRKI3fTFaByicJeCIkjFwn2cr74lONdco4AAFdGGVN1cMgJmlOxUZE0Okv68DocVXUMSXCdcTBBmGL2h2gDIagThjo8sVXORponMNTrXEP068Zy7pNkVJyW10EoZwqE2IIcoKdixYsJvPc0mRWnk3gfSmB6uHWgKvgGq4yzzbGp3NT01z8IRYKbmSXTmLyk9rJjUYatoIi 757C2F0Yq0gceouo3LMaz9h4eyiC9psNiL3aoxquqrisayOjPs5esQzoY2iVmVZ7evrVCfxhe2AATFgTvk8Ek78y8s4nVNztlyluIrckfLbnOa25r1h9emJzooVV0Xj945xj5jAUHTZU9kCHKnmkcpEo0a7BdELbL0IvQlitXxbZBS86PlCltLGpLs fmYeUzJfpp0Cql3MAECSQQbW4ErwWScaZ5D rPfbbDZbF2m2ZtSPNn81G5zZBxfHgpuSm4UVrdd24NlLeG1mxwv zU1PbpjSCqbn8rUCWqn5LFafTrmSdtrCuFaknTpqmk1wR9cLnPF3cD xvh0EqSwvCmCTK9xCpZkJF 4WnBX6w5vg7gQkjvF1GOqP3LeV3qbJc SO68S2UrCBNYQKdWyq4HeGG3TTuFF4x74nWkPPi0txEGiGDoYRxPvEQzWyhZ8SHpHZ3 0UpHpuLWEXIO6VZlPJd4uC IaDEIaB rkCJ8TaIVvaBIf0t8FGY8MgXTWzKdUBkOcQawbODXRLEtdGABTnOqftRSfUSpdojmlwRIs8xJIKaxK9wSL67DKahL6E7CvDBaQx20G0o7u rMaponV4OZmHE45vaeAqfLSyWlNL4UvOstiDPaDd8nI08g9MSKFtYYxt3RxvydGxCtaYfgsl3KxjN5VHnAxkvChVlvdS2Yd8IBA 0dZwblnKUBibdQSgxcypDbRCPeAaOr169L9mrMv82w0V1Ndyt3qK wcpv5nKeO8P9kbVlWY9bGi9nxCVs804WBZMA9vc7AT4h7Jp0OsaHbJx0qyFyAnXP lu MMsOa28VxSW8thiTfIcx2qkdFN1KXrXpU4uo lxUOcJhH0HlyX6kLKhCnVqpG tFP93c5jJ7FdeSujFvxPgo1rQSN9DHXk4DR6nytgBrn2oGcM58zadRNaqoIL2wmWygQsnk7Euzypbg4KhlTICBl1mpb0JwbI7uaCudGcDNWIBMerY WgjahuC3QjIFd48o78CQSgqgQjzpHzdELrqMCKaKfdW4ihpHCA0sqNBYGQxxd T8iTWorOODkg5Kc7m4gPut8tuzEMOQus1xdajv9PqS8F7xwzAWyhymyYBJ8505HxZDuSFqBXSkpxGDh21fiBHkeKBC9RZp7r yD7i6xvRh47Vln0IxvnwcpahLltLr12yL0sDu9LXxHNAHU4gyvHud5J5xXJPD7r5xHXvtNOSiXVl hkBBib1k4IO9YjCgModazXNudTx2Mr8ccq6 kNLKwnrwGdssm3JYyjBsUcXyLMHpS7vncUeKSw2rov4Hg4gTZU8sJMJMAJvu8d6IDJYMHULwrawKOhK8rDTP6sk9Hv27mCG8Gf9inG38Pik7AfnEtUIiZZozEsiSkWvAA7YiHlNDUuL3OX2FRgt2qu9T7zXtQkhon8uSv5FncUq17XB9idflAO0rWIK57HoilaXgIDrzG61kfSKZXpdKuwBVsRNmgJVDSedRsSihlcVDdZ7bmqsgzbvKhFri8lSh8ez6ttlXgF8h4wJ2985bVw5PUmLdeGjlbfrLF0f22vqGi11qz2GUltrjBmmBSrbCLpFUkwqqpATRoQEwo27qi5XwHYWWBqPN9rxF orktFM5SRwG2IJmx8li8sRRchYnNYQgH7iuwKqd69jJJTwwdYla2296Lhw88YHzL60aq2XomN0BNNSoY8cALvy0QIHZpCFd3EmBojr46d6c8nBYMXJLlgKNzklk8vMTKrjAgBQevUH4U7gbQpOIWVf7Tx2BIXkdRGwQYHAuJzU5gtDuDqhuddXkGdACMmp0tgJVP2tpMW05Z3OGs6jYKb5xtqHotIJd7tUM33J85fRYOEIoGOaRblZr7RF82nSOSpPQnDgnVUhJ1j mCY1ofeqG7QqeV6LTdRyRPgiiPwHF1Xgpb3feAJ804NmX7xOkDPvw0WeqxrSVMCto r8E64UsRFypZ wtzVAlTJKgTMpzA4xeuVXuk85mpEJTIQpNxPjU3vgAacENiejcRs68Y85Ncb5ymC3fD0WAyh23VIsy GqaCV9hIFrAs tMM2zlkqpoBsSwgODBEsizaJkb4ZOWJj3Z2Wttr08YPpXSO6 IhQKD5SHqNXEDNar2UVZwFZbg1YJccvsjWEtfm0AUZ 3KHMUb3X1F3tWqIYrZucrsjUp2xfaGtqnsij4q7CRWhRucucjyKcKmiaGE7XllzVGPeHWmbtAFku355JLB2OlBXdsgWMVZFcaCOHff6OlSECOgdLGBSL297kgCVKLzDEvxS T4rb5neHQffvmAHOzdIuDGw1559XGVHwzz5lLoc3iSicYlwZTKN2VUOQPHRSqTI1hMJmgTcUaO3LEHyxL2so3EedaU9BSaTaA3kPefKSdu ibaW3h1 WKkznSnlmVjhLzq5e5ywYzwA26EusRtJmAAiiSrYG20uO7ejp1AlorSgOAfM9B5qxQAqaDqQMUlvhlu7SjK46egz5kK3xtcoUfyxyUwAonh3iv VJPXdvxm8ZuZbnm82xLkh4MeWbClb0jH5E42m9aFp8GrSQzAwhzciocZJABwerP1sfITnG6EMyPKdl7FBIjJKjNcFOVabzQX966h6WYnAOKuaYdJWNGgKOISIcR6OwHIaUWjqV9w84VYxXutZJ1rRlbeUPT8ygTZmFk2FK2Ix02rBzt0nFkiTNmoZSilSzSOxSF iwtXmtDRtjrQPQCVKlZM3KrYjiJfOem8PIOA8wadL0lHN87gpEqUsrvpohZ8FRW ILoeDeWeBYO94JOrYv7JdirgNH7MBdmrMQOrBPpY6bdX3is62JWMm9c0Xv7jyEVdq3hkSsJLWEr4Gu8TZBfjrd9rVX0gqjlQZsk30UwEDjvtfufkYcJj2sGbJ3HzJdIh1MCHIoPb1YyacfzEvnQsnlQagfRu51vSF8qehDJ2AtCezy6hOdwberI4qgP8HMuBKRjoyN91ipykonft9himO44rJtkiREFA9opJA9jKWM8kYzICDmE2 D3pZcmMGyUEyCY K7IEITWxzmISenhl1Ext2wzZxJoQcfLNU 8rmXNFLwxnJCEYq4bNrEn9IQw 6xhgjw8roQVEgL8NZTxtlcve8RAyLILFdfNsvvg7qa700PCc ZD
[jira] [Created] (DRILL-6594) Data batches for Project operator are not being split properly and exceed the maximum specified
Robert Hou created DRILL-6594: - Summary: Data batches for Project operator are not being split properly and exceed the maximum specified Key: DRILL-6594 URL: https://issues.apache.org/jira/browse/DRILL-6594 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.14.0 Reporter: Robert Hou Assignee: Karthikeyan Manivannan Fix For: 1.14.0 I ran this query: alter session set `drill.exec.memory.operator.project.output_batch_size` = 131072; alter session set `planner.width.max_per_node` = 1; alter session set `planner.width.max_per_query` = 1; select * from ( select case when false then c.CharacterValuea else i.IntegerValuea end IntegerValuea, case when false then c.CharacterValueb else i.IntegerValueb end IntegerValueb, case when false then c.CharacterValuec else i.IntegerValuec end IntegerValuec, case when false then c.CharacterValued else i.IntegerValued end IntegerValued, case when false then c.CharacterValuee else i.IntegerValuee end IntegerValuee from (select * from dfs.`/drill/testdata/batch_memory/character5_1MB.parquet` order by CharacterValuea) c, dfs.`/drill/testdata/batch_memory/integer5_1MB.parquet` i where i.Index = c.Index and c.CharacterValuea = '1234567890123100') limit 10; An incoming batch looks like this: 2018-06-14 19:28:10,905 [24dcdbc7-2f42-16a9-56f1-9cf58bc549bc:frag:5:0] DEBUG o.a.d.e.p.i.p.ProjectMemoryManager - BATCH_STATS, incoming: Batch size: { Records: 32768, Total size: 20512768, Data size: 9175040, Gross row width: 626, Net row width: 280, Density: 45% } An outgoing batch looks like this: 2018-06-14 19:28:10,911 [24dcdbc7-2f42-16a9-56f1-9cf58bc549bc:frag:5:0] DEBUG o.a.d.e.p.i.p.ProjectRecordBatch - BATCH_STATS, outgoing: Batch size: { Records: 1023, Total size: 11018240, Data size: 138105, Gross row width: 10771, Net row width: 135, Density: 2% } The data size (138105) exceeds the maximum batch size (131072). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6569) Jenkins Regression: TPCDS query 19 fails with INTERNAL_ERROR ERROR: Can not read value at 2 in block 0 in file maprfs:///drill/testdata/tpcds_sf100/parquet/store_sales/1_
Robert Hou created DRILL-6569: - Summary: Jenkins Regression: TPCDS query 19 fails with INTERNAL_ERROR ERROR: Can not read value at 2 in block 0 in file maprfs:///drill/testdata/tpcds_sf100/parquet/store_sales/1_13_1.parquet Key: DRILL-6569 URL: https://issues.apache.org/jira/browse/DRILL-6569 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.14.0 Reporter: Robert Hou Assignee: Pritesh Maker Fix For: 1.14.0 This is TPCDS Query 19. Query: /root/drillAutomation/framework-master/framework/resources/Advanced/tpcds/tpcds_sf100/hive/parquet/query19.sql SELECT i_brand_id brand_id, i_brand brand, i_manufact_id, i_manufact, Sum(ss_ext_sales_price) ext_price FROM date_dim, store_sales, item, customer, customer_address, store WHERE d_date_sk = ss_sold_date_sk AND ss_item_sk = i_item_sk AND i_manager_id = 38 AND d_moy = 12 AND d_year = 1998 AND ss_customer_sk = c_customer_sk AND c_current_addr_sk = ca_address_sk AND Substr(ca_zip, 1, 5) <> Substr(s_zip, 1, 5) AND ss_store_sk = s_store_sk GROUP BY i_brand, i_brand_id, i_manufact_id, i_manufact ORDER BY ext_price DESC, i_brand, i_brand_id, i_manufact_id, i_manufact LIMIT 100; Here is the stack trace: 2018-06-29 07:00:32 INFO DrillTestLogger:348 - Exception: java.sql.SQLException: INTERNAL_ERROR ERROR: Can not read value at 2 in block 0 in file maprfs:///drill/testdata/tpcds_sf100/parquet/store_sales/1_13_1.parquet Fragment 4:26 [Error Id: 6401a71e-7a5d-4a10-a17c-16873fc3239b on atsqa6c88.qa.lab:31010] (hive.org.apache.parquet.io.ParquetDecodingException) Can not read value at 2 in block 0 in file maprfs:///drill/testdata/tpcds_sf100/parquet/store_sales/1_13_1.parquet hive.org.apache.parquet.hadoop.InternalParquetRecordReader.nextKeyValue():243 hive.org.apache.parquet.hadoop.ParquetRecordReader.nextKeyValue():227 org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.next():199 org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.next():57 org.apache.drill.exec.store.hive.readers.HiveAbstractReader.hasNextValue():417 org.apache.drill.exec.store.hive.readers.HiveParquetReader.next():54 org.apache.drill.exec.physical.impl.ScanBatch.next():172 org.apache.drill.exec.record.AbstractRecordBatch.next():119 org.apache.drill.exec.physical.impl.join.HashJoinBatch.sniffNonEmptyBatch():276 org.apache.drill.exec.physical.impl.join.HashJoinBatch.prefetchFirstBatchFromBothSides():238 org.apache.drill.exec.physical.impl.join.HashJoinBatch.buildSchema():218 org.apache.drill.exec.record.AbstractRecordBatch.next():152 org.apache.drill.exec.record.AbstractRecordBatch.next():119 org.apache.drill.exec.physical.impl.join.HashJoinBatch.sniffNonEmptyBatch():276 org.apache.drill.exec.physical.impl.join.HashJoinBatch.prefetchFirstBatchFromBothSides():238 org.apache.drill.exec.physical.impl.join.HashJoinBatch.buildSchema():218 org.apache.drill.exec.record.AbstractRecordBatch.next():152 org.apache.drill.exec.record.AbstractRecordBatch.next():119 org.apache.drill.exec.physical.impl.join.HashJoinBatch.sniffNonEmptyBatch():276 org.apache.drill.exec.physical.impl.join.HashJoinBatch.prefetchFirstBatchFromBothSides():238 org.apache.drill.exec.physical.impl.join.HashJoinBatch.buildSchema():218 org.apache.drill.exec.record.AbstractRecordBatch.next():152 org.apache.drill.exec.record.AbstractRecordBatch.next():119 org.apache.drill.exec.physical.impl.join.HashJoinBatch.sniffNonEmptyBatch():276 org.apache.drill.exec.physical.impl.join.HashJoinBatch.prefetchFirstBatchFromBothSides():238 org.apache.drill.exec.physical.impl.join.HashJoinBatch.buildSchema():218 org.apache.drill.exec.record.AbstractRecordBatch.next():152 org.apache.drill.exec.record.AbstractRecordBatch.next():119 org.apache.drill.exec.record.AbstractRecordBatch.next():109 org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():63 org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():147 org.apache.drill.exec.record.AbstractRecordBatch.next():172 org.apache.drill.exec.record.AbstractRecordBatch.next():119 org.apache.drill.exec.record.AbstractRecordBatch.next():109 org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():63 org.apache.drill.exec.record.AbstractRecordBatch.next():172 org.apache.drill.exec.record.AbstractRecordBatch.next():119 org.apache.drill.exec.record.AbstractRecordBatch.next():109 org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():63 org.apache.drill.exec.record.AbstractRecordBatch.next():172 org.apache.drill.exec.record.AbstractRecordBatch.next():119 org.apache.drill.exec.record.AbstractRecordBatch.next():109 o
[jira] [Created] (DRILL-6568) Jenkins Regression: TPCDS query 68 fails with IllegalStateException: Unexpected EMIT outcome received in buildSchema phase
Robert Hou created DRILL-6568: - Summary: Jenkins Regression: TPCDS query 68 fails with IllegalStateException: Unexpected EMIT outcome received in buildSchema phase Key: DRILL-6568 URL: https://issues.apache.org/jira/browse/DRILL-6568 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.14.0 Reporter: Robert Hou Assignee: Khurram Faraaz Fix For: 1.14.0 This is TPCDS Query 68. Query: /root/drillAutomation/framework-master/framework/resources/Advanced/tpcds/tpcds_sf1/original/maprdb/json/query68.sql SELECT c_last_name, c_first_name, ca_city, bought_city, ss_ticket_number, extended_price, extended_tax, list_price FROM (SELECT ss_ticket_number, ss_customer_sk, ca_city bought_city, Sum(ss_ext_sales_price) extended_price, Sum(ss_ext_list_price) list_price, Sum(ss_ext_tax) extended_tax FROM store_sales, date_dim, store, household_demographics, customer_address WHERE store_sales.ss_sold_date_sk = date_dim.d_date_sk AND store_sales.ss_store_sk = store.s_store_sk AND store_sales.ss_hdemo_sk = household_demographics.hd_demo_sk AND store_sales.ss_addr_sk = customer_address.ca_address_sk AND date_dim.d_dom BETWEEN 1 AND 2 AND ( household_demographics.hd_dep_count = 8 OR household_demographics.hd_vehicle_count = 3 ) AND date_dim.d_year IN ( 1998, 1998 + 1, 1998 + 2 ) AND store.s_city IN ( 'Fairview', 'Midway' ) GROUP BY ss_ticket_number, ss_customer_sk, ss_addr_sk, ca_city) dn, customer, customer_address current_addr WHERE ss_customer_sk = c_customer_sk AND customer.c_current_addr_sk = current_addr.ca_address_sk AND current_addr.ca_city <> bought_city ORDER BY c_last_name, ss_ticket_number LIMIT 100; Here is the stack trace: 2018-06-29 07:00:32 INFO DrillTestLogger:348 - Exception: java.sql.SQLException: SYSTEM ERROR: IllegalStateException: Unexpected EMIT outcome received in buildSchema phase Fragment 0:0 [Error Id: edbe3477-805e-4f1f-8405-d5c194dc28c2 on atsqa6c87.qa.lab:31010] (java.lang.IllegalStateException) Unexpected EMIT outcome received in buildSchema phase org.apache.drill.exec.physical.impl.TopN.TopNBatch.buildSchema():178 org.apache.drill.exec.record.AbstractRecordBatch.next():152 org.apache.drill.exec.record.AbstractRecordBatch.next():119 org.apache.drill.exec.record.AbstractRecordBatch.next():109 org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():63 org.apache.drill.exec.record.AbstractRecordBatch.next():172 org.apache.drill.exec.record.AbstractRecordBatch.next():119 org.apache.drill.exec.record.AbstractRecordBatch.next():109 org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():63 org.apache.drill.exec.physical.impl.limit.LimitRecordBatch.innerNext():87 org.apache.drill.exec.record.AbstractRecordBatch.next():172 org.apache.drill.exec.record.AbstractRecordBatch.next():119 org.apache.drill.exec.record.AbstractRecordBatch.next():109 org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():63 org.apache.drill.exec.record.AbstractRecordBatch.next():172 org.apache.drill.exec.record.AbstractRecordBatch.next():119 org.apache.drill.exec.record.AbstractRecordBatch.next():109 org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():63 org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():147 org.apache.drill.exec.record.AbstractRecordBatch.next():172 org.apache.drill.exec.physical.impl.BaseRootExec.next():103 org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():83 org.apache.drill.exec.physical.impl.BaseRootExec.next():93 org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():294 org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():281 java.security.AccessController.doPrivileged():-2 javax.security.auth.Subject.doAs():422 org.apache.hadoop.security.UserGroupInformation.doAs():1595 org.apache.drill.exec.work.fragment.FragmentExecutor.run():281 org.apache.drill.common.SelfCleaningRunnable.run():38 java.util.concurrent.ThreadPoolExecutor.runWorker():1149 java.util.concurrent.ThreadPoolExecutor$Worker.run():624 java.lang.Thread.run():748 at org.apache.drill.jdbc.impl.DrillCursor.nextRowInternally(DrillCursor.java:528) at org.apache.drill.jdbc.impl.DrillCursor.loadInitialSchema(DrillCursor.java:600) at org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(DrillResultSetImpl.java:1904) at org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(DrillResultSetImpl.java:64) at oadd.org.apache.calcite.avatica.AvaticaConnection$1.execute(AvaticaConnection.java:630) at org.apache.drill.jdbc.impl.DrillMetaImpl.prepareAndExecute(DrillMetaImpl.java:1109) at org.apache.drill.jdbc.impl.DrillMet
[jira] [Resolved] (DRILL-6566) Jenkins Regression: TPCDS query 66 fails with RESOURCE ERROR: One or more nodes ran out of memory while executing the query. AGGR OOM at First Phase.
[ https://issues.apache.org/jira/browse/DRILL-6566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Hou resolved DRILL-6566. --- Resolution: Resolved Will remove from Jenkins for now. > Jenkins Regression: TPCDS query 66 fails with RESOURCE ERROR: One or more > nodes ran out of memory while executing the query. AGGR OOM at First Phase. > -- > > Key: DRILL-6566 > URL: https://issues.apache.org/jira/browse/DRILL-6566 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators >Affects Versions: 1.14.0 >Reporter: Robert Hou >Assignee: Boaz Ben-Zvi >Priority: Critical > Fix For: 1.14.0 > > > This is TPCDS Query 66. > Query: > /root/drillAutomation/framework-master/framework/resources/Advanced/tpcds/tpcds_sf1/hive-generated-parquet/hive1_native/query66.sql > SELECT w_warehouse_name, > w_warehouse_sq_ft, > w_city, > w_county, > w_state, > w_country, > ship_carriers, > year1, > Sum(jan_sales) AS jan_sales, > Sum(feb_sales) AS feb_sales, > Sum(mar_sales) AS mar_sales, > Sum(apr_sales) AS apr_sales, > Sum(may_sales) AS may_sales, > Sum(jun_sales) AS jun_sales, > Sum(jul_sales) AS jul_sales, > Sum(aug_sales) AS aug_sales, > Sum(sep_sales) AS sep_sales, > Sum(oct_sales) AS oct_sales, > Sum(nov_sales) AS nov_sales, > Sum(dec_sales) AS dec_sales, > Sum(jan_sales / w_warehouse_sq_ft) AS jan_sales_per_sq_foot, > Sum(feb_sales / w_warehouse_sq_ft) AS feb_sales_per_sq_foot, > Sum(mar_sales / w_warehouse_sq_ft) AS mar_sales_per_sq_foot, > Sum(apr_sales / w_warehouse_sq_ft) AS apr_sales_per_sq_foot, > Sum(may_sales / w_warehouse_sq_ft) AS may_sales_per_sq_foot, > Sum(jun_sales / w_warehouse_sq_ft) AS jun_sales_per_sq_foot, > Sum(jul_sales / w_warehouse_sq_ft) AS jul_sales_per_sq_foot, > Sum(aug_sales / w_warehouse_sq_ft) AS aug_sales_per_sq_foot, > Sum(sep_sales / w_warehouse_sq_ft) AS sep_sales_per_sq_foot, > Sum(oct_sales / w_warehouse_sq_ft) AS oct_sales_per_sq_foot, > Sum(nov_sales / w_warehouse_sq_ft) AS nov_sales_per_sq_foot, > Sum(dec_sales / w_warehouse_sq_ft) AS dec_sales_per_sq_foot, > Sum(jan_net) AS jan_net, > Sum(feb_net) AS feb_net, > Sum(mar_net) AS mar_net, > Sum(apr_net) AS apr_net, > Sum(may_net) AS may_net, > Sum(jun_net) AS jun_net, > Sum(jul_net) AS jul_net, > Sum(aug_net) AS aug_net, > Sum(sep_net) AS sep_net, > Sum(oct_net) AS oct_net, > Sum(nov_net) AS nov_net, > Sum(dec_net) AS dec_net > FROM (SELECT w_warehouse_name, > w_warehouse_sq_ft, > w_city, > w_county, > w_state, > w_country, > 'ZOUROS' > \|\| ',' > \|\| 'ZHOU' AS ship_carriers, > d_yearAS year1, > Sum(CASE > WHEN d_moy = 1 THEN ws_ext_sales_price * ws_quantity > ELSE 0 > END) AS jan_sales, > Sum(CASE > WHEN d_moy = 2 THEN ws_ext_sales_price * ws_quantity > ELSE 0 > END) AS feb_sales, > Sum(CASE > WHEN d_moy = 3 THEN ws_ext_sales_price * ws_quantity > ELSE 0 > END) AS mar_sales, > Sum(CASE > WHEN d_moy = 4 THEN ws_ext_sales_price * ws_quantity > ELSE 0 > END) AS apr_sales, > Sum(CASE > WHEN d_moy = 5 THEN ws_ext_sales_price * ws_quantity > ELSE 0 > END) AS may_sales, > Sum(CASE > WHEN d_moy = 6 THEN ws_ext_sales_price * ws_quantity > ELSE 0 > END) AS jun_sales, > Sum(CASE > WHEN d_moy = 7 THEN ws_ext_sales_price * ws_quantity > ELSE 0 > END) AS jul_sales, > Sum(CASE > WHEN d_moy = 8 THEN ws_ext_sales_price * ws_quantity > ELSE 0 > END) AS aug_sales, > Sum(CASE > WHEN d_moy = 9 THEN ws_ext_sales_price * ws_quantity > ELSE 0 > END) AS sep_sales, > Sum(CASE > WHEN d_moy = 10 THEN ws_ext_sales_price * ws_quantity > ELSE 0 > END) AS oct_sales, > Sum(CASE > WHEN d_moy = 11 THEN ws_ext_sales_price * ws_quantity > ELSE 0 > END) AS nov_sales, > Sum(CASE > WHEN d_moy = 12 THEN ws_ext_sales_price * ws_quantity > ELSE 0 > END) AS dec_sales, > Sum(CASE > WHEN d_moy = 1 THEN ws_net_paid_inc_ship * ws_quantity > ELSE 0 > END) AS jan_net, > Sum(CASE > WHEN d_moy = 2 THEN ws_net_paid_inc_ship * ws_quantity > ELSE 0 > END) AS feb_net, > Sum(CASE > WHEN d_moy = 3 THEN ws_net_paid_inc_ship * ws_quantity > ELSE 0 > END) AS mar_net, > Sum(CASE > WHEN d_moy = 4 THEN ws_net_paid_inc_ship * ws_quantity > ELSE 0 > END) AS apr_net, > Sum(CASE > WHEN d_moy = 5 THEN ws_net_paid_inc_ship * ws_quanti
[jira] [Created] (DRILL-6567) Jenkins Regression: TPCDS query 93 fails with INTERNAL_ERROR ERROR: java.lang.reflect.UndeclaredThrowableException.
Robert Hou created DRILL-6567: - Summary: Jenkins Regression: TPCDS query 93 fails with INTERNAL_ERROR ERROR: java.lang.reflect.UndeclaredThrowableException. Key: DRILL-6567 URL: https://issues.apache.org/jira/browse/DRILL-6567 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.14.0 Reporter: Robert Hou Assignee: Pritesh Maker Fix For: 1.14.0 This is TPCDS Query 93. Query: /root/drillAutomation/framework-master/framework/resources/Advanced/tpcds/tpcds_sf100/hive/parquet/query93.sql SELECT ss_customer_sk, Sum(act_sales) sumsales FROM (SELECT ss_item_sk, ss_ticket_number, ss_customer_sk, CASE WHEN sr_return_quantity IS NOT NULL THEN ( ss_quantity - sr_return_quantity ) * ss_sales_price ELSE ( ss_quantity * ss_sales_price ) END act_sales FROM store_sales LEFT OUTER JOIN store_returns ON ( sr_item_sk = ss_item_sk AND sr_ticket_number = ss_ticket_number ), reason WHERE sr_reason_sk = r_reason_sk AND r_reason_desc = 'reason 38') t GROUP BY ss_customer_sk ORDER BY sumsales, ss_customer_sk LIMIT 100; Here is the stack trace: 2018-06-29 07:00:32 INFO DrillTestLogger:348 - Exception: java.sql.SQLException: INTERNAL_ERROR ERROR: java.lang.reflect.UndeclaredThrowableException Setup failed for null Fragment 4:56 [Error Id: 3c72c14d-9362-4a9b-affb-5cf937bed89e on atsqa6c82.qa.lab:31010] (org.apache.drill.common.exceptions.ExecutionSetupException) java.lang.reflect.UndeclaredThrowableException org.apache.drill.common.exceptions.ExecutionSetupException.fromThrowable():30 org.apache.drill.exec.store.hive.readers.HiveAbstractReader.setup():327 org.apache.drill.exec.physical.impl.ScanBatch.getNextReaderIfHas():245 org.apache.drill.exec.physical.impl.ScanBatch.next():164 org.apache.drill.exec.record.AbstractRecordBatch.next():119 org.apache.drill.exec.physical.impl.join.HashJoinBatch.sniffNonEmptyBatch():276 org.apache.drill.exec.physical.impl.join.HashJoinBatch.prefetchFirstBatchFromBothSides():238 org.apache.drill.exec.physical.impl.join.HashJoinBatch.buildSchema():218 org.apache.drill.exec.record.AbstractRecordBatch.next():152 org.apache.drill.exec.record.AbstractRecordBatch.next():119 org.apache.drill.exec.record.AbstractRecordBatch.next():109 org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():63 org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():147 org.apache.drill.exec.record.AbstractRecordBatch.next():172 org.apache.drill.exec.record.AbstractRecordBatch.next():119 org.apache.drill.exec.physical.impl.join.HashJoinBatch.sniffNonEmptyBatch():276 org.apache.drill.exec.physical.impl.join.HashJoinBatch.prefetchFirstBatchFromBothSides():238 org.apache.drill.exec.physical.impl.join.HashJoinBatch.buildSchema():218 org.apache.drill.exec.record.AbstractRecordBatch.next():152 org.apache.drill.exec.record.AbstractRecordBatch.next():119 org.apache.drill.exec.record.AbstractRecordBatch.next():109 org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():63 org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():147 org.apache.drill.exec.record.AbstractRecordBatch.next():172 org.apache.drill.exec.record.AbstractRecordBatch.next():119 org.apache.drill.exec.record.AbstractRecordBatch.next():109 org.apache.drill.exec.physical.impl.aggregate.HashAggBatch.buildSchema():118 org.apache.drill.exec.record.AbstractRecordBatch.next():152 org.apache.drill.exec.physical.impl.BaseRootExec.next():103 org.apache.drill.exec.physical.impl.partitionsender.PartitionSenderRootExec.innerNext():152 org.apache.drill.exec.physical.impl.BaseRootExec.next():93 org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():294 org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():281 java.security.AccessController.doPrivileged():-2 javax.security.auth.Subject.doAs():422 org.apache.hadoop.security.UserGroupInformation.doAs():1595 org.apache.drill.exec.work.fragment.FragmentExecutor.run():281 org.apache.drill.common.SelfCleaningRunnable.run():38 java.util.concurrent.ThreadPoolExecutor.runWorker():1149 java.util.concurrent.ThreadPoolExecutor$Worker.run():624 java.lang.Thread.run():748 Caused By (java.util.concurrent.ExecutionException) java.lang.reflect.UndeclaredThrowableException java.util.concurrent.FutureTask.report():122 java.util.concurrent.FutureTask.get():192 org.apache.drill.exec.store.hive.readers.HiveAbstractReader.setup():320 org.apache.drill.exec.physical.impl.ScanBatch.getNextReaderIfHas():245 org.apache.drill.exec.physical.impl.ScanBatch.next():164 org.apache.drill.exec.record.AbstractRecordBatch.next():119 org.apache.drill.exec.physical.impl.join.HashJo
[jira] [Created] (DRILL-6566) Jenkins Regression: TPCDS query 66 fails with RESOURCE ERROR: One or more nodes ran out of memory while executing the query. AGGR OOM at First Phase.
Robert Hou created DRILL-6566: - Summary: Jenkins Regression: TPCDS query 66 fails with RESOURCE ERROR: One or more nodes ran out of memory while executing the query. AGGR OOM at First Phase. Key: DRILL-6566 URL: https://issues.apache.org/jira/browse/DRILL-6566 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.14.0 Reporter: Robert Hou Assignee: Pritesh Maker Fix For: 1.14.0 This is TPCDS Query 66. Query: /root/drillAutomation/framework-master/framework/resources/Advanced/tpcds/tpcds_sf1/hive-generated-parquet/hive1_native/query66.sql SELECT w_warehouse_name, w_warehouse_sq_ft, w_city, w_county, w_state, w_country, ship_carriers, year1, Sum(jan_sales) AS jan_sales, Sum(feb_sales) AS feb_sales, Sum(mar_sales) AS mar_sales, Sum(apr_sales) AS apr_sales, Sum(may_sales) AS may_sales, Sum(jun_sales) AS jun_sales, Sum(jul_sales) AS jul_sales, Sum(aug_sales) AS aug_sales, Sum(sep_sales) AS sep_sales, Sum(oct_sales) AS oct_sales, Sum(nov_sales) AS nov_sales, Sum(dec_sales) AS dec_sales, Sum(jan_sales / w_warehouse_sq_ft) AS jan_sales_per_sq_foot, Sum(feb_sales / w_warehouse_sq_ft) AS feb_sales_per_sq_foot, Sum(mar_sales / w_warehouse_sq_ft) AS mar_sales_per_sq_foot, Sum(apr_sales / w_warehouse_sq_ft) AS apr_sales_per_sq_foot, Sum(may_sales / w_warehouse_sq_ft) AS may_sales_per_sq_foot, Sum(jun_sales / w_warehouse_sq_ft) AS jun_sales_per_sq_foot, Sum(jul_sales / w_warehouse_sq_ft) AS jul_sales_per_sq_foot, Sum(aug_sales / w_warehouse_sq_ft) AS aug_sales_per_sq_foot, Sum(sep_sales / w_warehouse_sq_ft) AS sep_sales_per_sq_foot, Sum(oct_sales / w_warehouse_sq_ft) AS oct_sales_per_sq_foot, Sum(nov_sales / w_warehouse_sq_ft) AS nov_sales_per_sq_foot, Sum(dec_sales / w_warehouse_sq_ft) AS dec_sales_per_sq_foot, Sum(jan_net) AS jan_net, Sum(feb_net) AS feb_net, Sum(mar_net) AS mar_net, Sum(apr_net) AS apr_net, Sum(may_net) AS may_net, Sum(jun_net) AS jun_net, Sum(jul_net) AS jul_net, Sum(aug_net) AS aug_net, Sum(sep_net) AS sep_net, Sum(oct_net) AS oct_net, Sum(nov_net) AS nov_net, Sum(dec_net) AS dec_net FROM (SELECT w_warehouse_name, w_warehouse_sq_ft, w_city, w_county, w_state, w_country, 'ZOUROS' || ',' || 'ZHOU' AS ship_carriers, d_yearAS year1, Sum(CASE WHEN d_moy = 1 THEN ws_ext_sales_price * ws_quantity ELSE 0 END) AS jan_sales, Sum(CASE WHEN d_moy = 2 THEN ws_ext_sales_price * ws_quantity ELSE 0 END) AS feb_sales, Sum(CASE WHEN d_moy = 3 THEN ws_ext_sales_price * ws_quantity ELSE 0 END) AS mar_sales, Sum(CASE WHEN d_moy = 4 THEN ws_ext_sales_price * ws_quantity ELSE 0 END) AS apr_sales, Sum(CASE WHEN d_moy = 5 THEN ws_ext_sales_price * ws_quantity ELSE 0 END) AS may_sales, Sum(CASE WHEN d_moy = 6 THEN ws_ext_sales_price * ws_quantity ELSE 0 END) AS jun_sales, Sum(CASE WHEN d_moy = 7 THEN ws_ext_sales_price * ws_quantity ELSE 0 END) AS jul_sales, Sum(CASE WHEN d_moy = 8 THEN ws_ext_sales_price * ws_quantity ELSE 0 END) AS aug_sales, Sum(CASE WHEN d_moy = 9 THEN ws_ext_sales_price * ws_quantity ELSE 0 END) AS sep_sales, Sum(CASE WHEN d_moy = 10 THEN ws_ext_sales_price * ws_quantity ELSE 0 END) AS oct_sales, Sum(CASE WHEN d_moy = 11 THEN ws_ext_sales_price * ws_quantity ELSE 0 END) AS nov_sales, Sum(CASE WHEN d_moy = 12 THEN ws_ext_sales_price * ws_quantity ELSE 0 END) AS dec_sales, Sum(CASE WHEN d_moy = 1 THEN ws_net_paid_inc_ship * ws_quantity ELSE 0 END) AS jan_net, Sum(CASE WHEN d_moy = 2 THEN ws_net_paid_inc_ship * ws_quantity ELSE 0 END) AS feb_net, Sum(CASE WHEN d_moy = 3 THEN ws_net_paid_inc_ship * ws_quantity ELSE 0 END) AS mar_net, Sum(CASE WHEN d_moy = 4 THEN ws_net_paid_inc_ship * ws_quantity ELSE 0 END) AS apr_net, Sum(CASE WHEN d_moy = 5 THEN ws_net_paid_inc_ship * ws_quantity ELSE 0 END) AS may_net, Sum(CASE WHEN d_moy = 6 THEN ws_net_paid_inc_ship * ws_quantity ELSE 0 END) AS jun_net, Sum(CASE WHEN d_moy = 7 THEN ws_net_paid_inc_ship * ws_quantity ELSE 0 END) AS jul_net, Sum(CASE WHEN d_moy = 8 THEN ws_net_paid_inc_ship * ws_quantity ELSE 0 END) AS aug_net, Sum(CASE WHEN d_moy = 9 THEN ws_net_paid_inc_ship * ws_quantity ELSE 0 END) AS sep_net, Sum(CASE WHEN d_moy = 10 THEN ws_net_paid_inc_ship * ws_quantity ELSE 0 END) AS oct_net, Sum(CASE WHEN d_moy = 11 THEN ws_net_paid_inc_ship * ws_quantity ELSE 0 END) AS nov_net, Sum(CASE WHEN d_moy = 12 THEN ws_net_paid_inc_ship
[jira] [Created] (DRILL-6565) cume_dist does not return enough rows
Robert Hou created DRILL-6565: - Summary: cume_dist does not return enough rows Key: DRILL-6565 URL: https://issues.apache.org/jira/browse/DRILL-6565 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.14.0 Reporter: Robert Hou Assignee: Pritesh Maker Attachments: drillbit.log.7802 This query should return 64 rows but only returns 38 rows: alter session set `planner.width.max_per_node` = 1; alter session set `planner.width.max_per_query` = 1; select * from ( select cume_dist() over (order by Index) IntervalSecondValuea, Index from (select * from dfs.`/drill/testdata/batch_memory/fourvarchar_asc_nulls_16MB_1GB.parquet` order by BigIntvalue)) d where d.Index = 1; I tried to reproduce the problem by using a smaller table, but it does not reproduce. I tried to reproduce the problem without the outside select statement, but it does not reproduce. Here is the explain plan: {noformat} | 00-00Screen : rowType = RecordType(DOUBLE IntervalSecondValuea, ANY Index): rowcount = 12000.0, cumulative cost = {757200.0 rows, 1.1573335922911648E7 cpu, 0.0 io, 0.0 network, 192.0 memory}, id = 4034 00-01 ProjectAllowDup(IntervalSecondValuea=[$0], Index=[$1]) : rowType = RecordType(DOUBLE IntervalSecondValuea, ANY Index): rowcount = 12000.0, cumulative cost = {756000.0 rows, 1.1572135922911648E7 cpu, 0.0 io, 0.0 network, 192.0 memory}, id = 4033 00-02Project(w0$o0=[$1], $0=[$0]) : rowType = RecordType(DOUBLE w0$o0, ANY $0): rowcount = 12000.0, cumulative cost = {744000.0 rows, 1.1548135922911648E7 cpu, 0.0 io, 0.0 network, 192.0 memory}, id = 4032 00-03 SelectionVectorRemover : rowType = RecordType(ANY $0, DOUBLE w0$o0): rowcount = 12000.0, cumulative cost = {732000.0 rows, 1.1524135922911648E7 cpu, 0.0 io, 0.0 network, 192.0 memory}, id = 4031 00-04Filter(condition=[=($0, 1)]) : rowType = RecordType(ANY $0, DOUBLE w0$o0): rowcount = 12000.0, cumulative cost = {72.0 rows, 1.1512135922911648E7 cpu, 0.0 io, 0.0 network, 192.0 memory}, id = 4030 00-05 Window(window#0=[window(partition {} order by [0] range between UNBOUNDED PRECEDING and CURRENT ROW aggs [CUME_DIST()])]) : rowType = RecordType(ANY $0, DOUBLE w0$o0): rowcount = 8.0, cumulative cost = {64.0 rows, 1.1144135922911648E7 cpu, 0.0 io, 0.0 network, 192.0 memory}, id = 4029 00-06SelectionVectorRemover : rowType = RecordType(ANY $0): rowcount = 8.0, cumulative cost = {56.0 rows, 1.0984135922911648E7 cpu, 0.0 io, 0.0 network, 192.0 memory}, id = 4028 00-07 Sort(sort0=[$0], dir0=[ASC]) : rowType = RecordType(ANY $0): rowcount = 8.0, cumulative cost = {48.0 rows, 1.0904135922911648E7 cpu, 0.0 io, 0.0 network, 192.0 memory}, id = 4027 00-08Project($0=[ITEM($0, 'Index')]) : rowType = RecordType(ANY $0): rowcount = 8.0, cumulative cost = {40.0 rows, 5692067.961455824 cpu, 0.0 io, 0.0 network, 128.0 memory}, id = 4026 00-09 SelectionVectorRemover : rowType = RecordType(DYNAMIC_STAR T2¦¦**, ANY BigIntvalue): rowcount = 8.0, cumulative cost = {32.0 rows, 5612067.961455824 cpu, 0.0 io, 0.0 network, 128.0 memory}, id = 4025 00-10Sort(sort0=[$1], dir0=[ASC]) : rowType = RecordType(DYNAMIC_STAR T2¦¦**, ANY BigIntvalue): rowcount = 8.0, cumulative cost = {24.0 rows, 5532067.961455824 cpu, 0.0 io, 0.0 network, 128.0 memory}, id = 4024 00-11 Project(T2¦¦**=[$0], BigIntvalue=[$1]) : rowType = RecordType(DYNAMIC_STAR T2¦¦**, ANY BigIntvalue): rowcount = 8.0, cumulative cost = {16.0 rows, 32.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 4023 00-12Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:///drill/testdata/batch_memory/fourvarchar_asc_nulls_16MB_1GB.parquet]], selectionRoot=maprfs:/drill/testdata/batch_memory/fourvarchar_asc_nulls_16MB_1GB.parquet, numFiles=1, numRowGroups=6, usedMetadataFile=false, columns=[`**`]]]) : rowType = RecordType(DYNAMIC_STAR **, ANY BigIntvalue): rowcount = 8.0, cumulative cost = {8.0 rows, 16.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 4022 {noformat} I have attached the drillbit.log. The commit id is: | 1.14.0-SNAPSHOT | aa127b70b1e46f7f4aa19881f25eda583627830a | DRILL-6523: Fix NPE for describe of partial schema | 22.06.2018 @ 11:28:23 PDT | r...@mapr.com | 23.06.2018 @ 02:05:10 PDT | fourvarchar_asc_nulls95.q -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6547) IllegalStateException: Tried to remove unmanaged buffer.
Robert Hou created DRILL-6547: - Summary: IllegalStateException: Tried to remove unmanaged buffer. Key: DRILL-6547 URL: https://issues.apache.org/jira/browse/DRILL-6547 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.14.0 Reporter: Robert Hou Assignee: Pritesh Maker This is the query: select * from ( select Index, concat(BinaryValue, 'aaa') NewVarcharValue from (select * from dfs.`/drill/testdata/batch_memory/alltypes_large_1MB.parquet`)) d where d.Index = 1; This is the plan: {noformat} | 00-00Screen 00-01 Project(Index=[$0], NewVarcharValue=[$1]) 00-02SelectionVectorRemover 00-03 Filter(condition=[=($0, 1)]) 00-04Project(Index=[$0], NewVarcharValue=[CONCAT($1, 'aaa')]) 00-05 Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:///drill/testdata/batch_memory/alltypes_large_1MB.parquet]], selectionRoot=maprfs:/drill/testdata/batch_memory/alltypes_large_1MB.parquet, numFiles=1, numRowGroups=1, usedMetadataFile=false, columns=[`Index`, `BinaryValue`]]]) {noformat} Here is the stack trace from drillbit.log: {noformat} 2018-06-27 13:55:03,291 [24cc0659-30b7-b290-7fae-ecb1c1f15c05:frag:0:0] ERROR o.a.d.e.w.fragment.FragmentExecutor - SYSTEM ERROR: IllegalStateException: Tried to remove unmanaged buffer. Fragment 0:0 [Error Id: bc1f2f72-c31b-4b9a-964f-96dec9e0f388 on qa-node186.qa.lab:31010] org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: IllegalStateException: Tried to remove unmanaged buffer. Fragment 0:0 [Error Id: bc1f2f72-c31b-4b9a-964f-96dec9e0f388 on qa-node186.qa.lab:31010] at org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:633) ~[drill-common-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] at org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:361) [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] at org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:216) [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] at org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:327) [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] at org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) [drill-common-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [na:1.8.0_161] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [na:1.8.0_161] at java.lang.Thread.run(Thread.java:748) [na:1.8.0_161] Caused by: java.lang.IllegalStateException: Tried to remove unmanaged buffer. at org.apache.drill.exec.ops.BufferManagerImpl.replace(BufferManagerImpl.java:50) ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] at io.netty.buffer.DrillBuf.reallocIfNeeded(DrillBuf.java:97) ~[drill-memory-base-1.14.0-SNAPSHOT.jar:4.0.48.Final] at org.apache.drill.exec.test.generated.ProjectorGen4046.doEval(ProjectorTemplate.java:77) ~[na:na] at org.apache.drill.exec.test.generated.ProjectorGen4046.projectRecords(ProjectorTemplate.java:67) ~[na:na] at org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.doWork(ProjectRecordBatch.java:236) ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] at org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext(AbstractUnaryRecordBatch.java:117) ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] at org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:147) ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:172) ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119) ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:109) ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] at org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext(AbstractUnaryRecordBatch.java:63) ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:172) ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119) ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:109) ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAP
[jira] [Created] (DRILL-6393) Radians should take an argument (x)
Robert Hou created DRILL-6393: - Summary: Radians should take an argument (x) Key: DRILL-6393 URL: https://issues.apache.org/jira/browse/DRILL-6393 Project: Apache Drill Issue Type: Bug Components: Documentation Affects Versions: 1.13.0 Reporter: Robert Hou Assignee: Bridget Bevens Fix For: 1.14.0 The radians function is missing an argument on this webpage: https://drill.apache.org/docs/math-and-trig/ The table has this information: {noformat} RADIANS FLOAT8 Converts x degress to radians. {nformat} It should be: {noformat} RADIANS(x) FLOAT8 Converts x degrees to radians. {noformat} Also, degress is mis-spelled. It should be degrees. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (DRILL-5900) Regression: TPCH query encounters random IllegalStateException: Memory was leaked by query
[ https://issues.apache.org/jira/browse/DRILL-5900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Hou resolved DRILL-5900. --- Resolution: Fixed This test is now passing. > Regression: TPCH query encounters random IllegalStateException: Memory was > leaked by query > -- > > Key: DRILL-5900 > URL: https://issues.apache.org/jira/browse/DRILL-5900 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators >Affects Versions: 1.11.0 >Reporter: Robert Hou >Assignee: Timothy Farkas >Priority: Blocker > Attachments: 2611d7c0-b0c9-a93e-c64d-a4ef8f4baf8f.sys.drill, > drillbit.log.node81, drillbit.log.node88 > > > This is a random failure in the TPCH-SF100-baseline run. The test is > /root/drillAutomation/framework-master/framework/resources/Advanced/tpch/tpch_sf1/original/parquet/query17.sql. > This test has passed before. > TPCH query 6: > {noformat} > SELECT > SUM(L.L_EXTENDEDPRICE) / 7.0 AS AVG_YEARLY > FROM > lineitem L, > part P > WHERE > P.P_PARTKEY = L.L_PARTKEY > AND P.P_BRAND = 'BRAND#13' > AND P.P_CONTAINER = 'JUMBO CAN' > AND L.L_QUANTITY < ( > SELECT > 0.2 * AVG(L2.L_QUANTITY) > FROM > lineitem L2 > WHERE > L2.L_PARTKEY = P.P_PARTKEY > ) > {noformat} > Error is: > {noformat} > 2017-10-23 10:34:55,989 [2611d7c0-b0c9-a93e-c64d-a4ef8f4baf8f:frag:8:2] ERROR > o.a.d.e.w.fragment.FragmentExecutor - SYSTEM ERROR: IllegalStateException: > Memory was leaked by query. Memory leaked: (2097152) > Allocator(op:8:2:6:ParquetRowGroupScan) 100/0/7675904/100 > (res/actual/peak/limit) > Fragment 8:2 > [Error Id: f21a2560-7259-4e13-88c2-9bac29e2930a on atsqa6c88.qa.lab:31010] > org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: > IllegalStateException: Memory was leaked by query. Memory leaked: (2097152) > Allocator(op:8:2:6:ParquetRowGroupScan) 100/0/7675904/100 > (res/actual/peak/limit) > Fragment 8:2 > [Error Id: f21a2560-7259-4e13-88c2-9bac29e2930a on atsqa6c88.qa.lab:31010] > at > org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:586) > ~[drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:298) > [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:160) > [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:267) > [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) > [drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > [na:1.7.0_51] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > [na:1.7.0_51] > at java.lang.Thread.run(Thread.java:744) [na:1.7.0_51] > Caused by: java.lang.IllegalStateException: Memory was leaked by query. > Memory leaked: (2097152) > Allocator(op:8:2:6:ParquetRowGroupScan) 100/0/7675904/100 > (res/actual/peak/limit) > at > org.apache.drill.exec.memory.BaseAllocator.close(BaseAllocator.java:519) > ~[drill-memory-base-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.ops.AbstractOperatorExecContext.close(AbstractOperatorExecContext.java:86) > ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.ops.OperatorContextImpl.close(OperatorContextImpl.java:108) > ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.ops.FragmentContext.suppressingClose(FragmentContext.java:435) > ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.ops.FragmentContext.close(FragmentContext.java:424) > ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.closeOutResources(FragmentExecutor.java:324) > [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:155) > [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > ... 5 common frames omitted > 2017-10-23 10:34:55,989 [2611d7c0-b0c9-a93e-c64d-a4ef8f4baf8f:frag:6:0] INFO > o.a.d.e.w.f.FragmentStatusReporter - > 2611d7c0-b0c9-a93e-c64d-a4ef8f4baf8f:6:0: State to report: FINISHED > {noformat} > sys.version is: > 1.12.0-SNAPSHOT b0c4e0486d6d4620b04a1bb8198
[jira] [Created] (DRILL-6276) Drill CTAS creates parquet file having page greater than 200 MB.
Robert Hou created DRILL-6276: - Summary: Drill CTAS creates parquet file having page greater than 200 MB. Key: DRILL-6276 URL: https://issues.apache.org/jira/browse/DRILL-6276 Project: Apache Drill Issue Type: Bug Components: Storage - Parquet Affects Versions: 1.13.0 Reporter: Robert Hou Attachments: alltypes_asc_16MB.json I used this CTAS to create a parquet file from a json file: {noformat} create table `alltypes.parquet` as select cast(BigIntValue as BigInt) BigIntValue, cast(BooleanValue as Boolean) BooleanValue, cast (DateValue as Date) DateValue, cast (FloatValue as Float) FloatValue, cast (DoubleValue as Double) DoubleValue, cast (IntegerValue as Integer) IntegerValue, cast (TimeValue as Time) TimeValue, cast (TimestampValue as Timestamp) TimestampValue, cast (IntervalYearValue as INTERVAL YEAR) IntervalYearValue, cast (IntervalDayValue as INTERVAL DAY) IntervalDayValue, cast (IntervalSecondValue as INTERVAL SECOND) IntervalSecondValue, cast (BinaryValue as binary) Binaryvalue, cast (VarcharValue as varchar) VarcharValue from `alltypes.json`; {noformat} I ran parquet-tools/parquet-dump : VarcharValue TV=6885 RL=0 DL=1 page 0: DLE:RLE RLE:BIT_PACKED VLE:PLAIN SZ:17240317 VC:6885 The page size is 16MB. This is with a 16MB data set. When I try a similar 1GB data set, the page size starts at over 200 MB, decreasing down to 1MB. VarcharValue TV=208513 RL=0 DL=1 page 0: DLE:RLE RLE:BIT_PACKED VLE:PLAIN SZ:215243750 VC:87433 page 1: DLE:RLE RLE:BIT_PACKED VLE:PLAIN SZ:112350266 VC:43717 page 2: DLE:RLE RLE:BIT_PACKED VLE:PLAIN SZ:52501154 VC:21859 page 3: DLE:RLE RLE:BIT_PACKED VLE:PLAIN SZ:27725498 VC:10930 page 4: DLE:RLE RLE:BIT_PACKED VLE:PLAIN SZ:12181241 VC:5466 page 5: DLE:RLE RLE:BIT_PACKED VLE:PLAIN SZ:11005971 VC:2734 page 6: DLE:RLE RLE:BIT_PACKED VLE:PLAIN SZ:1133237 VC:1797 page 7: DLE:RLE RLE:BIT_PACKED VLE:PLAIN SZ:1462803 VC:899 page 8: DLE:RLE RLE:BIT_PACKED VLE:PLAIN SZ:1050967 VC:490 page 9: DLE:RLE RLE:BIT_PACKED VLE:PLAIN SZ:1051603 VC:424 page 10: DLE:RLE RLE:BIT_PACKED VLE:PLAIN SZ:1050919 VC:378 page 11: DLE:RLE RLE:BIT_PACKED VLE:PLAIN SZ:1050487 VC:345 page 12: DLE:RLE RLE:BIT_PACKED VLE:PLAIN SZ:1050783 VC:319 page 13: DLE:RLE RLE:BIT_PACKED VLE:PLAIN SZ:1052303 VC:299 page 14: DLE:RLE RLE:BIT_PACKED VLE:PLAIN SZ:1053235 VC:282 page 15: DLE:RLE RLE:BIT_PACKED VLE:PLAIN SZ:1055979 VC:268 The column has a varchar, and the size varies from 2 bytes to 5000 bytes. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (DRILL-6176) Drill skips a row when querying a text file but does not report it.
[ https://issues.apache.org/jira/browse/DRILL-6176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Hou resolved DRILL-6176. --- Resolution: Not A Problem > Drill skips a row when querying a text file but does not report it. > --- > > Key: DRILL-6176 > URL: https://issues.apache.org/jira/browse/DRILL-6176 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Data Types >Affects Versions: 1.12.0 >Reporter: Robert Hou >Assignee: Pritesh Maker >Priority: Critical > Attachments: 10.tbl > > > I tried to query 10 rows from a tbl file. It skipped the 6th row, which only > has special symbols in it. So it shows 9 rows. And there was no warning > that a row is skipped. > i checked the special symbols. The same symbols appear in other rows. > This also occurs if the file is a csv file. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6178) Drill does not project extra columns in some cases
Robert Hou created DRILL-6178: - Summary: Drill does not project extra columns in some cases Key: DRILL-6178 URL: https://issues.apache.org/jira/browse/DRILL-6178 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.12.0 Reporter: Robert Hou Assignee: Pritesh Maker Attachments: 10.tbl Drill is supposed to project extra columns as null columns. This table has 10 columns. The extra columns are shown as null: {noformat} 0: jdbc:drill:zk=10.10.104.85:5181> select columns[0], columns[3], columns[4], columns[5], columns[6], columns[7], columns[8], columns[9], columns[10], columns[11], columns[12], columns[13], columns[14], columns[15] from `resource-manager/1.tbl`; +-+-+-+-+-+-+-+-+-+-+--+--+--+--+ | EXPR$0 | EXPR$1 | EXPR$2 | EXPR$3 | EXPR$4 | EXPR$5 | EXPR$6 | EXPR$7 | EXPR$8 | EXPR$9 | EXPR$10 | EXPR$11 | EXPR$12 | EXPR$13 | +-+-+-+-+-+-+-+-+-+-+--+--+--+--+ | 1 | | null | null | null | null | -61 | -255.0 | null | null | null | null | null | null | +-+-+-+-+-+-+-+-+-+-+--+--+--+--+{noformat} If I run the same query against a table with 10 rows and 10 columns (attached to the Jira), only the 10 columns are shown. {noformat} select columns[0], columns[1], columns[2], columns[3], columns[4], columns[5], columns[6], columns[7], columns[8], columns[9], columns[10], columns[11], columns[12], columns[13], columns[14], columns[15] from `10.tbl`{noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6176) Drill skips a row when querying a text file but does not report it.
Robert Hou created DRILL-6176: - Summary: Drill skips a row when querying a text file but does not report it. Key: DRILL-6176 URL: https://issues.apache.org/jira/browse/DRILL-6176 Project: Apache Drill Issue Type: Bug Components: Execution - Data Types Affects Versions: 1.12.0 Reporter: Robert Hou Assignee: Pritesh Maker I tried to query 10 rows from a tbl file. It skipped the 6th row, which only has special symbols in it. So it shows 9 rows. And there was no warning that a row is skipped. i checked the special symbols. The same symbols appear in other rows. This also occurs if the file is a csv file. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6165) Drill should support versioning between Drill clients (JDBC/ODBC) and Drill server
Robert Hou created DRILL-6165: - Summary: Drill should support versioning between Drill clients (JDBC/ODBC) and Drill server Key: DRILL-6165 URL: https://issues.apache.org/jira/browse/DRILL-6165 Project: Apache Drill Issue Type: Bug Components: Client - JDBC, Client - ODBC Affects Versions: 1.12.0 Reporter: Robert Hou Assignee: Pritesh Maker We need to determine which versions of JDBC/ODBC drivers can be used with which versions of Drill server. Due to recent improvements in security, a newer client had problems working with an older server. The current solution is to require drill clients and drill servers to be the same version. In some cases, different versions of drill clients can work with different versions of drill servers, but this compatibility is being determined on a version-by-version, feature-by-feature basis. We need an architecture that enables this to work automatically. In particular, if a new drill client requests a feature that the older drill server does not support, this should be handled gracefully without returning an error. This also has an impact on QA resources. We recently had a customer issue that needed to be fixed on three different Drill server releases, so three new drivers had to be created and tested. Note that drill clients and drill servers can be on different versions for various reasons: 1) A user may need to access different drill servers. They can only have one version of the drill client installed on their machine. 2) Many users may need to access the same drill server. Some users may have one version of the drill client installed while other users may have a different version of the drill client installed. In a large customer installation, it is difficult to get all users to upgrade their drill client at the same time. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6134) Many Drill queries fail when using JDBC Driver from Simba
Robert Hou created DRILL-6134: - Summary: Many Drill queries fail when using JDBC Driver from Simba Key: DRILL-6134 URL: https://issues.apache.org/jira/browse/DRILL-6134 Project: Apache Drill Issue Type: Bug Reporter: Robert Hou Assignee: Pritesh Maker Here is an example: Query: /root/drillAutomation/framework-master/framework/resources/Functional/limit0/union/data/union_51.q {noformat} (SELECT c2 FROM `union_01_v` ORDER BY c5 DESC nulls first) UNION (SELECT c2 FROM `union_02_v` ORDER BY c5 ASC nulls first){noformat} This is the error: {noformat} Exception: java.sql.SQLException: [JDBC Driver]The field c2(BIGINT:OPTIONAL) [$bits$(UINT1:REQUIRED), $values$(BIGINT:OPTIONAL)] doesn't match the provided metadata major_type { minor_type: BIGINT mode: OPTIONAL } name_part { name: "$values$" } value_count: 18 buffer_length: 144 . at com.google.common.base.Preconditions.checkArgument(Preconditions.java:145) at org.apache.drill.exec.vector.BigIntVector.load(BigIntVector.java:287) at org.apache.drill.exec.vector.NullableBigIntVector.load(NullableBigIntVector.java:274) at org.apache.drill.exec.record.RecordBatchLoader.load(RecordBatchLoader.java:131) at com.mapr.drill.drill.dataengine.DRJDBCResultSet.doLoadRecordBatchData(Unknown Source) at com.mapr.drill.drill.dataengine.DRJDBCResultSet.hasMoreRows(Unknown Source) at com.mapr.drill.drill.dataengine.DRJDBCResultSet.doMoveToNextRow(Unknown Source) at com.mapr.drill.jdbc.common.CommonResultSet.moveToNextRow(Unknown Source) at com.mapr.drill.jdbc.common.SForwardResultSet.next(Unknown Source) at org.apache.drill.test.framework.DrillTestJdbc.executeQuery(DrillTestJdbc.java:255) at org.apache.drill.test.framework.DrillTestJdbc.run(DrillTestJdbc.java:115) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:473) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.IllegalArgumentException: The field c2(BIGINT:OPTIONAL) [$bits$(UINT1:REQUIRED), $values$(BIGINT:OPTIONAL)] doesn't match the provided metadata major_type { minor_type: BIGINT mode: OPTIONAL } name_part { name: "$values$" } value_count: 18 buffer_length: 144 . ... 16 more{noformat} The commit that causes these errors to occur is: {noformat} https://issues.apache.org/jira/browse/DRILL-6049 Rollup of hygiene changes from "batch size" project commit ID e791ed62b1c91c39676c4adef438c689fd84fd4b{noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6078) Query with INTERVAL in predicate does not return any rows
Robert Hou created DRILL-6078: - Summary: Query with INTERVAL in predicate does not return any rows Key: DRILL-6078 URL: https://issues.apache.org/jira/browse/DRILL-6078 Project: Apache Drill Issue Type: Bug Components: Query Planning & Optimization Affects Versions: 1.12.0 Reporter: Robert Hou Assignee: Chunhui Shi This query does not return any rows when accessing MapR DB tables. SELECT C.C_CUSTKEY, C.C_NAME, SUM(L.L_EXTENDEDPRICE * (1 - L.L_DISCOUNT)) AS REVENUE, C.C_ACCTBAL, N.N_NAME, C.C_ADDRESS, C.C_PHONE, C.C_COMMENT FROM customer C, orders O, lineitem L, nation N WHERE C.C_CUSTKEY = O.O_CUSTKEY AND L.L_ORDERKEY = O.O_ORDERKEY AND O.O_ORDERDate >= DATE '1994-03-01' AND O.O_ORDERDate < DATE '1994-03-01' + INTERVAL '3' MONTH AND L.L_RETURNFLAG = 'R' AND C.C_NATIONKEY = N.N_NATIONKEY GROUP BY C.C_CUSTKEY, C.C_NAME, C.C_ACCTBAL, C.C_PHONE, N.N_NAME, C.C_ADDRESS, C.C_COMMENT ORDER BY REVENUE DESC LIMIT 20 This query works against JSON tables. It should return 20 rows. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (DRILL-5898) Query returns columns in the wrong order
[ https://issues.apache.org/jira/browse/DRILL-5898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Hou resolved DRILL-5898. --- Resolution: Fixed Updated expected results file. > Query returns columns in the wrong order > > > Key: DRILL-5898 > URL: https://issues.apache.org/jira/browse/DRILL-5898 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators >Affects Versions: 1.11.0 >Reporter: Robert Hou >Assignee: Robert Hou >Priority: Blocker > Fix For: 1.12.0 > > > This is a regression. It worked with this commit: > {noformat} > f1d1945b3772bb782039fd6811e34a7de66441c8 DRILL-5582: C++ Client: [Threat > Modeling] Drillbit may be spoofed by an attacker and this may lead to data > being written to the attacker's target instead of Drillbit > {noformat} > It fails with this commit, although there are six commits total between the > last good one and this one: > {noformat} > b0c4e0486d6d4620b04a1bb8198e959d433b4840 DRILL-5876: Use openssl profile > to include netty-tcnative dependency with the platform specific classifier > {noformat} > Query is: > {noformat} > select * from > dfs.`/drill/testdata/tpch100_dir_partitioned_5files/lineitem` where > dir0=2006 and dir1=12 and dir2=15 and l_discount=0.07 order by l_orderkey, > l_extendedprice limit 10 > {noformat} > Columns are returned in a different order. Here are the expected results: > {noformat} > foxes. furiously final ideas cajol1994-05-27 0.071731.42 4 > F 653442 4965666.0 1.0 1994-06-23 A 1994-06-22 > NONESHIP215671 0.07200612 15 (1 time(s)) > lly final account 1994-11-09 0.0745881.783 F > 653412 1.320809E7 46.01994-11-24 R 1994-11-08 TAKE > BACK RETURNREG AIR 458104 0.08200612 15 (1 time(s)) > the asymptotes 1997-12-29 0.0760882.8 6 O 653413 > 1.4271413E7 44.01998-02-04 N 1998-01-20 DELIVER IN > PERSON MAIL21456 0.05200612 15 (1 time(s)) > carefully a 1996-09-23 0.075381.88 2 O 653378 > 1.6702792E7 3.0 1996-11-14 N 1996-10-15 NONEREG > AIR 952809 0.05200612 15 (1 time(s)) > ly final requests. boldly ironic theo 1995-09-04 0.072019.94 2 > O 653380 2416094.0 2.0 1995-11-14 N 1995-10-18 > COLLECT COD FOB 166101 0.02200612 15 (1 time(s)) > alongside of the even, e 1996-02-14 0.0786140.322 > O 653409 5622872.0 48.01996-05-02 N 1996-04-22 > NONESHIP372888 0.04200612 15 (1 time(s)) > es. regular instruct 1996-10-18 0.0725194.0 1 O 653382 > 6048060.0 25.01996-08-29 N 1996-08-20 DELIVER IN > PERSON AIR 798079 0.0 200612 15 (1 time(s)) > en package1993-09-19 0.0718718.322 F 653440 > 1.372054E7 12.01993-09-12 A 1993-09-09 DELIVER IN > PERSON TRUCK 970554 0.0 200612 15 (1 time(s)) > ly regular deposits snooze. unusual, even 1998-01-18 0.07 > 12427.921 O 653413 2822631.0 8.0 1998-02-09 > N 1998-02-05 TAKE BACK RETURNREG AIR 322636 0.01 > 200612 15 (1 time(s)) > ironic ideas. bra1996-10-13 0.0764711.533 O > 653383 6806672.0 41.01996-12-06 N 1996-11-10 TAKE > BACK RETURNAIR 556691 0.01200612 15 (1 time(s)) > {noformat} > Here are the actual results: > {noformat} > 2006 12 15 653383 6806672 556691 3 41.064711.53 > 0.070.01N O 1996-11-10 1996-10-13 1996-12-06 > TAKE BACK RETURNAIR ironic ideas. bra > 2006 12 15 653378 16702792952809 2 3.0 5381.88 > 0.070.05N O 1996-10-15 1996-09-23 1996-11-14 > NONEREG AIR carefully a > 2006 12 15 653380 2416094 166101 2 2.0 2019.94 0.07 > 0.02N O 1995-10-18 1995-09-04 1995-11-14 > COLLECT COD FOB ly final requests. boldly ironic theo > 2006 12 15 653413 2822631 322636 1 8.0 12427.92 > 0.070.01N O 1998-02-05 1998-01-18 1998-02-09 > TAKE BACK RETURNREG AIR ly regular deposits snooze. unusual, even > 2006 12 15 653382 6048060 798079 1 25.025194.0 0.07 > 0.0 N O
[jira] [Created] (DRILL-5908) Regression: Query intermittently may fail with error "Waited for 15000ms, but tasks for 'Get block maps' are not complete."
Robert Hou created DRILL-5908: - Summary: Regression: Query intermittently may fail with error "Waited for 15000ms, but tasks for 'Get block maps' are not complete." Key: DRILL-5908 URL: https://issues.apache.org/jira/browse/DRILL-5908 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.11.0 Reporter: Robert Hou Assignee: Pritesh Maker This is from the Functional-Baseline-88.193 Jenkins run. The test is in the Functional test suite, partition_pruning/dfs/csv/plan/csvselectpartormultiplewithdir_MD-185.q Query is: {noformat} explain plan for select columns[0],columns[1],columns[4],columns[10],columns[13],dir0 from `/drill/testdata/partition_pruning/dfs/lineitempart` where (dir0=1993 and columns[0]>29600) or (dir0=1994 and columns[0]>29700) {noformat} The error is: {noformat} Failed with exception java.sql.SQLException: RESOURCE ERROR: Waited for 15000ms, but tasks for 'Get block maps' are not complete. Total runnable size 2, parallelism 2. [Error Id: ab911277-36cb-465c-a9aa-8e3d21bcc09c on atsqa4-195.qa.lab:31010] at org.apache.drill.jdbc.impl.DrillCursor.nextRowInternally(DrillCursor.java:489) at org.apache.drill.jdbc.impl.DrillCursor.loadInitialSchema(DrillCursor.java:561) at org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(DrillResultSetImpl.java:1895) at org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(DrillResultSetImpl.java:61) at oadd.org.apache.calcite.avatica.AvaticaConnection$1.execute(AvaticaConnection.java:473) at org.apache.drill.jdbc.impl.DrillMetaImpl.prepareAndExecute(DrillMetaImpl.java:1100) at oadd.org.apache.calcite.avatica.AvaticaConnection.prepareAndExecuteInternal(AvaticaConnection.java:477) at org.apache.drill.jdbc.impl.DrillConnectionImpl.prepareAndExecuteInternal(DrillConnectionImpl.java:181) at oadd.org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:110) at oadd.org.apache.calcite.avatica.AvaticaStatement.executeQuery(AvaticaStatement.java:130) at org.apache.drill.jdbc.impl.DrillStatementImpl.executeQuery(DrillStatementImpl.java:112) at org.apache.drill.test.framework.DrillTestJdbc.executeQuery(DrillTestJdbc.java:224) at org.apache.drill.test.framework.DrillTestJdbc.run(DrillTestJdbc.java:136) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:473) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:748) Caused by: oadd.org.apache.drill.common.exceptions.UserRemoteException: RESOURCE ERROR: Waited for 15000ms, but tasks for 'Get block maps' are not complete. Total runnable size 2, parallelism 2. [Error Id: ab911277-36cb-465c-a9aa-8e3d21bcc09c on atsqa4-195.qa.lab:31010] at oadd.org.apache.drill.exec.rpc.user.QueryResultHandler.resultArrived(QueryResultHandler.java:123) at oadd.org.apache.drill.exec.rpc.user.UserClient.handle(UserClient.java:465) at oadd.org.apache.drill.exec.rpc.user.UserClient.handle(UserClient.java:102) at oadd.org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:274) at oadd.org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:244) at oadd.io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:88) at oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) at oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) at oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335) at oadd.io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:287) at oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) at oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) at oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335) at oadd.io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102) at oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) at oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) at oadd.io.netty.channel.AbstractChannelHandlerContext.f
[jira] [Resolved] (DRILL-5901) Drill test framework can have successful run even if a random failure occurs
[ https://issues.apache.org/jira/browse/DRILL-5901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Hou resolved DRILL-5901. --- Resolution: Not A Bug This is a bug in the Drill Test Framework, not in Drill itself. > Drill test framework can have successful run even if a random failure occurs > > > Key: DRILL-5901 > URL: https://issues.apache.org/jira/browse/DRILL-5901 > Project: Apache Drill > Issue Type: Bug > Components: Tools, Build & Test >Affects Versions: 1.11.0 >Reporter: Robert Hou > > From Jenkins: > http://10.10.104.91:8080/view/Nightly/job/TPCH-SF100-baseline/574/console > Random Failures: > /root/drillAutomation/framework-master/framework/resources/Advanced/tpch/tpch_sf1/original/parquet/query17.sql > Query: > SELECT > SUM(L.L_EXTENDEDPRICE) / 7.0 AS AVG_YEARLY > FROM > lineitem L, > part P > WHERE > P.P_PARTKEY = L.L_PARTKEY > AND P.P_BRAND = 'BRAND#13' > AND P.P_CONTAINER = 'JUMBO CAN' > AND L.L_QUANTITY < ( > SELECT > 0.2 * AVG(L2.L_QUANTITY) > FROM > lineitem L2 > WHERE > L2.L_PARTKEY = P.P_PARTKEY > ) > Failed with exception > java.sql.SQLException: SYSTEM ERROR: IllegalStateException: Memory was leaked > by query. Memory leaked: (2097152) > Allocator(op:8:2:6:ParquetRowGroupScan) 100/0/7675904/100 > (res/actual/peak/limit) > Fragment 8:2 > [Error Id: f21a2560-7259-4e13-88c2-9bac29e2930a on atsqa6c88.qa.lab:31010] > (java.lang.IllegalStateException) Memory was leaked by query. Memory > leaked: (2097152) > Allocator(op:8:2:6:ParquetRowGroupScan) 100/0/7675904/100 > (res/actual/peak/limit) > org.apache.drill.exec.memory.BaseAllocator.close():519 > org.apache.drill.exec.ops.AbstractOperatorExecContext.close():86 > org.apache.drill.exec.ops.OperatorContextImpl.close():108 > org.apache.drill.exec.ops.FragmentContext.suppressingClose():435 > org.apache.drill.exec.ops.FragmentContext.close():424 > > org.apache.drill.exec.work.fragment.FragmentExecutor.closeOutResources():324 > org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup():155 > org.apache.drill.exec.work.fragment.FragmentExecutor.run():267 > org.apache.drill.common.SelfCleaningRunnable.run():38 > java.util.concurrent.ThreadPoolExecutor.runWorker():1145 > java.util.concurrent.ThreadPoolExecutor$Worker.run():615 > java.lang.Thread.run():744 > at > org.apache.drill.jdbc.impl.DrillCursor.nextRowInternally(DrillCursor.java:489) > at > org.apache.drill.jdbc.impl.DrillCursor.loadInitialSchema(DrillCursor.java:561) > at > org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(DrillResultSetImpl.java:1895) > at > org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(DrillResultSetImpl.java:61) > at > oadd.org.apache.calcite.avatica.AvaticaConnection$1.execute(AvaticaConnection.java:473) > at > org.apache.drill.jdbc.impl.DrillMetaImpl.prepareAndExecute(DrillMetaImpl.java:1100) > at > oadd.org.apache.calcite.avatica.AvaticaConnection.prepareAndExecuteInternal(AvaticaConnection.java:477) > at > org.apache.drill.jdbc.impl.DrillConnectionImpl.prepareAndExecuteInternal(DrillConnectionImpl.java:181) > at > oadd.org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:110) > at > oadd.org.apache.calcite.avatica.AvaticaStatement.executeQuery(AvaticaStatement.java:130) > at > org.apache.drill.jdbc.impl.DrillStatementImpl.executeQuery(DrillStatementImpl.java:112) > at > org.apache.drill.test.framework.DrillTestJdbc.executeQuery(DrillTestJdbc.java:206) > at > org.apache.drill.test.framework.DrillTestJdbc.run(DrillTestJdbc.java:115) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:744) > Caused by: oadd.org.apache.drill.common.exceptions.UserRemoteException: > SYSTEM ERROR: IllegalStateException: Memory was leaked by query. Memory > leaked: (2097152) > Allocator(op:8:2:6:ParquetRowGroupScan) 100/0/7675904/100 > (res/actual/peak/limit) > Fragment 8:2 > [Error Id: f21a2560-7259-4e13-88c2-9bac29e2930a on atsqa6c88.qa.lab:31010] > (java.lang.IllegalStateException) Memory was leaked by query. Memory > leaked: (2097152) > Allocator(op:8:2:6:ParquetRowGroupScan) 100/0/7675904/100 > (res/actual/peak/limit) > org.apache.drill.exec.memory.BaseAllocator.close():519 > org.apache.drill.exec.ops.AbstractOperatorExecContext.close():86 >
[jira] [Created] (DRILL-5903) Query encounters "Waited for 15000ms, but tasks for 'Fetch parquet metadata' are not complete."
Robert Hou created DRILL-5903: - Summary: Query encounters "Waited for 15000ms, but tasks for 'Fetch parquet metadata' are not complete." Key: DRILL-5903 URL: https://issues.apache.org/jira/browse/DRILL-5903 Project: Apache Drill Issue Type: Bug Components: Metadata, Storage - Parquet Affects Versions: 1.11.0 Reporter: Robert Hou Priority: Critical Query is: {noformat} select a.int_col, b.date_col from dfs.`/drill/testdata/parquet_date/metadata_cache/mixed/fewtypes_null_large` a inner join ( select date_col, int_col from dfs.`/drill/testdata/parquet_date/metadata_cache/mixed/fewtypes_null_large` where dir0 = '1.2' and date_col > '1996-03-07' ) b on cast(a.date_col as date)= date_add(b.date_col, 5) where a.int_col = 7 and a.dir0='1.9' group by a.int_col, b.date_col {noformat} >From drillbit.log: {noformat} fc65-d430-ac1103638113: SELECT SUM(col_int) OVER() sum_int FROM vwOnParq_wCst_35 2017-10-23 11:20:50,122 [26122f83-6956-5aa8-d8de-d4808f572160:foreman] ERROR o.a.d.exec.store.parquet.Metadata - Waited for 15000ms, but tasks for 'Fetch parquet metadata' are not complete. Total runnable size 3, parallelism 3. 2017-10-23 11:20:50,127 [26122f83-6956-5aa8-d8de-d4808f572160:foreman] INFO o.a.d.exec.store.parquet.Metadata - User Error Occurred: Waited for 15000ms, but tasks for 'Fetch parquet metadata' are not complete. Total runnable size 3, parallelism 3. org.apache.drill.common.exceptions.UserException: RESOURCE ERROR: Waited for 15000ms, but tasks for 'Fetch parquet metadata' are not complete. Total runnable size 3, parallelism 3. [Error Id: 7484e127-ea41-4797-83c0-6619ea9b2bcd ] at org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:586) ~[drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.store.TimedRunnable.run(TimedRunnable.java:151) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.store.parquet.Metadata.getParquetFileMetadata_v3(Metadata.java:341) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.store.parquet.Metadata.getParquetTableMetadata(Metadata.java:318) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.store.parquet.Metadata.getParquetTableMetadata(Metadata.java:142) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.store.parquet.ParquetGroupScan.init(ParquetGroupScan.java:934) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.store.parquet.ParquetGroupScan.(ParquetGroupScan.java:227) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.store.parquet.ParquetGroupScan.(ParquetGroupScan.java:190) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.store.parquet.ParquetFormatPlugin.getGroupScan(ParquetFormatPlugin.java:170) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.store.parquet.ParquetFormatPlugin.getGroupScan(ParquetFormatPlugin.java:66) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.store.dfs.FileSystemPlugin.getPhysicalScan(FileSystemPlugin.java:144) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.store.AbstractStoragePlugin.getPhysicalScan(AbstractStoragePlugin.java:100) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.planner.logical.DrillTable.getGroupScan(DrillTable.java:85) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.planner.logical.DrillPushProjIntoScan.onMatch(DrillPushProjIntoScan.java:62) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:228) [calcite-core-1.4.0-drill-r22.jar:1.4.0-drill-r22] at org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:811) [calcite-core-1.4.0-drill-r22.jar:1.4.0-drill-r22] at org.apache.calcite.tools.Programs$RuleSetProgram.run(Programs.java:310) [calcite-core-1.4.0-drill-r22.jar:1.4.0-drill-r22] at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.transform(DefaultSqlHandler.java:400) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.transform(DefaultSqlHandler.java:342) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToRawDrel(DefaultSqlHandler.java:241) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToDrel(DefaultSqlHandler.java:291) [drill-j
[jira] [Created] (DRILL-5902) Regression: Queries encounter random failure due to RPC connection timed out
Robert Hou created DRILL-5902: - Summary: Regression: Queries encounter random failure due to RPC connection timed out Key: DRILL-5902 URL: https://issues.apache.org/jira/browse/DRILL-5902 Project: Apache Drill Issue Type: Bug Components: Execution - RPC Affects Versions: 1.11.0 Reporter: Robert Hou Priority: Critical Multiple random failures (25) occurred with the latest Functional-Baseline-88.193 run. Here is a sample query: {noformat} -- Kitchen sink -- Use all supported functions select rank() over W, dense_rank()over W, percent_rank() over W, cume_dist() over W, avg(c_integer + c_integer) over W, sum(c_integer/100) over W, count(*)over W, min(c_integer) over W, max(c_integer) over W, row_number()over W from j7 where c_boolean is not null window W as (partition by c_bigint, c_date, c_time, c_boolean order by c_integer) {noformat} >From the logs: {noformat} 2017-10-23 04:14:36,536 [BitServer-7] WARN o.a.d.e.w.b.ControlMessageHandler - Dropping request for early fragment termination for path 261230e8-d03e-9ca9-91bf-c1039deecde2:1:1 -> 261230e8-d03e-9ca9-91bf-c1039deecde2:0:0 as path to executor unavailable. 2017-10-23 04:14:36,537 [BitServer-7] WARN o.a.d.e.w.b.ControlMessageHandler - Dropping request for early fragment termination for path 261230e8-d03e-9ca9-91bf-c1039deecde2:1:5 -> 261230e8-d03e-9ca9-91bf-c1039deecde2:0:0 as path to executor unavailable. 2017-10-23 04:14:36,537 [BitServer-7] WARN o.a.d.e.w.b.ControlMessageHandler - Dropping request for early fragment termination for path 261230e8-d03e-9ca9-91bf-c1039deecde2:1:9 -> 261230e8-d03e-9ca9-91bf-c1039deecde2:0:0 as path to executor unavailable. 2017-10-23 04:14:36,537 [BitServer-7] WARN o.a.d.e.w.b.ControlMessageHandler - Dropping request for early fragment termination for path 261230e8-d03e-9ca9-91bf-c1039deecde2:1:13 -> 261230e8-d03e-9ca9-91bf-c1039deecde2:0:0 as path to executor unavailable. 2017-10-23 04:14:36,537 [BitServer-7] WARN o.a.d.e.w.b.ControlMessageHandler - Dropping request for early fragment termination for path 261230e8-d03e-9ca9-91bf-c1039deecde2:1:17 -> 261230e8-d03e-9ca9-91bf-c1039deecde2:0:0 as path to executor unavailable. 2017-10-23 04:14:36,538 [BitServer-7] WARN o.a.d.e.w.b.ControlMessageHandler - Dropping request for early fragment termination for path 261230e8-d03e-9ca9-91bf-c1039deecde2:1:21 -> 261230e8-d03e-9ca9-91bf-c1039deecde2:0:0 as path to executor unavailable. 2017-10-23 04:14:36,538 [BitServer-7] WARN o.a.d.e.w.b.ControlMessageHandler - Dropping request for early fragment termination for path 261230e8-d03e-9ca9-91bf-c1039deecde2:1:25 -> 261230e8-d03e-9ca9-91bf-c1039deecde2:0:0 as path to executor unavailable. {noformat} {noformat} 2017-10-23 04:14:53,941 [UserServer-1] INFO o.a.drill.exec.rpc.user.UserServer - RPC connection /10.10.88.196:31010 <--> /10.10.88.193:38281 (user server) timed out. Timeout was set to 30 seconds. Closing connection. 2017-10-23 04:14:53,952 [UserServer-1] INFO o.a.d.e.w.fragment.FragmentExecutor - 261230f8-2698-15b2-952f-d4ade8d6b180:0:0: State change requested RUNNING --> FAILED 2017-10-23 04:14:53,952 [261230f8-2698-15b2-952f-d4ade8d6b180:frag:0:0] INFO o.a.d.e.w.fragment.FragmentExecutor - 261230f8-2698-15b2-952f-d4ade8d6b180:0:0: State change requested FAILED --> FINISHED 2017-10-23 04:14:53,956 [UserServer-1] WARN o.apache.drill.exec.rpc.RequestIdMap - Failure while attempting to fail rpc response. java.lang.IllegalArgumentException: Self-suppression not permitted at java.lang.Throwable.addSuppressed(Throwable.java:1043) ~[na:1.7.0_45] at org.apache.drill.common.DeferredException.addException(DeferredException.java:88) ~[drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.common.DeferredException.addThrowable(DeferredException.java:97) ~[drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.work.fragment.FragmentExecutor.fail(FragmentExecutor.java:413) ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.work.fragment.FragmentExecutor.access$700(FragmentExecutor.java:55) ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.work.fragment.FragmentExecutor$ExecutorStateImpl.fail(FragmentExecutor.java:427) ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.ops.FragmentContext.fail(FragmentContext.java:213) ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.ops.FragmentContext$1.accept(FragmentContext
[jira] [Created] (DRILL-5901) Drill test framework can have successful run even if a random failure occurs
Robert Hou created DRILL-5901: - Summary: Drill test framework can have successful run even if a random failure occurs Key: DRILL-5901 URL: https://issues.apache.org/jira/browse/DRILL-5901 Project: Apache Drill Issue Type: Bug Components: Tools, Build & Test Affects Versions: 1.11.0 Reporter: Robert Hou Random Failures: /root/drillAutomation/framework-master/framework/resources/Advanced/tpch/tpch_sf1/original/parquet/query17.sql Query: SELECT SUM(L.L_EXTENDEDPRICE) / 7.0 AS AVG_YEARLY FROM lineitem L, part P WHERE P.P_PARTKEY = L.L_PARTKEY AND P.P_BRAND = 'BRAND#13' AND P.P_CONTAINER = 'JUMBO CAN' AND L.L_QUANTITY < ( SELECT 0.2 * AVG(L2.L_QUANTITY) FROM lineitem L2 WHERE L2.L_PARTKEY = P.P_PARTKEY ) Failed with exception java.sql.SQLException: SYSTEM ERROR: IllegalStateException: Memory was leaked by query. Memory leaked: (2097152) Allocator(op:8:2:6:ParquetRowGroupScan) 100/0/7675904/100 (res/actual/peak/limit) Fragment 8:2 [Error Id: f21a2560-7259-4e13-88c2-9bac29e2930a on atsqa6c88.qa.lab:31010] (java.lang.IllegalStateException) Memory was leaked by query. Memory leaked: (2097152) Allocator(op:8:2:6:ParquetRowGroupScan) 100/0/7675904/100 (res/actual/peak/limit) org.apache.drill.exec.memory.BaseAllocator.close():519 org.apache.drill.exec.ops.AbstractOperatorExecContext.close():86 org.apache.drill.exec.ops.OperatorContextImpl.close():108 org.apache.drill.exec.ops.FragmentContext.suppressingClose():435 org.apache.drill.exec.ops.FragmentContext.close():424 org.apache.drill.exec.work.fragment.FragmentExecutor.closeOutResources():324 org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup():155 org.apache.drill.exec.work.fragment.FragmentExecutor.run():267 org.apache.drill.common.SelfCleaningRunnable.run():38 java.util.concurrent.ThreadPoolExecutor.runWorker():1145 java.util.concurrent.ThreadPoolExecutor$Worker.run():615 java.lang.Thread.run():744 at org.apache.drill.jdbc.impl.DrillCursor.nextRowInternally(DrillCursor.java:489) at org.apache.drill.jdbc.impl.DrillCursor.loadInitialSchema(DrillCursor.java:561) at org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(DrillResultSetImpl.java:1895) at org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(DrillResultSetImpl.java:61) at oadd.org.apache.calcite.avatica.AvaticaConnection$1.execute(AvaticaConnection.java:473) at org.apache.drill.jdbc.impl.DrillMetaImpl.prepareAndExecute(DrillMetaImpl.java:1100) at oadd.org.apache.calcite.avatica.AvaticaConnection.prepareAndExecuteInternal(AvaticaConnection.java:477) at org.apache.drill.jdbc.impl.DrillConnectionImpl.prepareAndExecuteInternal(DrillConnectionImpl.java:181) at oadd.org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:110) at oadd.org.apache.calcite.avatica.AvaticaStatement.executeQuery(AvaticaStatement.java:130) at org.apache.drill.jdbc.impl.DrillStatementImpl.executeQuery(DrillStatementImpl.java:112) at org.apache.drill.test.framework.DrillTestJdbc.executeQuery(DrillTestJdbc.java:206) at org.apache.drill.test.framework.DrillTestJdbc.run(DrillTestJdbc.java:115) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) Caused by: oadd.org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: IllegalStateException: Memory was leaked by query. Memory leaked: (2097152) Allocator(op:8:2:6:ParquetRowGroupScan) 100/0/7675904/100 (res/actual/peak/limit) Fragment 8:2 [Error Id: f21a2560-7259-4e13-88c2-9bac29e2930a on atsqa6c88.qa.lab:31010] (java.lang.IllegalStateException) Memory was leaked by query. Memory leaked: (2097152) Allocator(op:8:2:6:ParquetRowGroupScan) 100/0/7675904/100 (res/actual/peak/limit) org.apache.drill.exec.memory.BaseAllocator.close():519 org.apache.drill.exec.ops.AbstractOperatorExecContext.close():86 org.apache.drill.exec.ops.OperatorContextImpl.close():108 org.apache.drill.exec.ops.FragmentContext.suppressingClose():435 org.apache.drill.exec.ops.FragmentContext.close():424 org.apache.drill.exec.work.fragment.FragmentExecutor.closeOutResources():324 org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup():155 org.apache.drill.exec.work.fragment.FragmentExecutor.run():267 org.apache.drill.common.SelfCleaningRunnable.run():38 java.util.concurrent.ThreadPoolExecutor.runWorke
[jira] [Created] (DRILL-5900) Regression: TPCH query encounters random IllegalStateException: Memory was leaked by query
Robert Hou created DRILL-5900: - Summary: Regression: TPCH query encounters random IllegalStateException: Memory was leaked by query Key: DRILL-5900 URL: https://issues.apache.org/jira/browse/DRILL-5900 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.11.0 Reporter: Robert Hou Assignee: Pritesh Maker Priority: Blocker This is a random failure. This test has passed before. TPCH query 6: {noformat} SELECT SUM(L.L_EXTENDEDPRICE) / 7.0 AS AVG_YEARLY FROM lineitem L, part P WHERE P.P_PARTKEY = L.L_PARTKEY AND P.P_BRAND = 'BRAND#13' AND P.P_CONTAINER = 'JUMBO CAN' AND L.L_QUANTITY < ( SELECT 0.2 * AVG(L2.L_QUANTITY) FROM lineitem L2 WHERE L2.L_PARTKEY = P.P_PARTKEY ) {noformat} Error is: {noformat} 2017-10-23 10:34:55,989 [2611d7c0-b0c9-a93e-c64d-a4ef8f4baf8f:frag:8:2] ERROR o.a.d.e.w.fragment.FragmentExecutor - SYSTEM ERROR: IllegalStateException: Memory was leaked by query. Memory leaked: (2097152) Allocator(op:8:2:6:ParquetRowGroupScan) 100/0/7675904/100 (res/actual/peak/limit) Fragment 8:2 [Error Id: f21a2560-7259-4e13-88c2-9bac29e2930a on atsqa6c88.qa.lab:31010] org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: IllegalStateException: Memory was leaked by query. Memory leaked: (2097152) Allocator(op:8:2:6:ParquetRowGroupScan) 100/0/7675904/100 (res/actual/peak/limit) Fragment 8:2 [Error Id: f21a2560-7259-4e13-88c2-9bac29e2930a on atsqa6c88.qa.lab:31010] at org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:586) ~[drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:298) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:160) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:267) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) [drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_51] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_51] at java.lang.Thread.run(Thread.java:744) [na:1.7.0_51] Caused by: java.lang.IllegalStateException: Memory was leaked by query. Memory leaked: (2097152) Allocator(op:8:2:6:ParquetRowGroupScan) 100/0/7675904/100 (res/actual/peak/limit) at org.apache.drill.exec.memory.BaseAllocator.close(BaseAllocator.java:519) ~[drill-memory-base-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.ops.AbstractOperatorExecContext.close(AbstractOperatorExecContext.java:86) ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.ops.OperatorContextImpl.close(OperatorContextImpl.java:108) ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.ops.FragmentContext.suppressingClose(FragmentContext.java:435) ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.ops.FragmentContext.close(FragmentContext.java:424) ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.work.fragment.FragmentExecutor.closeOutResources(FragmentExecutor.java:324) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:155) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] ... 5 common frames omitted 2017-10-23 10:34:55,989 [2611d7c0-b0c9-a93e-c64d-a4ef8f4baf8f:frag:6:0] INFO o.a.d.e.w.f.FragmentStatusReporter - 2611d7c0-b0c9-a93e-c64d-a4ef8f4baf8f:6:0: State to report: FINISHED {noformat} sys.version is: 1.12.0-SNAPSHOT b0c4e0486d6d4620b04a1bb8198e959d433b4840DRILL-5876: Use openssl profile to include netty-tcnative dependency with the platform specific classifier 20.10.2017 @ 16:52:35 PDT The previous version that ran clean is this commit: {noformat} 1.12.0-SNAPSHOT f1d1945b3772bb782039fd6811e34a7de66441c8DRILL-5582: C++ Client: [Threat Modeling] Drillbit may be spoofed by an attacker and this may lead to data being written to the attacker's target instead of Drillbit 19.10.2017 @ 17:13:05 PDT {noformat} But since the failure is random, the problem could have been introduced earlier. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (DRILL-5898) Query returns columns in the wrong order
Robert Hou created DRILL-5898: - Summary: Query returns columns in the wrong order Key: DRILL-5898 URL: https://issues.apache.org/jira/browse/DRILL-5898 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.11.0 Reporter: Robert Hou Assignee: Vitalii Diravka Priority: Blocker Fix For: 1.12.0 This is a regression. It worked with this commit: {noformat} f1d1945b3772bb782039fd6811e34a7de66441c8DRILL-5582: C++ Client: [Threat Modeling] Drillbit may be spoofed by an attacker and this may lead to data being written to the attacker's target instead of Drillbit {noformat} It fails with this commit, although there are six commits total between the last good one and this one: {noformat} b0c4e0486d6d4620b04a1bb8198e959d433b4840DRILL-5876: Use openssl profile to include netty-tcnative dependency with the platform specific classifier {noformat} Query is: {noformat} select * from dfs.`/drill/testdata/tpch100_dir_partitioned_5files/lineitem` where dir0=2006 and dir1=12 and dir2=15 and l_discount=0.07 order by l_orderkey, l_extendedprice limit 10 {noformat} Columns are returned in a different order. Here are the expected results: {noformat} foxes. furiously final ideas cajol 1994-05-27 0.071731.42 4 F 653442 4965666.0 1.0 1994-06-23 A 1994-06-22 NONESHIP215671 0.07200612 15 (1 time(s)) lly final account 1994-11-09 0.0745881.783 F 653412 1.320809E7 46.01994-11-24 R 1994-11-08 TAKE BACK RETURNREG AIR 458104 0.08200612 15 (1 time(s)) the asymptotes 1997-12-29 0.0760882.8 6 O 653413 1.4271413E7 44.01998-02-04 N 1998-01-20 DELIVER IN PERSON MAIL21456 0.05200612 15 (1 time(s)) carefully a 1996-09-23 0.075381.88 2 O 653378 1.6702792E7 3.0 1996-11-14 N 1996-10-15 NONEREG AIR 952809 0.05200612 15 (1 time(s)) ly final requests. boldly ironic theo 1995-09-04 0.072019.94 2 O 653380 2416094.0 2.0 1995-11-14 N 1995-10-18 COLLECT COD FOB 166101 0.02200612 15 (1 time(s)) alongside of the even, e1996-02-14 0.0786140.322 O 653409 5622872.0 48.01996-05-02 N 1996-04-22 NONESHIP372888 0.04200612 15 (1 time(s)) es. regular instruct1996-10-18 0.0725194.0 1 O 653382 6048060.0 25.01996-08-29 N 1996-08-20 DELIVER IN PERSON AIR 798079 0.0 200612 15 (1 time(s)) en package 1993-09-19 0.0718718.322 F 653440 1.372054E7 12.01993-09-12 A 1993-09-09 DELIVER IN PERSON TRUCK 970554 0.0 200612 15 (1 time(s)) ly regular deposits snooze. unusual, even 1998-01-18 0.07 12427.921 O 653413 2822631.0 8.0 1998-02-09 N 1998-02-05 TAKE BACK RETURNREG AIR 322636 0.012006 12 15 (1 time(s)) ironic ideas. bra 1996-10-13 0.0764711.533 O 653383 6806672.0 41.01996-12-06 N 1996-11-10 TAKE BACK RETURNAIR 556691 0.01200612 15 (1 time(s)) {noformat} Here are the actual results: {noformat} 200612 15 653383 6806672 556691 3 41.064711.53 0.070.01N O 1996-11-10 1996-10-13 1996-12-06 TAKE BACK RETURNAIR ironic ideas. bra 200612 15 653378 16702792952809 2 3.0 5381.88 0.070.05N O 1996-10-15 1996-09-23 1996-11-14 NONEREG AIR carefully a 200612 15 653380 2416094 166101 2 2.0 2019.94 0.07 0.02N O 1995-10-18 1995-09-04 1995-11-14 COLLECT COD FOB ly final requests. boldly ironic theo 200612 15 653413 2822631 322636 1 8.0 12427.92 0.070.01N O 1998-02-05 1998-01-18 1998-02-09 TAKE BACK RETURNREG AIR ly regular deposits snooze. unusual, even 200612 15 653382 6048060 798079 1 25.025194.0 0.07 0.0 N O 1996-08-20 1996-10-18 1996-08-29 DELIVER IN PERSON AIR es. regular instruct 200612 15 653442 4965666 215671 4 1.0 1731.42 0.07 0.07A F 1994-06-22 1994-05-27 1994-06-23 NONE SHIPfoxes. furiously final ideas cajol 200612
[jira] [Created] (DRILL-5891) When Drill runs out of memory for a HashAgg, it should tell the user how much memory to allocate
Robert Hou created DRILL-5891: - Summary: When Drill runs out of memory for a HashAgg, it should tell the user how much memory to allocate Key: DRILL-5891 URL: https://issues.apache.org/jira/browse/DRILL-5891 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.11.0 Reporter: Robert Hou Assignee: Pritesh Maker Query is: select count(*), max(`filename`) from dfs.`/drill/testdata/hash-agg/data1` group by no_nulls_col, nulls_col; Error is: Error: RESOURCE ERROR: Not enough memory for internal partitioning and fallback mechanism for HashAgg to use unbounded memory is disabled. Either enable fallback config drill.exec.hashagg.fallback.enabled using Alter session/system command or increase memory limit for Drillbit >From drillbit.log: {noformat} 2017-10-18 13:30:17,135 [26184629-3f4c-856a-e99e-97cdf0d29321:frag:1:8] TRACE o.a.d.e.p.i.aggregate.HashAggregator - Incoming sizer: Actual batch schema & sizes { no_nulls_col(type: OPTIONAL VARCHAR, count: 1023, std size: 54, actual size: 130, data size: 132892) nulls_col(type: OPTIONAL VARCHAR, count: 1023, std size: 54, actual size: 112, data size: 113673) EXPR$0(type: REQUIRED BIGINT, count: 1023, std size: 8, actual size: 8, data size: 8184) EXPR$1(type: OPTIONAL VARCHAR, count: 1023, std size: 54, actual size: 18, data size: 18414) Records: 1023, Total size: 524288, Data size: 273163, Gross row width: 513, Net row width: 268, Density: 53%} 2017-10-18 13:30:17,135 [26184629-3f4c-856a-e99e-97cdf0d29321:frag:1:8] TRACE o.a.d.e.p.i.aggregate.HashAggregator - 2nd phase. Estimated internal row width: 166 Values row width: 66 batch size: 12779520 memory limit: 63161283 max column width: 50 2017-10-18 13:30:17,139 [26184629-3f4c-856a-e99e-97cdf0d29321:frag:3:2] TRACE o.a.d.e.p.impl.common.HashTable - HT allocated 4784128 for varchar of max width 50 2017-10-18 13:30:17,139 [26184629-3f4c-856a-e99e-97cdf0d29321:frag:1:15] INFO o.a.d.e.p.i.aggregate.HashAggregator - User Error Occurred: Not enough memory for internal partitioning and fallback mechanism for HashAgg to use unbounded memory is disabled. Either enable fallback config drill.exec.hashagg.fallback.enabled using Alter session/system command or increase memory limit for Drillbit org.apache.drill.common.exceptions.UserException: RESOURCE ERROR: Not enough memory for internal partitioning and fallback mechanism for HashAgg to use unbounded memory is disabled. Either enable fallback config drill.exec.hashagg.fallback.enabled using Alter session/system command or increase memory limit for Drillbit {noformat} I would recommend that we add a log message with the "alter" command to increase the amount of memory allocated, and how much memory to allocate. Otherwise, the user may not know what to do. I would also not suggest enabling "drill.exec.hashagg.fallback.enabled" except as a last resort. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (DRILL-5889) sqlline loses RPC connection with executing query with HashAgg
Robert Hou created DRILL-5889: - Summary: sqlline loses RPC connection with executing query with HashAgg Key: DRILL-5889 URL: https://issues.apache.org/jira/browse/DRILL-5889 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.11.0 Reporter: Robert Hou Query is: {noformat} alter session set `planner.memory.max_query_memory_per_node` = 10737418240; select count(*), max(`filename`) from dfs.`/drill/testdata/hash-agg/data1` group by no_nulls_col, nulls_col; {noformat} Error is: {noformat} 0: jdbc:drill:drillbit=10.10.100.190> select count(*), max(`filename`) from dfs.`/drill/testdata/hash-agg/data1` group by no_nulls_col, nulls_col; Error: CONNECTION ERROR: Connection /10.10.100.190:45776 <--> /10.10.100.190:31010 (user client) closed unexpectedly. Drillbit down? [Error Id: db4aea70-11e6-4e63-b0cc-13cdba0ee87a ] (state=,code=0) {noformat} >From drillbit.log: 2017-10-18 14:04:23,044 [UserServer-1] INFO o.a.drill.exec.rpc.user.UserServer - RPC connection /10.10.100.190:31010 <--> /10.10.100.190:45776 (user server) timed out. Timeout was set to 30 seconds. Closing connection. Plan is: {noformat} | 00-00Screen 00-01 Project(EXPR$0=[$0], EXPR$1=[$1]) 00-02UnionExchange 01-01 Project(EXPR$0=[$2], EXPR$1=[$3]) 01-02HashAgg(group=[{0, 1}], EXPR$0=[$SUM0($2)], EXPR$1=[MAX($3)]) 01-03 Project(no_nulls_col=[$0], nulls_col=[$1], EXPR$0=[$2], EXPR$1=[$3]) 01-04HashToRandomExchange(dist0=[[$0]], dist1=[[$1]]) 02-01 UnorderedMuxExchange 03-01Project(no_nulls_col=[$0], nulls_col=[$1], EXPR$0=[$2], EXPR$1=[$3], E_X_P_R_H_A_S_H_F_I_E_L_D=[hash32AsDouble($1, hash32AsDouble($0, 1301011))]) 03-02 HashAgg(group=[{0, 1}], EXPR$0=[COUNT()], EXPR$1=[MAX($2)]) 03-03Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:///drill/testdata/hash-agg/data1]], selectionRoot=maprfs:/drill/testdata/hash-agg/data1, numFiles=1, usedMetadataFile=false, columns=[`no_nulls_col`, `nulls_col`, `filename`]]]) {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (DRILL-5804) External Sort times out, may be infinite loop
[ https://issues.apache.org/jira/browse/DRILL-5804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Hou resolved DRILL-5804. --- Resolution: Fixed > External Sort times out, may be infinite loop > - > > Key: DRILL-5804 > URL: https://issues.apache.org/jira/browse/DRILL-5804 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators >Affects Versions: 1.11.0 >Reporter: Robert Hou >Assignee: Paul Rogers > Fix For: 1.12.0 > > Attachments: drillbit.log > > > Query is: > {noformat} > ALTER SESSION SET `exec.sort.disable_managed` = false; > select count(*) from ( > select * from ( > select s1.type type, flatten(s1.rms.rptd) rptds, s1.rms, s1.uid > from ( > select d.type type, d.uid uid, flatten(d.map.rm) rms from > dfs.`/drill/testdata/resource-manager/nested_large` d order by d.uid > ) s1 > ) s2 > order by s2.rms.mapid, s2.rptds.a, s2.rptds.do_not_exist > ); > {noformat} > Plan is: > {noformat} > | 00-00Screen > 00-01 Project(EXPR$0=[$0]) > 00-02StreamAgg(group=[{}], EXPR$0=[$SUM0($0)]) > 00-03 UnionExchange > 01-01StreamAgg(group=[{}], EXPR$0=[COUNT()]) > 01-02 Project($f0=[0]) > 01-03SingleMergeExchange(sort0=[4 ASC], sort1=[5 ASC], > sort2=[6 ASC]) > 02-01 SelectionVectorRemover > 02-02Sort(sort0=[$4], sort1=[$5], sort2=[$6], dir0=[ASC], > dir1=[ASC], dir2=[ASC]) > 02-03 Project(type=[$0], rptds=[$1], rms=[$2], uid=[$3], > EXPR$4=[$4], EXPR$5=[$5], EXPR$6=[$6]) > 02-04HashToRandomExchange(dist0=[[$4]], dist1=[[$5]], > dist2=[[$6]]) > 03-01 UnorderedMuxExchange > 04-01Project(type=[$0], rptds=[$1], rms=[$2], > uid=[$3], EXPR$4=[$4], EXPR$5=[$5], EXPR$6=[$6], > E_X_P_R_H_A_S_H_F_I_E_L_D=[hash32AsDouble($6, hash32AsDouble($5, > hash32AsDouble($4, 1301011)))]) > 04-02 Project(type=[$0], rptds=[$1], rms=[$2], > uid=[$3], EXPR$4=[ITEM($2, 'mapid')], EXPR$5=[ITEM($1, 'a')], > EXPR$6=[ITEM($1, 'do_not_exist')]) > 04-03Flatten(flattenField=[$1]) > 04-04 Project(type=[$0], rptds=[ITEM($2, > 'rptd')], rms=[$2], uid=[$1]) > 04-05SingleMergeExchange(sort0=[1 ASC]) > 05-01 SelectionVectorRemover > 05-02Sort(sort0=[$1], dir0=[ASC]) > 05-03 Project(type=[$0], uid=[$1], > rms=[$2]) > 05-04 > HashToRandomExchange(dist0=[[$1]]) > 06-01 UnorderedMuxExchange > 07-01Project(type=[$0], > uid=[$1], rms=[$2], E_X_P_R_H_A_S_H_F_I_E_L_D=[hash32AsDouble($1, 1301011)]) > 07-02 > Flatten(flattenField=[$2]) > 07-03Project(type=[$0], > uid=[$1], rms=[ITEM($2, 'rm')]) > 07-04 > Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath > [path=maprfs:///drill/testdata/resource-manager/nested_large]], > selectionRoot=maprfs:/drill/testdata/resource-manager/nested_large, > numFiles=1, usedMetadataFile=false, columns=[`type`, `uid`, `map`.`rm`]]]) > {noformat} > Here is a segment of the drillbit.log, starting at line 55890: > {noformat} > 2017-09-19 04:22:56,258 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:2] DEBUG > o.a.d.e.t.g.SingleBatchSorterGen44 - Took 142 us to sort 1023 records > 2017-09-19 04:22:56,265 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:4] DEBUG > o.a.d.e.t.g.SingleBatchSorterGen44 - Took 105 us to sort 1023 records > 2017-09-19 04:22:56,268 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:3:0] DEBUG > o.a.d.e.p.i.p.PartitionSenderRootExec - Partitioner.next(): got next record > batch with status OK > 2017-09-19 04:22:56,275 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:7] DEBUG > o.a.d.e.t.g.SingleBatchSorterGen44 - Took 145 us to sort 1023 records > 2017-09-19 04:22:56,354 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:3:0] DEBUG > o.a.d.e.p.i.p.PartitionSenderRootExec - Partitioner.next(): got next record > batch with status OK > 2017-09-19 04:22:56,357 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:2] DEBUG > o.a.d.e.t.g.SingleBatchSorterGen44 - Took 143 us to sort 1023 records > 2017-09-19 04:22:56,361 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:0] DEBUG > o.a.d.exec.compile.ClassTransformer - Compiled and merged > PriorityQueueCopierGen50: bytecode size = 11.0 KiB, time = 124 ms. > 2017-09-19 04:22:56,
[jira] [Created] (DRILL-5886) Operators should create batch sizes that the next operator can consume to avoid OOM
Robert Hou created DRILL-5886: - Summary: Operators should create batch sizes that the next operator can consume to avoid OOM Key: DRILL-5886 URL: https://issues.apache.org/jira/browse/DRILL-5886 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.11.0 Reporter: Robert Hou Attachments: 26478262-f0a7-8fc1-1887-4f27071b9c0f.sys.drill, drillbit.log.exchange Query is: {noformat} ALTER SESSION SET `exec.sort.disable_managed` = false alter session set `planner.memory.max_query_memory_per_node` = 482344960 alter session set `planner.width.max_per_node` = 1 alter session set `planner.width.max_per_query` = 1 alter session set `planner.disable_exchanges` = true select count(*) from (select * from dfs.`/drill/testdata/resource-manager/3500cols.tbl` order by columns[450],columns[330],columns[230],columns[220],columns[110],columns[90],columns[80],columns[70],columns[40],columns[10],columns[20],columns[30],columns[40],columns[50], columns[454],columns[413],columns[940],columns[834],columns[73],columns[140],columns[104],columns[],columns[30],columns[2420],columns[1520], columns[1410], columns[1110],columns[1290],columns[2380],columns[705],columns[45],columns[1054],columns[2430],columns[420],columns[404],columns[3350], columns[],columns[153],columns[356],columns[84],columns[745],columns[1450],columns[103],columns[2065],columns[343],columns[3420],columns[530], columns[3210] ) d where d.col433 = 'sjka skjf'; {noformat} This is the error from drillbit.log: 2017-09-12 17:36:53,155 [26478262-f0a7-8fc1-1887-4f27071b9c0f:frag:0:0] ERROR o.a.d.e.p.i.x.m.ExternalSortBatch - Insufficient memory to merge two batches. Incoming batch size: 409305088, available memory: 482344960 Here is the plan: {noformat} | 00-00Screen 00-01 Project(EXPR$0=[$0]) 00-02StreamAgg(group=[{}], EXPR$0=[COUNT()]) 00-03 Project($f0=[0]) 00-04SelectionVectorRemover 00-05 Filter(condition=[=(ITEM($0, 'col433'), 'sjka skjf')]) 00-06Project(T8¦¦*=[$0]) 00-07 SelectionVectorRemover 00-08Sort(sort0=[$1], sort1=[$2], sort2=[$3], sort3=[$4], sort4=[$5], sort5=[$6], sort6=[$7], sort7=[$8], sort8=[$9], sort9=[$10], sort10=[$11], sort11=[$12], sort12=[$9], sort13=[$13], sort14=[$14], sort15=[$15], sort16=[$16], sort17=[$17], sort18=[$18], sort19=[$19], sort20=[$20], sort21=[$21], sort22=[$12], sort23=[$22], sort24=[$23], sort25=[$24], sort26=[$25], sort27=[$26], sort28=[$27], sort29=[$28], sort30=[$29], sort31=[$30], sort32=[$31], sort33=[$32], sort34=[$33], sort35=[$34], sort36=[$35], sort37=[$36], sort38=[$37], sort39=[$38], sort40=[$39], sort41=[$40], sort42=[$41], sort43=[$42], sort44=[$43], sort45=[$44], sort46=[$45], sort47=[$46], dir0=[ASC], dir1=[ASC], dir2=[ASC], dir3=[ASC], dir4=[ASC], dir5=[ASC], dir6=[ASC], dir7=[ASC], dir8=[ASC], dir9=[ASC], dir10=[ASC], dir11=[ASC], dir12=[ASC], dir13=[ASC], dir14=[ASC], dir15=[ASC], dir16=[ASC], dir17=[ASC], dir18=[ASC], dir19=[ASC], dir20=[ASC], dir21=[ASC], dir22=[ASC], dir23=[ASC], dir24=[ASC], dir25=[ASC], dir26=[ASC], dir27=[ASC], dir28=[ASC], dir29=[ASC], dir30=[ASC], dir31=[ASC], dir32=[ASC], dir33=[ASC], dir34=[ASC], dir35=[ASC], dir36=[ASC], dir37=[ASC], dir38=[ASC], dir39=[ASC], dir40=[ASC], dir41=[ASC], dir42=[ASC], dir43=[ASC], dir44=[ASC], dir45=[ASC], dir46=[ASC], dir47=[ASC]) 00-09 Project(T8¦¦*=[$0], EXPR$1=[ITEM($1, 450)], EXPR$2=[ITEM($1, 330)], EXPR$3=[ITEM($1, 230)], EXPR$4=[ITEM($1, 220)], EXPR$5=[ITEM($1, 110)], EXPR$6=[ITEM($1, 90)], EXPR$7=[ITEM($1, 80)], EXPR$8=[ITEM($1, 70)], EXPR$9=[ITEM($1, 40)], EXPR$10=[ITEM($1, 10)], EXPR$11=[ITEM($1, 20)], EXPR$12=[ITEM($1, 30)], EXPR$13=[ITEM($1, 50)], EXPR$14=[ITEM($1, 454)], EXPR$15=[ITEM($1, 413)], EXPR$16=[ITEM($1, 940)], EXPR$17=[ITEM($1, 834)], EXPR$18=[ITEM($1, 73)], EXPR$19=[ITEM($1, 140)], EXPR$20=[ITEM($1, 104)], EXPR$21=[ITEM($1, )], EXPR$22=[ITEM($1, 2420)], EXPR$23=[ITEM($1, 1520)], EXPR$24=[ITEM($1, 1410)], EXPR$25=[ITEM($1, 1110)], EXPR$26=[ITEM($1, 1290)], EXPR$27=[ITEM($1, 2380)], EXPR$28=[ITEM($1, 705)], EXPR$29=[ITEM($1, 45)], EXPR$30=[ITEM($1, 1054)], EXPR$31=[ITEM($1, 2430)], EXPR$32=[ITEM($1, 420)], EXPR$33=[ITEM($1, 404)], EXPR$34=[ITEM($1, 3350)], EXPR$35=[ITEM($1, )], EXPR$36=[ITEM($1, 153)], EXPR$37=[ITEM($1, 356)], EXPR$38=[ITEM($1, 84)], EXPR$39=[ITEM($1, 745)], EXPR$40=[ITEM($1, 1450)], EXPR$41=[ITEM($1, 103)], EXPR$42=[ITEM($1, 2065)], EXPR$43=[ITEM($1, 343)], EXPR$44=[ITEM($1, 3420)], EXPR$45=[ITEM($1, 530)], EXPR$46=[ITEM($1, 3210)]) 00-10Project(T8¦¦*=[$0], columns=[$1]) 00-11 Scan(groupscan=[EasyGroupScan [selectionRoot=maprfs:/drill/testdata/resource-manager/3500cols.tbl, numFiles=1, columns=[`*`],
[jira] [Created] (DRILL-5885) Drill consumes 2x memory when sorting and reading a spilled batch from disk.
Robert Hou created DRILL-5885: - Summary: Drill consumes 2x memory when sorting and reading a spilled batch from disk. Key: DRILL-5885 URL: https://issues.apache.org/jira/browse/DRILL-5885 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.11.0 Reporter: Robert Hou The query is: {noformat} select count(*) from (select * from dfs.`/drill/testdata/resource-manager/3500cols.tbl` order by columns[450],columns[330],columns[230],columns[220],columns[110],columns[90],columns[80],columns[70],columns[40],columns[10],columns[20],columns[30],columns[40],columns[50], columns[454],columns[413],columns[940],columns[834],columns[73],columns[140],columns[104],columns[],columns[30],columns[2420],columns[1520], columns[1410], columns[1110],columns[1290],columns[2380],columns[705],columns[45],columns[1054],columns[2430],columns[420],columns[404],columns[3350], columns[],columns[153],columns[356],columns[84],columns[745],columns[1450],columns[103],columns[2065],columns[343],columns[3420],columns[530], columns[3210] ) d where d.col433 = 'sjka skjf'; {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (DRILL-5840) A query that includes sort completes, and then loses Drill connection. Drill becomes unresponsive, and cannot restart because it cannot communicate with Zookeeper
[ https://issues.apache.org/jira/browse/DRILL-5840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Hou resolved DRILL-5840. --- Resolution: Not A Problem > A query that includes sort completes, and then loses Drill connection. Drill > becomes unresponsive, and cannot restart because it cannot communicate with > Zookeeper > -- > > Key: DRILL-5840 > URL: https://issues.apache.org/jira/browse/DRILL-5840 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators >Affects Versions: 1.11.0 >Reporter: Robert Hou >Assignee: Paul Rogers > Fix For: 1.12.0 > > > Query is: > {noformat} > ALTER SESSION SET `exec.sort.disable_managed` = false; > select count(*) from (select * from > dfs.`/drill/testdata/resource-manager/250wide.tbl` order by columns[0])d > where d.columns[0] = 'ljdfhwuehnoiueyf'; > {noformat} > Query tries to complete, but cannot. It takes 20 hours from the time the > query tries to complete, to the time Drill finally loses its connection. > From the drillbit.log: > {noformat} > 2017-10-03 16:28:14,892 [262bec7f-3539-0dd7-6fea-f2959f9df3b6:frag:0:0] DEBUG > o.a.drill.exec.work.foreman.Foreman - 262bec7f-3539-0dd7-6fea-f2959f9df3b6: > State change requested RUNNING --> COMPLETED > 2017-10-04 01:47:27,698 [UserServer-1] DEBUG > o.a.d.e.r.u.UserServerRequestHandler - Received query to run. Returning > query handle. > 2017-10-04 03:30:02,916 [262bec7f-3539-0dd7-6fea-f2959f9df3b6:frag:0:0] WARN > o.a.d.exec.work.foreman.QueryManager - Failure while trying to delete the > estore profile for this query. > org.apache.drill.common.exceptions.DrillRuntimeException: unable to delete > node at /running/262bec7f-3539-0dd7-6fea-f2959f9df3b6 > at > org.apache.drill.exec.coord.zk.ZookeeperClient.delete(ZookeeperClient.java:343) > ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.coord.zk.ZkEphemeralStore.remove(ZkEphemeralStore.java:108) > ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.work.foreman.QueryManager.updateEphemeralState(QueryManager.java:293) > [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.work.foreman.Foreman.recordNewState(Foreman.java:1043) > [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.work.foreman.Foreman.moveToState(Foreman.java:964) > [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.work.foreman.Foreman.access$2600(Foreman.java:113) > [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.work.foreman.Foreman$StateSwitch.processEvent(Foreman.java:1025) > [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.work.foreman.Foreman$StateSwitch.processEvent(Foreman.java:1018) > [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.common.EventProcessor.processEvents(EventProcessor.java:107) > [drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.common.EventProcessor.sendEvent(EventProcessor.java:65) > [drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.work.foreman.Foreman$StateSwitch.addEvent(Foreman.java:1020) > [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.work.foreman.Foreman.addToEventQueue(Foreman.java:1038) > [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.work.foreman.QueryManager.nodeComplete(QueryManager.java:498) > [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.work.foreman.QueryManager.access$100(QueryManager.java:66) > [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.work.foreman.QueryManager$NodeTracker.fragmentComplete(QueryManager.java:462) > [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.work.foreman.QueryManager.fragmentDone(QueryManager.java:147) > [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.work.foreman.QueryManager.access$400(QueryManager.java:66) > [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.work.foreman.QueryManager$1.statusUpdate(QueryManager.java:525) > [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.rpc.control.WorkEventBus.statusUpdate(WorkEventBus.java:71) > [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentS
[jira] [Created] (DRILL-5840) A query that includes sort completes, and then loses Drill connection. Drill becomes unresponsive, and cannot restart because it cannot communicate with Zookeeper
Robert Hou created DRILL-5840: - Summary: A query that includes sort completes, and then loses Drill connection. Drill becomes unresponsive, and cannot restart because it cannot communicate with Zookeeper Key: DRILL-5840 URL: https://issues.apache.org/jira/browse/DRILL-5840 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.11.0 Reporter: Robert Hou Assignee: Paul Rogers Fix For: 1.12.0 Query is: {noformat} ALTER SESSION SET `exec.sort.disable_managed` = false; select count(*) from (select * from dfs.`/drill/testdata/resource-manager/250wide.tbl` order by columns[0])d where d.columns[0] = 'ljdfhwuehnoiueyf'; {noformat} Query tries to complete, but cannot. >From the drillbit.log: {noformat} 2017-10-03 16:28:14,892 [262bec7f-3539-0dd7-6fea-f2959f9df3b6:frag:0:0] DEBUG o.a.drill.exec.work.foreman.Foreman - 262bec7f-3539-0dd7-6fea-f2959f9df3b6: State change requested RUNNING --> COMPLETED 2017-10-04 01:47:27,698 [UserServer-1] DEBUG o.a.d.e.r.u.UserServerRequestHandler - Received query to run. Returning query handle. 2017-10-04 03:30:02,916 [262bec7f-3539-0dd7-6fea-f2959f9df3b6:frag:0:0] WARN o.a.d.exec.work.foreman.QueryManager - Failure while trying to delete the estore profile for this query. org.apache.drill.common.exceptions.DrillRuntimeException: unable to delete node at /running/262bec7f-3539-0dd7-6fea-f2959f9df3b6 at org.apache.drill.exec.coord.zk.ZookeeperClient.delete(ZookeeperClient.java:343) ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.coord.zk.ZkEphemeralStore.remove(ZkEphemeralStore.java:108) ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.work.foreman.QueryManager.updateEphemeralState(QueryManager.java:293) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.work.foreman.Foreman.recordNewState(Foreman.java:1043) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.work.foreman.Foreman.moveToState(Foreman.java:964) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.work.foreman.Foreman.access$2600(Foreman.java:113) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.work.foreman.Foreman$StateSwitch.processEvent(Foreman.java:1025) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.work.foreman.Foreman$StateSwitch.processEvent(Foreman.java:1018) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.common.EventProcessor.processEvents(EventProcessor.java:107) [drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.common.EventProcessor.sendEvent(EventProcessor.java:65) [drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.work.foreman.Foreman$StateSwitch.addEvent(Foreman.java:1020) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.work.foreman.Foreman.addToEventQueue(Foreman.java:1038) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.work.foreman.QueryManager.nodeComplete(QueryManager.java:498) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.work.foreman.QueryManager.access$100(QueryManager.java:66) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.work.foreman.QueryManager$NodeTracker.fragmentComplete(QueryManager.java:462) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.work.foreman.QueryManager.fragmentDone(QueryManager.java:147) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.work.foreman.QueryManager.access$400(QueryManager.java:66) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.work.foreman.QueryManager$1.statusUpdate(QueryManager.java:525) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.rpc.control.WorkEventBus.statusUpdate(WorkEventBus.java:71) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.work.fragment.FragmentStatusReporter.sendStatus(FragmentStatusReporter.java:124) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.work.fragment.FragmentStatusReporter.stateChanged(FragmentStatusReporter.java:94) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:304) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:160) [d
[jira] [Created] (DRILL-5813) A query that includes sort encounters Exception occurred with closed channel
Robert Hou created DRILL-5813: - Summary: A query that includes sort encounters Exception occurred with closed channel Key: DRILL-5813 URL: https://issues.apache.org/jira/browse/DRILL-5813 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.11.0 Reporter: Robert Hou Assignee: Paul Rogers Fix For: 1.12.0 Query is: {noformat} ALTER SESSION SET `exec.sort.disable_managed` = false; alter session set `planner.enable_decimal_data_type` = true; select count(*) from (select * from dfs.`/drill/testdata/resource-manager/all_types_large` order by missing11) d where d.missing3 is false; {noformat} This query has passed before when the number of threads and amount of memory is restricted. With more threads and memory, the query does not complete execution. Here is the stack trace: {noformat} Exception occurred with closed channel. Connection: /10.10.100.190:59281 <--> /10.10.100.190:31010 (user client) java.io.IOException: Connection reset by peer at sun.nio.ch.FileDispatcherImpl.read0(Native Method) at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) at sun.nio.ch.IOUtil.read(IOUtil.java:192) at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:384) at oadd.io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes(PooledUnsafeDirectByteBuf.java:311) at oadd.io.netty.buffer.WrappedByteBuf.setBytes(WrappedByteBuf.java:407) at oadd.io.netty.buffer.UnsafeDirectLittleEndian.setBytes(UnsafeDirectLittleEndian.java:32) at oadd.io.netty.buffer.DrillBuf.setBytes(DrillBuf.java:792) at oadd.io.netty.buffer.MutableWrappedByteBuf.setBytes(MutableWrappedByteBuf.java:280) at oadd.io.netty.buffer.ExpandableByteBuf.setBytes(ExpandableByteBuf.java:26) at oadd.io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:881) at oadd.io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:241) at oadd.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:119) at oadd.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511) at oadd.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468) at oadd.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382) at oadd.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) at oadd.io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111) at java.lang.Thread.run(Thread.java:745) User Error Occurred: Connection /10.10.100.190:59281 <--> /10.10.100.190:31010 (user client) closed unexpectedly. Drillbit down? oadd.org.apache.drill.common.exceptions.UserException: CONNECTION ERROR: Connection /10.10.100.190:59281 <--> /10.10.100.190:31010 (user client) closed un expectedly. Drillbit down? [Error Id: b97704a4-b8f0-4cd0-b428-2cf1bcf39a1d ] at oadd.org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:550) at oadd.org.apache.drill.exec.rpc.user.QueryResultHandler$ChannelClosedHandler$1.operationComplete(QueryResultHandler.java:373) at oadd.io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:680) at oadd.io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:603) at oadd.io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:563) at oadd.io.netty.util.concurrent.DefaultPromise.trySuccess(DefaultPromise.java:406) at oadd.io.netty.channel.DefaultChannelPromise.trySuccess(DefaultChannelPromise.java:82) at oadd.io.netty.channel.AbstractChannel$CloseFuture.setClosed(AbstractChannel.java:943) at oadd.io.netty.channel.AbstractChannel$AbstractUnsafe.doClose0(AbstractChannel.java:592) at oadd.io.netty.channel.AbstractChannel$AbstractUnsafe.close(AbstractChannel.java:584) at oadd.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.closeOnRead(AbstractNioByteChannel.java:71) at oadd.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.handleReadException(AbstractNioByteChannel.java:89) at oadd.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:162) at oadd.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511) at oadd.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468) at oadd.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382) at oadd.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) at oadd.io.netty.util.c
[jira] [Created] (DRILL-5805) External Sort runs out of memory
Robert Hou created DRILL-5805: - Summary: External Sort runs out of memory Key: DRILL-5805 URL: https://issues.apache.org/jira/browse/DRILL-5805 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.11.0 Reporter: Robert Hou Assignee: Paul Rogers Fix For: 1.12.0 Query is: {noformat} ALTER SESSION SET `exec.sort.disable_managed` = false; alter session set `planner.width.max_per_node` = 5; alter session set `planner.disable_exchanges` = true; alter session set `planner.width.max_per_query` = 100; select count(*) from (select * from (select id, flatten(str_list) str from dfs.`/drill/testdata/resource-manager/flatten-large-small.json`) d order by d.str) d1 where d1.id=0; {noformat} Plan is: {noformat} | 00-00Screen 00-01 Project(EXPR$0=[$0]) 00-02StreamAgg(group=[{}], EXPR$0=[COUNT()]) 00-03 Project($f0=[0]) 00-04SelectionVectorRemover 00-05 Filter(condition=[=($0, 0)]) 00-06SelectionVectorRemover 00-07 Sort(sort0=[$1], dir0=[ASC]) 00-08Flatten(flattenField=[$1]) 00-09 Project(id=[$0], str=[$1]) 00-10Scan(groupscan=[EasyGroupScan [selectionRoot=maprfs:/drill/testdata/resource-manager/flatten-large-small.json, numFiles=1, columns=[`id`, `str_list`], files=[maprfs:///drill/testdata/resource-manager/flatten-large-small.json]]]) {noformat} sys.version is: {noformat} | 1.12.0-SNAPSHOT | c4211d3b545b0d1996b096a8e1ace35376a63977 | Fix for DRILL-5670 | 09.09.2017 @ 14:38:25 PDT | r...@qa-node190.qa.lab | 11.09.2017 @ 14:27:16 PDT | {noformat} mult drill5447_1 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (DRILL-5804) Query times out, may be infinite loop
Robert Hou created DRILL-5804: - Summary: Query times out, may be infinite loop Key: DRILL-5804 URL: https://issues.apache.org/jira/browse/DRILL-5804 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.11.0 Reporter: Robert Hou Assignee: Paul Rogers Fix For: 1.12.0 Query is: {noformat} ALTER SESSION SET `exec.sort.disable_managed` = false; select count(*) from ( select * from ( select s1.type type, flatten(s1.rms.rptd) rptds, s1.rms, s1.uid from ( select d.type type, d.uid uid, flatten(d.map.rm) rms from dfs.`/drill/testdata/resource-manager/nested_large` d order by d.uid ) s1 ) s2 order by s2.rms.mapid, s2.rptds.a, s2.rptds.do_not_exist ); {noformat} Plan is: {noformat} | 00-00Screen 00-01 Project(EXPR$0=[$0]) 00-02StreamAgg(group=[{}], EXPR$0=[$SUM0($0)]) 00-03 UnionExchange 01-01StreamAgg(group=[{}], EXPR$0=[COUNT()]) 01-02 Project($f0=[0]) 01-03SingleMergeExchange(sort0=[4 ASC], sort1=[5 ASC], sort2=[6 ASC]) 02-01 SelectionVectorRemover 02-02Sort(sort0=[$4], sort1=[$5], sort2=[$6], dir0=[ASC], dir1=[ASC], dir2=[ASC]) 02-03 Project(type=[$0], rptds=[$1], rms=[$2], uid=[$3], EXPR$4=[$4], EXPR$5=[$5], EXPR$6=[$6]) 02-04HashToRandomExchange(dist0=[[$4]], dist1=[[$5]], dist2=[[$6]]) 03-01 UnorderedMuxExchange 04-01Project(type=[$0], rptds=[$1], rms=[$2], uid=[$3], EXPR$4=[$4], EXPR$5=[$5], EXPR$6=[$6], E_X_P_R_H_A_S_H_F_I_E_L_D=[hash32AsDouble($6, hash32AsDouble($5, hash32AsDouble($4, 1301011)))]) 04-02 Project(type=[$0], rptds=[$1], rms=[$2], uid=[$3], EXPR$4=[ITEM($2, 'mapid')], EXPR$5=[ITEM($1, 'a')], EXPR$6=[ITEM($1, 'do_not_exist')]) 04-03Flatten(flattenField=[$1]) 04-04 Project(type=[$0], rptds=[ITEM($2, 'rptd')], rms=[$2], uid=[$1]) 04-05SingleMergeExchange(sort0=[1 ASC]) 05-01 SelectionVectorRemover 05-02Sort(sort0=[$1], dir0=[ASC]) 05-03 Project(type=[$0], uid=[$1], rms=[$2]) 05-04 HashToRandomExchange(dist0=[[$1]]) 06-01 UnorderedMuxExchange 07-01Project(type=[$0], uid=[$1], rms=[$2], E_X_P_R_H_A_S_H_F_I_E_L_D=[hash32AsDouble($1, 1301011)]) 07-02 Flatten(flattenField=[$2]) 07-03Project(type=[$0], uid=[$1], rms=[ITEM($2, 'rm')]) 07-04 Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:///drill/testdata/resource-manager/nested_large]], selectionRoot=maprfs:/drill/testdata/resource-manager/nested_large, numFiles=1, usedMetadataFile=false, columns=[`type`, `uid`, `map`.`rm`]]]) {noformat} Here is a segment of the drillbit.log, starting at line 55890: {noformat} 2017-09-19 04:22:56,258 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:2] DEBUG o.a.d.e.t.g.SingleBatchSorterGen44 - Took 142 us to sort 1023 records 2017-09-19 04:22:56,265 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:4] DEBUG o.a.d.e.t.g.SingleBatchSorterGen44 - Took 105 us to sort 1023 records 2017-09-19 04:22:56,268 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:3:0] DEBUG o.a.d.e.p.i.p.PartitionSenderRootExec - Partitioner.next(): got next record batch with status OK 2017-09-19 04:22:56,275 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:7] DEBUG o.a.d.e.t.g.SingleBatchSorterGen44 - Took 145 us to sort 1023 records 2017-09-19 04:22:56,354 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:3:0] DEBUG o.a.d.e.p.i.p.PartitionSenderRootExec - Partitioner.next(): got next record batch with status OK 2017-09-19 04:22:56,357 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:2] DEBUG o.a.d.e.t.g.SingleBatchSorterGen44 - Took 143 us to sort 1023 records 2017-09-19 04:22:56,361 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:0] DEBUG o.a.d.exec.compile.ClassTransformer - Compiled and merged PriorityQueueCopierGen50: bytecode size = 11.0 KiB, time = 124 ms. 2017-09-19 04:22:56,365 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:4] DEBUG o.a.d.e.t.g.SingleBatchSorterGen44 - Took 108 us to sort 1023 records 2017-09-19 04:22:56,367 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:0] DEBUG o.a.d.e.p.i.x.m.PriorityQueueCopierWrapper - Copier setup complete 2017-09-19 04:22:56,375 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:7] DEBUG o.a.d.e.t.g.SingleBatchSorterGen44 - Took 14
[jira] [Created] (DRILL-5786) Query enters Exception in RPC communication
Robert Hou created DRILL-5786: - Summary: Query enters Exception in RPC communication Key: DRILL-5786 URL: https://issues.apache.org/jira/browse/DRILL-5786 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.11.0 Reporter: Robert Hou Assignee: Paul Rogers Fix For: 1.12.0 Query is: {noformat} select count(*) from (select * from dfs.`/drill/testdata/resource-manager/3500cols.tbl` order by columns[450],columns[330],columns[230],columns[220],columns[110],columns[90],columns[80],columns[70],columns[40],columns[10],columns[20],columns[30],columns[40],columns[50], columns[454],columns[413],columns[940],columns[834],columns[73],columns[140],columns[104],columns[],columns[30],columns[2420],columns[1520], columns[1410], columns[1110],columns[1290],columns[2380],columns[705],columns[45],columns[1054],columns[2430],columns[420],columns[404],columns[3350], columns[],columns[153],columns[356],columns[84],columns[745],columns[1450],columns[103],columns[2065],columns[343],columns[3420],columns[530], columns[3210] ) d where d.col433 = 'sjka skjf' {noformat} This is the same query as DRILL-5670 but no session variables are set. Here is the stack trace: {noformat} 2017-09-12 13:14:57,584 [BitServer-5] ERROR o.a.d.exec.rpc.RpcExceptionHandler - Exception in RPC communication. Connection: /10.10.100.190:31012 <--> /10.10.100.190:46230 (data server). Closing connection. io.netty.handler.codec.DecoderException: org.apache.drill.exec.exception.OutOfMemoryException: Failure allocating buffer. at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:233) ~[netty-codec-4.0.27.Final.jar:4.0.27.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339) [netty-transport-4.0.27.Final.jar:4.0.27.Final] at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324) [netty-transport-4.0.27.Final.jar:4.0.27.Final] at io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86) [netty-transport-4.0.27.Final.jar:4.0.27.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339) [netty-transport-4.0.27.Final.jar:4.0.27.Final] at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324) [netty-transport-4.0.27.Final.jar:4.0.27.Final] at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:847) [netty-transport-4.0.27.Final.jar:4.0.27.Final] at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131) [netty-transport-4.0.27.Final.jar:4.0.27.Final] at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511) [netty-transport-4.0.27.Final.jar:4.0.27.Final] at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468) [netty-transport-4.0.27.Final.jar:4.0.27.Final] at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382) [netty-transport-4.0.27.Final.jar:4.0.27.Final] at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) [netty-transport-4.0.27.Final.jar:4.0.27.Final] at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111) [netty-common-4.0.27.Final.jar:4.0.27.Final] at java.lang.Thread.run(Thread.java:745) [na:1.7.0_111] Caused by: org.apache.drill.exec.exception.OutOfMemoryException: Failure allocating buffer. at io.netty.buffer.PooledByteBufAllocatorL.allocate(PooledByteBufAllocatorL.java:64) ~[drill-memory-base-1.12.0-SNAPSHOT.jar:4.0.27.Final] at org.apache.drill.exec.memory.AllocationManager.(AllocationManager.java:81) ~[drill-memory-base-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.memory.BaseAllocator.bufferWithoutReservation(BaseAllocator.java:260) ~[drill-memory-base-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.memory.BaseAllocator.buffer(BaseAllocator.java:243) ~[drill-memory-base-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.memory.BaseAllocator.buffer(BaseAllocator.java:213) ~[drill-memory-base-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at io.netty.buffer.ExpandableByteBuf.capacity(ExpandableByteBuf.java:43) ~[drill-memory-base-1.12.0-SNAPSHOT.jar:4.0.27.Final] at io.netty.buffer.AbstractByteBuf.ensureWritable(AbstractByteBuf.java:251) ~[netty-buffer-4.0.27.Final.jar:4.0.27.Final] at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:849) ~[netty-buffer-4.0.27.Final.jar:4.0.27.Final] at io.n
[jira] [Resolved] (DRILL-5522) OOM during the merge and spill process of the managed external sort
[ https://issues.apache.org/jira/browse/DRILL-5522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Hou resolved DRILL-5522. --- Resolution: Fixed This has been resolved. > OOM during the merge and spill process of the managed external sort > --- > > Key: DRILL-5522 > URL: https://issues.apache.org/jira/browse/DRILL-5522 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators >Affects Versions: 1.10.0 >Reporter: Rahul Challapalli >Assignee: Paul Rogers > Attachments: 26e334aa-1afa-753f-3afe-862f76b80c18.sys.drill, > drillbit.log, drillbit.out, drill-env.sh > > > git.commit.id.abbrev=1e0a14c > The below query fails with an OOM > {code} > ALTER SESSION SET `exec.sort.disable_managed` = false; > alter session set `planner.memory.max_query_memory_per_node` = 1552428800; > create table dfs.drillTestDir.xsort_ctas3_multiple partition by (type, aCol) > as select type, rptds, rms, s3.rms.a aCol, uid from ( > select * from ( > select s1.type type, flatten(s1.rms.rptd) rptds, s1.rms, s1.uid > from ( > select d.type type, d.uid uid, flatten(d.map.rm) rms from > dfs.`/drill/testdata/resource-manager/nested-large.json` d order by d.uid > ) s1 > ) s2 > order by s2.rms.mapid, s2.rptds.a > ) s3; > {code} > Stack trace > {code} > 2017-05-17 15:15:35,027 [26e334aa-1afa-753f-3afe-862f76b80c18:frag:4:2] INFO > o.a.d.e.w.fragment.FragmentExecutor - User Error Occurred: One or more nodes > ran out of memory while executing the query. (Unable to allocate buffer of > size 2097152 due to memory limit. Current allocation: 29229064) > org.apache.drill.common.exceptions.UserException: RESOURCE ERROR: One or more > nodes ran out of memory while executing the query. > Unable to allocate buffer of size 2097152 due to memory limit. Current > allocation: 29229064 > [Error Id: 619e2e34-704c-4964-a354-1348fb33ce8a ] > at > org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:544) > ~[drill-common-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:244) > [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) > [drill-common-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > [na:1.7.0_111] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > [na:1.7.0_111] > at java.lang.Thread.run(Thread.java:745) [na:1.7.0_111] > Caused by: org.apache.drill.exec.exception.OutOfMemoryException: Unable to > allocate buffer of size 2097152 due to memory limit. Current allocation: > 29229064 > at > org.apache.drill.exec.memory.BaseAllocator.buffer(BaseAllocator.java:220) > ~[drill-memory-base-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.memory.BaseAllocator.buffer(BaseAllocator.java:195) > ~[drill-memory-base-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.vector.BigIntVector.reAlloc(BigIntVector.java:212) > ~[vector-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.vector.BigIntVector.copyFromSafe(BigIntVector.java:324) > ~[vector-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.vector.NullableBigIntVector.copyFromSafe(NullableBigIntVector.java:367) > ~[vector-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.vector.NullableBigIntVector$TransferImpl.copyValueSafe(NullableBigIntVector.java:328) > ~[vector-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.vector.complex.RepeatedMapVector$RepeatedMapTransferPair.copyValueSafe(RepeatedMapVector.java:360) > ~[vector-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.vector.complex.MapVector$MapTransferPair.copyValueSafe(MapVector.java:220) > ~[vector-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.vector.complex.MapVector.copyFromSafe(MapVector.java:82) > ~[vector-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.test.generated.PriorityQueueCopierGen49.doCopy(PriorityQueueCopierTemplate.java:34) > ~[na:na] > at > org.apache.drill.exec.test.generated.PriorityQueueCopierGen49.next(PriorityQueueCopierTemplate.java:76) > ~[na:na] > at > org.apache.drill.exec.physical.impl.xsort.managed.CopierHolder$BatchMerger.next(CopierHolder.java:234) > ~[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.mergeSpilledRuns(ExternalSortBatch
[jira] [Resolved] (DRILL-5443) Managed External Sort fails with OOM while spilling to disk
[ https://issues.apache.org/jira/browse/DRILL-5443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Hou resolved DRILL-5443. --- Resolution: Fixed This has been resolved. > Managed External Sort fails with OOM while spilling to disk > --- > > Key: DRILL-5443 > URL: https://issues.apache.org/jira/browse/DRILL-5443 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators >Affects Versions: 1.10.0, 1.11.0 >Reporter: Rahul Challapalli >Assignee: Paul Rogers > Fix For: 1.12.0 > > Attachments: 265a014b-8cae-30b5-adab-ff030b6c7086.sys.drill, > 27016969-ef53-40dc-b582-eea25371fa1c.sys.drill, drill5443.drillbit.log, > drillbit.log > > > git.commit.id.abbrev=3e8b01d > The below query fails with an OOM > {code} > ALTER SESSION SET `exec.sort.disable_managed` = false; > alter session set `planner.width.max_per_node` = 1; > alter session set `planner.disable_exchanges` = true; > alter session set `planner.width.max_per_query` = 1; > alter session set `planner.memory.max_query_memory_per_node` = 52428800; > select s1.type type, flatten(s1.rms.rptd) rptds from (select d.type type, > d.uid uid, flatten(d.map.rm) rms from > dfs.`/drill/testdata/resource-manager/nested-large.json` d order by d.uid) s1 > order by s1.rms.mapid; > {code} > Exception from the logs > {code} > 2017-04-24 17:22:59,439 [27016969-ef53-40dc-b582-eea25371fa1c:frag:0:0] INFO > o.a.d.e.p.i.x.m.ExternalSortBatch - User Error Occurred: External Sort > encountered an error while spilling to disk (Unable to allocate buffer of > size 524288 (rounded from 307197) due to memory limit. Current allocation: > 25886728) > org.apache.drill.common.exceptions.UserException: RESOURCE ERROR: External > Sort encountered an error while spilling to disk > [Error Id: a64e3790-3a34-42c8-b4ea-4cb1df780e63 ] > at > org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:544) > ~[drill-common-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.doMergeAndSpill(ExternalSortBatch.java:1445) > [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.mergeAndSpill(ExternalSortBatch.java:1376) > [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.mergeRuns(ExternalSortBatch.java:1372) > [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.consolidateBatches(ExternalSortBatch.java:1299) > [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.mergeSpilledRuns(ExternalSortBatch.java:1195) > [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.load(ExternalSortBatch.java:689) > [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.innerNext(ExternalSortBatch.java:559) > [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:162) > [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:215) > [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119) > [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:109) > [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51) > [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.physical.impl.svremover.RemovingRecordBatch.innerNext(RemovingRecordBatch.java:93) > [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:162) > [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:215) > [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119) > [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0
[jira] [Resolved] (DRILL-5447) Managed External Sort : Unable to allocate sv2 vector
[ https://issues.apache.org/jira/browse/DRILL-5447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Hou resolved DRILL-5447. --- Resolution: Fixed This has been resolved. > Managed External Sort : Unable to allocate sv2 vector > - > > Key: DRILL-5447 > URL: https://issues.apache.org/jira/browse/DRILL-5447 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators >Affects Versions: 1.10.0 >Reporter: Rahul Challapalli >Assignee: Paul Rogers > Fix For: 1.12.0 > > Attachments: 26550427-6adf-a52e-2ea8-dc52d8d8433f.sys.drill, > 26617a7e-b953-7ac3-556d-43fd88e51b19.sys.drill, > 26fee988-ed18-a86a-7164-3e75118c0ffc.sys.drill, drillbit.log, drillbit.log, > drillbit.log > > > git.commit.id.abbrev=3e8b01d > Dataset : > {code} > Every records contains a repeated type with 2000 elements. > The repeated type contains varchars of length 250 for the first 2000 records > and single character strings for the next 2000 records > The above pattern is repeated a few types > {code} > The below query fails > {code} > ALTER SESSION SET `exec.sort.disable_managed` = false; > alter session set `planner.width.max_per_node` = 1; > alter session set `planner.disable_exchanges` = true; > alter session set `planner.width.max_per_query` = 1; > select count(*) from (select * from (select id, flatten(str_list) str from > dfs.`/drill/testdata/resource-manager/flatten-large-small.json`) d order by > d.str) d1 where d1.id=0; > Error: RESOURCE ERROR: Unable to allocate sv2 buffer > Fragment 0:0 > [Error Id: 9e45c293-ab26-489d-a90e-25da96004f15 on qa-node190.qa.lab:31010] > (state=,code=0) > {code} > Exception from the logs > {code} > [Error Id: 9e45c293-ab26-489d-a90e-25da96004f15 ] > at > org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:544) > ~[drill-common-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.newSV2(ExternalSortBatch.java:1463) > [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.makeSelectionVector(ExternalSortBatch.java:799) > [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.processBatch(ExternalSortBatch.java:856) > [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.loadBatch(ExternalSortBatch.java:618) > [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.load(ExternalSortBatch.java:660) > [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.innerNext(ExternalSortBatch.java:559) > [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:162) > [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:215) > [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119) > [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:109) > [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51) > [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.physical.impl.svremover.RemovingRecordBatch.innerNext(RemovingRecordBatch.java:93) > [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:162) > [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:215) > [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119) > [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:109) > [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51) > [drill-java-exe
[jira] [Resolved] (DRILL-5753) Managed External Sort: One or more nodes ran out of memory while executing the query.
[ https://issues.apache.org/jira/browse/DRILL-5753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Hou resolved DRILL-5753. --- Resolution: Fixed > Managed External Sort: One or more nodes ran out of memory while executing > the query. > - > > Key: DRILL-5753 > URL: https://issues.apache.org/jira/browse/DRILL-5753 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators >Affects Versions: 1.11.0 >Reporter: Robert Hou >Assignee: Paul Rogers > Fix For: 1.12.0 > > Attachments: 26596b4e-9883-7dc2-6275-37134f7d63be.sys.drill, > drillbit.log > > > The query is: > {noformat} > ALTER SESSION SET `exec.sort.disable_managed` = false; > alter session set `planner.memory.max_query_memory_per_node` = 1252428800; > select count(*) from ( > select * from ( > select s1.type type, flatten(s1.rms.rptd) rptds, s1.rms, s1.uid > from ( > select d.type type, d.uid uid, flatten(d.map.rm) rms from > dfs.`/drill/testdata/resource-manager/nested-large.json` d order by d.uid > ) s1 > ) s2 > order by s2.rms.mapid, s2.rptds.a, s2.rptds.do_not_exist > ); > ALTER SESSION SET `exec.sort.disable_managed` = true; > alter session set `planner.memory.max_query_memory_per_node` = 2147483648; > {noformat} > The stack trace is: > {noformat} > 2017-08-30 03:35:10,479 [BitServer-5] DEBUG > o.a.drill.exec.work.foreman.Foreman - 26596b4e-9883-7dc2-6275-37134f7d63be: > State change requested RUNNING --> FAILED > org.apache.drill.common.exceptions.UserRemoteException: RESOURCE ERROR: One > or more nodes ran out of memory while executing the query. > Unable to allocate buffer of size 4194304 due to memory limit. Current > allocation: 43960640 > Fragment 2:9 > [Error Id: f58210a2-7569-42d0-8961-8c7e42c7fea3 on atsqa6c80.qa.lab:31010] > (org.apache.drill.exec.exception.OutOfMemoryException) Unable to allocate > buffer of size 4194304 due to memory limit. Current allocation: 43960640 > org.apache.drill.exec.memory.BaseAllocator.buffer():238 > org.apache.drill.exec.memory.BaseAllocator.buffer():213 > org.apache.drill.exec.vector.BigIntVector.reAlloc():252 > org.apache.drill.exec.vector.BigIntVector$Mutator.setSafe():452 > org.apache.drill.exec.vector.RepeatedBigIntVector$Mutator.addSafe():355 > org.apache.drill.exec.vector.RepeatedBigIntVector.copyFromSafe():220 > > org.apache.drill.exec.vector.RepeatedBigIntVector$TransferImpl.copyValueSafe():202 > > org.apache.drill.exec.vector.complex.MapVector$MapTransferPair.copyValueSafe():225 > > org.apache.drill.exec.vector.complex.MapVector$MapTransferPair.copyValueSafe():225 > org.apache.drill.exec.vector.complex.MapVector.copyFromSafe():82 > > org.apache.drill.exec.test.generated.PriorityQueueCopierGen1466.doCopy():47 > org.apache.drill.exec.test.generated.PriorityQueueCopierGen1466.next():77 > > org.apache.drill.exec.physical.impl.xsort.managed.PriorityQueueCopierWrapper$BatchMerger.next():267 > > org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.load():374 > > org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.innerNext():303 > org.apache.drill.exec.record.AbstractRecordBatch.next():164 > org.apache.drill.exec.record.AbstractRecordBatch.next():119 > org.apache.drill.exec.record.AbstractRecordBatch.next():109 > org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51 > > org.apache.drill.exec.physical.impl.svremover.RemovingRecordBatch.innerNext():93 > org.apache.drill.exec.record.AbstractRecordBatch.next():164 > org.apache.drill.exec.physical.impl.BaseRootExec.next():105 > > org.apache.drill.exec.physical.impl.SingleSenderCreator$SingleSenderRootExec.innerNext():92 > org.apache.drill.exec.physical.impl.BaseRootExec.next():95 > org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():234 > org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():227 > java.security.AccessController.doPrivileged():-2 > javax.security.auth.Subject.doAs():415 > org.apache.hadoop.security.UserGroupInformation.doAs():1595 > org.apache.drill.exec.work.fragment.FragmentExecutor.run():227 > org.apache.drill.common.SelfCleaningRunnable.run():38 > java.util.concurrent.ThreadPoolExecutor.runWorker():1145 > java.util.concurrent.ThreadPoolExecutor$Worker.run():615 > java.lang.Thread.run():744 > at > org.apache.drill.exec.work.foreman.QueryManager$1.statusUpdate(QueryManager.java:521) > [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.rpc.control.WorkEventBus.statusUpdate(WorkEventBus.java:71) > [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > or
[jira] [Resolved] (DRILL-5744) External sort fails with OOM error
[ https://issues.apache.org/jira/browse/DRILL-5744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Hou resolved DRILL-5744. --- Resolution: Fixed This has been verified. > External sort fails with OOM error > -- > > Key: DRILL-5744 > URL: https://issues.apache.org/jira/browse/DRILL-5744 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators >Affects Versions: 1.10.0 >Reporter: Robert Hou >Assignee: Paul Rogers > Fix For: 1.12.0 > > Attachments: 265b163b-cf44-d2ff-2e70-4cd746b56611.sys.drill, > q34.drillbit.log > > > Query is: > {noformat} > ALTER SESSION SET `exec.sort.disable_managed` = false; > alter session set `planner.width.max_per_node` = 1; > alter session set `planner.disable_exchanges` = true; > alter session set `planner.width.max_per_query` = 1; > alter session set `planner.memory.max_query_memory_per_node` = 152428800; > select count(*) from ( > select * from ( > select s1.type type, flatten(s1.rms.rptd) rptds, s1.rms, s1.uid > from ( > select d.type type, d.uid uid, flatten(d.map.rm) rms from > dfs.`/drill/testdata/resource-manager/nested-large.json` d order by d.uid > ) s1 > ) s2 > order by s2.rms.mapid > ); > ALTER SESSION SET `exec.sort.disable_managed` = true; > alter session set `planner.width.max_per_node` = 17; > alter session set `planner.disable_exchanges` = false; > alter session set `planner.width.max_per_query` = 1000; > alter session set `planner.memory.max_query_memory_per_node` = 2147483648; > {noformat} > Stack trace is: > {noformat} > 2017-08-23 06:59:42,763 [266275e5-ebdb-14ae-d52d-00fa3a154f6d:frag:0:0] INFO > o.a.d.e.w.fragment.FragmentExecutor - User Error Occurred: One or more nodes > ran out of memory while executing the query. (Unable to allocate buffer of > size 4194304 (rounded from 3276750) due to memory limit. Current allocation: 7 > 9986944) > org.apache.drill.common.exceptions.UserException: RESOURCE ERROR: One or more > nodes ran out of memory while executing the query. > Unable to allocate buffer of size 4194304 (rounded from 3276750) due to > memory limit. Current allocation: 79986944 > [Error Id: 4f4959df-0921-4a50-b75e-56488469ab10 ] > at > org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:550) > ~[drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:244) > [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) > [drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > [na:1.7.0_51] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > [na:1.7.0_51] > at java.lang.Thread.run(Thread.java:744) [na:1.7.0_51] > Caused by: org.apache.drill.exec.exception.OutOfMemoryException: Unable to > allocate buffer of size 4194304 (rounded from 3276750) due to memory limit. > Cur > rent allocation: 79986944 > at > org.apache.drill.exec.memory.BaseAllocator.buffer(BaseAllocator.java:238) > ~[drill-memory-base-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.memory.BaseAllocator.buffer(BaseAllocator.java:213) > ~[drill-memory-base-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.vector.VarCharVector.allocateNew(VarCharVector.java:402) > ~[vector-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.vector.NullableVarCharVector.allocateNew(NullableVarCharVector.java:236) > ~[vector-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.vector.AllocationHelper.allocatePrecomputedChildCount(AllocationHelper.java:33) > ~[vector-1.12.0-SNAPSHOT.jar:1.12.0-SNAPS > HOT] > at > org.apache.drill.exec.vector.AllocationHelper.allocate(AllocationHelper.java:46) > ~[vector-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.record.VectorInitializer.allocateVector(VectorInitializer.java:113) > ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT > ] > at > org.apache.drill.exec.record.VectorInitializer.allocateVector(VectorInitializer.java:95) > ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.record.VectorInitializer.allocateMap(VectorInitializer.java:130) > ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.record.VectorInitializer.allocateVector(VectorInitializer.java:93) > ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.record.VectorInitializer.allocateBatch(VectorInitializer.java:85) > ~[drill-j
[jira] [Created] (DRILL-5778) Drill seems to run out of memory but completes execution
Robert Hou created DRILL-5778: - Summary: Drill seems to run out of memory but completes execution Key: DRILL-5778 URL: https://issues.apache.org/jira/browse/DRILL-5778 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.11.0 Reporter: Robert Hou Assignee: Paul Rogers Fix For: 1.12.0 Query is: {noformat} ALTER SESSION SET `exec.sort.disable_managed` = false; alter session set `planner.width.max_per_node` = 1; alter session set `planner.disable_exchanges` = true; alter session set `planner.width.max_per_query` = 1; alter session set `planner.memory.max_query_memory_per_node` = 2147483648; select count(*) from (select * from (select id, flatten(str_list) str from dfs.`/drill/testdata/resource-manager/flatten-large-small.json`) d order by d.str) d1 where d1.id=0; {noformat} Plan is: {noformat} | 00-00Screen 00-01 Project(EXPR$0=[$0]) 00-02StreamAgg(group=[{}], EXPR$0=[$SUM0($0)]) 00-03 UnionExchange 01-01StreamAgg(group=[{}], EXPR$0=[COUNT()]) 01-02 Project($f0=[0]) 01-03SelectionVectorRemover 01-04 Filter(condition=[=($0, 0)]) 01-05SingleMergeExchange(sort0=[1 ASC]) 02-01 SelectionVectorRemover 02-02Sort(sort0=[$1], dir0=[ASC]) 02-03 Project(id=[$0], str=[$1]) 02-04HashToRandomExchange(dist0=[[$1]]) 03-01 UnorderedMuxExchange 04-01Project(id=[$0], str=[$1], E_X_P_R_H_A_S_H_F_I_E_L_D=[hash32AsDouble($1, 1301011)]) 04-02 Flatten(flattenField=[$1]) 04-03Project(id=[$0], str=[$1]) 04-04 Scan(groupscan=[EasyGroupScan [selectionRoot=maprfs:/drill/testdata/resource-manager/flatten-large-small.json, numFiles=1, columns=[`id`, `str_list`], files=[maprfs:///drill/testdata/resource-manager/flatten-large-small.json]]]) {noformat} >From drillbit.log: {noformat} 2017-09-08 05:07:21,515 [264d780f-41ac-2c4f-6bc8-bdbb5eeb3df0:frag:0:0] DEBUG o.a.d.e.p.i.x.m.ExternalSortBatch - Actual batch schema & sizes { str(type: REQUIRED VARCHAR, count: 4096, std size: 54, actual size: 134, data size: 548360) id(type: OPTIONAL BIGINT, count: 4096, std size: 8, actual size: 9, data size: 36864) Records: 4096, Total size: 1073819648, Data size: 585224, Gross row width: 262163, Net row width: 143, Density: 1} 2017-09-08 05:07:21,515 [264d780f-41ac-2c4f-6bc8-bdbb5eeb3df0:frag:0:0] ERROR o.a.d.e.p.i.x.m.ExternalSortBatch - Insufficient memory to merge two batches. Incoming batch size: 1073819648, available memory: 2147483648 2017-09-08 05:07:21,517 [264d780f-41ac-2c4f-6bc8-bdbb5eeb3df0:frag:0:0] INFO o.a.d.e.c.ClassCompilerSelector - Java compiler policy: DEFAULT, Debug option: true 2017-09-08 05:07:21,517 [264d780f-41ac-2c4f-6bc8-bdbb5eeb3df0:frag:0:0] DEBUG o.a.d.e.compile.JaninoClassCompiler - Compiling (source size=3.3 KiB): ... 2017-09-08 05:07:21,536 [264d780f-41ac-2c4f-6bc8-bdbb5eeb3df0:frag:0:0] DEBUG o.a.d.exec.compile.ClassTransformer - Compiled and merged SingleBatchSorterGen2677: bytecode size = 3.6 KiB, time = 19 ms. 2017-09-08 05:07:21,566 [264d780f-41ac-2c4f-6bc8-bdbb5eeb3df0:frag:0:0] DEBUG o.a.d.e.t.g.SingleBatchSorterGen2677 - Took 5608 us to sort 4096 records 2017-09-08 05:07:21,566 [264d780f-41ac-2c4f-6bc8-bdbb5eeb3df0:frag:0:0] DEBUG o.a.d.e.p.i.x.m.ExternalSortBatch - Input Batch Estimates: record size = 143 bytes; net = 1073819648 bytes, gross = 1610729472, records = 4096 2017-09-08 05:07:21,566 [264d780f-41ac-2c4f-6bc8-bdbb5eeb3df0:frag:0:0] DEBUG o.a.d.e.p.i.x.m.ExternalSortBatch - Spill batch size: net = 1048476 bytes, gross = 1572714 bytes, records = 7332; spill file = 268435456 bytes 2017-09-08 05:07:21,566 [264d780f-41ac-2c4f-6bc8-bdbb5eeb3df0:frag:0:0] DEBUG o.a.d.e.p.i.x.m.ExternalSortBatch - Output batch size: net = 9371505 bytes, gross = 14057257 bytes, records = 65535 2017-09-08 05:07:21,566 [264d780f-41ac-2c4f-6bc8-bdbb5eeb3df0:frag:0:0] DEBUG o.a.d.e.p.i.x.m.ExternalSortBatch - Available memory: 2147483648, buffer memory = 2143289744, merge memory = 2128740638 2017-09-08 05:07:21,571 [264d780f-41ac-2c4f-6bc8-bdbb5eeb3df0:frag:0:0] DEBUG o.a.d.e.t.g.SingleBatchSorterGen2677 - Took 4303 us to sort 4096 records 2017-09-08 05:07:21,571 [264d780f-41ac-2c4f-6bc8-bdbb5eeb3df0:frag:0:0] DEBUG o.a.d.e.p.i.x.m.ExternalSortBatch - Input Batch Estimates: record size = 266 bytes; net = 1073819648 bytes, gross = 1610729472, records = 4096 2017-09-08 05:07:21,571 [264d780f-41ac-2c4f-6bc8-bdbb5eeb3df0:frag:0:0] DEBUG o.a.d.e.p.i.x.m.ExternalSortBatch - Spill batch size: net = 1048572 bytes, gross = 1572858 bytes, rec
[jira] [Created] (DRILL-5774) Excessive memory allocation
Robert Hou created DRILL-5774: - Summary: Excessive memory allocation Key: DRILL-5774 URL: https://issues.apache.org/jira/browse/DRILL-5774 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.11.0 Reporter: Robert Hou Assignee: Paul Rogers Fix For: 1.12.0 This query exhibits excessive memory allocation: {noformat} ALTER SESSION SET `exec.sort.disable_managed` = false; alter session set `planner.width.max_per_node` = 1; alter session set `planner.disable_exchanges` = true; alter session set `planner.width.max_per_query` = 1; select count(*) from (select * from (select id, flatten(str_list) str from dfs.`/drill/testdata/resource-manager/flatten-large-small.json`) d order by d.str) d1 where d1.id=0; {noformat} This query does a flatten on a large table. The result is 160M records. Half the records have a one-byte string, and half have a 253-byte string. And then there are 40K records with 223 byte strings. {noformat} select length(str), count(*) from (select id, flatten(str_list) str from dfs.`/drill/testdata/resource-manager/flatten-large-small.json`) group by length(str); +-+---+ | EXPR$0 | EXPR$1 | +-+---+ | 223 | 4 | | 1 | 80042001 | | 253 | 8000 | {noformat} >From the drillbit.log: {noformat} 2017-09-02 11:43:44,598 [26550427-6adf-a52e-2ea8-dc52d8d8433f:frag:0:0] DEBUG o.a.d.e.p.i.x.m.ExternalSortBatch - Actual batch schema & sizes { str(type: REQUIRED VARCHAR, count: 4096, std size: 54, actual size: 134, data size: 548360) id(type: OPTIONAL BIGINT, count: 4096, std size: 8, actual size: 9, data size: 36864) Records: 4096, Total size: 1073819648, Data size: 585224, Gross row width: 262163, Net row width: 143, Density: 1} {noformat} The data size is 585K, but the batch size is 1 GB. The density is 1%. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (DRILL-5753) Managed External Sort: One or more nodes ran out of memory while executing the query.
Robert Hou created DRILL-5753: - Summary: Managed External Sort: One or more nodes ran out of memory while executing the query. Key: DRILL-5753 URL: https://issues.apache.org/jira/browse/DRILL-5753 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.11.0 Reporter: Robert Hou Assignee: Paul Rogers Fix For: 1.12.0 The query is: {noformat} ALTER SESSION SET `exec.sort.disable_managed` = false; alter session set `planner.memory.max_query_memory_per_node` = 1252428800; select count(*) from ( select * from ( select s1.type type, flatten(s1.rms.rptd) rptds, s1.rms, s1.uid from ( select d.type type, d.uid uid, flatten(d.map.rm) rms from dfs.`/drill/testdata/resource-manager/nested-large.json` d order by d.uid ) s1 ) s2 order by s2.rms.mapid, s2.rptds.a, s2.rptds.do_not_exist ); ALTER SESSION SET `exec.sort.disable_managed` = true; alter session set `planner.memory.max_query_memory_per_node` = 2147483648; {noformat} The stack trace is: {noformat} 2017-08-30 03:35:10,479 [BitServer-5] DEBUG o.a.drill.exec.work.foreman.Foreman - 26596b4e-9883-7dc2-6275-37134f7d63be: State change requested RUNNING --> FAILED org.apache.drill.common.exceptions.UserRemoteException: RESOURCE ERROR: One or more nodes ran out of memory while executing the query. Unable to allocate buffer of size 4194304 due to memory limit. Current allocation: 43960640 Fragment 2:9 [Error Id: f58210a2-7569-42d0-8961-8c7e42c7fea3 on atsqa6c80.qa.lab:31010] (org.apache.drill.exec.exception.OutOfMemoryException) Unable to allocate buffer of size 4194304 due to memory limit. Current allocation: 43960640 org.apache.drill.exec.memory.BaseAllocator.buffer():238 org.apache.drill.exec.memory.BaseAllocator.buffer():213 org.apache.drill.exec.vector.BigIntVector.reAlloc():252 org.apache.drill.exec.vector.BigIntVector$Mutator.setSafe():452 org.apache.drill.exec.vector.RepeatedBigIntVector$Mutator.addSafe():355 org.apache.drill.exec.vector.RepeatedBigIntVector.copyFromSafe():220 org.apache.drill.exec.vector.RepeatedBigIntVector$TransferImpl.copyValueSafe():202 org.apache.drill.exec.vector.complex.MapVector$MapTransferPair.copyValueSafe():225 org.apache.drill.exec.vector.complex.MapVector$MapTransferPair.copyValueSafe():225 org.apache.drill.exec.vector.complex.MapVector.copyFromSafe():82 org.apache.drill.exec.test.generated.PriorityQueueCopierGen1466.doCopy():47 org.apache.drill.exec.test.generated.PriorityQueueCopierGen1466.next():77 org.apache.drill.exec.physical.impl.xsort.managed.PriorityQueueCopierWrapper$BatchMerger.next():267 org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.load():374 org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.innerNext():303 org.apache.drill.exec.record.AbstractRecordBatch.next():164 org.apache.drill.exec.record.AbstractRecordBatch.next():119 org.apache.drill.exec.record.AbstractRecordBatch.next():109 org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51 org.apache.drill.exec.physical.impl.svremover.RemovingRecordBatch.innerNext():93 org.apache.drill.exec.record.AbstractRecordBatch.next():164 org.apache.drill.exec.physical.impl.BaseRootExec.next():105 org.apache.drill.exec.physical.impl.SingleSenderCreator$SingleSenderRootExec.innerNext():92 org.apache.drill.exec.physical.impl.BaseRootExec.next():95 org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():234 org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():227 java.security.AccessController.doPrivileged():-2 javax.security.auth.Subject.doAs():415 org.apache.hadoop.security.UserGroupInformation.doAs():1595 org.apache.drill.exec.work.fragment.FragmentExecutor.run():227 org.apache.drill.common.SelfCleaningRunnable.run():38 java.util.concurrent.ThreadPoolExecutor.runWorker():1145 java.util.concurrent.ThreadPoolExecutor$Worker.run():615 java.lang.Thread.run():744 at org.apache.drill.exec.work.foreman.QueryManager$1.statusUpdate(QueryManager.java:521) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.rpc.control.WorkEventBus.statusUpdate(WorkEventBus.java:71) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.work.batch.ControlMessageHandler.handle(ControlMessageHandler.java:94) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.work.batch.ControlMessageHandler.handle(ControlMessageHandler.java:55) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.rpc.BasicServer.handle(BasicServer.java:157) [drill-rpc-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.rpc.BasicServ
[jira] [Resolved] (DRILL-5732) Unable to allocate sv2 for 9039 records, and not enough batchGroups to spill.
[ https://issues.apache.org/jira/browse/DRILL-5732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Hou resolved DRILL-5732. --- Resolution: Not A Problem > Unable to allocate sv2 for 9039 records, and not enough batchGroups to spill. > - > > Key: DRILL-5732 > URL: https://issues.apache.org/jira/browse/DRILL-5732 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.10.0 >Reporter: Robert Hou >Assignee: Paul Rogers > Attachments: 26621eb2-daec-cef9-efed-5986e72a750a.sys.drill, > drillbit.log.83 > > > git commit id: > {noformat} > | 1.12.0-SNAPSHOT | e9065b55ea560e7f737d6fcb4948f9e945b9b14f | DRILL-5660: > Parquet metadata caching improvements | 15.08.2017 @ 09:31:00 PDT | > r...@qa-node190.qa.lab | 15.08.2017 @ 13:29:26 PDT | > {noformat} > Query is: > {noformat} > ALTER SESSION SET `exec.sort.disable_managed` = false; > alter session set `planner.disable_exchanges` = true; > alter session set `planner.memory.max_query_memory_per_node` = 104857600; > alter session set `planner.width.max_per_node` = 1; > alter session set `planner.width.max_per_query` = 1; > select max(col1), max(cs_sold_date_sk), max(cs_sold_time_sk), > max(cs_ship_date_sk), max(cs_bill_customer_sk), max(cs_bill_cdemo_sk), > max(cs_bill_hdemo_sk), max(cs_bill_addr_sk), max(cs_ship_customer_sk), > max(cs_ship_cdemo_sk), max(cs_ship_hdemo_sk), max(cs_ship_addr_sk), > max(cs_call_center_sk), max(cs_catalog_page_sk), max(cs_ship_mode_sk), > min(cs_warehouse_sk), max(cs_item_sk), max(cs_promo_sk), > max(cs_order_number), max(cs_quantity), max(cs_wholesale_cost), > max(cs_list_price), max(cs_sales_price), max(cs_ext_discount_amt), > min(cs_ext_sales_price), max(cs_ext_wholesale_cost), min(cs_ext_list_price), > min(cs_ext_tax), min(cs_coupon_amt), max(cs_ext_ship_cost), max(cs_net_paid), > max(cs_net_paid_inc_tax), min(cs_net_paid_inc_ship), > min(cs_net_paid_inc_ship_tax), min(cs_net_profit), min(c_customer_sk), > min(length(c_customer_id)), max(c_current_cdemo_sk), max(c_current_hdemo_sk), > min(c_current_addr_sk), min(c_first_shipto_date_sk), > min(c_first_sales_date_sk), min(length(c_salutation)), > min(length(c_first_name)), min(length(c_last_name)), > min(length(c_preferred_cust_flag)), max(c_birth_day), min(c_birth_month), > min(c_birth_year), max(c_last_review_date), c_email_address from (select > cs_sold_date_sk+cs_sold_time_sk col1, * from > dfs.`/drill/testdata/resource-manager/md1362` order by c_email_address nulls > first) d where d.col1 > 2536816 and c_email_address is not null group by > c_email_address; > ALTER SESSION SET `exec.sort.disable_managed` = true; > alter session set `planner.disable_exchanges` = false; > alter session set `planner.memory.max_query_memory_per_node` = 2147483648; > alter session set `planner.width.max_per_node` = 17; > alter session set `planner.width.max_per_query` = 1000; > {noformat} > Here is the stack trace: > {noformat} > 2017-08-18 13:15:27,052 [2668b522-5833-8fd2-0b6d-e685197f0ae3:frag:0:0] DEBUG > o.a.d.e.t.g.SingleBatchSorterGen27 - Took 6445 us to sort 9039 records > 2017-08-18 13:15:27,420 [2668b522-5833-8fd2-0b6d-e685197f0ae3:frag:0:0] DEBUG > o.a.d.e.p.i.xsort.ExternalSortBatch - Copier allocator current allocation 0 > 2017-08-18 13:15:27,420 [2668b522-5833-8fd2-0b6d-e685197f0ae3:frag:0:0] DEBUG > o.a.d.e.p.i.xsort.ExternalSortBatch - mergeAndSpill: starting total size in > memory = 71964288 > 2017-08-18 13:15:27,421 [2668b522-5833-8fd2-0b6d-e685197f0ae3:frag:0:0] INFO > o.a.d.e.p.i.xsort.ExternalSortBatch - User Error Occurred: One or more nodes > ran out of memory while executing the query. > org.apache.drill.common.exceptions.UserException: RESOURCE ERROR: One or more > nodes ran out of memory while executing the query. > Unable to allocate sv2 for 9039 records, and not enough batchGroups to spill. > batchGroups.size 1 > spilledBatchGroups.size 0 > allocated memory 71964288 > allocator limit 52428800 > [Error Id: 7b248f12-2b31-4013-86b6-92e6c842db48 ] > at > org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:550) > ~[drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.physical.impl.xsort.ExternalSortBatch.newSV2(ExternalSortBatch.java:637) > [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.physical.impl.xsort.ExternalSortBatch.innerNext(ExternalSortBatch.java:379) > [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:164) > [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIt
[jira] [Created] (DRILL-5744) External sort fails with OOM error
Robert Hou created DRILL-5744: - Summary: External sort fails with OOM error Key: DRILL-5744 URL: https://issues.apache.org/jira/browse/DRILL-5744 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.10.0 Reporter: Robert Hou Assignee: Paul Rogers Fix For: 1.12.0 Query is: {noformat} ALTER SESSION SET `exec.sort.disable_managed` = false; alter session set `planner.width.max_per_node` = 1; alter session set `planner.disable_exchanges` = true; alter session set `planner.width.max_per_query` = 1; alter session set `planner.memory.max_query_memory_per_node` = 152428800; select count(*) from ( select * from ( select s1.type type, flatten(s1.rms.rptd) rptds, s1.rms, s1.uid from ( select d.type type, d.uid uid, flatten(d.map.rm) rms from dfs.`/drill/testdata/resource-manager/nested-large.json` d order by d.uid ) s1 ) s2 order by s2.rms.mapid ); ALTER SESSION SET `exec.sort.disable_managed` = true; alter session set `planner.width.max_per_node` = 17; alter session set `planner.disable_exchanges` = false; alter session set `planner.width.max_per_query` = 1000; alter session set `planner.memory.max_query_memory_per_node` = 2147483648; {noformat} Stack trace is: {noformat} 2017-08-23 06:59:42,763 [266275e5-ebdb-14ae-d52d-00fa3a154f6d:frag:0:0] INFO o.a.d.e.w.fragment.FragmentExecutor - User Error Occurred: One or more nodes ran out of memory while executing the query. (Unable to allocate buffer of size 4194304 (rounded from 3276750) due to memory limit. Current allocation: 7 9986944) org.apache.drill.common.exceptions.UserException: RESOURCE ERROR: One or more nodes ran out of memory while executing the query. Unable to allocate buffer of size 4194304 (rounded from 3276750) due to memory limit. Current allocation: 79986944 [Error Id: 4f4959df-0921-4a50-b75e-56488469ab10 ] at org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:550) ~[drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:244) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) [drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_51] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_51] at java.lang.Thread.run(Thread.java:744) [na:1.7.0_51] Caused by: org.apache.drill.exec.exception.OutOfMemoryException: Unable to allocate buffer of size 4194304 (rounded from 3276750) due to memory limit. Cur rent allocation: 79986944 at org.apache.drill.exec.memory.BaseAllocator.buffer(BaseAllocator.java:238) ~[drill-memory-base-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.memory.BaseAllocator.buffer(BaseAllocator.java:213) ~[drill-memory-base-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.vector.VarCharVector.allocateNew(VarCharVector.java:402) ~[vector-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.vector.NullableVarCharVector.allocateNew(NullableVarCharVector.java:236) ~[vector-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.vector.AllocationHelper.allocatePrecomputedChildCount(AllocationHelper.java:33) ~[vector-1.12.0-SNAPSHOT.jar:1.12.0-SNAPS HOT] at org.apache.drill.exec.vector.AllocationHelper.allocate(AllocationHelper.java:46) ~[vector-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.record.VectorInitializer.allocateVector(VectorInitializer.java:113) ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT ] at org.apache.drill.exec.record.VectorInitializer.allocateVector(VectorInitializer.java:95) ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.record.VectorInitializer.allocateMap(VectorInitializer.java:130) ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.record.VectorInitializer.allocateVector(VectorInitializer.java:93) ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.record.VectorInitializer.allocateBatch(VectorInitializer.java:85) ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.physical.impl.xsort.managed.PriorityQueueCopierWrapper$BatchMerger.next(PriorityQueueCopierWrapper.java:262) ~[drill-java -exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.load(ExternalSortBatch.java:374) ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12 .0-SNAPSHOT] at org.apache.dr
[jira] [Created] (DRILL-5732) Unable to allocate sv2 for 9039 records, and not enough batchGroups to spill.
Robert Hou created DRILL-5732: - Summary: Unable to allocate sv2 for 9039 records, and not enough batchGroups to spill. Key: DRILL-5732 URL: https://issues.apache.org/jira/browse/DRILL-5732 Project: Apache Drill Issue Type: Bug Affects Versions: 1.10.0 Reporter: Robert Hou Assignee: Paul Rogers git commit id: {noformat} | 1.12.0-SNAPSHOT | e9065b55ea560e7f737d6fcb4948f9e945b9b14f | DRILL-5660: Parquet metadata caching improvements | 15.08.2017 @ 09:31:00 PDT | r...@qa-node190.qa.lab | 15.08.2017 @ 13:29:26 PDT | {noformat} Query is: {noformat} ALTER SESSION SET `exec.sort.disable_managed` = false; alter session set `planner.disable_exchanges` = true; alter session set `planner.memory.max_query_memory_per_node` = 104857600; alter session set `planner.width.max_per_node` = 1; alter session set `planner.width.max_per_query` = 1; select max(col1), max(cs_sold_date_sk), max(cs_sold_time_sk), max(cs_ship_date_sk), max(cs_bill_customer_sk), max(cs_bill_cdemo_sk), max(cs_bill_hdemo_sk), max(cs_bill_addr_sk), max(cs_ship_customer_sk), max(cs_ship_cdemo_sk), max(cs_ship_hdemo_sk), max(cs_ship_addr_sk), max(cs_call_center_sk), max(cs_catalog_page_sk), max(cs_ship_mode_sk), min(cs_warehouse_sk), max(cs_item_sk), max(cs_promo_sk), max(cs_order_number), max(cs_quantity), max(cs_wholesale_cost), max(cs_list_price), max(cs_sales_price), max(cs_ext_discount_amt), min(cs_ext_sales_price), max(cs_ext_wholesale_cost), min(cs_ext_list_price), min(cs_ext_tax), min(cs_coupon_amt), max(cs_ext_ship_cost), max(cs_net_paid), max(cs_net_paid_inc_tax), min(cs_net_paid_inc_ship), min(cs_net_paid_inc_ship_tax), min(cs_net_profit), min(c_customer_sk), min(length(c_customer_id)), max(c_current_cdemo_sk), max(c_current_hdemo_sk), min(c_current_addr_sk), min(c_first_shipto_date_sk), min(c_first_sales_date_sk), min(length(c_salutation)), min(length(c_first_name)), min(length(c_last_name)), min(length(c_preferred_cust_flag)), max(c_birth_day), min(c_birth_month), min(c_birth_year), max(c_last_review_date), c_email_address from (select cs_sold_date_sk+cs_sold_time_sk col1, * from dfs.`/drill/testdata/resource-manager/md1362` order by c_email_address nulls first) d where d.col1 > 2536816 and c_email_address is not null group by c_email_address; ALTER SESSION SET `exec.sort.disable_managed` = true; alter session set `planner.disable_exchanges` = false; alter session set `planner.memory.max_query_memory_per_node` = 2147483648; alter session set `planner.width.max_per_node` = 17; alter session set `planner.width.max_per_query` = 1000; {noformat} Here is the stack trace: {noformat} 2017-08-18 13:15:27,052 [2668b522-5833-8fd2-0b6d-e685197f0ae3:frag:0:0] DEBUG o.a.d.e.t.g.SingleBatchSorterGen27 - Took 6445 us to sort 9039 records 2017-08-18 13:15:27,420 [2668b522-5833-8fd2-0b6d-e685197f0ae3:frag:0:0] DEBUG o.a.d.e.p.i.xsort.ExternalSortBatch - Copier allocator current allocation 0 2017-08-18 13:15:27,420 [2668b522-5833-8fd2-0b6d-e685197f0ae3:frag:0:0] DEBUG o.a.d.e.p.i.xsort.ExternalSortBatch - mergeAndSpill: starting total size in memory = 71964288 2017-08-18 13:15:27,421 [2668b522-5833-8fd2-0b6d-e685197f0ae3:frag:0:0] INFO o.a.d.e.p.i.xsort.ExternalSortBatch - User Error Occurred: One or more nodes ran out of memory while executing the query. org.apache.drill.common.exceptions.UserException: RESOURCE ERROR: One or more nodes ran out of memory while executing the query. Unable to allocate sv2 for 9039 records, and not enough batchGroups to spill. batchGroups.size 1 spilledBatchGroups.size 0 allocated memory 71964288 allocator limit 52428800 [Error Id: 7b248f12-2b31-4013-86b6-92e6c842db48 ] at org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:550) ~[drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.physical.impl.xsort.ExternalSortBatch.newSV2(ExternalSortBatch.java:637) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.physical.impl.xsort.ExternalSortBatch.innerNext(ExternalSortBatch.java:379) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:164) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:225) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:109) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSing
[jira] [Created] (DRILL-5374) Parquet filter pushdown does not prune partition with nulls when predicate uses float column
Robert Hou created DRILL-5374: - Summary: Parquet filter pushdown does not prune partition with nulls when predicate uses float column Key: DRILL-5374 URL: https://issues.apache.org/jira/browse/DRILL-5374 Project: Apache Drill Issue Type: Bug Components: Query Planning & Optimization Affects Versions: 1.9.0 Reporter: Robert Hou Assignee: Jinfeng Ni Drill does not prune enough partitions for this query when filter pushdown is used with metadata caching. The float column is being compared with a double value. {code} 0: jdbc:drill:zk=10.10.100.186:5181/drill/rho> select count(*) from orders_parts_metadata where float_id < 1100.0; {code} To reproduce the problem, put the attached files into a directory. Then {code} create the metadata: refresh table metadata dfs.`path_to_directory`; {code} For example, if you put the files in /drill/testdata/filter/orders_parts_metadata, then run this sql command {code} refresh table metadata dfs.`/drill/testdata/filter/orders_parts_metadata`; {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (DRILL-5136) Some SQL statements fail when using Simba ODBC driver 1.3
Robert Hou created DRILL-5136: - Summary: Some SQL statements fail when using Simba ODBC driver 1.3 Key: DRILL-5136 URL: https://issues.apache.org/jira/browse/DRILL-5136 Project: Apache Drill Issue Type: Bug Components: Client - ODBC Affects Versions: 1.9.0 Reporter: Robert Hou "show schemas" does not work with Simba ODBC driver SQL>show schemas 1: SQLPrepare = [MapR][Drill] (1040) Drill failed to execute the query: show schemas [30029]Query execution error. Details:[ PARSE ERROR: Encountered "( show" at line 1, column 15. Was expecting one of: ... ... ... ... ... "LATERAL" ... "(" "WITH" ... "(" "+" ... "(" "-" ... "(" ... "(" ... "("
[jira] [Created] (DRILL-5093) Explain plan shows all partitions when query scans all partitions, and filter pushdown is used with metadata caching.
Robert Hou created DRILL-5093: - Summary: Explain plan shows all partitions when query scans all partitions, and filter pushdown is used with metadata caching. Key: DRILL-5093 URL: https://issues.apache.org/jira/browse/DRILL-5093 Project: Apache Drill Issue Type: Bug Components: Query Planning & Optimization Affects Versions: 1.9.0 Reporter: Robert Hou Assignee: Jinfeng Ni This query scans all the partitions because the partitions cannot be pruned. When metadata caching is used, the explain plan shows all the partitions, when it should only show the parent. 0: jdbc:drill:zk=10.10.100.186:5181/drill/rho> explain plan for select \* from orders_parts_metadata; +--+--+ | text | json | +--+--+ | 00-00Screen 00-01 Project(*=[$0]) 00-02Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=/drill/testdata/filter/orders_parts_metadata/0_0_1.parquet], ReadEntryWithPath [path=/drill/testdata/filter/orders_parts_metadata/0_0_3.parquet], ReadEntryWithPath [path=/drill/testdata/filter/orders_parts_metadata/0_0_4.parquet], ReadEntryWithPath [path=/drill/testdata/filter/orders_parts_metadata/0_0_5.parquet], ReadEntryWithPath [path=/drill/testdata/filter/orders_parts_metadata/0_0_2.parquet]], selectionRoot=/drill/testdata/filter/orders_parts_metadata, numFiles=5, usedMetadataFile=true, cacheFileRoot=/drill/testdata/filter/orders_parts_metadata, columns=[`*`]]]) Here is the same query with a table that does not have metadata caching. 0: jdbc:drill:zk=10.10.100.186:5181/drill/rho> explain plan for select \* from orders_parts; +--+--+ | text | json | +--+--+ | 00-00Screen 00-01 Project(*=[$0]) 00-02Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:///drill/testdata/filter/orders_parts]], selectionRoot=maprfs:/drill/testdata/filter/orders_parts, numFiles=1, usedMetadataFile=false, columns=[`*`]]]) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (DRILL-5086) ClassCastException when filter pushdown is used with a bigint or float column.
Robert Hou created DRILL-5086: - Summary: ClassCastException when filter pushdown is used with a bigint or float column. Key: DRILL-5086 URL: https://issues.apache.org/jira/browse/DRILL-5086 Project: Apache Drill Issue Type: Bug Components: Query Planning & Optimization Affects Versions: 1.9.0 Reporter: Robert Hou Assignee: Aman Sinha This query results in a ClassCastException when filter pushdown is used. The bigint column is being compared with an integer value. 0: jdbc:drill:zk=10.10.100.186:5181/drill/rho> select count(\*) from orders_parts_metadata where bigint_id < 1100; Error: SYSTEM ERROR: ClassCastException: java.lang.Integer cannot be cast to java.lang.Long A similar problem occurs when a float column is being compared with a double value. 0: jdbc:drill:zk=10.10.100.186:5181/drill/rho> select count(\*) from orders_parts_metadata where float_id < 1100.0; Error: SYSTEM ERROR: ClassCastException Also when a timestamp column is being compared with a string. 0: jdbc:drill:zk=10.10.100.186:5181/drill/rho> select count(\*) from orders_parts_metadata where timestamp_id < '2016-10-13'; Error: SYSTEM ERROR: ClassCastException: java.lang.Integer cannot be cast to java.lang.Long -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (DRILL-5035) Selecting timestamp value from Hive table causes IndexOutOfBoundsException
Robert Hou created DRILL-5035: - Summary: Selecting timestamp value from Hive table causes IndexOutOfBoundsException Key: DRILL-5035 URL: https://issues.apache.org/jira/browse/DRILL-5035 Project: Apache Drill Issue Type: Bug Components: Execution - Data Types Affects Versions: 1.9.0 Reporter: Robert Hou I used the new option to read Hive timestamps. alter session set `store.parquet.reader.int96_as_timestamp` = true; This query fails: select timestamp_id from orders_parts_hive where timestamp_id = '2016-10-03 06:11:52.429'; Error: SYSTEM ERROR: IndexOutOfBoundsException: readerIndex: 0, writerIndex: 36288 (expected: 0 <= readerIndex <= writerIndex <= capacity(32768)) Fragment 0:0 [Error Id: 50537b32-cdc9-4898-9581-531066288fbd on qa-node211:31010] (state=,code=0) Selecting all the columns succeed. 0: jdbc:drill:zk=10.10.100.186:5181> select * from orders_parts_hive where timestamp_id = '2016-10-03 06:11:52.429'; +-+++---+--+--+-++-++---++-+-+--+-+ | o_orderkey | o_custkey | o_orderstatus | o_totalprice | o_orderdate | o_clerk | o_shippriority | o_comment | int_id | bigint_id | float_id | double_id | varchar_id | date_id | timestamp_id | dir0 | +-+++---+--+--+-++-++---++-+-+--+-+ | 11335 | 871| F | 133549.0 | 1994-10-22 | null | 0 | ealms. theodolites maintain. regular, even instructions against t | -4 | -4 | -4.0 | -4.0 | -4 | 2016-09-29 | 2016-10-03 06:11:52.429 | o_orderpriority=2-HIGH | +-+++---+--+--+-++-++---++-+-+--+-+ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (DRILL-5018) Metadata cache has duplicate columnTypeInfo values
Robert Hou created DRILL-5018: - Summary: Metadata cache has duplicate columnTypeInfo values Key: DRILL-5018 URL: https://issues.apache.org/jira/browse/DRILL-5018 Project: Apache Drill Issue Type: Bug Components: Metadata Affects Versions: 1.8.0 Reporter: Robert Hou Assignee: Parth Chandra This lineitem table has duplicate entries in its metadata file, although the entries have slightly different values. This lineitem table uses directory-based partitioning on year and month. "columnTypeInfo" : { "L_RETURNFLAG" : { "name" : [ "L_RETURNFLAG" ], "primitiveType" : "BINARY", "originalType" : null, "precision" : 0, "scale" : 0, "repetitionLevel" : 0, "definitionLevel" : 1 }, "l_returnflag" : { "name" : [ "l_returnflag" ], "primitiveType" : "BINARY", "originalType" : "UTF8", "precision" : 0, "scale" : 0, "repetitionLevel" : 0, "definitionLevel" : 0 }, This lineitem table has two entries in its metadata file for each column, but the two entries have different column names (adding a zero). It also has slightly different values. This lineitem table was created using CTAS with the first table above. "l_shipinstruct" : { "name" : [ "l_shipinstruct" ], "primitiveType" : "BINARY", "originalType" : "UTF8", "precision" : 0, "scale" : 0, "repetitionLevel" : 0, "definitionLevel" : 0 }, "L_SHIPINSTRUCT0" : { "name" : [ "L_SHIPINSTRUCT0" ], "primitiveType" : "BINARY", "originalType" : null, "precision" : 0, "scale" : 0, "repetitionLevel" : 0, "definitionLevel" : 1 }, -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (DRILL-4971) query encounters system error: Statement "break AndOP3" is not enclosed by a breakable statement with label "AndOP3"
Robert Hou created DRILL-4971: - Summary: query encounters system error: Statement "break AndOP3" is not enclosed by a breakable statement with label "AndOP3" Key: DRILL-4971 URL: https://issues.apache.org/jira/browse/DRILL-4971 Project: Apache Drill Issue Type: Bug Components: Execution - Flow Reporter: Robert Hou Attachments: low_table, medium_table This query returns an error: select count(\*) from test where ((int_id > 3060 and int_id < 6002) or (int_id > 9025 and int_id < 11976)) and ((int_id > 9025 and int_id < 11976) or (int_id > 3060 and int_id < 6002)) and (int_id > 3060 and int_id < 6002); Error: SYSTEM ERROR: CompileException: Line 232, Column 30: Statement "break AndOP3" is not enclosed by a breakable statement with label "AndOP3" There are two partitions to the test table. One covers the range 3061 - 6001 and the other covers the range 9026 - 11975. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (DRILL-4970) Wrong results when casting double to bigint or int
Robert Hou created DRILL-4970: - Summary: Wrong results when casting double to bigint or int Key: DRILL-4970 URL: https://issues.apache.org/jira/browse/DRILL-4970 Project: Apache Drill Issue Type: Bug Components: Execution - Data Types Affects Versions: 1.8.0 Reporter: Robert Hou This query returns the wrong result 0: jdbc:drill:zk=10.10.100.186:5181/drill/rho> select count(*) from orders_parts_rowgr1 where (int_id > -3025 and bigint_id <= -256) or (cast(double_id as bigint) >= -255 and double_id <= -5); +-+ | EXPR$0 | +-+ | 2769| +-+ Without the cast, it returns the correct result: 0: jdbc:drill:zk=10.10.100.186:5181/drill/rho> select count(*) from orders_parts_rowgr1 where (int_id > -3025 and bigint_id <= -256) or (double_id >= -255 and double_id <= -5); +-+ | EXPR$0 | +-+ | 3020| +-+ By itself, the result is also correct: 0: jdbc:drill:zk=10.10.100.186:5181/drill/rho> select count(*) from orders_parts_rowgr1 where (cast(double_id as bigint) >= -255 and double_id <= -5); +-+ | EXPR$0 | +-+ | 251 | +-+ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (DRILL-4883) Drill Explorer returns "SYSTEM ERROR: UnsupportedOperationException: Unhandled field reference ; a field reference identifier must not have the form of a qualified name (
Robert Hou created DRILL-4883: - Summary: Drill Explorer returns "SYSTEM ERROR: UnsupportedOperationException: Unhandled field reference ; a field reference identifier must not have the form of a qualified name (i.e., with "."). Key: DRILL-4883 URL: https://issues.apache.org/jira/browse/DRILL-4883 Project: Apache Drill Issue Type: Bug Components: Execution - Codegen Affects Versions: 1.8.0 Environment: Drill Explorer runs in Windows Reporter: Robert Hou When Drill Explorer submits this query, it returns an error regarding favorites.color: select age,`favorites.color` from `dfs`.`drillTestDir`.`./json_storage/employeeNestedArrayAndObject.json` The error is: ERROR [HY000] [MapR][Drill] (1040) Drill failed to execute the query: select age,`favorites.color` from `dfs`.`drillTestDir`.`./json_storage/employeeNestedArrayAndObject.json` [30027]Query execution error. Details:[ SYSTEM ERROR: UnsupportedOperationException: Unhandled field reference "favorites.color"; a field reference identifier must not have the form of a qualified name (i.e., with "."). This query can be executed by sqlline (note that the format of the query is slightly different for sqlline and Drill Explorer). select age,`favorites.color` from `json_storage/employeeNestedArrayAndObject.json`; The physical plan for the query when using sqlline is different from the physical plan when using Drill Explorer. Here is the plan when using sqlline: 00-00Screen : rowType = RecordType(ANY age, ANY favorites.color): rowcount = 1.0, cumulative cost = {0.1 rows, 0.1 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 19699870 00-01 Project(age=[$0], favorites.color=[$1]) : rowType = RecordType(ANY age, ANY favorites.color): rowcount = 1.0, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 19699869 00-02Scan(groupscan=[EasyGroupScan [selectionRoot=maprfs:/drill/testdata/json_storage/employeeNestedArrayAndObject.json, numFiles=1, columns=[`age`, `favorites.color`], files=[maprfs:///drill/testdata/json_storage/employeeNestedArrayAndObject.json]]]) : rowType = RecordType(ANY age, ANY favorites.color): rowcount = 1.0, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 19699868 The physical plan when using Drill Explorer is: 00-00Screen : rowType = RecordType(ANY age, ANY favorites.color): rowcount = 1.0, cumulative cost = {1.1 rows, 1.1 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 19675621 00-01 ComplexToJson : rowType = RecordType(ANY age, ANY favorites.color): rowcount = 1.0, cumulative cost = {1.0 rows, 1.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 19675620 00-02Project(age=[$0], favorites.color=[$1]) : rowType = RecordType(ANY age, ANY favorites.color): rowcount = 1.0, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 19675619 00-03 Scan(groupscan=[EasyGroupScan [selectionRoot=maprfs:/drill/testdata/json_storage/employeeNestedArrayAndObject.json, numFiles=1, columns=[`age`, `favorites.color`], files=[maprfs:///drill/testdata/json_storage/employeeNestedArrayAndObject.json]]]) : rowType = RecordType(ANY age, ANY favorites.color): rowcount = 1.0, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 19675618 Drill Explorer has an extra ComplexToJson operator that may have a problem. Here is the data file used: { "first": "John", "last": "Doe", "age": 39, "sex": "M", "salary": 7, "registered": true, "interests": [ "Reading", "Mountain Biking", "Hacking" ], "favorites": { "color": "Blue", "sport": "Soccer", "food": "Spaghetti" } } -- This message was sent by Atlassian JIRA (v6.3.4#6332)