Re: [ANNOUNCE] New PMC Chair of Apache Drill
Congratulations Charles, and thanks for your contributions to Drill! Thank you Arina for all you have done as PMC Chair this past year. --Robert On Fri, Aug 23, 2019 at 4:16 PM Khurram Faraaz wrote: > Congratulations Charles, and thank you Arina. > > Regards, > Khurram > > On Fri, Aug 23, 2019 at 2:54 PM Niels Basjes wrote: > > > Congratulations Charles. > > > > Niels Basjes > > > > On Thu, Aug 22, 2019, 09:28 Arina Ielchiieva wrote: > > > > > Hi all, > > > > > > It has been a honor to serve as Drill Chair during the past year but > it's > > > high time for the new one... > > > > > > I am very pleased to announce that the Drill PMC has voted to elect > > Charles > > > Givre as the new PMC chair of Apache Drill. He has also been approved > > > unanimously by the Apache Board in last board meeting. > > > > > > Congratulations, Charles! > > > > > > Kind regards, > > > Arina > > > > > >
Re: [ANNOUNCE] New Committer: Jyothsna Donapati
Congratulations! Thanks for your contributions. --Robert On Thu, May 9, 2019 at 4:00 PM Sorabh Hamirwasia wrote: > Congratulations ! > > > On Thu, May 9, 2019 at 3:45 PM Hanumanth Maduri > wrote: > > > Congratulations Jyothsna!! > > > > > On May 9, 2019, at 3:06 PM, Gautam Parai wrote: > > > > > > Congratulations Jyothsna!! > > > > > > Gautam > > > > > >> On Thu, May 9, 2019 at 2:59 PM Timothy Farkas > wrote: > > >> > > >> Congrats!! > > >> > > >> > > >>> On Thu, May 9, 2019 at 2:54 PM Bridget Bevens > > wrote: > > >>> > > >>> Congratulations, Jyothsna!!! :-) > > >>> > > On Thu, May 9, 2019 at 2:46 PM Khurram Faraaz > > wrote: > > > > Congratulations Jyothsna! > > > > On Thu, May 9, 2019 at 2:38 PM salim achouche > > > wrote: > > > > > Congratulations Jyothsna! > > > > > > On Thu, May 9, 2019 at 2:28 PM Aman Sinha > > >>> wrote: > > > > > >> The Project Management Committee (PMC) for Apache Drill has > invited > > >> Jyothsna > > >> Donapati to become a committer, and we are pleased to announce > that > > >>> she > > > has > > >> accepted. > > >> > > >> Jyothsna has been contributing to Drill for about 1 1/2 years. > She > > >> initially contributed the graceful shutdown capability and more > > recently > > >> has made several crucial improvements in the parquet metadata > > >> caching > > > which > > >> have gone into the 1.16 release. She also co-authored the design > > > document > > >> for this feature. > > >> > > >> Welcome Jyothsna, and thank you for your contributions. Keep up > > >> the > > good > > >> work > > >> ! > > >> > > >> -Aman > > >> (on behalf of Drill PMC) > > >> > > > > > > > > > -- > > > Regards, > > > Salim > > > > > > > >>> > > >> > > >
[jira] [Created] (DRILL-7227) TPCDS queries 47, 57, 59 fail to run with Statistics enabled at sf100
Robert Hou created DRILL-7227: - Summary: TPCDS queries 47, 57, 59 fail to run with Statistics enabled at sf100 Key: DRILL-7227 URL: https://issues.apache.org/jira/browse/DRILL-7227 Project: Apache Drill Issue Type: Bug Components: Metadata Affects Versions: 1.16.0 Reporter: Robert Hou Assignee: Gautam Parai Fix For: 1.17.0 Attachments: 23387ab0-cb1c-cd5e-449a-c9bcefc901c1.sys.drill, 2338ae93-155b-356d-382e-0da949c6f439.sys.drill Here is query 78: {noformat} WITH ws AS (SELECT d_year AS ws_sold_year, ws_item_sk, ws_bill_customer_skws_customer_sk, Sum(ws_quantity) ws_qty, Sum(ws_wholesale_cost) ws_wc, Sum(ws_sales_price)ws_sp FROM web_sales LEFT JOIN web_returns ON wr_order_number = ws_order_number AND ws_item_sk = wr_item_sk JOIN date_dim ON ws_sold_date_sk = d_date_sk WHERE wr_order_number IS NULL GROUP BY d_year, ws_item_sk, ws_bill_customer_sk), cs AS (SELECT d_year AS cs_sold_year, cs_item_sk, cs_bill_customer_skcs_customer_sk, Sum(cs_quantity) cs_qty, Sum(cs_wholesale_cost) cs_wc, Sum(cs_sales_price)cs_sp FROM catalog_sales LEFT JOIN catalog_returns ON cr_order_number = cs_order_number AND cs_item_sk = cr_item_sk JOIN date_dim ON cs_sold_date_sk = d_date_sk WHERE cr_order_number IS NULL GROUP BY d_year, cs_item_sk, cs_bill_customer_sk), ss AS (SELECT d_year AS ss_sold_year, ss_item_sk, ss_customer_sk, Sum(ss_quantity) ss_qty, Sum(ss_wholesale_cost) ss_wc, Sum(ss_sales_price)ss_sp FROM store_sales LEFT JOIN store_returns ON sr_ticket_number = ss_ticket_number AND ss_item_sk = sr_item_sk JOIN date_dim ON ss_sold_date_sk = d_date_sk WHERE sr_ticket_number IS NULL GROUP BY d_year, ss_item_sk, ss_customer_sk) SELECT ss_item_sk, Round(ss_qty / ( COALESCE(ws_qty + cs_qty, 1) ), 2) ratio, ss_qty store_qty, ss_wc store_wholesale_cost, ss_sp store_sales_price, COALESCE(ws_qty, 0) + COALESCE(cs_qty, 0) other_chan_qty, COALESCE(ws_wc, 0) + COALESCE(cs_wc, 0) other_chan_wholesale_cost, COALESCE(ws_sp, 0) + COALESCE(cs_sp, 0) other_chan_sales_price FROM ss LEFT JOIN ws ON ( ws_sold_year = ss_sold_year AND ws_item_sk = ss_item_sk AND ws_customer_sk = ss_customer_sk ) LEFT JOIN cs ON ( cs_sold_year = ss_sold_year AND cs_item_sk = cs_item_sk AND cs_customer_sk = ss_customer_sk ) WHERE COALESCE(ws_qty, 0) > 0 AND COALESCE(cs_qty, 0) > 0 AND ss_sold_year = 1999 ORDER BY ss_item_sk, ss_qty DESC, ss_wc DESC, ss_sp DESC, other_chan_qty, other_chan_wholesale_cost, other_chan_sales_price, Round(ss_qty / ( COALESCE(ws_qty + cs_qty, 1) ), 2) LIMIT 100; {noformat} The profile for the new plan is 2338ae93-155b-356d-382e-0da949c6f439. Hash partition sender operator (10-00) takes 10-15 minutes. I am not sure why it takes so long. It has 10 minor fragments sending to receiver (06-05), which has 62 minor fragments. But hash partition sender (16-00) has 10 minor fragments sending to receiver (12-06), which has 220 minor fragments, and there is no performance issue. The profile for the old plan is 23387ab0-cb1c-cd5e-449a-c9bcefc901c1. Both plans use the same commit. The old plan is created by disabling statistics. I have not included the plans in the Jira because Jira has a max of 32K. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-7183) TPCDS query 10, 35, 69 take longer with sf 1000 when Statistics are disabled
Robert Hou created DRILL-7183: - Summary: TPCDS query 10, 35, 69 take longer with sf 1000 when Statistics are disabled Key: DRILL-7183 URL: https://issues.apache.org/jira/browse/DRILL-7183 Project: Apache Drill Issue Type: Bug Components: Query Planning Optimization Affects Versions: 1.16.0 Reporter: Robert Hou Assignee: Hanumath Rao Maduri Fix For: 1.16.0 Query 69 runs 150% slower when Statistics is disabled. Here is the query: {noformat} SELECT cd_gender, cd_marital_status, cd_education_status, count(*) cnt1, cd_purchase_estimate, count(*) cnt2, cd_credit_rating, count(*) cnt3 FROM customer c, customer_address ca, customer_demographics WHERE c.c_current_addr_sk = ca.ca_address_sk AND ca_state IN ('KY', 'GA', 'NM') AND cd_demo_sk = c.c_current_cdemo_sk AND exists(SELECT * FROM store_sales, date_dim WHERE c.c_customer_sk = ss_customer_sk AND ss_sold_date_sk = d_date_sk AND d_year = 2001 AND d_moy BETWEEN 4 AND 4 + 2) AND (NOT exists(SELECT * FROM web_sales, date_dim WHERE c.c_customer_sk = ws_bill_customer_sk AND ws_sold_date_sk = d_date_sk AND d_year = 2001 AND d_moy BETWEEN 4 AND 4 + 2) AND NOT exists(SELECT * FROM catalog_sales, date_dim WHERE c.c_customer_sk = cs_ship_customer_sk AND cs_sold_date_sk = d_date_sk AND d_year = 2001 AND d_moy BETWEEN 4 AND 4 + 2)) GROUP BY cd_gender, cd_marital_status, cd_education_status, cd_purchase_estimate, cd_credit_rating ORDER BY cd_gender, cd_marital_status, cd_education_status, cd_purchase_estimate, cd_credit_rating LIMIT 100; {noformat} This regression is caused by commit 982e98061e029a39f1c593f695c0d93ec7079f0d. This commit should be reverted for now. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: [ANNOUNCE] New PMC member: Sorabh Hamirwasia
Congratulations Sorabh! Thanks for your contributions. --Robert On Fri, Apr 5, 2019 at 4:57 PM weijie tong wrote: > Congratulations Sorabh! > > On Sat, Apr 6, 2019 at 7:17 AM Sorabh Hamirwasia > wrote: > > > Thank You everyone for your wishes!! > > > > Looking forward for everyone's help to vote on release candidate next > week > > :) > > > > Thanks, > > Sorabh > > > > On Fri, Apr 5, 2019 at 2:12 PM Parth Chandra wrote: > > > > > Congrats Sorabh. Just in time to manage the release ! > > > > > > > > > > > > On Fri, Apr 5, 2019 at 9:06 AM Arina Ielchiieva > > wrote: > > > > > > > I am pleased to announce that Drill PMC invited Sorabh Hamirwasia to > > > > the PMC and > > > > he has accepted the invitation. > > > > > > > > Congratulations Sorabh and welcome! > > > > > > > > - Arina > > > > (on behalf of Drill PMC) > > > > > > > > > >
[jira] [Created] (DRILL-7155) Create a standard logging message for batch sizes generated by individual operators
Robert Hou created DRILL-7155: - Summary: Create a standard logging message for batch sizes generated by individual operators Key: DRILL-7155 URL: https://issues.apache.org/jira/browse/DRILL-7155 Project: Apache Drill Issue Type: Task Components: Execution - Relational Operators Affects Versions: 1.16.0 Reporter: Robert Hou Assignee: Robert Hou QA reads log messages in drillbit.log to verify the sizes of data batches generated by individual operators. These log messages need to be standardized so that each operator creates the same message. This allows the QA test framework to verify the information in each message. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-7154) TPCH query 4 and 17 take longer with sf 1000 when Statistics are disabled
Robert Hou created DRILL-7154: - Summary: TPCH query 4 and 17 take longer with sf 1000 when Statistics are disabled Key: DRILL-7154 URL: https://issues.apache.org/jira/browse/DRILL-7154 Project: Apache Drill Issue Type: Bug Components: Query Planning Optimization Affects Versions: 1.16.0 Reporter: Robert Hou Assignee: Boaz Ben-Zvi Fix For: 1.16.0 Attachments: 235a3ed4-e3d1-f3b7-39c5-fc947f56b6d5.sys.drill, 235a471b-aa97-bfb5-207d-3f25b4b5fbbb.sys.drill, hashagg.nostats.log, hashagg.stats.disabled.log Here is TPCH 04 with sf 1000: {noformat} select o.o_orderpriority, count(*) as order_count from orders o where o.o_orderdate >= date '1996-10-01' and o.o_orderdate < date '1996-10-01' + interval '3' month and exists ( select * from lineitem l where l.l_orderkey = o.o_orderkey and l.l_commitdate < l.l_receiptdate ) group by o.o_orderpriority order by o.o_orderpriority; {noformat} TPCH query 4 takes 30% longer. The plan is the same. But the Hash Agg operator in the new plan is taking longer. One possible reason is that the Hash Agg operator in the new plan is not using as many buckets as the old plan did. The memory usage of the Hash Agg operator in the new plan is using less memory compared to the old plan. Here is the old plan: {noformat} 00-00Screen : rowType = RecordType(ANY o_orderpriority, BIGINT order_count): rowcount = 375.0, cumulative cost = {1.9163601940441746E10 rows, 9.07316867594483E10 cpu, 2.2499969127E10 io, 3.59423968386048E12 network, 2.2631985057468002E10 memory}, id = 5645 00-01 Project(o_orderpriority=[$0], order_count=[$1]) : rowType = RecordType(ANY o_orderpriority, BIGINT order_count): rowcount = 375.0, cumulative cost = {1.9163226940441746E10 rows, 9.07313117594483E10 cpu, 2.2499969127E10 io, 3.59423968386048E12 network, 2.2631985057468002E10 memory}, id = 5644 00-02SingleMergeExchange(sort0=[0]) : rowType = RecordType(ANY o_orderpriority, BIGINT order_count): rowcount = 375.0, cumulative cost = {1.9159476940441746E10 rows, 9.07238117594483E10 cpu, 2.2499969127E10 io, 3.59423968386048E12 network, 2.2631985057468002E10 memory}, id = 5643 01-01 OrderedMuxExchange(sort0=[0]) : rowType = RecordType(ANY o_orderpriority, BIGINT order_count): rowcount = 375.0, cumulative cost = {1.9155726940441746E10 rows, 9.0643982838025E10 cpu, 2.2499969127E10 io, 3.56351968386048E12 network, 2.2631985057468002E10 memory}, id = 5642 02-01SelectionVectorRemover : rowType = RecordType(ANY o_orderpriority, BIGINT order_count): rowcount = 375.0, cumulative cost = {1.9151976940441746E10 rows, 9.0640232838025E10 cpu, 2.2499969127E10 io, 3.56351968386048E12 network, 2.2631985057468002E10 memory}, id = 5641 02-02 Sort(sort0=[$0], dir0=[ASC]) : rowType = RecordType(ANY o_orderpriority, BIGINT order_count): rowcount = 375.0, cumulative cost = {1.9148226940441746E10 rows, 9.0636482838025E10 cpu, 2.2499969127E10 io, 3.56351968386048E12 network, 2.2631985057468002E10 memory}, id = 5640 02-03HashAgg(group=[{0}], order_count=[$SUM0($1)]) : rowType = RecordType(ANY o_orderpriority, BIGINT order_count): rowcount = 375.0, cumulative cost = {1.9144476940441746E10 rows, 9.030890595055101E10 cpu, 2.2499969127E10 io, 3.56351968386048E12 network, 2.2571985057468002E10 memory}, id = 5639 02-04 HashToRandomExchange(dist0=[[$0]]) : rowType = RecordType(ANY o_orderpriority, BIGINT order_count): rowcount = 3.75E7, cumulative cost = {1.9106976940441746E10 rows, 8.955890595055101E10 cpu, 2.2499969127E10 io, 3.56351968386048E12 network, 2.1911985057468002E10 memory}, id = 5638 03-01HashAgg(group=[{0}], order_count=[COUNT()]) : rowType = RecordType(ANY o_orderpriority, BIGINT order_count): rowcount = 3.75E7, cumulative cost = {1.9069476940441746E10 rows, 8.895890595055101E10 cpu, 2.2499969127E10 io, 3.25631968386048E12 network, 2.1911985057468002E10 memory}, id = 5637 03-02 Project(o_orderpriority=[$1]) : rowType = RecordType(ANY o_orderpriority): rowcount = 3.75E8, cumulative cost = {1.8694476940441746E10 rows, 8.145890595055101E10 cpu, 2.2499969127E10 io, 3.25631968386048E12 network, 1.5311985057468002E10 memory}, id = 5636 03-03Project(o_orderkey=[$1], o_orderpriority=[$2], l_orderkey=[$0]) : rowType = RecordType(ANY o_orderkey, ANY o_orderpriority, ANY l_orderkey): rowcount = 3.75E8, cumulative cost = {1.8319476940441746E10 rows, 8.108390595055101E10 cpu, 2.2499969127E10 io, 3.25631968386048E12 network, 1.5311985057468002E10 memory}, id = 5635 03-04 HashJoin(condition=[=($1, $0)], joinType=[inner], semi-join: =[false]) : rowType = RecordType(ANY l_order
[jira] [Created] (DRILL-7139) Date)add produces Incorrect results when adding to a timestamp
Robert Hou created DRILL-7139: - Summary: Date)add produces Incorrect results when adding to a timestamp Key: DRILL-7139 URL: https://issues.apache.org/jira/browse/DRILL-7139 Project: Apache Drill Issue Type: Bug Components: Functions - Drill Affects Versions: 1.15.0 Reporter: Robert Hou Assignee: Pritesh Maker I am using date_add() to create a sequence of timestamps: {noformat} select date_add(timestamp '1970-01-01 00:00:00', cast(concat('PT',{color:#f79232}107374{color},'M') as interval minute)) timestamp_id from (values(1)); +--+ | timestamp_id | +--+ | 1970-01-25 20:31:12.704 | +--+ 1 row selected (0.121 seconds) {noformat} When I add one more, I get an older timestamp: {noformat} 0: jdbc:drill:drillbit=10.10.51.5> select date_add(timestamp '1970-01-01 00:00:00', cast(concat('PT',{color:#f79232}107375{color},'M') as interval minute)) timestamp_id from (values(1)); +--+ | timestamp_id | +--+ | 1969-12-07 03:29:25.408 | +--+ 1 row selected (0.126 seconds) {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-7136) Num_buckets for HashAgg in profile may be inaccurate
Robert Hou created DRILL-7136: - Summary: Num_buckets for HashAgg in profile may be inaccurate Key: DRILL-7136 URL: https://issues.apache.org/jira/browse/DRILL-7136 Project: Apache Drill Issue Type: Bug Components: Tools, Build Test Affects Versions: 1.16.0 Reporter: Robert Hou Assignee: Pritesh Maker Fix For: 1.16.0 Attachments: 23650ee5-6721-8a8f-7dd3-f5dd09a3a7b0.sys.drill I ran TPCH query 17 with sf 1000. Here is the query: {noformat} select sum(l.l_extendedprice) / 7.0 as avg_yearly from lineitem l, part p where p.p_partkey = l.l_partkey and p.p_brand = 'Brand#13' and p.p_container = 'JUMBO CAN' and l.l_quantity < ( select 0.2 * avg(l2.l_quantity) from lineitem l2 where l2.l_partkey = p.p_partkey ); {noformat} One of the hash agg operators has resized 6 times. It should have 4M buckets. But the profile shows it has 64K buckets. I have attached a sample profile. In this profile, the hash agg operator is (04-02). {noformat} Operator Metrics Minor Fragment NUM_BUCKETS NUM_ENTRIES NUM_RESIZING RESIZING_TIME_MSNUM_PARTITIONS SPILLED_PARTITIONS SPILL_MB SPILL_CYCLE INPUT_BATCH_COUNT AVG_INPUT_BATCH_BYTES AVG_INPUT_ROW_BYTES INPUT_RECORD_COUNT OUTPUT_BATCH_COUNT AVG_OUTPUT_BATCH_BYTES AVG_OUTPUT_ROW_BYTESOUTPUT_RECORD_COUNT 04-00-0265,536 748,746 6 364 1 582 0 813 582,653 18 26,316,456 401 1,631,943 25 26,176,350 {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (DRILL-7132) Metadata cache does not have correct min/max values for varchar and interval data types
[ https://issues.apache.org/jira/browse/DRILL-7132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Hou resolved DRILL-7132. --- Resolution: Not A Problem > Metadata cache does not have correct min/max values for varchar and interval > data types > --- > > Key: DRILL-7132 > URL: https://issues.apache.org/jira/browse/DRILL-7132 > Project: Apache Drill > Issue Type: Bug > Components: Metadata >Affects Versions: 1.14.0 >Reporter: Robert Hou >Priority: Major > Fix For: 1.17.0 > > Attachments: 0_0_10.parquet > > > The parquet metadata cache does not have correct min/max values for varchar > and interval data types. > I have attached a parquet file. Here is what parquet tools shows for varchar: > [varchar_col] BINARY 14.6% of all space [PLAIN, BIT_PACKED] min: 67 max: 67 > average: 67 total: 67 (raw data: 65 saving -3%) > values: min: 1 max: 1 average: 1 total: 1 > uncompressed: min: 65 max: 65 average: 65 total: 65 > column values statistics: min: ioegjNJKvnkd, max: ioegjNJKvnkd, num_nulls: 0 > Here is what the metadata cache file shows: > "name" : [ "varchar_col" ], > "minValue" : "aW9lZ2pOSkt2bmtk", > "maxValue" : "aW9lZ2pOSkt2bmtk", > "nulls" : 0 > Here is what parquet tools shows for interval: > [interval_col] BINARY 11.3% of all space [PLAIN, BIT_PACKED] min: 52 max: 52 > average: 52 total: 52 (raw data: 50 saving -4%) > values: min: 1 max: 1 average: 1 total: 1 > uncompressed: min: 50 max: 50 average: 50 total: 50 > column values statistics: min: P18582D, max: P18582D, num_nulls: 0 > Here is what the metadata cache file shows: > "name" : [ "interval_col" ], > "minValue" : "UDE4NTgyRA==", > "maxValue" : "UDE4NTgyRA==", > "nulls" : 0 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-7132) Metadata cache does not have correct min/max values for varchar and interval data types
Robert Hou created DRILL-7132: - Summary: Metadata cache does not have correct min/max values for varchar and interval data types Key: DRILL-7132 URL: https://issues.apache.org/jira/browse/DRILL-7132 Project: Apache Drill Issue Type: Bug Components: Metadata Affects Versions: 1.14.0 Reporter: Robert Hou Fix For: 1.17.0 Attachments: 0_0_10.parquet The parquet metadata cache does not have correct min/max values for varchar and interval data types. I have attached a parquet file. Here is what parquet tools shows for varchar: [varchar_col] BINARY 14.6% of all space [PLAIN, BIT_PACKED] min: 67 max: 67 average: 67 total: 67 (raw data: 65 saving -3%) values: min: 1 max: 1 average: 1 total: 1 uncompressed: min: 65 max: 65 average: 65 total: 65 column values statistics: min: ioegjNJKvnkd, max: ioegjNJKvnkd, num_nulls: 0 Here is what the metadata cache file shows: "name" : [ "varchar_col" ], "minValue" : "aW9lZ2pOSkt2bmtk", "maxValue" : "aW9lZ2pOSkt2bmtk", "nulls" : 0 Here is what parquet tools shows for interval: [interval_col] BINARY 11.3% of all space [PLAIN, BIT_PACKED] min: 52 max: 52 average: 52 total: 52 (raw data: 50 saving -4%) values: min: 1 max: 1 average: 1 total: 1 uncompressed: min: 50 max: 50 average: 50 total: 50 column values statistics: min: P18582D, max: P18582D, num_nulls: 0 Here is what the metadata cache file shows: "name" : [ "interval_col" ], "minValue" : "UDE4NTgyRA==", "maxValue" : "UDE4NTgyRA==", "nulls" : 0 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-7122) TPCDS queries 29 25 17 are slower when Statistics is disabled.
Robert Hou created DRILL-7122: - Summary: TPCDS queries 29 25 17 are slower when Statistics is disabled. Key: DRILL-7122 URL: https://issues.apache.org/jira/browse/DRILL-7122 Project: Apache Drill Issue Type: Bug Reporter: Robert Hou -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-7123) TPCDS query 83 runs slower when Statistics is disabled
Robert Hou created DRILL-7123: - Summary: TPCDS query 83 runs slower when Statistics is disabled Key: DRILL-7123 URL: https://issues.apache.org/jira/browse/DRILL-7123 Project: Apache Drill Issue Type: Bug Components: Query Planning Optimization Affects Versions: 1.16.0 Reporter: Robert Hou Assignee: Gautam Parai Fix For: 1.16.0 Query is TPCDS 83 with sf 100: {noformat} WITH sr_items AS (SELECT i_item_id item_id, Sum(sr_return_quantity) sr_item_qty FROM store_returns, item, date_dim WHERE sr_item_sk = i_item_sk AND d_date IN (SELECT d_date FROM date_dim WHERE d_week_seq IN (SELECT d_week_seq FROM date_dim WHERE d_date IN ( '1999-06-30', '1999-08-28', '1999-11-18' ))) AND sr_returned_date_sk = d_date_sk GROUP BY i_item_id), cr_items AS (SELECT i_item_id item_id, Sum(cr_return_quantity) cr_item_qty FROM catalog_returns, item, date_dim WHERE cr_item_sk = i_item_sk AND d_date IN (SELECT d_date FROM date_dim WHERE d_week_seq IN (SELECT d_week_seq FROM date_dim WHERE d_date IN ( '1999-06-30', '1999-08-28', '1999-11-18' ))) AND cr_returned_date_sk = d_date_sk GROUP BY i_item_id), wr_items AS (SELECT i_item_id item_id, Sum(wr_return_quantity) wr_item_qty FROM web_returns, item, date_dim WHERE wr_item_sk = i_item_sk AND d_date IN (SELECT d_date FROM date_dim WHERE d_week_seq IN (SELECT d_week_seq FROM date_dim WHERE d_date IN ( '1999-06-30', '1999-08-28', '1999-11-18' ))) AND wr_returned_date_sk = d_date_sk GROUP BY i_item_id) SELECT sr_items.item_id, sr_item_qty, sr_item_qty / ( sr_item_qty + cr_item_qty + wr_item_qty ) / 3.0 * 100 sr_dev, cr_item_qty, cr_item_qty / ( sr_item_qty + cr_item_qty + wr_item_qty ) / 3.0 * 100 cr_dev, wr_item_qty, wr_item_qty / ( sr_item_qty + cr_item_qty + wr_item_qty ) / 3.0 * 100 wr_dev, ( sr_item_qty + cr_item_qty + wr_item_qty ) / 3.0 average FROM sr_items, cr_items, wr_items WHERE sr_items.item_id = cr_items.item_id AND sr_items.item_id = wr_items.item_id ORDER BY sr_items.item_id, sr_item_qty LIMIT 100; {noformat} The number of threads for major fragments 1 and 2 has changed when Statistics is disabled. The number of minor fragments has been reduced from 10 and 15 fragments down to 3 fragments. Rowcount has changed for major fragment 2 from 1439754.0 down to 287950.8. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-7121) TPCH 4 takes longer
Robert Hou created DRILL-7121: - Summary: TPCH 4 takes longer Key: DRILL-7121 URL: https://issues.apache.org/jira/browse/DRILL-7121 Project: Apache Drill Issue Type: Bug Components: Query Planning Optimization Affects Versions: 1.16.0 Reporter: Robert Hou Assignee: Gautam Parai Fix For: 1.16.0 Here is TPCH 4 with sf 100: {noformat} select o.o_orderpriority, count(*) as order_count from orders o where o.o_orderdate >= date '1996-10-01' and o.o_orderdate < date '1996-10-01' + interval '3' month and exists ( select * from lineitem l where l.l_orderkey = o.o_orderkey and l.l_commitdate < l.l_receiptdate ) group by o.o_orderpriority order by o.o_orderpriority; {noformat} The plan has changed when Statistics is disabled. A Hash Agg and a Broadcast Exchange have been added. These two operators expand the number of rows from the lineitem table from 137M to 9B rows. This forces the hash join to use 6GB of memory instead of 30 MB. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-7120) Query fails with ChannelClosedException
Robert Hou created DRILL-7120: - Summary: Query fails with ChannelClosedException Key: DRILL-7120 URL: https://issues.apache.org/jira/browse/DRILL-7120 Project: Apache Drill Issue Type: Bug Components: Query Planning Optimization Affects Versions: 1.16.0 Reporter: Robert Hou Assignee: Gautam Parai Fix For: 1.16.0 TPCH query 5 fails at sf100. Here is the query: {noformat} select n.n_name, sum(l.l_extendedprice * (1 - l.l_discount)) as revenue from customer c, orders o, lineitem l, supplier s, nation n, region r where c.c_custkey = o.o_custkey and l.l_orderkey = o.o_orderkey and l.l_suppkey = s.s_suppkey and c.c_nationkey = s.s_nationkey and s.s_nationkey = n.n_nationkey and n.n_regionkey = r.r_regionkey and r.r_name = 'EUROPE' and o.o_orderdate >= date '1997-01-01' and o.o_orderdate < date '1997-01-01' + interval '1' year group by n.n_name order by revenue desc; {noformat} This is the error from drillbit.log: {noformat} 2019-03-04 17:46:38,684 [23822b0a-b7bd-0b79-b905-1438f5b1d039:frag:6:64] INFO o.a.d.e.w.fragment.FragmentExecutor - 23822b0a-b7bd-0b79-b905-1438f5b1d039:6:64: State change requested RUNNING --> FINISHED 2019-03-04 17:46:38,684 [23822b0a-b7bd-0b79-b905-1438f5b1d039:frag:6:64] INFO o.a.d.e.w.f.FragmentStatusReporter - 23822b0a-b7bd-0b79-b905-1438f5b1d039:6:64: State to report: FINISHED 2019-03-04 18:17:51,454 [BitServer-13] WARN o.a.d.exec.rpc.ProtobufLengthDecoder - Failure allocating buffer on incoming stream due to memory limits. Current Allocation: 262144. 2019-03-04 18:17:51,454 [BitServer-13] ERROR o.a.drill.exec.rpc.data.DataServer - Out of memory in RPC layer. 2019-03-04 18:17:51,463 [BitServer-13] ERROR o.a.d.exec.rpc.RpcExceptionHandler - Exception in RPC communication. Connection: /10.10.120.104:31012 <--> /10.10.120.106:53048 (data server). Closing connection. io.netty.handler.codec.DecoderException: org.apache.drill.exec.exception.OutOfMemoryException: Failure allocating buffer. at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:271) ~[netty-codec-4.0.48.Final.jar:4.0.48.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) [netty-transport-4.0.48.Final.jar:4.0.48.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) [netty-transport-4.0.48.Final.jar:4.0.48.Final] at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335) [netty-transport-4.0.48.Final.jar:4.0.48.Final] at io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86) [netty-transport-4.0.48.Final.jar:4.0.48.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) [netty-transport-4.0.48.Final.jar:4.0.48.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) [netty-transport-4.0.48.Final.jar:4.0.48.Final] at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335) [netty-transport-4.0.48.Final.jar:4.0.48.Final] at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1294) [netty-transport-4.0.48.Final.jar:4.0.48.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) [netty-transport-4.0.48.Final.jar:4.0.48.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) [netty-transport-4.0.48.Final.jar:4.0.48.Final] at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:911) [netty-transport-4.0.48.Final.jar:4.0.48.Final] at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131) [netty-transport-4.0.48.Final.jar:4.0.48.Final] at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:645) [netty-transport-4.0.48.Final.jar:4.0.48.Final] at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:580) [netty-transport-4.0.48.Final.jar:4.0.48.Final] at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:497) [netty-transport-4.0.48.Final.jar:4.0.48.Final] at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459) [netty-transport-4.0.48.Final.jar:4.0.48.Final] at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:131) [netty-common-4.0.48.Final.jar:4.0.48.Final] at java.lang.Thread.run(Threa
[jira] [Created] (DRILL-7109) Statistics adds external sort, which spills to disk
Robert Hou created DRILL-7109: - Summary: Statistics adds external sort, which spills to disk Key: DRILL-7109 URL: https://issues.apache.org/jira/browse/DRILL-7109 Project: Apache Drill Issue Type: Bug Components: Query Planning Optimization Affects Versions: 1.16.0 Reporter: Robert Hou Assignee: Gautam Parai Fix For: 1.16.0 TPCH query 4 with sf 100 runs many times slower. One issue is that an extra external sort has been added, and both external sorts spill to disk. Also, the hash join sees 100x more data. Here is the query: {noformat} select o.o_orderpriority, count(*) as order_count from orders o where o.o_orderdate >= date '1996-10-01' and o.o_orderdate < date '1996-10-01' + interval '3' month and exists ( select * from lineitem l where l.l_orderkey = o.o_orderkey and l.l_commitdate < l.l_receiptdate ) group by o.o_orderpriority order by o.o_orderpriority; {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-7108) Statistics adds two exchange operators
Robert Hou created DRILL-7108: - Summary: Statistics adds two exchange operators Key: DRILL-7108 URL: https://issues.apache.org/jira/browse/DRILL-7108 Project: Apache Drill Issue Type: Bug Components: Query Planning Optimization Affects Versions: 1.16.0 Reporter: Robert Hou Assignee: Gautam Parai Fix For: 1.16.0 TPCH 16 with sf 100 runs 14% slower. Here is the query: {noformat} select p.p_brand, p.p_type, p.p_size, count(distinct ps.ps_suppkey) as supplier_cnt from partsupp ps, part p where p.p_partkey = ps.ps_partkey and p.p_brand <> 'Brand#21' and p.p_type not like 'MEDIUM PLATED%' and p.p_size in (38, 2, 8, 31, 44, 5, 14, 24) and ps.ps_suppkey not in ( select s.s_suppkey from supplier s where s.s_comment like '%Customer%Complaints%' ) group by p.p_brand, p.p_type, p.p_size order by supplier_cnt desc, p.p_brand, p.p_type, p.p_size; {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6957) Parquet rowgroup filtering can have incorrect file count
Robert Hou created DRILL-6957: - Summary: Parquet rowgroup filtering can have incorrect file count Key: DRILL-6957 URL: https://issues.apache.org/jira/browse/DRILL-6957 Project: Apache Drill Issue Type: Bug Reporter: Robert Hou Assignee: Jean-Blas IMBERT If a query accesses all the files, the Scan operator indicates that one file is accessed. The number of rowgroups is correct. Here is an example query: {noformat} select count(*) from dfs.`/custdata/tudata/fact/vintage/snapshot_period_id=20151231/comp_id=120` where cur_tot_bal_amt < 100 {noformat} Here is the plan: {noformat} Screen : rowType = RecordType(BIGINT EXPR$0): rowcount = 1.0, cumulative cost = {9.8376721446E9 rows, 4.35668337906E10 cpu, 2.810763469E9 io, 4096.0 network, 0.0 memory}, id = 4477 00-01 Project(EXPR$0=[$0]) : rowType = RecordType(BIGINT EXPR$0): rowcount = 1.0, cumulative cost = {9.8376721445E9 rows, 4.35668337905E10 cpu, 2.810763469E9 io, 4096.0 network, 0.0 memory}, id = 4476 00-02StreamAgg(group=[{}], EXPR$0=[$SUM0($0)]) : rowType = RecordType(BIGINT EXPR$0): rowcount = 1.0, cumulative cost = {9.8376721435E9 rows, 4.35668337895E10 cpu, 2.810763469E9 io, 4096.0 network, 0.0 memory}, id = 4475 00-03 UnionExchange : rowType = RecordType(BIGINT EXPR$0): rowcount = 1.0, cumulative cost = {9.8376721425E9 rows, 4.35668337775E10 cpu, 2.810763469E9 io, 4096.0 network, 0.0 memory}, id = 4474 01-01StreamAgg(group=[{}], EXPR$0=[COUNT()]) : rowType = RecordType(BIGINT EXPR$0): rowcount = 1.0, cumulative cost = {9.8376721415E9 rows, 4.35668337695E10 cpu, 2.810763469E9 io, 0.0 network, 0.0 memory}, id = 4473 01-02 Project($f0=[0]) : rowType = RecordType(INTEGER $f0): rowcount = 1.4053817345E9, cumulative cost = {8.432290407E9 rows, 2.67022529555E10 cpu, 2.810763469E9 io, 0.0 network, 0.0 memory}, id = 4472 01-03SelectionVectorRemover : rowType = RecordType(ANY cur_tot_bal_amt): rowcount = 1.4053817345E9, cumulative cost = {7.0269086725E9 rows, 2.10807260175E10 cpu, 2.810763469E9 io, 0.0 network, 0.0 memory}, id = 4471 01-04 Filter(condition=[($0, 100)]) : rowType = RecordType(ANY cur_tot_bal_amt): rowcount = 1.4053817345E9, cumulative cost = {5.621526938E9 rows, 1.9675344283E10 cpu, 2.810763469E9 io, 0.0 network, 0.0 memory}, id = 4470 01-05Scan(table=[[dfs, /custdata/tudata/fact/vintage/snapshot_period_id=20151231/comp_id=120]], groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:///custdata/tudata/fact/vintage/snapshot_period_id=20151231/comp_id=120]], selectionRoot=maprfs:/custdata/tudata/fact/vintage/snapshot_period_id=20151231/comp_id=120, numFiles=1, numRowGroups=1007, usedMetadataFile=false, columns=[`cur_tot_bal_amt`]]]) : rowType = RecordType(ANY cur_tot_bal_amt): rowcount = 2.810763469E9, cumulative cost = {2.810763469E9 rows, 2.810763469E9 cpu, 2.810763469E9 io, 0.0 network, 0.0 memory}, id = 4469 {noformat} numFiles is set to 1 when it should be set to 21. All the files are in one directory. If I add a level of directories (i.e. a directory with multiple directories, each with files), then I get the correct file count. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6906) File permissions are not being honored
Robert Hou created DRILL-6906: - Summary: File permissions are not being honored Key: DRILL-6906 URL: https://issues.apache.org/jira/browse/DRILL-6906 Project: Apache Drill Issue Type: Bug Components: Client - JDBC, Client - ODBC Affects Versions: 1.15.0 Reporter: Robert Hou Assignee: Pritesh Maker Fix For: 1.15.0 I ran sqlline with user "kuser1". {noformat} /opt/mapr/drill/drill-1.15.0.apache/bin/sqlline -u "jdbc:drill:drillbit=10.10.30.206" -n kuser1 -p mapr {noformat} I tried to access a file that is only accessible by root: {noformat} [root@perfnode206 drill-test-framework_krystal]# hf -ls /drill/testdata/impersonation/neg_tc5/student -rwx-- 3 root root 64612 2018-06-19 10:30 /drill/testdata/impersonation/neg_tc5/student {noformat} I am able to read the table, which should not be possible. I used this commit for Drill 1.15. {noformat} git.commit.id=bf2b414ac62cfc515fdd77f2688bb110073d764d git.commit.message.full=DRILL-6866\: Upgrade to SqlLine 1.6.0\n\n1. Changed SqlLine version to 1.6.0.\n2. Overridden new getVersion method in DrillSqlLineApplication.\n3. Set maxColumnWidth to 80 to avoid issue described in DRILL-6769.\n4. Changed colorScheme to obsidian.\n5. Output null value for varchar / char / boolean types as null instead of empty string.\n6. Changed access modifier from package default to public for JDBC classes that implement external interfaces to avoid issues when calling methods from these classes using reflection.\n\ncloses \#1556 {noformat} This is from drillbit.log. It shows that user is kuser1. {noformat} 2018-12-15 05:00:52,516 [23eb04fb-1701-bea7-dd97-ecda58795b3b:foreman] DEBUG o.a.d.e.w.f.QueryStateProcessor - 23eb04fb-1701-bea7-dd97-ecda58795b3b: State change requested PREPARING --> PLANNING 2018-12-15 05:00:52,531 [23eb04fb-1701-bea7-dd97-ecda58795b3b:foreman] INFO o.a.drill.exec.work.foreman.Foreman - Query text for query with id 23eb04fb-1701-bea7-dd97-ecda58795b3b issued by kuser1: select * from dfs.`/drill/testdata/impersonation/neg_tc5/student` {noformat} It is not clear to me if this is a Drill problem or a file system problem. I tested MFS by logging in as kuser1 and trying to copy the file using "hadoop fs -copyToLocal /drill/testdata/impersonation/neg_tc5/student" and got an error, and was not able to copy the file. So I think MFS permissions are working. I also tried with Drill 1.14, and I get the expected error: {noformat} 0: jdbc:drill:drillbit=10.10.30.206> select * from dfs.`/drill/testdata/impersonation/neg_tc5/student` limit 1; Error: VALIDATION ERROR: From line 1, column 15 to line 1, column 17: Object '/drill/testdata/impersonation/neg_tc5/student' not found within 'dfs' [Error Id: cdf18c2a-b005-4f92-b819-d4324e8807d9 on perfnode206.perf.lab:31010] (state=,code=0) {noformat} The commit for Drill 1.14 is: {noformat} git.commit.message.full=[maven-release-plugin] prepare release drill-1.14.0\n git.commit.id=0508a128853ce796ca7e99e13008e49442f83147 {noformat} This problem exists with both Apache JDBC and Simba ODBC. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6902) Extra limit operator is not needed
Robert Hou created DRILL-6902: - Summary: Extra limit operator is not needed Key: DRILL-6902 URL: https://issues.apache.org/jira/browse/DRILL-6902 Project: Apache Drill Issue Type: Bug Components: Query Planning Optimization Affects Versions: 1.15.0 Reporter: Robert Hou Assignee: Pritesh Maker For TPCDS query 49, there is an extra limit operator that is not needed. Here is the query: {noformat} SELECT 'web' AS channel, web.item, web.return_ratio, web.return_rank, web.currency_rank FROM (SELECT item, return_ratio, currency_ratio, Rank() OVER ( ORDER BY return_ratio) AS return_rank, Rank() OVER ( ORDER BY currency_ratio) AS currency_rank FROM (SELECT ws.ws_item_sk AS item, ( Cast(Sum(COALESCE(wr.wr_return_quantity, 0)) AS DEC(15, 4)) / Cast( Sum(COALESCE(ws.ws_quantity, 0)) AS DEC(15, 4)) ) AS return_ratio, ( Cast(Sum(COALESCE(wr.wr_return_amt, 0)) AS DEC(15, 4)) / Cast( Sum( COALESCE(ws.ws_net_paid, 0)) AS DEC(15, 4)) ) AS currency_ratio FROM web_sales ws LEFT OUTER JOIN web_returns wr ON ( ws.ws_order_number = wr.wr_order_number AND ws.ws_item_sk = wr.wr_item_sk ), date_dim WHERE wr.wr_return_amt > 1 AND ws.ws_net_profit > 1 AND ws.ws_net_paid > 0 AND ws.ws_quantity > 0 AND ws_sold_date_sk = d_date_sk AND d_year = 1999 AND d_moy = 12 GROUP BY ws.ws_item_sk) in_web) web WHERE ( web.return_rank <= 10 OR web.currency_rank <= 10 ) UNION SELECT 'catalog' AS channel, catalog.item, catalog.return_ratio, catalog.return_rank, catalog.currency_rank FROM (SELECT item, return_ratio, currency_ratio, Rank() OVER ( ORDER BY return_ratio) AS return_rank, Rank() OVER ( ORDER BY currency_ratio) AS currency_rank FROM (SELECT cs.cs_item_sk AS item, ( Cast(Sum(COALESCE(cr.cr_return_quantity, 0)) AS DEC(15, 4)) / Cast( Sum(COALESCE(cs.cs_quantity, 0)) AS DEC(15, 4)) ) AS return_ratio, ( Cast(Sum(COALESCE(cr.cr_return_amount, 0)) AS DEC(15, 4 )) / Cast(Sum( COALESCE(cs.cs_net_paid, 0)) AS DEC( 15, 4)) ) AS currency_ratio FROM catalog_sales cs LEFT OUTER JOIN catalog_returns cr ON ( cs.cs_order_number = cr.cr_order_number AND cs.cs_item_sk = cr.cr_item_sk ), date_dim WHERE cr.cr_return_amount > 1 AND cs.cs_net_profit > 1 AND cs.cs_net_paid > 0 AND cs.cs_quantity > 0 AND cs_sold_date_sk = d_date_sk AND d_year = 1999 AND d_moy = 12 GROUP BY cs.cs_item_sk) in_cat) catalog WHERE ( catalog.return_rank <= 10 OR catalog.currency_rank <= 10 ) UNION SELECT 'store' AS channel, store.item, store.return_ratio, store.return_rank, store.currency_rank FROM (SELECT item, return_ratio, currency_ratio, Rank() OVER ( ORDER BY return_ratio) AS return_rank, Rank() OVER ( ORDER BY currency_ratio) AS currency_rank FROM (SELECT sts.ss_item_sk AS item, ( Cast(S
[jira] [Created] (DRILL-6897) TPCH 13 has regressed
Robert Hou created DRILL-6897: - Summary: TPCH 13 has regressed Key: DRILL-6897 URL: https://issues.apache.org/jira/browse/DRILL-6897 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.15.0 Reporter: Robert Hou Assignee: Karthikeyan Manivannan Attachments: 240099ed-ef2a-a23a-4559-f1b2e0809e72.sys.drill, 2400be84-c024-cb92-8743-3211589e0247.sys.drill I ran TPCH query 13 with both scale factor 100 and 1000, and ran them 3x to get a warm start, and ran them twice to verify the regression. It is regressing between 26 and 33%. Here is the query: {noformat} select c_count, count(*) as custdist from ( select c.c_custkey, count(o.o_orderkey) from customer c left outer join orders o on c.c_custkey = o.o_custkey and o.o_comment not like '%special%requests%' group by c.c_custkey ) as orders (c_custkey, c_count) group by c_count order by custdist desc, c_count desc; {noformat} I have attached two profiles. 240099ed-ef2a-a23a-4559-f1b2e0809e72 is for Drill 1.15. 2400be84-c024-cb92-8743-3211589e0247 is for Drill 1.14. The commit for Drill 1.15 is 596227bbbecfb19bdb55dd8ea58159890f83bc9c. The commit for Drill 1.14 is 0508a128853ce796ca7e99e13008e49442f83147. The two plans nearly the same. One difference is that Drill 1.15 is using four times more memory in operator 07-01 Unordered Mux Exchange. I think the problem may be in operator 09-01 Project. Drill 1.15 is projecting the comment field while Drill 1.14 does not project the comment field. Another issue is that the Drill 1.15 takes more processing time to filter the order table. Filter operator 09-03 takes an average of 19.3s. For Drill 1.14, filter operator 09-04 takes an average of 15.6s. They process the same number of rows, and have the same number of minor fragments. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: [ANNOUNCE] New Committer: Karthikeyan Manivannan
Congratulations, Karthik! Thanks for all your contributions. --Robert On Fri, Dec 7, 2018 at 11:15 PM weijie tong wrote: > Congratulations Karthik ! > > On Sat, Dec 8, 2018 at 12:10 PM Karthikeyan Manivannan < > kmanivan...@mapr.com> > wrote: > > > Thanks! In addition to all you wonderful Drillers, I would also like to > > thank Google, StackOverflow and Larry Tesler > > < > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__www.indiatoday.in_education-2Dtoday_gk-2Dcurrent-2Daffairs_story_copy-2Dpaste-2Dinventor-2D337401-2D2016-2D08-2D26=DwIBaQ=cskdkSMqhcnjZxdQVpwTXg=GXRJhB4g1YFDJsrcglHwUA=unIwO2bGiU-CmEDMlh04j5SH0l7I9oQQysVWsBaBe2o=v_edFrOdFEaw0rIWpVS2PNSEJjUlIq28Kh0O3ULmBPE= > > > > > . > > > > On Fri, Dec 7, 2018 at 3:59 PM Padma Penumarthy < > > penumarthy.pa...@gmail.com> > > wrote: > > > > > Congrats Karthik. > > > > > > Thanks > > > Padma > > > > > > > > > On Fri, Dec 7, 2018 at 1:33 PM Paul Rogers > > > wrote: > > > > > > > Congrats Karthik! > > > > > > > > - Paul > > > > > > > > Sent from my iPhone > > > > > > > > > On Dec 7, 2018, at 11:12 AM, Abhishek Girish > > > wrote: > > > > > > > > > > Congratulations Karthik! > > > > > > > > > >> On Fri, Dec 7, 2018 at 11:11 AM Arina Ielchiieva < > ar...@apache.org> > > > > wrote: > > > > >> > > > > >> The Project Management Committee (PMC) for Apache Drill has > invited > > > > >> Karthikeyan > > > > >> Manivannan to become a committer, and we are pleased to announce > > that > > > he > > > > >> has accepted. > > > > >> > > > > >> Karthik started contributing to the Drill project in 2016. He has > > > > >> implemented changes in various Drill areas, including batch > sizing, > > > > >> security, code-gen, C++ part. One of his latest improvements is > ACL > > > > >> support for Drill ZK nodes. > > > > >> > > > > >> Welcome Karthik, and thank you for your contributions! > > > > >> > > > > >> - Arina > > > > >> (on behalf of Drill PMC) > > > > >> > > > > > > > > > >
[jira] [Resolved] (DRILL-6828) Hit UnrecognizedPropertyException when run tpch queries
[ https://issues.apache.org/jira/browse/DRILL-6828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Hou resolved DRILL-6828. --- Resolution: Cannot Reproduce > Hit UnrecognizedPropertyException when run tpch queries > --- > > Key: DRILL-6828 > URL: https://issues.apache.org/jira/browse/DRILL-6828 > Project: Apache Drill > Issue Type: Bug > Components: Functions - Drill >Affects Versions: 1.15.0 > Environment: RHEL 7, Apache Drill commit id: > 18e09a1b1c801f2691a05ae7db543bf71874cfea >Reporter: Dechang Gu >Assignee: Robert Hou >Priority: Blocker > Fix For: 1.15.0 > > > Installed Apache Drill 1.15.0 commit id: > 18e09a1b1c801f2691a05ae7db543bf71874cfea DRILL-6763: Codegen optimization of > SQL functions with constant values(\#1481) > Hit the following errors: > {code} > java.sql.SQLException: SYSTEM ERROR: UnrecognizedPropertyException: > Unrecognized field "outgoingBatchSize" (class > org.apache.drill.exec.physical.config.HashPartitionSender), not marked as > ignorable (9 known properties: "receiver-major-fragment", > "initialAllocation", "expr", "userName", "@id", "child", "cost", > "destinations", "maxAllocation"]) > at [Source: (StringReader); line: 1000, column: 29] (through reference > chain: > org.apache.drill.exec.physical.config.HashPartitionSender["outgoingBatchSize"]) > Fragment 3:175 > Please, refer to logs for more information. > [Error Id: cc023cdb-9a46-4edd-ad0b-6da1e9085291 on ucs-node6.perf.lab:31010] > at > org.apache.drill.jdbc.impl.DrillCursor.nextRowInternally(DrillCursor.java:528) > at > org.apache.drill.jdbc.impl.DrillCursor.loadInitialSchema(DrillCursor.java:600) > at > org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(DrillResultSetImpl.java:1288) > at > org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(DrillResultSetImpl.java:61) > at > org.apache.calcite.avatica.AvaticaConnection$1.execute(AvaticaConnection.java:667) > at > org.apache.drill.jdbc.impl.DrillMetaImpl.prepareAndExecute(DrillMetaImpl.java:1109) > at > org.apache.drill.jdbc.impl.DrillMetaImpl.prepareAndExecute(DrillMetaImpl.java:1120) > at > org.apache.calcite.avatica.AvaticaConnection.prepareAndExecuteInternal(AvaticaConnection.java:675) > at > org.apache.drill.jdbc.impl.DrillConnectionImpl.prepareAndExecuteInternal(DrillConnectionImpl.java:196) > at > org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:156) > at > org.apache.calcite.avatica.AvaticaStatement.executeQuery(AvaticaStatement.java:227) > at PipSQueak.executeQuery(PipSQueak.java:289) > at PipSQueak.runTest(PipSQueak.java:104) > at PipSQueak.main(PipSQueak.java:477) > Caused by: org.apache.drill.common.exceptions.UserRemoteException: SYSTEM > ERROR: UnrecognizedPropertyException: Unrecognized field "outgoingBatchSize" > (class org.apache.drill.exec.physical.config.HashPartitionSender), not marked > as ignorable (9 known properties: "receiver-major-fragment", > "initialAllocation", "expr", "userName", "@id", "child", "cost", > "destinations", "maxAllocation"]) > at [Source: (StringReader); line: 1000, column: 29] (through reference > chain: > org.apache.drill.exec.physical.config.HashPartitionSender["outgoingBatchSize"]) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (DRILL-6567) Jenkins Regression: TPCDS query 93 fails with INTERNAL_ERROR ERROR: java.lang.reflect.UndeclaredThrowableException.
[ https://issues.apache.org/jira/browse/DRILL-6567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Hou resolved DRILL-6567. --- Resolution: Fixed Assignee: Vitalii Diravka (was: Robert Hou) > Jenkins Regression: TPCDS query 93 fails with INTERNAL_ERROR ERROR: > java.lang.reflect.UndeclaredThrowableException. > --- > > Key: DRILL-6567 > URL: https://issues.apache.org/jira/browse/DRILL-6567 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators >Affects Versions: 1.14.0 >Reporter: Robert Hou >Assignee: Vitalii Diravka >Priority: Critical > Fix For: 1.15.0 > > > This is TPCDS Query 93. > Query: > /root/drillAutomation/framework-master/framework/resources/Advanced/tpcds/tpcds_sf100/hive/parquet/query93.sql > SELECT ss_customer_sk, > Sum(act_sales) sumsales > FROM (SELECT ss_item_sk, > ss_ticket_number, > ss_customer_sk, > CASE > WHEN sr_return_quantity IS NOT NULL THEN > ( ss_quantity - sr_return_quantity ) * ss_sales_price > ELSE ( ss_quantity * ss_sales_price ) > END act_sales > FROM store_sales > LEFT OUTER JOIN store_returns > ON ( sr_item_sk = ss_item_sk > AND sr_ticket_number = ss_ticket_number ), > reason > WHERE sr_reason_sk = r_reason_sk > AND r_reason_desc = 'reason 38') t > GROUP BY ss_customer_sk > ORDER BY sumsales, > ss_customer_sk > LIMIT 100; > Here is the stack trace: > 2018-06-29 07:00:32 INFO DrillTestLogger:348 - > Exception: > java.sql.SQLException: INTERNAL_ERROR ERROR: > java.lang.reflect.UndeclaredThrowableException > Setup failed for null > Fragment 4:56 > [Error Id: 3c72c14d-9362-4a9b-affb-5cf937bed89e on atsqa6c82.qa.lab:31010] > (org.apache.drill.common.exceptions.ExecutionSetupException) > java.lang.reflect.UndeclaredThrowableException > > org.apache.drill.common.exceptions.ExecutionSetupException.fromThrowable():30 > org.apache.drill.exec.store.hive.readers.HiveAbstractReader.setup():327 > org.apache.drill.exec.physical.impl.ScanBatch.getNextReaderIfHas():245 > org.apache.drill.exec.physical.impl.ScanBatch.next():164 > org.apache.drill.exec.record.AbstractRecordBatch.next():119 > > org.apache.drill.exec.physical.impl.join.HashJoinBatch.sniffNonEmptyBatch():276 > > org.apache.drill.exec.physical.impl.join.HashJoinBatch.prefetchFirstBatchFromBothSides():238 > org.apache.drill.exec.physical.impl.join.HashJoinBatch.buildSchema():218 > org.apache.drill.exec.record.AbstractRecordBatch.next():152 > org.apache.drill.exec.record.AbstractRecordBatch.next():119 > org.apache.drill.exec.record.AbstractRecordBatch.next():109 > org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():63 > > org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():147 > org.apache.drill.exec.record.AbstractRecordBatch.next():172 > org.apache.drill.exec.record.AbstractRecordBatch.next():119 > > org.apache.drill.exec.physical.impl.join.HashJoinBatch.sniffNonEmptyBatch():276 > > org.apache.drill.exec.physical.impl.join.HashJoinBatch.prefetchFirstBatchFromBothSides():238 > org.apache.drill.exec.physical.impl.join.HashJoinBatch.buildSchema():218 > org.apache.drill.exec.record.AbstractRecordBatch.next():152 > org.apache.drill.exec.record.AbstractRecordBatch.next():119 > org.apache.drill.exec.record.AbstractRecordBatch.next():109 > org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():63 > > org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():147 > org.apache.drill.exec.record.AbstractRecordBatch.next():172 > org.apache.drill.exec.record.AbstractRecordBatch.next():119 > org.apache.drill.exec.record.AbstractRecordBatch.next():109 > > org.apache.drill.exec.physical.impl.aggregate.HashAggBatch.buildSchema():118 > org.apache.drill.exec.record.AbstractRecordBatch.next():152 > org.apache.drill.exec.physical.impl.BaseRootExec.next():103 > > org.apache.drill.exec.physical.impl.partitionsender.PartitionSenderRootExec.innerNext():152 > org.apache.drill.exec.physical.impl.BaseRootExec.next():93 > org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():294 > org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():281 > java.security.AccessController.doPrivileged():-2 > javax.security.auth.Subject.doAs():422 > org.apache.hadoop.security.UserGroupInformation.doAs():1595 > org.
Re: [ANNOUNCE] New Committer: Hanumath Rao Maduri
Congratulations, Hanu. Thanks for contributing to Drill. --Robert On Thu, Nov 1, 2018 at 4:06 PM Jyothsna Reddy wrote: > Congrats Hanu!! Well deserved :D > > Thank you, > Jyothsna > > On Thu, Nov 1, 2018 at 2:15 PM Sorabh Hamirwasia > wrote: > > > Congratulations Hanu! > > > > Thanks, > > Sorabh > > > > On Thu, Nov 1, 2018 at 1:35 PM Hanumath Rao Maduri > > wrote: > > > > > Thank you all for the wishes! > > > > > > Thanks, > > > -Hanu > > > > > > On Thu, Nov 1, 2018 at 1:28 PM Chunhui Shi > > .invalid> > > > wrote: > > > > > > > Congratulations Hanu! > > > > -- > > > > From:Arina Ielchiieva > > > > Send Time:2018 Nov 1 (Thu) 06:05 > > > > To:dev ; user > > > > Subject:[ANNOUNCE] New Committer: Hanumath Rao Maduri > > > > > > > > The Project Management Committee (PMC) for Apache Drill has invited > > > > Hanumath > > > > Rao Maduri to become a committer, and we are pleased to announce that > > he > > > > has accepted. > > > > > > > > Hanumath became a contributor in 2017, making changes mostly in the > > Drill > > > > planning side, including lateral / unnest support. He is also one of > > the > > > > contributors of index based planning and execution support. > > > > > > > > Welcome Hanumath, and thank you for your contributions! > > > > > > > > - Arina > > > > (on behalf of Drill PMC) > > > > > > > > > >
[jira] [Created] (DRILL-6787) Update Spnego webpage
Robert Hou created DRILL-6787: - Summary: Update Spnego webpage Key: DRILL-6787 URL: https://issues.apache.org/jira/browse/DRILL-6787 Project: Apache Drill Issue Type: Bug Affects Versions: 1.14.0 Reporter: Robert Hou Assignee: Bridget Bevens Fix For: 1.15.0 A few things should be updated on this webpage: https://drill.apache.org/docs/configuring-drill-to-use-spnego-for-http-authentication/ When configuring drillbits in drill-override.conf, the principal and keytab should be corrected. There are two places where this should be corrected. {noformat} drill.exec.http: { auth.spnego.principal:"HTTP/hostname@realm", auth.spnego.keytab:"path/to/keytab", auth.mechanisms: [“SPNEGO”] } {noformat} For the section on Chrome, we should change "hostname/domain" to "domain". Or "hostname@domain". Also, the two blanks around the "=" should be removed. {noformat} google-chrome --auth-server-whitelist="hostname/domain" {noformat} Also, for the section on Chrome, the "domain" should match the URL given to Chrome to access the Web UI. Also, Linux and Mac should be treated in separate paragraphs. These should be the directions for Mac: {noformat} cd /Applications/Google Chrome.app/Contents/MacOS ./"Google Chrome" --auth-server-whitelist="example.com" {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: [ANNOUNCE] New Committer: Chunhui Shi
Congratulations, Chun-hui. Thanks for contributing to Drill. --Robert On Sat, Sep 29, 2018 at 11:47 AM rahul challapalli < challapallira...@gmail.com> wrote: > Congratulations Chunhui! > > On Sat, Sep 29, 2018, 11:39 AM Kunal Khatua wrote: > > > Congratulations, Chunhui !! > > On 9/28/2018 7:31:44 PM, Chunhui Shi > > wrote: > > Thank you Arina, PMCs, and every driller friends! I deeply appreciate the > > opportunity to be part of this global growing community of awesome > > developers. > > > > Best regards, > > Chunhui > > > > > > -- > > From:Arina Ielchiieva > > Send Time:2018 Sep 28 (Fri) 02:17 > > To:dev ; user > > Subject:[ANNOUNCE] New Committer: Chunhui Shi > > > > The Project Management Committee (PMC) for Apache Drill has invited > Chunhui > > Shi to become a committer, and we are pleased to announce that he has > > accepted. > > > > Chunhui Shi has become a contributor since 2016, making changes in > various > > Drill areas. He has shown profound knowledge in Drill planning side > during > > his work to support lateral join. He is also one of the contributors of > the > > upcoming feature to support index based planning and execution. > > > > Welcome Chunhui, and thank you for your contributions! > > > > - Arina > > (on behalf of Drill PMC) > > > > >
Re: [ANNOUNCE] New Committer: Weijie Tong
Congrats Weijie! Thanks for working on Drill. --Robert On Fri, Aug 31, 2018 at 1:38 PM, Boaz Ben-Zvi wrote: >Congrat.s Weijie - and thanks for implementing the Bloom Filters fro > Drill . > > Boaz > > > On 8/31/18 1:04 PM, Aman Sinha wrote: > >> Congratulations Weijie ! Thanks for your contributions. >> >> On Fri, Aug 31, 2018 at 11:58 AM salim achouche >> wrote: >> >> Congrats Weijie! >>> >>> On Fri, Aug 31, 2018 at 10:28 AM Paul Rogers >>> wrote: >>> >>> Congratulations Weijie, thanks for your contributions to Drill. Thanks, - Paul On Friday, August 31, 2018, 8:51:30 AM PDT, Arina Ielchiieva < ar...@apache.org> wrote: The Project Management Committee (PMC) for Apache Drill has invited >>> Weijie >>> Tong to become a committer, and we are pleased to announce that he has accepted. Weijie Tong has become a very active contributor to Drill in recent >>> months. >>> He contributed the Join predicate push down feature which will be >>> available >>> in Apache Drill 1.15. The feature is non trivial and has covered changes to all aspects of Drill: RPC layer, Planning, and Execution. Welcome Weijie, and thank you for your contributions! - Arina (on behalf of Drill PMC) >>> >>> -- >>> Regards, >>> Salim >>> >>> >
[jira] [Created] (DRILL-6726) Drill should return a better error message when a view uses a table that has a mixed case schema
Robert Hou created DRILL-6726: - Summary: Drill should return a better error message when a view uses a table that has a mixed case schema Key: DRILL-6726 URL: https://issues.apache.org/jira/browse/DRILL-6726 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.14.0 Reporter: Robert Hou Assignee: Arina Ielchiieva Fix For: 1.15.0 Drill 1.14 changes schemas to be case-insensitive (DRILL-6492). If a view references a schema which has upper case letters, the view needs to be rebuilt. For example: {noformat} create or replace view `dfs.drillTestDirP1`.student_parquet_v as select * from `dfs.drillTestDirP1`.student; {noformat} If a query references this schema, Drill will return an exception: {noformat} java.sql.SQLException: VALIDATION ERROR: Failure while attempting to expand view. Requested schema drillTestDirP1 not available in schema dfs. {noformat} It would be helpful to users if the error message explains that these views need to be re-created. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6725) Views cannot use tables with mixed case schemas
Robert Hou created DRILL-6725: - Summary: Views cannot use tables with mixed case schemas Key: DRILL-6725 URL: https://issues.apache.org/jira/browse/DRILL-6725 Project: Apache Drill Issue Type: Bug Components: Documentation Affects Versions: 1.14.0 Reporter: Robert Hou Assignee: Bridget Bevens Fix For: 1.14.0 Drill 1.14 changes schemas to be case-insensitive (DRILL-6492). If a view references a schema which has upper case letters, the view needs to be rebuilt. For example: create or replace view `dfs.drillTestDirP1`.student_parquet_v as select * from `dfs.drillTestDirP1`.student; Do we have release notes? If so, this should be documented. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: [ANNOUNCE] New PMC member: Volodymyr Vysotskyi
Congratulations, Volodymyr! Thank you for all your work on Drill. On Sat, Aug 25, 2018 at 2:38 PM, Timothy Farkas wrote: > Congratulations, Volodymyr! > > On Sat, Aug 25, 2018 at 9:00 AM, Kunal Khatua wrote: > > > Congratulations, Volodymyr! > > On 8/25/2018 6:32:07 AM, weijie tong wrote: > > Congratulations Volodymyr! > > > > On Sat, Aug 25, 2018 at 8:30 AM salim achouche wrote: > > > > > Congrats Volodymyr! > > > > > > On Fri, Aug 24, 2018 at 11:32 AM Gautam Parai wrote: > > > > > > > Congratulations Vova! > > > > > > > > Gautam > > > > > > > > On Fri, Aug 24, 2018 at 10:59 AM, Khurram Faraaz > > > wrote: > > > > > > > > > Congratulations Volodymyr! > > > > > > > > > > Regards, > > > > > Khurram > > > > > > > > > > On Fri, Aug 24, 2018 at 10:25 AM, Hanumath Rao Maduri > > > > hanu@gmail.com> > > > > > wrote: > > > > > > > > > > > Congratulations Volodymyr! > > > > > > > > > > > > Thanks, > > > > > > -Hanu > > > > > > > > > > > > On Fri, Aug 24, 2018 at 10:22 AM Paul Rogers > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > Congratulations Volodymyr! > > > > > > > Thanks, > > > > > > > - Paul > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Friday, August 24, 2018, 5:53:25 AM PDT, Arina Ielchiieva > > > > > > > ar...@apache.org> wrote: > > > > > > > > > > > > > > I am pleased to announce that Drill PMC invited Volodymyr > > > Vysotskyi > > > > to > > > > > > the > > > > > > > PMC and he has accepted the invitation. > > > > > > > > > > > > > > Congratulations Vova and thanks for your contributions! > > > > > > > > > > > > > > - Arina > > > > > > > (on behalf of Drill PMC) > > > > > > > > > > > > > > > > > > > > > > > > > > > >
[jira] [Created] (DRILL-6710) Drill C++ Client does not handle scale = 0 properly for decimal
Robert Hou created DRILL-6710: - Summary: Drill C++ Client does not handle scale = 0 properly for decimal Key: DRILL-6710 URL: https://issues.apache.org/jira/browse/DRILL-6710 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.14.0 Reporter: Robert Hou Assignee: Sorabh Hamirwasia Fix For: 1.15.0 Query is: select cast('99' as decimal(18,0)) + cast('9' as decimal(38,0)) from data limit 1 This is the error I get when my test program calls SQLExecDirect: The driver reported the following diagnostics whilst running SQLExecDirect HY000:1:40140:[MapR][Support] (40140) Scale can't be less than zero. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6709) Batch statistics logging utility needs to be extended to mid-stream operators
Robert Hou created DRILL-6709: - Summary: Batch statistics logging utility needs to be extended to mid-stream operators Key: DRILL-6709 URL: https://issues.apache.org/jira/browse/DRILL-6709 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.14.0 Reporter: Robert Hou Assignee: salim achouche Fix For: 1.15.0 A new batch logging utility has been created to log batch sizing messages to drillbit.log. It is being used by the Parquet reader. It needs to be enhanced so it can be used by mid-stream operators. In particular, mid-stream operators have both incoming batches and outgoing batches, while Parquet only has outgoing batches. So the utility needs to support incoming batches. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: [ANNOUNCE] New PMC member: Boaz Ben-Zvi
Congratulations, Boaz! Thanks for working on Drill. --Robert On Fri, Aug 17, 2018 at 4:45 PM, Padma Penumarthy < penumarthy.pa...@gmail.com> wrote: > Congratulations Boaz. > > Thanks > Padma > > > On Fri, Aug 17, 2018 at 2:33 PM, Robert Wu wrote: > > > Congratulations, Boaz! > > > > Best regards, > > > > Rob > > > > -Original Message- > > From: Abhishek Girish > > Sent: Friday, August 17, 2018 2:17 PM > > To: dev > > Subject: Re: [ANNOUNCE] New PMC member: Boaz Ben-Zvi > > > > Congratulations, Boaz! > > > > On Fri, Aug 17, 2018 at 2:15 PM Sorabh Hamirwasia > > wrote: > > > > > Congratulations Boaz! > > > > > > On Fri, Aug 17, 2018 at 11:42 AM, Karthikeyan Manivannan < > > > kmanivan...@mapr.com> wrote: > > > > > > > Congrats! Well deserved! > > > > > > > > On Fri, Aug 17, 2018, 11:31 AM Timothy Farkas > > wrote: > > > > > > > > > Congrats! > > > > > > > > > > On Fri, Aug 17, 2018 at 11:27 AM, Gautam Parai > > > wrote: > > > > > > > > > > > Congratulations Boaz!! > > > > > > > > > > > > Gautam > > > > > > > > > > > > On Fri, Aug 17, 2018 at 11:04 AM, Khurram Faraaz > > > > > > > > > > > wrote: > > > > > > > > > > > > > Congratulations Boaz. > > > > > > > > > > > > > > On Fri, Aug 17, 2018 at 10:47 AM, shi.chunhui < > > > > > > > shi.chun...@aliyun.com.invalid> wrote: > > > > > > > > > > > > > > > Congrats Boaz! > > > > > > > > > > > -- > > > > > > > > Sender:Arina Ielchiieva Sent at:2018 Aug > > > > > > > > 17 (Fri) 17:51 To:dev ; user > > > > > > > > Subject:[ANNOUNCE] New PMC member: > > > > > > > > Boaz Ben-Zvi > > > > > > > > > > > > > > > > I am pleased to announce that Drill PMC invited Boaz Ben-Zvi > > > > > > > > to > > > the > > > > > PMC > > > > > > > and > > > > > > > > he has accepted the invitation. > > > > > > > > > > > > > > > > Congratulations Boaz and thanks for your contributions! > > > > > > > > > > > > > > > > - Arina > > > > > > > > (on behalf of Drill PMC) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
[jira] [Created] (DRILL-6688) Data batches for Project operator exceed the maximum specified
Robert Hou created DRILL-6688: - Summary: Data batches for Project operator exceed the maximum specified Key: DRILL-6688 URL: https://issues.apache.org/jira/browse/DRILL-6688 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.14.0 Reporter: Robert Hou Assignee: Karthikeyan Manivannan Fix For: 1.15.0 I ran this query: alter session set `drill.exec.memory.operator.project.output_batch_size` = 131072; alter session set `planner.width.max_per_node` = 1; alter session set `planner.width.max_per_query` = 1; select chr(101) CharacterValuea, chr(102) CharacterValueb, chr(103) CharacterValuec, chr(104) CharacterValued, chr(105) CharacterValuee from dfs.`/drill/testdata/batch_memory/character5_1MB.parquet`; The output has 1024 identical lines: e f g h i There is one incoming batch: 2018-08-09 15:50:14,794 [24933ad8-a5e2-73f1-90dd-947fc2938e54:frag:0:0] DEBUG o.a.d.e.p.i.p.ProjectMemoryManager - BATCH_STATS, incoming: Batch size: { Records: 6, Total size: 0, Data size: 30, Gross row width: 0, Net row width: 5, Density: 0% } Batch schema & sizes: { `_DEFAULT_COL_TO_READ_`(type: OPTIONAL INT, count: 6, Per entry: std data size: 4, std net size: 5, actual data size: 4, actual net size: 5 Totals: data size: 24, net size: 30) } } There are four outgoing batches. All are too large. The first three look like this: 2018-08-09 15:50:14,799 [24933ad8-a5e2-73f1-90dd-947fc2938e54:frag:0:0] DEBUG o.a.d.e.p.i.p.ProjectRecordBatch - BATCH_STATS, outgoing: Batch size: { Records: 16383, Total size: 0, Data size: 409575, Gross row width: 0, Net row width: 25, Density: 0% } Batch schema & sizes: { CharacterValuea(type: REQUIRED VARCHAR, count: 16383, Per entry: std data size: 50, std net size: 54, actual data size: 1, actual net size: 5 Totals: data size: 16383, net size: 81915) } CharacterValueb(type: REQUIRED VARCHAR, count: 16383, Per entry: std data size: 50, std net size: 54, actual data size: 1, actual net size: 5 Totals: data size: 16383, net size: 81915) } CharacterValuec(type: REQUIRED VARCHAR, count: 16383, Per entry: std data size: 50, std net size: 54, actual data size: 1, actual net size: 5 Totals: data size: 16383, net size: 81915) } CharacterValued(type: REQUIRED VARCHAR, count: 16383, Per entry: std data size: 50, std net size: 54, actual data size: 1, actual net size: 5 Totals: data size: 16383, net size: 81915) } CharacterValuee(type: REQUIRED VARCHAR, count: 16383, Per entry: std data size: 50, std net size: 54, actual data size: 1, actual net size: 5 Totals: data size: 16383, net size: 81915) } } The last batch is smaller because it has the remaining records. The data size (409575) exceeds the maximum batch size (131072). character415.q -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6682) Cast integer to binary returns incorrect result
Robert Hou created DRILL-6682: - Summary: Cast integer to binary returns incorrect result Key: DRILL-6682 URL: https://issues.apache.org/jira/browse/DRILL-6682 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.12.0 Reporter: Robert Hou Assignee: Pritesh Maker This query returns an empty binary string: select cast(123 as binary) from (values(1)); The same problem occurs for bigint, float and double. Casting works if the data type is date, time, timestamp, interval, varchar and binary. select cast(date '2018-08-10' as binary) from (values(1)); select length(string_binary(cast(123 as binary))), length(string_binary(cast(date '2018-08-10' as binary))) from (values(1)); +-+-+ | EXPR$0 | EXPR$1 | +-+-+ | 0 | 10 | +-+-+ -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6623) Drill encounters exception IndexOutOfBoundsException: writerIndex: -8373248 (expected: readerIndex(0) <= writerIndex <= capacity(32768))
Robert Hou created DRILL-6623: - Summary: Drill encounters exception IndexOutOfBoundsException: writerIndex: -8373248 (expected: readerIndex(0) <= writerIndex <= capacity(32768)) Key: DRILL-6623 URL: https://issues.apache.org/jira/browse/DRILL-6623 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.14.0 Reporter: Robert Hou Assignee: Pritesh Maker This is the query: alter session set `planner.width.max_per_node` = 1; alter session set `planner.width.max_per_query` = 1; select * from ( select split_part(CharacterValuea, '8', 1) CharacterValuea, split_part(CharacterValueb, '8', 1) CharacterValueb, split_part(CharacterValuec, '8', 2) CharacterValuec, split_part(CharacterValued, '8', 3) CharacterValued, split_part(CharacterValuee, 'b', 1) CharacterValuee from (select * from dfs.`/drill/testdata/batch_memory/character5_1MB_1GB.parquet` order by CharacterValuea) d where d.CharacterValuea = '1234567890123110'); The query works with a smaller table. This is the stack trace: {noformat} 2018-07-19 16:59:48,803 [24aedae9-d1f3-8e12-2e1f-0479915c61b1:frag:0:0] ERROR o.a.d.e.w.fragment.FragmentExecutor - SYSTEM ERROR: IndexOutOfBoundsException: writerIndex: -8373248 (expected: readerIndex(0) <= writerIndex <= capacity(32768)) Fragment 0:0 [Error Id: edc75560-41ca-4fdd-907f-060be1795786 on qa-node186.qa.lab:31010] org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: IndexOutOfBoundsException: writerIndex: -8373248 (expected: readerIndex(0) <= writerIndex <= capacity(32768)) Fragment 0:0 [Error Id: edc75560-41ca-4fdd-907f-060be1795786 on qa-node186.qa.lab:31010] at org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:633) ~[drill-common-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] at org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:361) [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] at org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:216) [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] at org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:327) [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] at org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) [drill-common-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [na:1.8.0_161] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [na:1.8.0_161] at java.lang.Thread.run(Thread.java:748) [na:1.8.0_161] Caused by: java.lang.IndexOutOfBoundsException: writerIndex: -8373248 (expected: readerIndex(0) <= writerIndex <= capacity(32768)) at io.netty.buffer.AbstractByteBuf.writerIndex(AbstractByteBuf.java:104) ~[netty-buffer-4.0.48.Final.jar:4.0.48.Final] at org.apache.drill.exec.vector.VarCharVector$Mutator.setValueCount(VarCharVector.java:810) ~[vector-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] at org.apache.drill.exec.vector.NullableVarCharVector$Mutator.setValueCount(NullableVarCharVector.java:641) ~[vector-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] at org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.setValueCount(ProjectRecordBatch.java:329) ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] at org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.doWork(ProjectRecordBatch.java:242) ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] at org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext(AbstractUnaryRecordBatch.java:117) ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] at org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:142) ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:172) ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119) ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:109) ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] at org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext(AbstractUnaryRecordBatch.java:63) ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] at org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:142) ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java
Re: [ANNOUNCE] New PMC Chair of Apache Drill
Congratulations, Arina! --Robert On Wed, Jul 18, 2018 at 9:12 PM, Sorabh Hamirwasia wrote: > Congratulations Arina! > > On Wed, Jul 18, 2018 at 6:13 PM, Charles Givre wrote: > > > Congrats Arina!! Well done! > > > > > On Jul 18, 2018, at 20:59, Paul Rogers > > wrote: > > > > > > Congratulations Arina! > > > > > > - Paul > > > > > > > > > > > >On Wednesday, July 18, 2018, 2:19:44 PM PDT, Aman Sinha < > > amansi...@apache.org> wrote: > > > > > > Drill developers, > > > Time flies and it is time for a new PMC chair ! Thank you all for your > > > support during the past year. > > > > > > I am very pleased to announce that the Drill PMC has voted to elect > Arina > > > Ielchiieva as the new PMC chair of Apache Drill. She has also been > > > approved unanimously by the Apache Board in today's board meeting. > > Please > > > join me in congratulating Arina ! > > > > > > Thanks, > > > Aman > > > > >
[jira] [Resolved] (DRILL-6605) TPCDS-84 Query does not return any rows
[ https://issues.apache.org/jira/browse/DRILL-6605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Hou resolved DRILL-6605. --- Resolution: Fixed > TPCDS-84 Query does not return any rows > --- > > Key: DRILL-6605 > URL: https://issues.apache.org/jira/browse/DRILL-6605 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators > Reporter: Robert Hou > Assignee: Robert Hou >Priority: Major > Attachments: drillbit.log.node80, drillbit.log.node81, > drillbit.log.node82, drillbit.log.node83, drillbit.log.node85, > drillbit.log.node86, drillbit.log.node87, drillbit.log.node88 > > > Query is: > Advanced/tpcds/tpcds_sf100/hive/parquet/query84.sql > This uses the hive parquet reader. > {code:sql} > SELECT c_customer_id AS customer_id, > c_last_name > || ', ' > || c_first_name AS customername > FROM customer, > customer_address, > customer_demographics, > household_demographics, > income_band, > store_returns > WHERE ca_city = 'Green Acres' > AND c_current_addr_sk = ca_address_sk > AND ib_lower_bound >= 54986 > AND ib_upper_bound <= 54986 + 5 > AND ib_income_band_sk = hd_income_band_sk > AND cd_demo_sk = c_current_cdemo_sk > AND hd_demo_sk = c_current_hdemo_sk > AND sr_cdemo_sk = cd_demo_sk > ORDER BY c_customer_id > LIMIT 100 > {code} > This query should return 100 rows. It does not return any rows. > Here is the explain plan: > {noformat} > | 00-00Screen > 00-01 Project(customer_id=[$0], customername=[$1]) > 00-02SelectionVectorRemover > 00-03 Limit(fetch=[100]) > 00-04SingleMergeExchange(sort0=[0]) > 01-01 OrderedMuxExchange(sort0=[0]) > 02-01SelectionVectorRemover > 02-02 TopN(limit=[100]) > 02-03HashToRandomExchange(dist0=[[$0]]) > 03-01 Project(customer_id=[$0], customername=[||(||($5, > ', '), $4)]) > 03-02Project(c_customer_id=[$1], > c_current_cdemo_sk=[$2], c_current_hdemo_sk=[$3], c_current_addr_sk=[$4], > c_first_name=[$5], c_last_name=[$6], ca_address_sk=[$8], ca_city=[$9], > cd_demo_sk=[$7], hd_demo_sk=[$10], hd_income_band_sk=[$11], > ib_income_band_sk=[$12], ib_lower_bound=[$13], ib_upper_bound=[$14], > sr_cdemo_sk=[$0]) > 03-03 HashJoin(condition=[=($7, $0)], > joinType=[inner]) > 03-05HashToRandomExchange(dist0=[[$0]]) > 04-01 Scan(groupscan=[HiveScan > [table=Table(dbName:tpcds100_parquet, tableName:store_returns), > columns=[`sr_cdemo_sk`], numPartitions=0, partitions= null, > inputDirectories=[maprfs:/drill/testdata/tpcds_sf100/parquet/web_returns], > confProperties={}]]) > 03-04HashToRandomExchange(dist0=[[$6]]) > 05-01 HashJoin(condition=[=($2, $9)], > joinType=[inner]) > 05-03HashJoin(condition=[=($3, $7)], > joinType=[inner]) > 05-05 HashJoin(condition=[=($1, $6)], > joinType=[inner]) > 05-07Scan(groupscan=[HiveScan > [table=Table(dbName:tpcds100_parquet, tableName:customer), > columns=[`c_customer_id`, `c_current_cdemo_sk`, `c_current_hdemo_sk`, > `c_current_addr_sk`, `c_first_name`, `c_last_name`], numPartitions=0, > partitions= null, > inputDirectories=[maprfs:/drill/testdata/tpcds_sf100/parquet/customer], > confProperties={}]]) > 05-06BroadcastExchange > 06-01 Scan(groupscan=[HiveScan > [table=Table(dbName:tpcds100_parquet, tableName:customer_demographics), > columns=[`cd_demo_sk`], numPartitions=0, partitions= null, > inputDirectories=[maprfs:/drill/testdata/tpcds_sf100/parquet/customer_demographics], > confProperties={}]]) > 05-04 BroadcastExchange > 07-01SelectionVectorRemover > 07-02 Filter(condition=[=($1, 'Green > Acres')]) > 07-03Scan(groupscan=[HiveScan > [table=Table(dbName:tpcds100_parquet, tableName:customer_address), > columns=[`ca_address_sk`, `ca_city`], numPartitions=0, partitions= null, > inputDirectories=[maprfs:/drill/testdata/tpcds_sf100/parquet/customer_address], > confProperties={}]]) > 05-02BroadcastExchange > 08-01 HashJoin(co
[jira] [Created] (DRILL-6605) Query does not return any rows
Robert Hou created DRILL-6605: - Summary: Query does not return any rows Key: DRILL-6605 URL: https://issues.apache.org/jira/browse/DRILL-6605 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.13.0 Reporter: Robert Hou Assignee: Pritesh Maker Fix For: 1.15.0 Query is: Advanced/tpcds/tpcds_sf100/hive/parquet/query84.sql This uses the hive parquet reader. SELECT c_customer_id AS customer_id, c_last_name \|\| ', ' \|\| c_first_name AS customername FROM customer, customer_address, customer_demographics, household_demographics, income_band, store_returns WHERE ca_city = 'Green Acres' AND c_current_addr_sk = ca_address_sk AND ib_lower_bound >= 54986 AND ib_upper_bound <= 54986 + 5 AND ib_income_band_sk = hd_income_band_sk AND cd_demo_sk = c_current_cdemo_sk AND hd_demo_sk = c_current_hdemo_sk AND sr_cdemo_sk = cd_demo_sk ORDER BY c_customer_id LIMIT 100 This query should return 100 rows commit id is: 1.14.0-SNAPSHOT a77fd142d86dd5648cda8866b8ff3af39c7b6b11DRILL-6516: EMIT support in streaming agg 11.07.2018 @ 18:40:03 PDT Unknown 12.07.2018 @ 01:50:37 PDT -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6603) Query does not return enough rows
Robert Hou created DRILL-6603: - Summary: Query does not return enough rows Key: DRILL-6603 URL: https://issues.apache.org/jira/browse/DRILL-6603 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.14.0 Reporter: Robert Hou Assignee: Pritesh Maker Fix For: 1.15.0 Query is: /root/drillAutomation/framework-master/framework/resources/Advanced/data-shapes/wide-columns/5000/10rows/parquet/q67.q select * from widestrings where str_var is null and dec_var_prec5_sc2 between 10 and 15 This query should return 5 rows. It is missing 3 rows. 1664IaYIEviH tJHD 6nF33QQJn1p4uuTELHOR2z0FCzMK35JkNeDRKCduYKUiPaXFgwftf4Ciidk2d7IXxyrCoX56Vsb ITcI9yxPpd3Gu6zkk2kktmZv9oHxMVE1ccVh2iGzU7greQuUEJ1oYFHGzGN9MEeKc5DqbHHT0F65NF1LE88CAudZW5bv6AiIj2D714q72g8ULd2WaazavWBQ6PgdKax 5kVvGkt9czWgZOH9CfT0ApOWUWZlQcvtVC2UumK6Q8tmE5f5yjKhTqvXOiistNIMo4K1NqG8U5t9V33b3h9Hk1ymyeGNMrb5Is1jB5nL9zlpyx3y46WoxV9GornIyrLw W4wxtVsbj2yFYuU65RdDzkNKezE0LsPtpXeEpJeFoFSP lF0wj8xSQg1wx5cfOMXBGNA1nvqTELCPCEzUvFj8hXQ3gANHJ9bOt7QFZhxWLlBhCevbqA40IgJntlf0cAJM6V562fpGd16Trt3mI4YQUOkf3luTVRcBJRpIdoP3ZzgvhnVrgfblboAFMZ8CzCaH7QrZf02fPtYJlBAdoJB6DMjqh6mbkphod1QGYOkE0jqLMCnKoZSpOG9Rk9dIFdlkIrvea0f1KDGAuAlYiTTsdgU4R6CowbVNfEyjIv0Wp1CXC6SzM1Vex6Ye7CrRptvn92SOQCsAElScXa1EuErruEAyIEvtWraXL5X42RxTBsH3TZTR6NVuUcpObKbVIx0kLTdbxIElf33x31QwXUfUVZ T4zHEpu6f4mLR6N9uLVG0Fza Glq3UxixhgxPXgZpQt9GqT3HJXHEn9F0KGaxhC9VCqSk119HrrJuMpHiYS34MCkw1iFhGFUsRKI3fTFaByicJeCIkjFwn2cr74lONdco4AAFdGGVN1cMgJmlOxUZE0Okv68DocVXUMSXCdcTBBmGL2h2gDIagThjo8sVXORponMNTrXEP068Zy7pNkVJyW10EoZwqE2IIcoKdixYsJvPc0mRWnk3gfSmB6uHWgKvgGq4yzzbGp3NT01z8IRYKbmSXTmLyk9rJjUYatoIi 757C2F0Yq0gceouo3LMaz9h4eyiC9psNiL3aoxquqrisayOjPs5esQzoY2iVmVZ7evrVCfxhe2AATFgTvk8Ek78y8s4nVNztlyluIrckfLbnOa25r1h9emJzooVV0Xj945xj5jAUHTZU9kCHKnmkcpEo0a7BdELbL0IvQlitXxbZBS86PlCltLGpLs fmYeUzJfpp0Cql3MAECSQQbW4ErwWScaZ5D rPfbbDZbF2m2ZtSPNn81G5zZBxfHgpuSm4UVrdd24NlLeG1mxwv zU1PbpjSCqbn8rUCWqn5LFafTrmSdtrCuFaknTpqmk1wR9cLnPF3cD xvh0EqSwvCmCTK9xCpZkJF 4WnBX6w5vg7gQkjvF1GOqP3LeV3qbJc SO68S2UrCBNYQKdWyq4HeGG3TTuFF4x74nWkPPi0txEGiGDoYRxPvEQzWyhZ8SHpHZ3 0UpHpuLWEXIO6VZlPJd4uC IaDEIaB rkCJ8TaIVvaBIf0t8FGY8MgXTWzKdUBkOcQawbODXRLEtdGABTnOqftRSfUSpdojmlwRIs8xJIKaxK9wSL67DKahL6E7CvDBaQx20G0o7u rMaponV4OZmHE45vaeAqfLSyWlNL4UvOstiDPaDd8nI08g9MSKFtYYxt3RxvydGxCtaYfgsl3KxjN5VHnAxkvChVlvdS2Yd8IBA 0dZwblnKUBibdQSgxcypDbRCPeAaOr169L9mrMv82w0V1Ndyt3qK wcpv5nKeO8P9kbVlWY9bGi9nxCVs804WBZMA9vc7AT4h7Jp0OsaHbJx0qyFyAnXP lu MMsOa28VxSW8thiTfIcx2qkdFN1KXrXpU4uo lxUOcJhH0HlyX6kLKhCnVqpG tFP93c5jJ7FdeSujFvxPgo1rQSN9DHXk4DR6nytgBrn2oGcM58zadRNaqoIL2wmWygQsnk7Euzypbg4KhlTICBl1mpb0JwbI7uaCudGcDNWIBMerY WgjahuC3QjIFd48o78CQSgqgQjzpHzdELrqMCKaKfdW4ihpHCA0sqNBYGQxxd T8iTWorOODkg5Kc7m4gPut8tuzEMOQus1xdajv9PqS8F7xwzAWyhymyYBJ8505HxZDuSFqBXSkpxGDh21fiBHkeKBC9RZp7r yD7i6xvRh47Vln0IxvnwcpahLltLr12yL0sDu9LXxHNAHU4gyvHud5J5xXJPD7r5xHXvtNOSiXVl hkBBib1k4IO9YjCgModazXNudTx2Mr8ccq6 kNLKwnrwGdssm3JYyjBsUcXyLMHpS7vncUeKSw2rov4Hg4gTZU8sJMJMAJvu8d6IDJYMHULwrawKOhK8rDTP6sk9Hv27mCG8Gf9inG38Pik7AfnEtUIiZZozEsiSkWvAA7YiHlNDUuL3OX2FRgt2qu9T7zXtQkhon8uSv5FncUq17XB9idflAO0rWIK57HoilaXgIDrzG61kfSKZXpdKuwBVsRNmgJVDSedRsSihlcVDdZ7bmqsgzbvKhFri8lSh8ez6ttlXgF8h4wJ2985bVw5PUmLdeGjlbfrLF0f22vqGi11qz2GUltrjBmmBSrbCLpFUkwqqpATRoQEwo27qi5XwHYWWBqPN9rxF orktFM5SRwG2IJmx8li8sRRchYnNYQgH7iuwKqd69jJJTwwdYla2296Lhw88YHzL60aq2XomN0BNNSoY8cALvy0QIHZpCFd3EmBojr46d6c8nBYMXJLlgKNzklk8vMTKrjAgBQevUH4U7gbQpOIWVf7Tx2BIXkdRGwQYHAuJzU5gtDuDqhuddXkGdACMmp0tgJVP2tpMW05Z3OGs6jYKb5xtqHotIJd7tUM33J85fRYOEIoGOaRblZr7RF82nSOSpPQnDgnVUhJ1j mCY1ofeqG7QqeV6LTdRyRPgiiPwHF1Xgpb3feAJ804NmX7xOkDPvw0WeqxrSVMCto r8E64UsRFypZ wtzVAlTJKgTMpzA4xeuVXuk85mpEJTIQpNxPjU3vgAacENiejcRs68Y85Ncb5ymC3fD0WAyh23VIsy GqaCV9hIFrAs tMM2zlkqpoBsSwgODBEsizaJkb4ZOWJj3Z2Wttr08YPpXSO6 IhQKD5SHqNXEDNar2UVZwFZbg1YJccvsjWEtfm0AUZ 3KHMUb3X1F3tWqIYrZucrsjUp2xfaGtqnsij4q7CRWhRucucjyKcKmiaGE7XllzVGPeHWmbtAFku355JLB2OlBXdsgWMVZFcaCOHff6OlSECOgdLGBSL297kgCVKLzDEvxS T4rb5neHQffvmAHOzdIuDGw1559XGVHwzz5lLoc3iSicYlwZTKN2VUOQPHRSqTI1hMJmgTcUaO3LEHyxL2so3EedaU9BSaTaA3kPefKSdu ibaW3h1 WKkznSnlmVjhLzq5e5ywYzwA26EusRtJmAAiiSrYG20uO7ejp1AlorSgOAfM9B5qxQAqaDqQMUlvhlu7SjK46egz5kK3xtcoUfyxyUwAonh3iv VJPXdvxm8ZuZbnm82xLkh4MeWbClb0jH5E42m9aFp8GrSQzAwhzciocZJABwerP1sfITnG6EMyPKdl7FBIjJKjNcFOVabzQX966h6WYnAOKuaYdJWNGgKOISIcR6OwHIaUWjqV9w84VYxXutZJ1rRlbeUPT8ygTZmFk2FK2Ix02rBzt0nFkiTNmoZSilSzSOxSF iwtXmtDRtjrQPQCVKlZM3KrYjiJfOem8PIOA8wadL0lHN87gpEqUsrvpohZ8FRW ILoeDeWeBYO94JOrYv7JdirgNH7MBdmrMQOrBPpY6bdX3is62JWMm9c0Xv7jyEVdq3hkSsJLWEr4Gu8TZBfjrd9rVX0gqjlQZsk30UwEDjvtfufkYcJj2sGbJ3HzJdIh1MCHIoPb1YyacfzEvnQsnlQagfRu51vSF8qehDJ2AtCezy6hOdwberI4qgP8HMuBKRjoyN91ipykonft9himO44rJtkiREFA9opJA9jKWM8kYzICDmE2 D3pZcmMGyUEyCY K7IEITWxzmISenhl1Ext2wzZxJoQcfLNU 8rmXNFLwxnJCEYq4bNrEn9IQw 6xhgjw8roQVEgL8NZTxtlcve8RAyLILFdfNsvvg7qa700PCc
[jira] [Created] (DRILL-6594) Data batches for Project operator are not being split properly and exceed the maximum specified
Robert Hou created DRILL-6594: - Summary: Data batches for Project operator are not being split properly and exceed the maximum specified Key: DRILL-6594 URL: https://issues.apache.org/jira/browse/DRILL-6594 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.14.0 Reporter: Robert Hou Assignee: Karthikeyan Manivannan Fix For: 1.14.0 I ran this query: alter session set `drill.exec.memory.operator.project.output_batch_size` = 131072; alter session set `planner.width.max_per_node` = 1; alter session set `planner.width.max_per_query` = 1; select * from ( select case when false then c.CharacterValuea else i.IntegerValuea end IntegerValuea, case when false then c.CharacterValueb else i.IntegerValueb end IntegerValueb, case when false then c.CharacterValuec else i.IntegerValuec end IntegerValuec, case when false then c.CharacterValued else i.IntegerValued end IntegerValued, case when false then c.CharacterValuee else i.IntegerValuee end IntegerValuee from (select * from dfs.`/drill/testdata/batch_memory/character5_1MB.parquet` order by CharacterValuea) c, dfs.`/drill/testdata/batch_memory/integer5_1MB.parquet` i where i.Index = c.Index and c.CharacterValuea = '1234567890123100') limit 10; An incoming batch looks like this: 2018-06-14 19:28:10,905 [24dcdbc7-2f42-16a9-56f1-9cf58bc549bc:frag:5:0] DEBUG o.a.d.e.p.i.p.ProjectMemoryManager - BATCH_STATS, incoming: Batch size: { Records: 32768, Total size: 20512768, Data size: 9175040, Gross row width: 626, Net row width: 280, Density: 45% } An outgoing batch looks like this: 2018-06-14 19:28:10,911 [24dcdbc7-2f42-16a9-56f1-9cf58bc549bc:frag:5:0] DEBUG o.a.d.e.p.i.p.ProjectRecordBatch - BATCH_STATS, outgoing: Batch size: { Records: 1023, Total size: 11018240, Data size: 138105, Gross row width: 10771, Net row width: 135, Density: 2% } The data size (138105) exceeds the maximum batch size (131072). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6569) Jenkins Regression: TPCDS query 19 fails with INTERNAL_ERROR ERROR: Can not read value at 2 in block 0 in file maprfs:///drill/testdata/tpcds_sf100/parquet/store_sales/1_
Robert Hou created DRILL-6569: - Summary: Jenkins Regression: TPCDS query 19 fails with INTERNAL_ERROR ERROR: Can not read value at 2 in block 0 in file maprfs:///drill/testdata/tpcds_sf100/parquet/store_sales/1_13_1.parquet Key: DRILL-6569 URL: https://issues.apache.org/jira/browse/DRILL-6569 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.14.0 Reporter: Robert Hou Assignee: Pritesh Maker Fix For: 1.14.0 This is TPCDS Query 19. Query: /root/drillAutomation/framework-master/framework/resources/Advanced/tpcds/tpcds_sf100/hive/parquet/query19.sql SELECT i_brand_id brand_id, i_brand brand, i_manufact_id, i_manufact, Sum(ss_ext_sales_price) ext_price FROM date_dim, store_sales, item, customer, customer_address, store WHERE d_date_sk = ss_sold_date_sk AND ss_item_sk = i_item_sk AND i_manager_id = 38 AND d_moy = 12 AND d_year = 1998 AND ss_customer_sk = c_customer_sk AND c_current_addr_sk = ca_address_sk AND Substr(ca_zip, 1, 5) <> Substr(s_zip, 1, 5) AND ss_store_sk = s_store_sk GROUP BY i_brand, i_brand_id, i_manufact_id, i_manufact ORDER BY ext_price DESC, i_brand, i_brand_id, i_manufact_id, i_manufact LIMIT 100; Here is the stack trace: 2018-06-29 07:00:32 INFO DrillTestLogger:348 - Exception: java.sql.SQLException: INTERNAL_ERROR ERROR: Can not read value at 2 in block 0 in file maprfs:///drill/testdata/tpcds_sf100/parquet/store_sales/1_13_1.parquet Fragment 4:26 [Error Id: 6401a71e-7a5d-4a10-a17c-16873fc3239b on atsqa6c88.qa.lab:31010] (hive.org.apache.parquet.io.ParquetDecodingException) Can not read value at 2 in block 0 in file maprfs:///drill/testdata/tpcds_sf100/parquet/store_sales/1_13_1.parquet hive.org.apache.parquet.hadoop.InternalParquetRecordReader.nextKeyValue():243 hive.org.apache.parquet.hadoop.ParquetRecordReader.nextKeyValue():227 org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.next():199 org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.next():57 org.apache.drill.exec.store.hive.readers.HiveAbstractReader.hasNextValue():417 org.apache.drill.exec.store.hive.readers.HiveParquetReader.next():54 org.apache.drill.exec.physical.impl.ScanBatch.next():172 org.apache.drill.exec.record.AbstractRecordBatch.next():119 org.apache.drill.exec.physical.impl.join.HashJoinBatch.sniffNonEmptyBatch():276 org.apache.drill.exec.physical.impl.join.HashJoinBatch.prefetchFirstBatchFromBothSides():238 org.apache.drill.exec.physical.impl.join.HashJoinBatch.buildSchema():218 org.apache.drill.exec.record.AbstractRecordBatch.next():152 org.apache.drill.exec.record.AbstractRecordBatch.next():119 org.apache.drill.exec.physical.impl.join.HashJoinBatch.sniffNonEmptyBatch():276 org.apache.drill.exec.physical.impl.join.HashJoinBatch.prefetchFirstBatchFromBothSides():238 org.apache.drill.exec.physical.impl.join.HashJoinBatch.buildSchema():218 org.apache.drill.exec.record.AbstractRecordBatch.next():152 org.apache.drill.exec.record.AbstractRecordBatch.next():119 org.apache.drill.exec.physical.impl.join.HashJoinBatch.sniffNonEmptyBatch():276 org.apache.drill.exec.physical.impl.join.HashJoinBatch.prefetchFirstBatchFromBothSides():238 org.apache.drill.exec.physical.impl.join.HashJoinBatch.buildSchema():218 org.apache.drill.exec.record.AbstractRecordBatch.next():152 org.apache.drill.exec.record.AbstractRecordBatch.next():119 org.apache.drill.exec.physical.impl.join.HashJoinBatch.sniffNonEmptyBatch():276 org.apache.drill.exec.physical.impl.join.HashJoinBatch.prefetchFirstBatchFromBothSides():238 org.apache.drill.exec.physical.impl.join.HashJoinBatch.buildSchema():218 org.apache.drill.exec.record.AbstractRecordBatch.next():152 org.apache.drill.exec.record.AbstractRecordBatch.next():119 org.apache.drill.exec.record.AbstractRecordBatch.next():109 org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():63 org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():147 org.apache.drill.exec.record.AbstractRecordBatch.next():172 org.apache.drill.exec.record.AbstractRecordBatch.next():119 org.apache.drill.exec.record.AbstractRecordBatch.next():109 org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():63 org.apache.drill.exec.record.AbstractRecordBatch.next():172 org.apache.drill.exec.record.AbstractRecordBatch.next():119 org.apache.drill.exec.record.AbstractRecordBatch.next():109 org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():63 org.apache.drill.exec.record.AbstractRecordBatch.next():172 org.apache.drill.exec.record.AbstractRecordBatch.next():119 org.apache.drill.exec.record.AbstractRecordBatch.next
[jira] [Created] (DRILL-6568) Jenkins Regression: TPCDS query 68 fails with IllegalStateException: Unexpected EMIT outcome received in buildSchema phase
Robert Hou created DRILL-6568: - Summary: Jenkins Regression: TPCDS query 68 fails with IllegalStateException: Unexpected EMIT outcome received in buildSchema phase Key: DRILL-6568 URL: https://issues.apache.org/jira/browse/DRILL-6568 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.14.0 Reporter: Robert Hou Assignee: Khurram Faraaz Fix For: 1.14.0 This is TPCDS Query 68. Query: /root/drillAutomation/framework-master/framework/resources/Advanced/tpcds/tpcds_sf1/original/maprdb/json/query68.sql SELECT c_last_name, c_first_name, ca_city, bought_city, ss_ticket_number, extended_price, extended_tax, list_price FROM (SELECT ss_ticket_number, ss_customer_sk, ca_city bought_city, Sum(ss_ext_sales_price) extended_price, Sum(ss_ext_list_price) list_price, Sum(ss_ext_tax) extended_tax FROM store_sales, date_dim, store, household_demographics, customer_address WHERE store_sales.ss_sold_date_sk = date_dim.d_date_sk AND store_sales.ss_store_sk = store.s_store_sk AND store_sales.ss_hdemo_sk = household_demographics.hd_demo_sk AND store_sales.ss_addr_sk = customer_address.ca_address_sk AND date_dim.d_dom BETWEEN 1 AND 2 AND ( household_demographics.hd_dep_count = 8 OR household_demographics.hd_vehicle_count = 3 ) AND date_dim.d_year IN ( 1998, 1998 + 1, 1998 + 2 ) AND store.s_city IN ( 'Fairview', 'Midway' ) GROUP BY ss_ticket_number, ss_customer_sk, ss_addr_sk, ca_city) dn, customer, customer_address current_addr WHERE ss_customer_sk = c_customer_sk AND customer.c_current_addr_sk = current_addr.ca_address_sk AND current_addr.ca_city <> bought_city ORDER BY c_last_name, ss_ticket_number LIMIT 100; Here is the stack trace: 2018-06-29 07:00:32 INFO DrillTestLogger:348 - Exception: java.sql.SQLException: SYSTEM ERROR: IllegalStateException: Unexpected EMIT outcome received in buildSchema phase Fragment 0:0 [Error Id: edbe3477-805e-4f1f-8405-d5c194dc28c2 on atsqa6c87.qa.lab:31010] (java.lang.IllegalStateException) Unexpected EMIT outcome received in buildSchema phase org.apache.drill.exec.physical.impl.TopN.TopNBatch.buildSchema():178 org.apache.drill.exec.record.AbstractRecordBatch.next():152 org.apache.drill.exec.record.AbstractRecordBatch.next():119 org.apache.drill.exec.record.AbstractRecordBatch.next():109 org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():63 org.apache.drill.exec.record.AbstractRecordBatch.next():172 org.apache.drill.exec.record.AbstractRecordBatch.next():119 org.apache.drill.exec.record.AbstractRecordBatch.next():109 org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():63 org.apache.drill.exec.physical.impl.limit.LimitRecordBatch.innerNext():87 org.apache.drill.exec.record.AbstractRecordBatch.next():172 org.apache.drill.exec.record.AbstractRecordBatch.next():119 org.apache.drill.exec.record.AbstractRecordBatch.next():109 org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():63 org.apache.drill.exec.record.AbstractRecordBatch.next():172 org.apache.drill.exec.record.AbstractRecordBatch.next():119 org.apache.drill.exec.record.AbstractRecordBatch.next():109 org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():63 org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():147 org.apache.drill.exec.record.AbstractRecordBatch.next():172 org.apache.drill.exec.physical.impl.BaseRootExec.next():103 org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():83 org.apache.drill.exec.physical.impl.BaseRootExec.next():93 org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():294 org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():281 java.security.AccessController.doPrivileged():-2 javax.security.auth.Subject.doAs():422 org.apache.hadoop.security.UserGroupInformation.doAs():1595 org.apache.drill.exec.work.fragment.FragmentExecutor.run():281 org.apache.drill.common.SelfCleaningRunnable.run():38 java.util.concurrent.ThreadPoolExecutor.runWorker():1149 java.util.concurrent.ThreadPoolExecutor$Worker.run():624 java.lang.Thread.run():748 at org.apache.drill.jdbc.impl.DrillCursor.nextRowInternally(DrillCursor.java:528) at org.apache.drill.jdbc.impl.DrillCursor.loadInitialSchema(DrillCursor.java:600) at org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(DrillResultSetImpl.java:1904) at org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(DrillResultSetImpl.java:64) at oadd.org.apache.calcite.avatica.AvaticaConnection$1.execute(AvaticaConnection.java:630) at org.apache.drill.jdbc.impl.DrillMetaImpl.prepareAndExecute(DrillMetaImpl.jav
[jira] [Created] (DRILL-6567) Jenkins Regression: TPCDS query 93 fails with INTERNAL_ERROR ERROR: java.lang.reflect.UndeclaredThrowableException.
Robert Hou created DRILL-6567: - Summary: Jenkins Regression: TPCDS query 93 fails with INTERNAL_ERROR ERROR: java.lang.reflect.UndeclaredThrowableException. Key: DRILL-6567 URL: https://issues.apache.org/jira/browse/DRILL-6567 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.14.0 Reporter: Robert Hou Assignee: Pritesh Maker Fix For: 1.14.0 This is TPCDS Query 93. Query: /root/drillAutomation/framework-master/framework/resources/Advanced/tpcds/tpcds_sf100/hive/parquet/query93.sql SELECT ss_customer_sk, Sum(act_sales) sumsales FROM (SELECT ss_item_sk, ss_ticket_number, ss_customer_sk, CASE WHEN sr_return_quantity IS NOT NULL THEN ( ss_quantity - sr_return_quantity ) * ss_sales_price ELSE ( ss_quantity * ss_sales_price ) END act_sales FROM store_sales LEFT OUTER JOIN store_returns ON ( sr_item_sk = ss_item_sk AND sr_ticket_number = ss_ticket_number ), reason WHERE sr_reason_sk = r_reason_sk AND r_reason_desc = 'reason 38') t GROUP BY ss_customer_sk ORDER BY sumsales, ss_customer_sk LIMIT 100; Here is the stack trace: 2018-06-29 07:00:32 INFO DrillTestLogger:348 - Exception: java.sql.SQLException: INTERNAL_ERROR ERROR: java.lang.reflect.UndeclaredThrowableException Setup failed for null Fragment 4:56 [Error Id: 3c72c14d-9362-4a9b-affb-5cf937bed89e on atsqa6c82.qa.lab:31010] (org.apache.drill.common.exceptions.ExecutionSetupException) java.lang.reflect.UndeclaredThrowableException org.apache.drill.common.exceptions.ExecutionSetupException.fromThrowable():30 org.apache.drill.exec.store.hive.readers.HiveAbstractReader.setup():327 org.apache.drill.exec.physical.impl.ScanBatch.getNextReaderIfHas():245 org.apache.drill.exec.physical.impl.ScanBatch.next():164 org.apache.drill.exec.record.AbstractRecordBatch.next():119 org.apache.drill.exec.physical.impl.join.HashJoinBatch.sniffNonEmptyBatch():276 org.apache.drill.exec.physical.impl.join.HashJoinBatch.prefetchFirstBatchFromBothSides():238 org.apache.drill.exec.physical.impl.join.HashJoinBatch.buildSchema():218 org.apache.drill.exec.record.AbstractRecordBatch.next():152 org.apache.drill.exec.record.AbstractRecordBatch.next():119 org.apache.drill.exec.record.AbstractRecordBatch.next():109 org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():63 org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():147 org.apache.drill.exec.record.AbstractRecordBatch.next():172 org.apache.drill.exec.record.AbstractRecordBatch.next():119 org.apache.drill.exec.physical.impl.join.HashJoinBatch.sniffNonEmptyBatch():276 org.apache.drill.exec.physical.impl.join.HashJoinBatch.prefetchFirstBatchFromBothSides():238 org.apache.drill.exec.physical.impl.join.HashJoinBatch.buildSchema():218 org.apache.drill.exec.record.AbstractRecordBatch.next():152 org.apache.drill.exec.record.AbstractRecordBatch.next():119 org.apache.drill.exec.record.AbstractRecordBatch.next():109 org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():63 org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():147 org.apache.drill.exec.record.AbstractRecordBatch.next():172 org.apache.drill.exec.record.AbstractRecordBatch.next():119 org.apache.drill.exec.record.AbstractRecordBatch.next():109 org.apache.drill.exec.physical.impl.aggregate.HashAggBatch.buildSchema():118 org.apache.drill.exec.record.AbstractRecordBatch.next():152 org.apache.drill.exec.physical.impl.BaseRootExec.next():103 org.apache.drill.exec.physical.impl.partitionsender.PartitionSenderRootExec.innerNext():152 org.apache.drill.exec.physical.impl.BaseRootExec.next():93 org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():294 org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():281 java.security.AccessController.doPrivileged():-2 javax.security.auth.Subject.doAs():422 org.apache.hadoop.security.UserGroupInformation.doAs():1595 org.apache.drill.exec.work.fragment.FragmentExecutor.run():281 org.apache.drill.common.SelfCleaningRunnable.run():38 java.util.concurrent.ThreadPoolExecutor.runWorker():1149 java.util.concurrent.ThreadPoolExecutor$Worker.run():624 java.lang.Thread.run():748 Caused By (java.util.concurrent.ExecutionException) java.lang.reflect.UndeclaredThrowableException java.util.concurrent.FutureTask.report():122 java.util.concurrent.FutureTask.get():192 org.apache.drill.exec.store.hive.readers.HiveAbstractReader.setup():320 org.apache.drill.exec.physical.impl.ScanBatch.getNextReaderIfHas():245 org.apache.drill.exec.physical.impl.ScanBatch.next():164 org.apache.drill.exec.record.AbstractRecordBatch.next():119
[jira] [Created] (DRILL-6566) Jenkins Regression: TPCDS query 66 fails with RESOURCE ERROR: One or more nodes ran out of memory while executing the query. AGGR OOM at First Phase.
Robert Hou created DRILL-6566: - Summary: Jenkins Regression: TPCDS query 66 fails with RESOURCE ERROR: One or more nodes ran out of memory while executing the query. AGGR OOM at First Phase. Key: DRILL-6566 URL: https://issues.apache.org/jira/browse/DRILL-6566 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.14.0 Reporter: Robert Hou Assignee: Pritesh Maker Fix For: 1.14.0 This is TPCDS Query 66. Query: /root/drillAutomation/framework-master/framework/resources/Advanced/tpcds/tpcds_sf1/hive-generated-parquet/hive1_native/query66.sql SELECT w_warehouse_name, w_warehouse_sq_ft, w_city, w_county, w_state, w_country, ship_carriers, year1, Sum(jan_sales) AS jan_sales, Sum(feb_sales) AS feb_sales, Sum(mar_sales) AS mar_sales, Sum(apr_sales) AS apr_sales, Sum(may_sales) AS may_sales, Sum(jun_sales) AS jun_sales, Sum(jul_sales) AS jul_sales, Sum(aug_sales) AS aug_sales, Sum(sep_sales) AS sep_sales, Sum(oct_sales) AS oct_sales, Sum(nov_sales) AS nov_sales, Sum(dec_sales) AS dec_sales, Sum(jan_sales / w_warehouse_sq_ft) AS jan_sales_per_sq_foot, Sum(feb_sales / w_warehouse_sq_ft) AS feb_sales_per_sq_foot, Sum(mar_sales / w_warehouse_sq_ft) AS mar_sales_per_sq_foot, Sum(apr_sales / w_warehouse_sq_ft) AS apr_sales_per_sq_foot, Sum(may_sales / w_warehouse_sq_ft) AS may_sales_per_sq_foot, Sum(jun_sales / w_warehouse_sq_ft) AS jun_sales_per_sq_foot, Sum(jul_sales / w_warehouse_sq_ft) AS jul_sales_per_sq_foot, Sum(aug_sales / w_warehouse_sq_ft) AS aug_sales_per_sq_foot, Sum(sep_sales / w_warehouse_sq_ft) AS sep_sales_per_sq_foot, Sum(oct_sales / w_warehouse_sq_ft) AS oct_sales_per_sq_foot, Sum(nov_sales / w_warehouse_sq_ft) AS nov_sales_per_sq_foot, Sum(dec_sales / w_warehouse_sq_ft) AS dec_sales_per_sq_foot, Sum(jan_net) AS jan_net, Sum(feb_net) AS feb_net, Sum(mar_net) AS mar_net, Sum(apr_net) AS apr_net, Sum(may_net) AS may_net, Sum(jun_net) AS jun_net, Sum(jul_net) AS jul_net, Sum(aug_net) AS aug_net, Sum(sep_net) AS sep_net, Sum(oct_net) AS oct_net, Sum(nov_net) AS nov_net, Sum(dec_net) AS dec_net FROM (SELECT w_warehouse_name, w_warehouse_sq_ft, w_city, w_county, w_state, w_country, 'ZOUROS' || ',' || 'ZHOU' AS ship_carriers, d_yearAS year1, Sum(CASE WHEN d_moy = 1 THEN ws_ext_sales_price * ws_quantity ELSE 0 END) AS jan_sales, Sum(CASE WHEN d_moy = 2 THEN ws_ext_sales_price * ws_quantity ELSE 0 END) AS feb_sales, Sum(CASE WHEN d_moy = 3 THEN ws_ext_sales_price * ws_quantity ELSE 0 END) AS mar_sales, Sum(CASE WHEN d_moy = 4 THEN ws_ext_sales_price * ws_quantity ELSE 0 END) AS apr_sales, Sum(CASE WHEN d_moy = 5 THEN ws_ext_sales_price * ws_quantity ELSE 0 END) AS may_sales, Sum(CASE WHEN d_moy = 6 THEN ws_ext_sales_price * ws_quantity ELSE 0 END) AS jun_sales, Sum(CASE WHEN d_moy = 7 THEN ws_ext_sales_price * ws_quantity ELSE 0 END) AS jul_sales, Sum(CASE WHEN d_moy = 8 THEN ws_ext_sales_price * ws_quantity ELSE 0 END) AS aug_sales, Sum(CASE WHEN d_moy = 9 THEN ws_ext_sales_price * ws_quantity ELSE 0 END) AS sep_sales, Sum(CASE WHEN d_moy = 10 THEN ws_ext_sales_price * ws_quantity ELSE 0 END) AS oct_sales, Sum(CASE WHEN d_moy = 11 THEN ws_ext_sales_price * ws_quantity ELSE 0 END) AS nov_sales, Sum(CASE WHEN d_moy = 12 THEN ws_ext_sales_price * ws_quantity ELSE 0 END) AS dec_sales, Sum(CASE WHEN d_moy = 1 THEN ws_net_paid_inc_ship * ws_quantity ELSE 0 END) AS jan_net, Sum(CASE WHEN d_moy = 2 THEN ws_net_paid_inc_ship * ws_quantity ELSE 0 END) AS feb_net, Sum(CASE WHEN d_moy = 3 THEN ws_net_paid_inc_ship * ws_quantity ELSE 0 END) AS mar_net, Sum(CASE WHEN d_moy = 4 THEN ws_net_paid_inc_ship * ws_quantity ELSE 0 END) AS apr_net, Sum(CASE WHEN d_moy = 5 THEN ws_net_paid_inc_ship * ws_quantity ELSE 0 END) AS may_net, Sum(CASE WHEN d_moy = 6 THEN ws_net_paid_inc_ship * ws_quantity ELSE 0 END) AS jun_net, Sum(CASE WHEN d_moy = 7 THEN ws_net_paid_inc_ship * ws_quantity ELSE 0 END) AS jul_net, Sum(CASE WHEN d_moy = 8 THEN ws_net_paid_inc_ship * ws_quantity ELSE 0 END) AS aug_net, Sum(CASE WHEN d_moy = 9 THEN ws_net_paid_inc_ship * ws_quantity ELSE 0 END) AS sep_net, Sum(CASE WHEN d_moy = 10 THEN ws_net_paid_inc_ship * ws_quantity ELSE 0 END) AS oct_net, Sum(CASE WHEN d_moy = 11 THEN ws_net_paid_inc_ship * ws_quantity ELSE 0 END) AS nov_net, Sum(CASE WHEN d_moy = 12
[jira] [Created] (DRILL-6565) cume_dist does not return enough rows
Robert Hou created DRILL-6565: - Summary: cume_dist does not return enough rows Key: DRILL-6565 URL: https://issues.apache.org/jira/browse/DRILL-6565 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.14.0 Reporter: Robert Hou Assignee: Pritesh Maker Attachments: drillbit.log.7802 This query should return 64 rows but only returns 38 rows: alter session set `planner.width.max_per_node` = 1; alter session set `planner.width.max_per_query` = 1; select * from ( select cume_dist() over (order by Index) IntervalSecondValuea, Index from (select * from dfs.`/drill/testdata/batch_memory/fourvarchar_asc_nulls_16MB_1GB.parquet` order by BigIntvalue)) d where d.Index = 1; I tried to reproduce the problem by using a smaller table, but it does not reproduce. I tried to reproduce the problem without the outside select statement, but it does not reproduce. Here is the explain plan: {noformat} | 00-00Screen : rowType = RecordType(DOUBLE IntervalSecondValuea, ANY Index): rowcount = 12000.0, cumulative cost = {757200.0 rows, 1.1573335922911648E7 cpu, 0.0 io, 0.0 network, 192.0 memory}, id = 4034 00-01 ProjectAllowDup(IntervalSecondValuea=[$0], Index=[$1]) : rowType = RecordType(DOUBLE IntervalSecondValuea, ANY Index): rowcount = 12000.0, cumulative cost = {756000.0 rows, 1.1572135922911648E7 cpu, 0.0 io, 0.0 network, 192.0 memory}, id = 4033 00-02Project(w0$o0=[$1], $0=[$0]) : rowType = RecordType(DOUBLE w0$o0, ANY $0): rowcount = 12000.0, cumulative cost = {744000.0 rows, 1.1548135922911648E7 cpu, 0.0 io, 0.0 network, 192.0 memory}, id = 4032 00-03 SelectionVectorRemover : rowType = RecordType(ANY $0, DOUBLE w0$o0): rowcount = 12000.0, cumulative cost = {732000.0 rows, 1.1524135922911648E7 cpu, 0.0 io, 0.0 network, 192.0 memory}, id = 4031 00-04Filter(condition=[=($0, 1)]) : rowType = RecordType(ANY $0, DOUBLE w0$o0): rowcount = 12000.0, cumulative cost = {72.0 rows, 1.1512135922911648E7 cpu, 0.0 io, 0.0 network, 192.0 memory}, id = 4030 00-05 Window(window#0=[window(partition {} order by [0] range between UNBOUNDED PRECEDING and CURRENT ROW aggs [CUME_DIST()])]) : rowType = RecordType(ANY $0, DOUBLE w0$o0): rowcount = 8.0, cumulative cost = {64.0 rows, 1.1144135922911648E7 cpu, 0.0 io, 0.0 network, 192.0 memory}, id = 4029 00-06SelectionVectorRemover : rowType = RecordType(ANY $0): rowcount = 8.0, cumulative cost = {56.0 rows, 1.0984135922911648E7 cpu, 0.0 io, 0.0 network, 192.0 memory}, id = 4028 00-07 Sort(sort0=[$0], dir0=[ASC]) : rowType = RecordType(ANY $0): rowcount = 8.0, cumulative cost = {48.0 rows, 1.0904135922911648E7 cpu, 0.0 io, 0.0 network, 192.0 memory}, id = 4027 00-08Project($0=[ITEM($0, 'Index')]) : rowType = RecordType(ANY $0): rowcount = 8.0, cumulative cost = {40.0 rows, 5692067.961455824 cpu, 0.0 io, 0.0 network, 128.0 memory}, id = 4026 00-09 SelectionVectorRemover : rowType = RecordType(DYNAMIC_STAR T2¦¦**, ANY BigIntvalue): rowcount = 8.0, cumulative cost = {32.0 rows, 5612067.961455824 cpu, 0.0 io, 0.0 network, 128.0 memory}, id = 4025 00-10Sort(sort0=[$1], dir0=[ASC]) : rowType = RecordType(DYNAMIC_STAR T2¦¦**, ANY BigIntvalue): rowcount = 8.0, cumulative cost = {24.0 rows, 5532067.961455824 cpu, 0.0 io, 0.0 network, 128.0 memory}, id = 4024 00-11 Project(T2¦¦**=[$0], BigIntvalue=[$1]) : rowType = RecordType(DYNAMIC_STAR T2¦¦**, ANY BigIntvalue): rowcount = 8.0, cumulative cost = {16.0 rows, 32.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 4023 00-12Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:///drill/testdata/batch_memory/fourvarchar_asc_nulls_16MB_1GB.parquet]], selectionRoot=maprfs:/drill/testdata/batch_memory/fourvarchar_asc_nulls_16MB_1GB.parquet, numFiles=1, numRowGroups=6, usedMetadataFile=false, columns=[`**`]]]) : rowType = RecordType(DYNAMIC_STAR **, ANY BigIntvalue): rowcount = 8.0, cumulative cost = {8.0 rows, 16.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 4022 {noformat} I have attached the drillbit.log. The commit id is: | 1.14.0-SNAPSHOT | aa127b70b1e46f7f4aa19881f25eda583627830a | DRILL-6523: Fix NPE for describe of partial schema | 22.06.2018 @ 11:28:23 PDT | r...@mapr.com | 23.06.2018 @ 02:05:10 PDT | fourvarchar_asc_nulls95.q -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6547) IllegalStateException: Tried to remove unmanaged buffer.
Robert Hou created DRILL-6547: - Summary: IllegalStateException: Tried to remove unmanaged buffer. Key: DRILL-6547 URL: https://issues.apache.org/jira/browse/DRILL-6547 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.14.0 Reporter: Robert Hou Assignee: Pritesh Maker This is the query: select * from ( select Index, concat(BinaryValue, 'aaa') NewVarcharValue from (select * from dfs.`/drill/testdata/batch_memory/alltypes_large_1MB.parquet`)) d where d.Index = 1; This is the plan: {noformat} | 00-00Screen 00-01 Project(Index=[$0], NewVarcharValue=[$1]) 00-02SelectionVectorRemover 00-03 Filter(condition=[=($0, 1)]) 00-04Project(Index=[$0], NewVarcharValue=[CONCAT($1, 'aaa')]) 00-05 Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:///drill/testdata/batch_memory/alltypes_large_1MB.parquet]], selectionRoot=maprfs:/drill/testdata/batch_memory/alltypes_large_1MB.parquet, numFiles=1, numRowGroups=1, usedMetadataFile=false, columns=[`Index`, `BinaryValue`]]]) {noformat} Here is the stack trace from drillbit.log: {noformat} 2018-06-27 13:55:03,291 [24cc0659-30b7-b290-7fae-ecb1c1f15c05:frag:0:0] ERROR o.a.d.e.w.fragment.FragmentExecutor - SYSTEM ERROR: IllegalStateException: Tried to remove unmanaged buffer. Fragment 0:0 [Error Id: bc1f2f72-c31b-4b9a-964f-96dec9e0f388 on qa-node186.qa.lab:31010] org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: IllegalStateException: Tried to remove unmanaged buffer. Fragment 0:0 [Error Id: bc1f2f72-c31b-4b9a-964f-96dec9e0f388 on qa-node186.qa.lab:31010] at org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:633) ~[drill-common-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] at org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:361) [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] at org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:216) [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] at org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:327) [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] at org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) [drill-common-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [na:1.8.0_161] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [na:1.8.0_161] at java.lang.Thread.run(Thread.java:748) [na:1.8.0_161] Caused by: java.lang.IllegalStateException: Tried to remove unmanaged buffer. at org.apache.drill.exec.ops.BufferManagerImpl.replace(BufferManagerImpl.java:50) ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] at io.netty.buffer.DrillBuf.reallocIfNeeded(DrillBuf.java:97) ~[drill-memory-base-1.14.0-SNAPSHOT.jar:4.0.48.Final] at org.apache.drill.exec.test.generated.ProjectorGen4046.doEval(ProjectorTemplate.java:77) ~[na:na] at org.apache.drill.exec.test.generated.ProjectorGen4046.projectRecords(ProjectorTemplate.java:67) ~[na:na] at org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.doWork(ProjectRecordBatch.java:236) ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] at org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext(AbstractUnaryRecordBatch.java:117) ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] at org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:147) ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:172) ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119) ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:109) ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] at org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext(AbstractUnaryRecordBatch.java:63) ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:172) ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119) ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:109) ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0
Re: [ANNOUNCE] New PMC member: Vitalii Diravka
Congrats, Vitalii! --Robert On Tue, Jun 26, 2018 at 6:17 PM, Padma Penumarthy wrote: > Congrats Vitalii. > > Thanks > Padma > > > > On Jun 26, 2018, at 6:14 PM, Vlad Rozov wrote: > > > > Congratulations Vitalii! > > > > Thank you, > > > > Vlad > > > > On 6/26/18 17:11, Paul Rogers wrote: > >> Congratulations Vitalii! > >> - Paul > >> > >> > >> On Tuesday, June 26, 2018, 11:12:16 AM PDT, Aman Sinha < > amansi...@apache.org> wrote: > >>I am pleased to announce that Drill PMC invited Vitalii Diravka to > the PMC > >> and he has accepted the invitation. > >> > >> Congratulations Vitalii and thanks for your contributions ! > >> > >> -Aman > >> (on behalf of Drill PMC) > >> > > > >
Re: [ANNOUNCE] New Committer: Padma Penumarthy
Congratuations, Padma! --Robert From: rahul challapalli Sent: Monday, June 18, 2018 1:36 PM To: dev Subject: Re: [ANNOUNCE] New Committer: Padma Penumarthy Congratulations Padma! On Mon, Jun 18, 2018 at 1:35 PM Khurram Faraaz wrote: > Congratulations Padma! Well deserved. > > > Thanks, > > Khurram > > > From: Paul Rogers > Sent: Friday, June 15, 2018 7:50:05 PM > To: dev@drill.apache.org > Subject: Re: [ANNOUNCE] New Committer: Padma Penumarthy > > Congratulations! Well deserved, if just from the number of times you've > reviewed my code. > > Thanks, > - Paul > > > > On Friday, June 15, 2018, 9:36:44 AM PDT, Aman Sinha < > amansi...@apache.org> wrote: > > The Project Management Committee (PMC) for Apache Drill has invited Padma > Penumarthy to become a committer, and we are pleased to announce that she > has > accepted. > > Padma has been contributing to Drill for about 1 1/2 years. She has made > improvements for work-unit assignment in the parallelizer, performance of > filter operator for pattern matching and (more recently) on the batch > sizing for several operators: Flatten, MergeJoin, HashJoin, UnionAll. > > Welcome Padma, and thank you for your contributions. Keep up the good work > ! > > -Aman > (on behalf of Drill PMC) > >
Re: [Vote] Cleaning Up Old PRs
The Exchange PR was under active development, but there were some issues that could not be resolved at the time. So it was shelved until someone could get some time to resolve those issues. Thanks. --Robert From: Robert Hou Sent: Thursday, June 7, 2018 11:46 AM To: dev@drill.apache.org Subject: Re: [Vote] Cleaning Up Old PRs On a related note, someone created a PR to resolve some Exchange issues a year ago. It has been dormant since then, and the original author is probably not going to push it forward. However, a second person has picked it up now because we need to resolve the issue. There is a lot of good work in that PR, and it has provided a great starting point. I'm not against cleaning up old PRs. But I am not sure it is easy to automate without losing some good work. Thanks. --Robert From: Dave Oshinsky Sent: Thursday, June 7, 2018 11:34 AM To: dev@drill.apache.org Subject: Re: [Vote] Cleaning Up Old PRs Hi Tim, Everyone's time is constrained, so I doubt that it will always be possible to give "timely" reviews to PR's, especially complex ones, or ones regarding problems that are not regarded as high priority. I suggest these changes to your scheme: 1) Once a PR reaches the 3 months point, send an email to the list and directly to the PR creator that the PR will automatically be closed in 1 more month if specific actions are not taken. The PR creator is less likely to miss an email that is sent directly to him/her. 2) Automatic removals should not be executed until an administrator has approved it. In other words, it should not be completely automatic, without a human in the loop. 3) PR's that are closed (either automatically or not) should remain in the system for some time (with "reopen" possible), in case a mistake occurs. It seems that github already supports this behavior. As of this writing, I see 105 open PR's, 1201 closed PR's for Apache Drill. Perhaps I'm missing something, but why the effort to make this automatic? Are there way more PR's than I'm seeing? Thanks, Dave O From: Timothy Farkas Sent: Thursday, June 7, 2018 1:38 PM To: dev@drill.apache.org Subject: Re: [Vote] Cleaning Up Old PRs Hi Dave, I'm sorry you had a bad experience. We should do better giving timely reviews moving forward. I think there are some ways we can protect PRs from unresponsive committers while still closing PRs from unresponsive contributors. Here are some ideas. 1. Have an auto responder comment on each new PR after it is opened with all the information a contributor needs to be successful along with all the information about how PRs are autoclosed and what to do to keep the PR alive. Also encourage the contributor to spam us until we do a review in this message. 2. Auto labeling fresh PRs with a "needs-first-review" label (or something like that). PRs with this label are exempt from the auto closing process and the label will only be removed after a committer has looked at the PR and done a first round of review. This can protect a PR that had never been reviewed from being closed. 3. Allow the contributor to request a "pending" label to be placed on their PR. This label would make their PR permanently immune to auto closing even after a first round of review has been completed and the "needs-first-review" label has been removed. How do you feel about these protections? Do you think they would be sufficient? If not, do you have any alternative ideas to help improve the process? As a note, I think our motivations are the same. We both want quality PRs to make it into Drill. I want to do it by removing PRs where the contributor is unresponsive so committers can better focus on the PRs that need attention. And I think you are rightfully concerned about false positives when automating this process. Hopefully we can find a good middle ground that everyone can be happy with. Thanks, Tim From: Dave Oshinsky Sent: Wednesday, June 6, 2018 6:28:39 PM To: dev@drill.apache.org Subject: Re: [Vote] Cleaning Up Old PRs Tim, It's too restrictive, unless something can be done to educate (outsider) PR authors like myself to "go against the grain" and keep asking. And asking. And asking. And asking. You get the picture? I did all that. And it was ignored. I assumed that people outside MapR aren't welcome to contribute, and/or there was little interest in making decimal work properly, and/or there was simply nobody available to review it (what I was most comfortable believing), and/or my emails smelled really bad (kidding on the last one 8-). I asked a few times, and asked again a few times a few months later, and nothing. What can you do to educate outsiders as to what they need to do to make sure a useful PR doesn't get flushed d
Re: [Vote] Cleaning Up Old PRs
On a related note, someone created a PR to resolve some Exchange issues a year ago. It has been dormant since then, and the original author is probably not going to push it forward. However, a second person has picked it up now because we need to resolve the issue. There is a lot of good work in that PR, and it has provided a great starting point. I'm not against cleaning up old PRs. But I am not sure it is easy to automate without losing some good work. Thanks. --Robert From: Dave Oshinsky Sent: Thursday, June 7, 2018 11:34 AM To: dev@drill.apache.org Subject: Re: [Vote] Cleaning Up Old PRs Hi Tim, Everyone's time is constrained, so I doubt that it will always be possible to give "timely" reviews to PR's, especially complex ones, or ones regarding problems that are not regarded as high priority. I suggest these changes to your scheme: 1) Once a PR reaches the 3 months point, send an email to the list and directly to the PR creator that the PR will automatically be closed in 1 more month if specific actions are not taken. The PR creator is less likely to miss an email that is sent directly to him/her. 2) Automatic removals should not be executed until an administrator has approved it. In other words, it should not be completely automatic, without a human in the loop. 3) PR's that are closed (either automatically or not) should remain in the system for some time (with "reopen" possible), in case a mistake occurs. It seems that github already supports this behavior. As of this writing, I see 105 open PR's, 1201 closed PR's for Apache Drill. Perhaps I'm missing something, but why the effort to make this automatic? Are there way more PR's than I'm seeing? Thanks, Dave O From: Timothy Farkas Sent: Thursday, June 7, 2018 1:38 PM To: dev@drill.apache.org Subject: Re: [Vote] Cleaning Up Old PRs Hi Dave, I'm sorry you had a bad experience. We should do better giving timely reviews moving forward. I think there are some ways we can protect PRs from unresponsive committers while still closing PRs from unresponsive contributors. Here are some ideas. 1. Have an auto responder comment on each new PR after it is opened with all the information a contributor needs to be successful along with all the information about how PRs are autoclosed and what to do to keep the PR alive. Also encourage the contributor to spam us until we do a review in this message. 2. Auto labeling fresh PRs with a "needs-first-review" label (or something like that). PRs with this label are exempt from the auto closing process and the label will only be removed after a committer has looked at the PR and done a first round of review. This can protect a PR that had never been reviewed from being closed. 3. Allow the contributor to request a "pending" label to be placed on their PR. This label would make their PR permanently immune to auto closing even after a first round of review has been completed and the "needs-first-review" label has been removed. How do you feel about these protections? Do you think they would be sufficient? If not, do you have any alternative ideas to help improve the process? As a note, I think our motivations are the same. We both want quality PRs to make it into Drill. I want to do it by removing PRs where the contributor is unresponsive so committers can better focus on the PRs that need attention. And I think you are rightfully concerned about false positives when automating this process. Hopefully we can find a good middle ground that everyone can be happy with. Thanks, Tim From: Dave Oshinsky Sent: Wednesday, June 6, 2018 6:28:39 PM To: dev@drill.apache.org Subject: Re: [Vote] Cleaning Up Old PRs Tim, It's too restrictive, unless something can be done to educate (outsider) PR authors like myself to "go against the grain" and keep asking. And asking. And asking. And asking. You get the picture? I did all that. And it was ignored. I assumed that people outside MapR aren't welcome to contribute, and/or there was little interest in making decimal work properly, and/or there was simply nobody available to review it (what I was most comfortable believing), and/or my emails smelled really bad (kidding on the last one 8-). I asked a few times, and asked again a few times a few months later, and nothing. What can you do to educate outsiders as to what they need to do to make sure a useful PR doesn't get flushed down the toilet? I spent days learning some amount of Drill internals and implementing VARDECIMAL (over 70 source files changed), and did it again months later to merge to then current master tip. All ignored for quite some time. Thanks to Volodymyr Vysotskyi for ultimately grabbing the ball and running with it. That complex a change required an "insider" to bring it fully to fruition. But if the PR had been
Re: help for native drive for .NET
I don't think there is a Drill driver for .NET. For Windows, we have ODBC and JDBC. Can you provide more information on your performance issue? Thanks. --Robert From: ariolov...@gmail.comSent: Monday, May 21, 2018 3:06 PM To: dev@drill.apache.org Subject: help for native drive for .NET Hi, So, today I use ODBC driver, but is too slow.. Do you know a native driver for .NET ? Thanks! Ario
[jira] [Created] (DRILL-6393) Radians should take an argument (x)
Robert Hou created DRILL-6393: - Summary: Radians should take an argument (x) Key: DRILL-6393 URL: https://issues.apache.org/jira/browse/DRILL-6393 Project: Apache Drill Issue Type: Bug Components: Documentation Affects Versions: 1.13.0 Reporter: Robert Hou Assignee: Bridget Bevens Fix For: 1.14.0 The radians function is missing an argument on this webpage: https://drill.apache.org/docs/math-and-trig/ The table has this information: {noformat} RADIANS FLOAT8 Converts x degress to radians. {nformat} It should be: {noformat} RADIANS(x) FLOAT8 Converts x degrees to radians. {noformat} Also, degress is mis-spelled. It should be degrees. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (DRILL-5900) Regression: TPCH query encounters random IllegalStateException: Memory was leaked by query
[ https://issues.apache.org/jira/browse/DRILL-5900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Hou resolved DRILL-5900. --- Resolution: Fixed This test is now passing. > Regression: TPCH query encounters random IllegalStateException: Memory was > leaked by query > -- > > Key: DRILL-5900 > URL: https://issues.apache.org/jira/browse/DRILL-5900 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators >Affects Versions: 1.11.0 >Reporter: Robert Hou >Assignee: Timothy Farkas >Priority: Blocker > Attachments: 2611d7c0-b0c9-a93e-c64d-a4ef8f4baf8f.sys.drill, > drillbit.log.node81, drillbit.log.node88 > > > This is a random failure in the TPCH-SF100-baseline run. The test is > /root/drillAutomation/framework-master/framework/resources/Advanced/tpch/tpch_sf1/original/parquet/query17.sql. > This test has passed before. > TPCH query 6: > {noformat} > SELECT > SUM(L.L_EXTENDEDPRICE) / 7.0 AS AVG_YEARLY > FROM > lineitem L, > part P > WHERE > P.P_PARTKEY = L.L_PARTKEY > AND P.P_BRAND = 'BRAND#13' > AND P.P_CONTAINER = 'JUMBO CAN' > AND L.L_QUANTITY < ( > SELECT > 0.2 * AVG(L2.L_QUANTITY) > FROM > lineitem L2 > WHERE > L2.L_PARTKEY = P.P_PARTKEY > ) > {noformat} > Error is: > {noformat} > 2017-10-23 10:34:55,989 [2611d7c0-b0c9-a93e-c64d-a4ef8f4baf8f:frag:8:2] ERROR > o.a.d.e.w.fragment.FragmentExecutor - SYSTEM ERROR: IllegalStateException: > Memory was leaked by query. Memory leaked: (2097152) > Allocator(op:8:2:6:ParquetRowGroupScan) 100/0/7675904/100 > (res/actual/peak/limit) > Fragment 8:2 > [Error Id: f21a2560-7259-4e13-88c2-9bac29e2930a on atsqa6c88.qa.lab:31010] > org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: > IllegalStateException: Memory was leaked by query. Memory leaked: (2097152) > Allocator(op:8:2:6:ParquetRowGroupScan) 100/0/7675904/100 > (res/actual/peak/limit) > Fragment 8:2 > [Error Id: f21a2560-7259-4e13-88c2-9bac29e2930a on atsqa6c88.qa.lab:31010] > at > org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:586) > ~[drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:298) > [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:160) > [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:267) > [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) > [drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > [na:1.7.0_51] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > [na:1.7.0_51] > at java.lang.Thread.run(Thread.java:744) [na:1.7.0_51] > Caused by: java.lang.IllegalStateException: Memory was leaked by query. > Memory leaked: (2097152) > Allocator(op:8:2:6:ParquetRowGroupScan) 100/0/7675904/100 > (res/actual/peak/limit) > at > org.apache.drill.exec.memory.BaseAllocator.close(BaseAllocator.java:519) > ~[drill-memory-base-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.ops.AbstractOperatorExecContext.close(AbstractOperatorExecContext.java:86) > ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.ops.OperatorContextImpl.close(OperatorContextImpl.java:108) > ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.ops.FragmentContext.suppressingClose(FragmentContext.java:435) > ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.ops.FragmentContext.close(FragmentContext.java:424) > ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.closeOutResources(FragmentExecutor.java:324) > [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:155) > [drill-java-exec-1.12.
[jira] [Created] (DRILL-6276) Drill CTAS creates parquet file having page greater than 200 MB.
Robert Hou created DRILL-6276: - Summary: Drill CTAS creates parquet file having page greater than 200 MB. Key: DRILL-6276 URL: https://issues.apache.org/jira/browse/DRILL-6276 Project: Apache Drill Issue Type: Bug Components: Storage - Parquet Affects Versions: 1.13.0 Reporter: Robert Hou Attachments: alltypes_asc_16MB.json I used this CTAS to create a parquet file from a json file: {noformat} create table `alltypes.parquet` as select cast(BigIntValue as BigInt) BigIntValue, cast(BooleanValue as Boolean) BooleanValue, cast (DateValue as Date) DateValue, cast (FloatValue as Float) FloatValue, cast (DoubleValue as Double) DoubleValue, cast (IntegerValue as Integer) IntegerValue, cast (TimeValue as Time) TimeValue, cast (TimestampValue as Timestamp) TimestampValue, cast (IntervalYearValue as INTERVAL YEAR) IntervalYearValue, cast (IntervalDayValue as INTERVAL DAY) IntervalDayValue, cast (IntervalSecondValue as INTERVAL SECOND) IntervalSecondValue, cast (BinaryValue as binary) Binaryvalue, cast (VarcharValue as varchar) VarcharValue from `alltypes.json`; {noformat} I ran parquet-tools/parquet-dump : VarcharValue TV=6885 RL=0 DL=1 page 0: DLE:RLE RLE:BIT_PACKED VLE:PLAIN SZ:17240317 VC:6885 The page size is 16MB. This is with a 16MB data set. When I try a similar 1GB data set, the page size starts at over 200 MB, decreasing down to 1MB. VarcharValue TV=208513 RL=0 DL=1 page 0: DLE:RLE RLE:BIT_PACKED VLE:PLAIN SZ:215243750 VC:87433 page 1: DLE:RLE RLE:BIT_PACKED VLE:PLAIN SZ:112350266 VC:43717 page 2: DLE:RLE RLE:BIT_PACKED VLE:PLAIN SZ:52501154 VC:21859 page 3: DLE:RLE RLE:BIT_PACKED VLE:PLAIN SZ:27725498 VC:10930 page 4: DLE:RLE RLE:BIT_PACKED VLE:PLAIN SZ:12181241 VC:5466 page 5: DLE:RLE RLE:BIT_PACKED VLE:PLAIN SZ:11005971 VC:2734 page 6: DLE:RLE RLE:BIT_PACKED VLE:PLAIN SZ:1133237 VC:1797 page 7: DLE:RLE RLE:BIT_PACKED VLE:PLAIN SZ:1462803 VC:899 page 8: DLE:RLE RLE:BIT_PACKED VLE:PLAIN SZ:1050967 VC:490 page 9: DLE:RLE RLE:BIT_PACKED VLE:PLAIN SZ:1051603 VC:424 page 10: DLE:RLE RLE:BIT_PACKED VLE:PLAIN SZ:1050919 VC:378 page 11: DLE:RLE RLE:BIT_PACKED VLE:PLAIN SZ:1050487 VC:345 page 12: DLE:RLE RLE:BIT_PACKED VLE:PLAIN SZ:1050783 VC:319 page 13: DLE:RLE RLE:BIT_PACKED VLE:PLAIN SZ:1052303 VC:299 page 14: DLE:RLE RLE:BIT_PACKED VLE:PLAIN SZ:1053235 VC:282 page 15: DLE:RLE RLE:BIT_PACKED VLE:PLAIN SZ:1055979 VC:268 The column has a varchar, and the size varies from 2 bytes to 5000 bytes. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (DRILL-6176) Drill skips a row when querying a text file but does not report it.
[ https://issues.apache.org/jira/browse/DRILL-6176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Hou resolved DRILL-6176. --- Resolution: Not A Problem > Drill skips a row when querying a text file but does not report it. > --- > > Key: DRILL-6176 > URL: https://issues.apache.org/jira/browse/DRILL-6176 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Data Types >Affects Versions: 1.12.0 > Reporter: Robert Hou >Assignee: Pritesh Maker >Priority: Critical > Attachments: 10.tbl > > > I tried to query 10 rows from a tbl file. It skipped the 6th row, which only > has special symbols in it. So it shows 9 rows. And there was no warning > that a row is skipped. > i checked the special symbols. The same symbols appear in other rows. > This also occurs if the file is a csv file. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: [ANNOUNCE] New Committer: Kunal Khatua
Congrats Kunal! --Robert From: Robert WuSent: Wednesday, February 28, 2018 10:50 AM To: dev@drill.apache.org Subject: RE: [ANNOUNCE] New Committer: Kunal Khatua Congratulations, Kunal! Best regards, Rob -Original Message- From: Vitalii Diravka [mailto:vitalii.dira...@gmail.com] Sent: Wednesday, February 28, 2018 10:48 AM To: dev@drill.apache.org Subject: Re: [ANNOUNCE] New Committer: Kunal Khatua Congrats, Kunal! Kind regards Vitalii On Wed, Feb 28, 2018 at 6:39 PM, Timothy Farkas wrote: > Congrats! > > > From: Paul Rogers > Sent: Wednesday, February 28, 2018 9:58:32 AM > To: dev@drill.apache.org > Subject: Re: [ANNOUNCE] New Committer: Kunal Khatua > > Congrats, Kunal! Well deserved. > > - Paul > > > > On Feb 27, 2018, at 10:42 AM, Prasad Nagaraj Subramanya < > prasadn...@gmail.com> wrote: > > > > Congratulations Kunal! > > > > > > On Tue, Feb 27, 2018 at 10:41 AM, Padma Penumarthy > > > > > wrote: > > > >> Congratulations Kunal ! > >> > >> Thanks > >> Padma > >> > >> > >>> On Feb 27, 2018, at 8:42 AM, Aman Sinha wrote: > >>> > >>> The Project Management Committee (PMC) for Apache Drill has > >>> invited > Kunal > >>> Khatua to become a committer, and we are pleased to announce that > >>> he has accepted. > >>> > >>> Over the last couple of years, Kunal has made substantial > >>> contributions > >> to > >>> the process of creating and interpreting of query profiles, among > >>> other code contributions. He has led the efforts for Drill > >>> performance > >> evaluation > >>> and benchmarking. He is a prolific writer on the user mailing > >>> list, providing detailed responses. > >>> > >>> Welcome Kunal, and thank you for your contributions. Keep up the > >>> good work ! > >>> > >>> - Aman > >>> (on behalf of the Apache Drill PMC) > >> > >> > >
[jira] [Created] (DRILL-6178) Drill does not project extra columns in some cases
Robert Hou created DRILL-6178: - Summary: Drill does not project extra columns in some cases Key: DRILL-6178 URL: https://issues.apache.org/jira/browse/DRILL-6178 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.12.0 Reporter: Robert Hou Assignee: Pritesh Maker Attachments: 10.tbl Drill is supposed to project extra columns as null columns. This table has 10 columns. The extra columns are shown as null: {noformat} 0: jdbc:drill:zk=10.10.104.85:5181> select columns[0], columns[3], columns[4], columns[5], columns[6], columns[7], columns[8], columns[9], columns[10], columns[11], columns[12], columns[13], columns[14], columns[15] from `resource-manager/1.tbl`; +-+-+-+-+-+-+-+-+-+-+--+--+--+--+ | EXPR$0 | EXPR$1 | EXPR$2 | EXPR$3 | EXPR$4 | EXPR$5 | EXPR$6 | EXPR$7 | EXPR$8 | EXPR$9 | EXPR$10 | EXPR$11 | EXPR$12 | EXPR$13 | +-+-+-+-+-+-+-+-+-+-+--+--+--+--+ | 1 | | null | null | null | null | -61 | -255.0 | null | null | null | null | null | null | +-+-+-+-+-+-+-+-+-+-+--+--+--+--+{noformat} If I run the same query against a table with 10 rows and 10 columns (attached to the Jira), only the 10 columns are shown. {noformat} select columns[0], columns[1], columns[2], columns[3], columns[4], columns[5], columns[6], columns[7], columns[8], columns[9], columns[10], columns[11], columns[12], columns[13], columns[14], columns[15] from `10.tbl`{noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6176) Drill skips a row when querying a text file but does not report it.
Robert Hou created DRILL-6176: - Summary: Drill skips a row when querying a text file but does not report it. Key: DRILL-6176 URL: https://issues.apache.org/jira/browse/DRILL-6176 Project: Apache Drill Issue Type: Bug Components: Execution - Data Types Affects Versions: 1.12.0 Reporter: Robert Hou Assignee: Pritesh Maker I tried to query 10 rows from a tbl file. It skipped the 6th row, which only has special symbols in it. So it shows 9 rows. And there was no warning that a row is skipped. i checked the special symbols. The same symbols appear in other rows. This also occurs if the file is a csv file. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6165) Drill should support versioning between Drill clients (JDBC/ODBC) and Drill server
Robert Hou created DRILL-6165: - Summary: Drill should support versioning between Drill clients (JDBC/ODBC) and Drill server Key: DRILL-6165 URL: https://issues.apache.org/jira/browse/DRILL-6165 Project: Apache Drill Issue Type: Bug Components: Client - JDBC, Client - ODBC Affects Versions: 1.12.0 Reporter: Robert Hou Assignee: Pritesh Maker We need to determine which versions of JDBC/ODBC drivers can be used with which versions of Drill server. Due to recent improvements in security, a newer client had problems working with an older server. The current solution is to require drill clients and drill servers to be the same version. In some cases, different versions of drill clients can work with different versions of drill servers, but this compatibility is being determined on a version-by-version, feature-by-feature basis. We need an architecture that enables this to work automatically. In particular, if a new drill client requests a feature that the older drill server does not support, this should be handled gracefully without returning an error. This also has an impact on QA resources. We recently had a customer issue that needed to be fixed on three different Drill server releases, so three new drivers had to be created and tested. Note that drill clients and drill servers can be on different versions for various reasons: 1) A user may need to access different drill servers. They can only have one version of the drill client installed on their machine. 2) Many users may need to access the same drill server. Some users may have one version of the drill client installed while other users may have a different version of the drill client installed. In a large customer installation, it is difficult to get all users to upgrade their drill client at the same time. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6134) Many Drill queries fail when using JDBC Driver from Simba
Robert Hou created DRILL-6134: - Summary: Many Drill queries fail when using JDBC Driver from Simba Key: DRILL-6134 URL: https://issues.apache.org/jira/browse/DRILL-6134 Project: Apache Drill Issue Type: Bug Reporter: Robert Hou Assignee: Pritesh Maker Here is an example: Query: /root/drillAutomation/framework-master/framework/resources/Functional/limit0/union/data/union_51.q {noformat} (SELECT c2 FROM `union_01_v` ORDER BY c5 DESC nulls first) UNION (SELECT c2 FROM `union_02_v` ORDER BY c5 ASC nulls first){noformat} This is the error: {noformat} Exception: java.sql.SQLException: [JDBC Driver]The field c2(BIGINT:OPTIONAL) [$bits$(UINT1:REQUIRED), $values$(BIGINT:OPTIONAL)] doesn't match the provided metadata major_type { minor_type: BIGINT mode: OPTIONAL } name_part { name: "$values$" } value_count: 18 buffer_length: 144 . at com.google.common.base.Preconditions.checkArgument(Preconditions.java:145) at org.apache.drill.exec.vector.BigIntVector.load(BigIntVector.java:287) at org.apache.drill.exec.vector.NullableBigIntVector.load(NullableBigIntVector.java:274) at org.apache.drill.exec.record.RecordBatchLoader.load(RecordBatchLoader.java:131) at com.mapr.drill.drill.dataengine.DRJDBCResultSet.doLoadRecordBatchData(Unknown Source) at com.mapr.drill.drill.dataengine.DRJDBCResultSet.hasMoreRows(Unknown Source) at com.mapr.drill.drill.dataengine.DRJDBCResultSet.doMoveToNextRow(Unknown Source) at com.mapr.drill.jdbc.common.CommonResultSet.moveToNextRow(Unknown Source) at com.mapr.drill.jdbc.common.SForwardResultSet.next(Unknown Source) at org.apache.drill.test.framework.DrillTestJdbc.executeQuery(DrillTestJdbc.java:255) at org.apache.drill.test.framework.DrillTestJdbc.run(DrillTestJdbc.java:115) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:473) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.IllegalArgumentException: The field c2(BIGINT:OPTIONAL) [$bits$(UINT1:REQUIRED), $values$(BIGINT:OPTIONAL)] doesn't match the provided metadata major_type { minor_type: BIGINT mode: OPTIONAL } name_part { name: "$values$" } value_count: 18 buffer_length: 144 . ... 16 more{noformat} The commit that causes these errors to occur is: {noformat} https://issues.apache.org/jira/browse/DRILL-6049 Rollup of hygiene changes from "batch size" project commit ID e791ed62b1c91c39676c4adef438c689fd84fd4b{noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: [ANNOUNCE] New PMC member: Paul Rogers
Congratulations, Paul! --Robert From: Abhishek GirishSent: Tuesday, January 30, 2018 9:31 PM To: dev@drill.apache.org Subject: Re: [ANNOUNCE] New PMC member: Paul Rogers Congratulations, Paul! On Tue, Jan 30, 2018 at 2:48 PM, Sorabh Hamirwasia wrote: > Congratulations Paul! > > > Thanks, > Sorabh > > > From: AnilKumar B > Sent: Tuesday, January 30, 2018 2:43:07 PM > To: dev@drill.apache.org > Subject: Re: [ANNOUNCE] New PMC member: Paul Rogers > > Congratulations, Paul. > > Thanks & Regards, > B Anil Kumar. > > On Tue, Jan 30, 2018 at 2:34 PM, Chunhui Shi wrote: > > > Congrats Paul! Well deserved! > > > > > > From: Kunal Khatua > > Sent: Tuesday, January 30, 2018 2:05:56 PM > > To: dev@drill.apache.org > > Subject: RE: [ANNOUNCE] New PMC member: Paul Rogers > > > > Congratulations, Paul ! > > > > -Original Message- > > From: salim achouche [mailto:sachouc...@gmail.com] > > Sent: Tuesday, January 30, 2018 2:00 PM > > To: dev@drill.apache.org; Padma Penumarthy > > Subject: Re: [ANNOUNCE] New PMC member: Paul Rogers > > > > Congrats Paul! > > > > Regards, > > Salim > > > > > On Jan 30, 2018, at 1:58 PM, Padma Penumarthy > > wrote: > > > > > > Congratulations Paul. > > > > > > Thanks > > > Padma > > > > > > > > >> On Jan 30, 2018, at 1:55 PM, Gautam Parai wrote: > > >> > > >> Congratulations Paul! > > >> > > >> > > >> From: Timothy Farkas > > >> Sent: Tuesday, January 30, 2018 1:54:43 PM > > >> To: dev@drill.apache.org > > >> Subject: Re: [ANNOUNCE] New PMC member: Paul Rogers > > >> > > >> Congrats! > > >> > > >> > > >> From: Aman Sinha > > >> Sent: Tuesday, January 30, 2018 1:50:07 PM > > >> To: dev@drill.apache.org > > >> Subject: [ANNOUNCE] New PMC member: Paul Rogers > > >> > > >> I am pleased to announce that Drill PMC invited Paul Rogers to the > > >> PMC and he has accepted the invitation. > > >> > > >> Congratulations Paul and thanks for your contributions ! > > >> > > >> -Aman > > >> (on behalf of Drill PMC) > > > > > > > >
[jira] [Created] (DRILL-6078) Query with INTERVAL in predicate does not return any rows
Robert Hou created DRILL-6078: - Summary: Query with INTERVAL in predicate does not return any rows Key: DRILL-6078 URL: https://issues.apache.org/jira/browse/DRILL-6078 Project: Apache Drill Issue Type: Bug Components: Query Planning & Optimization Affects Versions: 1.12.0 Reporter: Robert Hou Assignee: Chunhui Shi This query does not return any rows when accessing MapR DB tables. SELECT C.C_CUSTKEY, C.C_NAME, SUM(L.L_EXTENDEDPRICE * (1 - L.L_DISCOUNT)) AS REVENUE, C.C_ACCTBAL, N.N_NAME, C.C_ADDRESS, C.C_PHONE, C.C_COMMENT FROM customer C, orders O, lineitem L, nation N WHERE C.C_CUSTKEY = O.O_CUSTKEY AND L.L_ORDERKEY = O.O_ORDERKEY AND O.O_ORDERDate >= DATE '1994-03-01' AND O.O_ORDERDate < DATE '1994-03-01' + INTERVAL '3' MONTH AND L.L_RETURNFLAG = 'R' AND C.C_NATIONKEY = N.N_NATIONKEY GROUP BY C.C_CUSTKEY, C.C_NAME, C.C_ACCTBAL, C.C_PHONE, N.N_NAME, C.C_ADDRESS, C.C_COMMENT ORDER BY REVENUE DESC LIMIT 20 This query works against JSON tables. It should return 20 rows. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
Re: [ANNOUNCE] New Committer: Boaz Ben-Zvi
Congratulations, Boaz! --Robert From: Paul RogersSent: Wednesday, December 13, 2017 11:02 AM To: dev@drill.apache.org Subject: Re: [ANNOUNCE] New Committer: Boaz Ben-Zvi Congrats! Well deserved. - Paul > On Dec 13, 2017, at 11:00 AM, Timothy Farkas wrote: > > Congrats! > > > From: Kunal Khatua > Sent: Wednesday, December 13, 2017 10:47:14 AM > To: dev@drill.apache.org > Subject: RE: [ANNOUNCE] New Committer: Boaz Ben-Zvi > > Congratulations, Boaz!! > > -Original Message- > From: Abhishek Girish [mailto:agir...@apache.org] > Sent: Wednesday, December 13, 2017 10:25 AM > To: dev@drill.apache.org > Subject: Re: [ANNOUNCE] New Committer: Boaz Ben-Zvi > > Congratulations Boaz! > On Wed, Dec 13, 2017 at 10:23 AM Aman Sinha wrote: > >> The Project Management Committee (PMC) for Apache Drill has invited >> Boaz Ben-Zvi to become a committer, and we are pleased to announce >> that he has accepted. >> >> Boaz has been an active contributor to Drill for more than a year. >> He designed and implemented the Hash Aggregate spilling and is leading >> the efforts for Hash Join spilling. >> >> Welcome Boaz, and thank you for your contributions. Keep up the good >> work ! >> >> - Aman >> (on behalf of the Apache Drill PMC) >>
Re: [ANNOUNCE] New Committer: Vitalii Diravka
Congratulations! --Robert From: Paul RogersSent: Sunday, December 10, 2017 4:29 PM To: dev@drill.apache.org Subject: Re: [ANNOUNCE] New Committer: Vitalii Diravka Congrats! Well deserved. - Paul > On Dec 10, 2017, at 3:16 PM, AnilKumar B wrote: > > Congratulations Vitalii > > Thanks & Regards, > B Anil Kumar. > > On Sun, Dec 10, 2017 at 3:12 PM, rahul challapalli < > challapallira...@gmail.com> wrote: > >> Congratulations Vitalii! >> >> On Sun, Dec 10, 2017 at 3:05 PM, Kunal Khatua wrote: >> >>> Congratulations!! >>> >>> -Original Message- >>> From: Aman Sinha [mailto:amansi...@apache.org] >>> Sent: Sunday, December 10, 2017 11:06 AM >>> To: dev@drill.apache.org >>> Subject: [ANNOUNCE] New Committer: Vitalii Diravka >>> >>> The Project Management Committee (PMC) for Apache Drill has invited >>> Vitalii Diravka to become a committer, and we are pleased to announce >> that >>> he has accepted. >>> >>> Vitalii has been an active contributor to Drill over the last 1 1/2 >> years. >>> His contributions have spanned areas such as: CASTing issues with >>> Date/Timestamp, Parquet metadata and SQL enhancements, among others. >>> >>> Welcome Vitalii, and thank you for your contributions. Keep up the good >>> work ! >>> >>> - Aman >>> (on behalf of the Apache Drill PMC) >>> >>
[jira] [Resolved] (DRILL-5898) Query returns columns in the wrong order
[ https://issues.apache.org/jira/browse/DRILL-5898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Hou resolved DRILL-5898. --- Resolution: Fixed Updated expected results file. > Query returns columns in the wrong order > > > Key: DRILL-5898 > URL: https://issues.apache.org/jira/browse/DRILL-5898 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators >Affects Versions: 1.11.0 > Reporter: Robert Hou >Assignee: Robert Hou >Priority: Blocker > Fix For: 1.12.0 > > > This is a regression. It worked with this commit: > {noformat} > f1d1945b3772bb782039fd6811e34a7de66441c8 DRILL-5582: C++ Client: [Threat > Modeling] Drillbit may be spoofed by an attacker and this may lead to data > being written to the attacker's target instead of Drillbit > {noformat} > It fails with this commit, although there are six commits total between the > last good one and this one: > {noformat} > b0c4e0486d6d4620b04a1bb8198e959d433b4840 DRILL-5876: Use openssl profile > to include netty-tcnative dependency with the platform specific classifier > {noformat} > Query is: > {noformat} > select * from > dfs.`/drill/testdata/tpch100_dir_partitioned_5files/lineitem` where > dir0=2006 and dir1=12 and dir2=15 and l_discount=0.07 order by l_orderkey, > l_extendedprice limit 10 > {noformat} > Columns are returned in a different order. Here are the expected results: > {noformat} > foxes. furiously final ideas cajol1994-05-27 0.071731.42 4 > F 653442 4965666.0 1.0 1994-06-23 A 1994-06-22 > NONESHIP215671 0.07200612 15 (1 time(s)) > lly final account 1994-11-09 0.0745881.783 F > 653412 1.320809E7 46.01994-11-24 R 1994-11-08 TAKE > BACK RETURNREG AIR 458104 0.08200612 15 (1 time(s)) > the asymptotes 1997-12-29 0.0760882.8 6 O 653413 > 1.4271413E7 44.01998-02-04 N 1998-01-20 DELIVER IN > PERSON MAIL21456 0.05200612 15 (1 time(s)) > carefully a 1996-09-23 0.075381.88 2 O 653378 > 1.6702792E7 3.0 1996-11-14 N 1996-10-15 NONEREG > AIR 952809 0.05200612 15 (1 time(s)) > ly final requests. boldly ironic theo 1995-09-04 0.072019.94 2 > O 653380 2416094.0 2.0 1995-11-14 N 1995-10-18 > COLLECT COD FOB 166101 0.02200612 15 (1 time(s)) > alongside of the even, e 1996-02-14 0.0786140.322 > O 653409 5622872.0 48.01996-05-02 N 1996-04-22 > NONESHIP372888 0.04200612 15 (1 time(s)) > es. regular instruct 1996-10-18 0.0725194.0 1 O 653382 > 6048060.0 25.01996-08-29 N 1996-08-20 DELIVER IN > PERSON AIR 798079 0.0 200612 15 (1 time(s)) > en package1993-09-19 0.0718718.322 F 653440 > 1.372054E7 12.01993-09-12 A 1993-09-09 DELIVER IN > PERSON TRUCK 970554 0.0 200612 15 (1 time(s)) > ly regular deposits snooze. unusual, even 1998-01-18 0.07 > 12427.921 O 653413 2822631.0 8.0 1998-02-09 > N 1998-02-05 TAKE BACK RETURNREG AIR 322636 0.01 > 200612 15 (1 time(s)) > ironic ideas. bra1996-10-13 0.0764711.533 O > 653383 6806672.0 41.01996-12-06 N 1996-11-10 TAKE > BACK RETURNAIR 556691 0.01200612 15 (1 time(s)) > {noformat} > Here are the actual results: > {noformat} > 2006 12 15 653383 6806672 556691 3 41.064711.53 > 0.070.01N O 1996-11-10 1996-10-13 1996-12-06 > TAKE BACK RETURNAIR ironic ideas. bra > 2006 12 15 653378 16702792952809 2 3.0 5381.88 > 0.070.05N O 1996-10-15 1996-09-23 1996-11-14 > NONEREG AIR carefully a > 2006 12 15 653380 2416094 166101 2 2.0 2019.94 0.07 > 0.02N O 1995-10-18 1995-09-04 1995-11-14 > COLLECT COD FOB ly final requests. boldly ironic theo > 2006 12 15 653413 2822631 322636 1 8.0 12427.92 > 0.070.01
[jira] [Created] (DRILL-5908) Regression: Query intermittently may fail with error "Waited for 15000ms, but tasks for 'Get block maps' are not complete."
Robert Hou created DRILL-5908: - Summary: Regression: Query intermittently may fail with error "Waited for 15000ms, but tasks for 'Get block maps' are not complete." Key: DRILL-5908 URL: https://issues.apache.org/jira/browse/DRILL-5908 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.11.0 Reporter: Robert Hou Assignee: Pritesh Maker This is from the Functional-Baseline-88.193 Jenkins run. The test is in the Functional test suite, partition_pruning/dfs/csv/plan/csvselectpartormultiplewithdir_MD-185.q Query is: {noformat} explain plan for select columns[0],columns[1],columns[4],columns[10],columns[13],dir0 from `/drill/testdata/partition_pruning/dfs/lineitempart` where (dir0=1993 and columns[0]>29600) or (dir0=1994 and columns[0]>29700) {noformat} The error is: {noformat} Failed with exception java.sql.SQLException: RESOURCE ERROR: Waited for 15000ms, but tasks for 'Get block maps' are not complete. Total runnable size 2, parallelism 2. [Error Id: ab911277-36cb-465c-a9aa-8e3d21bcc09c on atsqa4-195.qa.lab:31010] at org.apache.drill.jdbc.impl.DrillCursor.nextRowInternally(DrillCursor.java:489) at org.apache.drill.jdbc.impl.DrillCursor.loadInitialSchema(DrillCursor.java:561) at org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(DrillResultSetImpl.java:1895) at org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(DrillResultSetImpl.java:61) at oadd.org.apache.calcite.avatica.AvaticaConnection$1.execute(AvaticaConnection.java:473) at org.apache.drill.jdbc.impl.DrillMetaImpl.prepareAndExecute(DrillMetaImpl.java:1100) at oadd.org.apache.calcite.avatica.AvaticaConnection.prepareAndExecuteInternal(AvaticaConnection.java:477) at org.apache.drill.jdbc.impl.DrillConnectionImpl.prepareAndExecuteInternal(DrillConnectionImpl.java:181) at oadd.org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:110) at oadd.org.apache.calcite.avatica.AvaticaStatement.executeQuery(AvaticaStatement.java:130) at org.apache.drill.jdbc.impl.DrillStatementImpl.executeQuery(DrillStatementImpl.java:112) at org.apache.drill.test.framework.DrillTestJdbc.executeQuery(DrillTestJdbc.java:224) at org.apache.drill.test.framework.DrillTestJdbc.run(DrillTestJdbc.java:136) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:473) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:748) Caused by: oadd.org.apache.drill.common.exceptions.UserRemoteException: RESOURCE ERROR: Waited for 15000ms, but tasks for 'Get block maps' are not complete. Total runnable size 2, parallelism 2. [Error Id: ab911277-36cb-465c-a9aa-8e3d21bcc09c on atsqa4-195.qa.lab:31010] at oadd.org.apache.drill.exec.rpc.user.QueryResultHandler.resultArrived(QueryResultHandler.java:123) at oadd.org.apache.drill.exec.rpc.user.UserClient.handle(UserClient.java:465) at oadd.org.apache.drill.exec.rpc.user.UserClient.handle(UserClient.java:102) at oadd.org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:274) at oadd.org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:244) at oadd.io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:88) at oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) at oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) at oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335) at oadd.io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:287) at oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) at oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) at oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335) at oadd.io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102) at oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) at oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandler
[jira] [Resolved] (DRILL-5901) Drill test framework can have successful run even if a random failure occurs
[ https://issues.apache.org/jira/browse/DRILL-5901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Hou resolved DRILL-5901. --- Resolution: Not A Bug This is a bug in the Drill Test Framework, not in Drill itself. > Drill test framework can have successful run even if a random failure occurs > > > Key: DRILL-5901 > URL: https://issues.apache.org/jira/browse/DRILL-5901 > Project: Apache Drill > Issue Type: Bug > Components: Tools, Build & Test >Affects Versions: 1.11.0 >Reporter: Robert Hou > > From Jenkins: > http://10.10.104.91:8080/view/Nightly/job/TPCH-SF100-baseline/574/console > Random Failures: > /root/drillAutomation/framework-master/framework/resources/Advanced/tpch/tpch_sf1/original/parquet/query17.sql > Query: > SELECT > SUM(L.L_EXTENDEDPRICE) / 7.0 AS AVG_YEARLY > FROM > lineitem L, > part P > WHERE > P.P_PARTKEY = L.L_PARTKEY > AND P.P_BRAND = 'BRAND#13' > AND P.P_CONTAINER = 'JUMBO CAN' > AND L.L_QUANTITY < ( > SELECT > 0.2 * AVG(L2.L_QUANTITY) > FROM > lineitem L2 > WHERE > L2.L_PARTKEY = P.P_PARTKEY > ) > Failed with exception > java.sql.SQLException: SYSTEM ERROR: IllegalStateException: Memory was leaked > by query. Memory leaked: (2097152) > Allocator(op:8:2:6:ParquetRowGroupScan) 100/0/7675904/100 > (res/actual/peak/limit) > Fragment 8:2 > [Error Id: f21a2560-7259-4e13-88c2-9bac29e2930a on atsqa6c88.qa.lab:31010] > (java.lang.IllegalStateException) Memory was leaked by query. Memory > leaked: (2097152) > Allocator(op:8:2:6:ParquetRowGroupScan) 100/0/7675904/100 > (res/actual/peak/limit) > org.apache.drill.exec.memory.BaseAllocator.close():519 > org.apache.drill.exec.ops.AbstractOperatorExecContext.close():86 > org.apache.drill.exec.ops.OperatorContextImpl.close():108 > org.apache.drill.exec.ops.FragmentContext.suppressingClose():435 > org.apache.drill.exec.ops.FragmentContext.close():424 > > org.apache.drill.exec.work.fragment.FragmentExecutor.closeOutResources():324 > org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup():155 > org.apache.drill.exec.work.fragment.FragmentExecutor.run():267 > org.apache.drill.common.SelfCleaningRunnable.run():38 > java.util.concurrent.ThreadPoolExecutor.runWorker():1145 > java.util.concurrent.ThreadPoolExecutor$Worker.run():615 > java.lang.Thread.run():744 > at > org.apache.drill.jdbc.impl.DrillCursor.nextRowInternally(DrillCursor.java:489) > at > org.apache.drill.jdbc.impl.DrillCursor.loadInitialSchema(DrillCursor.java:561) > at > org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(DrillResultSetImpl.java:1895) > at > org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(DrillResultSetImpl.java:61) > at > oadd.org.apache.calcite.avatica.AvaticaConnection$1.execute(AvaticaConnection.java:473) > at > org.apache.drill.jdbc.impl.DrillMetaImpl.prepareAndExecute(DrillMetaImpl.java:1100) > at > oadd.org.apache.calcite.avatica.AvaticaConnection.prepareAndExecuteInternal(AvaticaConnection.java:477) > at > org.apache.drill.jdbc.impl.DrillConnectionImpl.prepareAndExecuteInternal(DrillConnectionImpl.java:181) > at > oadd.org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:110) > at > oadd.org.apache.calcite.avatica.AvaticaStatement.executeQuery(AvaticaStatement.java:130) > at > org.apache.drill.jdbc.impl.DrillStatementImpl.executeQuery(DrillStatementImpl.java:112) > at > org.apache.drill.test.framework.DrillTestJdbc.executeQuery(DrillTestJdbc.java:206) > at > org.apache.drill.test.framework.DrillTestJdbc.run(DrillTestJdbc.java:115) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:744) > Caused by: oadd.org.apache.drill.common.exceptions.UserRemoteException: > SYSTEM ERROR: IllegalStateException: Memory was leaked by query. Memory > leaked: (2097152) > Allocator(op:8:2:6:ParquetRowGroupScan) 100/0/7675904/100 > (res/actual/peak/limit) > Fragment 8:2 > [Error Id: f21a2560-7259-4e13-88c2-9bac29e2930a on atsqa6c88.qa.lab:31010] > (java.lang.Illeg
[jira] [Created] (DRILL-5903) Query encounters "Waited for 15000ms, but tasks for 'Fetch parquet metadata' are not complete."
Robert Hou created DRILL-5903: - Summary: Query encounters "Waited for 15000ms, but tasks for 'Fetch parquet metadata' are not complete." Key: DRILL-5903 URL: https://issues.apache.org/jira/browse/DRILL-5903 Project: Apache Drill Issue Type: Bug Components: Metadata, Storage - Parquet Affects Versions: 1.11.0 Reporter: Robert Hou Priority: Critical Query is: {noformat} select a.int_col, b.date_col from dfs.`/drill/testdata/parquet_date/metadata_cache/mixed/fewtypes_null_large` a inner join ( select date_col, int_col from dfs.`/drill/testdata/parquet_date/metadata_cache/mixed/fewtypes_null_large` where dir0 = '1.2' and date_col > '1996-03-07' ) b on cast(a.date_col as date)= date_add(b.date_col, 5) where a.int_col = 7 and a.dir0='1.9' group by a.int_col, b.date_col {noformat} >From drillbit.log: {noformat} fc65-d430-ac1103638113: SELECT SUM(col_int) OVER() sum_int FROM vwOnParq_wCst_35 2017-10-23 11:20:50,122 [26122f83-6956-5aa8-d8de-d4808f572160:foreman] ERROR o.a.d.exec.store.parquet.Metadata - Waited for 15000ms, but tasks for 'Fetch parquet metadata' are not complete. Total runnable size 3, parallelism 3. 2017-10-23 11:20:50,127 [26122f83-6956-5aa8-d8de-d4808f572160:foreman] INFO o.a.d.exec.store.parquet.Metadata - User Error Occurred: Waited for 15000ms, but tasks for 'Fetch parquet metadata' are not complete. Total runnable size 3, parallelism 3. org.apache.drill.common.exceptions.UserException: RESOURCE ERROR: Waited for 15000ms, but tasks for 'Fetch parquet metadata' are not complete. Total runnable size 3, parallelism 3. [Error Id: 7484e127-ea41-4797-83c0-6619ea9b2bcd ] at org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:586) ~[drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.store.TimedRunnable.run(TimedRunnable.java:151) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.store.parquet.Metadata.getParquetFileMetadata_v3(Metadata.java:341) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.store.parquet.Metadata.getParquetTableMetadata(Metadata.java:318) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.store.parquet.Metadata.getParquetTableMetadata(Metadata.java:142) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.store.parquet.ParquetGroupScan.init(ParquetGroupScan.java:934) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.store.parquet.ParquetGroupScan.(ParquetGroupScan.java:227) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.store.parquet.ParquetGroupScan.(ParquetGroupScan.java:190) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.store.parquet.ParquetFormatPlugin.getGroupScan(ParquetFormatPlugin.java:170) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.store.parquet.ParquetFormatPlugin.getGroupScan(ParquetFormatPlugin.java:66) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.store.dfs.FileSystemPlugin.getPhysicalScan(FileSystemPlugin.java:144) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.store.AbstractStoragePlugin.getPhysicalScan(AbstractStoragePlugin.java:100) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.planner.logical.DrillTable.getGroupScan(DrillTable.java:85) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.planner.logical.DrillPushProjIntoScan.onMatch(DrillPushProjIntoScan.java:62) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:228) [calcite-core-1.4.0-drill-r22.jar:1.4.0-drill-r22] at org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:811) [calcite-core-1.4.0-drill-r22.jar:1.4.0-drill-r22] at org.apache.calcite.tools.Programs$RuleSetProgram.run(Programs.java:310) [calcite-core-1.4.0-drill-r22.jar:1.4.0-drill-r22] at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.transform(DefaultSqlHandler.java:400) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.transform(DefaultSqlHandler.java:342) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToRawDrel(DefaultSqlHandler.java:241) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToDrel(DefaultSqlHandler.j
[jira] [Created] (DRILL-5902) Regression: Queries encounter random failure due to RPC connection timed out
Robert Hou created DRILL-5902: - Summary: Regression: Queries encounter random failure due to RPC connection timed out Key: DRILL-5902 URL: https://issues.apache.org/jira/browse/DRILL-5902 Project: Apache Drill Issue Type: Bug Components: Execution - RPC Affects Versions: 1.11.0 Reporter: Robert Hou Priority: Critical Multiple random failures (25) occurred with the latest Functional-Baseline-88.193 run. Here is a sample query: {noformat} -- Kitchen sink -- Use all supported functions select rank() over W, dense_rank()over W, percent_rank() over W, cume_dist() over W, avg(c_integer + c_integer) over W, sum(c_integer/100) over W, count(*)over W, min(c_integer) over W, max(c_integer) over W, row_number()over W from j7 where c_boolean is not null window W as (partition by c_bigint, c_date, c_time, c_boolean order by c_integer) {noformat} >From the logs: {noformat} 2017-10-23 04:14:36,536 [BitServer-7] WARN o.a.d.e.w.b.ControlMessageHandler - Dropping request for early fragment termination for path 261230e8-d03e-9ca9-91bf-c1039deecde2:1:1 -> 261230e8-d03e-9ca9-91bf-c1039deecde2:0:0 as path to executor unavailable. 2017-10-23 04:14:36,537 [BitServer-7] WARN o.a.d.e.w.b.ControlMessageHandler - Dropping request for early fragment termination for path 261230e8-d03e-9ca9-91bf-c1039deecde2:1:5 -> 261230e8-d03e-9ca9-91bf-c1039deecde2:0:0 as path to executor unavailable. 2017-10-23 04:14:36,537 [BitServer-7] WARN o.a.d.e.w.b.ControlMessageHandler - Dropping request for early fragment termination for path 261230e8-d03e-9ca9-91bf-c1039deecde2:1:9 -> 261230e8-d03e-9ca9-91bf-c1039deecde2:0:0 as path to executor unavailable. 2017-10-23 04:14:36,537 [BitServer-7] WARN o.a.d.e.w.b.ControlMessageHandler - Dropping request for early fragment termination for path 261230e8-d03e-9ca9-91bf-c1039deecde2:1:13 -> 261230e8-d03e-9ca9-91bf-c1039deecde2:0:0 as path to executor unavailable. 2017-10-23 04:14:36,537 [BitServer-7] WARN o.a.d.e.w.b.ControlMessageHandler - Dropping request for early fragment termination for path 261230e8-d03e-9ca9-91bf-c1039deecde2:1:17 -> 261230e8-d03e-9ca9-91bf-c1039deecde2:0:0 as path to executor unavailable. 2017-10-23 04:14:36,538 [BitServer-7] WARN o.a.d.e.w.b.ControlMessageHandler - Dropping request for early fragment termination for path 261230e8-d03e-9ca9-91bf-c1039deecde2:1:21 -> 261230e8-d03e-9ca9-91bf-c1039deecde2:0:0 as path to executor unavailable. 2017-10-23 04:14:36,538 [BitServer-7] WARN o.a.d.e.w.b.ControlMessageHandler - Dropping request for early fragment termination for path 261230e8-d03e-9ca9-91bf-c1039deecde2:1:25 -> 261230e8-d03e-9ca9-91bf-c1039deecde2:0:0 as path to executor unavailable. {noformat} {noformat} 2017-10-23 04:14:53,941 [UserServer-1] INFO o.a.drill.exec.rpc.user.UserServer - RPC connection /10.10.88.196:31010 <--> /10.10.88.193:38281 (user server) timed out. Timeout was set to 30 seconds. Closing connection. 2017-10-23 04:14:53,952 [UserServer-1] INFO o.a.d.e.w.fragment.FragmentExecutor - 261230f8-2698-15b2-952f-d4ade8d6b180:0:0: State change requested RUNNING --> FAILED 2017-10-23 04:14:53,952 [261230f8-2698-15b2-952f-d4ade8d6b180:frag:0:0] INFO o.a.d.e.w.fragment.FragmentExecutor - 261230f8-2698-15b2-952f-d4ade8d6b180:0:0: State change requested FAILED --> FINISHED 2017-10-23 04:14:53,956 [UserServer-1] WARN o.apache.drill.exec.rpc.RequestIdMap - Failure while attempting to fail rpc response. java.lang.IllegalArgumentException: Self-suppression not permitted at java.lang.Throwable.addSuppressed(Throwable.java:1043) ~[na:1.7.0_45] at org.apache.drill.common.DeferredException.addException(DeferredException.java:88) ~[drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.common.DeferredException.addThrowable(DeferredException.java:97) ~[drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.work.fragment.FragmentExecutor.fail(FragmentExecutor.java:413) ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.work.fragment.FragmentExecutor.access$700(FragmentExecutor.java:55) ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.work.fragment.FragmentExecutor$ExecutorStateImpl.fail(FragmentExecutor.java:427) ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.ops.FragmentContext.fail(FragmentContext.java:213) ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.dri
[jira] [Created] (DRILL-5901) Drill test framework can have successful run even if a random failure occurs
Robert Hou created DRILL-5901: - Summary: Drill test framework can have successful run even if a random failure occurs Key: DRILL-5901 URL: https://issues.apache.org/jira/browse/DRILL-5901 Project: Apache Drill Issue Type: Bug Components: Tools, Build & Test Affects Versions: 1.11.0 Reporter: Robert Hou Random Failures: /root/drillAutomation/framework-master/framework/resources/Advanced/tpch/tpch_sf1/original/parquet/query17.sql Query: SELECT SUM(L.L_EXTENDEDPRICE) / 7.0 AS AVG_YEARLY FROM lineitem L, part P WHERE P.P_PARTKEY = L.L_PARTKEY AND P.P_BRAND = 'BRAND#13' AND P.P_CONTAINER = 'JUMBO CAN' AND L.L_QUANTITY < ( SELECT 0.2 * AVG(L2.L_QUANTITY) FROM lineitem L2 WHERE L2.L_PARTKEY = P.P_PARTKEY ) Failed with exception java.sql.SQLException: SYSTEM ERROR: IllegalStateException: Memory was leaked by query. Memory leaked: (2097152) Allocator(op:8:2:6:ParquetRowGroupScan) 100/0/7675904/100 (res/actual/peak/limit) Fragment 8:2 [Error Id: f21a2560-7259-4e13-88c2-9bac29e2930a on atsqa6c88.qa.lab:31010] (java.lang.IllegalStateException) Memory was leaked by query. Memory leaked: (2097152) Allocator(op:8:2:6:ParquetRowGroupScan) 100/0/7675904/100 (res/actual/peak/limit) org.apache.drill.exec.memory.BaseAllocator.close():519 org.apache.drill.exec.ops.AbstractOperatorExecContext.close():86 org.apache.drill.exec.ops.OperatorContextImpl.close():108 org.apache.drill.exec.ops.FragmentContext.suppressingClose():435 org.apache.drill.exec.ops.FragmentContext.close():424 org.apache.drill.exec.work.fragment.FragmentExecutor.closeOutResources():324 org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup():155 org.apache.drill.exec.work.fragment.FragmentExecutor.run():267 org.apache.drill.common.SelfCleaningRunnable.run():38 java.util.concurrent.ThreadPoolExecutor.runWorker():1145 java.util.concurrent.ThreadPoolExecutor$Worker.run():615 java.lang.Thread.run():744 at org.apache.drill.jdbc.impl.DrillCursor.nextRowInternally(DrillCursor.java:489) at org.apache.drill.jdbc.impl.DrillCursor.loadInitialSchema(DrillCursor.java:561) at org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(DrillResultSetImpl.java:1895) at org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(DrillResultSetImpl.java:61) at oadd.org.apache.calcite.avatica.AvaticaConnection$1.execute(AvaticaConnection.java:473) at org.apache.drill.jdbc.impl.DrillMetaImpl.prepareAndExecute(DrillMetaImpl.java:1100) at oadd.org.apache.calcite.avatica.AvaticaConnection.prepareAndExecuteInternal(AvaticaConnection.java:477) at org.apache.drill.jdbc.impl.DrillConnectionImpl.prepareAndExecuteInternal(DrillConnectionImpl.java:181) at oadd.org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:110) at oadd.org.apache.calcite.avatica.AvaticaStatement.executeQuery(AvaticaStatement.java:130) at org.apache.drill.jdbc.impl.DrillStatementImpl.executeQuery(DrillStatementImpl.java:112) at org.apache.drill.test.framework.DrillTestJdbc.executeQuery(DrillTestJdbc.java:206) at org.apache.drill.test.framework.DrillTestJdbc.run(DrillTestJdbc.java:115) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) Caused by: oadd.org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: IllegalStateException: Memory was leaked by query. Memory leaked: (2097152) Allocator(op:8:2:6:ParquetRowGroupScan) 100/0/7675904/100 (res/actual/peak/limit) Fragment 8:2 [Error Id: f21a2560-7259-4e13-88c2-9bac29e2930a on atsqa6c88.qa.lab:31010] (java.lang.IllegalStateException) Memory was leaked by query. Memory leaked: (2097152) Allocator(op:8:2:6:ParquetRowGroupScan) 100/0/7675904/100 (res/actual/peak/limit) org.apache.drill.exec.memory.BaseAllocator.close():519 org.apache.drill.exec.ops.AbstractOperatorExecContext.close():86 org.apache.drill.exec.ops.OperatorContextImpl.close():108 org.apache.drill.exec.ops.FragmentContext.suppressingClose():435 org.apache.drill.exec.ops.FragmentContext.close():424 org.apache.drill.exec.work.fragment.FragmentExecutor.closeOutResources():324 org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup():155 org.apache.drill.exec.work.fragment.FragmentExecutor.run():267 org.apache.drill.common.SelfCleaningRunnable.r
[jira] [Created] (DRILL-5900) Regression: TPCH query encounters random IllegalStateException: Memory was leaked by query
Robert Hou created DRILL-5900: - Summary: Regression: TPCH query encounters random IllegalStateException: Memory was leaked by query Key: DRILL-5900 URL: https://issues.apache.org/jira/browse/DRILL-5900 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.11.0 Reporter: Robert Hou Assignee: Pritesh Maker Priority: Blocker This is a random failure. This test has passed before. TPCH query 6: {noformat} SELECT SUM(L.L_EXTENDEDPRICE) / 7.0 AS AVG_YEARLY FROM lineitem L, part P WHERE P.P_PARTKEY = L.L_PARTKEY AND P.P_BRAND = 'BRAND#13' AND P.P_CONTAINER = 'JUMBO CAN' AND L.L_QUANTITY < ( SELECT 0.2 * AVG(L2.L_QUANTITY) FROM lineitem L2 WHERE L2.L_PARTKEY = P.P_PARTKEY ) {noformat} Error is: {noformat} 2017-10-23 10:34:55,989 [2611d7c0-b0c9-a93e-c64d-a4ef8f4baf8f:frag:8:2] ERROR o.a.d.e.w.fragment.FragmentExecutor - SYSTEM ERROR: IllegalStateException: Memory was leaked by query. Memory leaked: (2097152) Allocator(op:8:2:6:ParquetRowGroupScan) 100/0/7675904/100 (res/actual/peak/limit) Fragment 8:2 [Error Id: f21a2560-7259-4e13-88c2-9bac29e2930a on atsqa6c88.qa.lab:31010] org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: IllegalStateException: Memory was leaked by query. Memory leaked: (2097152) Allocator(op:8:2:6:ParquetRowGroupScan) 100/0/7675904/100 (res/actual/peak/limit) Fragment 8:2 [Error Id: f21a2560-7259-4e13-88c2-9bac29e2930a on atsqa6c88.qa.lab:31010] at org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:586) ~[drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:298) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:160) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:267) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) [drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_51] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_51] at java.lang.Thread.run(Thread.java:744) [na:1.7.0_51] Caused by: java.lang.IllegalStateException: Memory was leaked by query. Memory leaked: (2097152) Allocator(op:8:2:6:ParquetRowGroupScan) 100/0/7675904/100 (res/actual/peak/limit) at org.apache.drill.exec.memory.BaseAllocator.close(BaseAllocator.java:519) ~[drill-memory-base-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.ops.AbstractOperatorExecContext.close(AbstractOperatorExecContext.java:86) ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.ops.OperatorContextImpl.close(OperatorContextImpl.java:108) ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.ops.FragmentContext.suppressingClose(FragmentContext.java:435) ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.ops.FragmentContext.close(FragmentContext.java:424) ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.work.fragment.FragmentExecutor.closeOutResources(FragmentExecutor.java:324) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:155) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] ... 5 common frames omitted 2017-10-23 10:34:55,989 [2611d7c0-b0c9-a93e-c64d-a4ef8f4baf8f:frag:6:0] INFO o.a.d.e.w.f.FragmentStatusReporter - 2611d7c0-b0c9-a93e-c64d-a4ef8f4baf8f:6:0: State to report: FINISHED {noformat} sys.version is: 1.12.0-SNAPSHOT b0c4e0486d6d4620b04a1bb8198e959d433b4840DRILL-5876: Use openssl profile to include netty-tcnative dependency with the platform specific classifier 20.10.2017 @ 16:52:35 PDT The previous version that ran clean is this commit: {noformat} 1.12.0-SNAPSHOT f1d1945b3772bb782039fd6811e34a7de66441c8DRILL-5582: C++ Client: [Threat Modeling] Drillbit may be spoofed by an attacker and this may lead to data being written to the attacker's target instead of Drillbit 19.10.2017 @ 17:13:05 PDT {noformat} But since the failure is random, the problem could have been introduced earlier. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (DRILL-5898) Query returns columns in the wrong order
Robert Hou created DRILL-5898: - Summary: Query returns columns in the wrong order Key: DRILL-5898 URL: https://issues.apache.org/jira/browse/DRILL-5898 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.11.0 Reporter: Robert Hou Assignee: Vitalii Diravka Priority: Blocker Fix For: 1.12.0 This is a regression. It worked with this commit: {noformat} f1d1945b3772bb782039fd6811e34a7de66441c8DRILL-5582: C++ Client: [Threat Modeling] Drillbit may be spoofed by an attacker and this may lead to data being written to the attacker's target instead of Drillbit {noformat} It fails with this commit, although there are six commits total between the last good one and this one: {noformat} b0c4e0486d6d4620b04a1bb8198e959d433b4840DRILL-5876: Use openssl profile to include netty-tcnative dependency with the platform specific classifier {noformat} Query is: {noformat} select * from dfs.`/drill/testdata/tpch100_dir_partitioned_5files/lineitem` where dir0=2006 and dir1=12 and dir2=15 and l_discount=0.07 order by l_orderkey, l_extendedprice limit 10 {noformat} Columns are returned in a different order. Here are the expected results: {noformat} foxes. furiously final ideas cajol 1994-05-27 0.071731.42 4 F 653442 4965666.0 1.0 1994-06-23 A 1994-06-22 NONESHIP215671 0.07200612 15 (1 time(s)) lly final account 1994-11-09 0.0745881.783 F 653412 1.320809E7 46.01994-11-24 R 1994-11-08 TAKE BACK RETURNREG AIR 458104 0.08200612 15 (1 time(s)) the asymptotes 1997-12-29 0.0760882.8 6 O 653413 1.4271413E7 44.01998-02-04 N 1998-01-20 DELIVER IN PERSON MAIL21456 0.05200612 15 (1 time(s)) carefully a 1996-09-23 0.075381.88 2 O 653378 1.6702792E7 3.0 1996-11-14 N 1996-10-15 NONEREG AIR 952809 0.05200612 15 (1 time(s)) ly final requests. boldly ironic theo 1995-09-04 0.072019.94 2 O 653380 2416094.0 2.0 1995-11-14 N 1995-10-18 COLLECT COD FOB 166101 0.02200612 15 (1 time(s)) alongside of the even, e1996-02-14 0.0786140.322 O 653409 5622872.0 48.01996-05-02 N 1996-04-22 NONESHIP372888 0.04200612 15 (1 time(s)) es. regular instruct1996-10-18 0.0725194.0 1 O 653382 6048060.0 25.01996-08-29 N 1996-08-20 DELIVER IN PERSON AIR 798079 0.0 200612 15 (1 time(s)) en package 1993-09-19 0.0718718.322 F 653440 1.372054E7 12.01993-09-12 A 1993-09-09 DELIVER IN PERSON TRUCK 970554 0.0 200612 15 (1 time(s)) ly regular deposits snooze. unusual, even 1998-01-18 0.07 12427.921 O 653413 2822631.0 8.0 1998-02-09 N 1998-02-05 TAKE BACK RETURNREG AIR 322636 0.012006 12 15 (1 time(s)) ironic ideas. bra 1996-10-13 0.0764711.533 O 653383 6806672.0 41.01996-12-06 N 1996-11-10 TAKE BACK RETURNAIR 556691 0.01200612 15 (1 time(s)) {noformat} Here are the actual results: {noformat} 200612 15 653383 6806672 556691 3 41.064711.53 0.070.01N O 1996-11-10 1996-10-13 1996-12-06 TAKE BACK RETURNAIR ironic ideas. bra 200612 15 653378 16702792952809 2 3.0 5381.88 0.070.05N O 1996-10-15 1996-09-23 1996-11-14 NONEREG AIR carefully a 200612 15 653380 2416094 166101 2 2.0 2019.94 0.07 0.02N O 1995-10-18 1995-09-04 1995-11-14 COLLECT COD FOB ly final requests. boldly ironic theo 200612 15 653413 2822631 322636 1 8.0 12427.92 0.070.01N O 1998-02-05 1998-01-18 1998-02-09 TAKE BACK RETURNREG AIR ly regular deposits snooze. unusual, even 200612 15 653382 6048060 798079 1 25.025194.0 0.07 0.0 N O 1996-08-20 1996-10-18 1996-08-29 DELIVER IN PERSON AIR es. regular instruct 200612 15 653442 4965666 215671 4 1.0 1731.42 0.07 0.07A F 1994-06-22 1994-05-27 1994-06-23 NONE SHIPfoxes. furiously final ideas cajol 200612
[jira] [Created] (DRILL-5891) When Drill runs out of memory for a HashAgg, it should tell the user how much memory to allocate
Robert Hou created DRILL-5891: - Summary: When Drill runs out of memory for a HashAgg, it should tell the user how much memory to allocate Key: DRILL-5891 URL: https://issues.apache.org/jira/browse/DRILL-5891 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.11.0 Reporter: Robert Hou Assignee: Pritesh Maker Query is: select count(*), max(`filename`) from dfs.`/drill/testdata/hash-agg/data1` group by no_nulls_col, nulls_col; Error is: Error: RESOURCE ERROR: Not enough memory for internal partitioning and fallback mechanism for HashAgg to use unbounded memory is disabled. Either enable fallback config drill.exec.hashagg.fallback.enabled using Alter session/system command or increase memory limit for Drillbit >From drillbit.log: {noformat} 2017-10-18 13:30:17,135 [26184629-3f4c-856a-e99e-97cdf0d29321:frag:1:8] TRACE o.a.d.e.p.i.aggregate.HashAggregator - Incoming sizer: Actual batch schema & sizes { no_nulls_col(type: OPTIONAL VARCHAR, count: 1023, std size: 54, actual size: 130, data size: 132892) nulls_col(type: OPTIONAL VARCHAR, count: 1023, std size: 54, actual size: 112, data size: 113673) EXPR$0(type: REQUIRED BIGINT, count: 1023, std size: 8, actual size: 8, data size: 8184) EXPR$1(type: OPTIONAL VARCHAR, count: 1023, std size: 54, actual size: 18, data size: 18414) Records: 1023, Total size: 524288, Data size: 273163, Gross row width: 513, Net row width: 268, Density: 53%} 2017-10-18 13:30:17,135 [26184629-3f4c-856a-e99e-97cdf0d29321:frag:1:8] TRACE o.a.d.e.p.i.aggregate.HashAggregator - 2nd phase. Estimated internal row width: 166 Values row width: 66 batch size: 12779520 memory limit: 63161283 max column width: 50 2017-10-18 13:30:17,139 [26184629-3f4c-856a-e99e-97cdf0d29321:frag:3:2] TRACE o.a.d.e.p.impl.common.HashTable - HT allocated 4784128 for varchar of max width 50 2017-10-18 13:30:17,139 [26184629-3f4c-856a-e99e-97cdf0d29321:frag:1:15] INFO o.a.d.e.p.i.aggregate.HashAggregator - User Error Occurred: Not enough memory for internal partitioning and fallback mechanism for HashAgg to use unbounded memory is disabled. Either enable fallback config drill.exec.hashagg.fallback.enabled using Alter session/system command or increase memory limit for Drillbit org.apache.drill.common.exceptions.UserException: RESOURCE ERROR: Not enough memory for internal partitioning and fallback mechanism for HashAgg to use unbounded memory is disabled. Either enable fallback config drill.exec.hashagg.fallback.enabled using Alter session/system command or increase memory limit for Drillbit {noformat} I would recommend that we add a log message with the "alter" command to increase the amount of memory allocated, and how much memory to allocate. Otherwise, the user may not know what to do. I would also not suggest enabling "drill.exec.hashagg.fallback.enabled" except as a last resort. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (DRILL-5889) sqlline loses RPC connection with executing query with HashAgg
Robert Hou created DRILL-5889: - Summary: sqlline loses RPC connection with executing query with HashAgg Key: DRILL-5889 URL: https://issues.apache.org/jira/browse/DRILL-5889 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.11.0 Reporter: Robert Hou Query is: {noformat} alter session set `planner.memory.max_query_memory_per_node` = 10737418240; select count(*), max(`filename`) from dfs.`/drill/testdata/hash-agg/data1` group by no_nulls_col, nulls_col; {noformat} Error is: {noformat} 0: jdbc:drill:drillbit=10.10.100.190> select count(*), max(`filename`) from dfs.`/drill/testdata/hash-agg/data1` group by no_nulls_col, nulls_col; Error: CONNECTION ERROR: Connection /10.10.100.190:45776 <--> /10.10.100.190:31010 (user client) closed unexpectedly. Drillbit down? [Error Id: db4aea70-11e6-4e63-b0cc-13cdba0ee87a ] (state=,code=0) {noformat} >From drillbit.log: 2017-10-18 14:04:23,044 [UserServer-1] INFO o.a.drill.exec.rpc.user.UserServer - RPC connection /10.10.100.190:31010 <--> /10.10.100.190:45776 (user server) timed out. Timeout was set to 30 seconds. Closing connection. Plan is: {noformat} | 00-00Screen 00-01 Project(EXPR$0=[$0], EXPR$1=[$1]) 00-02UnionExchange 01-01 Project(EXPR$0=[$2], EXPR$1=[$3]) 01-02HashAgg(group=[{0, 1}], EXPR$0=[$SUM0($2)], EXPR$1=[MAX($3)]) 01-03 Project(no_nulls_col=[$0], nulls_col=[$1], EXPR$0=[$2], EXPR$1=[$3]) 01-04HashToRandomExchange(dist0=[[$0]], dist1=[[$1]]) 02-01 UnorderedMuxExchange 03-01Project(no_nulls_col=[$0], nulls_col=[$1], EXPR$0=[$2], EXPR$1=[$3], E_X_P_R_H_A_S_H_F_I_E_L_D=[hash32AsDouble($1, hash32AsDouble($0, 1301011))]) 03-02 HashAgg(group=[{0, 1}], EXPR$0=[COUNT()], EXPR$1=[MAX($2)]) 03-03Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:///drill/testdata/hash-agg/data1]], selectionRoot=maprfs:/drill/testdata/hash-agg/data1, numFiles=1, usedMetadataFile=false, columns=[`no_nulls_col`, `nulls_col`, `filename`]]]) {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (DRILL-5804) External Sort times out, may be infinite loop
[ https://issues.apache.org/jira/browse/DRILL-5804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Hou resolved DRILL-5804. --- Resolution: Fixed > External Sort times out, may be infinite loop > - > > Key: DRILL-5804 > URL: https://issues.apache.org/jira/browse/DRILL-5804 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators >Affects Versions: 1.11.0 > Reporter: Robert Hou >Assignee: Paul Rogers > Fix For: 1.12.0 > > Attachments: drillbit.log > > > Query is: > {noformat} > ALTER SESSION SET `exec.sort.disable_managed` = false; > select count(*) from ( > select * from ( > select s1.type type, flatten(s1.rms.rptd) rptds, s1.rms, s1.uid > from ( > select d.type type, d.uid uid, flatten(d.map.rm) rms from > dfs.`/drill/testdata/resource-manager/nested_large` d order by d.uid > ) s1 > ) s2 > order by s2.rms.mapid, s2.rptds.a, s2.rptds.do_not_exist > ); > {noformat} > Plan is: > {noformat} > | 00-00Screen > 00-01 Project(EXPR$0=[$0]) > 00-02StreamAgg(group=[{}], EXPR$0=[$SUM0($0)]) > 00-03 UnionExchange > 01-01StreamAgg(group=[{}], EXPR$0=[COUNT()]) > 01-02 Project($f0=[0]) > 01-03SingleMergeExchange(sort0=[4 ASC], sort1=[5 ASC], > sort2=[6 ASC]) > 02-01 SelectionVectorRemover > 02-02Sort(sort0=[$4], sort1=[$5], sort2=[$6], dir0=[ASC], > dir1=[ASC], dir2=[ASC]) > 02-03 Project(type=[$0], rptds=[$1], rms=[$2], uid=[$3], > EXPR$4=[$4], EXPR$5=[$5], EXPR$6=[$6]) > 02-04HashToRandomExchange(dist0=[[$4]], dist1=[[$5]], > dist2=[[$6]]) > 03-01 UnorderedMuxExchange > 04-01Project(type=[$0], rptds=[$1], rms=[$2], > uid=[$3], EXPR$4=[$4], EXPR$5=[$5], EXPR$6=[$6], > E_X_P_R_H_A_S_H_F_I_E_L_D=[hash32AsDouble($6, hash32AsDouble($5, > hash32AsDouble($4, 1301011)))]) > 04-02 Project(type=[$0], rptds=[$1], rms=[$2], > uid=[$3], EXPR$4=[ITEM($2, 'mapid')], EXPR$5=[ITEM($1, 'a')], > EXPR$6=[ITEM($1, 'do_not_exist')]) > 04-03Flatten(flattenField=[$1]) > 04-04 Project(type=[$0], rptds=[ITEM($2, > 'rptd')], rms=[$2], uid=[$1]) > 04-05SingleMergeExchange(sort0=[1 ASC]) > 05-01 SelectionVectorRemover > 05-02Sort(sort0=[$1], dir0=[ASC]) > 05-03 Project(type=[$0], uid=[$1], > rms=[$2]) > 05-04 > HashToRandomExchange(dist0=[[$1]]) > 06-01 UnorderedMuxExchange > 07-01Project(type=[$0], > uid=[$1], rms=[$2], E_X_P_R_H_A_S_H_F_I_E_L_D=[hash32AsDouble($1, 1301011)]) > 07-02 > Flatten(flattenField=[$2]) > 07-03Project(type=[$0], > uid=[$1], rms=[ITEM($2, 'rm')]) > 07-04 > Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath > [path=maprfs:///drill/testdata/resource-manager/nested_large]], > selectionRoot=maprfs:/drill/testdata/resource-manager/nested_large, > numFiles=1, usedMetadataFile=false, columns=[`type`, `uid`, `map`.`rm`]]]) > {noformat} > Here is a segment of the drillbit.log, starting at line 55890: > {noformat} > 2017-09-19 04:22:56,258 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:2] DEBUG > o.a.d.e.t.g.SingleBatchSorterGen44 - Took 142 us to sort 1023 records > 2017-09-19 04:22:56,265 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:4] DEBUG > o.a.d.e.t.g.SingleBatchSorterGen44 - Took 105 us to sort 1023 records > 2017-09-19 04:22:56,268 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:3:0] DEBUG > o.a.d.e.p.i.p.PartitionSenderRootExec - Partitioner.next(): got next record > batch with status OK > 2017-09-19 04:22:56,275 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:7] DEBUG > o.a.d.e.t.g.SingleBatchSorterGen44 - Took 145 us to sort 1023 records > 2017-09-19 04:22:56,354 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:3:0] DEBUG > o.a.d.e.p.i.p.PartitionSenderRootExec - Partitioner.next(): got next record > batch with status OK > 2017-09-19 04:22:56,357 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:2] DEBUG > o.a.d.e.t.g.Singl
[jira] [Created] (DRILL-5886) Operators should create batch sizes that the next operator can consume to avoid OOM
Robert Hou created DRILL-5886: - Summary: Operators should create batch sizes that the next operator can consume to avoid OOM Key: DRILL-5886 URL: https://issues.apache.org/jira/browse/DRILL-5886 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.11.0 Reporter: Robert Hou Attachments: 26478262-f0a7-8fc1-1887-4f27071b9c0f.sys.drill, drillbit.log.exchange Query is: {noformat} ALTER SESSION SET `exec.sort.disable_managed` = false alter session set `planner.memory.max_query_memory_per_node` = 482344960 alter session set `planner.width.max_per_node` = 1 alter session set `planner.width.max_per_query` = 1 alter session set `planner.disable_exchanges` = true select count(*) from (select * from dfs.`/drill/testdata/resource-manager/3500cols.tbl` order by columns[450],columns[330],columns[230],columns[220],columns[110],columns[90],columns[80],columns[70],columns[40],columns[10],columns[20],columns[30],columns[40],columns[50], columns[454],columns[413],columns[940],columns[834],columns[73],columns[140],columns[104],columns[],columns[30],columns[2420],columns[1520], columns[1410], columns[1110],columns[1290],columns[2380],columns[705],columns[45],columns[1054],columns[2430],columns[420],columns[404],columns[3350], columns[],columns[153],columns[356],columns[84],columns[745],columns[1450],columns[103],columns[2065],columns[343],columns[3420],columns[530], columns[3210] ) d where d.col433 = 'sjka skjf'; {noformat} This is the error from drillbit.log: 2017-09-12 17:36:53,155 [26478262-f0a7-8fc1-1887-4f27071b9c0f:frag:0:0] ERROR o.a.d.e.p.i.x.m.ExternalSortBatch - Insufficient memory to merge two batches. Incoming batch size: 409305088, available memory: 482344960 Here is the plan: {noformat} | 00-00Screen 00-01 Project(EXPR$0=[$0]) 00-02StreamAgg(group=[{}], EXPR$0=[COUNT()]) 00-03 Project($f0=[0]) 00-04SelectionVectorRemover 00-05 Filter(condition=[=(ITEM($0, 'col433'), 'sjka skjf')]) 00-06Project(T8¦¦*=[$0]) 00-07 SelectionVectorRemover 00-08Sort(sort0=[$1], sort1=[$2], sort2=[$3], sort3=[$4], sort4=[$5], sort5=[$6], sort6=[$7], sort7=[$8], sort8=[$9], sort9=[$10], sort10=[$11], sort11=[$12], sort12=[$9], sort13=[$13], sort14=[$14], sort15=[$15], sort16=[$16], sort17=[$17], sort18=[$18], sort19=[$19], sort20=[$20], sort21=[$21], sort22=[$12], sort23=[$22], sort24=[$23], sort25=[$24], sort26=[$25], sort27=[$26], sort28=[$27], sort29=[$28], sort30=[$29], sort31=[$30], sort32=[$31], sort33=[$32], sort34=[$33], sort35=[$34], sort36=[$35], sort37=[$36], sort38=[$37], sort39=[$38], sort40=[$39], sort41=[$40], sort42=[$41], sort43=[$42], sort44=[$43], sort45=[$44], sort46=[$45], sort47=[$46], dir0=[ASC], dir1=[ASC], dir2=[ASC], dir3=[ASC], dir4=[ASC], dir5=[ASC], dir6=[ASC], dir7=[ASC], dir8=[ASC], dir9=[ASC], dir10=[ASC], dir11=[ASC], dir12=[ASC], dir13=[ASC], dir14=[ASC], dir15=[ASC], dir16=[ASC], dir17=[ASC], dir18=[ASC], dir19=[ASC], dir20=[ASC], dir21=[ASC], dir22=[ASC], dir23=[ASC], dir24=[ASC], dir25=[ASC], dir26=[ASC], dir27=[ASC], dir28=[ASC], dir29=[ASC], dir30=[ASC], dir31=[ASC], dir32=[ASC], dir33=[ASC], dir34=[ASC], dir35=[ASC], dir36=[ASC], dir37=[ASC], dir38=[ASC], dir39=[ASC], dir40=[ASC], dir41=[ASC], dir42=[ASC], dir43=[ASC], dir44=[ASC], dir45=[ASC], dir46=[ASC], dir47=[ASC]) 00-09 Project(T8¦¦*=[$0], EXPR$1=[ITEM($1, 450)], EXPR$2=[ITEM($1, 330)], EXPR$3=[ITEM($1, 230)], EXPR$4=[ITEM($1, 220)], EXPR$5=[ITEM($1, 110)], EXPR$6=[ITEM($1, 90)], EXPR$7=[ITEM($1, 80)], EXPR$8=[ITEM($1, 70)], EXPR$9=[ITEM($1, 40)], EXPR$10=[ITEM($1, 10)], EXPR$11=[ITEM($1, 20)], EXPR$12=[ITEM($1, 30)], EXPR$13=[ITEM($1, 50)], EXPR$14=[ITEM($1, 454)], EXPR$15=[ITEM($1, 413)], EXPR$16=[ITEM($1, 940)], EXPR$17=[ITEM($1, 834)], EXPR$18=[ITEM($1, 73)], EXPR$19=[ITEM($1, 140)], EXPR$20=[ITEM($1, 104)], EXPR$21=[ITEM($1, )], EXPR$22=[ITEM($1, 2420)], EXPR$23=[ITEM($1, 1520)], EXPR$24=[ITEM($1, 1410)], EXPR$25=[ITEM($1, 1110)], EXPR$26=[ITEM($1, 1290)], EXPR$27=[ITEM($1, 2380)], EXPR$28=[ITEM($1, 705)], EXPR$29=[ITEM($1, 45)], EXPR$30=[ITEM($1, 1054)], EXPR$31=[ITEM($1, 2430)], EXPR$32=[ITEM($1, 420)], EXPR$33=[ITEM($1, 404)], EXPR$34=[ITEM($1, 3350)], EXPR$35=[ITEM($1, )], EXPR$36=[ITEM($1, 153)], EXPR$37=[ITEM($1, 356)], EXPR$38=[ITEM($1, 84)], EXPR$39=[ITEM($1, 745)], EXPR$40=[ITEM($1, 1450)], EXPR$41=[ITEM($1, 103)], EXPR$42=[ITEM($1, 2065)], EXPR$43=[ITEM($1, 343)], EXPR$44=[ITEM($1, 3420)], EXPR$45=[ITEM($1, 530)], EXPR$46=[ITEM($1, 3210)]) 00-10Project(T8¦¦*=[$0], columns=[$1]) 00-11 Scan(groupscan=[EasyGroupScan [selectionRoot=maprfs:/drill/testdata/resource-manager/3500cols.tbl, numFiles=1, columns
[jira] [Created] (DRILL-5885) Drill consumes 2x memory when sorting and reading a spilled batch from disk.
Robert Hou created DRILL-5885: - Summary: Drill consumes 2x memory when sorting and reading a spilled batch from disk. Key: DRILL-5885 URL: https://issues.apache.org/jira/browse/DRILL-5885 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.11.0 Reporter: Robert Hou The query is: {noformat} select count(*) from (select * from dfs.`/drill/testdata/resource-manager/3500cols.tbl` order by columns[450],columns[330],columns[230],columns[220],columns[110],columns[90],columns[80],columns[70],columns[40],columns[10],columns[20],columns[30],columns[40],columns[50], columns[454],columns[413],columns[940],columns[834],columns[73],columns[140],columns[104],columns[],columns[30],columns[2420],columns[1520], columns[1410], columns[1110],columns[1290],columns[2380],columns[705],columns[45],columns[1054],columns[2430],columns[420],columns[404],columns[3350], columns[],columns[153],columns[356],columns[84],columns[745],columns[1450],columns[103],columns[2065],columns[343],columns[3420],columns[530], columns[3210] ) d where d.col433 = 'sjka skjf'; {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (DRILL-5840) A query that includes sort completes, and then loses Drill connection. Drill becomes unresponsive, and cannot restart because it cannot communicate with Zookeeper
[ https://issues.apache.org/jira/browse/DRILL-5840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Hou resolved DRILL-5840. --- Resolution: Not A Problem > A query that includes sort completes, and then loses Drill connection. Drill > becomes unresponsive, and cannot restart because it cannot communicate with > Zookeeper > -- > > Key: DRILL-5840 > URL: https://issues.apache.org/jira/browse/DRILL-5840 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators >Affects Versions: 1.11.0 >Reporter: Robert Hou >Assignee: Paul Rogers > Fix For: 1.12.0 > > > Query is: > {noformat} > ALTER SESSION SET `exec.sort.disable_managed` = false; > select count(*) from (select * from > dfs.`/drill/testdata/resource-manager/250wide.tbl` order by columns[0])d > where d.columns[0] = 'ljdfhwuehnoiueyf'; > {noformat} > Query tries to complete, but cannot. It takes 20 hours from the time the > query tries to complete, to the time Drill finally loses its connection. > From the drillbit.log: > {noformat} > 2017-10-03 16:28:14,892 [262bec7f-3539-0dd7-6fea-f2959f9df3b6:frag:0:0] DEBUG > o.a.drill.exec.work.foreman.Foreman - 262bec7f-3539-0dd7-6fea-f2959f9df3b6: > State change requested RUNNING --> COMPLETED > 2017-10-04 01:47:27,698 [UserServer-1] DEBUG > o.a.d.e.r.u.UserServerRequestHandler - Received query to run. Returning > query handle. > 2017-10-04 03:30:02,916 [262bec7f-3539-0dd7-6fea-f2959f9df3b6:frag:0:0] WARN > o.a.d.exec.work.foreman.QueryManager - Failure while trying to delete the > estore profile for this query. > org.apache.drill.common.exceptions.DrillRuntimeException: unable to delete > node at /running/262bec7f-3539-0dd7-6fea-f2959f9df3b6 > at > org.apache.drill.exec.coord.zk.ZookeeperClient.delete(ZookeeperClient.java:343) > ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.coord.zk.ZkEphemeralStore.remove(ZkEphemeralStore.java:108) > ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.work.foreman.QueryManager.updateEphemeralState(QueryManager.java:293) > [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.work.foreman.Foreman.recordNewState(Foreman.java:1043) > [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.work.foreman.Foreman.moveToState(Foreman.java:964) > [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.work.foreman.Foreman.access$2600(Foreman.java:113) > [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.work.foreman.Foreman$StateSwitch.processEvent(Foreman.java:1025) > [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.work.foreman.Foreman$StateSwitch.processEvent(Foreman.java:1018) > [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.common.EventProcessor.processEvents(EventProcessor.java:107) > [drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.common.EventProcessor.sendEvent(EventProcessor.java:65) > [drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.work.foreman.Foreman$StateSwitch.addEvent(Foreman.java:1020) > [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.work.foreman.Foreman.addToEventQueue(Foreman.java:1038) > [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.work.foreman.QueryManager.nodeComplete(QueryManager.java:498) > [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.work.foreman.QueryManager.access$100(QueryManager.java:66) > [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.work.foreman.QueryManager$NodeTracker.fragmentComplete(QueryManager.java:462) > [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.work.foreman.QueryManager.fragmentDone(QueryManager.java:147) > [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.work.foreman.QueryManager.access$400(QueryManager.java:66) > [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.e
[jira] [Created] (DRILL-5813) A query that includes sort encounters Exception occurred with closed channel
Robert Hou created DRILL-5813: - Summary: A query that includes sort encounters Exception occurred with closed channel Key: DRILL-5813 URL: https://issues.apache.org/jira/browse/DRILL-5813 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.11.0 Reporter: Robert Hou Assignee: Paul Rogers Fix For: 1.12.0 Query is: {noformat} ALTER SESSION SET `exec.sort.disable_managed` = false; alter session set `planner.enable_decimal_data_type` = true; select count(*) from (select * from dfs.`/drill/testdata/resource-manager/all_types_large` order by missing11) d where d.missing3 is false; {noformat} This query has passed before when the number of threads and amount of memory is restricted. With more threads and memory, the query does not complete execution. Here is the stack trace: {noformat} Exception occurred with closed channel. Connection: /10.10.100.190:59281 <--> /10.10.100.190:31010 (user client) java.io.IOException: Connection reset by peer at sun.nio.ch.FileDispatcherImpl.read0(Native Method) at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) at sun.nio.ch.IOUtil.read(IOUtil.java:192) at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:384) at oadd.io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes(PooledUnsafeDirectByteBuf.java:311) at oadd.io.netty.buffer.WrappedByteBuf.setBytes(WrappedByteBuf.java:407) at oadd.io.netty.buffer.UnsafeDirectLittleEndian.setBytes(UnsafeDirectLittleEndian.java:32) at oadd.io.netty.buffer.DrillBuf.setBytes(DrillBuf.java:792) at oadd.io.netty.buffer.MutableWrappedByteBuf.setBytes(MutableWrappedByteBuf.java:280) at oadd.io.netty.buffer.ExpandableByteBuf.setBytes(ExpandableByteBuf.java:26) at oadd.io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:881) at oadd.io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:241) at oadd.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:119) at oadd.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511) at oadd.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468) at oadd.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382) at oadd.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) at oadd.io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111) at java.lang.Thread.run(Thread.java:745) User Error Occurred: Connection /10.10.100.190:59281 <--> /10.10.100.190:31010 (user client) closed unexpectedly. Drillbit down? oadd.org.apache.drill.common.exceptions.UserException: CONNECTION ERROR: Connection /10.10.100.190:59281 <--> /10.10.100.190:31010 (user client) closed un expectedly. Drillbit down? [Error Id: b97704a4-b8f0-4cd0-b428-2cf1bcf39a1d ] at oadd.org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:550) at oadd.org.apache.drill.exec.rpc.user.QueryResultHandler$ChannelClosedHandler$1.operationComplete(QueryResultHandler.java:373) at oadd.io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:680) at oadd.io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:603) at oadd.io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:563) at oadd.io.netty.util.concurrent.DefaultPromise.trySuccess(DefaultPromise.java:406) at oadd.io.netty.channel.DefaultChannelPromise.trySuccess(DefaultChannelPromise.java:82) at oadd.io.netty.channel.AbstractChannel$CloseFuture.setClosed(AbstractChannel.java:943) at oadd.io.netty.channel.AbstractChannel$AbstractUnsafe.doClose0(AbstractChannel.java:592) at oadd.io.netty.channel.AbstractChannel$AbstractUnsafe.close(AbstractChannel.java:584) at oadd.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.closeOnRead(AbstractNioByteChannel.java:71) at oadd.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.handleReadException(AbstractNioByteChannel.java:89) at oadd.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:162) at oadd.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511) at oadd.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468) at oadd.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382) at oadd.io.netty.channel.nio.NioEventLoop.run(Nio
[jira] [Created] (DRILL-5805) External Sort runs out of memory
Robert Hou created DRILL-5805: - Summary: External Sort runs out of memory Key: DRILL-5805 URL: https://issues.apache.org/jira/browse/DRILL-5805 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.11.0 Reporter: Robert Hou Assignee: Paul Rogers Fix For: 1.12.0 Query is: {noformat} ALTER SESSION SET `exec.sort.disable_managed` = false; alter session set `planner.width.max_per_node` = 5; alter session set `planner.disable_exchanges` = true; alter session set `planner.width.max_per_query` = 100; select count(*) from (select * from (select id, flatten(str_list) str from dfs.`/drill/testdata/resource-manager/flatten-large-small.json`) d order by d.str) d1 where d1.id=0; {noformat} Plan is: {noformat} | 00-00Screen 00-01 Project(EXPR$0=[$0]) 00-02StreamAgg(group=[{}], EXPR$0=[COUNT()]) 00-03 Project($f0=[0]) 00-04SelectionVectorRemover 00-05 Filter(condition=[=($0, 0)]) 00-06SelectionVectorRemover 00-07 Sort(sort0=[$1], dir0=[ASC]) 00-08Flatten(flattenField=[$1]) 00-09 Project(id=[$0], str=[$1]) 00-10Scan(groupscan=[EasyGroupScan [selectionRoot=maprfs:/drill/testdata/resource-manager/flatten-large-small.json, numFiles=1, columns=[`id`, `str_list`], files=[maprfs:///drill/testdata/resource-manager/flatten-large-small.json]]]) {noformat} sys.version is: {noformat} | 1.12.0-SNAPSHOT | c4211d3b545b0d1996b096a8e1ace35376a63977 | Fix for DRILL-5670 | 09.09.2017 @ 14:38:25 PDT | r...@qa-node190.qa.lab | 11.09.2017 @ 14:27:16 PDT | {noformat} mult drill5447_1 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (DRILL-5804) Query times out, may be infinite loop
Robert Hou created DRILL-5804: - Summary: Query times out, may be infinite loop Key: DRILL-5804 URL: https://issues.apache.org/jira/browse/DRILL-5804 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.11.0 Reporter: Robert Hou Assignee: Paul Rogers Fix For: 1.12.0 Query is: {noformat} ALTER SESSION SET `exec.sort.disable_managed` = false; select count(*) from ( select * from ( select s1.type type, flatten(s1.rms.rptd) rptds, s1.rms, s1.uid from ( select d.type type, d.uid uid, flatten(d.map.rm) rms from dfs.`/drill/testdata/resource-manager/nested_large` d order by d.uid ) s1 ) s2 order by s2.rms.mapid, s2.rptds.a, s2.rptds.do_not_exist ); {noformat} Plan is: {noformat} | 00-00Screen 00-01 Project(EXPR$0=[$0]) 00-02StreamAgg(group=[{}], EXPR$0=[$SUM0($0)]) 00-03 UnionExchange 01-01StreamAgg(group=[{}], EXPR$0=[COUNT()]) 01-02 Project($f0=[0]) 01-03SingleMergeExchange(sort0=[4 ASC], sort1=[5 ASC], sort2=[6 ASC]) 02-01 SelectionVectorRemover 02-02Sort(sort0=[$4], sort1=[$5], sort2=[$6], dir0=[ASC], dir1=[ASC], dir2=[ASC]) 02-03 Project(type=[$0], rptds=[$1], rms=[$2], uid=[$3], EXPR$4=[$4], EXPR$5=[$5], EXPR$6=[$6]) 02-04HashToRandomExchange(dist0=[[$4]], dist1=[[$5]], dist2=[[$6]]) 03-01 UnorderedMuxExchange 04-01Project(type=[$0], rptds=[$1], rms=[$2], uid=[$3], EXPR$4=[$4], EXPR$5=[$5], EXPR$6=[$6], E_X_P_R_H_A_S_H_F_I_E_L_D=[hash32AsDouble($6, hash32AsDouble($5, hash32AsDouble($4, 1301011)))]) 04-02 Project(type=[$0], rptds=[$1], rms=[$2], uid=[$3], EXPR$4=[ITEM($2, 'mapid')], EXPR$5=[ITEM($1, 'a')], EXPR$6=[ITEM($1, 'do_not_exist')]) 04-03Flatten(flattenField=[$1]) 04-04 Project(type=[$0], rptds=[ITEM($2, 'rptd')], rms=[$2], uid=[$1]) 04-05SingleMergeExchange(sort0=[1 ASC]) 05-01 SelectionVectorRemover 05-02Sort(sort0=[$1], dir0=[ASC]) 05-03 Project(type=[$0], uid=[$1], rms=[$2]) 05-04 HashToRandomExchange(dist0=[[$1]]) 06-01 UnorderedMuxExchange 07-01Project(type=[$0], uid=[$1], rms=[$2], E_X_P_R_H_A_S_H_F_I_E_L_D=[hash32AsDouble($1, 1301011)]) 07-02 Flatten(flattenField=[$2]) 07-03Project(type=[$0], uid=[$1], rms=[ITEM($2, 'rm')]) 07-04 Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:///drill/testdata/resource-manager/nested_large]], selectionRoot=maprfs:/drill/testdata/resource-manager/nested_large, numFiles=1, usedMetadataFile=false, columns=[`type`, `uid`, `map`.`rm`]]]) {noformat} Here is a segment of the drillbit.log, starting at line 55890: {noformat} 2017-09-19 04:22:56,258 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:2] DEBUG o.a.d.e.t.g.SingleBatchSorterGen44 - Took 142 us to sort 1023 records 2017-09-19 04:22:56,265 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:4] DEBUG o.a.d.e.t.g.SingleBatchSorterGen44 - Took 105 us to sort 1023 records 2017-09-19 04:22:56,268 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:3:0] DEBUG o.a.d.e.p.i.p.PartitionSenderRootExec - Partitioner.next(): got next record batch with status OK 2017-09-19 04:22:56,275 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:7] DEBUG o.a.d.e.t.g.SingleBatchSorterGen44 - Took 145 us to sort 1023 records 2017-09-19 04:22:56,354 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:3:0] DEBUG o.a.d.e.p.i.p.PartitionSenderRootExec - Partitioner.next(): got next record batch with status OK 2017-09-19 04:22:56,357 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:2] DEBUG o.a.d.e.t.g.SingleBatchSorterGen44 - Took 143 us to sort 1023 records 2017-09-19 04:22:56,361 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:0] DEBUG o.a.d.exec.compile.ClassTransformer - Compiled and merged PriorityQueueCopierGen50: bytecode size = 11.0 KiB, time = 124 ms. 2017-09-19 04:22:56,365 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:4] DEBUG o.a.d.e.t.g.SingleBatchSorterGen44 - Took 108 us to sort 1023 records 2017-09-19 04:22:56,367 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:0] DEBUG o.a.d.e.p.i.x.m.PriorityQueueCopierWrapper - Copier setup complete 2017-09-19 04:22:56,375 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:7] DEBUG o.a.d.e.t.g.SingleBatchSorterGen44 - Took
[jira] [Created] (DRILL-5786) Query enters Exception in RPC communication
Robert Hou created DRILL-5786: - Summary: Query enters Exception in RPC communication Key: DRILL-5786 URL: https://issues.apache.org/jira/browse/DRILL-5786 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.11.0 Reporter: Robert Hou Assignee: Paul Rogers Fix For: 1.12.0 Query is: {noformat} select count(*) from (select * from dfs.`/drill/testdata/resource-manager/3500cols.tbl` order by columns[450],columns[330],columns[230],columns[220],columns[110],columns[90],columns[80],columns[70],columns[40],columns[10],columns[20],columns[30],columns[40],columns[50], columns[454],columns[413],columns[940],columns[834],columns[73],columns[140],columns[104],columns[],columns[30],columns[2420],columns[1520], columns[1410], columns[1110],columns[1290],columns[2380],columns[705],columns[45],columns[1054],columns[2430],columns[420],columns[404],columns[3350], columns[],columns[153],columns[356],columns[84],columns[745],columns[1450],columns[103],columns[2065],columns[343],columns[3420],columns[530], columns[3210] ) d where d.col433 = 'sjka skjf' {noformat} This is the same query as DRILL-5670 but no session variables are set. Here is the stack trace: {noformat} 2017-09-12 13:14:57,584 [BitServer-5] ERROR o.a.d.exec.rpc.RpcExceptionHandler - Exception in RPC communication. Connection: /10.10.100.190:31012 <--> /10.10.100.190:46230 (data server). Closing connection. io.netty.handler.codec.DecoderException: org.apache.drill.exec.exception.OutOfMemoryException: Failure allocating buffer. at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:233) ~[netty-codec-4.0.27.Final.jar:4.0.27.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339) [netty-transport-4.0.27.Final.jar:4.0.27.Final] at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324) [netty-transport-4.0.27.Final.jar:4.0.27.Final] at io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86) [netty-transport-4.0.27.Final.jar:4.0.27.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339) [netty-transport-4.0.27.Final.jar:4.0.27.Final] at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324) [netty-transport-4.0.27.Final.jar:4.0.27.Final] at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:847) [netty-transport-4.0.27.Final.jar:4.0.27.Final] at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131) [netty-transport-4.0.27.Final.jar:4.0.27.Final] at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511) [netty-transport-4.0.27.Final.jar:4.0.27.Final] at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468) [netty-transport-4.0.27.Final.jar:4.0.27.Final] at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382) [netty-transport-4.0.27.Final.jar:4.0.27.Final] at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) [netty-transport-4.0.27.Final.jar:4.0.27.Final] at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111) [netty-common-4.0.27.Final.jar:4.0.27.Final] at java.lang.Thread.run(Thread.java:745) [na:1.7.0_111] Caused by: org.apache.drill.exec.exception.OutOfMemoryException: Failure allocating buffer. at io.netty.buffer.PooledByteBufAllocatorL.allocate(PooledByteBufAllocatorL.java:64) ~[drill-memory-base-1.12.0-SNAPSHOT.jar:4.0.27.Final] at org.apache.drill.exec.memory.AllocationManager.(AllocationManager.java:81) ~[drill-memory-base-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.memory.BaseAllocator.bufferWithoutReservation(BaseAllocator.java:260) ~[drill-memory-base-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.memory.BaseAllocator.buffer(BaseAllocator.java:243) ~[drill-memory-base-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.memory.BaseAllocator.buffer(BaseAllocator.java:213) ~[drill-memory-base-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at io.netty.buffer.ExpandableByteBuf.capacity(ExpandableByteBuf.java:43) ~[drill-memory-base-1.12.0-SNAPSHOT.jar:4.0.27.Final] at io.netty.buffer.AbstractByteBuf.ensureWritable(AbstractByteBuf.java:251) ~[netty-buffer-4.0.27.Final.jar:4.0.27.Final] at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:849) ~[netty-buffer-4.0.27.Final.jar:4.0.27
[jira] [Resolved] (DRILL-5522) OOM during the merge and spill process of the managed external sort
[ https://issues.apache.org/jira/browse/DRILL-5522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Hou resolved DRILL-5522. --- Resolution: Fixed This has been resolved. > OOM during the merge and spill process of the managed external sort > --- > > Key: DRILL-5522 > URL: https://issues.apache.org/jira/browse/DRILL-5522 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators >Affects Versions: 1.10.0 >Reporter: Rahul Challapalli >Assignee: Paul Rogers > Attachments: 26e334aa-1afa-753f-3afe-862f76b80c18.sys.drill, > drillbit.log, drillbit.out, drill-env.sh > > > git.commit.id.abbrev=1e0a14c > The below query fails with an OOM > {code} > ALTER SESSION SET `exec.sort.disable_managed` = false; > alter session set `planner.memory.max_query_memory_per_node` = 1552428800; > create table dfs.drillTestDir.xsort_ctas3_multiple partition by (type, aCol) > as select type, rptds, rms, s3.rms.a aCol, uid from ( > select * from ( > select s1.type type, flatten(s1.rms.rptd) rptds, s1.rms, s1.uid > from ( > select d.type type, d.uid uid, flatten(d.map.rm) rms from > dfs.`/drill/testdata/resource-manager/nested-large.json` d order by d.uid > ) s1 > ) s2 > order by s2.rms.mapid, s2.rptds.a > ) s3; > {code} > Stack trace > {code} > 2017-05-17 15:15:35,027 [26e334aa-1afa-753f-3afe-862f76b80c18:frag:4:2] INFO > o.a.d.e.w.fragment.FragmentExecutor - User Error Occurred: One or more nodes > ran out of memory while executing the query. (Unable to allocate buffer of > size 2097152 due to memory limit. Current allocation: 29229064) > org.apache.drill.common.exceptions.UserException: RESOURCE ERROR: One or more > nodes ran out of memory while executing the query. > Unable to allocate buffer of size 2097152 due to memory limit. Current > allocation: 29229064 > [Error Id: 619e2e34-704c-4964-a354-1348fb33ce8a ] > at > org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:544) > ~[drill-common-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:244) > [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) > [drill-common-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > [na:1.7.0_111] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > [na:1.7.0_111] > at java.lang.Thread.run(Thread.java:745) [na:1.7.0_111] > Caused by: org.apache.drill.exec.exception.OutOfMemoryException: Unable to > allocate buffer of size 2097152 due to memory limit. Current allocation: > 29229064 > at > org.apache.drill.exec.memory.BaseAllocator.buffer(BaseAllocator.java:220) > ~[drill-memory-base-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.memory.BaseAllocator.buffer(BaseAllocator.java:195) > ~[drill-memory-base-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.vector.BigIntVector.reAlloc(BigIntVector.java:212) > ~[vector-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.vector.BigIntVector.copyFromSafe(BigIntVector.java:324) > ~[vector-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.vector.NullableBigIntVector.copyFromSafe(NullableBigIntVector.java:367) > ~[vector-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.vector.NullableBigIntVector$TransferImpl.copyValueSafe(NullableBigIntVector.java:328) > ~[vector-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.vector.complex.RepeatedMapVector$RepeatedMapTransferPair.copyValueSafe(RepeatedMapVector.java:360) > ~[vector-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.vector.complex.MapVector$MapTransferPair.copyValueSafe(MapVector.java:220) > ~[vector-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.vector.complex.MapVector.copyFromSafe(MapVector.java:82) > ~[vector-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.test.generated.PriorityQueueCopierGen49.doCopy(PriorityQueueCopierTemplate.java:34) > ~[na:na] > at > org.apache.drill.exec.test.generated.PriorityQueueCopierGen49.next(PriorityQueueCopierTemplate.java:76) > ~
[jira] [Resolved] (DRILL-5443) Managed External Sort fails with OOM while spilling to disk
[ https://issues.apache.org/jira/browse/DRILL-5443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Hou resolved DRILL-5443. --- Resolution: Fixed This has been resolved. > Managed External Sort fails with OOM while spilling to disk > --- > > Key: DRILL-5443 > URL: https://issues.apache.org/jira/browse/DRILL-5443 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators >Affects Versions: 1.10.0, 1.11.0 >Reporter: Rahul Challapalli >Assignee: Paul Rogers > Fix For: 1.12.0 > > Attachments: 265a014b-8cae-30b5-adab-ff030b6c7086.sys.drill, > 27016969-ef53-40dc-b582-eea25371fa1c.sys.drill, drill5443.drillbit.log, > drillbit.log > > > git.commit.id.abbrev=3e8b01d > The below query fails with an OOM > {code} > ALTER SESSION SET `exec.sort.disable_managed` = false; > alter session set `planner.width.max_per_node` = 1; > alter session set `planner.disable_exchanges` = true; > alter session set `planner.width.max_per_query` = 1; > alter session set `planner.memory.max_query_memory_per_node` = 52428800; > select s1.type type, flatten(s1.rms.rptd) rptds from (select d.type type, > d.uid uid, flatten(d.map.rm) rms from > dfs.`/drill/testdata/resource-manager/nested-large.json` d order by d.uid) s1 > order by s1.rms.mapid; > {code} > Exception from the logs > {code} > 2017-04-24 17:22:59,439 [27016969-ef53-40dc-b582-eea25371fa1c:frag:0:0] INFO > o.a.d.e.p.i.x.m.ExternalSortBatch - User Error Occurred: External Sort > encountered an error while spilling to disk (Unable to allocate buffer of > size 524288 (rounded from 307197) due to memory limit. Current allocation: > 25886728) > org.apache.drill.common.exceptions.UserException: RESOURCE ERROR: External > Sort encountered an error while spilling to disk > [Error Id: a64e3790-3a34-42c8-b4ea-4cb1df780e63 ] > at > org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:544) > ~[drill-common-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.doMergeAndSpill(ExternalSortBatch.java:1445) > [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.mergeAndSpill(ExternalSortBatch.java:1376) > [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.mergeRuns(ExternalSortBatch.java:1372) > [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.consolidateBatches(ExternalSortBatch.java:1299) > [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.mergeSpilledRuns(ExternalSortBatch.java:1195) > [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.load(ExternalSortBatch.java:689) > [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.innerNext(ExternalSortBatch.java:559) > [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:162) > [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:215) > [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119) > [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:109) > [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51) > [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.physical.impl.svremover.RemovingRecordBatch.innerNext(RemovingRecordBatch.java:93) > [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:162) > [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.physical.impl.validate.IteratorV
[jira] [Resolved] (DRILL-5753) Managed External Sort: One or more nodes ran out of memory while executing the query.
[ https://issues.apache.org/jira/browse/DRILL-5753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Hou resolved DRILL-5753. --- Resolution: Fixed > Managed External Sort: One or more nodes ran out of memory while executing > the query. > - > > Key: DRILL-5753 > URL: https://issues.apache.org/jira/browse/DRILL-5753 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators >Affects Versions: 1.11.0 >Reporter: Robert Hou >Assignee: Paul Rogers > Fix For: 1.12.0 > > Attachments: 26596b4e-9883-7dc2-6275-37134f7d63be.sys.drill, > drillbit.log > > > The query is: > {noformat} > ALTER SESSION SET `exec.sort.disable_managed` = false; > alter session set `planner.memory.max_query_memory_per_node` = 1252428800; > select count(*) from ( > select * from ( > select s1.type type, flatten(s1.rms.rptd) rptds, s1.rms, s1.uid > from ( > select d.type type, d.uid uid, flatten(d.map.rm) rms from > dfs.`/drill/testdata/resource-manager/nested-large.json` d order by d.uid > ) s1 > ) s2 > order by s2.rms.mapid, s2.rptds.a, s2.rptds.do_not_exist > ); > ALTER SESSION SET `exec.sort.disable_managed` = true; > alter session set `planner.memory.max_query_memory_per_node` = 2147483648; > {noformat} > The stack trace is: > {noformat} > 2017-08-30 03:35:10,479 [BitServer-5] DEBUG > o.a.drill.exec.work.foreman.Foreman - 26596b4e-9883-7dc2-6275-37134f7d63be: > State change requested RUNNING --> FAILED > org.apache.drill.common.exceptions.UserRemoteException: RESOURCE ERROR: One > or more nodes ran out of memory while executing the query. > Unable to allocate buffer of size 4194304 due to memory limit. Current > allocation: 43960640 > Fragment 2:9 > [Error Id: f58210a2-7569-42d0-8961-8c7e42c7fea3 on atsqa6c80.qa.lab:31010] > (org.apache.drill.exec.exception.OutOfMemoryException) Unable to allocate > buffer of size 4194304 due to memory limit. Current allocation: 43960640 > org.apache.drill.exec.memory.BaseAllocator.buffer():238 > org.apache.drill.exec.memory.BaseAllocator.buffer():213 > org.apache.drill.exec.vector.BigIntVector.reAlloc():252 > org.apache.drill.exec.vector.BigIntVector$Mutator.setSafe():452 > org.apache.drill.exec.vector.RepeatedBigIntVector$Mutator.addSafe():355 > org.apache.drill.exec.vector.RepeatedBigIntVector.copyFromSafe():220 > > org.apache.drill.exec.vector.RepeatedBigIntVector$TransferImpl.copyValueSafe():202 > > org.apache.drill.exec.vector.complex.MapVector$MapTransferPair.copyValueSafe():225 > > org.apache.drill.exec.vector.complex.MapVector$MapTransferPair.copyValueSafe():225 > org.apache.drill.exec.vector.complex.MapVector.copyFromSafe():82 > > org.apache.drill.exec.test.generated.PriorityQueueCopierGen1466.doCopy():47 > org.apache.drill.exec.test.generated.PriorityQueueCopierGen1466.next():77 > > org.apache.drill.exec.physical.impl.xsort.managed.PriorityQueueCopierWrapper$BatchMerger.next():267 > > org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.load():374 > > org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.innerNext():303 > org.apache.drill.exec.record.AbstractRecordBatch.next():164 > org.apache.drill.exec.record.AbstractRecordBatch.next():119 > org.apache.drill.exec.record.AbstractRecordBatch.next():109 > org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51 > > org.apache.drill.exec.physical.impl.svremover.RemovingRecordBatch.innerNext():93 > org.apache.drill.exec.record.AbstractRecordBatch.next():164 > org.apache.drill.exec.physical.impl.BaseRootExec.next():105 > > org.apache.drill.exec.physical.impl.SingleSenderCreator$SingleSenderRootExec.innerNext():92 > org.apache.drill.exec.physical.impl.BaseRootExec.next():95 > org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():234 > org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():227 > java.security.AccessController.doPrivileged():-2 > javax.security.auth.Subject.doAs():415 > org.apache.hadoop.security.UserGroupInformation.doAs():1595 > org.apache.drill.exec.work.fragment.FragmentExecutor.run():227 > org.apache.drill.common.SelfCleaningRunnable.run():38 > java.util.concurrent.ThreadPoolExecutor.runWorker():1145 > java.util.concurrent.ThreadPoolExecutor$Worker.run():615 > java.lang.Thread.run():744 > at > org.apache.dri
[jira] [Resolved] (DRILL-5744) External sort fails with OOM error
[ https://issues.apache.org/jira/browse/DRILL-5744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Hou resolved DRILL-5744. --- Resolution: Fixed This has been verified. > External sort fails with OOM error > -- > > Key: DRILL-5744 > URL: https://issues.apache.org/jira/browse/DRILL-5744 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators >Affects Versions: 1.10.0 > Reporter: Robert Hou >Assignee: Paul Rogers > Fix For: 1.12.0 > > Attachments: 265b163b-cf44-d2ff-2e70-4cd746b56611.sys.drill, > q34.drillbit.log > > > Query is: > {noformat} > ALTER SESSION SET `exec.sort.disable_managed` = false; > alter session set `planner.width.max_per_node` = 1; > alter session set `planner.disable_exchanges` = true; > alter session set `planner.width.max_per_query` = 1; > alter session set `planner.memory.max_query_memory_per_node` = 152428800; > select count(*) from ( > select * from ( > select s1.type type, flatten(s1.rms.rptd) rptds, s1.rms, s1.uid > from ( > select d.type type, d.uid uid, flatten(d.map.rm) rms from > dfs.`/drill/testdata/resource-manager/nested-large.json` d order by d.uid > ) s1 > ) s2 > order by s2.rms.mapid > ); > ALTER SESSION SET `exec.sort.disable_managed` = true; > alter session set `planner.width.max_per_node` = 17; > alter session set `planner.disable_exchanges` = false; > alter session set `planner.width.max_per_query` = 1000; > alter session set `planner.memory.max_query_memory_per_node` = 2147483648; > {noformat} > Stack trace is: > {noformat} > 2017-08-23 06:59:42,763 [266275e5-ebdb-14ae-d52d-00fa3a154f6d:frag:0:0] INFO > o.a.d.e.w.fragment.FragmentExecutor - User Error Occurred: One or more nodes > ran out of memory while executing the query. (Unable to allocate buffer of > size 4194304 (rounded from 3276750) due to memory limit. Current allocation: 7 > 9986944) > org.apache.drill.common.exceptions.UserException: RESOURCE ERROR: One or more > nodes ran out of memory while executing the query. > Unable to allocate buffer of size 4194304 (rounded from 3276750) due to > memory limit. Current allocation: 79986944 > [Error Id: 4f4959df-0921-4a50-b75e-56488469ab10 ] > at > org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:550) > ~[drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:244) > [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) > [drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > [na:1.7.0_51] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > [na:1.7.0_51] > at java.lang.Thread.run(Thread.java:744) [na:1.7.0_51] > Caused by: org.apache.drill.exec.exception.OutOfMemoryException: Unable to > allocate buffer of size 4194304 (rounded from 3276750) due to memory limit. > Cur > rent allocation: 79986944 > at > org.apache.drill.exec.memory.BaseAllocator.buffer(BaseAllocator.java:238) > ~[drill-memory-base-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.memory.BaseAllocator.buffer(BaseAllocator.java:213) > ~[drill-memory-base-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.vector.VarCharVector.allocateNew(VarCharVector.java:402) > ~[vector-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.vector.NullableVarCharVector.allocateNew(NullableVarCharVector.java:236) > ~[vector-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.vector.AllocationHelper.allocatePrecomputedChildCount(AllocationHelper.java:33) > ~[vector-1.12.0-SNAPSHOT.jar:1.12.0-SNAPS > HOT] > at > org.apache.drill.exec.vector.AllocationHelper.allocate(AllocationHelper.java:46) > ~[vector-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.record.VectorInitializer.allocateVector(VectorInitializer.java:113) > ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT > ] > at > org.apache.drill.exec.record.VectorInitializer.allocateVector(VectorInitializer.java:95) > ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.record.VectorInitializer.allocateMap(VectorInitializer.java:130) > ~[drill-java-exec-1.12.0-
[jira] [Created] (DRILL-5778) Drill seems to run out of memory but completes execution
Robert Hou created DRILL-5778: - Summary: Drill seems to run out of memory but completes execution Key: DRILL-5778 URL: https://issues.apache.org/jira/browse/DRILL-5778 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.11.0 Reporter: Robert Hou Assignee: Paul Rogers Fix For: 1.12.0 Query is: {noformat} ALTER SESSION SET `exec.sort.disable_managed` = false; alter session set `planner.width.max_per_node` = 1; alter session set `planner.disable_exchanges` = true; alter session set `planner.width.max_per_query` = 1; alter session set `planner.memory.max_query_memory_per_node` = 2147483648; select count(*) from (select * from (select id, flatten(str_list) str from dfs.`/drill/testdata/resource-manager/flatten-large-small.json`) d order by d.str) d1 where d1.id=0; {noformat} Plan is: {noformat} | 00-00Screen 00-01 Project(EXPR$0=[$0]) 00-02StreamAgg(group=[{}], EXPR$0=[$SUM0($0)]) 00-03 UnionExchange 01-01StreamAgg(group=[{}], EXPR$0=[COUNT()]) 01-02 Project($f0=[0]) 01-03SelectionVectorRemover 01-04 Filter(condition=[=($0, 0)]) 01-05SingleMergeExchange(sort0=[1 ASC]) 02-01 SelectionVectorRemover 02-02Sort(sort0=[$1], dir0=[ASC]) 02-03 Project(id=[$0], str=[$1]) 02-04HashToRandomExchange(dist0=[[$1]]) 03-01 UnorderedMuxExchange 04-01Project(id=[$0], str=[$1], E_X_P_R_H_A_S_H_F_I_E_L_D=[hash32AsDouble($1, 1301011)]) 04-02 Flatten(flattenField=[$1]) 04-03Project(id=[$0], str=[$1]) 04-04 Scan(groupscan=[EasyGroupScan [selectionRoot=maprfs:/drill/testdata/resource-manager/flatten-large-small.json, numFiles=1, columns=[`id`, `str_list`], files=[maprfs:///drill/testdata/resource-manager/flatten-large-small.json]]]) {noformat} >From drillbit.log: {noformat} 2017-09-08 05:07:21,515 [264d780f-41ac-2c4f-6bc8-bdbb5eeb3df0:frag:0:0] DEBUG o.a.d.e.p.i.x.m.ExternalSortBatch - Actual batch schema & sizes { str(type: REQUIRED VARCHAR, count: 4096, std size: 54, actual size: 134, data size: 548360) id(type: OPTIONAL BIGINT, count: 4096, std size: 8, actual size: 9, data size: 36864) Records: 4096, Total size: 1073819648, Data size: 585224, Gross row width: 262163, Net row width: 143, Density: 1} 2017-09-08 05:07:21,515 [264d780f-41ac-2c4f-6bc8-bdbb5eeb3df0:frag:0:0] ERROR o.a.d.e.p.i.x.m.ExternalSortBatch - Insufficient memory to merge two batches. Incoming batch size: 1073819648, available memory: 2147483648 2017-09-08 05:07:21,517 [264d780f-41ac-2c4f-6bc8-bdbb5eeb3df0:frag:0:0] INFO o.a.d.e.c.ClassCompilerSelector - Java compiler policy: DEFAULT, Debug option: true 2017-09-08 05:07:21,517 [264d780f-41ac-2c4f-6bc8-bdbb5eeb3df0:frag:0:0] DEBUG o.a.d.e.compile.JaninoClassCompiler - Compiling (source size=3.3 KiB): ... 2017-09-08 05:07:21,536 [264d780f-41ac-2c4f-6bc8-bdbb5eeb3df0:frag:0:0] DEBUG o.a.d.exec.compile.ClassTransformer - Compiled and merged SingleBatchSorterGen2677: bytecode size = 3.6 KiB, time = 19 ms. 2017-09-08 05:07:21,566 [264d780f-41ac-2c4f-6bc8-bdbb5eeb3df0:frag:0:0] DEBUG o.a.d.e.t.g.SingleBatchSorterGen2677 - Took 5608 us to sort 4096 records 2017-09-08 05:07:21,566 [264d780f-41ac-2c4f-6bc8-bdbb5eeb3df0:frag:0:0] DEBUG o.a.d.e.p.i.x.m.ExternalSortBatch - Input Batch Estimates: record size = 143 bytes; net = 1073819648 bytes, gross = 1610729472, records = 4096 2017-09-08 05:07:21,566 [264d780f-41ac-2c4f-6bc8-bdbb5eeb3df0:frag:0:0] DEBUG o.a.d.e.p.i.x.m.ExternalSortBatch - Spill batch size: net = 1048476 bytes, gross = 1572714 bytes, records = 7332; spill file = 268435456 bytes 2017-09-08 05:07:21,566 [264d780f-41ac-2c4f-6bc8-bdbb5eeb3df0:frag:0:0] DEBUG o.a.d.e.p.i.x.m.ExternalSortBatch - Output batch size: net = 9371505 bytes, gross = 14057257 bytes, records = 65535 2017-09-08 05:07:21,566 [264d780f-41ac-2c4f-6bc8-bdbb5eeb3df0:frag:0:0] DEBUG o.a.d.e.p.i.x.m.ExternalSortBatch - Available memory: 2147483648, buffer memory = 2143289744, merge memory = 2128740638 2017-09-08 05:07:21,571 [264d780f-41ac-2c4f-6bc8-bdbb5eeb3df0:frag:0:0] DEBUG o.a.d.e.t.g.SingleBatchSorterGen2677 - Took 4303 us to sort 4096 records 2017-09-08 05:07:21,571 [264d780f-41ac-2c4f-6bc8-bdbb5eeb3df0:frag:0:0] DEBUG o.a.d.e.p.i.x.m.ExternalSortBatch - Input Batch Estimates: record size = 266 bytes; net = 1073819648 bytes, gross = 1610729472, records = 4096 2017-09-08 05:07:21,571 [264d780f-41ac-2c4f-6bc8-bdbb5eeb3df0:frag:0:0] DEBUG o.a.d.e.p.i.x.m.ExternalSortBatch - Spill batch size: net = 1048572 bytes, gross = 157285
[jira] [Created] (DRILL-5774) Excessive memory allocation
Robert Hou created DRILL-5774: - Summary: Excessive memory allocation Key: DRILL-5774 URL: https://issues.apache.org/jira/browse/DRILL-5774 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.11.0 Reporter: Robert Hou Assignee: Paul Rogers Fix For: 1.12.0 This query exhibits excessive memory allocation: {noformat} ALTER SESSION SET `exec.sort.disable_managed` = false; alter session set `planner.width.max_per_node` = 1; alter session set `planner.disable_exchanges` = true; alter session set `planner.width.max_per_query` = 1; select count(*) from (select * from (select id, flatten(str_list) str from dfs.`/drill/testdata/resource-manager/flatten-large-small.json`) d order by d.str) d1 where d1.id=0; {noformat} This query does a flatten on a large table. The result is 160M records. Half the records have a one-byte string, and half have a 253-byte string. And then there are 40K records with 223 byte strings. {noformat} select length(str), count(*) from (select id, flatten(str_list) str from dfs.`/drill/testdata/resource-manager/flatten-large-small.json`) group by length(str); +-+---+ | EXPR$0 | EXPR$1 | +-+---+ | 223 | 4 | | 1 | 80042001 | | 253 | 8000 | {noformat} >From the drillbit.log: {noformat} 2017-09-02 11:43:44,598 [26550427-6adf-a52e-2ea8-dc52d8d8433f:frag:0:0] DEBUG o.a.d.e.p.i.x.m.ExternalSortBatch - Actual batch schema & sizes { str(type: REQUIRED VARCHAR, count: 4096, std size: 54, actual size: 134, data size: 548360) id(type: OPTIONAL BIGINT, count: 4096, std size: 8, actual size: 9, data size: 36864) Records: 4096, Total size: 1073819648, Data size: 585224, Gross row width: 262163, Net row width: 143, Density: 1} {noformat} The data size is 585K, but the batch size is 1 GB. The density is 1%. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (DRILL-5753) Managed External Sort: One or more nodes ran out of memory while executing the query.
Robert Hou created DRILL-5753: - Summary: Managed External Sort: One or more nodes ran out of memory while executing the query. Key: DRILL-5753 URL: https://issues.apache.org/jira/browse/DRILL-5753 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.11.0 Reporter: Robert Hou Assignee: Paul Rogers Fix For: 1.12.0 The query is: {noformat} ALTER SESSION SET `exec.sort.disable_managed` = false; alter session set `planner.memory.max_query_memory_per_node` = 1252428800; select count(*) from ( select * from ( select s1.type type, flatten(s1.rms.rptd) rptds, s1.rms, s1.uid from ( select d.type type, d.uid uid, flatten(d.map.rm) rms from dfs.`/drill/testdata/resource-manager/nested-large.json` d order by d.uid ) s1 ) s2 order by s2.rms.mapid, s2.rptds.a, s2.rptds.do_not_exist ); ALTER SESSION SET `exec.sort.disable_managed` = true; alter session set `planner.memory.max_query_memory_per_node` = 2147483648; {noformat} The stack trace is: {noformat} 2017-08-30 03:35:10,479 [BitServer-5] DEBUG o.a.drill.exec.work.foreman.Foreman - 26596b4e-9883-7dc2-6275-37134f7d63be: State change requested RUNNING --> FAILED org.apache.drill.common.exceptions.UserRemoteException: RESOURCE ERROR: One or more nodes ran out of memory while executing the query. Unable to allocate buffer of size 4194304 due to memory limit. Current allocation: 43960640 Fragment 2:9 [Error Id: f58210a2-7569-42d0-8961-8c7e42c7fea3 on atsqa6c80.qa.lab:31010] (org.apache.drill.exec.exception.OutOfMemoryException) Unable to allocate buffer of size 4194304 due to memory limit. Current allocation: 43960640 org.apache.drill.exec.memory.BaseAllocator.buffer():238 org.apache.drill.exec.memory.BaseAllocator.buffer():213 org.apache.drill.exec.vector.BigIntVector.reAlloc():252 org.apache.drill.exec.vector.BigIntVector$Mutator.setSafe():452 org.apache.drill.exec.vector.RepeatedBigIntVector$Mutator.addSafe():355 org.apache.drill.exec.vector.RepeatedBigIntVector.copyFromSafe():220 org.apache.drill.exec.vector.RepeatedBigIntVector$TransferImpl.copyValueSafe():202 org.apache.drill.exec.vector.complex.MapVector$MapTransferPair.copyValueSafe():225 org.apache.drill.exec.vector.complex.MapVector$MapTransferPair.copyValueSafe():225 org.apache.drill.exec.vector.complex.MapVector.copyFromSafe():82 org.apache.drill.exec.test.generated.PriorityQueueCopierGen1466.doCopy():47 org.apache.drill.exec.test.generated.PriorityQueueCopierGen1466.next():77 org.apache.drill.exec.physical.impl.xsort.managed.PriorityQueueCopierWrapper$BatchMerger.next():267 org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.load():374 org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.innerNext():303 org.apache.drill.exec.record.AbstractRecordBatch.next():164 org.apache.drill.exec.record.AbstractRecordBatch.next():119 org.apache.drill.exec.record.AbstractRecordBatch.next():109 org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51 org.apache.drill.exec.physical.impl.svremover.RemovingRecordBatch.innerNext():93 org.apache.drill.exec.record.AbstractRecordBatch.next():164 org.apache.drill.exec.physical.impl.BaseRootExec.next():105 org.apache.drill.exec.physical.impl.SingleSenderCreator$SingleSenderRootExec.innerNext():92 org.apache.drill.exec.physical.impl.BaseRootExec.next():95 org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():234 org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():227 java.security.AccessController.doPrivileged():-2 javax.security.auth.Subject.doAs():415 org.apache.hadoop.security.UserGroupInformation.doAs():1595 org.apache.drill.exec.work.fragment.FragmentExecutor.run():227 org.apache.drill.common.SelfCleaningRunnable.run():38 java.util.concurrent.ThreadPoolExecutor.runWorker():1145 java.util.concurrent.ThreadPoolExecutor$Worker.run():615 java.lang.Thread.run():744 at org.apache.drill.exec.work.foreman.QueryManager$1.statusUpdate(QueryManager.java:521) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.rpc.control.WorkEventBus.statusUpdate(WorkEventBus.java:71) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.work.batch.ControlMessageHandler.handle(ControlMessageHandler.java:94) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.work.batch.ControlMessageHandler.handle(ControlMessageHandler.java:55) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.rpc.BasicServer.handle(BasicServer.java:157) [drill-rpc-1.12.0-SNAPSHOT.jar:1.12.0-SNAPS
[jira] [Resolved] (DRILL-5732) Unable to allocate sv2 for 9039 records, and not enough batchGroups to spill.
[ https://issues.apache.org/jira/browse/DRILL-5732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Hou resolved DRILL-5732. --- Resolution: Not A Problem > Unable to allocate sv2 for 9039 records, and not enough batchGroups to spill. > - > > Key: DRILL-5732 > URL: https://issues.apache.org/jira/browse/DRILL-5732 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.10.0 > Reporter: Robert Hou >Assignee: Paul Rogers > Attachments: 26621eb2-daec-cef9-efed-5986e72a750a.sys.drill, > drillbit.log.83 > > > git commit id: > {noformat} > | 1.12.0-SNAPSHOT | e9065b55ea560e7f737d6fcb4948f9e945b9b14f | DRILL-5660: > Parquet metadata caching improvements | 15.08.2017 @ 09:31:00 PDT | > r...@qa-node190.qa.lab | 15.08.2017 @ 13:29:26 PDT | > {noformat} > Query is: > {noformat} > ALTER SESSION SET `exec.sort.disable_managed` = false; > alter session set `planner.disable_exchanges` = true; > alter session set `planner.memory.max_query_memory_per_node` = 104857600; > alter session set `planner.width.max_per_node` = 1; > alter session set `planner.width.max_per_query` = 1; > select max(col1), max(cs_sold_date_sk), max(cs_sold_time_sk), > max(cs_ship_date_sk), max(cs_bill_customer_sk), max(cs_bill_cdemo_sk), > max(cs_bill_hdemo_sk), max(cs_bill_addr_sk), max(cs_ship_customer_sk), > max(cs_ship_cdemo_sk), max(cs_ship_hdemo_sk), max(cs_ship_addr_sk), > max(cs_call_center_sk), max(cs_catalog_page_sk), max(cs_ship_mode_sk), > min(cs_warehouse_sk), max(cs_item_sk), max(cs_promo_sk), > max(cs_order_number), max(cs_quantity), max(cs_wholesale_cost), > max(cs_list_price), max(cs_sales_price), max(cs_ext_discount_amt), > min(cs_ext_sales_price), max(cs_ext_wholesale_cost), min(cs_ext_list_price), > min(cs_ext_tax), min(cs_coupon_amt), max(cs_ext_ship_cost), max(cs_net_paid), > max(cs_net_paid_inc_tax), min(cs_net_paid_inc_ship), > min(cs_net_paid_inc_ship_tax), min(cs_net_profit), min(c_customer_sk), > min(length(c_customer_id)), max(c_current_cdemo_sk), max(c_current_hdemo_sk), > min(c_current_addr_sk), min(c_first_shipto_date_sk), > min(c_first_sales_date_sk), min(length(c_salutation)), > min(length(c_first_name)), min(length(c_last_name)), > min(length(c_preferred_cust_flag)), max(c_birth_day), min(c_birth_month), > min(c_birth_year), max(c_last_review_date), c_email_address from (select > cs_sold_date_sk+cs_sold_time_sk col1, * from > dfs.`/drill/testdata/resource-manager/md1362` order by c_email_address nulls > first) d where d.col1 > 2536816 and c_email_address is not null group by > c_email_address; > ALTER SESSION SET `exec.sort.disable_managed` = true; > alter session set `planner.disable_exchanges` = false; > alter session set `planner.memory.max_query_memory_per_node` = 2147483648; > alter session set `planner.width.max_per_node` = 17; > alter session set `planner.width.max_per_query` = 1000; > {noformat} > Here is the stack trace: > {noformat} > 2017-08-18 13:15:27,052 [2668b522-5833-8fd2-0b6d-e685197f0ae3:frag:0:0] DEBUG > o.a.d.e.t.g.SingleBatchSorterGen27 - Took 6445 us to sort 9039 records > 2017-08-18 13:15:27,420 [2668b522-5833-8fd2-0b6d-e685197f0ae3:frag:0:0] DEBUG > o.a.d.e.p.i.xsort.ExternalSortBatch - Copier allocator current allocation 0 > 2017-08-18 13:15:27,420 [2668b522-5833-8fd2-0b6d-e685197f0ae3:frag:0:0] DEBUG > o.a.d.e.p.i.xsort.ExternalSortBatch - mergeAndSpill: starting total size in > memory = 71964288 > 2017-08-18 13:15:27,421 [2668b522-5833-8fd2-0b6d-e685197f0ae3:frag:0:0] INFO > o.a.d.e.p.i.xsort.ExternalSortBatch - User Error Occurred: One or more nodes > ran out of memory while executing the query. > org.apache.drill.common.exceptions.UserException: RESOURCE ERROR: One or more > nodes ran out of memory while executing the query. > Unable to allocate sv2 for 9039 records, and not enough batchGroups to spill. > batchGroups.size 1 > spilledBatchGroups.size 0 > allocated memory 71964288 > allocator limit 52428800 > [Error Id: 7b248f12-2b31-4013-86b6-92e6c842db48 ] > at > org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:550) > ~[drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.physical.impl.xsort.ExternalSortBatch.newSV2(ExternalSortBatch.java:637) > [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] > at > org.apache.drill.exec.physical.impl.xsort.ExternalSortBatch.innerNext(ExternalSortBatch.java:379) > [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT
[jira] [Created] (DRILL-5744) External sort fails with OOM error
Robert Hou created DRILL-5744: - Summary: External sort fails with OOM error Key: DRILL-5744 URL: https://issues.apache.org/jira/browse/DRILL-5744 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.10.0 Reporter: Robert Hou Assignee: Paul Rogers Fix For: 1.12.0 Query is: {noformat} ALTER SESSION SET `exec.sort.disable_managed` = false; alter session set `planner.width.max_per_node` = 1; alter session set `planner.disable_exchanges` = true; alter session set `planner.width.max_per_query` = 1; alter session set `planner.memory.max_query_memory_per_node` = 152428800; select count(*) from ( select * from ( select s1.type type, flatten(s1.rms.rptd) rptds, s1.rms, s1.uid from ( select d.type type, d.uid uid, flatten(d.map.rm) rms from dfs.`/drill/testdata/resource-manager/nested-large.json` d order by d.uid ) s1 ) s2 order by s2.rms.mapid ); ALTER SESSION SET `exec.sort.disable_managed` = true; alter session set `planner.width.max_per_node` = 17; alter session set `planner.disable_exchanges` = false; alter session set `planner.width.max_per_query` = 1000; alter session set `planner.memory.max_query_memory_per_node` = 2147483648; {noformat} Stack trace is: {noformat} 2017-08-23 06:59:42,763 [266275e5-ebdb-14ae-d52d-00fa3a154f6d:frag:0:0] INFO o.a.d.e.w.fragment.FragmentExecutor - User Error Occurred: One or more nodes ran out of memory while executing the query. (Unable to allocate buffer of size 4194304 (rounded from 3276750) due to memory limit. Current allocation: 7 9986944) org.apache.drill.common.exceptions.UserException: RESOURCE ERROR: One or more nodes ran out of memory while executing the query. Unable to allocate buffer of size 4194304 (rounded from 3276750) due to memory limit. Current allocation: 79986944 [Error Id: 4f4959df-0921-4a50-b75e-56488469ab10 ] at org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:550) ~[drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:244) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) [drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_51] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_51] at java.lang.Thread.run(Thread.java:744) [na:1.7.0_51] Caused by: org.apache.drill.exec.exception.OutOfMemoryException: Unable to allocate buffer of size 4194304 (rounded from 3276750) due to memory limit. Cur rent allocation: 79986944 at org.apache.drill.exec.memory.BaseAllocator.buffer(BaseAllocator.java:238) ~[drill-memory-base-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.memory.BaseAllocator.buffer(BaseAllocator.java:213) ~[drill-memory-base-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.vector.VarCharVector.allocateNew(VarCharVector.java:402) ~[vector-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.vector.NullableVarCharVector.allocateNew(NullableVarCharVector.java:236) ~[vector-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.vector.AllocationHelper.allocatePrecomputedChildCount(AllocationHelper.java:33) ~[vector-1.12.0-SNAPSHOT.jar:1.12.0-SNAPS HOT] at org.apache.drill.exec.vector.AllocationHelper.allocate(AllocationHelper.java:46) ~[vector-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.record.VectorInitializer.allocateVector(VectorInitializer.java:113) ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT ] at org.apache.drill.exec.record.VectorInitializer.allocateVector(VectorInitializer.java:95) ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.record.VectorInitializer.allocateMap(VectorInitializer.java:130) ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.record.VectorInitializer.allocateVector(VectorInitializer.java:93) ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.record.VectorInitializer.allocateBatch(VectorInitializer.java:85) ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.physical.impl.xsort.managed.PriorityQueueCopierWrapper$BatchMerger.next(PriorityQueueCopierWrapper.java:262) ~[drill-java -exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.load(ExternalSortBatch.java:374) ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12 .0-SNAPSHOT
[jira] [Created] (DRILL-5732) Unable to allocate sv2 for 9039 records, and not enough batchGroups to spill.
Robert Hou created DRILL-5732: - Summary: Unable to allocate sv2 for 9039 records, and not enough batchGroups to spill. Key: DRILL-5732 URL: https://issues.apache.org/jira/browse/DRILL-5732 Project: Apache Drill Issue Type: Bug Affects Versions: 1.10.0 Reporter: Robert Hou Assignee: Paul Rogers git commit id: {noformat} | 1.12.0-SNAPSHOT | e9065b55ea560e7f737d6fcb4948f9e945b9b14f | DRILL-5660: Parquet metadata caching improvements | 15.08.2017 @ 09:31:00 PDT | r...@qa-node190.qa.lab | 15.08.2017 @ 13:29:26 PDT | {noformat} Query is: {noformat} ALTER SESSION SET `exec.sort.disable_managed` = false; alter session set `planner.disable_exchanges` = true; alter session set `planner.memory.max_query_memory_per_node` = 104857600; alter session set `planner.width.max_per_node` = 1; alter session set `planner.width.max_per_query` = 1; select max(col1), max(cs_sold_date_sk), max(cs_sold_time_sk), max(cs_ship_date_sk), max(cs_bill_customer_sk), max(cs_bill_cdemo_sk), max(cs_bill_hdemo_sk), max(cs_bill_addr_sk), max(cs_ship_customer_sk), max(cs_ship_cdemo_sk), max(cs_ship_hdemo_sk), max(cs_ship_addr_sk), max(cs_call_center_sk), max(cs_catalog_page_sk), max(cs_ship_mode_sk), min(cs_warehouse_sk), max(cs_item_sk), max(cs_promo_sk), max(cs_order_number), max(cs_quantity), max(cs_wholesale_cost), max(cs_list_price), max(cs_sales_price), max(cs_ext_discount_amt), min(cs_ext_sales_price), max(cs_ext_wholesale_cost), min(cs_ext_list_price), min(cs_ext_tax), min(cs_coupon_amt), max(cs_ext_ship_cost), max(cs_net_paid), max(cs_net_paid_inc_tax), min(cs_net_paid_inc_ship), min(cs_net_paid_inc_ship_tax), min(cs_net_profit), min(c_customer_sk), min(length(c_customer_id)), max(c_current_cdemo_sk), max(c_current_hdemo_sk), min(c_current_addr_sk), min(c_first_shipto_date_sk), min(c_first_sales_date_sk), min(length(c_salutation)), min(length(c_first_name)), min(length(c_last_name)), min(length(c_preferred_cust_flag)), max(c_birth_day), min(c_birth_month), min(c_birth_year), max(c_last_review_date), c_email_address from (select cs_sold_date_sk+cs_sold_time_sk col1, * from dfs.`/drill/testdata/resource-manager/md1362` order by c_email_address nulls first) d where d.col1 > 2536816 and c_email_address is not null group by c_email_address; ALTER SESSION SET `exec.sort.disable_managed` = true; alter session set `planner.disable_exchanges` = false; alter session set `planner.memory.max_query_memory_per_node` = 2147483648; alter session set `planner.width.max_per_node` = 17; alter session set `planner.width.max_per_query` = 1000; {noformat} Here is the stack trace: {noformat} 2017-08-18 13:15:27,052 [2668b522-5833-8fd2-0b6d-e685197f0ae3:frag:0:0] DEBUG o.a.d.e.t.g.SingleBatchSorterGen27 - Took 6445 us to sort 9039 records 2017-08-18 13:15:27,420 [2668b522-5833-8fd2-0b6d-e685197f0ae3:frag:0:0] DEBUG o.a.d.e.p.i.xsort.ExternalSortBatch - Copier allocator current allocation 0 2017-08-18 13:15:27,420 [2668b522-5833-8fd2-0b6d-e685197f0ae3:frag:0:0] DEBUG o.a.d.e.p.i.xsort.ExternalSortBatch - mergeAndSpill: starting total size in memory = 71964288 2017-08-18 13:15:27,421 [2668b522-5833-8fd2-0b6d-e685197f0ae3:frag:0:0] INFO o.a.d.e.p.i.xsort.ExternalSortBatch - User Error Occurred: One or more nodes ran out of memory while executing the query. org.apache.drill.common.exceptions.UserException: RESOURCE ERROR: One or more nodes ran out of memory while executing the query. Unable to allocate sv2 for 9039 records, and not enough batchGroups to spill. batchGroups.size 1 spilledBatchGroups.size 0 allocated memory 71964288 allocator limit 52428800 [Error Id: 7b248f12-2b31-4013-86b6-92e6c842db48 ] at org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:550) ~[drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.physical.impl.xsort.ExternalSortBatch.newSV2(ExternalSortBatch.java:637) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.physical.impl.xsort.ExternalSortBatch.innerNext(ExternalSortBatch.java:379) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:164) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:225) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:109) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.record.AbstractSingleRecordBatch.innerN
Re: [ANNOUNCE] New PMC member: Arina Ielchiieva
Congratulations! Thanks for your contributions. --Robert From: Vitalii DiravkaSent: Thursday, August 3, 2017 3:32 AM To: dev@drill.apache.org Subject: Re: [ANNOUNCE] New PMC member: Arina Ielchiieva Congratulations! Well deserved. Kind regards Vitalii On Thu, Aug 3, 2017 at 2:53 AM, Arina Yelchiyeva wrote: > Thank all you! > > Kind regards > Arina > > On Thu, Aug 3, 2017 at 5:58 AM, Sudheesh Katkam > wrote: > > > Congratulations and thank you, Arina. > > > > On Wed, Aug 2, 2017 at 1:38 PM, Paul Rogers wrote: > > > > > The success of the Drill 1.11 release proves this is a well-deserved > > move. > > > Congratulations! > > > > > > - Paul > > > > > > > On Aug 2, 2017, at 11:23 AM, Aman Sinha > wrote: > > > > > > > > I am pleased to announce that Drill PMC invited Arina Ielchiieva to > the > > > PMC > > > > and she has accepted the invitation. > > > > > > > > Congratulations Arina and thanks for your contributions ! > > > > > > > > -Aman > > > > (on behalf of Drill PMC) > > > > > > > > >
Re: [ANNOUNCE] New Committer: Laurent Goujon
Congrats, Laurent! Thanks for all your work on the client side. --Robert From: Jinfeng NiSent: Friday, June 9, 2017 1:11 PM To: dev Subject: Re: [ANNOUNCE] New Committer: Laurent Goujon Congratulations, Laurent! On Fri, Jun 9, 2017 at 10:02 AM, Julien Le Dem wrote: > Congrats Laurent! > > On Fri, Jun 9, 2017 at 9:57 AM, rahul challapalli < > challapallira...@gmail.com> wrote: > > > Congratulations Laurent! > > > > On Fri, Jun 9, 2017 at 9:49 AM, Paul Rogers wrote: > > > > > Congratulations and welcome! > > > > > > - Paul > > > > > > > On Jun 9, 2017, at 3:33 AM, Khurram Faraaz wrote: > > > > > > > > Congratulations Laurent. > > > > > > > > > > > > From: Parth Chandra > > > > Sent: Friday, June 9, 2017 3:14:00 AM > > > > To: dev@drill.apache.org > > > > Subject: [ANNOUNCE] New Committer: Laurent Goujon > > > > > > > > The Project Management Committee (PMC) for Apache Drill has invited > > > Laurent > > > > Goujon to become a committer, and we are pleased to announce that he > > has > > > > accepted. > > > > > > > > Laurent has a long list of contributions many in the client side > > > interfaces > > > > and metadata queries. > > > > > > > > Welcome Laurent, and thank you for your contributions. Keep up the > > good > > > > work ! > > > > > > > > - Parth > > > > (on behalf of the Apache Drill PMC) > > > > > > > > > > > > -- > Julien >
Re: [ANNOUNCE] New Committer: Paul Rogers
Congrats, Paul! From: Chunhui ShiSent: Friday, May 19, 2017 9:44 AM To: dev Subject: Re: [ANNOUNCE] New Committer: Paul Rogers Congrats Paul! Thank you for your contributions! From: rahul challapalli Sent: Friday, May 19, 2017 9:20:52 AM To: dev Subject: Re: [ANNOUNCE] New Committer: Paul Rogers Congratulations Paul. Well Deserved. On Fri, May 19, 2017 at 8:46 AM, Gautam Parai wrote: > Congratulations Paul and thank you for your contributions! > > > Gautam > > > From: Abhishek Girish > Sent: Friday, May 19, 2017 8:27:05 AM > To: dev@drill.apache.org > Subject: Re: [ANNOUNCE] New Committer: Paul Rogers > > Congrats Paul! > > On Fri, May 19, 2017 at 8:23 AM, Charles Givre wrote: > > > Congrats Paul!! > > > > On Fri, May 19, 2017 at 11:22 AM, Aman Sinha > wrote: > > > > > The Project Management Committee (PMC) for Apache Drill has invited > Paul > > > Rogers to become a committer, and we are pleased to announce that he > has > > > accepted. > > > > > > Paul has a long list of contributions that have touched many aspects of > > the > > > product. > > > > > > Welcome Paul, and thank you for your contributions. Keep up the good > > work > > > ! > > > > > > - Aman > > > > > > (on behalf of the Apache Drill PMC) > > > > > >
[jira] [Created] (DRILL-5374) Parquet filter pushdown does not prune partition with nulls when predicate uses float column
Robert Hou created DRILL-5374: - Summary: Parquet filter pushdown does not prune partition with nulls when predicate uses float column Key: DRILL-5374 URL: https://issues.apache.org/jira/browse/DRILL-5374 Project: Apache Drill Issue Type: Bug Components: Query Planning & Optimization Affects Versions: 1.9.0 Reporter: Robert Hou Assignee: Jinfeng Ni Drill does not prune enough partitions for this query when filter pushdown is used with metadata caching. The float column is being compared with a double value. {code} 0: jdbc:drill:zk=10.10.100.186:5181/drill/rho> select count(*) from orders_parts_metadata where float_id < 1100.0; {code} To reproduce the problem, put the attached files into a directory. Then {code} create the metadata: refresh table metadata dfs.`path_to_directory`; {code} For example, if you put the files in /drill/testdata/filter/orders_parts_metadata, then run this sql command {code} refresh table metadata dfs.`/drill/testdata/filter/orders_parts_metadata`; {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
Re: [ANNOUNCE] New Committer: Arina Ielchiieva
Congratulations, Arina! From: rahul challapalliSent: Friday, February 24, 2017 9:48 AM To: dev Subject: Re: [ANNOUNCE] New Committer: Arina Ielchiieva Congrats Arina! On Fri, Feb 24, 2017 at 9:42 AM, Julian Hyde wrote: > Congratulations, and welcome! > > On Fri, Feb 24, 2017 at 9:17 AM, Abhishek Girish > wrote: > > Congratulations Arina! > > > > On Fri, Feb 24, 2017 at 9:06 AM, Sudheesh Katkam > > wrote: > > > >> The Project Management Committee (PMC) for Apache Drill has invited > Arina > >> Ielchiieva to become a committer, and we are pleased to announce that > she > >> has accepted. > >> > >> Arina has a long list of contributions [1] that have touched many > aspects > >> of the product. Her work includes features such as dynamic UDF support > and > >> temporary tables support. > >> > >> Welcome Arina, and thank you for your contributions. > >> > >> - Sudheesh, on behalf of the Apache Drill PMC > >> > >> [1] https://github.com/apache/drill/commits/master?author= > arina-ielchiieva > >> >
[jira] [Created] (DRILL-5136) Some SQL statements fail when using Simba ODBC driver 1.3
Robert Hou created DRILL-5136: - Summary: Some SQL statements fail when using Simba ODBC driver 1.3 Key: DRILL-5136 URL: https://issues.apache.org/jira/browse/DRILL-5136 Project: Apache Drill Issue Type: Bug Components: Client - ODBC Affects Versions: 1.9.0 Reporter: Robert Hou "show schemas" does not work with Simba ODBC driver SQL>show schemas 1: SQLPrepare = [MapR][Drill] (1040) Drill failed to execute the query: show schemas [30029]Query execution error. Details:[ PARSE ERROR: Encountered "( show" at line 1, column 15. Was expecting one of: ... ... ... ... ... "LATERAL" ... "(" "WITH" ... "(" "+" ... "(" "-" ... "(" ... "(" ... "("