Hi Peter >From the logs you shared, the Parquet-related messages are a warning and has >nothing to do with the unresponsiveness and subsequent crash of the Drillbit.
The Zookeeper timeout indicates that either the ZK or your DBit was possibly overloaded. Since th Drillbit is reporting the timeout, the lack of a response is because of the Zookeeper. That, however, should not crash that Drillbit. Do you have GC logs of the Drillbit? You'll need to edit the conf/drill-env.sh file to start the Drillbit JVM with additional parameters: -XX:+PrintGCDetails -XX:+PrintGCDateStamps -verbose:gc -Xloggc:/var/log/drill/gc.log The fact that your Drillbit's webconsole is unresponsive makes me believe that it most likely ran out of (heap) memory. The GC log would indicate that. Now, if that is the case, heap memory would need to be increased. However, this is assuming you are NOT running your query via the WebConsole. If you are, I'd recommend you use a JDBC tool like DBeaver or SQuirreL. With the WebConsole, the Drillbit uses the heap to collect the query's result set and would also involve the scarce webserver threads to be busy building this into a single JSON response to your HTTP based query submission. This results in those threads being unavailable to address other relatively quick HTTP requests (like your other general interactions with the WebConsole). If you MUST use the WebConsole to submit, it would make sense to view the progress from another Drillbit's WebConsole. In the meanwhile, you can mitigate the ZK timeout with these params: drill.exec.zk.timeout drill.exec.zk.retry.delay drill.exec.zk.retry.count Hope this helps! Let us know how things turn out. Kunal On 8/1/2018 3:25:40 AM, Peter Edike <[email protected]> wrote: Hello everyone, I am currently using apache drill to query a directory of parquet files (total size 64gb with each file averaging 6.4Gb per size). I am running the following query select * from dfs.rj.vw_reports_201805 where unique_key in ( select unique_key from dfs.rj.vw_reports_201805 group by unique_key having count(unique_key) > 1 ) limit 40 Now this query takes about 42 minutes after which it fails because one of the drillbits in the drillbit cluster crashes. I check the logs and I see several errors summarized as follows 20:35:18.006 [Curator-Framework-0] ERROR org.apache.curator.ConnectionState - Connection timed out for connection string (172.35.15.129:5181,172.35.15.151:5181,172.35.15.152:5181) and timeout (5000) / elapsed (13347) org.apache.curator.CuratorConnectionLossException: KeeperErrorCode = ConnectionLoss at org.apache.curator.ConnectionState.checkTimeouts(ConnectionState.java:197) [curator-client-2.7.1.jar:na] at org.apache.curator.ConnectionState.getZooKeeper(ConnectionState.java:87) [curator-client-2.7.1.jar:na] at org.apache.curator.CuratorZookeeperClient.getZooKeeper(CuratorZookeeperClient.java:115) [curator-client-2.7.1.jar:na] at org.apache.curator.framework.imps.CuratorFrameworkImpl.performBackgroundOperation(CuratorFrameworkImpl.java:806) [curator-framework-2.7.1.jar:na] at org.apache.curator.framework.imps.CuratorFrameworkImpl.backgroundOperationsLoop(CuratorFrameworkImpl.java:792) [curator-framework-2.7.1.jar:na] at org.apache.curator.framework.imps.CuratorFrameworkImpl.access$300(CuratorFrameworkImpl.java:62) [curator-framework-2.7.1.jar:na] at org.apache.curator.framework.imps.CuratorFrameworkImpl$4.call(CuratorFrameworkImpl.java:257) [curator-framework-2.7.1.jar:na] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_171] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [na:1.8.0_171] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [na:1.8.0_171] at java.lang.Thread.run(Thread.java:748) [na:1.8.0_171] AND Aug 1, 2018 7:54:18 AM WARNING: org.apache.parquet.CorruptStatistics: Ignoring statistics because created_by could not be parsed (see PARQUET-251): parquet-mr (build 6aa21f8776625b5fa6b18059cfebe7549f2e00cb) org.apache.parquet.VersionParser$VersionParseException: Could not parse created_by: parquet-mr (build 6aa21f8776625b5fa6b18059cfebe7549f2e00cb) using format: (.+) version ((.*) )?\(build ?(.*)\) at org.apache.parquet.VersionParser.parse(VersionParser.java:112) at org.apache.parquet.CorruptStatistics.shouldIgnoreStatistics(CorruptStatistics.java:66) at org.apache.parquet.format.converter.ParquetMetadataConverter.fromParquetStatistics(ParquetMetadataConverter.java:264) at org.apache.parquet.format.converter.ParquetMetadataConverter.fromParquetMetadata(ParquetMetadataConverter.java:568) at org.apache.parquet.format.converter.ParquetMetadataConverter.readParquetMetadata(ParquetMetadataConverter.java:545) at org.apache.parquet.hadoop.ParquetFileReader.readFooter(ParquetFileReader.java:455) at org.apache.parquet.hadoop.ParquetFileReader.readFooter(ParquetFileReader.java:404) at org.apache.parquet.hadoop.ParquetFileReader.readFooter(ParquetFileReader.java:390) at org.apache.drill.exec.store.parquet.ParquetScanBatchCreator.getBatch(ParquetScanBatchCreator.java:117) at org.apache.drill.exec.store.parquet.ParquetScanBatchCreator.getBatch(ParquetScanBatchCreator.java:53) at org.apache.drill.exec.physical.impl.ImplCreator$2.run(ImplCreator.java:146) at org.apache.drill.exec.physical.impl.ImplCreator$2.run(ImplCreator.java:142) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1633) at org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:142) at org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:182) at org.apache.drill.exec.physical.impl.ImplCreator.getRootExec(ImplCreator.java:110) at org.apache.drill.exec.physical.impl.ImplCreator.getExec(ImplCreator.java:87) at org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:206) at org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) The Schema for the data I am querying in question is as follows index_no ANY YES bank_code ANY YES trxn_category ANY YES Debit_Account_Type ANY YES Credit_Account_Type ANY YES trxn_amount ANY YES trxn_fee ANY YES trxn_date ANY YES Currency ANY YES Late_reversal ANY YES card_type ANY YES terminal_type ANY YES Acquirer ANY YES Issuer ANY YES Value_SettleAmount ANY YES Value_RequestedAmount ANY YES Volume ANY YES unique_key ANY YES source_node_name ANY YES business_date ANY YES PT_datetime_tran_local ANY YES PTC_pan ANY YES PTC_terminal_id ANY YES PTC_card_acceptor_id_code ANY YES PTC_card_acceptor_name_loc ANY YES PT_system_trace_audit_nr ANY YES PT_message_type ANY YES PT_from_account_id ANY YES PT_to_account_id ANY YES PT_tran_type ANY YES PT_rsp_code_req ANY YES PT_rsp_code_rsp ANY YES PT_settle_amount_req ANY YES PT_settle_amount_rsp ANY YES PTC_merchant_type ANY YES PT_settle_amount_impact ANY YES PT_settle_tran_fee_req ANY YES PT_settle_tran_fee_rsp ANY YES PT_auth_id_rsp ANY YES PT_retrieval_reference_nr ANY YES PTC_totals_group ANY YES PTC_card_product ANY YES PT_tran_currency_code ANY YES PT_payee ANY YES PT_tran_amount_req ANY YES PT_tran_amount_rsp ANY YES PT_extended_tran_type ANY YES PT_tran_postilion_originated ANY YES PT_tran_completed ANY YES PT_tran_nr ANY YES PT_retention_data ANY YES PT_acquiring_inst_id_code ANY YES PT_message_reason_code ANY YES PT_sponsor_bank ANY YES PT_datetime_tran_gmt ANY YES PT_datetime_req ANY YES PT_datetime_rsp ANY YES PT_realtime_business_date ANY YES PT_recon_business_date ANY YES PT_from_account_type ANY YES PT_to_account_type ANY YES PT_tran_cash_req ANY YES PT_tran_cash_rsp ANY YES PT_tran_tran_fee_req ANY YES PT_tran_tran_fee_rsp ANY YES PT_tran_tran_fee_currency_code ANY YES PT_tran_proc_fee_req ANY YES PT_tran_proc_fee_rsp ANY YES PT_tran_proc_fee_currency_code ANY YES PT_settle_currency_code ANY YES PT_pos_entry_mode ANY YES PT_pos_condition_code ANY YES PT_tran_reversed ANY YES PT_card_verification_result ANY YES PT_online_system_id ANY YES PT_participant_id ANY YES PT_receiving_inst_id_code ANY YES PT_routing_type ANY YES PT_pt_pos_card_input_mode ANY YES PT_source_node_key ANY YES PT_proc_online_system_id ANY YES PTC_post_tran_cust_id ANY YES PTC_source_node_name ANY YES PTC_card_seq_nr ANY YES PTC_expiry_date ANY YES PTC_service_restriction_code ANY YES PTC_terminal_owner ANY YES PTC_mapped_card_acceptor_id_code ANY YES PTC_pos_terminal_type ANY YES PTC_pan_encrypted ANY YES PTSP_Account_Nr ANY YES PTSP_Code ANY YES account_PTSP_Code ANY YES PTSP_Name ANY YES Sort_Code ANY YES rdm_amt ANY YES Reward_Code ANY YES Reward_discount ANY YES ptsp_terminal_id ANY YES reward_terminal_id ANY YES txn_id ANY YES web_category_code ANY YES web_category_name ANY YES web_fee_type ANY YES web_merchant_disc ANY YES web_fee_cap ANY YES Account_Name ANY YES account_nr ANY YES Acquiring_bank ANY YES journal_amount ANY YES xls_amount ANY YES merch_cat_amount_cap ANY YES merch_cat_visa_amount_cap ANY YES reward_amount_cap ANY YES Amount_config_state ANY YES Amount_description ANY YES Authorized_Person ANY YES ACC_BANK_CODE ANY YES merch_cat_bearer ANY YES merch_cat_visa_bearer ANY YES merch_cat_visa_category_code ANY YES merch_cat_category_code ANY YES merch_cat_visa_category_name ANY YES merch_cat_category_name ANY YES credit_acc_id ANY YES credit_acc_nr_id ANY YES credit_cardholder_acc_id ANY YES credit_cardholder_acc_type ANY YES CreditAccNr_acc_id ANY YES CreditAccNr_acc_nr ANY YES CreditAccNr_acc_nr_id ANY YES CreditAccNr_se_id ANY YES CreditAccNr_state ANY YES Date_Modified ANY YES debit_acc_id ANY YES debit_acc_nr_id ANY YES debit_cardholder_acc_id ANY YES debit_cardholder_acc_type ANY YES DebitAccNr_acc_id ANY YES DebitAccNr_acc_nr ANY YES DebitAccNr_acc_nr_id ANY YES extended_trans_type ANY YES fee ANY YES Fee_amount_id ANY YES merch_cat_fee_cap ANY YES merch_cat_visa_fee_cap ANY YES reward_fee_cap ANY YES Fee_description ANY YES Fee_Discount ANY YES Fee_fee_id ANY YES fee_id ANY YES Fee_name ANY YES Fee_se_id ANY YES merch_cat_category_fee_type ANY YES merch_cat_category_visa_fee_type ANY YES merch_cat_category_merch_discount ANY YES merch_cat_category_visa_merch_discount ANY YES PT_sink_node_name ANY YES post_tran_id ANY YES Miscellaneous ANY YES terminal_owner ANY YES terminal_owner_name ANY YES REF_rpt_account_type ANY YES REF_IsPoolAccount ANY YES From_Account_Type ANY YES To_Account_Type ANY YES Tran_type_description ANY YES Region ANY YES Transaction_Status ANY YES Transaction_type_Impact ANY YES Message_Type_Desc ANY YES Response_Code_description ANY YES REF_BatchId ANY YES Beneficiary_Account ANY YES Rate ANY YES final_fee ANY YES final_fee_type ANY YES settlement_date ANY YES Please what can I do to improve performance of this query
