Hello, we just upgraded our dev cluster from Kudu 1.3 to kudu 1.5.0-cdh5.13.1 and noticed quite severe performance degradation. We did CTAS from Impala parquet table which has not changed a bit since the upgrade (even the same # of rows) to Kudu using the follow query below.
It used to take 11-11.5 hours on Kudu 1.3 and now taking 50-55 hours. Of course Impala version was also bumped with CDH 5.13. Any clue why it takes so much time now? Table has 5.5B rows.. create TABLE kudutest_ts.clinical_event_nots PRIMARY KEY (clinical_event_id) PARTITION BY HASH(clinical_event_id) PARTITIONS 120 STORED AS KUDU AS SELECT clinical_event_id, encntr_id, person_id, encntr_financial_id, event_id, event_title_text, CAST(view_level as string) as view_level, order_id, catalog_cd, series_ref_nbr, accession_nbr, contributor_system_cd, reference_nbr, parent_event_id, event_reltn_cd, event_class_cd, event_cd, event_tag, CAST(event_end_dt_tm_os as BIGINT) as event_end_dt_tm_os, result_val, result_units_cd, result_time_units_cd, task_assay_cd, record_status_cd, result_status_cd, CAST(authentic_flag as STRING) authentic_flag, CAST(publish_flag as STRING) publish_flag, qc_review_cd, normalcy_cd, normalcy_method_cd, inquire_security_cd, resource_group_cd, resource_cd, CAST(subtable_bit_map as STRING) subtable_bit_map, collating_seq, verified_prsnl_id, performed_prsnl_id, updt_id, CAST(updt_task as STRING) updt_task, updt_cnt, CAST(updt_applctx as STRING) updt_applctx, normal_low, normal_high, critical_low, critical_high, CAST(event_tag_set_flag as STRING) event_tag_set_flag, CAST(note_importance_bit_map as STRING) note_importance_bit_map, CAST(order_action_sequence as STRING) order_action_sequence, entry_mode_cd, source_cd, clinical_seq, CAST(event_end_tz as STRING) event_end_tz, CAST(event_start_tz as STRING) event_start_tz, CAST(performed_tz as STRING) performed_tz, CAST(verified_tz as STRING) verified_tz, task_assay_version_nbr, modifier_long_text_id, ce_dynamic_label_id, CAST(nomen_string_flag as STRING) nomen_string_flag, src_event_id, CAST(last_utc_ts as BIGINT) last_utc_ts, device_free_txt, CAST(trait_bit_map as STRING) trait_bit_map, CAST(clu_subkey1_flag as STRING) clu_subkey1_flag, CAST(clinsig_updt_dt_tm as BIGINT) clinsig_updt_dt_tm, CAST(event_end_dt_tm as BIGINT) event_end_dt_tm, CAST(event_start_dt_tm as BIGINT) event_start_dt_tm, CAST(expiration_dt_tm as BIGINT) expiration_dt_tm, CAST(verified_dt_tm as BIGINT) verified_dt_tm, CAST(src_clinsig_updt_dt_tm as BIGINT) src_clinsig_updt_dt_tm, CAST(updt_dt_tm as BIGINT) updt_dt_tm, CAST(valid_from_dt_tm as BIGINT) valid_from_dt_tm, CAST(valid_until_dt_tm as BIGINT) valid_until_dt_tm, CAST(performed_dt_tm as BIGINT) performed_dt_tm, txn_id_text, CAST(ingest_dt_tm as BIGINT) ingest_dt_tm FROM v500.clinical_event