[ https://issues.apache.org/jira/browse/HIVE-27898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17819475#comment-17819475 ]
yongzhi.shao commented on HIVE-27898: ------------------------------------- [~zhangbutao] I tried the master branch, using the latest updated code today, and the problem persists. {code:java} 0: jdbc:hive2://smaster01:10000> select version(); INFO : Compiling command(queryId=hive_20240222134713_2aaf182e-be55-4946-9a7c-c19d4a047e13): select version() INFO : Semantic Analysis Completed (retrial = false) INFO : Created Hive schema: Schema(fieldSchemas:[FieldSchema(name:_c0, type:string, comment:null)], properties:null) INFO : Completed compiling command(queryId=hive_20240222134713_2aaf182e-be55-4946-9a7c-c19d4a047e13); Time taken: 1.974 seconds INFO : Concurrency mode is disabled, not creating a lock manager INFO : Executing command(queryId=hive_20240222134713_2aaf182e-be55-4946-9a7c-c19d4a047e13): select version() INFO : Completed executing command(queryId=hive_20240222134713_2aaf182e-be55-4946-9a7c-c19d4a047e13); Time taken: 0.004 seconds _c0 4.0.0-beta-2-SNAPSHOT r6e061e6559522c8a060c1b55439ada0001bf5e5d 0: jdbc:hive2://smaster01:10000> select * from (select * from iceberg_dwd.test_data_04 limit 10) s1; INFO : Compiling command(queryId=hive_20240222134526_87055172-e808-4726-acdf-89f6b8b815e6): select * from (select * from iceberg_dwd.test_data_04 limit 10) s1 INFO : Semantic Analysis Completed (retrial = false) INFO : Created Hive schema: Schema(fieldSchemas:[FieldSchema(name:s1.id, type:string, comment:null), FieldSchema(name:s1.name, type:string, comment:null)], properties:null) INFO : Completed compiling command(queryId=hive_20240222134526_87055172-e808-4726-acdf-89f6b8b815e6); Time taken: 1.007 seconds INFO : Concurrency mode is disabled, not creating a lock manager INFO : Executing command(queryId=hive_20240222134526_87055172-e808-4726-acdf-89f6b8b815e6): select * from (select * from iceberg_dwd.test_data_04 limit 10) s1 INFO : Query ID = hive_20240222134526_87055172-e808-4726-acdf-89f6b8b815e6 INFO : Total jobs = 1 INFO : Launching Job 1 out of 1 INFO : Starting task [Stage-1:MAPRED] in serial mode INFO : Subscribed to counters: [] for queryId: hive_20240222134526_87055172-e808-4726-acdf-89f6b8b815e6 INFO : Tez session hasn't been created yet. Opening session INFO : Dag name: select * from (select......a_04 limit 10) s1 (Stage-1) INFO : HS2 Host: [smaster01], Query ID: [hive_20240222134526_87055172-e808-4726-acdf-89f6b8b815e6], Dag ID: [dag_1706163520799_39421_1], DAG Session ID: [application_1706163520799_39421] INFO : Status: Running (Executing on YARN cluster with App id application_1706163520799_39421)---------------------------------------------------------------------------------------------- VERTICES MODE STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED ---------------------------------------------------------------------------------------------- Map 1 .......... container SUCCEEDED 1 1 0 0 0 0 Reducer 2 ...... container SUCCEEDED 1 1 0 0 0 0 ---------------------------------------------------------------------------------------------- VERTICES: 02/02 [==========================>>] 100% ELAPSED TIME: 20.10 s ---------------------------------------------------------------------------------------------- INFO : Status: DAG finished successfully in 19.90 seconds INFO : DAG ID: dag_1706163520799_39421_1 INFO : INFO : Query Execution Summary INFO : ---------------------------------------------------------------------------------------------- INFO : OPERATION DURATION INFO : ---------------------------------------------------------------------------------------------- INFO : Compile Query 1.01s INFO : Prepare Plan 16.62s INFO : Get Query Coordinator (AM) 0.03s INFO : Submit Plan 0.54s INFO : Start DAG 0.07s INFO : Run DAG 19.90s INFO : ---------------------------------------------------------------------------------------------- INFO : INFO : Task Execution Summary INFO : ---------------------------------------------------------------------------------------------- INFO : VERTICES DURATION(ms) CPU_TIME(ms) GC_TIME(ms) INPUT_RECORDS OUTPUT_RECORDS INFO : ---------------------------------------------------------------------------------------------- INFO : Map 1 0.00 10,290 290 2 2 INFO : Reducer 2 0.00 1,020 0 2 0 INFO : ---------------------------------------------------------------------------------------------- INFO : INFO : org.apache.tez.common.counters.DAGCounter: INFO : NUM_SUCCEEDED_TASKS: 2 INFO : TOTAL_LAUNCHED_TASKS: 2 INFO : RACK_LOCAL_TASKS: 1 INFO : AM_CPU_MILLISECONDS: 5430 INFO : WALL_CLOCK_MILLIS: 9971 INFO : AM_GC_TIME_MILLIS: 51 INFO : INITIAL_HELD_CONTAINERS: 0 INFO : TOTAL_CONTAINERS_USED: 1 INFO : TOTAL_CONTAINER_ALLOCATION_COUNT: 2 INFO : TOTAL_CONTAINER_LAUNCH_COUNT: 1 INFO : TOTAL_CONTAINER_REUSE_COUNT: 1 INFO : File System Counters: INFO : FILE_BYTES_READ: 0 INFO : FILE_BYTES_WRITTEN: 0 INFO : FILE_READ_OPS: 0 INFO : FILE_LARGE_READ_OPS: 0 INFO : FILE_WRITE_OPS: 0 INFO : HDFS_BYTES_READ: 1210 INFO : HDFS_BYTES_WRITTEN: 121 INFO : HDFS_READ_OPS: 8 INFO : HDFS_LARGE_READ_OPS: 0 INFO : HDFS_WRITE_OPS: 2 INFO : org.apache.tez.common.counters.TaskCounter: INFO : SPILLED_RECORDS: 0 INFO : NUM_SHUFFLED_INPUTS: 0 INFO : NUM_FAILED_SHUFFLE_INPUTS: 0 INFO : GC_TIME_MILLIS: 290 INFO : CPU_MILLISECONDS: 11310 INFO : WALL_CLOCK_MILLISECONDS: 9561 INFO : PHYSICAL_MEMORY_BYTES: 3435134976 INFO : VIRTUAL_MEMORY_BYTES: 7557877760 INFO : COMMITTED_HEAP_BYTES: 3435134976 INFO : INPUT_RECORDS_PROCESSED: 4 INFO : INPUT_SPLIT_LENGTH_BYTES: 880 INFO : OUTPUT_RECORDS: 2 INFO : APPROXIMATE_INPUT_RECORDS: 2 INFO : OUTPUT_LARGE_RECORDS: 0 INFO : OUTPUT_BYTES: 8 INFO : OUTPUT_BYTES_WITH_OVERHEAD: 18 INFO : OUTPUT_BYTES_PHYSICAL: 46 INFO : ADDITIONAL_SPILLS_BYTES_WRITTEN: 0 INFO : ADDITIONAL_SPILLS_BYTES_READ: 0 INFO : ADDITIONAL_SPILL_COUNT: 0 INFO : SHUFFLE_BYTES: 0 INFO : SHUFFLE_BYTES_DECOMPRESSED: 0 INFO : SHUFFLE_BYTES_TO_MEM: 0 INFO : SHUFFLE_BYTES_TO_DISK: 0 INFO : SHUFFLE_BYTES_DISK_DIRECT: 0 INFO : SHUFFLE_PHASE_TIME: 158 INFO : FIRST_EVENT_RECEIVED: 157 INFO : LAST_EVENT_RECEIVED: 157 INFO : DATA_BYTES_VIA_EVENT: 22 INFO : HIVE: INFO : CREATED_FILES: 1 INFO : DESERIALIZE_ERRORS: 0 INFO : RECORDS_IN_Map_1: 2 INFO : RECORDS_OUT_0: 2 INFO : RECORDS_OUT_INTERMEDIATE_Map_1: 2 INFO : RECORDS_OUT_INTERMEDIATE_Reducer_2: 0 INFO : RECORDS_OUT_OPERATOR_FS_13: 2 INFO : RECORDS_OUT_OPERATOR_LIM_11: 2 INFO : RECORDS_OUT_OPERATOR_LIM_8: 2 INFO : RECORDS_OUT_OPERATOR_MAP_0: 0 INFO : RECORDS_OUT_OPERATOR_RS_10: 2 INFO : RECORDS_OUT_OPERATOR_SEL_12: 2 INFO : RECORDS_OUT_OPERATOR_SEL_9: 2 INFO : RECORDS_OUT_OPERATOR_TS_0: 2 INFO : org.apache.hadoop.hive.ql.exec.tez.HiveInputCounters: INFO : GROUPED_INPUT_SPLITS_Map_1: 1 INFO : INPUT_DIRECTORIES_Map_1: 1 INFO : INPUT_FILES_Map_1: 1 INFO : RAW_INPUT_SPLITS_Map_1: 1 INFO : Completed executing command(queryId=hive_20240222134526_87055172-e808-4726-acdf-89f6b8b815e6); Time taken: 37.363 seconds s1.id NULL s1.name a s1.id NULL s1.name b 2 rows selected (38.804 seconds) {code} > HIVE4 can't use ICEBERG table in subqueries > ------------------------------------------- > > Key: HIVE-27898 > URL: https://issues.apache.org/jira/browse/HIVE-27898 > Project: Hive > Issue Type: Improvement > Components: Iceberg integration > Affects Versions: 4.0.0-beta-1 > Reporter: yongzhi.shao > Priority: Critical > > Currently, we found that when using HIVE4-BETA1 version, if we use ICEBERG > table in the subquery, we can't get any data in the end. > I have used HIVE3-TEZ for cross validation and HIVE3 does not have this > problem when querying ICEBERG. > {code:java} > --spark3.4.1+iceberg 1.4.2 > CREATE TABLE datacenter.dwd.b_std_trade ( > uni_order_id STRING, > data_from BIGINT, > partner STRING, > plat_code STRING, > order_id STRING, > uni_shop_id STRING, > uni_id STRING, > guide_id STRING, > shop_id STRING, > plat_account STRING, > total_fee DOUBLE, > item_discount_fee DOUBLE, > trade_discount_fee DOUBLE, > adjust_fee DOUBLE, > post_fee DOUBLE, > discount_rate DOUBLE, > payment_no_postfee DOUBLE, > payment DOUBLE, > pay_time STRING, > product_num BIGINT, > order_status STRING, > is_refund STRING, > refund_fee DOUBLE, > insert_time STRING, > created STRING, > endtime STRING, > modified STRING, > trade_type STRING, > receiver_name STRING, > receiver_country STRING, > receiver_state STRING, > receiver_city STRING, > receiver_district STRING, > receiver_town STRING, > receiver_address STRING, > receiver_mobile STRING, > trade_source STRING, > delivery_type STRING, > consign_time STRING, > orders_num BIGINT, > is_presale BIGINT, > presale_status STRING, > first_fee_paytime STRING, > last_fee_paytime STRING, > first_paid_fee DOUBLE, > tenant STRING, > tidb_modified STRING, > step_paid_fee DOUBLE, > seller_flag STRING, > is_used_store_card BIGINT, > store_card_used DOUBLE, > store_card_basic_used DOUBLE, > store_card_expand_used DOUBLE, > order_promotion_num BIGINT, > item_promotion_num BIGINT, > buyer_remark STRING, > seller_remark STRING, > trade_business_type STRING) > USING iceberg > PARTITIONED BY (uni_shop_id, truncate(4, created)) > LOCATION '/iceberg-catalog/warehouse/dwd/b_std_trade' > TBLPROPERTIES ( > 'current-snapshot-id' = '7217819472703702905', > 'format' = 'iceberg/orc', > 'format-version' = '1', > 'hive.stored-as' = 'iceberg', > 'read.orc.vectorization.enabled' = 'true', > 'sort-order' = 'uni_shop_id ASC NULLS FIRST, created ASC NULLS FIRST', > 'write.distribution-mode' = 'hash', > 'write.format.default' = 'orc', > 'write.metadata.delete-after-commit.enabled' = 'true', > 'write.metadata.previous-versions-max' = '3', > 'write.orc.bloom.filter.columns' = 'order_id', > 'write.orc.compression-codec' = 'zstd') > --hive-iceberg > CREATE EXTERNAL TABLE iceberg_dwd.b_std_trade > STORED BY 'org.apache.iceberg.mr.hive.HiveIcebergStorageHandler' > LOCATION 'hdfs://xxxx/iceberg-catalog/warehouse/dwd/b_std_trade' > TBLPROPERTIES > ('iceberg.catalog'='location_based_table','engine.hive.enabled'='true'); > select * from iceberg_dwd.b_std_trade > where uni_shop_id = 'TEST|11111' limit 10 --10 rows > select * > from ( > select * from iceberg_dwd.b_std_trade > where uni_shop_id = 'TEST|11111' limit 10 > ) t1; --10 rows > select uni_shop_id > from ( > select * from iceberg_dwd.b_std_trade > where uni_shop_id = 'TEST|11111' limit 10 > ) t1; --0 rows > select uni_shop_id > from ( > select uni_shop_id as uni_shop_id from iceberg_dwd.b_std_trade > where uni_shop_id = 'TEST|11111' limit 10 > ) t1; --0 rows > --hive-orc > select uni_shop_id > from ( > select * from iceberg_dwd.trade_test > where uni_shop_id = 'TEST|11111' limit 10 > ) t1; --10 ROWS{code} > -- This message was sent by Atlassian Jira (v8.20.10#820010)