lcs559 opened a new issue #3113:
URL: https://github.com/apache/iceberg/issues/3113


   1. **Environment**
   
      ```shell
      hdfs: 3.1.1.3.1
      hive: 3.1.0
      iceberg: master branch build(2021/9/14)
      ```
   
   2. **login hive beeline**
   
      ```shell
      # beeline
      SLF4J: Class path contains multiple SLF4J bindings.
      SLF4J: Found binding in 
[jar:file:/usr/hdp/3.1.5.0-152/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
      SLF4J: Found binding in 
[jar:file:/usr/hdp/3.1.5.0-152/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
      SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
explanation.
      SLF4J: Actual binding is of type 
[org.apache.logging.slf4j.Log4jLoggerFactory]
      Connecting to 
jdbc:hive2://dev.bdp.mgmt01:2181,dev.bdp.mgmt02:2181,dev.bdp.mgmt03:2181/default;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2
      Enter username for 
jdbc:hive2://dev.bdp.mgmt01:2181,dev.bdp.mgmt02:2181,dev.bdp.mgmt03:2181/default:
 hive
      Enter password for 
jdbc:hive2://dev.bdp.mgmt01:2181,dev.bdp.mgmt02:2181,dev.bdp.mgmt03:2181/default:
      21/09/14 14:12:04 [main]: INFO jdbc.HiveConnection: Connected to 
dev.bdp.mgmt01:10000
      Connected to: Apache Hive (version 3.1.0.3.1.5.0-152)
      Driver: Hive JDBC (version 3.1.0.3.1.5.0-152)
      Transaction isolation: TRANSACTION_REPEATABLE_READ
      Beeline version 3.1.0.3.1.5.0-152 by Apache Hive
      0: jdbc:hive2://dev.bdp.mgmt01:2181,dev.bdp.m>
      ```
   
   3. **add `iceberg-hive-runtime-5f90476.jar`**
   
      ```sql
      > add jar hdfs://bdptest/user/hive/lib/iceberg-hive-runtime-5f90476.jar;
      INFO  : Added 
[/tmp/0be97055-c189-4e8d-ad33-54a50ac828bd_resources/iceberg-hive-runtime-5f90476.jar]
 to class path
      INFO  : Added resources: 
[hdfs://bdptest/user/hive/lib/iceberg-hive-runtime-5f90476.jar]
      No rows affected (0.263 seconds)
      ```
   
   4. **create iceberg table**
   
      ```sql
      > CREATE TABLE iceberg.t5 (
      . . . . . . . . . . . . . . . . . . . . . . .>   id bigint,
      . . . . . . . . . . . . . . . . . . . . . . .>   name string
      . . . . . . . . . . . . . . . . . . . . . . .> ) PARTITIONED BY (
      . . . . . . . . . . . . . . . . . . . . . . .>   dept string
      . . . . . . . . . . . . . . . . . . . . . . .> ) STORED BY 
'org.apache.iceberg.mr.hive.HiveIcebergStorageHandler';
      INFO  : Compiling 
command(queryId=hive_20210914141346_272fe204-b77a-4576-9cca-219d9442daf8): 
CREATE TABLE iceberg.t5 (
      id bigint,
      name string
      ) PARTITIONED BY (
      dept string
      ) STORED BY 'org.apache.iceberg.mr.hive.HiveIcebergStorageHandler'
      INFO  : Semantic Analysis Completed (retrial = false)
      INFO  : Returning Hive schema: Schema(fieldSchemas:null, properties:null)
      INFO  : Completed compiling 
command(queryId=hive_20210914141346_272fe204-b77a-4576-9cca-219d9442daf8); Time 
taken: 0.132 seconds
      INFO  : Executing 
command(queryId=hive_20210914141346_272fe204-b77a-4576-9cca-219d9442daf8): 
CREATE TABLE iceberg.t5 (
      id bigint,
      name string
      ) PARTITIONED BY (
      dept string
      ) STORED BY 'org.apache.iceberg.mr.hive.HiveIcebergStorageHandler'
      INFO  : Starting task [Stage-0:DDL] in serial mode
      INFO  : Completed executing 
command(queryId=hive_20210914141346_272fe204-b77a-4576-9cca-219d9442daf8); Time 
taken: 2.056 seconds
      INFO  : OK
      No rows affected (2.222 seconds)
      ```
   
   5. **insert data**
   
      ```sql
      > insert into  iceberg.t5 values(1,'t1','d1');
      INFO  : Compiling 
command(queryId=hive_20210914141536_be24fd62-5922-472d-b1df-24c2dfa8e383): 
insert into  iceberg.t5 values(1,'t1','d1')
      INFO  : Semantic Analysis Completed (retrial = false)
      INFO  : Returning Hive schema: 
Schema(fieldSchemas:[FieldSchema(name:_col0, type:bigint, comment:null), 
FieldSchema(name:_col1, type:string, comment:null), FieldSchema(name:_col2, 
type:string, comment:null)], properties:null)
      INFO  : Completed compiling 
command(queryId=hive_20210914141536_be24fd62-5922-472d-b1df-24c2dfa8e383); Time 
taken: 0.294 seconds
      INFO  : Executing 
command(queryId=hive_20210914141536_be24fd62-5922-472d-b1df-24c2dfa8e383): 
insert into  iceberg.t5 values(1,'t1','d1')
      INFO  : Query ID = 
hive_20210914141536_be24fd62-5922-472d-b1df-24c2dfa8e383
      INFO  : Total jobs = 1
      INFO  : Starting task [Stage-0:DDL] in serial mode
      INFO  : Starting task [Stage-1:DDL] in serial mode
      INFO  : Launching Job 1 out of 1
      INFO  : Starting task [Stage-2:MAPRED] in serial mode
      INFO  : Subscribed to counters: [] for queryId: 
hive_20210914141536_be24fd62-5922-472d-b1df-24c2dfa8e383
      INFO  : Tez session hasn't been created yet. Opening session
      INFO  : Dag name: insert into  iceberg.t5 values(1,'t1','d1') (Stage-2)
      INFO  : Status: Running (Executing on YARN cluster with App id 
application_1631502306736_0204)
      
      
----------------------------------------------------------------------------------------------
              VERTICES      MODE        STATUS  TOTAL  COMPLETED  RUNNING  
PENDING  FAILED  KILLED
      
----------------------------------------------------------------------------------------------
      Map 1 .......... container     SUCCEEDED      1          1        0       
 0       0       0
      
----------------------------------------------------------------------------------------------
      VERTICES: 01/01  [==========================>>] 100%  ELAPSED TIME: 6.48 s
      
----------------------------------------------------------------------------------------------
      INFO  : Status: DAG finished successfully in 6.43 seconds
      INFO  :
      INFO  : Query Execution Summary
      INFO  : 
----------------------------------------------------------------------------------------------
      INFO  : OPERATION                            DURATION
      INFO  : 
----------------------------------------------------------------------------------------------
      INFO  : Compile Query                           0.29s
      INFO  : Prepare Plan                            6.28s
      INFO  : Get Query Coordinator (AM)              0.00s
      INFO  : Submit Plan                             0.67s
      INFO  : Start DAG                               0.69s
      INFO  : Run DAG                                 6.43s
      INFO  : 
----------------------------------------------------------------------------------------------
      INFO  :
      INFO  : Task Execution Summary
      INFO  : 
----------------------------------------------------------------------------------------------
      INFO  :   VERTICES      DURATION(ms)   CPU_TIME(ms)    GC_TIME(ms)   
INPUT_RECORDS   OUTPUT_RECORDS
      INFO  : 
----------------------------------------------------------------------------------------------
      INFO  :      Map 1           3314.00          4,800            179        
       3                0
      INFO  : 
----------------------------------------------------------------------------------------------
      INFO  :
      INFO  : org.apache.tez.common.counters.DAGCounter:
      INFO  :    NUM_SUCCEEDED_TASKS: 1
      INFO  :    TOTAL_LAUNCHED_TASKS: 1
      INFO  :    RACK_LOCAL_TASKS: 1
      INFO  :    AM_CPU_MILLISECONDS: 2090
      INFO  :    AM_GC_TIME_MILLIS: 0
      INFO  : File System Counters:
      INFO  :    HDFS_BYTES_WRITTEN: 896
      INFO  :    HDFS_WRITE_OPS: 1
      INFO  :    HDFS_OP_CREATE: 1
      INFO  : org.apache.tez.common.counters.TaskCounter:
      INFO  :    GC_TIME_MILLIS: 179
      INFO  :    TASK_DURATION_MILLIS: 3507
      INFO  :    CPU_MILLISECONDS: 4800
      INFO  :    PHYSICAL_MEMORY_BYTES: 260046848
      INFO  :    VIRTUAL_MEMORY_BYTES: 4520706048
      INFO  :    COMMITTED_HEAP_BYTES: 260046848
      INFO  :    INPUT_RECORDS_PROCESSED: 4
      INFO  :    INPUT_SPLIT_LENGTH_BYTES: 1
      INFO  :    OUTPUT_RECORDS: 0
      INFO  : HIVE:
      INFO  :    CREATED_FILES: 1
      INFO  :    DESERIALIZE_ERRORS: 0
      INFO  :    RECORDS_IN_Map_1: 3
      INFO  :    RECORDS_OUT_1_iceberg.t5: 1
      INFO  :    RECORDS_OUT_INTERMEDIATE_Map_1: 0
      INFO  :    RECORDS_OUT_OPERATOR_FS_5: 1
      INFO  :    RECORDS_OUT_OPERATOR_MAP_0: 0
      INFO  :    RECORDS_OUT_OPERATOR_SEL_1: 1
      INFO  :    RECORDS_OUT_OPERATOR_SEL_3: 1
      INFO  :    RECORDS_OUT_OPERATOR_TS_0: 1
      INFO  :    RECORDS_OUT_OPERATOR_UDTF_2: 1
      INFO  : TaskCounter_Map_1_INPUT__dummy_table:
      INFO  :    INPUT_RECORDS_PROCESSED: 4
      INFO  :    INPUT_SPLIT_LENGTH_BYTES: 1
      INFO  : TaskCounter_Map_1_OUTPUT_out_Map_1:
      INFO  :    OUTPUT_RECORDS: 0
      INFO  : Starting task [Stage-4:DDL] in serial mode
      INFO  : Completed executing 
command(queryId=hive_20210914141536_be24fd62-5922-472d-b1df-24c2dfa8e383); Time 
taken: 14.075 seconds
      INFO  : OK
      No rows affected (14.431 seconds)
      ```
   
   6. **Query iceberg table by hive** 
   
      ```sql
      > select * from iceberg.t5;
      INFO  : Compiling 
command(queryId=hive_20210914141630_3f4534a0-659e-4100-af91-f7c687d75245): 
select * from iceberg.t5
      INFO  : Semantic Analysis Completed (retrial = false)
      INFO  : Returning Hive schema: 
Schema(fieldSchemas:[FieldSchema(name:t5.id, type:bigint, comment:null), 
FieldSchema(name:t5.name, type:string, comment:null), FieldSchema(name:t5.dept, 
type:string, comment:null)], properties:null)
      INFO  : Completed compiling 
command(queryId=hive_20210914141630_3f4534a0-659e-4100-af91-f7c687d75245); Time 
taken: 0.136 seconds
      INFO  : Executing 
command(queryId=hive_20210914141630_3f4534a0-659e-4100-af91-f7c687d75245): 
select * from iceberg.t5
      INFO  : Completed executing 
command(queryId=hive_20210914141630_3f4534a0-659e-4100-af91-f7c687d75245); Time 
taken: 0.005 seconds
      INFO  : OK
      +--------+----------+----------+
      | t5.id  | t5.name  | t5.dept  |
      +--------+----------+----------+
      +--------+----------+----------+
      No rows selected (0.202 seconds)
      ```
   
      7. **show create table**
   
         ```sql
         > show create table iceberg.t5;
         INFO  : Compiling 
command(queryId=hive_20210914141931_a48f0a14-4d07-4867-aaee-4db4ece1d05f): show 
create table iceberg.t5
         INFO  : Semantic Analysis Completed (retrial = false)
         INFO  : Returning Hive schema: 
Schema(fieldSchemas:[FieldSchema(name:createtab_stmt, type:string, comment:from 
deserializer)], properties:null)
         INFO  : Completed compiling 
command(queryId=hive_20210914141931_a48f0a14-4d07-4867-aaee-4db4ece1d05f); Time 
taken: 0.027 seconds
         INFO  : Executing 
command(queryId=hive_20210914141931_a48f0a14-4d07-4867-aaee-4db4ece1d05f): show 
create table iceberg.t5
         INFO  : Starting task [Stage-0:DDL] in serial mode
         INFO  : Completed executing 
command(queryId=hive_20210914141931_a48f0a14-4d07-4867-aaee-4db4ece1d05f); Time 
taken: 0.036 seconds
         INFO  : OK
         +----------------------------------------------------+
         |                   createtab_stmt                   |
         +----------------------------------------------------+
         | CREATE EXTERNAL TABLE `iceberg.t5`(                |
         |   `id` bigint COMMENT 'from deserializer',         |
         |   `name` string COMMENT 'from deserializer',       |
         |   `dept` string COMMENT 'from deserializer')       |
         | ROW FORMAT SERDE                                   |
         |   'org.apache.iceberg.mr.hive.HiveIcebergSerDe'    |
         | STORED BY                                          |
         |   'org.apache.iceberg.mr.hive.HiveIcebergStorageHandler'  |
         |                                                    |
         | LOCATION                                           |
         |   'hdfs://bdptest/warehouse/tablespace/managed/hive/iceberg.db/t5' |
         | TBLPROPERTIES (                                    |
         |   'TRANSLATED_TO_EXTERNAL'='TRUE',                 |
         |   'bucketing_version'='2',                         |
         |   'engine.hive.enabled'='true',                    |
         |   'external.table.purge'='TRUE',                   |
         |   'last_modified_by'='hive',                       |
         |   'last_modified_time'='1631600137',               |
         |   
'metadata_location'='hdfs://bdptest/warehouse/tablespace/managed/hive/iceberg.db/t5/metadata/00000-6a199018-22ec-4f82-8434-5672bd26b7c4.metadata.json',
  |
         |   'table_type'='ICEBERG',                          |
         |   'transient_lastDdlTime'='1631600137')            |
         +----------------------------------------------------+
         21 rows selected (0.104 seconds)
         ```
   
      8. **Check the file of the table **
   
   ```shell
   # sudo -u hdfs hadoop fs -ls 
hdfs://bdptest/warehouse/tablespace/managed/hive/iceberg.db/t5
   Found 2 items
   drwxrwx---+  - hive hadoop          0 2021-09-14 14:15 
hdfs://bdptest/warehouse/tablespace/managed/hive/iceberg.db/t5/data
   drwxrwx---+  - hive hadoop          0 2021-09-14 14:13 
hdfs://bdptest/warehouse/tablespace/managed/hive/iceberg.db/t5/metadata
   # sudo -u hdfs hadoop fs -ls 
hdfs://bdptest/warehouse/tablespace/managed/hive/iceberg.db/t5/metadata
   Found 1 items
   -rw-rw----+  3 hive hadoop       1731 2021-09-14 14:13 
hdfs://bdptest/warehouse/tablespace/managed/hive/iceberg.db/t5/metadata/00000-6a199018-22ec-4f82-8434-5672bd26b7c4.metadata.json
   # sudo -u hdfs hadoop fs -ls 
hdfs://bdptest/warehouse/tablespace/managed/hive/iceberg.db/t5/data
   Found 1 items
   drwxrwx---+  - hive hadoop          0 2021-09-14 14:15 
hdfs://bdptest/warehouse/tablespace/managed/hive/iceberg.db/t5/data/dept=d1
   # sudo -u hdfs hadoop fs -ls 
hdfs://bdptest/warehouse/tablespace/managed/hive/iceberg.db/t5/data/dept=d1
   Found 1 items
   -rw-rw----+  3 hive hadoop        896 2021-09-14 14:15 
hdfs://bdptest/warehouse/tablespace/managed/hive/iceberg.db/t5/data/dept=d1/00000-0-hive_20210914141536_be24fd62-5922-472d-b1df-24c2dfa8e383-job_1631502306736_0204-00001.parquet
   
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to