sassai commented on issue #1962:
URL: https://github.com/apache/hudi/issues/1962#issuecomment-702568499


   @bvaradar: Sorry for the late reply. I was not able to investigate this 
issue further until now. 
   
   In the meantime I updated Hudi to 0.6.0 to check if the issue still occurs. 
Unfortunately yes. I created a table (COPY_ON_WRITE) with testing data for 
further debugging. Please find the requested information below:
   
   Hudi data set:
   
   ```console
   drwxr-xr-x   - 3d88417a-c602-4b19-b581-ac7265074929 srv_tu_usecase2_producer 
         0 2020-10-01 17:14 
abfs://[email protected]/data/hudi/batch/tables/nyc_taxi/address/.hoodie
   drwxr-xr-x   - 3d88417a-c602-4b19-b581-ac7265074929 srv_tu_usecase2_producer 
         0 2020-10-01 17:14 
abfs://[email protected]/data/hudi/batch/tables/nyc_taxi/address/.hoodie/.aux
   drwxr-xr-x   - 3d88417a-c602-4b19-b581-ac7265074929 srv_tu_usecase2_producer 
         0 2020-10-01 17:14 
abfs://[email protected]/data/hudi/batch/tables/nyc_taxi/address/.hoodie/.aux/.bootstrap
   drwxr-xr-x   - 3d88417a-c602-4b19-b581-ac7265074929 srv_tu_usecase2_producer 
         0 2020-10-01 17:14 
abfs://[email protected]/data/hudi/batch/tables/nyc_taxi/address/.hoodie/.aux/.bootstrap/.fileids
   drwxr-xr-x   - 3d88417a-c602-4b19-b581-ac7265074929 srv_tu_usecase2_producer 
         0 2020-10-01 17:14 
abfs://[email protected]/data/hudi/batch/tables/nyc_taxi/address/.hoodie/.aux/.bootstrap/.partitions
   drwxr-xr-x   - 3d88417a-c602-4b19-b581-ac7265074929 srv_tu_usecase2_producer 
         0 2020-10-01 17:14 
abfs://[email protected]/data/hudi/batch/tables/nyc_taxi/address/.hoodie/.temp
   -rw-r--r--   1 3d88417a-c602-4b19-b581-ac7265074929 srv_tu_usecase2_producer 
      9133 2020-10-01 17:15 
abfs://[email protected]/data/hudi/batch/tables/nyc_taxi/address/.hoodie/20201001171431.commit
   -rw-r--r--   1 3d88417a-c602-4b19-b581-ac7265074929 srv_tu_usecase2_producer 
         0 2020-10-01 17:14 
abfs://[email protected]/data/hudi/batch/tables/nyc_taxi/address/.hoodie/20201001171431.commit.requested
   -rw-r--r--   1 3d88417a-c602-4b19-b581-ac7265074929 srv_tu_usecase2_producer 
       999 2020-10-01 17:15 
abfs://[email protected]/data/hudi/batch/tables/nyc_taxi/address/.hoodie/20201001171431.inflight
   -rw-r--r--   1 3d88417a-c602-4b19-b581-ac7265074929 srv_tu_usecase2_producer 
      1169 2020-10-01 17:18 
abfs://[email protected]/data/hudi/batch/tables/nyc_taxi/address/.hoodie/20201001171823.commit
   -rw-r--r--   1 3d88417a-c602-4b19-b581-ac7265074929 srv_tu_usecase2_producer 
         0 2020-10-01 17:18 
abfs://[email protected]/data/hudi/batch/tables/nyc_taxi/address/.hoodie/20201001171823.commit.requested
   -rw-r--r--   1 3d88417a-c602-4b19-b581-ac7265074929 srv_tu_usecase2_producer 
       380 2020-10-01 17:18 
abfs://[email protected]/data/hudi/batch/tables/nyc_taxi/address/.hoodie/20201001171823.inflight
   -rw-r--r--   1 3d88417a-c602-4b19-b581-ac7265074929 srv_tu_usecase2_producer 
      2986 2020-10-01 17:34 
abfs://[email protected]/data/hudi/batch/tables/nyc_taxi/address/.hoodie/20201001173346.commit
   -rw-r--r--   1 3d88417a-c602-4b19-b581-ac7265074929 srv_tu_usecase2_producer 
         0 2020-10-01 17:34 
abfs://[email protected]/data/hudi/batch/tables/nyc_taxi/address/.hoodie/20201001173346.commit.requested
   -rw-r--r--   1 3d88417a-c602-4b19-b581-ac7265074929 srv_tu_usecase2_producer 
      1653 2020-10-01 17:34 
abfs://[email protected]/data/hudi/batch/tables/nyc_taxi/address/.hoodie/20201001173346.inflight
   drwxr-xr-x   - 3d88417a-c602-4b19-b581-ac7265074929 srv_tu_usecase2_producer 
         0 2020-10-01 17:14 
abfs://[email protected]/data/hudi/batch/tables/nyc_taxi/address/.hoodie/archived
   -rw-r--r--   1 3d88417a-c602-4b19-b581-ac7265074929 srv_tu_usecase2_producer 
       228 2020-10-01 17:14 
abfs://[email protected]/data/hudi/batch/tables/nyc_taxi/address/.hoodie/hoodie.properties
   drwxr-xr-x   - 3d88417a-c602-4b19-b581-ac7265074929 srv_tu_usecase2_producer 
         0 2020-10-01 17:15 
abfs://[email protected]/data/hudi/batch/tables/nyc_taxi/address/year=2020
   drwxr-xr-x   - 3d88417a-c602-4b19-b581-ac7265074929 srv_tu_usecase2_producer 
         0 2020-10-01 17:15 
abfs://[email protected]/data/hudi/batch/tables/nyc_taxi/address/year=2020/month=10
   drwxr-xr-x   - 3d88417a-c602-4b19-b581-ac7265074929 srv_tu_usecase2_producer 
         0 2020-10-01 17:15 
abfs://[email protected]/data/hudi/batch/tables/nyc_taxi/address/year=2020/month=10/day=1
   -rw-r--r--   1 3d88417a-c602-4b19-b581-ac7265074929 srv_tu_usecase2_producer 
        93 2020-10-01 17:15 
abfs://[email protected]/data/hudi/batch/tables/nyc_taxi/address/year=2020/month=10/day=1/.hoodie_partition_metadata
   -rw-r--r--   1 3d88417a-c602-4b19-b581-ac7265074929 srv_tu_usecase2_producer 
   7095995 2020-10-01 17:34 
abfs://[email protected]/data/hudi/batch/tables/nyc_taxi/address/year=2020/month=10/day=1/08b5ed87-a749-4a82-a298-59071381dbc9-0_0-89-2258_20201001173346.parquet
   -rw-r--r--   1 3d88417a-c602-4b19-b581-ac7265074929 srv_tu_usecase2_producer 
   7096113 2020-10-01 17:15 
abfs://[email protected]/data/hudi/batch/tables/nyc_taxi/address/year=2020/month=10/day=1/08b5ed87-a749-4a82-a298-59071381dbc9-0_8-25-180_20201001171431.parquet
   -rw-r--r--   1 3d88417a-c602-4b19-b581-ac7265074929 srv_tu_usecase2_producer 
   7126955 2020-10-01 17:15 
abfs://[email protected]/data/hudi/batch/tables/nyc_taxi/address/year=2020/month=10/day=1/3c49d05f-8a9b-4365-8158-a32f879d674f-0_0-25-172_20201001171431.parquet
   -rw-r--r--   1 3d88417a-c602-4b19-b581-ac7265074929 srv_tu_usecase2_producer 
   7126790 2020-10-01 17:34 
abfs://[email protected]/data/hudi/batch/tables/nyc_taxi/address/year=2020/month=10/day=1/3c49d05f-8a9b-4365-8158-a32f879d674f-0_1-89-2259_20201001173346.parquet
   -rw-r--r--   1 3d88417a-c602-4b19-b581-ac7265074929 srv_tu_usecase2_producer 
   7144341 2020-10-01 17:15 
abfs://[email protected]/data/hudi/batch/tables/nyc_taxi/address/year=2020/month=10/day=1/4f7d1a69-112a-42d1-b0ac-adf8de1e8dad-0_7-25-179_20201001171431.parquet
   -rw-r--r--   1 3d88417a-c602-4b19-b581-ac7265074929 srv_tu_usecase2_producer 
   7120178 2020-10-01 17:15 
abfs://[email protected]/data/hudi/batch/tables/nyc_taxi/address/year=2020/month=10/day=1/669c9159-a795-40ac-9827-4551965e1750-0_3-25-175_20201001171431.parquet
   -rw-r--r--   1 3d88417a-c602-4b19-b581-ac7265074929 srv_tu_usecase2_producer 
   7197719 2020-10-01 17:15 
abfs://[email protected]/data/hudi/batch/tables/nyc_taxi/address/year=2020/month=10/day=1/ab9c4c36-ea10-47bc-bf05-1555bf07c4ad-0_2-25-174_20201001171431.parquet
   -rw-r--r--   1 3d88417a-c602-4b19-b581-ac7265074929 srv_tu_usecase2_producer 
   7158006 2020-10-01 17:15 
abfs://[email protected]/data/hudi/batch/tables/nyc_taxi/address/year=2020/month=10/day=1/ad939d4e-d8dc-4723-b5a5-8ec2c064f3e3-0_6-25-178_20201001171431.parquet
   -rw-r--r--   1 3d88417a-c602-4b19-b581-ac7265074929 srv_tu_usecase2_producer 
   7170312 2020-10-01 17:15 
abfs://[email protected]/data/hudi/batch/tables/nyc_taxi/address/year=2020/month=10/day=1/c4c0b379-79e8-4856-9605-e56b1beb1b09-0_4-25-176_20201001171431.parquet
   -rw-r--r--   1 3d88417a-c602-4b19-b581-ac7265074929 srv_tu_usecase2_producer 
   7118844 2020-10-01 17:15 
abfs://[email protected]/data/hudi/batch/tables/nyc_taxi/address/year=2020/month=10/day=1/f5fae4bc-396d-4ff6-8b89-01c08559cb50-0_5-25-177_20201001171431.parquet
   -rw-r--r--   1 3d88417a-c602-4b19-b581-ac7265074929 srv_tu_usecase2_producer 
   7156753 2020-10-01 17:15 
abfs://[email protected]/data/hudi/batch/tables/nyc_taxi/address/year=2020/month=10/day=1/f804937c-8de2-4692-90d9-4b485305d721-0_1-25-173_20201001171431.parquet
   ```
   
   Latest commit:
   
   ```json
   {
     "partitionToWriteStats" : {
       "year=2020/month=10/day=1" : [ {
         "fileId" : "08b5ed87-a749-4a82-a298-59071381dbc9-0",
         "path" : 
"year=2020/month=10/day=1/08b5ed87-a749-4a82-a298-59071381dbc9-0_0-89-2258_20201001173346.parquet",
         "prevCommit" : "20201001171431",
         "numWrites" : 110569,
         "numDeletes" : 0,
         "numUpdateWrites" : 1,
         "numInserts" : 0,
         "totalWriteBytes" : 7095995,
         "totalWriteErrors" : 0,
         "tempPath" : null,
         "partitionPath" : "year=2020/month=10/day=1",
         "totalLogRecords" : 0,
         "totalLogFilesCompacted" : 0,
         "totalLogSizeCompacted" : 0,
         "totalUpdatedRecordsCompacted" : 0,
         "totalLogBlocks" : 0,
         "totalCorruptLogBlock" : 0,
         "totalRollbackBlocks" : 0,
         "fileSizeInBytes" : 7095995
       }, {
         "fileId" : "3c49d05f-8a9b-4365-8158-a32f879d674f-0",
         "path" : 
"year=2020/month=10/day=1/3c49d05f-8a9b-4365-8158-a32f879d674f-0_1-89-2259_20201001173346.parquet",
         "prevCommit" : "20201001171431",
         "numWrites" : 111111,
         "numDeletes" : 0,
         "numUpdateWrites" : 0,
         "numInserts" : 1,
         "totalWriteBytes" : 7126790,
         "totalWriteErrors" : 0,
         "tempPath" : null,
         "partitionPath" : "year=2020/month=10/day=1",
         "totalLogRecords" : 0,
         "totalLogFilesCompacted" : 0,
         "totalLogSizeCompacted" : 0,
         "totalUpdatedRecordsCompacted" : 0,
         "totalLogBlocks" : 0,
         "totalCorruptLogBlock" : 0,
         "totalRollbackBlocks" : 0,
         "fileSizeInBytes" : 7126790
       } ]
     },
     "compacted" : false,
     "extraMetadata" : {
       "schema" : 
"{\"type\":\"record\",\"name\":\"address_record\",\"namespace\":\"hoodie.address\",\"fields\":[{\"name\":\"id\",\"type\":[\"int\",\"null\"]},{\"name\":\"zipCode\",\"type\":[\"int\",\"null\"]},{\"name\":\"city\",\"type\":[\"string\",\"null\"]},{\"name\":\"street\",\"type\":[\"string\",\"null\"]},{\"name\":\"streetNumber\",\"type\":[\"int\",\"null\"]},{\"name\":\"uuid\",\"type\":[\"string\",\"null\"]},{\"name\":\"start_date\",\"type\":[\"string\",\"null\"]},{\"name\":\"end_date\",\"type\":[\"string\",\"null\"]},{\"name\":\"is_current\",\"type\":\"boolean\"},{\"name\":\"event_time\",\"type\":[\"string\",\"null\"]},{\"name\":\"year\",\"type\":\"int\"},{\"name\":\"month\",\"type\":\"int\"},{\"name\":\"day\",\"type\":\"int\"},{\"name\":\"its\",\"type\":\"string\"}]}"
     },
     "operationType" : "UPSERT",
     "fileIdAndRelativePaths" : {
       "08b5ed87-a749-4a82-a298-59071381dbc9-0" : 
"year=2020/month=10/day=1/08b5ed87-a749-4a82-a298-59071381dbc9-0_0-89-2258_20201001173346.parquet",
       "3c49d05f-8a9b-4365-8158-a32f879d674f-0" : 
"year=2020/month=10/day=1/3c49d05f-8a9b-4365-8158-a32f879d674f-0_1-89-2259_20201001173346.parquet"
     },
     "totalRecordsDeleted" : 0,
     "totalLogRecordsCompacted" : 0,
     "totalScanTime" : 0,
     "totalCreateTime" : 0,
     "totalUpsertTime" : 11307,
     "totalCompactedRecordsUpdated" : 0,
     "totalLogFilesCompacted" : 0,
     "totalLogFilesSize" : 0
   }
   ```
   
   Hudi properties file:
   
   ```console
   #Properties saved on Thu Oct 01 17:14:31 UTC 2020
   #Thu Oct 01 17:14:31 UTC 2020
   hoodie.table.name=address
   hoodie.archivelog.folder=archived
   hoodie.table.type=COPY_ON_WRITE
   hoodie.table.version=1
   hoodie.timeline.layout.version=1
   ```
   
   Describe table:
   
   ```console
   
+-----------------------------------+----------------------------------------------------+-----------------------+
   |             col_name              |                     data_type          
            |        comment        |
   
+-----------------------------------+----------------------------------------------------+-----------------------+
   | _hoodie_commit_time               | string                                 
            |                       |
   | _hoodie_commit_seqno              | string                                 
            |                       |
   | _hoodie_record_key                | string                                 
            |                       |
   | _hoodie_partition_path            | string                                 
            |                       |
   | _hoodie_file_name                 | string                                 
            |                       |
   | id                                | int                                    
            |                       |
   | zipcode                           | int                                    
            |                       |
   | city                              | string                                 
            |                       |
   | street                            | string                                 
            |                       |
   | streetnumber                      | int                                    
            |                       |
   | uuid                              | string                                 
            |                       |
   | start_date                        | string                                 
            |                       |
   | end_date                          | string                                 
            |                       |
   | is_current                        | boolean                                
            |                       |
   | event_time                        | string                                 
            |                       |
   | its                               | string                                 
            |                       |
   |                                   | NULL                                   
            | NULL                  |
   | # Partition Information           | NULL                                   
            | NULL                  |
   | # col_name                        | data_type                              
            | comment               |
   | year                              | int                                    
            |                       |
   | month                             | int                                    
            |                       |
   | day                               | int                                    
            |                       |
   |                                   | NULL                                   
            | NULL                  |
   | # Detailed Partition Information  | NULL                                   
            | NULL                  |
   | Partition Value:                  | [2020, 10, 1]                          
            | NULL                  |
   | Database:                         | nyc_taxi                               
            | NULL                  |
   | Table:                            | address                                
            | NULL                  |
   | CreateTime:                       | Thu Oct 01 17:15:51 UTC 2020           
            | NULL                  |
   | LastAccessTime:                   | UNKNOWN                                
            | NULL                  |
   | Location:                         | 
abfs://[email protected]/data/hudi/batch/tables/nyc_taxi/address/year=2020/month=10/day=1
 | NULL                  |
   | Partition Parameters:             | NULL                                   
            | NULL                  |
   |                                   | numFiles                               
            | 9                     |
   |                                   | totalSize                              
            | 64289221              |
   |                                   | transient_lastDdlTime                  
            | 1601572551            |
   |                                   | NULL                                   
            | NULL                  |
   | # Storage Information             | NULL                                   
            | NULL                  |
   | SerDe Library:                    | 
org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe | NULL              
    |
   | InputFormat:                      | 
org.apache.hudi.hadoop.HoodieParquetInputFormat    | NULL                  |
   | OutputFormat:                     | 
org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat | NULL           
       |
   | Compressed:                       | No                                     
            | NULL                  |
   | Num Buckets:                      | -1                                     
            | NULL                  |
   | Bucket Columns:                   | []                                     
            | NULL                  |
   | Sort Columns:                     | []                                     
            | NULL                  |
   | Storage Desc Params:              | NULL                                   
            | NULL                  |
   |                                   | serialization.format                   
            | 1                     |
   
+-----------------------------------+----------------------------------------------------+-----------------------+
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to