sassai commented on issue #1962:
URL: https://github.com/apache/hudi/issues/1962#issuecomment-702568499
@bvaradar: Sorry for the late reply. I was not able to investigate this
issue further until now.
In the meantime I updated Hudi to 0.6.0 to check if the issue still occurs.
Unfortunately yes. I created a table (COPY_ON_WRITE) with testing data for
further debugging. Please find the requested information below:
Hudi data set:
```console
drwxr-xr-x - 3d88417a-c602-4b19-b581-ac7265074929 srv_tu_usecase2_producer
0 2020-10-01 17:14
abfs://[email protected]/data/hudi/batch/tables/nyc_taxi/address/.hoodie
drwxr-xr-x - 3d88417a-c602-4b19-b581-ac7265074929 srv_tu_usecase2_producer
0 2020-10-01 17:14
abfs://[email protected]/data/hudi/batch/tables/nyc_taxi/address/.hoodie/.aux
drwxr-xr-x - 3d88417a-c602-4b19-b581-ac7265074929 srv_tu_usecase2_producer
0 2020-10-01 17:14
abfs://[email protected]/data/hudi/batch/tables/nyc_taxi/address/.hoodie/.aux/.bootstrap
drwxr-xr-x - 3d88417a-c602-4b19-b581-ac7265074929 srv_tu_usecase2_producer
0 2020-10-01 17:14
abfs://[email protected]/data/hudi/batch/tables/nyc_taxi/address/.hoodie/.aux/.bootstrap/.fileids
drwxr-xr-x - 3d88417a-c602-4b19-b581-ac7265074929 srv_tu_usecase2_producer
0 2020-10-01 17:14
abfs://[email protected]/data/hudi/batch/tables/nyc_taxi/address/.hoodie/.aux/.bootstrap/.partitions
drwxr-xr-x - 3d88417a-c602-4b19-b581-ac7265074929 srv_tu_usecase2_producer
0 2020-10-01 17:14
abfs://[email protected]/data/hudi/batch/tables/nyc_taxi/address/.hoodie/.temp
-rw-r--r-- 1 3d88417a-c602-4b19-b581-ac7265074929 srv_tu_usecase2_producer
9133 2020-10-01 17:15
abfs://[email protected]/data/hudi/batch/tables/nyc_taxi/address/.hoodie/20201001171431.commit
-rw-r--r-- 1 3d88417a-c602-4b19-b581-ac7265074929 srv_tu_usecase2_producer
0 2020-10-01 17:14
abfs://[email protected]/data/hudi/batch/tables/nyc_taxi/address/.hoodie/20201001171431.commit.requested
-rw-r--r-- 1 3d88417a-c602-4b19-b581-ac7265074929 srv_tu_usecase2_producer
999 2020-10-01 17:15
abfs://[email protected]/data/hudi/batch/tables/nyc_taxi/address/.hoodie/20201001171431.inflight
-rw-r--r-- 1 3d88417a-c602-4b19-b581-ac7265074929 srv_tu_usecase2_producer
1169 2020-10-01 17:18
abfs://[email protected]/data/hudi/batch/tables/nyc_taxi/address/.hoodie/20201001171823.commit
-rw-r--r-- 1 3d88417a-c602-4b19-b581-ac7265074929 srv_tu_usecase2_producer
0 2020-10-01 17:18
abfs://[email protected]/data/hudi/batch/tables/nyc_taxi/address/.hoodie/20201001171823.commit.requested
-rw-r--r-- 1 3d88417a-c602-4b19-b581-ac7265074929 srv_tu_usecase2_producer
380 2020-10-01 17:18
abfs://[email protected]/data/hudi/batch/tables/nyc_taxi/address/.hoodie/20201001171823.inflight
-rw-r--r-- 1 3d88417a-c602-4b19-b581-ac7265074929 srv_tu_usecase2_producer
2986 2020-10-01 17:34
abfs://[email protected]/data/hudi/batch/tables/nyc_taxi/address/.hoodie/20201001173346.commit
-rw-r--r-- 1 3d88417a-c602-4b19-b581-ac7265074929 srv_tu_usecase2_producer
0 2020-10-01 17:34
abfs://[email protected]/data/hudi/batch/tables/nyc_taxi/address/.hoodie/20201001173346.commit.requested
-rw-r--r-- 1 3d88417a-c602-4b19-b581-ac7265074929 srv_tu_usecase2_producer
1653 2020-10-01 17:34
abfs://[email protected]/data/hudi/batch/tables/nyc_taxi/address/.hoodie/20201001173346.inflight
drwxr-xr-x - 3d88417a-c602-4b19-b581-ac7265074929 srv_tu_usecase2_producer
0 2020-10-01 17:14
abfs://[email protected]/data/hudi/batch/tables/nyc_taxi/address/.hoodie/archived
-rw-r--r-- 1 3d88417a-c602-4b19-b581-ac7265074929 srv_tu_usecase2_producer
228 2020-10-01 17:14
abfs://[email protected]/data/hudi/batch/tables/nyc_taxi/address/.hoodie/hoodie.properties
drwxr-xr-x - 3d88417a-c602-4b19-b581-ac7265074929 srv_tu_usecase2_producer
0 2020-10-01 17:15
abfs://[email protected]/data/hudi/batch/tables/nyc_taxi/address/year=2020
drwxr-xr-x - 3d88417a-c602-4b19-b581-ac7265074929 srv_tu_usecase2_producer
0 2020-10-01 17:15
abfs://[email protected]/data/hudi/batch/tables/nyc_taxi/address/year=2020/month=10
drwxr-xr-x - 3d88417a-c602-4b19-b581-ac7265074929 srv_tu_usecase2_producer
0 2020-10-01 17:15
abfs://[email protected]/data/hudi/batch/tables/nyc_taxi/address/year=2020/month=10/day=1
-rw-r--r-- 1 3d88417a-c602-4b19-b581-ac7265074929 srv_tu_usecase2_producer
93 2020-10-01 17:15
abfs://[email protected]/data/hudi/batch/tables/nyc_taxi/address/year=2020/month=10/day=1/.hoodie_partition_metadata
-rw-r--r-- 1 3d88417a-c602-4b19-b581-ac7265074929 srv_tu_usecase2_producer
7095995 2020-10-01 17:34
abfs://[email protected]/data/hudi/batch/tables/nyc_taxi/address/year=2020/month=10/day=1/08b5ed87-a749-4a82-a298-59071381dbc9-0_0-89-2258_20201001173346.parquet
-rw-r--r-- 1 3d88417a-c602-4b19-b581-ac7265074929 srv_tu_usecase2_producer
7096113 2020-10-01 17:15
abfs://[email protected]/data/hudi/batch/tables/nyc_taxi/address/year=2020/month=10/day=1/08b5ed87-a749-4a82-a298-59071381dbc9-0_8-25-180_20201001171431.parquet
-rw-r--r-- 1 3d88417a-c602-4b19-b581-ac7265074929 srv_tu_usecase2_producer
7126955 2020-10-01 17:15
abfs://[email protected]/data/hudi/batch/tables/nyc_taxi/address/year=2020/month=10/day=1/3c49d05f-8a9b-4365-8158-a32f879d674f-0_0-25-172_20201001171431.parquet
-rw-r--r-- 1 3d88417a-c602-4b19-b581-ac7265074929 srv_tu_usecase2_producer
7126790 2020-10-01 17:34
abfs://[email protected]/data/hudi/batch/tables/nyc_taxi/address/year=2020/month=10/day=1/3c49d05f-8a9b-4365-8158-a32f879d674f-0_1-89-2259_20201001173346.parquet
-rw-r--r-- 1 3d88417a-c602-4b19-b581-ac7265074929 srv_tu_usecase2_producer
7144341 2020-10-01 17:15
abfs://[email protected]/data/hudi/batch/tables/nyc_taxi/address/year=2020/month=10/day=1/4f7d1a69-112a-42d1-b0ac-adf8de1e8dad-0_7-25-179_20201001171431.parquet
-rw-r--r-- 1 3d88417a-c602-4b19-b581-ac7265074929 srv_tu_usecase2_producer
7120178 2020-10-01 17:15
abfs://[email protected]/data/hudi/batch/tables/nyc_taxi/address/year=2020/month=10/day=1/669c9159-a795-40ac-9827-4551965e1750-0_3-25-175_20201001171431.parquet
-rw-r--r-- 1 3d88417a-c602-4b19-b581-ac7265074929 srv_tu_usecase2_producer
7197719 2020-10-01 17:15
abfs://[email protected]/data/hudi/batch/tables/nyc_taxi/address/year=2020/month=10/day=1/ab9c4c36-ea10-47bc-bf05-1555bf07c4ad-0_2-25-174_20201001171431.parquet
-rw-r--r-- 1 3d88417a-c602-4b19-b581-ac7265074929 srv_tu_usecase2_producer
7158006 2020-10-01 17:15
abfs://[email protected]/data/hudi/batch/tables/nyc_taxi/address/year=2020/month=10/day=1/ad939d4e-d8dc-4723-b5a5-8ec2c064f3e3-0_6-25-178_20201001171431.parquet
-rw-r--r-- 1 3d88417a-c602-4b19-b581-ac7265074929 srv_tu_usecase2_producer
7170312 2020-10-01 17:15
abfs://[email protected]/data/hudi/batch/tables/nyc_taxi/address/year=2020/month=10/day=1/c4c0b379-79e8-4856-9605-e56b1beb1b09-0_4-25-176_20201001171431.parquet
-rw-r--r-- 1 3d88417a-c602-4b19-b581-ac7265074929 srv_tu_usecase2_producer
7118844 2020-10-01 17:15
abfs://[email protected]/data/hudi/batch/tables/nyc_taxi/address/year=2020/month=10/day=1/f5fae4bc-396d-4ff6-8b89-01c08559cb50-0_5-25-177_20201001171431.parquet
-rw-r--r-- 1 3d88417a-c602-4b19-b581-ac7265074929 srv_tu_usecase2_producer
7156753 2020-10-01 17:15
abfs://[email protected]/data/hudi/batch/tables/nyc_taxi/address/year=2020/month=10/day=1/f804937c-8de2-4692-90d9-4b485305d721-0_1-25-173_20201001171431.parquet
```
Latest commit:
```json
{
"partitionToWriteStats" : {
"year=2020/month=10/day=1" : [ {
"fileId" : "08b5ed87-a749-4a82-a298-59071381dbc9-0",
"path" :
"year=2020/month=10/day=1/08b5ed87-a749-4a82-a298-59071381dbc9-0_0-89-2258_20201001173346.parquet",
"prevCommit" : "20201001171431",
"numWrites" : 110569,
"numDeletes" : 0,
"numUpdateWrites" : 1,
"numInserts" : 0,
"totalWriteBytes" : 7095995,
"totalWriteErrors" : 0,
"tempPath" : null,
"partitionPath" : "year=2020/month=10/day=1",
"totalLogRecords" : 0,
"totalLogFilesCompacted" : 0,
"totalLogSizeCompacted" : 0,
"totalUpdatedRecordsCompacted" : 0,
"totalLogBlocks" : 0,
"totalCorruptLogBlock" : 0,
"totalRollbackBlocks" : 0,
"fileSizeInBytes" : 7095995
}, {
"fileId" : "3c49d05f-8a9b-4365-8158-a32f879d674f-0",
"path" :
"year=2020/month=10/day=1/3c49d05f-8a9b-4365-8158-a32f879d674f-0_1-89-2259_20201001173346.parquet",
"prevCommit" : "20201001171431",
"numWrites" : 111111,
"numDeletes" : 0,
"numUpdateWrites" : 0,
"numInserts" : 1,
"totalWriteBytes" : 7126790,
"totalWriteErrors" : 0,
"tempPath" : null,
"partitionPath" : "year=2020/month=10/day=1",
"totalLogRecords" : 0,
"totalLogFilesCompacted" : 0,
"totalLogSizeCompacted" : 0,
"totalUpdatedRecordsCompacted" : 0,
"totalLogBlocks" : 0,
"totalCorruptLogBlock" : 0,
"totalRollbackBlocks" : 0,
"fileSizeInBytes" : 7126790
} ]
},
"compacted" : false,
"extraMetadata" : {
"schema" :
"{\"type\":\"record\",\"name\":\"address_record\",\"namespace\":\"hoodie.address\",\"fields\":[{\"name\":\"id\",\"type\":[\"int\",\"null\"]},{\"name\":\"zipCode\",\"type\":[\"int\",\"null\"]},{\"name\":\"city\",\"type\":[\"string\",\"null\"]},{\"name\":\"street\",\"type\":[\"string\",\"null\"]},{\"name\":\"streetNumber\",\"type\":[\"int\",\"null\"]},{\"name\":\"uuid\",\"type\":[\"string\",\"null\"]},{\"name\":\"start_date\",\"type\":[\"string\",\"null\"]},{\"name\":\"end_date\",\"type\":[\"string\",\"null\"]},{\"name\":\"is_current\",\"type\":\"boolean\"},{\"name\":\"event_time\",\"type\":[\"string\",\"null\"]},{\"name\":\"year\",\"type\":\"int\"},{\"name\":\"month\",\"type\":\"int\"},{\"name\":\"day\",\"type\":\"int\"},{\"name\":\"its\",\"type\":\"string\"}]}"
},
"operationType" : "UPSERT",
"fileIdAndRelativePaths" : {
"08b5ed87-a749-4a82-a298-59071381dbc9-0" :
"year=2020/month=10/day=1/08b5ed87-a749-4a82-a298-59071381dbc9-0_0-89-2258_20201001173346.parquet",
"3c49d05f-8a9b-4365-8158-a32f879d674f-0" :
"year=2020/month=10/day=1/3c49d05f-8a9b-4365-8158-a32f879d674f-0_1-89-2259_20201001173346.parquet"
},
"totalRecordsDeleted" : 0,
"totalLogRecordsCompacted" : 0,
"totalScanTime" : 0,
"totalCreateTime" : 0,
"totalUpsertTime" : 11307,
"totalCompactedRecordsUpdated" : 0,
"totalLogFilesCompacted" : 0,
"totalLogFilesSize" : 0
}
```
Hudi properties file:
```console
#Properties saved on Thu Oct 01 17:14:31 UTC 2020
#Thu Oct 01 17:14:31 UTC 2020
hoodie.table.name=address
hoodie.archivelog.folder=archived
hoodie.table.type=COPY_ON_WRITE
hoodie.table.version=1
hoodie.timeline.layout.version=1
```
Describe table:
```console
+-----------------------------------+----------------------------------------------------+-----------------------+
| col_name | data_type
| comment |
+-----------------------------------+----------------------------------------------------+-----------------------+
| _hoodie_commit_time | string
| |
| _hoodie_commit_seqno | string
| |
| _hoodie_record_key | string
| |
| _hoodie_partition_path | string
| |
| _hoodie_file_name | string
| |
| id | int
| |
| zipcode | int
| |
| city | string
| |
| street | string
| |
| streetnumber | int
| |
| uuid | string
| |
| start_date | string
| |
| end_date | string
| |
| is_current | boolean
| |
| event_time | string
| |
| its | string
| |
| | NULL
| NULL |
| # Partition Information | NULL
| NULL |
| # col_name | data_type
| comment |
| year | int
| |
| month | int
| |
| day | int
| |
| | NULL
| NULL |
| # Detailed Partition Information | NULL
| NULL |
| Partition Value: | [2020, 10, 1]
| NULL |
| Database: | nyc_taxi
| NULL |
| Table: | address
| NULL |
| CreateTime: | Thu Oct 01 17:15:51 UTC 2020
| NULL |
| LastAccessTime: | UNKNOWN
| NULL |
| Location: |
abfs://[email protected]/data/hudi/batch/tables/nyc_taxi/address/year=2020/month=10/day=1
| NULL |
| Partition Parameters: | NULL
| NULL |
| | numFiles
| 9 |
| | totalSize
| 64289221 |
| | transient_lastDdlTime
| 1601572551 |
| | NULL
| NULL |
| # Storage Information | NULL
| NULL |
| SerDe Library: |
org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe | NULL
|
| InputFormat: |
org.apache.hudi.hadoop.HoodieParquetInputFormat | NULL |
| OutputFormat: |
org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat | NULL
|
| Compressed: | No
| NULL |
| Num Buckets: | -1
| NULL |
| Bucket Columns: | []
| NULL |
| Sort Columns: | []
| NULL |
| Storage Desc Params: | NULL
| NULL |
| | serialization.format
| 1 |
+-----------------------------------+----------------------------------------------------+-----------------------+
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]