jai20242 opened a new issue, #11249:
URL: https://github.com/apache/hudi/issues/11249
Hi.
I have ingested data using the following configuration:
option(OPERATION_OPT_KEY, "upsert").
option(CDC_ENABLED.key(), "true").
option(TABLE_NAME, tableName).
option("hoodie.datasource.write.payload.class","CustomOverwriteWithLatestAvroPayload").
option("hoodie.avro.schema.validate","false").
option("hoodie.datasource.write.recordkey.field",keysTable.mkString(",")).
option("hoodie.datasource.write.precombine.field",COLUMN_TO_SORT).
option("hoodie.datasource.write.new.columns.nullable", "true").
option("hoodie.datasource.write.reconcile.schema","true").
option("hoodie.metadata.enable","false").
option("hoodie.index.type","SIMPLE").
option("hoodie.datasource.write.table.type","MERGE_ON_READ").
option("hoodie.compact.inline","false").
option("hoodie.datasource.write.partitionpath.field","bdp_partition").
option("hoodie.compact.inline.max.delta.commits","1").
mode(Append).
save(dataPath)
But I can't run the compaction.
The .hoodie folder has the following files
. .schema
.. .temp
.20240516143453846.deltacommit.crc 20240516143453846.deltacommit
.20240516143453846.deltacommit.inflight.crc
20240516143453846.deltacommit.inflight
.20240516143453846.deltacommit.requested.crc
20240516143453846.deltacommit.requested
.20240516144403250.deltacommit.crc 20240516144403250.deltacommit
.20240516144403250.deltacommit.inflight.crc
20240516144403250.deltacommit.inflight
.20240516144403250.deltacommit.requested.crc
20240516144403250.deltacommit.requested
.20240516154539132.deltacommit.crc 20240516154539132.deltacommit
.20240516154539132.deltacommit.inflight.crc
20240516154539132.deltacommit.inflight
.20240516154539132.deltacommit.requested.crc
20240516154539132.deltacommit.requested
.aux archived
.hoodie.properties.crc hoodie.properties
And a partition:
.
..
..47546248-a9d6-4a99-9c56-6dc1b0c9ad82-0_20240516143453846.log.1_1-26-114.crc
..47546248-a9d6-4a99-9c56-6dc1b0c9ad82-0_20240516143453846.log.2_1-60-262.crc
..7504f0fe-c40f-4bfa-88c0-bf905840f04b-0_20240516143453846.log.1_0-26-113.crc
..7504f0fe-c40f-4bfa-88c0-bf905840f04b-0_20240516143453846.log.2_0-60-261.crc
..hoodie_partition_metadata.crc
.47546248-a9d6-4a99-9c56-6dc1b0c9ad82-0_0-26-105_20240516143453846.parquet.crc
.47546248-a9d6-4a99-9c56-6dc1b0c9ad82-0_20240516143453846.log.1_1-26-114
.47546248-a9d6-4a99-9c56-6dc1b0c9ad82-0_20240516143453846.log.2_1-60-262
.7504f0fe-c40f-4bfa-88c0-bf905840f04b-0_1-26-106_20240516143453846.parquet.crc
.7504f0fe-c40f-4bfa-88c0-bf905840f04b-0_20240516143453846.log.1_0-26-113
.7504f0fe-c40f-4bfa-88c0-bf905840f04b-0_20240516143453846.log.2_0-60-261
.hoodie_partition_metadata
47546248-a9d6-4a99-9c56-6dc1b0c9ad82-0_0-26-105_20240516143453846.parquet
7504f0fe-c40f-4bfa-88c0-bf905840f04b-0_1-26-106_20240516143453846.parquet
Finally. I am trying to compact using command cli.
I can see two commits:
hudi:prueba->commits show --sortBy "Total Bytes Written" --desc true --limit
10
╔═══════════════════╤═════════════════════╤═══════════════════╤═════════════════════╤══════════════════════════╤═══════════════════════╤══════════════════════════════╤══════════════╗
║ CommitTime │ Total Bytes Written │ Total Files Added │ Total Files
Updated │ Total Partitions Written │ Total Records Written │ Total Update
Records Written │ Total Errors ║
╠═══════════════════╪═════════════════════╪═══════════════════╪═════════════════════╪══════════════════════════╪═══════════════════════╪══════════════════════════════╪══════════════╣
║ 20240516144403250 │ 752,5 MB │ 0 │ 14
│ 7 │ 1435323 │ 1435323
│ 0 ║
╟───────────────────┼─────────────────────┼───────────────────┼─────────────────────┼──────────────────────────┼───────────────────────┼──────────────────────────────┼──────────────╢
║ 20240516143453846 │ 41,7 MB │ 14 │ 0
│ 7 │ 1435323 │ 0
│ 0 ║
╚═══════════════════╧═════════════════════╧═══════════════════╧═════════════════════╧══════════════════════════╧═══════════════════════╧══════════════════════════════╧══════════════╝
But there is no compactions:
compactions show all
╔═════════════════════════╤═══════╤═══════════════════════════════╗
║ Compaction Instant Time │ State │ Total FileIds to be Compacted ║
╠═════════════════════════╧═══════╧═══════════════════════════════╣
║ (empty) ║
╚═════════════════════════════════════════════════════════════════╝
But the command compaction run returns the following message (after
executing the command compaction schedule)
prueba->compaction run --tableName prueba
2024-05-16 14:17:08.633 INFO 58141 --- [ main]
o.a.h.c.t.t.HoodieActiveTimeline : Loaded instants upto :
Option{val=[20240516134708181__deltacommit__COMPLETED__20240516135028000]}
NO PENDING COMPACTION TO RUN
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]