RajasekarSribalan opened a new issue #1823:
URL: https://github.com/apache/hudi/issues/1823
**Describe the problem you faced**
We are writing to a Hudi MOR table via spark streaming. We read data from
kafka and write to Hudi MOR. We can huge inserts/upserts so we want to have
good performance ,so we chose MOR tables. We have disabled inline compaction to
avoid blocking ingestion and we wanted compaction to run async via Hudi CLI.
The issue is, we are unable to see any COMPACTION instant in the DFS hence we
get error saying "No Pending compaction", but we do see a lot of delta logs
getting created/appended but compaction is not requested.
We want to understand when does the compaction request is trigger when
inline compaction is switched OFF? so that I can run compaction via hudi-cli?
Please assist vinoth @vinothchandar @bhasudha . There is no much information
for async compaction in hudi documentation.
upsertDf.write
.format("hudi")
.options(getQuickstartWriteConfigs)
.option(OPERATION_OPT_KEY, "upsert")
.option(PRECOMBINE_FIELD_OPT_KEY, hudi_precombine_key)
.option(RECORDKEY_FIELD_OPT_KEY, hudi_key)
.option(PARTITIONPATH_FIELD_OPT_KEY, "")
.option(KEYGENERATOR_CLASS_OPT_KEY,
classOf[NonpartitionedKeyGenerator].getName)
.option(TABLE_NAME, tablename)
.option(DataSourceWriteOptions.STORAGE_TYPE_OPT_KEY,
DataSourceWriteOptions.MOR_STORAGE_TYPE_OPT_VAL)
.option(DataSourceWriteOptions.TABLE_TYPE_OPT_KEY,
DataSourceWriteOptions.MOR_TABLE_TYPE_OPT_VAL)
.option(HIVE_SYNC_ENABLED_OPT_KEY, "true")
.option(HIVE_URL_OPT_KEY, "XXXXXXX")
.option(HIVE_DATABASE_OPT_KEY, hudi_db)
.option(HIVE_TABLE_OPT_KEY, tablename)
.option(HIVE_PARTITION_EXTRACTOR_CLASS_OPT_KEY,
classOf[NonPartitionedExtractor].getName)
.option(HoodieStorageConfig.PARQUET_COMPRESSION_CODEC,
"snappy")
.option(HoodieCompactionConfig.INLINE_COMPACT_PROP,
"false")
.option(HoodieCompactionConfig.INLINE_COMPACT_NUM_DELTA_COMMITS_PROP, "24")
.mode(Append)
.save("/user/xyz/hudi/" + tablename)
**Environment Description**
* Hudi version : 0.5.2
* Spark version : 2.2.0
* Hive version :1.0
* Hadoop version :2.7
* Storage (HDFS/S3/GCS..) :
* Running on Docker? (yes/no) :
**Stacktrace**
hudi:user_emails->compactions show all
╔═════════════════════════╤═══════╤═══════════════════════════════╗
║ Compaction Instant Time │ State │ Total FileIds to be Compacted ║
╠═════════════════════════╧═══════╧═══════════════════════════════╣
║ (empty) ║
╚═════════════════════════════════════════════════════════════════╝
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]