Hello, Please let me know if this is not the correct forum, I’m happy to post it elsewhere.
I’m trying to use XTable to convert a hudi source to a delta target and I am receiving the following exception. The table is active and frequently updated. It is being actively queried as a hudi table. Is there any other debug information I can provide to make this more useful? My git head is 4a96627a OS is Linux/Ubuntu Java 11 Modified log4j2.xml to set level=trace for org.apache.hudi, o.a.xtable Run with stacktrace: $ java -jar ./xtable-utilities/target/xtable-utilities-0.1.0-SNAPSHOT-bundled.jar --datasetConfig config.yaml WARNING: Runtime environment or build system does not support multi-release JARs. This will impact location-based features. 2024-06-05 23:22:05 INFO org.apache.xtable.utilities.RunSync:148 - Running sync for basePath s3://hidden-s3-bucket/hidden-prefix/ for following table formats [DELTA] 2024-06-05 23:22:05 INFO org.apache.hudi.common.table.HoodieTableMetaClient:133 - Loading HoodieTableMetaClient from s3://hidden-s3-bucket/hidden-prefix 2024-06-05 23:22:05 WARN org.apache.hadoop.util.NativeCodeLoader:60 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 2024-06-05 23:22:05 WARN org.apache.hadoop.metrics2.impl.MetricsConfig:136 - Cannot locate configuration: tried hadoop-metrics2-s3a-file-system.properties,hadoop-metrics2.properties 2024-06-05 23:22:06 WARN org.apache.hadoop.fs.s3a.SDKV2Upgrade:39 - Directly referencing AWS SDK V1 credential provider com.amazonaws.auth.DefaultAWSCredentialsProviderChain. AWS SDK V1 credential providers will be removed once S3A is upgraded to SDK V2 2024-06-05 23:22:07 INFO org.apache.hudi.common.table.HoodieTableConfig:276 - Loading table properties from s3://hidden-s3-bucket/hidden-prefix/.hoodie/hoodie.properties 2024-06-05 23:22:07 INFO org.apache.hudi.common.table.HoodieTableMetaClient:152 - Finished Loading Table of type COPY_ON_WRITE(version=1, baseFileFormat=PARQUET) from s3://hidden-s3-bucket/hidden-prefix 2024-06-05 23:22:07 INFO org.apache.hudi.common.table.HoodieTableMetaClient:155 - Loading Active commit timeline for s3://hidden-s3-bucket/hidden-prefix 2024-06-05 23:22:07 INFO org.apache.hudi.common.table.timeline.HoodieActiveTimeline:171 - Loaded instants upto : Option{val=[20240605231910580__clean__COMPLETED__20240605231918000]} 2024-06-05 23:22:07 INFO org.apache.hudi.common.table.HoodieTableMetaClient:133 - Loading HoodieTableMetaClient from s3://hidden-s3-bucket/hidden-prefix 2024-06-05 23:22:07 INFO org.apache.hudi.common.table.HoodieTableConfig:276 - Loading table properties from s3://hidden-s3-bucket/hidden-prefix/.hoodie/hoodie.properties 2024-06-05 23:22:07 INFO org.apache.hudi.common.table.HoodieTableMetaClient:152 - Finished Loading Table of type COPY_ON_WRITE(version=1, baseFileFormat=PARQUET) from s3://hidden-s3-bucket/hidden-prefix 2024-06-05 23:22:07 INFO org.apache.hudi.common.table.HoodieTableMetaClient:133 - Loading HoodieTableMetaClient from s3://hidden-s3-bucket/hidden-prefix/.hoodie/metadata 2024-06-05 23:22:07 INFO org.apache.hudi.common.table.HoodieTableConfig:276 - Loading table properties from s3://hidden-s3-bucket/hidden-prefix/.hoodie/metadata/.hoodie/hoodie.properties 2024-06-05 23:22:07 INFO org.apache.hudi.common.table.HoodieTableMetaClient:152 - Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=HFILE) from s3://hidden-s3-bucket/hidden-prefix/.hoodie/metadata 2024-06-05 23:22:08 INFO org.apache.hudi.common.table.timeline.HoodieActiveTimeline:171 - Loaded instants upto : Option{val=[20240605231910580__deltacommit__COMPLETED__20240605231917000]} 2024-06-05 23:22:08 INFO org.apache.hudi.common.table.view.AbstractTableFileSystemView:259 - Took 7 ms to read 0 instants, 0 replaced file groups WARNING: An illegal reflective access operation has occurred WARNING: Illegal reflective access by org.apache.hadoop.hbase.util.UnsafeAvailChecker (file:/incubator-xtable/xtable-utilities/target/xtable-utilities-0.1.0-SNAPSHOT-bundled.jar) to method java.nio.Bits.unaligned() WARNING: Please consider reporting this to the maintainers of org.apache.hadoop.hbase.util.UnsafeAvailChecker WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations WARNING: All illegal access operations will be denied in a future release 2024-06-05 23:22:08 INFO org.apache.hudi.common.util.ClusteringUtils:147 - Found 0 files in pending clustering operations 2024-06-05 23:22:08 INFO org.apache.hudi.common.table.view.FileSystemViewManager:243 - Creating View Manager with storage type :MEMORY 2024-06-05 23:22:08 INFO org.apache.hudi.common.table.view.FileSystemViewManager:255 - Creating in-memory based Table View 2024-06-05 23:22:11 INFO org.apache.spark.sql.delta.storage.DelegatingLogStore:60 - LogStore `LogStoreAdapter(io.delta.storage.S3SingleDriverLogStore)` is used for scheme `s3` 2024-06-05 23:22:11 INFO org.apache.spark.sql.delta.DeltaLog:60 - Creating initial snapshot without metadata, because the directory is empty 2024-06-05 23:22:13 INFO org.apache.spark.sql.delta.InitialSnapshot:60 - [tableId=8eda3e8f-9dae-4d19-ac72-f625b8ccb0c5] Created snapshot InitialSnapshot(path=s3://hidden-s3-bucket/hidden-prefix/_delta_log, version=-1, metadata=Metadata(167f7b26-f82d-4765-97b9-b6e47d9147ec,null,null,Format(parquet,Map()),null,List(),Map(),Some(1717629733296)), logSegment=LogSegment(s3://hidden-s3-bucket/hidden-prefix/_delta_log,-1,List(),None,-1), checksumOpt=None) 2024-06-05 23:22:13 INFO org.apache.xtable.conversion.ConversionController:240 - No previous InternalTable sync for target. Falling back to snapshot sync. 2024-06-05 23:22:13 INFO org.apache.hudi.common.table.TableSchemaResolver:317 - Reading schema from s3://hidden-s3-bucket/hidden-prefix/op_date=2024-06-05/3b5d27af-ef39-4862-bbd9-d4a010f6056e-0_0-71-375_20240605231837826.parquet 2024-06-05 23:22:14 INFO org.apache.hudi.metadata.HoodieTableMetadataUtil:927 - Loading latest merged file slices for metadata table partition files 2024-06-05 23:22:14 INFO org.apache.hudi.common.table.view.AbstractTableFileSystemView:259 - Took 1 ms to read 0 instants, 0 replaced file groups 2024-06-05 23:22:14 INFO org.apache.hudi.common.util.ClusteringUtils:147 - Found 0 files in pending clustering operations 2024-06-05 23:22:14 INFO org.apache.hudi.common.table.view.AbstractTableFileSystemView:429 - Building file system view for partition (files) 2024-06-05 23:22:14 DEBUG org.apache.hudi.common.table.view.AbstractTableFileSystemView:435 - #files found in partition (files) =30, Time taken =40 2024-06-05 23:22:14 DEBUG org.apache.hudi.common.table.view.HoodieTableFileSystemView:386 - Adding file-groups for partition :files, #FileGroups=1 2024-06-05 23:22:14 DEBUG org.apache.hudi.common.table.view.AbstractTableFileSystemView:165 - addFilesToView: NumFiles=30, NumFileGroups=1, FileGroupsCreationTime=15, StoreTimeTaken=1 2024-06-05 23:22:14 DEBUG org.apache.hudi.common.table.view.AbstractTableFileSystemView:449 - Time to load partition (files) =57 2024-06-05 23:22:14 INFO org.apache.hudi.metadata.HoodieBackedTableMetadata:451 - Opened metadata base file from s3://hidden-s3-bucket/hidden-prefix/.hoodie/metadata/files/files-0000-0_0-67-1304_20240605210834482001.hfile at instant 20240605210834482001 in 9 ms 2024-06-05 23:22:14 INFO org.apache.hudi.common.table.timeline.HoodieActiveTimeline:171 - Loaded instants upto : Option{val=[20240605231910580__clean__COMPLETED__20240605231918000]} 2024-06-05 23:22:14 ERROR org.apache.xtable.utilities.RunSync:171 - Error running sync for s3://hidden-s3-bucket/hidden-prefix/ org.apache.hudi.exception.HoodieMetadataException: Failed to retrieve list of partition from metadata at org.apache.hudi.metadata.BaseTableMetadata.getAllPartitionPaths(BaseTableMetadata.java:127) ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT] at org.apache.xtable.hudi.HudiDataFileExtractor.getFilesCurrentState(HudiDataFileExtractor.java:116) ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT] at org.apache.xtable.hudi.HudiConversionSource.getCurrentSnapshot(HudiConversionSource.java:97) ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT] at org.apache.xtable.spi.extractor.ExtractFromSource.extractSnapshot(ExtractFromSource.java:38) ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT] at org.apache.xtable.conversion.ConversionController.syncSnapshot(ConversionController.java:183) ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT] at org.apache.xtable.conversion.ConversionController.sync(ConversionController.java:121) ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT] at org.apache.xtable.utilities.RunSync.main(RunSync.java:169) [xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT] Caused by: java.lang.IllegalStateException: Recursive update at java.util.concurrent.ConcurrentHashMap.computeIfAbsent(ConcurrentHashMap.java:1739) ~[?:?] at org.apache.avro.util.MapUtil.computeIfAbsent(MapUtil.java:42) ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT] at org.apache.avro.specific.SpecificData.getClass(SpecificData.java:257) ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT] at org.apache.avro.specific.SpecificData.newRecord(SpecificData.java:508) ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT] at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:237) ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT] at org.apache.avro.specific.SpecificDatumReader.readRecord(SpecificDatumReader.java:123) ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT] at org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:180) ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT] at org.apache.avro.generic.GenericDatumReader.readMap(GenericDatumReader.java:355) ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT] at org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:186) ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT] at org.apache.avro.specific.SpecificDatumReader.readField(SpecificDatumReader.java:136) ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT] at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:248) ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT] at org.apache.avro.specific.SpecificDatumReader.readRecord(SpecificDatumReader.java:123) ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT] at org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:180) ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT] at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:161) ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT] at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:154) ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT] at org.apache.avro.file.DataFileStream.next(DataFileStream.java:263) ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT] at org.apache.avro.file.DataFileStream.next(DataFileStream.java:248) ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT] at org.apache.hudi.common.table.timeline.TimelineMetadataUtils.deserializeAvroMetadata(TimelineMetadataUtils.java:209) ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT] at org.apache.hudi.common.table.timeline.TimelineMetadataUtils.deserializeHoodieRollbackMetadata(TimelineMetadataUtils.java:177) ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT] at org.apache.hudi.metadata.HoodieTableMetadataUtil.getRollbackedCommits(HoodieTableMetadataUtil.java:1355) ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT] at org.apache.hudi.metadata.HoodieTableMetadataUtil.lambda$getValidInstantTimestamps$37(HoodieTableMetadataUtil.java:1284) ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT] at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183) ~[?:?] at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:177) ~[?:?] at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1655) ~[?:?] at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484) ~[?:?] at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474) ~[?:?] at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150) ~[?:?] at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173) ~[?:?] at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) ~[?:?] at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:497) ~[?:?] at org.apache.hudi.metadata.HoodieTableMetadataUtil.getValidInstantTimestamps(HoodieTableMetadataUtil.java:1283) ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT] at org.apache.hudi.metadata.HoodieBackedTableMetadata.getLogRecordScanner(HoodieBackedTableMetadata.java:473) ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT] at org.apache.hudi.metadata.HoodieBackedTableMetadata.openReaders(HoodieBackedTableMetadata.java:429) ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT] at org.apache.hudi.metadata.HoodieBackedTableMetadata.lambda$getOrCreateReaders$10(HoodieBackedTableMetadata.java:412) ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT] at java.util.concurrent.ConcurrentHashMap.computeIfAbsent(ConcurrentHashMap.java:1705) ~[?:?] at org.apache.hudi.metadata.HoodieBackedTableMetadata.getOrCreateReaders(HoodieBackedTableMetadata.java:412) ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT] at org.apache.hudi.metadata.HoodieBackedTableMetadata.lookupKeysFromFileSlice(HoodieBackedTableMetadata.java:291) ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT] at org.apache.hudi.metadata.HoodieBackedTableMetadata.getRecordsByKeys(HoodieBackedTableMetadata.java:255) ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT] at org.apache.hudi.metadata.HoodieBackedTableMetadata.getRecordByKey(HoodieBackedTableMetadata.java:145) ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT] at org.apache.hudi.metadata.BaseTableMetadata.fetchAllPartitionPaths(BaseTableMetadata.java:316) ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT] at org.apache.hudi.metadata.BaseTableMetadata.getAllPartitionPaths(BaseTableMetadata.java:125) ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT] ... 6 more config.yaml: sourceFormat: HUDI targetFormats: - DELTA datasets: - tableBasePath: s3://hidden-s3-bucket/hidden-prefix tableName: hidden_table partitionSpec: op_date:VALUE hoodie.properties from the table: hoodie.table.timeline.timezone=LOCAL hoodie.table.keygenerator.class=org.apache.hudi.keygen.SimpleKeyGenerator hoodie.table.precombine.field=ts_millis hoodie.table.version=6 hoodie.database.name= hoodie.datasource.write.hive_style_partitioning=true hoodie.table.metadata.partitions.inflight= hoodie.table.checksum=2622850774 hoodie.partition.metafile.use.base.format=false hoodie.table.cdc.enabled=false hoodie.archivelog.folder=archived hoodie.table.name=hidden_table hoodie.populate.meta.fields=true hoodie.table.type=COPY_ON_WRITE hoodie.datasource.write.partitionpath.urlencode=false hoodie.table.base.file.format=PARQUET hoodie.datasource.write.drop.partition.columns=false hoodie.table.metadata.partitions=files hoodie.timeline.layout.version=1 hoodie.table.recordkey.fields=record_id hoodie.table.partition.fields=op_date Thanks, Lucas Fairchild-Madar ________________________________ The information contained in this e-mail message is intended only for the personal and confidential use of the recipient(s) named above. This message may be an attorney-client communication and/or work product and as such is privileged and confidential. If the reader of this message is not the intended recipient or an agent responsible for delivering it to the intended recipient, you are hereby notified that you have received this document in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify us immediately by e-mail, and delete the original message.