Hello,

Please let me know if this is not the correct forum, I’m happy to post it 
elsewhere.

I’m trying to use XTable to convert a hudi source to a delta target and I am 
receiving the following exception. The table is active and frequently updated. 
It is being actively queried as a hudi table.

Is there any other debug information I can provide to make this more useful?

My git head is 4a96627a
OS is Linux/Ubuntu
Java 11
Modified log4j2.xml to set level=trace for org.apache.hudi, o.a.xtable

Run with stacktrace:

$ java -jar 
./xtable-utilities/target/xtable-utilities-0.1.0-SNAPSHOT-bundled.jar 
--datasetConfig config.yaml
WARNING: Runtime environment or build system does not support multi-release 
JARs. This will impact location-based features.
2024-06-05 23:22:05 INFO  org.apache.xtable.utilities.RunSync:148 - Running 
sync for basePath s3://hidden-s3-bucket/hidden-prefix/ for following table 
formats [DELTA]
2024-06-05 23:22:05 INFO  
org.apache.hudi.common.table.HoodieTableMetaClient:133 - Loading 
HoodieTableMetaClient from s3://hidden-s3-bucket/hidden-prefix
2024-06-05 23:22:05 WARN  org.apache.hadoop.util.NativeCodeLoader:60 - Unable 
to load native-hadoop library for your platform... using builtin-java classes 
where applicable
2024-06-05 23:22:05 WARN  org.apache.hadoop.metrics2.impl.MetricsConfig:136 - 
Cannot locate configuration: tried 
hadoop-metrics2-s3a-file-system.properties,hadoop-metrics2.properties
2024-06-05 23:22:06 WARN  org.apache.hadoop.fs.s3a.SDKV2Upgrade:39 - Directly 
referencing AWS SDK V1 credential provider 
com.amazonaws.auth.DefaultAWSCredentialsProviderChain. AWS SDK V1 credential 
providers will be removed once S3A is upgraded to SDK V2
2024-06-05 23:22:07 INFO  org.apache.hudi.common.table.HoodieTableConfig:276 - 
Loading table properties from 
s3://hidden-s3-bucket/hidden-prefix/.hoodie/hoodie.properties
2024-06-05 23:22:07 INFO  
org.apache.hudi.common.table.HoodieTableMetaClient:152 - Finished Loading Table 
of type COPY_ON_WRITE(version=1, baseFileFormat=PARQUET) from 
s3://hidden-s3-bucket/hidden-prefix
2024-06-05 23:22:07 INFO  
org.apache.hudi.common.table.HoodieTableMetaClient:155 - Loading Active commit 
timeline for s3://hidden-s3-bucket/hidden-prefix
2024-06-05 23:22:07 INFO  
org.apache.hudi.common.table.timeline.HoodieActiveTimeline:171 - Loaded 
instants upto : 
Option{val=[20240605231910580__clean__COMPLETED__20240605231918000]}
2024-06-05 23:22:07 INFO  
org.apache.hudi.common.table.HoodieTableMetaClient:133 - Loading 
HoodieTableMetaClient from s3://hidden-s3-bucket/hidden-prefix
2024-06-05 23:22:07 INFO  org.apache.hudi.common.table.HoodieTableConfig:276 - 
Loading table properties from 
s3://hidden-s3-bucket/hidden-prefix/.hoodie/hoodie.properties
2024-06-05 23:22:07 INFO  
org.apache.hudi.common.table.HoodieTableMetaClient:152 - Finished Loading Table 
of type COPY_ON_WRITE(version=1, baseFileFormat=PARQUET) from 
s3://hidden-s3-bucket/hidden-prefix
2024-06-05 23:22:07 INFO  
org.apache.hudi.common.table.HoodieTableMetaClient:133 - Loading 
HoodieTableMetaClient from s3://hidden-s3-bucket/hidden-prefix/.hoodie/metadata
2024-06-05 23:22:07 INFO  org.apache.hudi.common.table.HoodieTableConfig:276 - 
Loading table properties from 
s3://hidden-s3-bucket/hidden-prefix/.hoodie/metadata/.hoodie/hoodie.properties
2024-06-05 23:22:07 INFO  
org.apache.hudi.common.table.HoodieTableMetaClient:152 - Finished Loading Table 
of type MERGE_ON_READ(version=1, baseFileFormat=HFILE) from 
s3://hidden-s3-bucket/hidden-prefix/.hoodie/metadata
2024-06-05 23:22:08 INFO  
org.apache.hudi.common.table.timeline.HoodieActiveTimeline:171 - Loaded 
instants upto : 
Option{val=[20240605231910580__deltacommit__COMPLETED__20240605231917000]}
2024-06-05 23:22:08 INFO  
org.apache.hudi.common.table.view.AbstractTableFileSystemView:259 - Took 7 ms 
to read  0 instants, 0 replaced file groups
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by 
org.apache.hadoop.hbase.util.UnsafeAvailChecker 
(file:/incubator-xtable/xtable-utilities/target/xtable-utilities-0.1.0-SNAPSHOT-bundled.jar)
 to method java.nio.Bits.unaligned()
WARNING: Please consider reporting this to the maintainers of 
org.apache.hadoop.hbase.util.UnsafeAvailChecker
WARNING: Use --illegal-access=warn to enable warnings of further illegal 
reflective access operations
WARNING: All illegal access operations will be denied in a future release
2024-06-05 23:22:08 INFO  org.apache.hudi.common.util.ClusteringUtils:147 - 
Found 0 files in pending clustering operations
2024-06-05 23:22:08 INFO  
org.apache.hudi.common.table.view.FileSystemViewManager:243 - Creating View 
Manager with storage type :MEMORY
2024-06-05 23:22:08 INFO  
org.apache.hudi.common.table.view.FileSystemViewManager:255 - Creating 
in-memory based Table View
2024-06-05 23:22:11 INFO  
org.apache.spark.sql.delta.storage.DelegatingLogStore:60 - LogStore 
`LogStoreAdapter(io.delta.storage.S3SingleDriverLogStore)` is used for scheme 
`s3`
2024-06-05 23:22:11 INFO  org.apache.spark.sql.delta.DeltaLog:60 - Creating 
initial snapshot without metadata, because the directory is empty
2024-06-05 23:22:13 INFO  org.apache.spark.sql.delta.InitialSnapshot:60 - 
[tableId=8eda3e8f-9dae-4d19-ac72-f625b8ccb0c5] Created snapshot 
InitialSnapshot(path=s3://hidden-s3-bucket/hidden-prefix/_delta_log, 
version=-1, 
metadata=Metadata(167f7b26-f82d-4765-97b9-b6e47d9147ec,null,null,Format(parquet,Map()),null,List(),Map(),Some(1717629733296)),
 
logSegment=LogSegment(s3://hidden-s3-bucket/hidden-prefix/_delta_log,-1,List(),None,-1),
 checksumOpt=None)
2024-06-05 23:22:13 INFO  org.apache.xtable.conversion.ConversionController:240 
- No previous InternalTable sync for target. Falling back to snapshot sync.
2024-06-05 23:22:13 INFO  org.apache.hudi.common.table.TableSchemaResolver:317 
- Reading schema from 
s3://hidden-s3-bucket/hidden-prefix/op_date=2024-06-05/3b5d27af-ef39-4862-bbd9-d4a010f6056e-0_0-71-375_20240605231837826.parquet
2024-06-05 23:22:14 INFO  org.apache.hudi.metadata.HoodieTableMetadataUtil:927 
- Loading latest merged file slices for metadata table partition files
2024-06-05 23:22:14 INFO  
org.apache.hudi.common.table.view.AbstractTableFileSystemView:259 - Took 1 ms 
to read  0 instants, 0 replaced file groups
2024-06-05 23:22:14 INFO  org.apache.hudi.common.util.ClusteringUtils:147 - 
Found 0 files in pending clustering operations
2024-06-05 23:22:14 INFO  
org.apache.hudi.common.table.view.AbstractTableFileSystemView:429 - Building 
file system view for partition (files)
2024-06-05 23:22:14 DEBUG 
org.apache.hudi.common.table.view.AbstractTableFileSystemView:435 - #files 
found in partition (files) =30, Time taken =40
2024-06-05 23:22:14 DEBUG 
org.apache.hudi.common.table.view.HoodieTableFileSystemView:386 - Adding 
file-groups for partition :files, #FileGroups=1
2024-06-05 23:22:14 DEBUG 
org.apache.hudi.common.table.view.AbstractTableFileSystemView:165 - 
addFilesToView: NumFiles=30, NumFileGroups=1, FileGroupsCreationTime=15, 
StoreTimeTaken=1
2024-06-05 23:22:14 DEBUG 
org.apache.hudi.common.table.view.AbstractTableFileSystemView:449 - Time to 
load partition (files) =57
2024-06-05 23:22:14 INFO  
org.apache.hudi.metadata.HoodieBackedTableMetadata:451 - Opened metadata base 
file from 
s3://hidden-s3-bucket/hidden-prefix/.hoodie/metadata/files/files-0000-0_0-67-1304_20240605210834482001.hfile
 at instant 20240605210834482001 in 9 ms
2024-06-05 23:22:14 INFO  
org.apache.hudi.common.table.timeline.HoodieActiveTimeline:171 - Loaded 
instants upto : 
Option{val=[20240605231910580__clean__COMPLETED__20240605231918000]}
2024-06-05 23:22:14 ERROR org.apache.xtable.utilities.RunSync:171 - Error 
running sync for s3://hidden-s3-bucket/hidden-prefix/
org.apache.hudi.exception.HoodieMetadataException: Failed to retrieve list of 
partition from metadata
    at 
org.apache.hudi.metadata.BaseTableMetadata.getAllPartitionPaths(BaseTableMetadata.java:127)
 ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
    at 
org.apache.xtable.hudi.HudiDataFileExtractor.getFilesCurrentState(HudiDataFileExtractor.java:116)
 ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
    at 
org.apache.xtable.hudi.HudiConversionSource.getCurrentSnapshot(HudiConversionSource.java:97)
 ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
    at 
org.apache.xtable.spi.extractor.ExtractFromSource.extractSnapshot(ExtractFromSource.java:38)
 ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
    at 
org.apache.xtable.conversion.ConversionController.syncSnapshot(ConversionController.java:183)
 ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
    at 
org.apache.xtable.conversion.ConversionController.sync(ConversionController.java:121)
 ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
    at org.apache.xtable.utilities.RunSync.main(RunSync.java:169) 
[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
Caused by: java.lang.IllegalStateException: Recursive update
    at 
java.util.concurrent.ConcurrentHashMap.computeIfAbsent(ConcurrentHashMap.java:1739)
 ~[?:?]
    at org.apache.avro.util.MapUtil.computeIfAbsent(MapUtil.java:42) 
~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
    at org.apache.avro.specific.SpecificData.getClass(SpecificData.java:257) 
~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
    at org.apache.avro.specific.SpecificData.newRecord(SpecificData.java:508) 
~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
    at 
org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:237)
 ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
    at 
org.apache.avro.specific.SpecificDatumReader.readRecord(SpecificDatumReader.java:123)
 ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
    at 
org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:180)
 ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
    at 
org.apache.avro.generic.GenericDatumReader.readMap(GenericDatumReader.java:355) 
~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
    at 
org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:186)
 ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
    at 
org.apache.avro.specific.SpecificDatumReader.readField(SpecificDatumReader.java:136)
 ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
    at 
org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:248)
 ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
    at 
org.apache.avro.specific.SpecificDatumReader.readRecord(SpecificDatumReader.java:123)
 ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
    at 
org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:180)
 ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
    at 
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:161) 
~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
    at 
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:154) 
~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
    at org.apache.avro.file.DataFileStream.next(DataFileStream.java:263) 
~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
    at org.apache.avro.file.DataFileStream.next(DataFileStream.java:248) 
~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
    at 
org.apache.hudi.common.table.timeline.TimelineMetadataUtils.deserializeAvroMetadata(TimelineMetadataUtils.java:209)
 ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
    at 
org.apache.hudi.common.table.timeline.TimelineMetadataUtils.deserializeHoodieRollbackMetadata(TimelineMetadataUtils.java:177)
 ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
    at 
org.apache.hudi.metadata.HoodieTableMetadataUtil.getRollbackedCommits(HoodieTableMetadataUtil.java:1355)
 ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
    at 
org.apache.hudi.metadata.HoodieTableMetadataUtil.lambda$getValidInstantTimestamps$37(HoodieTableMetadataUtil.java:1284)
 ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
    at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183) 
~[?:?]
    at 
java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:177) ~[?:?]
    at 
java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1655) 
~[?:?]
    at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484) 
~[?:?]
    at 
java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474) 
~[?:?]
    at 
java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150) 
~[?:?]
    at 
java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
 ~[?:?]
    at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) 
~[?:?]
    at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:497) 
~[?:?]
    at 
org.apache.hudi.metadata.HoodieTableMetadataUtil.getValidInstantTimestamps(HoodieTableMetadataUtil.java:1283)
 ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
    at 
org.apache.hudi.metadata.HoodieBackedTableMetadata.getLogRecordScanner(HoodieBackedTableMetadata.java:473)
 ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
    at 
org.apache.hudi.metadata.HoodieBackedTableMetadata.openReaders(HoodieBackedTableMetadata.java:429)
 ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
    at 
org.apache.hudi.metadata.HoodieBackedTableMetadata.lambda$getOrCreateReaders$10(HoodieBackedTableMetadata.java:412)
 ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
    at 
java.util.concurrent.ConcurrentHashMap.computeIfAbsent(ConcurrentHashMap.java:1705)
 ~[?:?]
    at 
org.apache.hudi.metadata.HoodieBackedTableMetadata.getOrCreateReaders(HoodieBackedTableMetadata.java:412)
 ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
    at 
org.apache.hudi.metadata.HoodieBackedTableMetadata.lookupKeysFromFileSlice(HoodieBackedTableMetadata.java:291)
 ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
    at 
org.apache.hudi.metadata.HoodieBackedTableMetadata.getRecordsByKeys(HoodieBackedTableMetadata.java:255)
 ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
    at 
org.apache.hudi.metadata.HoodieBackedTableMetadata.getRecordByKey(HoodieBackedTableMetadata.java:145)
 ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
    at 
org.apache.hudi.metadata.BaseTableMetadata.fetchAllPartitionPaths(BaseTableMetadata.java:316)
 ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
    at 
org.apache.hudi.metadata.BaseTableMetadata.getAllPartitionPaths(BaseTableMetadata.java:125)
 ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
    ... 6 more

config.yaml:

sourceFormat: HUDI
targetFormats:
  - DELTA
datasets:
  -
    tableBasePath: s3://hidden-s3-bucket/hidden-prefix
    tableName: hidden_table
    partitionSpec: op_date:VALUE


hoodie.properties from the table:

hoodie.table.timeline.timezone=LOCAL
hoodie.table.keygenerator.class=org.apache.hudi.keygen.SimpleKeyGenerator
hoodie.table.precombine.field=ts_millis
hoodie.table.version=6
hoodie.database.name=
hoodie.datasource.write.hive_style_partitioning=true
hoodie.table.metadata.partitions.inflight=
hoodie.table.checksum=2622850774
hoodie.partition.metafile.use.base.format=false
hoodie.table.cdc.enabled=false
hoodie.archivelog.folder=archived
hoodie.table.name=hidden_table
hoodie.populate.meta.fields=true
hoodie.table.type=COPY_ON_WRITE
hoodie.datasource.write.partitionpath.urlencode=false
hoodie.table.base.file.format=PARQUET
hoodie.datasource.write.drop.partition.columns=false
hoodie.table.metadata.partitions=files
hoodie.timeline.layout.version=1
hoodie.table.recordkey.fields=record_id
hoodie.table.partition.fields=op_date


Thanks,
Lucas Fairchild-Madar

________________________________
The information contained in this e-mail message is intended only for the 
personal and confidential use of the recipient(s) named above. This message may 
be an attorney-client communication and/or work product and as such is 
privileged and confidential. If the reader of this message is not the intended 
recipient or an agent responsible for delivering it to the intended recipient, 
you are hereby notified that you have received this document in error and that 
any review, dissemination, distribution, or copying of this message is strictly 
prohibited. If you have received this communication in error, please notify us 
immediately by e-mail, and delete the original message.

Reply via email to