Peeyush Gupta created ASTERIXDB-3602:
----------------------------------------
Summary: Failed to read valid Delta table with timestamp partition
key
Key: ASTERIXDB-3602
URL: https://issues.apache.org/jira/browse/ASTERIXDB-3602
Project: Apache AsterixDB
Issue Type: Bug
Components: EXT - External data
Reporter: Peeyush Gupta
In cases where the partition key is a Timestamp and the user has inserted a
tuple with the partition key specified in ISO8601 timestamp format, then the
delta kernel api that we use to read delta table throws the following exception.
{code:java}
2025-04-09T23:24:40.732+00:00 WARN CBAS.server.QueryServiceServlet
[HttpExecutor(port:18095)-3] handleException: unexpected exception:
uuid=f7e020e9-8342-492b-b934-93df3bfdd769,
clientContextID=3355a664-82b6-4e65-9f3a-8c9269d38d36
org.apache.hyracks.api.exceptions.HyracksDataException:
java.lang.IllegalArgumentException: Timestamp format must be yyyy-mm-dd
hh:mm:ss[.fffffffff]
at
org.apache.hyracks.api.exceptions.HyracksDataException.create(HyracksDataException.java:72)
~[hyracks-api-1.1.0-1238.jar:1.1.0-1238]
at
org.apache.hyracks.api.util.ExceptionUtils.setNodeIds(ExceptionUtils.java:70)
~[hyracks-api-1.1.0-1238.jar:1.1.0-1238]
at org.apache.hyracks.control.nc.Task.run(Task.java:399)
~[hyracks-control-nc-1.1.0-1238.jar:1.1.0-1238]
at
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
~[?:?]
at
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
~[?:?]
at java.base/java.lang.Thread.run(Thread.java:1583) [?:?]
Caused by: java.lang.IllegalArgumentException: Timestamp format must be
yyyy-mm-dd hh:mm:ss[.fffffffff]
at java.sql/java.sql.Timestamp.valueOf(Timestamp.java:196) ~[java.sql:?]
at
io.delta.kernel.internal.util.PartitionUtils.literalForPartitionValue(PartitionUtils.java:447)
~[delta-kernel-api-3.2.1.jar:3.2.1]
at
io.delta.kernel.internal.util.PartitionUtils.withPartitionColumns(PartitionUtils.java:100)
~[delta-kernel-api-3.2.1.jar:3.2.1]
at io.delta.kernel.Scan$1.next(Scan.java:205)
~[delta-kernel-api-3.2.1.jar:3.2.1]
at io.delta.kernel.Scan$1.next(Scan.java:136)
~[delta-kernel-api-3.2.1.jar:3.2.1]
at
org.apache.asterix.external.input.record.reader.aws.delta.DeltaFileRecordReader.<init>(DeltaFileRecordReader.java:93)
~[asterix-external-data-1.1.0-1238.jar:1.1.0-1238]
at
org.apache.asterix.external.input.record.reader.aws.delta.DeltaReaderFactory.createRecordReader(DeltaReaderFactory.java:201)
~[asterix-external-data-1.1.0-1238.jar:1.1.0-1238]
at
org.apache.asterix.external.provider.DataflowControllerProvider.getDataflowController(DataflowControllerProvider.java:70)
~[asterix-external-data-1.1.0-1238.jar:1.1.0-1238]
at
org.apache.asterix.external.adapter.factory.GenericAdapterFactory.createAdapter(GenericAdapterFactory.java:110)
~[asterix-external-data-1.1.0-1238.jar:1.1.0-1238]
at
org.apache.asterix.external.operators.ExternalScanOperatorDescriptor$1.initialize(ExternalScanOperatorDescriptor.java:79)
~[asterix-external-data-1.1.0-1238.jar:1.1.0-1238]
at
org.apache.hyracks.api.dataflow.ProfiledFrameWriter.timeMethod(ProfiledFrameWriter.java:65)
~[hyracks-api-1.1.0-1238.jar:1.1.0-1238]
at
org.apache.hyracks.api.dataflow.ProfiledOperatorNodePushable.initialize(ProfiledOperatorNodePushable.java:53)
~[hyracks-api-1.1.0-1238.jar:1.1.0-1238]
at
org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.lambda$runInParallel$0(SuperActivityOperatorNodePushable.java:245)
~[hyracks-api-1.1.0-1238.jar:1.1.0-1238]
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317)
~[?:?]
... 3 more
{code}
{{}}This is a known bug in the delta kernel library [[BUG] Kernel cannot read
partition column of type `ISO8601 formatted timestamp adjusted to UTC` · Issue
#4268 · delta-io/delta|https://github.com/delta-io/delta/issues/4268]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)