[I] Druid Delta Table load Error - Unable to load full table, restricted to 1024 records per parquet file (druid)

via GitHub Sat, 18 Oct 2025 15:12:45 -0700


sanjdow opened a new issue, #18606:
URL: https://github.com/apache/druid/issues/18606


   Running Druid on an Openshift cluster, and using the Druid Delta Lake 
extension(https://github.com/apache/druid/tree/master/extensions-contrib/druid-deltalake-extensions)
 to connect and load Delta tables. 
   
   Facing the following issue, - error while loading with delta connector: only 
1024 records of each constituent parquet file(each partition of the delta 
table) is loaded
   
   Also, there is an an error on the UI as soon as the load is over - ERROR: 
Request failed with status code 404. This may be unrelated to the issue with 
parquet data load
   
   ### Affected Version
   33.0.0
   
   ### Description
   - Running Druid on Openshift cluster and using Datainfra Druid operator 
version: 0.3.8
   - Trying to load delta table from parquet, using MSQ load with the below 
query context,
   {
     "finalizeAggregations": false,
     "groupByEnableMultiValueUnnesting": false,
     "arrayIngestMode": "array",
     "maxNumTasks": 11,
     "externalDataSampleRows": 0,
     "taskStatusCheckPeriodMs": 5000,
     "sqlInsertTaskNumSlots": "max"
   }
   
   - Query used for the load,
   
   REPLACE INTO "table" OVERWRITE ALL
   WITH "ext" AS (
     SELECT *
     FROM TABLE(
       EXTERN(
         '{"type":"delta","tablePath":"path"}',
         '{"type":"parquet"}'
       )
     ) EXTEND ("col1" VARCHAR, "col2" VARCHAR, "col3" VARCHAR, "col4" BIGINT, 
"col5" VARCHAR, "col6" VARCHAR, "col7" BIGINT, "col8" VARCHAR, "col9" VARCHAR)
   )
   SELECT
     MILLIS_TO_TIMESTAMP("dop" * 1000) AS "__time",
     "col1", "col2", "col3", "col4", "col5", "col6", "col7", "col8"
   FROM "ext"
   PARTITIONED BY DAY
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[I] Druid Delta Table load Error - Unable to load full table, restricted to 1024 records per parquet file (druid)

Reply via email to