gabotechs opened a new issue, #1837:
URL: https://github.com/apache/datafusion-ballista/issues/1837

   **Environment:**
   - Ballista: 53.0.0
   - DataFusion: 53.1.0
   - Dataset: TPCDS SF1 (Parquet on S3)
   - Cluster: 1 coordinator (`c5n.2xlarge`) + 11 workers (`c5n.2xlarge`)
   - Scheduling: PushStaged
   
   **Reference:** DataFusion main passes all of the queries listed below.
   
   ---
   
   **Affects:** TPCDS SF1 Q6, Q47, Q57, Q70
   
   **Exact error messages:**
   
   Q6 (stage 13):
   ```
   Error: Query failed: 500 Arrow error: External error: Execution error: Job 
S7j0bce failed:
     Job failed due to stage 13 failed: Task failed due to runtime execution 
error:
     DataFusionError(NotImplemented("Unsupported data type in sort merge join 
comparator: Dictionary(UInt16, Utf8)"))
   ```
   
   Q47 (stage 20):
   ```
   Error: Query failed: 500 Arrow error: External error: Execution error: Job 
tH108Ae failed:
     Job failed due to stage 20 failed: Task failed due to runtime execution 
error:
     DataFusionError(NotImplemented("Unsupported data type in sort merge join 
comparator: Dictionary(UInt16, Utf8)"))
   ```
   
   Q57 (stage 20):
   ```
   Error: Query failed: 500 Arrow error: External error: Execution error: Job 
FwCaoXB failed:
     Job failed due to stage 20 failed: Task failed due to runtime execution 
error:
     DataFusionError(NotImplemented("Unsupported data type in sort merge join 
comparator: Dictionary(UInt16, Utf8)"))
   ```
   
   Q70 (stage 11):
   ```
   Error: Query failed: 500 Arrow error: External error: Execution error: Job 
hd52nrG failed:
     Job failed due to stage 11 failed: Task failed due to runtime execution 
error:
     DataFusionError(NotImplemented("Unsupported data type in sort merge join 
comparator: Dictionary(UInt16, Utf8)"))
   ```
   
   **Description:**
   
   These queries involve sort-merge joins or sorts on string columns encoded as 
`Dictionary(UInt16, Utf8)` in the TPCDS Parquet files.
   
   DataFusion main passes all four queries at the same SF1 dataset.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to