GitHub user vardaram-zetta created a discussion: Gluten + Velox fallback when 
reading Delta Lake (Spark 3.5.3 + Delta 3.3.2)

I am using **Spark 3.5.3**, **Delta Lake 3.3.2**, and the **latest Gluten + 
Velox** build.
My TPC-H 2GB tables were written as **Delta** on S3A using regular Spark.
When I query the Delta tables with Gluten enabled, execution falls back.

Example:

```sql
select sum(p_partkey) from part;
```

### **Fallback Messages**

```
25/12/04 17:38:43 WARN GlutenFallbackReporter: Validation failed for plan: Scan 
json [QueryId=1], due to:
 - Unsupported file format UnknownFormat.
25/12/04 17:38:48 WARN SparkStringUtils: Truncated the string representation of 
a plan since it was too large. This behavior can be adjusted by setting 
'spark.sql.debug.maxToStringFields'.
25/12/04 17:38:48 WARN GlutenFallbackReporter: Validation failed for plan: Scan 
json [QueryId=3], due to:
 - Unsupported file format UnknownFormat.
25/12/04 17:38:48 WARN GlutenFallbackReporter: Validation failed for plan: 
Project[QueryId=3], due to:
 - Validation failed with exception from: ProjectExecTransformer, reason: UDF 
name is not found!
```

### **Question**

**Does Gluten + Velox fully support Delta Lake 3.x on Spark 3.5.x?**
Or is fallback expected with this combination?

If there is a specific Delta version recommended for Spark 3.5 + Gluten + 
Velox, please let me know.

### **Environment**

* Spark 3.5.3
* Delta Lake 3.3.2
* Latest Gluten (built from main)
* Velox backend
* Data stored on S3A (Delta format)


GitHub link: https://github.com/apache/incubator-gluten/discussions/11254

----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to