Re: [PR] fix(spark): catch HoodieSchemaNotFoundException in 3-arg DefaultSource.createRelation [hudi]

via GitHub Wed, 06 May 2026 20:50:19 -0700


prashantwason commented on code in PR #18669:
URL: https://github.com/apache/hudi/pull/18669#discussion_r3198847170



##########
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/DefaultSource.scala:
##########
@@ -134,7 +134,19 @@ class DefaultSource extends RelationProvider
       parameters
     }
 
-    val relation = DefaultSource.createRelation(sqlContext, metaClient, 
schema, options.toMap)
+    // Spark's DataSource.resolveRelation() invokes this 3-arg overload 
directly via the
+    // SchemaRelationProvider path when a user-supplied schema is present (e.g.
+    // spark.read.schema(...).load(path)). The 2-arg overload catches
+    // HoodieSchemaNotFoundException and returns an EmptyRelation, but that 
catch is bypassed
+    // on this path, so we mirror the same handling here. Preserve the 
caller-supplied schema
+    // so subsequent query analysis (e.g. column resolution in WHERE clauses) 
sees the
+    // HMS-known columns even though the on-disk table is schemaless.
+    val relation = try {
+      DefaultSource.createRelation(sqlContext, metaClient, schema, 
options.toMap)
+    } catch {
+      case _: HoodieSchemaNotFoundException =>
+        new EmptyRelation(sqlContext, Option(schema).getOrElse(new 
StructType()))

Review Comment:
   Done in bd4e4c24a2f7 - passed schema through directly. Agreed the Option 
wrapper was overly defensive given the SchemaRelationProvider contract.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] fix(spark): catch HoodieSchemaNotFoundException in 3-arg DefaultSource.createRelation [hudi]

Reply via email to