markj-db opened a new pull request, #55721:
URL: https://github.com/apache/spark/pull/55721

   ### What changes were proposed in this pull request?
   
   `PartitioningAwareFileIndex.listFiles` rejects the combination of 
`recursiveFileLookup=true` and a non-empty `partitionSpec().partitionColumns` 
by throwing a raw `java.lang.IllegalArgumentException` with the message 
"Datasource with partition do not allow recursive file loading."
   
   This PR replaces that with a tagged `AnalysisException` using a new error 
class:
   
   - New error class 
`RECURSIVE_FILE_LOOKUP_NOT_SUPPORTED_FOR_PARTITIONED_DATA_SOURCE` (`sqlState 
0A000`) in `error-conditions.json`.
   - New helper 
`QueryCompilationErrors.recursiveFileLookupNotSupportedForPartitionedDataSourceError()`.
   - Throw site in `PartitioningAwareFileIndex.scala` updated to use the helper.
   
   ### Why are the changes needed?
   
   The raw `IllegalArgumentException` is unclassified and does not surface as a 
user-facing error with a clear message. Replacing it with an 
`AnalysisException` using a proper error class ensures it is correctly 
classified as a user error with an actionable message.
   
   ### Does this PR introduce _any_ user-facing change?
   
   Yes. Users who hit this error will now see a clearer message:
   
   > Recursive file loading is not supported when the data source has explicit 
partition columns. Either remove the option "recursiveFileLookup", or read the 
data without supplying partition columns (for example, do not read a 
partitioned table or set partition-column options such as 
"cloudFiles.partitionColumns").
   
   Previously the error was a raw `IllegalArgumentException` with the message 
"Datasource with partition do not allow recursive file loading."
   
   ### How was this patch tested?
   
   Added `"recursiveFileLookup with a partitioned catalog table is rejected"` 
in `FileBasedDataSourceSuite`, which creates a partitioned Parquet catalog 
table, then asserts that reading it with `recursiveFileLookup=true` throws an 
`AnalysisException` with condition 
`RECURSIVE_FILE_LOOKUP_NOT_SUPPORTED_FOR_PARTITIONED_DATA_SOURCE`.
   
   ### Was this patch authored or co-authored using generative AI tooling?
   
   Generated-by: Claude (claude-sonnet-4-6)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to