AnuragKDwivedi opened a new pull request, #56732:
URL: https://github.com/apache/spark/pull/56732

   ## Description
   
   This is a follow-up to PR #56356, which improved validation consistency for 
namespace locations by treating whitespace-only values as invalid locations.
   
   ### What changes were proposed in this pull request?
   
   This PR extends the same validation behavior to direct file-path queries.
   
   Currently, direct file-path validation checks only for empty strings using 
`isEmpty`. Consequently, whitespace-only paths such as `" "`, `"\t"`, and 
`"\n"` are not recognized as empty during analysis and may fail later with 
datasource-specific errors.
   
   This PR updates the validation to use `SparkStringUtils.isBlank(...)`, 
ensuring that whitespace-only paths are treated as invalid and consistently 
fail with the standard `INVALID_EMPTY_LOCATION` error.
   
   By doing so, the change aligns direct file-path validation with the existing 
namespace location validation logic and improves consistency across Spark SQL 
location handling.
   
   ### Why are the changes needed?
   
   Currently, validation behavior differs depending on the type of location 
being processed:
   
   * Empty paths (`""`) are rejected during analysis with 
`INVALID_EMPTY_LOCATION`.
   * Whitespace-only paths (`" "`, `"\t"`, `"\n"`) may bypass analysis-time 
validation and fail later with datasource-specific errors.
   
   Using `SparkStringUtils.isBlank(...)` ensures consistent handling of all 
blank path values across Spark SQL.
   
   ### Does this PR introduce any user-facing change?
   
   Yes.
   
   Whitespace-only direct file paths are now rejected during analysis with 
`INVALID_EMPTY_LOCATION`, providing behavior consistent with namespace location 
validation.
   
   ### How was this patch tested?
   
   Added regression test coverage for blank path values, including:
   
   * `""`
   * `" "`
   * `"\t"`
   * `"\n"`
   
   and verified that they consistently fail with `INVALID_EMPTY_LOCATION`.
   
   Jira - https://issues.apache.org/jira/browse/SPARK-57295


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to