[
https://issues.apache.org/jira/browse/SPARK-57695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Yang Jie updated SPARK-57695:
-----------------------------
Description:
{{TestUtils.recursiveList}} (core, {{org.apache.spark.TestUtils}}) duplicates
the directory walk that SPARK-57530 just fixed in
{{SparkFileUtils.recursiveList}}, and carries the same problems: it calls
{{File.listFiles}} without a null check (so an unreadable directory throws NPE)
and it has no linear-time guarantee.
Since {{core}} already depends on {{common/utils}} and {{LocalSparkCluster}}
already calls the fixed {{Utils.recursiveList}}, the cleanest fix is to delete
the duplicate and point {{TestUtils.recursiveList}}'s callers at
{{Utils.recursiveList}}, which inherits the null-safety and linear-time
behavior. All current callers only filter or count entries by name, so the
BFS-vs-DFS ordering difference is not observable.
Follow-up to SPARK-57530.
> Make TestUtils.recursiveList null-safe by reusing Utils.recursiveList
> ---------------------------------------------------------------------
>
> Key: SPARK-57695
> URL: https://issues.apache.org/jira/browse/SPARK-57695
> Project: Spark
> Issue Type: Improvement
> Components: Spark Core
> Affects Versions: 4.3.0
> Reporter: Yang Jie
> Priority: Major
>
> {{TestUtils.recursiveList}} (core, {{org.apache.spark.TestUtils}}) duplicates
> the directory walk that SPARK-57530 just fixed in
> {{SparkFileUtils.recursiveList}}, and carries the same problems: it calls
> {{File.listFiles}} without a null check (so an unreadable directory throws
> NPE) and it has no linear-time guarantee.
> Since {{core}} already depends on {{common/utils}} and {{LocalSparkCluster}}
> already calls the fixed {{Utils.recursiveList}}, the cleanest fix is to
> delete the duplicate and point {{TestUtils.recursiveList}}'s callers at
> {{Utils.recursiveList}}, which inherits the null-safety and linear-time
> behavior. All current callers only filter or count entries by name, so the
> BFS-vs-DFS ordering difference is not observable.
> Follow-up to SPARK-57530.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]