Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10392#discussion_r48133110
  
    --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
    @@ -2072,14 +2072,15 @@ class SparkContext(config: SparkConf) extends 
Logging with ExecutorAllocationCli
         // Otherwise, the driver may attempt to reconstruct the checkpointed 
RDD from
         // its own local file system, which is incorrect because the 
checkpoint files
         // are actually on the executor machines.
    -    if (!isLocal && Utils.nonLocalPaths(directory).isEmpty) {
    +    val path = new Path(directory, UUID.randomUUID().toString)
    +    val fs = path.getFileSystem(hadoopConfiguration)
    +    val isDirLocal = fs.isInstanceOf[LocalFileSystem]
    +    if (!isLocal && Utils.nonLocalPaths(directory).isEmpty && !isDirLocal) 
{
    --- End diff --
    
    Is the "host:port" the issue? then write `hdfs:///path/xxx`.
    
    My suggestion is to modify the warning message to something like `s"If 
Spark is not running in local mode, then the checkpoint directory must not be 
on the local filesystem. Directory '$directory' appears to be on the local 
filesystem."` It still causes a warning in your case, but I believe that 
warning is avoidable (?) with the right `hdfs` URI in this case. Hence it's 
still useful.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to