Github user awarrior commented on the issue:
https://github.com/apache/spark/pull/19118
@jiangxb1987 well, I passed that part above but met other initialization
chances before runJob. They are in the write function of SparkHadoopWriter.
>
// Assert the output format/key/value class is set in JobConf.
config.assertConf(jobContext, rdd.conf) //// <= chance
val committer = config.createCommitter(stageId)
committer.setupJob(jobContext) //// <= chance
// Try to write all RDD partitions as a Hadoop OutputFormat.
try {
val ret = sparkContext.runJob(rdd, (context: TaskContext, iter:
Iterator[(K, V)]) => {
executeTask(
context = context,
config = config,
jobTrackerId = jobTrackerId,
sparkStageId = context.stageId,
sparkPartitionId = context.partitionId,
sparkAttemptNumber = context.attemptNumber,
committer = committer,
iterator = iter)
})
One trace list:
> java.lang.Thread.State: RUNNABLE
at org.apache.hadoop.fs.FileSystem.getStatistics(FileSystem.java:3270)
- locked <0x126a> (a java.lang.Class)
at org.apache.hadoop.fs.FileSystem.initialize(FileSystem.java:202)
at
org.apache.hadoop.fs.RawLocalFileSystem.initialize(RawLocalFileSystem.java:92)
at
org.apache.hadoop.fs.LocalFileSystem.initialize(LocalFileSystem.java:47)
at
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2598)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:91)
at
org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2632)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2614)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:370)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:169)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:354)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296)
at
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.<init>(FileOutputCommitter.java:91)
at
org.apache.hadoop.mapred.FileOutputCommitter.getWrapped(FileOutputCommitter.java:65)
at
org.apache.hadoop.mapred.FileOutputCommitter.setupJob(FileOutputCommitter.java:131)
at
org.apache.hadoop.mapred.OutputCommitter.setupJob(OutputCommitter.java:233)
at
org.apache.spark.internal.io.HadoopMapReduceCommitProtocol.setupJob(HadoopMapReduceCommitProtocol.scala:125)
at
org.apache.spark.internal.io.SparkHadoopWriter$.write(SparkHadoopWriter.scala:74)
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]