date:20200716

Re: File not found exceptions on S3 while running spark jobs

2020-07-16 Thread Hulio andres

https://examples.javacodegeeks.com/java-io-filenotfoundexception-how-to-solve-file-not-found-exception/

Are you a programmer   ?

Regards,

Hulio



> Sent: Friday, July 17, 2020 at 2:41 AM
> From: "Nagendra Darla" 
> To: user@spark.apache.org
> Subject: File not found exceptions on S3 while running spark jobs
>
> Hello All,
> I am converting existing parquet table (size: 50GB) into Delta format. It
> took around 1hr 45 mins to convert.
> And I see that there are lot of FileNotFoundExceptions in the logs
>
> Caused by: java.io.FileNotFoundException: No such file or directory:
> s3a://old-data/delta-data/PL1/output/denorm_table/part-00031-183e54ef-50bc-46fc-83a3-7836baa28f86-c000.snappy.parquet
>
> *How do I fix these errors?* I am using below options in spark-submit
> command
>
> spark-submit --packages
> io.delta:delta-core_2.11:0.6.0,org.apache.hadoop:hadoop-aws:2.8.5
> --conf 
> spark.delta.logStore.class=org.apache.spark.sql.delta.storage.S3SingleDriverLogStore
> --class Pipeline1 Pipeline.jar
>
> Thank You,
> Nagendra Darla
>

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

File not found exceptions on S3 while running spark jobs

2020-07-16 Thread Nagendra Darla

Hello All,
I am converting existing parquet table (size: 50GB) into Delta format. It
took around 1hr 45 mins to convert.
And I see that there are lot of FileNotFoundExceptions in the logs

Caused by: java.io.FileNotFoundException: No such file or directory:
s3a://old-data/delta-data/PL1/output/denorm_table/part-00031-183e54ef-50bc-46fc-83a3-7836baa28f86-c000.snappy.parquet

*How do I fix these errors?* I am using below options in spark-submit
command

spark-submit --packages
io.delta:delta-core_2.11:0.6.0,org.apache.hadoop:hadoop-aws:2.8.5
--conf 
spark.delta.logStore.class=org.apache.spark.sql.delta.storage.S3SingleDriverLogStore
--class Pipeline1 Pipeline.jar

Thank You,
Nagendra Darla

Re: “Pyspark.zip does not exist” using Spark in cluster mode with Yarn

2020-07-16 Thread Hulio andres

https://issues.apache.org/jira/plugins/servlet/mobile#issue/SPARK-10795


https://stackoverflow.com/questions/34632617/spark-python-submission-error-file-does-not-exist-pyspark-zip



https://stackoverflow.com/questions/34632617/spark-python-submission-error-file-does-not-exist-pyspark-zip>
 Sent: Thursday, July 16, 2020 at 6:54 PM
> From: "Davide Curcio" 
> To: "user@spark.apache.org" 
> Subject: “Pyspark.zip does not exist” using Spark in cluster mode with Yarn
>
> I'm trying to run some Spark script in cluster mode using Yarn but I've 
> always obtained this error. I read in other similar question that the cause 
> can be:
> 
> "Local" set up hard-coded as a master but I don't have it
> HADOOP_CONF_DIR environment variable that's wrong inside spark-env.sh but it 
> seems right
> I've tried with every code, even simple code but it still doesn't work, even 
> though in local mode they work.
> 
> Here is my log when I try to execute the code:
> 
> spark/bin/spark-submit --deploy-mode cluster --master yarn ~/prova7.py
> log4j:WARN No appenders could be found for logger 
> (org.apache.hadoop.util.Shell).
> log4j:WARN Please initialize the log4j system properly.
> log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more 
> info.
> Using Spark's default log4j profile: 
> org/apache/spark/log4j-defaults.properties
> 20/07/16 16:10:27 INFO Client: Requesting a new application from cluster with 
> 2 NodeManagers
> 20/07/16 16:10:27 INFO Client: Verifying our application has not requested 
> more than the maximum memory capability of the cluster (1536 MB per container)
> 20/07/16 16:10:27 INFO Client: Will allocate AM container, with 896 MB memory 
> including 384 MB overhead
> 20/07/16 16:10:27 INFO Client: Setting up container launch context for our AM
> 20/07/16 16:10:27 INFO Client: Setting up the launch environment for our AM 
> container
> 20/07/16 16:10:27 INFO Client: Preparing resources for our AM container
> 20/07/16 16:10:27 WARN Client: Neither spark.yarn.jars nor spark.yarn.archive 
> is set, falling back to uploading libraries under SPARK_HOME.
> 20/07/16 16:10:31 INFO Client: Uploading resource 
> file:/tmp/spark-750fb229-4166--9c69-eb90e9a2318d/__spark_libs__4588035472069967339.zip
>  -> 
> file:/home/ubuntu/.sparkStaging/application_1594914119543_0010/__spark_libs__4588035472069967339.zip
> 20/07/16 16:10:31 INFO Client: Uploading resource file:/home/ubuntu/prova7.py 
> -> file:/home/ubuntu/.sparkStaging/application_1594914119543_0010/prova7.py
> 20/07/16 16:10:31 INFO Client: Uploading resource 
> file:/home/ubuntu/spark/python/lib/pyspark.zip -> 
> file:/home/ubuntu/.sparkStaging/application_1594914119543_0010/pyspark.zip
> 20/07/16 16:10:31 INFO Client: Uploading resource 
> file:/home/ubuntu/spark/python/lib/py4j-0.10.7-src.zip -> 
> file:/home/ubuntu/.sparkStaging/application_1594914119543_0010/py4j-0.10.7-src.zip
> 20/07/16 16:10:32 INFO Client: Uploading resource 
> file:/tmp/spark-750fb229-4166--9c69-eb90e9a2318d/__spark_conf__1291791519024875749.zip
>  -> 
> file:/home/ubuntu/.sparkStaging/application_1594914119543_0010/__spark_conf__.zip
> 20/07/16 16:10:32 INFO SecurityManager: Changing view acls to: ubuntu
> 20/07/16 16:10:32 INFO SecurityManager: Changing modify acls to: ubuntu
> 20/07/16 16:10:32 INFO SecurityManager: Changing view acls groups to:
> 20/07/16 16:10:32 INFO SecurityManager: Changing modify acls groups to:
> 20/07/16 16:10:32 INFO SecurityManager: SecurityManager: authentication 
> disabled; ui acls disabled; users  with view permissions: Set(ubuntu); groups 
> with view permissions: Set(); users  with modify permissions: Set(ubuntu); 
> groups with modify permissions: Set()
> 20/07/16 16:10:33 INFO Client: Submitting application 
> application_1594914119543_0010 to ResourceManager
> 20/07/16 16:10:33 INFO YarnClientImpl: Submitted application 
> application_1594914119543_0010
> 20/07/16 16:10:34 INFO Client: Application report for 
> application_1594914119543_0010 (state: FAILED)
> 20/07/16 16:10:34 INFO Client:
>  client token: N/A
>  diagnostics: Application application_1594914119543_0010 failed 2 times 
> due to AM Container for appattempt_1594914119543_0010_02 exited with  
> exitCode: -1000
> Failing this attempt.Diagnostics: [2020-07-16 16:10:34.391]File 
> file:/home/ubuntu/.sparkStaging/application_1594914119543_0010/pyspark.zip 
> does not exist
> java.io.FileNotFoundException: File 
> file:/home/ubuntu/.sparkStaging/application_1594914119543_0010/pyspark.zip 
> does not exist
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:641)
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:930)
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:631)
> at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:454)
> at 
>

Re: Using spark.jars conf to override jars present in spark default classpath

2020-07-16 Thread Russell Spitzer

That's what I'm saying you don't want to do :) If you have two versions of
a library with different apis the safest approach is shading and ordering
probably can't be relied on. In my experience reflection will behave in
ways you may not like as well as which classpath has priority when a class
is loading.  Spark.Jars will never be able to reorder so you'll need to get
those jars on the system class loader using the driver (and executor) extra
classpath args (with userClasspathFirst). I will stress again that it would
be my last choice for getting it working and I would try shading first if I
really have a conflict.

On Thu, Jul 16, 2020 at 2:17 PM Nupur Shukla 
wrote:

> Thank you Russel and Jeff,
>
> My bad, I wasn't clear before about the conflicting jars. By that, I meant
> my application needs to use an updated version of certain jars than what
> are present in the default classpath. What would be the best way to use
> confs spark.jar and spark.driver.extraClassPath both to do a classpath
> reordering so that the updated versions get picked first? Looks like the
> one way to use extraClassPath conf here.
>
>
>
>
> On Thu, 16 Jul 2020 at 12:05, Jeff Evans 
> wrote:
>
>> If you can't avoid it, you need to make use of the
>> spark.driver.userClassPathFirst and/or spark.executor.userClassPathFirst
>> properties.
>>
>> On Thu, Jul 16, 2020 at 2:03 PM Russell Spitzer <
>> russell.spit...@gmail.com> wrote:
>>
>>> I believe the main issue here is that spark.jars is a bit "too late" to
>>> actually prepend things to the class path. For most use cases this value is
>>> not read until after the JVM has already started and the system classloader
>>> has already loaded.
>>>
>>> The jar argument gets added via the dynamic class loader so it
>>> necessarily has to come after wards :/ Driver extra classpath and it's
>>> friends, modify the actual launch command of the driver (or executors) so
>>> they can prepend whenever they want.
>>>
>>>  In general you do not want to have conflicting jars at all if possible
>>> and I would recommend looking into shading if it's really important for
>>> your application to use a specific incompatible version of a library. Jar
>>> (and extraClasspath) are really just
>>> for adding additional jars and I personally would try not to rely on
>>> classpath ordering to get the right libraries recognized.
>>>
>>> On Thu, Jul 16, 2020 at 1:55 PM Nupur Shukla 
>>> wrote:
>>>
 Hello,

 How can we use *spark.jars* to to specify conflicting jars (that is,
 jars that are already present in the spark's default classpath)? Jars
 specified in this conf gets "appended" to the classpath, and thus gets
 looked at after the default classpath. Is it not intended to be used to
 specify conflicting jars?
 Meanwhile when *spark.driver.extraClassPath* conf is specified, this
 path is "prepended" to the classpath and thus takes precedence over the
 default classpath.

 How can I use both to specify different jars and paths but achieve a
 precedence of spark.jars path > spark.driver.extraClassPath > spark default
 classpath (left to right precedence order)?

 Experiment conducted:

 I am using sample-project.jar which has one class in it SampleProject.
 This has a method which prints the version number of the jar. For this
 experiment I am using 3 versions of this sample-project.jar
 Sample-project-1.0.0.jar is present in the spark default classpath in
 my test cluster
 Sample-project-2.0.0.jar is present in folder
 /home//ClassPathConf on driver
 Sample-project-3.0.0.jar is present in  folder /home//JarsConf on
 driver

 (Empty cell in img below means that conf was not specified)

 [image: image.png]

 Thank you,
 Nupur

Re: Using spark.jars conf to override jars present in spark default classpath

2020-07-16 Thread Nupur Shukla

Thank you Russel and Jeff,

My bad, I wasn't clear before about the conflicting jars. By that, I meant
my application needs to use an updated version of certain jars than what
are present in the default classpath. What would be the best way to use
confs spark.jar and spark.driver.extraClassPath both to do a classpath
reordering so that the updated versions get picked first? Looks like the
one way to use extraClassPath conf here.




On Thu, 16 Jul 2020 at 12:05, Jeff Evans 
wrote:

> If you can't avoid it, you need to make use of the
> spark.driver.userClassPathFirst and/or spark.executor.userClassPathFirst
> properties.
>
> On Thu, Jul 16, 2020 at 2:03 PM Russell Spitzer 
> wrote:
>
>> I believe the main issue here is that spark.jars is a bit "too late" to
>> actually prepend things to the class path. For most use cases this value is
>> not read until after the JVM has already started and the system classloader
>> has already loaded.
>>
>> The jar argument gets added via the dynamic class loader so it
>> necessarily has to come after wards :/ Driver extra classpath and it's
>> friends, modify the actual launch command of the driver (or executors) so
>> they can prepend whenever they want.
>>
>>  In general you do not want to have conflicting jars at all if possible
>> and I would recommend looking into shading if it's really important for
>> your application to use a specific incompatible version of a library. Jar
>> (and extraClasspath) are really just
>> for adding additional jars and I personally would try not to rely on
>> classpath ordering to get the right libraries recognized.
>>
>> On Thu, Jul 16, 2020 at 1:55 PM Nupur Shukla 
>> wrote:
>>
>>> Hello,
>>>
>>> How can we use *spark.jars* to to specify conflicting jars (that is,
>>> jars that are already present in the spark's default classpath)? Jars
>>> specified in this conf gets "appended" to the classpath, and thus gets
>>> looked at after the default classpath. Is it not intended to be used to
>>> specify conflicting jars?
>>> Meanwhile when *spark.driver.extraClassPath* conf is specified, this
>>> path is "prepended" to the classpath and thus takes precedence over the
>>> default classpath.
>>>
>>> How can I use both to specify different jars and paths but achieve a
>>> precedence of spark.jars path > spark.driver.extraClassPath > spark default
>>> classpath (left to right precedence order)?
>>>
>>> Experiment conducted:
>>>
>>> I am using sample-project.jar which has one class in it SampleProject.
>>> This has a method which prints the version number of the jar. For this
>>> experiment I am using 3 versions of this sample-project.jar
>>> Sample-project-1.0.0.jar is present in the spark default classpath in my
>>> test cluster
>>> Sample-project-2.0.0.jar is present in folder /home//ClassPathConf
>>> on driver
>>> Sample-project-3.0.0.jar is present in  folder /home//JarsConf on
>>> driver
>>>
>>> (Empty cell in img below means that conf was not specified)
>>>
>>> [image: image.png]
>>>
>>>
>>> Thank you,
>>> Nupur
>>>
>>>
>>>

Re: Using spark.jars conf to override jars present in spark default classpath

2020-07-16 Thread Jeff Evans

If you can't avoid it, you need to make use of the
spark.driver.userClassPathFirst and/or spark.executor.userClassPathFirst
properties.

On Thu, Jul 16, 2020 at 2:03 PM Russell Spitzer 
wrote:

> I believe the main issue here is that spark.jars is a bit "too late" to
> actually prepend things to the class path. For most use cases this value is
> not read until after the JVM has already started and the system classloader
> has already loaded.
>
> The jar argument gets added via the dynamic class loader so it necessarily
> has to come after wards :/ Driver extra classpath and it's friends, modify
> the actual launch command of the driver (or executors) so they can prepend
> whenever they want.
>
>  In general you do not want to have conflicting jars at all if possible
> and I would recommend looking into shading if it's really important for
> your application to use a specific incompatible version of a library. Jar
> (and extraClasspath) are really just
> for adding additional jars and I personally would try not to rely on
> classpath ordering to get the right libraries recognized.
>
> On Thu, Jul 16, 2020 at 1:55 PM Nupur Shukla 
> wrote:
>
>> Hello,
>>
>> How can we use *spark.jars* to to specify conflicting jars (that is,
>> jars that are already present in the spark's default classpath)? Jars
>> specified in this conf gets "appended" to the classpath, and thus gets
>> looked at after the default classpath. Is it not intended to be used to
>> specify conflicting jars?
>> Meanwhile when *spark.driver.extraClassPath* conf is specified, this
>> path is "prepended" to the classpath and thus takes precedence over the
>> default classpath.
>>
>> How can I use both to specify different jars and paths but achieve a
>> precedence of spark.jars path > spark.driver.extraClassPath > spark default
>> classpath (left to right precedence order)?
>>
>> Experiment conducted:
>>
>> I am using sample-project.jar which has one class in it SampleProject.
>> This has a method which prints the version number of the jar. For this
>> experiment I am using 3 versions of this sample-project.jar
>> Sample-project-1.0.0.jar is present in the spark default classpath in my
>> test cluster
>> Sample-project-2.0.0.jar is present in folder /home//ClassPathConf
>> on driver
>> Sample-project-3.0.0.jar is present in  folder /home//JarsConf on
>> driver
>>
>> (Empty cell in img below means that conf was not specified)
>>
>> [image: image.png]
>>
>>
>> Thank you,
>> Nupur
>>
>>
>>

Re: Using spark.jars conf to override jars present in spark default classpath

2020-07-16 Thread Russell Spitzer

I believe the main issue here is that spark.jars is a bit "too late" to
actually prepend things to the class path. For most use cases this value is
not read until after the JVM has already started and the system classloader
has already loaded.

The jar argument gets added via the dynamic class loader so it necessarily
has to come after wards :/ Driver extra classpath and it's friends, modify
the actual launch command of the driver (or executors) so they can prepend
whenever they want.

 In general you do not want to have conflicting jars at all if possible and
I would recommend looking into shading if it's really important for your
application to use a specific incompatible version of a library. Jar (and
extraClasspath) are really just
for adding additional jars and I personally would try not to rely on
classpath ordering to get the right libraries recognized.

On Thu, Jul 16, 2020 at 1:55 PM Nupur Shukla 
wrote:

> Hello,
>
> How can we use *spark.jars* to to specify conflicting jars (that is, jars
> that are already present in the spark's default classpath)? Jars specified
> in this conf gets "appended" to the classpath, and thus gets looked at
> after the default classpath. Is it not intended to be used to specify
> conflicting jars?
> Meanwhile when *spark.driver.extraClassPath* conf is specified, this path
> is "prepended" to the classpath and thus takes precedence over the default
> classpath.
>
> How can I use both to specify different jars and paths but achieve a
> precedence of spark.jars path > spark.driver.extraClassPath > spark default
> classpath (left to right precedence order)?
>
> Experiment conducted:
>
> I am using sample-project.jar which has one class in it SampleProject.
> This has a method which prints the version number of the jar. For this
> experiment I am using 3 versions of this sample-project.jar
> Sample-project-1.0.0.jar is present in the spark default classpath in my
> test cluster
> Sample-project-2.0.0.jar is present in folder /home//ClassPathConf
> on driver
> Sample-project-3.0.0.jar is present in  folder /home//JarsConf on
> driver
>
> (Empty cell in img below means that conf was not specified)
>
> [image: image.png]
>
>
> Thank you,
> Nupur
>
>
>

Using spark.jars conf to override jars present in spark default classpath

2020-07-16 Thread Nupur Shukla

Hello,

How can we use *spark.jars* to to specify conflicting jars (that is, jars
that are already present in the spark's default classpath)? Jars specified
in this conf gets "appended" to the classpath, and thus gets looked at
after the default classpath. Is it not intended to be used to specify
conflicting jars?
Meanwhile when *spark.driver.extraClassPath* conf is specified, this path
is "prepended" to the classpath and thus takes precedence over the default
classpath.

How can I use both to specify different jars and paths but achieve a
precedence of spark.jars path > spark.driver.extraClassPath > spark default
classpath (left to right precedence order)?

Experiment conducted:

I am using sample-project.jar which has one class in it SampleProject. This
has a method which prints the version number of the jar. For this
experiment I am using 3 versions of this sample-project.jar
Sample-project-1.0.0.jar is present in the spark default classpath in my
test cluster
Sample-project-2.0.0.jar is present in folder /home//ClassPathConf on
driver
Sample-project-3.0.0.jar is present in  folder /home//JarsConf on
driver

(Empty cell in img below means that conf was not specified)

[image: image.png]


Thank you,
Nupur

“Pyspark.zip does not exist” using Spark in cluster mode with Yarn

2020-07-16 Thread Davide Curcio

I'm trying to run some Spark script in cluster mode using Yarn but I've always 
obtained this error. I read in other similar question that the cause can be:

"Local" set up hard-coded as a master but I don't have it
HADOOP_CONF_DIR environment variable that's wrong inside spark-env.sh but it 
seems right
I've tried with every code, even simple code but it still doesn't work, even 
though in local mode they work.

Here is my log when I try to execute the code:

spark/bin/spark-submit --deploy-mode cluster --master yarn ~/prova7.py
log4j:WARN No appenders could be found for logger 
(org.apache.hadoop.util.Shell).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more 
info.
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
20/07/16 16:10:27 INFO Client: Requesting a new application from cluster with 2 
NodeManagers
20/07/16 16:10:27 INFO Client: Verifying our application has not requested more 
than the maximum memory capability of the cluster (1536 MB per container)
20/07/16 16:10:27 INFO Client: Will allocate AM container, with 896 MB memory 
including 384 MB overhead
20/07/16 16:10:27 INFO Client: Setting up container launch context for our AM
20/07/16 16:10:27 INFO Client: Setting up the launch environment for our AM 
container
20/07/16 16:10:27 INFO Client: Preparing resources for our AM container
20/07/16 16:10:27 WARN Client: Neither spark.yarn.jars nor spark.yarn.archive 
is set, falling back to uploading libraries under SPARK_HOME.
20/07/16 16:10:31 INFO Client: Uploading resource 
file:/tmp/spark-750fb229-4166--9c69-eb90e9a2318d/__spark_libs__4588035472069967339.zip
 -> 
file:/home/ubuntu/.sparkStaging/application_1594914119543_0010/__spark_libs__4588035472069967339.zip
20/07/16 16:10:31 INFO Client: Uploading resource file:/home/ubuntu/prova7.py 
-> file:/home/ubuntu/.sparkStaging/application_1594914119543_0010/prova7.py
20/07/16 16:10:31 INFO Client: Uploading resource 
file:/home/ubuntu/spark/python/lib/pyspark.zip -> 
file:/home/ubuntu/.sparkStaging/application_1594914119543_0010/pyspark.zip
20/07/16 16:10:31 INFO Client: Uploading resource 
file:/home/ubuntu/spark/python/lib/py4j-0.10.7-src.zip -> 
file:/home/ubuntu/.sparkStaging/application_1594914119543_0010/py4j-0.10.7-src.zip
20/07/16 16:10:32 INFO Client: Uploading resource 
file:/tmp/spark-750fb229-4166--9c69-eb90e9a2318d/__spark_conf__1291791519024875749.zip
 -> 
file:/home/ubuntu/.sparkStaging/application_1594914119543_0010/__spark_conf__.zip
20/07/16 16:10:32 INFO SecurityManager: Changing view acls to: ubuntu
20/07/16 16:10:32 INFO SecurityManager: Changing modify acls to: ubuntu
20/07/16 16:10:32 INFO SecurityManager: Changing view acls groups to:
20/07/16 16:10:32 INFO SecurityManager: Changing modify acls groups to:
20/07/16 16:10:32 INFO SecurityManager: SecurityManager: authentication 
disabled; ui acls disabled; users  with view permissions: Set(ubuntu); groups 
with view permissions: Set(); users  with modify permissions: Set(ubuntu); 
groups with modify permissions: Set()
20/07/16 16:10:33 INFO Client: Submitting application 
application_1594914119543_0010 to ResourceManager
20/07/16 16:10:33 INFO YarnClientImpl: Submitted application 
application_1594914119543_0010
20/07/16 16:10:34 INFO Client: Application report for 
application_1594914119543_0010 (state: FAILED)
20/07/16 16:10:34 INFO Client:
 client token: N/A
 diagnostics: Application application_1594914119543_0010 failed 2 times due 
to AM Container for appattempt_1594914119543_0010_02 exited with  exitCode: 
-1000
Failing this attempt.Diagnostics: [2020-07-16 16:10:34.391]File 
file:/home/ubuntu/.sparkStaging/application_1594914119543_0010/pyspark.zip does 
not exist
java.io.FileNotFoundException: File 
file:/home/ubuntu/.sparkStaging/application_1594914119543_0010/pyspark.zip does 
not exist
at 
org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:641)
at 
org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:930)
at 
org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:631)
at 
org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:454)
at org.apache.hadoop.yarn.util.FSDownload.verifyAndCopy(FSDownload.java:269)
at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:67)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:414)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:411)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:411)
at

Re: File not found exceptions on S3 while running spark jobs

File not found exceptions on S3 while running spark jobs

Re: “Pyspark.zip does not exist” using Spark in cluster mode with Yarn

Re: Using spark.jars conf to override jars present in spark default classpath

Re: Using spark.jars conf to override jars present in spark default classpath

Re: Using spark.jars conf to override jars present in spark default classpath

Re: Using spark.jars conf to override jars present in spark default classpath

Using spark.jars conf to override jars present in spark default classpath

“Pyspark.zip does not exist” using Spark in cluster mode with Yarn

9 matches

Site Navigation

Mail list logo

Footer information