UNSUBSCRIBE

2022-12-12 Thread Ricardo Sardenberg



How to see user defined variables in spark shell

2022-12-12 Thread Salil Surendran
I often define vals and vars in spark shell and after some time I forget
the ones I defined. I want to see only the ones I had defined and the ones
that come from spark. I have tried the following commands but all of them
show the variables and defs defined by spark also

$intp.definedTerms
  .map(t => s"${t.toTermName}: ${$intp.typeOfTerm(t.toTermName.toString)}")
  .foreach(println)

$intp.definedSymbolList,
$intp.namedDefinedTerms,
$intp.definedTerms.filter(x => !x.startsWith("res")).foreach(println),
$intp.allDefinedNames.filter(!_.startsWith("$"))

Any other command that works?
-- 
Thanks,
Salil
"The surest sign that intelligent life exists elsewhere in the universe is
that none of it has tried to contact us."


Spark-on-Yarn ClassNotFound Exception

2022-12-12 Thread Hariharan
Hello folks,

I have a spark app with a custom implementation of
*fs.s3a.s3.client.factory.impl* which is packaged into the same jar.
Output of *jar tf*

*2620 Mon Dec 12 11:23:00 IST 2022 aws/utils/MyS3ClientFactory.class*

However when I run the my spark app with spark-submit in cluster mode, it
fails with the following error:

*java.lang.RuntimeException: java.lang.RuntimeException:
java.lang.ClassNotFoundException: Class aws.utils.MyS3ClientFactory not
found*

I tried:
1. passing in the jar to the *--jars* option (with the local path)
2. Passing in the jar to *spark.yarn.jars* option with an HDFS path

but still the same error.

Any suggestions on what I'm missing?

Other pertinent details:
Spark version: 3.3.0
Hadoop version: 3.3.4

Command used to run the app
*/spark/bin/spark-submit --class MyMainClass --deploy-mode cluster --master
yarn  --conf spark.executor.instances=6   /path/to/my/jar*

TIA!