[Spark 3.0.0] Job fails with NPE - worked in Spark 2.4.4

2020-07-23 Thread Neelesh Salian
Hi folks, Been trying to debug this issue: https://gist.github.com/nssalian/203e20432c2ed237717be28642b1871a *Context:* *The application (Pyspark):* 1. Read a Hive table from the Metastore (Running Hive 1.2.2) 2. Print schema of the Dataframe read. 3. Do a show() on the df captured. The above err

Re: Spark books

2017-05-03 Thread Neelesh Salian
The Apache Spark documentation is good to begin with. All the programming guides, particularly. On Wed, May 3, 2017 at 5:07 PM, ayan guha wrote: > I would suggest do not buy any book, just start with databricks community > edition > > On Thu, May 4, 2017 at 9:30 AM, Tobi Bosede wrote: > >> Wel

Re: Steps to Run Spark Scala job from Oozie on EC2 Hadoop clsuter

2016-03-07 Thread Neelesh Salian
Hi Divya, This link should have the details that you need to begin using the Spark Action on Oozie: https://oozie.apache.org/docs/4.2.0/DG_SparkActionExtension.html Thanks. On Mon, Mar 7, 2016 at 7:52 AM, Benjamin Kim wrote: > To comment… > > At my company, we have not gotten it to work in any

Re: Spark Performance on Yarn

2015-04-22 Thread Neelesh Salian
Does it still hit the memory limit for the container? An expensive transformation? On Wed, Apr 22, 2015 at 8:45 AM, Ted Yu wrote: > In master branch, overhead is now 10%. > That would be 500 MB > > FYI > > > > > On Apr 22, 2015, at 8:26 AM, nsalian wrote: > > > > +1 to executor-memory to 5g. >