from:"Vamshi Talla"

unsubscribe

2018-10-15 Thread Vamshi Talla

Best Regards, Vamshi T

Re: How to avoid duplicate column names after join with multiple conditions

2018-07-08 Thread Vamshi Talla

Nirav, Spark does not create a duplicate column when you use the below join expression, as an array of column(s) like below but that requires the column name to be same in both the data frames. Example: df1.join(df2, [‘a’]) Thanks. Vamshi Talla On Jul 6, 2018, at 4:47 PM, Gokula Krishnan D

Re: repartition

2018-07-08 Thread Vamshi Talla

Hi Ravi, RDDs are always immutable, so you cannot change them, instead you create new ones by transforming one. Repartition is a transformation, so it lazily evaluated, hence computed only when you call an action on it. Thanks. Vamshi Talla On Jul 8, 2018, at 12:26 PM, mailto:ryanda

Re: Error: Could not find or load main class org.apache.spark.launcher.Main

2018-06-17 Thread Vamshi Talla

Raymond, Is your SPARK_HOME set? In your .bash_profile, try setting the below: export SPARK_HOME=/home/Downloads/spark (or wherever your spark is downloaded to) once done, source your .bash_profile or restart the shell and try spark-shell Best Regards, Vamshi T ___

Re: spark-submit Error: Cannot load main class from JAR file

2018-06-17 Thread Vamshi Talla

Hi Raymond, I see that you can make a small correction in your spark-submit command. Your spark-submit command should say: spark-submit --master local --class . < Jar Location and JarName> Example: spark-submit --master local \ --class retail_db.GetRevenuePerOrder C:\RXIE\Learning\Scal

Re: Issue upgrading to Spark 2.3.1 (Maintenance Release)

2018-06-15 Thread Vamshi Talla

Akash, Are you able to run your code in pyspark shell with no issues? Best Regards, Vamshi T From: Hyukjin Kwon Sent: Friday, June 15, 2018 10:18 AM To: Marcelo Vanzin Cc: aakash.spark@gmail.com; user @spark Subject: Re: Issue upgrading to Spark 2.3.1 (M

Re: [Spark Optimization] Why is one node getting all the pressure?

2018-06-11 Thread Vamshi Talla

Aakash, Like Jorn suggested, did you increase your test data set? If so, did you also update your executor-memory setting? It seems like you might exceeding the executor memory threshold. Thanks Vamshi Talla Sent from my iPhone On Jun 11, 2018, at 8:54 AM, Aakash Basu mailto:aakash.spark

unsubscribe

Re: How to avoid duplicate column names after join with multiple conditions

Re: repartition

Re: Error: Could not find or load main class org.apache.spark.launcher.Main

Re: spark-submit Error: Cannot load main class from JAR file

Re: Issue upgrading to Spark 2.3.1 (Maintenance Release)

Re: [Spark Optimization] Why is one node getting all the pressure?

7 matches

Site Navigation

Mail list logo

Footer information