Unsubscribe

2017-06-20 Thread Palash Gupta
Unsubscribe  Thanks & Best Regards, Engr. Palash Gupta Consultant, OSS/CEM/Big Data Skype: palash2494 https://www.linkedin.com/in/enggpalashgupta

Unsubscribe

2017-06-18 Thread Palash Gupta
 Thanks & Best Regards, Engr. Palash Gupta Consultant, OSS/CEM/Big Data Skype: palash2494 https://www.linkedin.com/in/enggpalashgupta

[Spark 2.0.0] java.util.concurrent.TimeoutException while writing to mongodb from Spark

2017-02-07 Thread Palash Gupta
ala:274)     at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)     at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)     ... 1 more 17/02/08 07:03:51 INFO spark.SparkContext: Invoking stop() from shutdown hook Thanks & Best Regards, Palash Gupta

Re: spark 2.02 error when writing to s3

2017-01-19 Thread Palash Gupta
Hi, You need to add mode overwrite option to avoid this error. //P.Gupta Sent from Yahoo Mail on Android On Fri, 20 Jan, 2017 at 2:15 am, VND Tremblay, Paul wrote: I have come across a problem when writing CSV files to S3 in Spark 2.02. The problem does not exist

Re: Spark #cores

2017-01-18 Thread Palash Gupta
k.default.parallelism=32 Later, I read that the term "cores" doesn't mean physical CPU cores but rather #tasks that an executor can execute.  Anyway, I don't have a clear idea how to set the number of executors per physical node. I see there's an option in the Yarn mode, but it's not av

Re: Spark #cores

2017-01-18 Thread Palash Gupta
Hi, Can you please share how you are assigning cpu core & tell us spark version and language you are using? //Palash Sent from Yahoo Mail on Android On Wed, 18 Jan, 2017 at 10:16 pm, Saliya Ekanayake wrote: Thank you, for the quick response. No, this is not Spark SQL.

Re: Spark vs MongoDB: saving DataFrame to db raises missing database name exception

2017-01-16 Thread Palash Gupta
go, please create a collection with a record. Otherwise mongo may not keep that db if online session die //Palash Sent from Yahoo Mail on Android On Tue, 17 Jan, 2017 at 12:44 pm, Palash Gupta<spline_pal...@yahoo.com> wrote: Hi Marco, What is the user and password you are using for

Re: Spark vs MongoDB: saving DataFrame to db raises missing database name exception

2017-01-16 Thread Palash Gupta
Hi Marco, What is the user and password you are using for mongodb connection? Did you enable authorization? Better to include user & pass in mongo url. I remember I tested with python successfully. Best Regards,Palash Sent from Yahoo Mail on Android On Tue, 17 Jan, 2017 at 5:37 am, Marco

Re: spark-shell running out of memory even with 6GB ?

2017-01-09 Thread Palash Gupta
Hello Mr. Burton, Can you share example code how did you implement for other user to see? "So I think what we did is did a repartition too large and now we ran out of memory in spark shell.  " Thanks! P.Gupta Sent from Yahoo Mail on Android On Tue, 10 Jan, 2017 at 8:20 am, Kevin

[Spark 2.1.0] Resource Scheduling Challenge in pyspark sparkSession

2017-01-05 Thread Palash Gupta
8g").config("spark.executor.memory", "4g").appName(APP_NAME).getOrCreate() Thanks & Best Regards, Engr. Palash GuptaWhatsApp/Viber: +8801817181502Skype: palash2494   Thanks & Best Regards,Palash Gupta

Re: [TorrentBroadcast] Pyspark Application terminated saying "Failed to get broadcast_1_ piece0 of broadcast_1 in Spark 2.0.0"

2017-01-05 Thread Palash Gupta
Hi Macro, Yes it was in the same host when problem was found. Even when I tried to start with different host, the problem is still there. Any hints or suggestion will be appreciated.  Thanks & Best Regards, Palash Gupta From: Marco Mistroni <mmistr...@gmail.com> To: Pa

Re: [TorrentBroadcast] Pyspark Application terminated saying "Failed to get broadcast_1_ piece0 of broadcast_1 in Spark 2.0.0"

2017-01-05 Thread Palash Gupta
park/python/lib/pyspark.zip/pyspark/sql/utils.py", line 63, in deco   File "/usr/local/spark/python/lib/py4j-0.10.1-src.zip/py4j/protocol.py", line 312, in get_return_value py4j.protocol.Py4JJavaError: An error occurred while calling o58.load.  Thanks & Best Regards, Palas

Re: [TorrentBroadcast] Pyspark Application terminated saying "Failed to get broadcast_1_ piece0 of broadcast_1 in Spark 2.0.0"

2016-12-31 Thread Palash Gupta
on at a time and try to reproduce this error for both file system and HDFS loading case When I'm done I will share details with you. If you have any suggestion more for debug point of view, you can add here for me  Thanks & Best Regards, Palash Gupta From: Marco Mistroni <mmistr..

Re: What's the best practice to load data from RDMS to Spark

2016-12-30 Thread Palash Gupta
Hi, If you want to load from csv, you can use below procedure. Of course you need to define spark context first. (Given example to load all csv under a folder, you can use specific name for single file) // these lines are equivalent in Spark 2.0 spark.read.format("csv").option("header",

Re: [TorrentBroadcast] Pyspark Application terminated saying "Failed to get broadcast_1_ piece0 of broadcast_1 in Spark 2.0.0"

2016-12-30 Thread Palash Gupta
;guha.a...@gmail.com> wrote: @Palash: I think what Macro meant by "reduce functionality" is to reduce scope of your application's functionality so that you can isolate the issue in certain part(s) of the app...I do not think he meant "reduce" operation :) On Fri, Dec 30, 2

Re: [TorrentBroadcast] Pyspark Application terminated saying "Failed to get broadcast_1_ piece0 of broadcast_1 in Spark 2.0.0"

2016-12-30 Thread Palash Gupta
example of reduce functionality. Cause I'm using Spark data frame, join data frames, use SQL statement to manipulate KPI(s). Here How could I apply reduce functionality?  Thanks & Best Regards, Palash Gupta From: Marco Mistroni <mmistr...@gmail.com> To: "spline_pal...@ya

Re: [TorrentBroadcast] Pyspark Application terminated saying "Failed to get broadcast_1_ piece0 of broadcast_1 in Spark 2.0.0"

2016-12-30 Thread Palash Gupta
loads/*.csv") spark.read.option("header", "true").csv("../Downloads/*.csv") Thanks & Best Regards, Palash Gupta From: Nicholas Hakobian <nicholas.hakob...@rallyhealth.com> To: "spline_pal...@yahoo.com" <spline_pal...@yahoo.com> Cc: M

Re: [TorrentBroadcast] Pyspark Application terminated saying "Failed to get broadcast_1_ piece0 of broadcast_1 in Spark 2.0.0"

2016-12-29 Thread Palash Gupta
filesystem instead of hadoop. If you can read it successfully then your hadoop file is the issue and you can start debugging from there.Hth On 29 Dec 2016 6:26 am, "Palash Gupta" <spline_pal...@yahoo.com. invalid> wrote: Hi Apache Spark User team, Greetings! I started developing a

[TorrentBroadcast] Pyspark Application terminated saying "Failed to get broadcast_1_ piece0 of broadcast_1 in Spark 2.0.0"

2016-12-28 Thread Palash Gupta
Hi Apache Spark User team, Greetings! I started developing an application using Apache Hadoop and Spark using python. My pyspark application randomly terminated saying "Failed to get broadcast_1*" and I have been searching for suggestion and support in Stakeoverflow at Failed to get