Running 100 GB at standalone node

Vivek Mishra Tue, 18 Apr 2017 02:11:07 -0700

Hi,
I am running application over spark v 1.6.2(in standalone mode) for over 100 GB 
of data . Given below are my configurations:


Job configuration
spark.driver.memory=5g
spark.executor.memory=5g
spark.cores.max=4

spark-env.sh
export SPARK_WORKER_INSTANCES=3;
export SPARK_WORKER_MEMORY=5g;


There is one data frame which periodically does union and aggregation over 
multiple csv files streamed in. All goes well, but towards the end when I need 
to persist this data frame(using spark jdbc), executor seems to get in hang 
state for good!  I even tried dataframe.show() but no luck.

Read about it and tried multiple things but nothing is working so far.


Any suggestion would really help!

Sincerely,
-Vivek

________________________________






NOTE: This message may contain information that is confidential, proprietary, 
privileged or otherwise protected by law. The message is intended solely for 
the named addressee. If received in error, please destroy and notify the 
sender. Any use of this email is prohibited when received in error. Impetus 
does not represent, warrant and/or guarantee, that the integrity of this 
communication has been maintained nor that the communication is free of errors, 
virus, interception or interference.

Running 100 GB at standalone node

Reply via email to