Re: EOFException while reading from HDFS

2016-04-28 Thread Saurav Sinha
to >> > conf/spark-env.sh >> > >> > >> > export SPARK_DIST_CLASSPATH="/usr/local/hadoop-1.0.4/bin/hadoop" >> > >> > >> > but none of it seems to work. However, the following command works from >> > 172.26.49.55 and gives the directory listing: >> > >> > /usr/local/hadoop-1.0.4/bin/hadoop fs -ls hdfs://172.26.49.156:54310/ >> > >> > >> > Any suggestion? >> > >> > >> > Thanks >> > >> > Bibudh >> > >> > >> > -- >> > Bibudh Lahiri >> > Data Scientist, Impetus Technolgoies >> > 5300 Stevens Creek Blvd >> > San Jose, CA 95129 >> > http://knowthynumbers.blogspot.com/ >> > >> >> - >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >> For additional commands, e-mail: user-h...@spark.apache.org >> >> > > > -- > Bibudh Lahiri > Senior Data Scientist, Impetus Technolgoies > 720 University Avenue, Suite 130 > Los Gatos, CA 95129 > http://knowthynumbers.blogspot.com/ > > -- Thanks and Regards, Saurav Sinha Contact: 9742879062

Error in Spark job

2016-07-12 Thread Saurav Sinha
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1426) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1418) at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) -- Thanks and Regards, Saurav Sinha Contact: 9742879062

Spark driver getting out of memory

2016-07-18 Thread Saurav Sinha
pper.flush(CompressionCodec.scala:197) at java.io.ObjectOutputStream$BlockDataOutputStream.flush(ObjectOutputStream.java:1822) Help needed. -- Thanks and Regards, Saurav Sinha Contact: 9742879062

Re: Spark driver getting out of memory

2016-07-18 Thread Saurav Sinha
be set . > > > On Monday, July 18, 2016 6:31 PM, Saurav Sinha > wrote: > > > Hi, > > I am running spark job. > > Master memory - 5G > executor memort 10G(running on 4 node) > > My job is getting killed as no of partition increase to 20K. > &

Re: Spark driver getting out of memory

2016-07-19 Thread Saurav Sinha
. Thanks, Saurav Sinha On Tue, Jul 19, 2016 at 2:42 AM, Mich Talebzadeh wrote: > can you please clarify: > > >1. In what mode are you running the spark standalone, yarn-client, >yarn cluster etc >2. You have 4 nodes with each executor having 10G. How many actual >

Re: Spark driver getting out of memory

2016-07-19 Thread Saurav Sinha
utilization. Thanks, Saurav Sinha On Tue, Jul 19, 2016 at 10:14 PM, RK Aduri wrote: > Just want to see if this helps. > > Are you doing heavy collects and persist that? If that is so, you might > want to parallelize that collection by converting to an RDD. > > Thanks, > RK > &

Explanation regarding Spark Streaming

2016-08-03 Thread Saurav Sinha
Hi, I have query Q1. What will happen if spark streaming job have batchDurationTime as 60 sec and processing time of complete pipeline is greater then 60 sec. -- Thanks and Regards, Saurav Sinha Contact: 9742879062

Master getting down with Memory issue.

2015-09-27 Thread Saurav Sinha
ore then 5 min to respond status of jobs. Running spark 1.4.1 in standalone mode on 5 machine cluster. Kindly suggest me solution for memory issue it is blocker. Thanks, Saurav Sinha -- Thanks and Regards, Saurav Sinha Contact: 9742879062

Re: Master getting down with Memory issue.

2015-09-28 Thread Saurav Sinha
Hi Akhil, Can you please explaine to me how increasing number of partition (which is thing is worker nodes) will help. As issue is that my master is getting OOM. Thanks, Saurav Sinha On Mon, Sep 28, 2015 at 2:32 PM, Akhil Das wrote: > This behavior totally depends on the job that you

Re: Master getting down with Memory issue.

2015-09-28 Thread Saurav Sinha
Hi Akhil, My job is creating 47 stages in one cycle and it is running every hour. Can you please suggest me what is optimum numbers of stages in spark job. How can we reduce numbers of stages in spark job. Thanks, Saurav Sinha On Mon, Sep 28, 2015 at 3:23 PM, Saurav Sinha wrote: > Hi Ak

Spark job is running infinitely

2015-10-12 Thread Saurav Sinha
Regards, Saurav Sinha Contact: 9742879062

Re: Spark job is running infinitely

2015-10-12 Thread Saurav Sinha
#x27; in > your reply. > > Thanks > > On Mon, Oct 12, 2015 at 10:07 AM, Saurav Sinha > wrote: > >> Hi Experts, >> >> I am facing issue in which spark job is running infinitely. >> >> When I start spark job on 4 node cluster. >> >> In w

Re: Spark job is running infinitely

2015-10-12 Thread Saurav Sinha
art, Spark experts may have answer for you. > > On Mon, Oct 12, 2015 at 11:09 AM, Saurav Sinha > wrote: > >> Hi Ted, >> >> *Do you have monitoring put in place to detect 'no space left' scenario ?* >> >> No, I don't have any monitoring in plac

Re: Spark job is running infinitely

2015-10-12 Thread Saurav Sinha
Hi Ted, Which monitoring service would you suggest for me. Thanks, Saurav On Mon, Oct 12, 2015 at 11:55 PM, Saurav Sinha wrote: > Hi Ted, > > Which would you suggest for monitoring service for me. > > Thanks, > Saurav > > On Mon, Oct 12, 2015 at 11:47 PM, Ted Yu wro

Fwd: Issue with high no of skipped task

2015-09-20 Thread Saurav Sinha
Hi Users, I am new Spark I have written flow.When we deployed our code it is completing jobs in 4-5 min. But now it is taking 20+ min in completing with almost same set of data. Can you please help me to figure out reason for it. -- Thanks and Regards, Saurav Sinha Contact: 9742879062

Issue with high no of skipped task

2015-09-20 Thread Saurav Sinha
Hi Users, I am new Spark I have written flow.When we deployed our code it is completing jobs in 4-5 min. But now it is taking 20+ min in completing with almost same set of data. Can you please help me to figure out reason for it. -- Thanks and Regards, Saurav Sinha Contact: 9742879062

Fwd: Issue with high no of skipped task

2015-09-21 Thread Saurav Sinha
-- Forwarded message -- From: "Saurav Sinha" Date: 21-Sep-2015 11:48 am Subject: Issue with high no of skipped task To: Cc: Hi Users, I am new Spark I have written flow.When we deployed our code it is completing jobs in 4-5 min. But now it is taking 20+ min in compl

Re: Unreachable dead objects permanently retained on heap

2015-09-25 Thread Saurav Sinha
in standalone mode on 5 machine cluster. Kindly suggest me solution for memory issue it is blocker. Thanks, Saurav Sinha On Fri, Sep 25, 2015 at 5:01 PM, James Aley wrote: > Hi, > > We have an application that submits several thousands jobs within the same > SparkContext, using a thr

Re: Finding unique across all columns in dataset

2016-09-19 Thread Saurav Sinha
o hdfs. > > How can I acheive this ? > > Is there any distributed data structure that I can use and keep on > updating it as I traverse the new rows ? > > Regards, > Abhi > -- Thanks and Regards, Saurav Sinha Contact: 9742879062

PermGen space error

2016-10-06 Thread Saurav Sinha
mpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) -- Thanks and Regards, Saurav Sinha Contact: 9742879062

Detected yarn-cluster mode, but isn't running on a cluster. Deployment to YARN is not supported directly by SparkContext. Please use spark-submit.

2016-10-06 Thread Saurav Sinha
ploy.SparkSubmit$.submit(SparkSubmit.scala:205) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Please help. -- Thanks and Regards, Saurav Sinha Contact: 9742879062

Re: Detected yarn-cluster mode, but isn't running on a cluster. Deployment to YARN is not supported directly by SparkContext. Please use spark-submit.

2016-10-07 Thread Saurav Sinha
I am submitting job by spark-submit but still it is giving message. Please use spark-submit. Can any one give me resone for this error. Thanks, Saurav Sinha On Thu, Oct 6, 2016 at 3:38 PM, Saurav Sinha wrote: > I did not get you I am submitting job by spark-submit but still it is >

Help in generating unique Id in spark row

2016-10-17 Thread Saurav Sinha
null| |null|2439d6db-16a2-44b...| +----+--------+ -- Thanks and Regards, Saurav Sinha Contact: 9742879062

Re: Help in generating unique Id in spark row

2016-10-17 Thread Saurav Sinha
Can any one help me out On Mon, Oct 17, 2016 at 7:27 PM, Saurav Sinha wrote: > Hi, > > I am in situation where I want to generate unique Id for each row. > > I have use monotonicallyIncreasingId but it is giving increasing values > and start generating from start if it fa

Setting spark.yarn.stagingDir in 1.6

2017-03-15 Thread Saurav Sinha
in spark 1.6. -- Thanks and Regards, Saurav Sinha Contact: 9742879062

Re: What is the best way for Spark to read HDF5@scale?

2018-09-17 Thread Saurav Sinha
or HDF5? > > The following link does not work anymore? > > https://www.hdfgroup.org/downloads/spark-connector/ > down vo > > Thanks, > > Kathleen > -- Thanks and Regards, Saurav Sinha Contact: 9742879062