from:"\"Jun Yang\""

Question about Spark Streaming Receiver Failure

2015-03-16 Thread Jun Yang

Guys, We have a project which builds upon Spark streaming. We use Kafka as the input stream, and create 5 receivers. When this application runs for around 90 hour, all the 5 receivers failed for some unknown reasons. In my understanding, it is not guaranteed that Spark streaming receiver will d

Re: Question about Spark Streaming Receiver Failure

2015-03-16 Thread Jun Yang

; Thanks > Best Regards > > On Mon, Mar 16, 2015 at 12:40 PM, Jun Yang wrote: > >> Guys, >> >> We have a project which builds upon Spark streaming. >> >> We use Kafka as the input stream, and create 5 receivers. >> >> When this applicat

Re: Question about Spark Streaming Receiver Failure

2015-03-16 Thread Jun Yang

t;> can enable log rotation etc.) and if you are doing a groupBy, join, etc >> type of operations, then there will be a lot of shuffle data. So You need >> to check in the worker logs and see what happened (whether DISK full etc.), >> We have streaming pipelines running for wee

Re: Question about Spark Streaming Receiver Failure

2015-03-16 Thread Jun Yang

tically > spawn another receiver on another machine or on the same machine. > > Thanks > Best Regards > > On Mon, Mar 16, 2015 at 1:08 PM, Jun Yang wrote: > >> Dibyendu, >> >> Thanks for the reply. >> >> I am reading your project homepage now. >&g

Questions Regarding to MPI Program Migration to Spark

2014-11-16 Thread Jun Yang

Guys, Recently we are migrating our backend pipeline from to Spark. In our pipeline, we have a MPI-based HAC implementation, to ensure the result consistency of migration, we also want to migrate this MPI-implemented code to Spark. However, during the migration process, I found that there are so

Re: k-means clustering

2014-11-20 Thread Jun Yang

Guys, As to the questions of pre-processing, you could just migrate your logic to Spark before using K-means. I only used Scala on Spark, and haven't used Python binding on Spark, but I think the basic steps must be the same. BTW, if your data set is big with huge sparse dimension feature vector

Is It Feasible for Spark 1.1 Broadcast to Fully Utilize the Ethernet Card Throughput?

2015-01-09 Thread Jun Yang

Guys, I have a question regarding to Spark 1.1 broadcast implementation. In our pipeline, we have a large multi-class LR model, which is about 1GiB size. To employ the benefit of Spark parallelism, a natural thinking is to broadcast this model file to the worker node. However, it looks that bro

Question about Spark Streaming Receiver Failure

Re: Question about Spark Streaming Receiver Failure

Re: Question about Spark Streaming Receiver Failure

Re: Question about Spark Streaming Receiver Failure

Questions Regarding to MPI Program Migration to Spark

Re: k-means clustering

Is It Feasible for Spark 1.1 Broadcast to Fully Utilize the Ethernet Card Throughput?

7 matches

Site Navigation

Mail list logo

Footer information