subject:"Re\: Partitioning in spark streaming"

Re: Partitioning in spark streaming

2015-08-12 Thread Tathagata Das

Yes. On Wed, Aug 12, 2015 at 12:12 PM, Mohit Anchlia wrote: > Thanks! To write to hdfs I do need to use saveAs method? > > On Wed, Aug 12, 2015 at 12:01 PM, Tathagata Das > wrote: > >> This is how Spark does. It writes the task output to a uniquely-named >> temporary file, and then atomically (

Re: Partitioning in spark streaming

2015-08-11 Thread Mohit Anchlia

Thanks for the info. When data is written in hdfs how does spark keeps the filenames written by multiple executors unique On Tue, Aug 11, 2015 at 9:35 PM, Hemant Bhanawat wrote: > Posting a comment from my previous mail post: > > When data is received from a stream source, receiver creates block

Re: Partitioning in spark streaming

2015-08-11 Thread Hemant Bhanawat

Posting a comment from my previous mail post: When data is received from a stream source, receiver creates blocks of data. A new block of data is generated every blockInterval milliseconds. N blocks of data are created during the batchInterval where N = batchInterval/blockInterval. A RDD is creat

Re: Partitioning in spark streaming

2015-08-11 Thread Mohit Anchlia

I am also trying to understand how are files named when writing to hadoop? for eg: how does "saveAs" method ensures that each executor is generating unique files? On Tue, Aug 11, 2015 at 4:21 PM, ayan guha wrote: > partitioning - by itself - is a property of RDD. so essentially it is no > differ

Re: Partitioning in spark streaming

2015-08-11 Thread ayan guha

partitioning - by itself - is a property of RDD. so essentially it is no different in case of streaming where each batch is one RDD. You can use partitionBy on RDD and pass on your custom partitioner function to it. One thing you should consider is how balanced are your partitions ie your partitio

Re: Partitioning in spark streaming

Re: Partitioning in spark streaming

Re: Partitioning in spark streaming

Re: Partitioning in spark streaming

Re: Partitioning in spark streaming

5 matches

Site Navigation

Mail list logo

Footer information