Re: Codegen In Shuffle

2015-11-04 Thread
I see. Thanks very much. 2015-11-04 16:25 GMT+08:00 Reynold Xin : > GenerateUnsafeProjection -- projects any internal row data structure > directly into bytes (UnsafeRow). > > > On Wed, Nov 4, 2015 at 12:21 AM, 牛兆捷 wrote: > >> Dear all: >> >> Tungsten

Codegen In Shuffle

2015-11-04 Thread
Dear all: Tungsten project has mentioned that they are applying code generation is to speed up the conversion of data from in-memory binary format to wire-protocol for shuffle. Where can I find the related implementation in spark code-based ? -- *Regards,* *Zhaojie*

RDD checkpoint

2015-07-13 Thread
The checkpointed RDD computed twice, why not do the checkpoint for the RDD once it is computed? Is there any special reason for this? -- *Regards,* *Zhaojie*

Re: Questions about Fault tolerance of Spark

2015-07-11 Thread
well. > Mike > > > Original message > From: 牛兆捷 > Date:07-09-2015 04:19 (GMT-05:00) > To: dev@spark.apache.org, u...@spark.apache.org > Subject: Questions about Fault tolerance of Spark > > Hi All: > > We already know that Spark utilizes the lineage to recompute t

Questions about Fault tolerance of Spark

2015-07-09 Thread
Hi All: We already know that Spark utilizes the lineage to recompute the RDDs when failure occurs. I want to study the performance of this fault-tolerant approach and have some questions about it. 1) Is there any benchmark (or standard failure model) to test the fault tolerance of these kinds of

Workload for spark testing

2014-09-13 Thread
Hi All: We know some memory of spark are used for computing (e.g., spark.shuffle.memoryFraction) and some are used for caching RDD for future use (e.g., spark.storage.memoryFraction). Is there any existing workload which can utilize both of them during the running left cycle? I want to do some pe

Re: memory size for caching RDD

2014-09-04 Thread
s done by RDD unit, not by block unit. And then, if the storage level > including disk level, the data on the disk will be removed too. > > Best Regards, > Raymond Liu > > From: 牛兆捷 [mailto:nzjem...@gmail.com] > Sent: Thursday, September 04, 2014 2:57 PM > To: Liu,

Re: memory size for caching RDD

2014-09-03 Thread
ion conf. > e.g. spark.shuffle.memoryFraction which you also set the up limit. > > > > Best Regards, > > *Raymond Liu* > > > > *From:* 牛兆捷 [mailto:nzjem...@gmail.com] > *Sent:* Thursday, September 04, 2014 2:27 PM > *To:* Patrick Wendell > *Cc:* u...@spark.apache.org; dev@spar

Re: memory size for caching RDD

2014-09-03 Thread
Thanks raymond. I duplicated the question. Please see the reply here. [?] 2014-09-04 14:27 GMT+08:00 牛兆捷 : > But is it possible to make t resizable? When we don't have many RDD to > cache, we can give some memory to others. > > > 2014-09-04 13:45 GMT+08:00 Patrick Wendell

Re: memory size for caching RDD

2014-09-03 Thread
But is it possible to make t resizable? When we don't have many RDD to cache, we can give some memory to others. 2014-09-04 13:45 GMT+08:00 Patrick Wendell : > Changing this is not supported, it si immutable similar to other spark > configuration settings. > > On Wed, Sep 3, 20

memory size for caching RDD

2014-09-03 Thread
Dear all: Spark uses memory to cache RDD and the memory size is specified by "spark.storage.memoryFraction". One the Executor starts, does Spark support adjusting/resizing memory size of this part dynamically? Thanks. -- *Regards,* *Zhaojie*

acquire and give back resources dynamically

2014-08-13 Thread
Dear all: Does spark can acquire resources from and give back resources to YARN dynamically ? -- *Regards,* *Zhaojie*