Re: temporary table size is 0, which makes reducer number too small

2016-10-18 Thread
issue: crunch-624. link: https://issues.apache.org/jira/browse/CRUNCH-624?jql=project%20%3D%20CRUNCH 2016-10-18 13:54 GMT+08:00 Josh Wills <josh.wi...@gmail.com>: > Yep, that's right-- can you file a JIRA, and I'll post the patch? > > On Mon, Oct 17, 2016 at 10:52 PM, 陈竞 <c

Re: temporary table size is 0, which makes reducer number too small

2016-10-17 Thread
h the scale factor on operations that may blow up > data? > > On Sun, Oct 16, 2016, 10:04 PM 陈竞 <cj.mag...@gmail.com> wrote: > >> that's a solution, but, since user may not clearly know whic step will >> produce tempoary table, i think setting reduce number automa

Re: temporary table size is 0, which makes reducer number too small

2016-10-16 Thread
. 2016-10-14 18:59 GMT+08:00 David Ortiz <dpo5...@gmail.com>: > You can manually set the reducer number using the conf object among other > things. > > On Fri, Oct 14, 2016, 5:43 AM 陈竞 <cj.mag...@gmail.com> wrote: > >> hi, i found that if the pipeline produce tempo

temporary table size is 0, which makes reducer number too small

2016-10-14 Thread
hi, i found that if the pipeline produce temporary table , the reduce number of the temporary table whose input table is temporary table become to small, since temporary table has no content .

Re: is the temporary output's sequence id stable in when pipeline runs every time

2016-09-23 Thread
est to support > it in most cases. > > On Thu, Sep 22, 2016 at 10:24 PM 陈竞 <cj.mag...@gmail.com> wrote: > >> i found out that crunch will give any temporary output an sequence >> id,which is generated when construct the data graph. my problem is that: is >> the t

is the temporary output's sequence id stable in when pipeline runs every time

2016-09-22 Thread
i found out that crunch will give any temporary output an sequence id,which is generated when construct the data graph. my problem is that: is the temporary output's sequence id stable in when pipeline runs every time?

Re: confused about the MapsideJoinStrategy, why use LoadLeftSideMapsideJoinStrategy, what if left table is too large to store in memory?

2016-05-10 Thread
+08:00 David Ortiz <dpo5...@gmail.com>: > Try mapsideJoinStrategy.create() > > On Mon, May 9, 2016, 9:29 PM 陈竞 <cj.mag...@gmail.com> wrote: > >> hi, i'm very confused when i use MapsideJoinStrategy. the origin >> constructor was deprecated, instead, LoadLeftSideMa

confused about the MapsideJoinStrategy, why use LoadLeftSideMapsideJoinStrategy, what if left table is too large to store in memory?

2016-05-09 Thread
we use LoadLeftSideMapsideJoinStrategy, we use A as right side, B as left side, which makes no improvement while adding a reverse DoFn -- 陈竞,中科院计算技术研究所,高性能计算机中心 Jing Chen HPCC.ICT.AC China

Re: confused about node split in MSCRPlanner.prepareFinalGraph

2016-05-04 Thread
think* the reason this works is because > this line: > > graph.getEdge(vertex, splitTail).addNodePath(headPath); > > doesn't actually do anything-- the headPath here is always empty, so > there's no impact to the final graph. I'd be curious if anything failed if > we removed

Re: crunch's pipeline does't support re-runing from failed job?

2016-04-21 Thread
ok, thx,i try it 2016-04-21 19:41 GMT+08:00 David Ortiz <dpo5...@gmail.com>: > To get it to pick up from where it left off, you have to write out > intermediate data with WriteMode.CHECKPOINT > > On Thu, Apr 21, 2016, 3:42 AM 陈竞 <cj.mag...@gmail.com> wrote: > >&g

crunch's pipeline does't support re-runing from failed job?

2016-04-21 Thread
hi, i'm new to crunch, i begin to use crunch, I found that, once once the pipeline failed, it can't start from latest failed job when re-running. thx -- 陈竞,中科院计算技术研究所,高性能计算机中心 Jing Chen HPCC.ICT.AC China