[ 
https://issues.apache.org/jira/browse/CRUNCH-624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Micah Whitacre updated CRUNCH-624:
----------------------------------
    Fix Version/s: 0.15.0

> temporary table size is 0, which makes reducer number too small
> ---------------------------------------------------------------
>
>                 Key: CRUNCH-624
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-624
>             Project: Crunch
>          Issue Type: Bug
>          Components: Core
>            Reporter: JingChen
>            Assignee: Josh Wills
>             Fix For: 0.15.0
>
>         Attachments: CRUNCH-624.patch
>
>
> if the pipeline produce temporary table , the reduce number of the temporary 
> table whose input table is temporary table may become very small in some 
> cases, since temporary table has no content .
> And, I may found the root cause in my caseļ¼š
> {code:title=PCollectionImpl.java|borderStyle=solid}
> public void materializeAt(SourceTarget<S> sourceTarget) {
>   this.materializedAt = sourceTarget;
>   this.size = materializedAt.getSize(getPipeline().getConfiguration());
> }
> @Override
> public long getSize() {
>     if (size < 0) {
>         this.size = getSizeInternal();
>     }
>     return size;
> }
> {code}
> PColletionImpl.materializeAt(sourceTarget) this method will be invoked when 
> node splits to create temporary table, source sourceTarget binds with the new 
> temporary table whose size is 0, since its path was just created, the 
> this.size will be 0. After that, when getSize() was invoked by setting reduce 
> number, since the size is 0, it will just return 0, which makes reduce number 
> too small.
> So i think the code of materializeAt() should check sourceTarget's size, like 
> below:
> {code:title=PCollectionImpl.java|borderStyle=solid}
> public void materializeAt(SourceTarget<S> sourceTarget) {
>   this.materializedAt = sourceTarget;
>   long size = materializedAt.getSize(getPipeline().getConfiguration());
>   if (size > 0)
>       this.size = size;
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to