[jira] [Commented] (HAMA-647) Make the input spliter robustly

Edward J. Yoon (JIRA) Mon, 24 Sep 2012 08:16:14 -0700

    [ 
https://issues.apache.org/jira/browse/HAMA-647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13461851#comment-13461851
 ]


Edward J. Yoon commented on HAMA-647:
-------------------------------------

{code}
   protected long computeSplitSize(long goalSize, long minSize, long blockSize) 
{
-    return Math.max(minSize, Math.min(goalSize, blockSize));

+    if (goalSize > blockSize) {
+      return Math.max(minSize, Math.max(goalSize, blockSize));
+    } else {
+      return Math.max(minSize, Math.min(goalSize, blockSize));
+    }
{code}

This is good catch.

By the way,

{code}
@@ -214,9 +215,13 @@
         }
       }
       return splits.toArray(new FileSplit[splits.size()]);
+    } else if (files.length == 1) {
+      goalSize = totalSize / (numSplits == 0 ? 1 : numSplits - 1);
{code}

If files.length == 1 and numSplits == 1, java will throw ArithmeticException. 
∵ numSplits - 1 equals zero, correct?
                
> Make the  input spliter robustly
> --------------------------------
>
>                 Key: HAMA-647
>                 URL: https://issues.apache.org/jira/browse/HAMA-647
>             Project: Hama
>          Issue Type: Improvement
>          Components: bsp core
>    Affects Versions: 0.5.0, 0.6.0
>            Reporter: Yuesheng Hu
>            Assignee: Yuesheng Hu
>            Priority: Critical
>             Fix For: 0.6.0
>
>         Attachments: HAMA-647-2.patch, HAMA-647.patch
>
>
> Currently, the spliter in FileInputFormat is based on the Mapreduce's 
> spliter. But, Hama is different from Mapreduce, Hama's task can not be  
> pended until the slot becomes free.  So, the current spliter is not suitable 
> for Hama. When input file is small, it may be ok, but when input is  very 
> large, the number of splits will be very large too, even our cluster is 
> powerful enough to handle the input. More details, please see the comments.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HAMA-647) Make the input spliter robustly

Reply via email to