[ 
https://issues.apache.org/jira/browse/HAMA-413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13089481#comment-13089481
 ] 

Thomas Jungblut commented on HAMA-413:
--------------------------------------

Yes, we can keep that, but then the tasks would operate on the same partitioned 
file since the files are tagged with the grooms name.
Do you see the problem?

Another idea would be to keep this partitioning, and just let the task 
determine how many other tasks are running for this job on this groom. And then 
do a % number of tasks on this groom for each record.
This is ultra-hacky, but I don't see this (a not hacky solution) without 
implementing a whole IO system now...

> Remove limitation on the number of tasks
> ----------------------------------------
>
>                 Key: HAMA-413
>                 URL: https://issues.apache.org/jira/browse/HAMA-413
>             Project: Hama
>          Issue Type: Sub-task
>          Components: bsp
>    Affects Versions: 0.3.0
>            Reporter: Edward J. Yoon
>            Assignee: Edward J. Yoon
>             Fix For: 0.4.0
>
>         Attachments: HAMA-413_v01.patch, HAMA-413_v02.patch, 
> HAMA-413_v03.patch
>
>
> By HAMA-410 patch, BSPPeer object will be constructed at child process. Now 
> we can just remove limitation on the number of tasks.
> Here's TODO list:
> 1. The number of tasks per groom should be configurable e.g., 
> 'bsp.local.tasks.maximum'.
> 2. The 'totalTaskCapacity' should be calculated at 
> BSPMaster.getClusterStatus().
> 3. When scheduling tasks, consider how to allocate them.
> 4. Each BSPPeer should know all created peers of Hama cluster by job. It can 
> be listed based on actions of GroomServer.
> 5. In examples, 'cluster.getGroomServers()' can be changed to 
> 'cluster.getMaxTasks()'.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to