[
https://issues.apache.org/jira/browse/HAMA-413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13089481#comment-13089481
]
Thomas Jungblut commented on HAMA-413:
--------------------------------------
Yes, we can keep that, but then the tasks would operate on the same partitioned
file since the files are tagged with the grooms name.
Do you see the problem?
Another idea would be to keep this partitioning, and just let the task
determine how many other tasks are running for this job on this groom. And then
do a % number of tasks on this groom for each record.
This is ultra-hacky, but I don't see this (a not hacky solution) without
implementing a whole IO system now...
> Remove limitation on the number of tasks
> ----------------------------------------
>
> Key: HAMA-413
> URL: https://issues.apache.org/jira/browse/HAMA-413
> Project: Hama
> Issue Type: Sub-task
> Components: bsp
> Affects Versions: 0.3.0
> Reporter: Edward J. Yoon
> Assignee: Edward J. Yoon
> Fix For: 0.4.0
>
> Attachments: HAMA-413_v01.patch, HAMA-413_v02.patch,
> HAMA-413_v03.patch
>
>
> By HAMA-410 patch, BSPPeer object will be constructed at child process. Now
> we can just remove limitation on the number of tasks.
> Here's TODO list:
> 1. The number of tasks per groom should be configurable e.g.,
> 'bsp.local.tasks.maximum'.
> 2. The 'totalTaskCapacity' should be calculated at
> BSPMaster.getClusterStatus().
> 3. When scheduling tasks, consider how to allocate them.
> 4. Each BSPPeer should know all created peers of Hama cluster by job. It can
> be listed based on actions of GroomServer.
> 5. In examples, 'cluster.getGroomServers()' can be changed to
> 'cluster.getMaxTasks()'.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira