The blocker is a disagreement among small PMCers. I never seen the
productive discussion about input partitioning, during discuss about
input partitioning. VertexInputReader, DiskVerticesInfo, and
SpillingQueue were always in there. Hence, I still don't know whether
you understood or not.

To be blunt, you have no opinion on plans of 0.6.1 and 0.6.2 roadmap,
and you didn't voted on 0.6.1 and furthermore I felt that you want to
create your own branch. Is this a tacit objection, or
mis-understanding, or gesture of defiance?

On Sun, May 12, 2013 at 10:47 PM, Suraj Menon <[email protected]> wrote:
> We've had discussions on the same many times.
>
> "But please don't block other developments" - I want to understand where
> the development is blocked especially for partitioning.
>
> -Suraj
>
>
> On Sun, May 12, 2013 at 6:54 AM, Edward J. Yoon <[email protected]>wrote:
>
>> Hi dev (especially BSP core committers and PMCers),
>>
>> First of all, the input re-partitioning is very important and
>> unavoidable part of Apache Hama. Since there are still people who say
>> "as if everything can be settled by Spilling Queue with something" or
>> "It should be also able to solve for the large input without large
>> cluster", let me explain again.
>>
>> Restricting the number of Task processors to the number of block files
>> of input, means that both below situations are problematic:
>>
>> Case 1. User want to process 1GB input with 1,000 tasks on large cluster.
>> Case 2. User want to process 10GB input with 3 tasks on small cluster.
>>
>> I believe this part has higher priority than other issues, such as
>> VertexInputReader, Spilling Queue. Hence, please don't mix everything
>> here, when we talking about this in the future. To re-partitioning raw
>> data and create partitions as desired, currently we have a
>> PartitioningJobRunner. So, before working on future projects, please
>> test with various scenarios, for example, whether it works well with
>> compressed files, latest Hadoop (HDFS 2.0), or on large cluster.
>>
>> Second, is a lack of active discussion on RoadMap, and a difference of
>> opinion on release. There's a limit as to what we can do. Moreover, as
>> I mentioned above, there're many high priority issues. I don't
>> understand why you need to develop BSP core or create separate
>> branches without working together on basis issues.
>>
>> Of course, research tasks are fine. If you want to work on them in
>> your free time, then feel free to do so. But please don't block other
>> developments.
>>
>> I hope you understand my meaning.
>> Thanks.
>>
>> --
>> Best Regards, Edward J. Yoon
>> @eddieyoon
>>



--
Best Regards, Edward J. Yoon
@eddieyoon

Reply via email to