On 2010-06-27 17:34, Mattmann, Chris A (388J) wrote:
> Hi Julien,
> 
>>>      (a) svn copy NutchBase from GitHub to the nutchbase branch in
>>> http://svn.apache.org/repos/asf/nutch/branches/nutchbase bringing the ASF
>>> branch up to date.
>>
>> this seems like an unnecessary step. There has been an enormous amount of
>> changes between the nutchbase branch and the version on GitHub - pretty much
>> EVERY class has been modified + a lot of classes have been removed etc... the
>> nutchbase branch on svn is completely obsolete so I suggest that simply get
>> rid of the nutchbase branch and move the code to trunk directly (after doing 
>> c
>> below of course)
> 
> That's the problem. If _every_ class has been modified like you state, then
> it's been modified outside of Apache and there is no SVN commit history and
> therefore no public log of the code that's been modified. We need to rectify
> that somehow...

I have to agree with Chris that this is problematic, it's also an issue
of the Nutch community involvement, and the fact that other committers
don't really know that code.

What about an intermediate solution: we delete the svn:nutchbase, import
the code from github on a branch, and then merge it piecewise according
to the major areas of functionality that you outlined? I understand that
this would be more work than importing it wholesale as trunk, but at
least the process of arriving at the final codebase will be clear.
Perhaps we could treat this as an investment in the education of the
rest of the team, which will pay off in the end because other developers
will be able to catch up sooner.

> I'm not sure I get it? How does seeing above deal with (c)? In terms of
> nutchbase branch merging with trunk, again, I'm a bit worried here since the
> three of you (Julien, Enis, and Doğacan) are the only ones that were
> significantly involved (correct me if I'm wrong) with the development of
> Nutchbase at Github, right, yet there are 7 Nutch PMC members and committers
> (one of which does not include Enis). How do you expect us to maintain the
> code you bring over unless those of us that were not involved in the Github
> development have some history/notion of what's been done vis-à-vis Github
> and Apache SVN?

There are precedents for this, e.g. a wholesale import of mapred+NDFS by
Doug and Mike C... but I have to say that it took a _very_ long time (1
year?) for other developers to get up to speed with the new code base...
and it could've been a failure if not for a steady involvement of Doug
throughout this process and long thereafter.

-- 
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com

Reply via email to