Hi guys,

>>>      (a) svn copy NutchBase from GitHub to the nutchbase branch in
> >>> http://svn.apache.org/repos/asf/nutch/branches/nutchbase bringing the
> ASF
> >>> branch up to date.
> >>
> >> this seems like an unnecessary step. There has been an enormous amount
> of
> >> changes between the nutchbase branch and the version on GitHub - pretty
> much
> >> EVERY class has been modified + a lot of classes have been removed
> etc... the
> >> nutchbase branch on svn is completely obsolete so I suggest that simply
> get
> >> rid of the nutchbase branch and move the code to trunk directly (after
> doing c
> >> below of course)
> >
> > That's the problem. If _every_ class has been modified like you state,
> then
> > it's been modified outside of Apache and there is no SVN commit history
> and
> > therefore no public log of the code that's been modified. We need to
> rectify
> > that somehow...
>
> I have to agree with Chris that this is problematic, it's also an issue
> of the Nutch community involvement, and the fact that other committers
> don't really know that code.
>

I surely agree that this is an issue. My comment was that the code on the
SVN nutchbase branch is quite deprecated and that merging with the GIT
version would not necessarily be the best way of getting a clear picture of
the new architecture. You are right that putting it to trunk is probably a
bit drastic


>
> What about an intermediate solution: we delete the svn:nutchbase, import
> the code from github on a branch, and then merge it piecewise according
> to the major areas of functionality that you outlined? I understand that
> this would be more work than importing it wholesale as trunk, but at
> least the process of arriving at the final codebase will be clear.
> Perhaps we could treat this as an investment in the education of the
> rest of the team, which will pay off in the end because other developers
> will be able to catch up sooner.
>

that would be a good option. If we do as you suggested then we would get a
good balance in terms of time vs traceability.


J.
-- 
DigitalPebble Ltd

Open Source Solutions for Text Engineering
http://www.digitalpebble.com

Reply via email to