As I already suggested it on this list, I really would like to
move the
LanguageIdentifier class (and profiles) to
an independant Lucene sub-project (and the MimeType repository too).
I don't remember why but there were some objections about this...
I think most people agree that it would be worthwhile to un-tie
this component from Nutch internals. The only objections were
related not to the idea itself, but to the management aspects of
creating a full-blown sub-project, both wrt. to the initial setup
and the continuing maintenance. An alternative solution was
proposed (creating a contrib/ package). This would still help to
separate the code from Nutch internals, so that it can be used in
other projects, but it would require much less effort to set up and
maintain.
+1, what's about lucene sandbox or jsut open a source forge project
with Apache 2 license, than we can use just the jar.
Stefan
- Re: lang identifier and nutch analyzer in trunk Stefan Groschupf
-