Hello Robert, That's a good question. The core SEN is under LGPL, yes. However, I didn't need to make changes to that, though given that there are 2 versions floating around, I think it needs a good home.
But the glue-layer is under "Apache 2.0" license, and that's the part that needed fixing. I think that means it's ASF / contrib compatible? There are also 2 other ancillary libraries and some source dictionaries which I need to research. Working from the other direction, which might give you some ideas: The goal is to get this more accessible. It'd be really nice if, in a Lucene distribution, the SEN library could be switched on as easily as CJK. Or at the most you'd run an ant script to fetch all the parts and assemble it. As it stands now I think it's not used much because it's a bit complex to setup, even prior to May '09's change, and most of the users of it discuss it in Japanese. So that's the goal, I'm very open to ideas on the "how". Mark -- Mark Bennett / New Idea Engineering, Inc. / mbenn...@ideaeng.com Direct: 408-733-0387 / Main: 866-IDEA-ENG / Cell: 408-829-6513 On Mon, Oct 12, 2009 at 11:10 AM, Robert Muir <rcm...@gmail.com> wrote: > Mark, does this mean Sen will be under the Apache license? (it is currently > LGPL) > > > On Mon, Oct 12, 2009 at 1:46 PM, Mark Bennett <mbenn...@ideaeng.com>wrote: > >> Hi folks, >> >> I've been working to fix the Japanese SEN morphological analyzer, which is >> currently hosted at: >> https://sen.dev.java.net >> >> To review, Japanese doesn't use whitespace for word breaks. The >> traditional approach to CJK (Chinese, Japanese, Korean) is to use bigram >> character pairs in the index. While this works to a point, some believe >> that using proper word breaks provides better results. >> >> The "lucene-ja" glue layer between Lucene and the core SEN library broke >> in May of '09 when a fix was made in Lucene: >> http://issues.apache.org/jira/browse/LUCENE-1636 >> >> Uwe S. had a very good insight for a quick fix, and I have been cleaning >> up some other issues with the code. I have also spoken the author Takashi >> Okamoto and he is fine to have this moved from java.net to ASF; I think >> it will be easier for folks to find and use it if it's in ASF. >> >> I'm not quite ready to submit a patch, but the Wiki suggests emailing the >> list with the idea in advance. There are some packaging questions I'll >> have, there's actually quite a few parts. Also, the wiki didn't quite spell >> out the process to get things into contrib, beyond emailing and submitting a >> patch. I also plan to eventually submit a Solr-specific wrapper to the solr >> dev list, to allow for dynamic config changes to be made from Solr's >> schema. But since the original code was Lucene based, and it provides the >> broadest reach, I think having it in core Lucene would be a good start. >> >> Any comments, suggestions, or mentor volunteers? :-) >> >> Mark >> >> -- >> Mark Bennett / New Idea Engineering, Inc. / mbenn...@ideaeng.com >> Direct: 408-733-0387 / Main: 866-IDEA-ENG / Cell: 408-829-6513 >> > > > > -- > Robert Muir > rcm...@gmail.com >