Hi Soomyung,
I agree with Christian, this sounds fantastic!
First, we need to know a couple things:
1. Are you the only author of the code? We need to get agreement from all
contributors. (When I browse CVS on the SourceForge site, the only author I
see is smlee0818, which I assume is you.)
2. Do you need permission from your employer to make this donation? If so,
we'll need your employer to submit a Corporate CLA (Contributor License
Agreement)[1] before we can accept the donation.
To get started, the first step is creating a Lucene JIRA issue here:
https://issues.apache.org/jira/browse/LUCENE - you'll need to create an ASF
JIRA account first if you don't already have one: click the Log In link at
the top right of the page, then click the Sign up link where it says Not a
member? Sign up for an account.
Once you've created a JIRA issue, you should make a compressed tarball of
everything you want to contribute - as far as I can tell, this is everything in
the lucenekorean sourceforge project in CVS under modules kr.dictionary,
kr.analysis.4x, and kr.morph - and then attach it to the JIRA issue, with
the MD5 hash for the tarball in the comment that you provide when you attach
the tarball to the issue.
Once you've created the JIRA issue and attached your contribution, we can make
progress on further steps that need to be taken: you should submit an
individual CLA[2] and a code grant[3], and I (in my role as Lucene PMC chair)
will be managing the IP clearance process[4][5].
See http://wiki.apache.org/lucene-java/HowToContribute for more information
about contributing.
I look forward to working with you on this - thank you for contributing!
Steve
[1] http://www.apache.org/licenses/cla-corporate.txt
[1] http://www.apache.org/licenses/icla.txt
[2] http://www.apache.org/licenses/software-grant.txt
[3] http://incubator.apache.org/ip-clearance/index.html
[4] http://incubator.apache.org/ip-clearance/ip-clearance-template.html
On Apr 24, 2013, at 7:00 AM, Christian Moen c...@atilika.com wrote:
Hello Soomyung,
Thanks a lot for this. This is very good news.
Let's await the PMC Chair's suggestion on next steps. See LUCENE-3305 to get
an idea how the process was for Japanese.
If the process goes well, I'm happy to see how I can set aside some time
after Lucene Revolution to work on integrating this.
Best regards,
Christian Moen
アティリカ株式会社
http://www.atilika.com
On Apr 24, 2013, at 7:40 PM, 이수명 smlee0...@gmail.com wrote:
Hello Christian.
Thanks for your reply.
I'm happy to hear about a code grant process.
To make the dictionaries, I collected words itself and word features from
books and internet.
And I organized all of the information that I collected to make the korean
morphological analyzer.
Therefore the dictionaries is that I made.
I think It is enough to attach a file(License Notice) that describe on where
the dictionaries originate from and the kind of licensing (Apache License
2.0).
If it is not enough, please leave me a message and give me some guide.
thanks.
Soomyung Lee
2013/4/24 Christian Moen c...@atilika.com
Hello SooMyung,
Thanks a lot! It will be great to get Korean supported out-of-the-box in
Lucene/Solr.
In terms of process, I'll leave this to Steve Rowe, PMC Chair, to comment
on, but a code grant process sounds likely.
I'm seeing that the code itself has an Apache License 2.0, but could you
elaborate on where the dictionaries originate from and what kind of
licensing terms that are applicable?
Many thanks,
Christian Moen
On Apr 24, 2013, at 2:05 PM, smlee0...@gmail.com wrote:
Hello,
I've developed the Korean Analyzer and distributed it since 2008.
Many people who use lucene with korean use it.
I posted it to the sourceforge
(http://sourceforge.net/projects/lucenekorean)
Here is the cvs address
d:pserver:anonym...@lucenekorean.cvs.sourceforge.net:/cvsroot/lucenekorean
KoreanAnalyzer consists of Korean Morphological Analyzer, Korean Dictionary
and Korean Filter.
When using lucene with korean, One thinks of CJK Analyzer.
But CJK Analyzer is improper for korean.
Korean has a specific characteristic and is needed to analyze morpheme when
extracting the index keyword.
Korean Analyzer has solved the problem with the Korean Morphological
Analyzer.
Korean Analyzer has also the feature of spliting compound noun.
Now, I want to contribute the korean analyzer to the lucene project.
Please let me know how to contribute it.
If you want to check the source code, please visit the sourceforge cvs
repository.
Best regards.
--
SooMyung Lee
Director of Research Center
Argonet co. ltd,
Manager of Luene Korean Analyzer
http://korlucene.naver.com
Contact: +82-10-6480-5710
--
SooMyung Lee
Director of Research Center
Argonet co. ltd,
Manager of Luene Korean Analyzer
http://korlucene.naver.com
Contact: +82-10-6480-5710