[ 
https://issues.apache.org/jira/browse/LUCENE-1377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12788719#action_12788719
 ] 

Michael McCandless commented on LUCENE-1377:
--------------------------------------------

Here's a semi-concrete proposal: how about we plan to move all
analyzers (Solr's, Lucene's core analyzers, snowball,
contrib/analyzers) into one place.

We could probably just use Lucene's existing contrib/analyzers as that
place.

We then change Solr's checkout/build/ship process to pull directly
from contrib/analyzers.  So when I checkout Solr, I get Lucene's
contrib analyzers and test against that.  We also fix Lucene's build
scripts -- we'd have to build analyzers first, and make it available
for core tests.

Any changes to contrib/analyzers must pass both Lucene's and Solr's
unit tests before being committed, which is great because it also
means more/better test coverage for all analyzers changes.

We may have some issues with someone being a committer on one project
but not another, but we can take those up on a case by case basis.
The wost case is we post a patch to a Solr or Lucene issue and a
committer picks it up, which would be fine.

This will require some one-time effort -- fixing the ant build scripts
for both Solr and Lucene, doing the initial move, etc.  I'm happy to
help out, but will probably need help with ant ;)

We could even promote contrib/analyzers to its own sub-project, but
I think that's probably overkill for now.

Could something like this work?


> Add HTMLStripReader and WordDelimiterFilter from SOLR
> -----------------------------------------------------
>
>                 Key: LUCENE-1377
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1377
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Analysis
>    Affects Versions: 2.3.2
>            Reporter: Jason Rutherglen
>            Priority: Minor
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> SOLR has two classes HTMLStripReader and WordDelimiterFilter which are very 
> useful for a wide variety of use cases.  It would be good to place them into 
> core Lucene.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Reply via email to