Regarding the location of the Filters and Factories ... I agree that the Filters would be best located in Lucene, as users of both packages would then have access.
What I'm struggling with is the timing of putting Filters into Lucene, and then Factories into Solr. The Factories in Solr would be useless until the Filters had been accepted and released in Lucene, then the Lucene version upgraded in Solr. What I'm inclined to do is release the Filters to both, and have the Factories point to the Solr version, until they become available in the Lucene version, then switch them over and drop the Solr version. How is this handled with other new Filter/Factory sets? Just let me know, and I'll get the ball rolling on those. I'm going to follow up on Protocol Buffers in response to some other messages I see coming in. Thanks, Todd Feak -----Original Message----- From: Grant Ingersoll [mailto:[EMAIL PROTECTED] Sent: Thursday, October 16, 2008 7:12 AM To: [email protected] Subject: Re: Offer to submit some custom enhancements Hi Todd, All of these sound good. Personally, I think analyzers like these belong in Lucene's contrib/analyzers package, with Solr factory implementations built on those, but that's your call. As for the Protocol Buffers, I am assuming you mean: http://code.google.com/p/protobuf/ That is an Apache license, so it is fine to incorporate. Sounds like it might be a contrib to start, but that's just my take. Sounds like they might be worth using in SolrJ and for distributed, but am interested in how it compares to other similar technologies. Can you share your use case for them? -Grant On Oct 15, 2008, at 2:48 PM, Feak, Todd wrote: > Reposting, as I inadvertently thread hijacked on the first one. My > bad. > > Hi all, > > I have a handful of custom classes that we've created for our purposes > here. I'd like to share them if you think they have value for the rest > of the community, but I wanted to check here before creating JIRA > tickets and patches. > > Here's what I have: > > 1. DoubleMetaphoneFilter and Factory. This replaces usage of the > PhoneticFilter and Factory allowing access to set maxCodeLength() on > the > DoubleMetaphone encoder and access to the "alternate" encodings that > the > encoder provides for some words. > > 2. JapaneseHalfWidthFilter and Factory. Some Japanese characters (and > Latin alphabet) exist in both a FullWidth and HalfWidth form. This > filter normalizes by switching to the FullWidth form for all the > characters. I have seen at least one JIRA ticket about this issue. > This > implementation doesn't rely on Java 1.6. > > 3. JapaneseHiraganaFilter and Factory. Japanese Hiragana can be > translated to Katakana. This filter normalizes to Katakana so that > data > and queries can come in either way and get hits. > > > Also, I have been requested to create a prototype that you may be > interested in. I'm to construct a QueryResponseWriter that returns > documents using Google's Protocol Buffers. This would rely on an > existing patch that exposes the OutputStream, but I would like to > start > the work soon. Are there license concerns that would block sharing > this > with you? Is there any interest in this? > > Thanks for your consideration, > Todd Feak -------------------------- Grant Ingersoll Lucene Boot Camp Training Nov. 3-4, 2008, ApacheCon US New Orleans. http://www.lucenebootcamp.com Lucene Helpful Hints: http://wiki.apache.org/lucene-java/BasicsOfPerformance http://wiki.apache.org/lucene-java/LuceneFAQ
