Re: FuzzyQuery vs SlowFuzsyQuery docs? -- was: Re: [jira] [Commented] (LUCENE-2667) Fix FuzzyQuery's defaults, so its fast.

Mark Bennett Fri, 09 Nov 2012 16:53:46 -0800

Hi Robert,

I acknowledge your "-1" vote, and I'm guessing that your objection is maybe
70% "scalability", and only 30% use-case?


The older Levenstein stuff has been around for a long time, scalable or
not, and already in real systems.

You seem to have a very "binary" on code being "in" or "out".  Is there any
room in your world-view of code for "gray code", unsupported, incubator,
what-have-you?  Maybe analagous to people who jailbreak their iPhones or
something?

You're an important part of the community, and working at Lucid, etc., and
clearly concerned about software quality.  When smart folks like you have
such sharp opinions I do try to ponder them against my own circumstances.

And on the quality of the old code, was it just the scalability, or were
there other concerns such as stability, coding style, or possibly
inconsistent results?

Isn't the sandbox and admonished reference in Java docs sufficient?

I'm harping on this because I'm really between a rock and hard place, and
also posted another question.

Just trying to understand your very strong opinions, and I thank you for
your patience in this matter.  This issue is either going to fix or break
my weekend / next-deliverble.

Sincere thanks,
Mark

--
Mark Bennett / New Idea Engineering, Inc. / [email protected]
Direct: 408-733-0387 / Main: 866-IDEA-ENG / Cell: 408-829-6513


On Fri, Nov 9, 2012 at 4:37 PM, Robert Muir <[email protected]> wrote:

> I'm -1 for having unscalable shit in lucene's core. This query should
> have never been added.
>
> I don't care if a few people complain because they aren't using
> lowercasefilter or some other insanity. Fix your analysis chain. I
> don't have any sympathy.
>
> On Fri, Nov 9, 2012 at 7:35 PM, Jack Krupansky <[email protected]>
> wrote:
> > +1 for permitting a choice of fuzzy query implementation.
> >
> > I agree that we want a super-fast fuzzy query for simple variations, but
> I
> > also agree that we should have the option to trade off speed for
> function.
> >
> > But I am also sympathetic to assuring that any core Lucene features be as
> > performant as possible.
> >
> > Ultimately, if there was a single fuzzy query implementation that did
> > everything for everybody all of the time, that would be the way to go,
> but
> > if choices need to be made to satisfy competing goals, we should support
> > going that route.
> >
> > -- Jack Krupansky
> >
> > From: Mark Bennett
> > Sent: Friday, November 09, 2012 3:48 PM
> > To: [email protected]
> > Subject: Re: FuzzyQuery vs SlowFuzsyQuery docs? -- was: Re: [jira]
> > [Commented] (LUCENE-2667) Fix FuzzyQuery's defaults, so its fast.
> >
> > Hi Robert,
> >
> > On Thu, Sep 13, 2012 at 7:39 PM, Robert Muir <[email protected]> wrote:
> >>
> >> ...
> >> ... I'm strongly against having this
> >> unscalable garbage in lucene's core.
> >>
> >> There is no use case for ed > 2, thats just crazy.
> >
> >
> > I promise you there ARE use cases for edit distances > 2, especially with
> > longer words.  Due to NDA I can't go into details.
> >
> > Also ed>2 can be useful when COMBINING that low-quality part of the
> search
> > with other sub-queries, or additional business rules.  Maybe instead of
> > boiling an ocean this lets you just boil the sea.  ;-)
> >
> > I won't comment on the quality of the older Levenstein code, or the
> likely
> > very slow performance, nor where the code should live, etc.
> >
> > But your statement about "no use case for ed > 2" is simply not true.
> > (whether you'd agree with any of them or not is certainly another matter)
> >
> > I understand your concerns about not having it be the default.  (or maybe
> > having a giant warning message or something, whatever)
> >
> >> --
> >> lucidworks.com
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: [email protected]
> >> For additional commands, e-mail: [email protected]
> >>
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>

Re: FuzzyQuery vs SlowFuzsyQuery docs? -- was: Re: [jira] [Commented] (LUCENE-2667) Fix FuzzyQuery's defaults, so its fast.

Reply via email to