+1. We don't use Solr, but have quite a bunch of medium and
short-sized documents. Plus heaps of metadata fields.
I'm yet to read Uwe's example, but I feel I'm a bit misunderstood by
Did you read it yet? What do you think about it?
some of you. My gripe with new API is not that it
I'm not just responding to just you there, but more to the growing
pack of those speaking against the new API. I don't see specific
issues being brought up - the only issues I have seen brought up have
been addressed in JIRA issues that have received no comments
indicating the fix was not
On Mon, Aug 10, 2009 at 9:12 PM, Grant Ingersollgsing...@apache.org wrote:
Or... and this is one crazy idea... maybe we should simply release 3.0
next, not removing any deprecated APIs until 3.1 or later. Ie,
normal software on having so many major changes would release an X.0
release; I
On Tue, Aug 11, 2009 at 4:28 AM, Michael Buschbusch...@gmail.com wrote:
There was a performance test in Solr that apparently ran much slower
after upgrading to the new Lucene jar. This test is testing a rather
uncommon scenario: very very short documents.
Actually, its more uncommon than that:
On Tue, Aug 11, 2009 at 6:50 AM, Robert Muirrcm...@gmail.com wrote:
On Tue, Aug 11, 2009 at 4:28 AM, Michael Buschbusch...@gmail.com wrote:
There was a performance test in Solr that apparently ran much slower
after upgrading to the new Lucene jar. This test is testing a rather
uncommon
On Aug 11, 2009, at 4:28 AM, Michael Busch wrote:
I'm not just responding to just you there, but more to the growing
pack of those speaking against the new API. I don't see specific
issues being brought up - the only issues I have seen brought up
have been addressed in JIRA issues that
On Tue, Aug 11, 2009 at 15:09, Yonik Seeleyyo...@lucidimagination.com wrote:
On Tue, Aug 11, 2009 at 6:50 AM, Robert Muirrcm...@gmail.com wrote:
On Tue, Aug 11, 2009 at 4:28 AM, Michael Buschbusch...@gmail.com wrote:
There was a performance test in Solr that apparently ran much slower
after
Earwin Burrfoot wrote:
The only person that tried to disprove this claim is Uwe. Others
either say the problems are solved, so it's okay to move to the new
API, or this will be usable when flexindexing arrives.
Others (not me) have spent a lot of time going over this before (more
than once I
The only person that tried to disprove this claim is Uwe. Others
either say the problems are solved, so it's okay to move to the new
API, or this will be usable when flexindexing arrives.
Others (not me) have spent a lot of time going over this before (more than
once I think) - they prob are
Earwin Burrfoot wrote:
The only person that tried to disprove this claim is Uwe. Others
either say the problems are solved, so it's okay to move to the new
API, or this will be usable when flexindexing arrives.
Others (not me) have spent a lot of time going over this before (more than
I think extensible analysis (the new TokenStream API) is a net
positive: it gives us strongly typed and high performance
extensibility to a Token, so apps can now add whatever attrs they
want.
And, I see it as the first (of 3) big legs that we need to reach
flexible indexing. We really have to
On 08/11/2009 08:22 AM, Michael McCandless wrote:
I do still think a longish 2.9 beta is warranted, if we can succeed in
getting users outside the dev group to kick the tires and uncover
stuff.
I think a beta would be a great idea. Not sure it needs to be longish.
Having not looked at it,
*From:* Shai Erera [mailto:ser...@gmail.com]
*Sent:* Monday, August 10, 2009 11:13 PM
*To:* java-dev@lucene.apache.org
*Subject:* Re: who clears attributes?
It sounds like the 'old' API should stay a bit longer than 3.0. We'd
like to give more people a chance to experiment w
branch release maintenance would be a new thing.)
Steve
-Original Message-
From: Jason Rutherglen [mailto:jason.rutherg...@gmail.com]
Sent: Tuesday, August 11, 2009 12:51 PM
To: java-dev@lucene.apache.org
Subject: Re: Beta (was Re: who clears attributes?)
I thought 2.9 was on track
On 8/11/09 4:13 AM, Grant Ingersoll wrote:
On Aug 11, 2009, at 4:28 AM, Michael Busch wrote:
I'm not just responding to just you there, but more to the growing
pack of those speaking against the new API. I don't see specific
issues being brought up - the only issues I have seen brought up
On Aug 11, 2009, at 3:21 PM, Michael Busch wrote:
On 8/11/09 4:13 AM, Grant Ingersoll wrote:
On Aug 11, 2009, at 4:28 AM, Michael Busch wrote:
I'm not just responding to just you there, but more to the
growing pack of those speaking against the new API. I don't see
specific issues
@lucene.apache.org
Subject: Re: who clears attributes?
Uwe,
Is this example available? I think that an example like this would help the
user community see the current value in the change. At least, I'd love to
see the code for it.
-- DM
On 08/10/2009 06:49 PM, Uwe Schindler wrote:
UIMA
The new API
CharTokenizer.incrementToken() clears *all* attributes in the entire
tokenizer chain.
StandardTokenizer.incrementToken() clears only the term attribute.
So... which is right? Seems like the tokenizer should be responsible?
On a performance related note, CharTokenizer.clearAttribtes() could be
...@thetaphi.de
-Original Message-
From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik
Seeley
Sent: Monday, August 10, 2009 6:01 PM
To: java-dev@lucene.apache.org
Subject: who clears attributes?
CharTokenizer.incrementToken() clears *all* attributes in the entire
, or each tokenizer
should read or each Tokenizer or TokenFilter
On Mon, Aug 10, 2009 at 12:55 PM, Yonik
Seeleyyo...@lucidimagination.com wrote:
On Mon, Aug 10, 2009 at 12:44 PM, Uwe Schindleru...@thetaphi.de wrote:
the CharTokenizer should only clear the TermAttribute, as it is only using
On Mon, Aug 10, 2009 at 12:44 PM, Uwe Schindleru...@thetaphi.de wrote:
the CharTokenizer should only clear the TermAttribute, as it is only
using this attribute.
I changed this in the latest patch for
https://issues.apache.org/jira/browse/LUCENE-1796
It's certainly not clear to me - is there
Thinking through this a little more, I don't see an alternative to the
tokenizer clearing all attributes at the start of incrementToken().
Consider a DefaultPayloadTokenFilter that only sets a payload if one
isn't already set - it's clear that this filter can't clear the
payload attribute, so it
Clearing the attributes should be required in those places where we
cleared (or reinit'ed) Token previously, right?
Michael
On 8/10/09 10:42 AM, Yonik Seeley wrote:
Thinking through this a little more, I don't see an alternative to the
tokenizer clearing all attributes at the start of
-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de
-Original Message-
From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik
Seeley
Sent: Monday, August 10, 2009 7:42 PM
To: java-dev@lucene.apache.org
Subject: Re: who clears attributes?
Thinking
Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de
-Original Message-
From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik
Seeley
Sent: Monday, August 10, 2009 7:42 PM
To: java-dev@lucene.apache.org
Subject: Re: who clears
On Aug 10, 2009, at 2:00 PM, Earwin Burrfoot wrote:
I'll deviate from the topic somewhat.
What are exact benefits that new tokenstream API yields? Are we sure
we want it released with 2.9?
By now I only see various elaborate problems, but haven't seen a
single piece of code becoming simpler.
Grant Ingersoll wrote:
On Aug 10, 2009, at 2:00 PM, Earwin Burrfoot wrote:
2.9 was _SUPPOSED_ to be a deprecation release,
Whats a deprecation release? We deprecate stuff in every release ...
does it make sense to do a release just to deprecate anything we might
not have yet? And if you add
I think we should change the backwards-compatibility policy as proposed
in LUCENE-1698 and remove some deprecated things (inlcuding the old
TokenStream API, maybe query parser) in 3.1, not 3.0.
I don't think we should have a 2.5 release - this clearly shows the
disadvantages of our current
On Aug 10, 2009, at 3:06 PM, Michael Busch wrote:
I think we should change the backwards-compatibility policy as
proposed in LUCENE-1698 and remove some deprecated things (inlcuding
the old TokenStream API, maybe query parser) in 3.1, not 3.0.
Maybe. I'm not convinced yet that the
Michael Busch wrote:
I think we should change the backwards-compatibility policy as
proposed in LUCENE-1698 and remove some deprecated things (inlcuding
the old TokenStream API, maybe query parser) in 3.1, not 3.0.
I don't think we should have a 2.5 release - this clearly shows the
You didn't really comment on my proposal: I suggested to not remove the
old Token API and old queryparser in 3.0. Instead with 3.0 change the
bw-policy, so that we can remove deprecated things in minor releases
(e.g. 3.1 in this case).
I think your 2.5 proposal has drawbacks: if we release
On Mon, Aug 10, 2009 at 22:50, Grant Ingersollgsing...@apache.org wrote:
On Aug 10, 2009, at 2:00 PM, Earwin Burrfoot wrote:
I'll deviate from the topic somewhat.
What are exact benefits that new tokenstream API yields? Are we sure
we want it released with 2.9?
By now I only see various
Hi Grant,
I have serious doubts about releasing this new API until these
performance issues are resolved and better proven out from a usability
standpoint.
I think LUCENE-1796 has fixed the performance problems, which was caused by
a missing reflection-cache needed for bw compatibility. I
On 8/10/09 12:52 PM, Uwe Schindler wrote:
Michael: The TokenWrapper added cost was there in 2.9 before the TokenStream
overhaul, too, as the TokenWrapper-like code was there implemented
similarily inside DocInverter.
You're right. It will only be more costly in case you mix multiple old
...@gmail.com]
Sent: Monday, August 10, 2009 9:58 PM
To: java-dev@lucene.apache.org
Subject: Re: who clears attributes?
On 8/10/09 12:52 PM, Uwe Schindler wrote:
Michael: The TokenWrapper added cost was there in 2.9 before the
TokenStream
overhaul, too, as the TokenWrapper-like code
Busch [mailto:busch...@gmail.com]
Sent: Monday, August 10, 2009 10:09 PM
To: java-dev@lucene.apache.org
Subject: Re: who clears attributes?
On 8/10/09 1:02 PM, Uwe Schindler wrote:
If both filters would only implement new API there would be direct calls
from the filter to the input
On Aug 10, 2009, at 3:36 PM, Michael Busch wrote:
You didn't really comment on my proposal: I suggested to not remove
the old Token API and old queryparser in 3.0. Instead with 3.0
change the bw-policy, so that we can remove deprecated things in
minor releases (e.g. 3.1 in this case).
On 8/10/09 1:02 PM, Uwe Schindler wrote:
If both filters would only implement new API there would be direct calls
from the filter to the input TokenStream. If all streams/filters would
implement only the old API, the bw-delegation would only be used for the
incrementToken() calls from
On 8/10/09 1:30 PM, Grant Ingersoll wrote:
I think your 2.5 proposal has drawbacks: if we release 2.5 now to
test the new major features in the field, then do you want to stop
adding new features to trunk until we release 2.9 to not have the
same situation then again? How long should this
On Aug 10, 2009, at 3:52 PM, Uwe Schindler wrote:
Hi Grant,
I have serious doubts about releasing this new API until these
performance issues are resolved and better proven out from a
usability
standpoint.
I think LUCENE-1796 has fixed the performance problems, which was
caused by
a
I have serious doubts about releasing this new API until these
performance issues are resolved and better proven out from a
usability
standpoint.
I think LUCENE-1796 has fixed the performance problems, which was
caused by
a missing reflection-cache needed for bw compatibility. I
On Tue, Aug 11, 2009 at 00:37, Michael Buschbusch...@gmail.com wrote:
On 8/10/09 1:30 PM, Grant Ingersoll wrote:
I think your 2.5 proposal has drawbacks: if we release 2.5 now to test
the new major features in the field, then do you want to stop adding new
features to trunk until we release
I do agree 2.9 has tons of changes: new analysis API, segment-based
searching/collection/sorting, new QP, etc.
One option might be to have a looong beta period for 2.9, and focus on
testing/docs?
Or... and this is one crazy idea... maybe we should simply release 3.0
next, not removing any
On Tue, Aug 11, 2009 at 00:54, Uwe Schindleru...@thetaphi.de wrote:
I have serious doubts about releasing this new API until these
performance issues are resolved and better proven out from a
usability
standpoint.
I think LUCENE-1796 has fixed the performance problems, which was
It sounds like the 'old' API should stay a bit longer than 3.0. We'd like to
give more people a chance to experiment w/ the new API before we claim it is
the new Analysis API in Lucene. And that means that more users will have to
live w/ the bit of slowness more than what is believed in this
Does this mean we still move to Java 5 in 3.0? If so, +1 from me too.
On Tue, Aug 11, 2009 at 12:06 AM, Mark Miller markrmil...@gmail.com wrote:
Michael McCandless wrote:
Or... and this is one crazy idea... maybe we should simply release 3.0
next, not removing any deprecated APIs until 3.1
You'll sell your vote for pork? :)
If by some miracle we went with this, with so many back compat issues
with this update, I don't see why we wouldn't throw Java 1.5 in as well.
That just complicates things here though. I'd save that discussion.
Shai Erera wrote:
Does this mean we still move
On Aug 10, 2009, at 5:12 PM, Shai Erera wrote:
Maybe we should follow what I seem to read from Earwin and Grant -
come up w/ real use cases, try to implement them w/ the current API,
then if it's impossible, discuss how we can make the current API
more adaptive. If at the end of this
On 8/10/09 3:19 PM, Grant Ingersoll wrote:
Oh, and now it seems the new QP is dependent on it all.
The new QP uses Attributes for config settings, but doesn't require the
TokenStream to be an AttributeSource.
-
To
Grant Ingersoll wrote:
On Aug 10, 2009, at 5:12 PM, Shai Erera wrote:
Maybe we should follow what I seem to read from Earwin and Grant -
come up w/ real use cases, try to implement them w/ the current API,
then if it's impossible, discuss how we can make the current API more
adaptive. If
Well, I have real use cases for it, but all of it is still missing the
biggest piece: search side support. It's the 900 lb. elephant in the room.
The 500 lb. elephant is the fact that all these attributes, AIUI, require
you to hook in your own indexing chain, etc. in order to even be
On 8/10/09 2:05 PM, Michael McCandless wrote:
Or... and this is one crazy idea... maybe we should simply release 3.0
next, not removing any deprecated APIs until 3.1 or later. Ie,
normal software on having so many major changes would release an X.0
release; I agree the deprecation release is
Erera [mailto:ser...@gmail.com]
Sent: Monday, August 10, 2009 11:13 PM
To: java-dev@lucene.apache.org
Subject: Re: who clears attributes?
It sounds like the 'old' API should stay a bit longer than 3.0. We'd like to
give more people a chance to experiment w/ the new API before we claim
On Aug 10, 2009, at 18:48, Michael Busch busch...@gmail.com wrote:
On 8/10/09 2:05 PM, Michael McCandless wrote:
Or... and this is one crazy idea... maybe we should simply release
3.0
next, not removing any deprecated APIs until 3.1 or later. Ie,
normal software on having so many major
On Aug 10, 2009, at 6:28 PM, Mark Miller wrote:
Grant Ingersoll wrote:
On Aug 10, 2009, at 5:12 PM, Shai Erera wrote:
Maybe we should follow what I seem to read from Earwin and Grant -
come up w/ real use cases, try to implement them w/ the current
API, then if it's impossible, discuss
55 matches
Mail list logo