Chris Hostetter skrev:
...personally i think the analysis contrib should have the same compat
reqruitements as the core given how heavily used it is.
In this specific case it is possible to introduce the new stemmers via
one method and leave the old stemmers accessable using old methods.
But
: Do we require the contrib to adhere to the same back compatibility rules as
: trunk? I don't know that it has been established. Thoughts? Analysis is a
: pretty tricky one, as compared to the other packages.
we discussed this a little while back and put it on the wiki...
>> "All contribs ar
It may seem like a socialist or a communist or a free love hippy attitude,
It sounds like a perfect attitude.
(In particular the "free love hippie" part - does it come with LSD and
tie-dyed/batik clothes too?)
Kind regards,
Endre.
+1. And, we always have the major version release at our disposal if
need be.
At any rate, I think we have beaten this one to death. I think it is
a useful to look back every now and then on the major things that
guide us and make sure we all still agree, at least for the most
part. F
+1
On Jan 27, 2008, at 8:34 PM, Chris Hostetter wrote:
: But I do agree, benchmark doesn't have the same litmus test.
the generalization of that statement probably being "all contribs
are not
created equal."
I propose making some comments in the BackwardsCompatibility wiki page
about the c
And then you can end up like the Soviet Union...
The basic problems of communism - those that don't contribute their
fair share, but suck out the minimum resources (but maximum in
totality), and those that want to lead (their contribution) and suck
the minimum, and then those that contribut
: But I do agree, benchmark doesn't have the same litmus test.
the generalization of that statement probably being "all contribs are not
created equal."
I propose making some comments in the BackwardsCompatibility wiki page
about the compatibility commitments of contribs depends largely on the
: > So, in hindsight, the acronym/host setting for StandardAnalyzer really
: > should have defaulted to "true", meaning the bug is fixed, but users who
: > somehow depend on the bug (which should be a tiny minority) have an avenue
: > (setReplaceInvalidAcronym) to keep back compatibility if needed
: I would guess the number of people/organizations using Lucene vs. contributing
: to Lucene is much greater.
:
: The contributers work in head (should IMO). The users can select a particular
: version of Lucene and code their apps accordingly. They can also back-port
: features from a later to an
Well, contrib/Wikipedia has a dependency on it, but at least it is
self contained. I would love to see the Wikipedia stuff extracted out
of benchmark and be in contrib/wikipedia (thus flipping the
dependency), but the effort isn't particularly high on my list.
But I do agree, benchmark doe
On Jan 25, 2008 8:04 PM, Grant Ingersoll <[EMAIL PROTECTED]> wrote:
> One more thought on back compatibility:
>
> Do we have the same requirements for any and all contrib modules? I
> am especially thinking about the benchmark contrib, but it probably
> applies to others as well.
>
> -Grant
>
In
One more thought on back compatibility:
Do we have the same requirements for any and all contrib modules? I
am especially thinking about the benchmark contrib, but it probably
applies to others as well.
-Grant
On Jan 24, 2008, at 8:42 AM, Grant Ingersoll wrote:
On Jan 24, 2008, at 4:2
I will do so.
On Jan 24, 2008, at 12:44 PM, DM Smith wrote:
This is now a hijacked thread. It is very interesting, but it may
be hard to find again. Wouldn't it be better to record this thread
differently, perhaps opening a Jira issue to add XA to Lucene?
-- DM
Doron Cohen wrote:
On Jan
This is now a hijacked thread. It is very interesting, but it may be
hard to find again. Wouldn't it be better to record this thread
differently, perhaps opening a Jira issue to add XA to Lucene?
-- DM
Doron Cohen wrote:
On Jan 24, 2008 6:55 PM, robert engels <[EMAIL PROTECTED]> wrote:
T
On Jan 24, 2008 6:55 PM, robert engels <[EMAIL PROTECTED]> wrote:
> Thanks, you are correct, but I am not sure it covers the complete case.
>
> Change it a bit to be:
>
> A opens reader.
> B opens reader.
> A performs query decides a new document is needed
> B performs query decides a new document
Thanks, you are correct, but I am not sure it covers the complete case.
Change it a bit to be:
A opens reader.
B opens reader.
A performs query decides a new document is needed
B performs query decides a new document is needed
B gets writer, adds document, closes
A gets writer, adds document, cl
Sorry, I am using "gets lock" to mean 'opening the index'. I was
simplifying the the procedure.
I think your comment is not correct in this context.
On Jan 24, 2008, at 3:16 AM, Michael McCandless wrote:
Doron Cohen wrote:
--=_Part_11325_2615585.1201162438596
Content-Type: text/plain;
On Jan 24, 2008, at 4:27 AM, Michael McCandless wrote:
Grant Ingersoll wrote:
Yes, I agree these are what is about (despite the divergence into
locking).
As I see, it the question is about whether we should try to do
major releases on the order of a year, rather than the current 2+
ye
Grant Ingersoll wrote:
Yes, I agree these are what is about (despite the divergence into
locking).
As I see, it the question is about whether we should try to do
major releases on the order of a year, rather than the current 2+
year schedule and also how to best handle bad behavior when
Doron Cohen wrote:
--=_Part_11325_2615585.1201162438596
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
On Jan 24, 2008 12:31 AM, robert engels <[EMAIL PROTECTED]> wrote:
You must get the write lock before opening the reader if you
On Jan 24, 2008 12:31 AM, robert engels <[EMAIL PROTECTED]> wrote:
> You must get the write lock before opening the reader if you want
> transactional consistency and are performing updates.
>
> No other way to do it.
>
> Otherwise.
>
> A opens reader.
> B opens reader.
> A performs query decides
Yes, I agree these are what is about (despite the divergence into
locking).
As I see, it the question is about whether we should try to do major
releases on the order of a year, rather than the current 2+ year
schedule and also how to best handle bad behavior when producing
tokens that pr
Top posting because this is a response to the thread as a whole.
It appears that this thread has identified some different reasons for
"needing" to break compatibility:
1) A current behavior is now deemed bad or wrong. Examples: the silent
truncation of large documents or an analyzer that wor
Right.
But, that can, and should, be done outside of the Lucene core.
Mike
robert engels wrote:
You must get the write lock before opening the reader if you want
transactional consistency and are performing updates.
No other way to do it.
Otherwise.
A opens reader.
B opens reader.
A per
The statement upon rereading seems much stronger than intended. You
are correct, but I think the number of users that become contributers
is still far less than the number of users.
The only abandonment of the users was from the standpoint of
maintaining a legacy API. The users are free to
I don't think I can say that this needs to happen now either. :)
An interesting question to answer would be:
If Lucene did not exist, and given all of the knowledge we have, we
decided to create a Java based search engine, would the API look like
it does today?
The answer may be yes. I dou
You must get the write lock before opening the reader if you want
transactional consistency and are performing updates.
No other way to do it.
Otherwise.
A opens reader.
B opens reader.
A performs query decides an update is needed based on results
B performs query decides an update is needed
Hi robert,
On 01/23/2008 at 4:55 PM, robert engels wrote:
> If the users are "just dropping in a new version" they are not
> contributing to the community... I think just the opposite, they are
> parasites.
I reject your characterization of passive users as "parasites"; I suspect that
you intend
robert engels wrote:
I think you are incorrect.
I would guess the number of people/organizations using Lucene vs.
contributing to Lucene is much greater.
The contributers work in head (should IMO). The users can select a
particular version of Lucene and code their apps accordingly. They
chris Hostetter wrote:
: I do like the idea of a static/system property to match legacy
: behavior. For example, the bugs around how StandardTokenizer
: mislabels tokens (eg LUCENE-1100), this would be the perfect
solution.
: Clearly those are silly bugs that should be fixed, quickly, with
robert engels wrote:
Thanks.
So all writers still need to get the write lock, before opening the
reader in order to maintain transactional consistency.
I don't understand what you mean by "before opening the reader"? A
writer acquires the write.lock before opening. Readers do not,
un
I think you are incorrect.
I would guess the number of people/organizations using Lucene vs.
contributing to Lucene is much greater.
The contributers work in head (should IMO). The users can select a
particular version of Lucene and code their apps accordingly. They
can also back-port fea
: I guess I don't see the back-porting as an issue. Only those that want to need
: to do the back-porting. Head moves on...
I view it as a potential risk to the overal productivity of the community.
If upgrading from A to B is easy people (in general) won't spend a lot of
time/effort backport
I guess I don't see the back-porting as an issue. Only those that
want to need to do the back-porting. Head moves on...
On Jan 23, 2008, at 2:00 PM, Chris Hostetter wrote:
: I do like the idea of a static/system property to match legacy
: behavior. For example, the bugs around how Standard
Thanks.
So all writers still need to get the write lock, before opening the
reader in order to maintain transactional consistency.
Was there performance testing done on the lockless commits with heavy
contention? I would think that reading the directory to find the
latest segments file wo
: I do like the idea of a static/system property to match legacy
: behavior. For example, the bugs around how StandardTokenizer
: mislabels tokens (eg LUCENE-1100), this would be the perfect solution.
: Clearly those are silly bugs that should be fixed, quickly, with this
: back-compatible mode t
robert engels wrote:
I guess I don't understand what a commit lock is, or what's its
purpose is. It seems the write lock is all that is needed.
The commit.lock was used to guard access to the "segments" file. A
reader would acquire the lock (blocking out other readers and
writers) when r
I guess I don't understand what a commit lock is, or what's its
purpose is. It seems the write lock is all that is needed.
If you still need a write lock, then what is the purpose of
"lockless" commits.
You can get consistency if all writers get the write lock before
performing any read.
robert engels wrote:
Maybe I don't understand lockless commits then.
I just don't think you can enforce transactional consistency
without either 1) locking, or 2) optimistic collision detection. I
could be wrong here, but this has been my experience.
By effectively removing the locking req
Maybe I don't understand lockless commits then.
I just don't think you can enforce transactional consistency without
either 1) locking, or 2) optimistic collision detection. I could be
wrong here, but this has been my experience.
By effectively removing the locking requirement, I think you
On Jan 23, 2008 9:53 AM, Mark Miller <[EMAIL PROTECTED]> wrote:
> Also, as he mentioned, we really need a good distributed system that
> allows for index partitioning. Thats the ticket to more enterprise
> adoption. Could be Solr's work though...
Yes, we're working on that :-)
-Yonik
---
Thats where Robert is confusing me as well. To have XA support you just
need to be able to define a transaction, atomically commit, or rollback.
You also need a consistent state after any of these operations.
LUCENE-1044 seems to guarantee that, and so isn't it more like finishing
up needed wor
Robert, besides LUCENE-1044 (syncing on commit), what is the Lucene
core missing in order for you (or, someone) to build XA compliance on
top of it?
Ie, you can open a writer with autoCommit=false and no changes are
committed until you close it. You can abort the session by calling
writer.abort
Catching up here...
Re the fracturing when Maven went from v1 -> v2: I think Lucene is a
totally different animal. Maven is an immense framework; Lucene is a
fairly small "core" set of APIs. I think for these "core" type
packages it's very important to keep drop-in compatibility as long as
poss
A specific example:
You have a criminal justice system that indexes past court cases.
You do a search for cases involving Joe Smith because you are a judge
and you want to review priors before sentencing. Similar issues with
related cases, case history, etc.
Is it better to return somethin
robert engels wrote:
I think there are a lot of applications using Lucene where "whether
its lost a bit of data or not" is not acceptable.
Yeah, and I have one of them. Which is why I would love the support your
talking about. But its not there yet and I am just grateful that i can
get my cus
I think there are a lot of applications using Lucene where "whether
its lost a bit of data or not" is not acceptable.
However, it is probably fine for a web search, or intranet search.
As to your first point, that is why the really great open-source
projects (eclipse, open office) have a fin
: To paraphrase a dead English guy: A rose by any other name is still the same,
: right?
:
: Basically, all the version number tick saves them from is having to read the
: CHANGES file, right?
Correct: i'm not disagreeing with your basic premise, just pointing out
that it can be done with the c
Grant Ingersoll wrote:
Does anyone have experience w/ how other open source projects deal with
this?
Use abstract base classes instead of interfaces: they're much easier to
evolve back-compatibly. In Hadoop, for example, we really wish that
Mapper and Reducer were not interfaces and are very
On Jan 22, 2008, at 3:45 PM, Chris Hostetter wrote:
Perhaps the crux of the issue is that we as a community need to become
more willing to crank out "major" releases ... if we just released
3.0 and
now someone came up with the "Magic" field type and it's really
magically
and we want to star
I humbly disagree about NFS. Arguing about where free time was invested,
or wasted, or inefficient, in an open source project just seems silly.
One of the great benefits is esoteric work that would normally not be
allowed for. NFS is easy. A lot of Lucene users don't care about Lucene.
They jus
One more example on this. A lot of work was done on transaction
support. I would argue that this falls way short of what is needed,
since there is no XA transaction support. Since the lucene index
(unless stored in an XA db) is a separate resource, it really needs
XA support in order to be
I don't think group C is interested in bug fixes. I just don't see
how Lucene is at all useful if the users are encountering any bug -
so they either don't use that feature, or they have already developed
a work-around (or they have patched the code in a way that avoids the
bug, yet is spec
: I guess I am suggesting that instead of maintaining the whole major/minor
: thing (not including file format) that we relax a bit and say that any give
: feature we choose to remove or add has to go through two release cycles, which
: according to your averages, would equal just over 1 year's ti
: If they are " no longer actively developing the portion of the code that's
: broken, aren't seeking the new feature, etc", and they stay back on old
: versions... isn't that exactly what we want? They can stay on the old version,
: and new application development uses the newer version.
This ba
Me too...
On Jan 18, 2008, at 4:33 AM, Uwe Schindler wrote:
Sort of keeping all version in the trunk at once? IndexWriter2 is
IndexWriter with some some features replaced with something better?
And then IndexWriter3..? That's a bit messy if you ask me. But it
would work. But terrible messy.
B
> Sort of keeping all version in the trunk at once? IndexWriter2 is
> IndexWriter with some some features replaced with something better?
> And then IndexWriter3..? That's a bit messy if you ask me. But it
> would work. But terrible messy.
Brrr, I hate this. Microsoft does this always when they up
That wasn't what I was thinking. They would use lucene23.jar if they
wanted the 2.3 API. Newer code uses the lucene30.jar for the 3.0 API.
The others could continue to back-port 3.0 features to 2.3.X if they
wished (and could do so without changing the API - private changes
only).
I thin
18 jan 2008 kl. 07.41 skrev robert engels:
Look at similar problems and how they handled in the JDK. The Date
class has been notorious since its inception. The Calendar class is
almost no better, now they are developing JSR-310 to replace both.
Existing code can still use the Date or Calen
That brings us back to an earlier discussion: "if majority want to
break compatibility, then we should do so, and the minority can back-
port the changes to a previous release if they feel it is warranted."
I don't understand why that isn't a viable approach.
I agree that maintaining interfac
18 jan 2008 kl. 03.39 skrev Grant Ingersoll:
Does anyone have experience w/ how other open source projects deal
with this?
Would be a pain to implement, but it could be done as libcompat.
lucene-2.4-compat-core-3.0.jar
--
karl
-
On Jan 17, 2008, at 9:30 PM, DM Smith wrote:
On Jan 17, 2008, at 7:57 PM, robert engels wrote:
If they are " no longer actively developing the portion of the code
that's broken, aren't seeking the new feature, etc", and they stay
back on old versions... isn't that exactly what we want? Th
I guess I am suggesting that instead of maintaining the whole major/
minor thing (not including file format) that we relax a bit and say
that any give feature we choose to remove or add has to go through two
release cycles, which according to your averages, would equal just
over 1 year's tim
On Jan 17, 2008, at 7:57 PM, robert engels wrote:
If they are " no longer actively developing the portion of the code
that's broken, aren't seeking the new feature, etc", and they stay
back on old versions... isn't that exactly what we want? They can
stay on the old version, and new applic
If they are " no longer actively developing the portion of the code
that's broken, aren't seeking the new feature, etc", and they stay
back on old versions... isn't that exactly what we want? They can
stay on the old version, and new application development uses the
newer version.
It woul
Hi Grant,
On 01/17/2008 at 7:51 AM, Grant Ingersoll wrote:
> Our minor release cycles are currently in the 3-6 months range
> and our major release cycles are in the 1-1.5 year range.
Since 2.0.0, including 2.3.0 - assuming it will be released in the next week or
so - the minor release intervals
Grant Ingersoll wrote:
My reasoning for this solution: Our minor release cycles are
currently in the 3-6 months range and our major release cycles are in
the 1-1.5 year range. I think giving someone 4-8 (or whatever) months
is more than enough time to prepare for API changes. I am not sur
On Jan 17, 2008, at 4:14 PM, Doug Cutting wrote:
Grant Ingersoll wrote:
1. We add a new section to CHANGES for each release, at the top
where we can declare what deprecations will be removed in the
_next_ release (major or minor) and also any interface API changes
2. When deprecating, the
Grant Ingersoll wrote:
1. We add a new section to CHANGES for each release, at the top where we
can declare what deprecations will be removed in the _next_ release
(major or minor) and also any interface API changes
2. When deprecating, the @deprecate tag should declare what version it
will be
On Jan 17, 2008, at 2:42 PM, Bill Janssen wrote:
Examples of the former issue include things like removing
deprecations sooner and the ability to add new methods to interfaces
(both of these are not to be done ad-hoc)
What would be the difference between ad-hoc and non-ad-hoc?
Maybe bad ch
> Examples of the former issue include things like removing
> deprecations sooner and the ability to add new methods to interfaces
> (both of these are not to be done ad-hoc)
What would be the difference between ad-hoc and non-ad-hoc?
Bill
71 matches
Mail list logo