Re: Back Compatibility of Contrib. was Re: [jira] Commented: (LUCENE-1142) Updated Snowball package

2008-04-17 Thread Karl Wettin
Chris Hostetter skrev: ...personally i think the analysis contrib should have the same compat reqruitements as the core given how heavily used it is. In this specific case it is possible to introduce the new stemmers via one method and leave the old stemmers accessable using old methods. But

Re: Back Compatibility of Contrib. was Re: [jira] Commented: (LUCENE-1142) Updated Snowball package

2008-04-17 Thread Chris Hostetter
: Do we require the contrib to adhere to the same back compatibility rules as : trunk? I don't know that it has been established. Thoughts? Analysis is a : pretty tricky one, as compared to the other packages. we discussed this a little while back and put it on the wiki... >> "All contribs ar

Re: Back Compatibility

2008-01-28 Thread Endre Stølsvik
It may seem like a socialist or a communist or a free love hippy attitude, It sounds like a perfect attitude. (In particular the "free love hippie" part - does it come with LSD and tie-dyed/batik clothes too?) Kind regards, Endre.

Re: Back Compatibility

2008-01-27 Thread Grant Ingersoll
+1. And, we always have the major version release at our disposal if need be. At any rate, I think we have beaten this one to death. I think it is a useful to look back every now and then on the major things that guide us and make sure we all still agree, at least for the most part. F

Re: Back Compatibility

2008-01-27 Thread Grant Ingersoll
+1 On Jan 27, 2008, at 8:34 PM, Chris Hostetter wrote: : But I do agree, benchmark doesn't have the same litmus test. the generalization of that statement probably being "all contribs are not created equal." I propose making some comments in the BackwardsCompatibility wiki page about the c

Re: Back Compatibility

2008-01-27 Thread robert engels
And then you can end up like the Soviet Union... The basic problems of communism - those that don't contribute their fair share, but suck out the minimum resources (but maximum in totality), and those that want to lead (their contribution) and suck the minimum, and then those that contribut

Re: Back Compatibility

2008-01-27 Thread Chris Hostetter
: But I do agree, benchmark doesn't have the same litmus test. the generalization of that statement probably being "all contribs are not created equal." I propose making some comments in the BackwardsCompatibility wiki page about the compatibility commitments of contribs depends largely on the

Re: Back Compatibility

2008-01-27 Thread Chris Hostetter
: > So, in hindsight, the acronym/host setting for StandardAnalyzer really : > should have defaulted to "true", meaning the bug is fixed, but users who : > somehow depend on the bug (which should be a tiny minority) have an avenue : > (setReplaceInvalidAcronym) to keep back compatibility if needed

Re: Back Compatibility

2008-01-27 Thread Chris Hostetter
: I would guess the number of people/organizations using Lucene vs. contributing : to Lucene is much greater. : : The contributers work in head (should IMO). The users can select a particular : version of Lucene and code their apps accordingly. They can also back-port : features from a later to an

Re: Back Compatibility

2008-01-25 Thread Grant Ingersoll
Well, contrib/Wikipedia has a dependency on it, but at least it is self contained. I would love to see the Wikipedia stuff extracted out of benchmark and be in contrib/wikipedia (thus flipping the dependency), but the effort isn't particularly high on my list. But I do agree, benchmark doe

Re: Back Compatibility

2008-01-25 Thread Doron Cohen
On Jan 25, 2008 8:04 PM, Grant Ingersoll <[EMAIL PROTECTED]> wrote: > One more thought on back compatibility: > > Do we have the same requirements for any and all contrib modules? I > am especially thinking about the benchmark contrib, but it probably > applies to others as well. > > -Grant > In

Re: Back Compatibility

2008-01-25 Thread Grant Ingersoll
One more thought on back compatibility: Do we have the same requirements for any and all contrib modules? I am especially thinking about the benchmark contrib, but it probably applies to others as well. -Grant On Jan 24, 2008, at 8:42 AM, Grant Ingersoll wrote: On Jan 24, 2008, at 4:2

Re: Back Compatibility

2008-01-24 Thread robert engels
I will do so. On Jan 24, 2008, at 12:44 PM, DM Smith wrote: This is now a hijacked thread. It is very interesting, but it may be hard to find again. Wouldn't it be better to record this thread differently, perhaps opening a Jira issue to add XA to Lucene? -- DM Doron Cohen wrote: On Jan

Re: Back Compatibility

2008-01-24 Thread DM Smith
This is now a hijacked thread. It is very interesting, but it may be hard to find again. Wouldn't it be better to record this thread differently, perhaps opening a Jira issue to add XA to Lucene? -- DM Doron Cohen wrote: On Jan 24, 2008 6:55 PM, robert engels <[EMAIL PROTECTED]> wrote: T

Re: Back Compatibility

2008-01-24 Thread Doron Cohen
On Jan 24, 2008 6:55 PM, robert engels <[EMAIL PROTECTED]> wrote: > Thanks, you are correct, but I am not sure it covers the complete case. > > Change it a bit to be: > > A opens reader. > B opens reader. > A performs query decides a new document is needed > B performs query decides a new document

Re: Back Compatibility

2008-01-24 Thread robert engels
Thanks, you are correct, but I am not sure it covers the complete case. Change it a bit to be: A opens reader. B opens reader. A performs query decides a new document is needed B performs query decides a new document is needed B gets writer, adds document, closes A gets writer, adds document, cl

Re: Back Compatibility

2008-01-24 Thread robert engels
Sorry, I am using "gets lock" to mean 'opening the index'. I was simplifying the the procedure. I think your comment is not correct in this context. On Jan 24, 2008, at 3:16 AM, Michael McCandless wrote: Doron Cohen wrote: --=_Part_11325_2615585.1201162438596 Content-Type: text/plain;

Re: Back Compatibility

2008-01-24 Thread Grant Ingersoll
On Jan 24, 2008, at 4:27 AM, Michael McCandless wrote: Grant Ingersoll wrote: Yes, I agree these are what is about (despite the divergence into locking). As I see, it the question is about whether we should try to do major releases on the order of a year, rather than the current 2+ ye

Re: Back Compatibility

2008-01-24 Thread Michael McCandless
Grant Ingersoll wrote: Yes, I agree these are what is about (despite the divergence into locking). As I see, it the question is about whether we should try to do major releases on the order of a year, rather than the current 2+ year schedule and also how to best handle bad behavior when

Re: Back Compatibility

2008-01-24 Thread Michael McCandless
Doron Cohen wrote: --=_Part_11325_2615585.1201162438596 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline On Jan 24, 2008 12:31 AM, robert engels <[EMAIL PROTECTED]> wrote: You must get the write lock before opening the reader if you

Re: Back Compatibility

2008-01-24 Thread Doron Cohen
On Jan 24, 2008 12:31 AM, robert engels <[EMAIL PROTECTED]> wrote: > You must get the write lock before opening the reader if you want > transactional consistency and are performing updates. > > No other way to do it. > > Otherwise. > > A opens reader. > B opens reader. > A performs query decides

Re: Back Compatibility

2008-01-23 Thread Grant Ingersoll
Yes, I agree these are what is about (despite the divergence into locking). As I see, it the question is about whether we should try to do major releases on the order of a year, rather than the current 2+ year schedule and also how to best handle bad behavior when producing tokens that pr

Re: Back Compatibility

2008-01-23 Thread DM Smith
Top posting because this is a response to the thread as a whole. It appears that this thread has identified some different reasons for "needing" to break compatibility: 1) A current behavior is now deemed bad or wrong. Examples: the silent truncation of large documents or an analyzer that wor

Re: Back Compatibility

2008-01-23 Thread Michael McCandless
Right. But, that can, and should, be done outside of the Lucene core. Mike robert engels wrote: You must get the write lock before opening the reader if you want transactional consistency and are performing updates. No other way to do it. Otherwise. A opens reader. B opens reader. A per

Re: Back Compatibility

2008-01-23 Thread robert engels
The statement upon rereading seems much stronger than intended. You are correct, but I think the number of users that become contributers is still far less than the number of users. The only abandonment of the users was from the standpoint of maintaining a legacy API. The users are free to

Re: Back Compatibility

2008-01-23 Thread robert engels
I don't think I can say that this needs to happen now either. :) An interesting question to answer would be: If Lucene did not exist, and given all of the knowledge we have, we decided to create a Java based search engine, would the API look like it does today? The answer may be yes. I dou

Re: Back Compatibility

2008-01-23 Thread robert engels
You must get the write lock before opening the reader if you want transactional consistency and are performing updates. No other way to do it. Otherwise. A opens reader. B opens reader. A performs query decides an update is needed based on results B performs query decides an update is needed

RE: Back Compatibility

2008-01-23 Thread Steven A Rowe
Hi robert, On 01/23/2008 at 4:55 PM, robert engels wrote: > If the users are "just dropping in a new version" they are not > contributing to the community... I think just the opposite, they are > parasites. I reject your characterization of passive users as "parasites"; I suspect that you intend

Re: Back Compatibility

2008-01-23 Thread Michael McCandless
robert engels wrote: I think you are incorrect. I would guess the number of people/organizations using Lucene vs. contributing to Lucene is much greater. The contributers work in head (should IMO). The users can select a particular version of Lucene and code their apps accordingly. They

Re: Back Compatibility

2008-01-23 Thread Michael McCandless
chris Hostetter wrote: : I do like the idea of a static/system property to match legacy : behavior. For example, the bugs around how StandardTokenizer : mislabels tokens (eg LUCENE-1100), this would be the perfect solution. : Clearly those are silly bugs that should be fixed, quickly, with

Re: Back Compatibility

2008-01-23 Thread Michael McCandless
robert engels wrote: Thanks. So all writers still need to get the write lock, before opening the reader in order to maintain transactional consistency. I don't understand what you mean by "before opening the reader"? A writer acquires the write.lock before opening. Readers do not, un

Re: Back Compatibility

2008-01-23 Thread robert engels
I think you are incorrect. I would guess the number of people/organizations using Lucene vs. contributing to Lucene is much greater. The contributers work in head (should IMO). The users can select a particular version of Lucene and code their apps accordingly. They can also back-port fea

Re: Back Compatibility

2008-01-23 Thread Chris Hostetter
: I guess I don't see the back-porting as an issue. Only those that want to need : to do the back-porting. Head moves on... I view it as a potential risk to the overal productivity of the community. If upgrading from A to B is easy people (in general) won't spend a lot of time/effort backport

Re: Back Compatibility

2008-01-23 Thread robert engels
I guess I don't see the back-porting as an issue. Only those that want to need to do the back-porting. Head moves on... On Jan 23, 2008, at 2:00 PM, Chris Hostetter wrote: : I do like the idea of a static/system property to match legacy : behavior. For example, the bugs around how Standard

Re: Back Compatibility

2008-01-23 Thread robert engels
Thanks. So all writers still need to get the write lock, before opening the reader in order to maintain transactional consistency. Was there performance testing done on the lockless commits with heavy contention? I would think that reading the directory to find the latest segments file wo

Re: Back Compatibility

2008-01-23 Thread Chris Hostetter
: I do like the idea of a static/system property to match legacy : behavior. For example, the bugs around how StandardTokenizer : mislabels tokens (eg LUCENE-1100), this would be the perfect solution. : Clearly those are silly bugs that should be fixed, quickly, with this : back-compatible mode t

Re: Back Compatibility

2008-01-23 Thread Michael McCandless
robert engels wrote: I guess I don't understand what a commit lock is, or what's its purpose is. It seems the write lock is all that is needed. The commit.lock was used to guard access to the "segments" file. A reader would acquire the lock (blocking out other readers and writers) when r

Re: Back Compatibility

2008-01-23 Thread robert engels
I guess I don't understand what a commit lock is, or what's its purpose is. It seems the write lock is all that is needed. If you still need a write lock, then what is the purpose of "lockless" commits. You can get consistency if all writers get the write lock before performing any read.

Re: Back Compatibility

2008-01-23 Thread Michael McCandless
robert engels wrote: Maybe I don't understand lockless commits then. I just don't think you can enforce transactional consistency without either 1) locking, or 2) optimistic collision detection. I could be wrong here, but this has been my experience. By effectively removing the locking req

Re: Back Compatibility

2008-01-23 Thread robert engels
Maybe I don't understand lockless commits then. I just don't think you can enforce transactional consistency without either 1) locking, or 2) optimistic collision detection. I could be wrong here, but this has been my experience. By effectively removing the locking requirement, I think you

Re: Back Compatibility

2008-01-23 Thread Yonik Seeley
On Jan 23, 2008 9:53 AM, Mark Miller <[EMAIL PROTECTED]> wrote: > Also, as he mentioned, we really need a good distributed system that > allows for index partitioning. Thats the ticket to more enterprise > adoption. Could be Solr's work though... Yes, we're working on that :-) -Yonik ---

Re: Back Compatibility

2008-01-23 Thread Mark Miller
Thats where Robert is confusing me as well. To have XA support you just need to be able to define a transaction, atomically commit, or rollback. You also need a consistent state after any of these operations. LUCENE-1044 seems to guarantee that, and so isn't it more like finishing up needed wor

Re: Back Compatibility

2008-01-23 Thread Michael McCandless
Robert, besides LUCENE-1044 (syncing on commit), what is the Lucene core missing in order for you (or, someone) to build XA compliance on top of it? Ie, you can open a writer with autoCommit=false and no changes are committed until you close it. You can abort the session by calling writer.abort

Re: Back Compatibility

2008-01-23 Thread Michael McCandless
Catching up here... Re the fracturing when Maven went from v1 -> v2: I think Lucene is a totally different animal. Maven is an immense framework; Lucene is a fairly small "core" set of APIs. I think for these "core" type packages it's very important to keep drop-in compatibility as long as poss

Re: Back Compatibility

2008-01-22 Thread robert engels
A specific example: You have a criminal justice system that indexes past court cases. You do a search for cases involving Joe Smith because you are a judge and you want to review priors before sentencing. Similar issues with related cases, case history, etc. Is it better to return somethin

Re: Back Compatibility

2008-01-22 Thread Mark Miller
robert engels wrote: I think there are a lot of applications using Lucene where "whether its lost a bit of data or not" is not acceptable. Yeah, and I have one of them. Which is why I would love the support your talking about. But its not there yet and I am just grateful that i can get my cus

Re: Back Compatibility

2008-01-22 Thread robert engels
I think there are a lot of applications using Lucene where "whether its lost a bit of data or not" is not acceptable. However, it is probably fine for a web search, or intranet search. As to your first point, that is why the really great open-source projects (eclipse, open office) have a fin

Re: Back Compatibility

2008-01-22 Thread Chris Hostetter
: To paraphrase a dead English guy: A rose by any other name is still the same, : right? : : Basically, all the version number tick saves them from is having to read the : CHANGES file, right? Correct: i'm not disagreeing with your basic premise, just pointing out that it can be done with the c

Re: Back Compatibility

2008-01-22 Thread Doug Cutting
Grant Ingersoll wrote: Does anyone have experience w/ how other open source projects deal with this? Use abstract base classes instead of interfaces: they're much easier to evolve back-compatibly. In Hadoop, for example, we really wish that Mapper and Reducer were not interfaces and are very

Re: Back Compatibility

2008-01-22 Thread Grant Ingersoll
On Jan 22, 2008, at 3:45 PM, Chris Hostetter wrote: Perhaps the crux of the issue is that we as a community need to become more willing to crank out "major" releases ... if we just released 3.0 and now someone came up with the "Magic" field type and it's really magically and we want to star

Re: Back Compatibility

2008-01-22 Thread Mark Miller
I humbly disagree about NFS. Arguing about where free time was invested, or wasted, or inefficient, in an open source project just seems silly. One of the great benefits is esoteric work that would normally not be allowed for. NFS is easy. A lot of Lucene users don't care about Lucene. They jus

Re: Back Compatibility

2008-01-22 Thread robert engels
One more example on this. A lot of work was done on transaction support. I would argue that this falls way short of what is needed, since there is no XA transaction support. Since the lucene index (unless stored in an XA db) is a separate resource, it really needs XA support in order to be

Re: Back Compatibility

2008-01-22 Thread robert engels
I don't think group C is interested in bug fixes. I just don't see how Lucene is at all useful if the users are encountering any bug - so they either don't use that feature, or they have already developed a work-around (or they have patched the code in a way that avoids the bug, yet is spec

Re: Back Compatibility

2008-01-22 Thread Chris Hostetter
: I guess I am suggesting that instead of maintaining the whole major/minor : thing (not including file format) that we relax a bit and say that any give : feature we choose to remove or add has to go through two release cycles, which : according to your averages, would equal just over 1 year's ti

Re: Back Compatibility

2008-01-22 Thread Chris Hostetter
: If they are " no longer actively developing the portion of the code that's : broken, aren't seeking the new feature, etc", and they stay back on old : versions... isn't that exactly what we want? They can stay on the old version, : and new application development uses the newer version. This ba

Re: Back Compatibility

2008-01-21 Thread Grant Ingersoll
Me too... On Jan 18, 2008, at 4:33 AM, Uwe Schindler wrote: Sort of keeping all version in the trunk at once? IndexWriter2 is IndexWriter with some some features replaced with something better? And then IndexWriter3..? That's a bit messy if you ask me. But it would work. But terrible messy. B

RE: Back Compatibility

2008-01-18 Thread Uwe Schindler
> Sort of keeping all version in the trunk at once? IndexWriter2 is > IndexWriter with some some features replaced with something better? > And then IndexWriter3..? That's a bit messy if you ask me. But it > would work. But terrible messy. Brrr, I hate this. Microsoft does this always when they up

Re: Back Compatibility

2008-01-17 Thread robert engels
That wasn't what I was thinking. They would use lucene23.jar if they wanted the 2.3 API. Newer code uses the lucene30.jar for the 3.0 API. The others could continue to back-port 3.0 features to 2.3.X if they wished (and could do so without changing the API - private changes only). I thin

Re: Back Compatibility

2008-01-17 Thread Karl Wettin
18 jan 2008 kl. 07.41 skrev robert engels: Look at similar problems and how they handled in the JDK. The Date class has been notorious since its inception. The Calendar class is almost no better, now they are developing JSR-310 to replace both. Existing code can still use the Date or Calen

Re: Back Compatibility

2008-01-17 Thread robert engels
That brings us back to an earlier discussion: "if majority want to break compatibility, then we should do so, and the minority can back- port the changes to a previous release if they feel it is warranted." I don't understand why that isn't a viable approach. I agree that maintaining interfac

Re: Back Compatibility

2008-01-17 Thread Karl Wettin
18 jan 2008 kl. 03.39 skrev Grant Ingersoll: Does anyone have experience w/ how other open source projects deal with this? Would be a pain to implement, but it could be done as libcompat. lucene-2.4-compat-core-3.0.jar -- karl -

Re: Back Compatibility

2008-01-17 Thread Grant Ingersoll
On Jan 17, 2008, at 9:30 PM, DM Smith wrote: On Jan 17, 2008, at 7:57 PM, robert engels wrote: If they are " no longer actively developing the portion of the code that's broken, aren't seeking the new feature, etc", and they stay back on old versions... isn't that exactly what we want? Th

Re: Back Compatibility

2008-01-17 Thread Grant Ingersoll
I guess I am suggesting that instead of maintaining the whole major/ minor thing (not including file format) that we relax a bit and say that any give feature we choose to remove or add has to go through two release cycles, which according to your averages, would equal just over 1 year's tim

Re: Back Compatibility

2008-01-17 Thread DM Smith
On Jan 17, 2008, at 7:57 PM, robert engels wrote: If they are " no longer actively developing the portion of the code that's broken, aren't seeking the new feature, etc", and they stay back on old versions... isn't that exactly what we want? They can stay on the old version, and new applic

Re: Back Compatibility

2008-01-17 Thread robert engels
If they are " no longer actively developing the portion of the code that's broken, aren't seeking the new feature, etc", and they stay back on old versions... isn't that exactly what we want? They can stay on the old version, and new application development uses the newer version. It woul

RE: Back Compatibility

2008-01-17 Thread Steven A Rowe
Hi Grant, On 01/17/2008 at 7:51 AM, Grant Ingersoll wrote: > Our minor release cycles are currently in the 3-6 months range > and our major release cycles are in the 1-1.5 year range. Since 2.0.0, including 2.3.0 - assuming it will be released in the next week or so - the minor release intervals

Re: Back Compatibility

2008-01-17 Thread DM Smith
Grant Ingersoll wrote: My reasoning for this solution: Our minor release cycles are currently in the 3-6 months range and our major release cycles are in the 1-1.5 year range. I think giving someone 4-8 (or whatever) months is more than enough time to prepare for API changes. I am not sur

Re: Back Compatibility

2008-01-17 Thread Grant Ingersoll
On Jan 17, 2008, at 4:14 PM, Doug Cutting wrote: Grant Ingersoll wrote: 1. We add a new section to CHANGES for each release, at the top where we can declare what deprecations will be removed in the _next_ release (major or minor) and also any interface API changes 2. When deprecating, the

Re: Back Compatibility

2008-01-17 Thread Doug Cutting
Grant Ingersoll wrote: 1. We add a new section to CHANGES for each release, at the top where we can declare what deprecations will be removed in the _next_ release (major or minor) and also any interface API changes 2. When deprecating, the @deprecate tag should declare what version it will be

Re: Back Compatibility

2008-01-17 Thread Grant Ingersoll
On Jan 17, 2008, at 2:42 PM, Bill Janssen wrote: Examples of the former issue include things like removing deprecations sooner and the ability to add new methods to interfaces (both of these are not to be done ad-hoc) What would be the difference between ad-hoc and non-ad-hoc? Maybe bad ch

Re: Back Compatibility

2008-01-17 Thread Bill Janssen
> Examples of the former issue include things like removing > deprecations sooner and the ability to add new methods to interfaces > (both of these are not to be done ad-hoc) What would be the difference between ad-hoc and non-ad-hoc? Bill