Coming in late to the discussion, and without really understanding the underlying Lucene issues, but...
The size of the problem of reindexing is under-appreciated I think. Somewhere in my company is the original data I indexed. But the effort it would take to resurrect it is O(unknown). An unfortunate reality of commercial products is that the often receive very little love for extended periods of time until all of the sudden more work is required. There ensues an extended period of re-orientation, even if the people who originally worked on the project are still around. *Assuming* the data is available to reindex (and there are many reasons besides poor practice on the part of the company that it may not be), remembering/finding out exactly which of the various backups you made of the original data is the one that's actually in your product can be highly non-trivial. Compounded by the fact that the product manager will be adamant about "Do NOT surprise our customers". So I can be in a spot of saying "I *think* I have the original data set, and I *think* I have the original code used to index it, and if I get a new version of Lucene I *think* I can recreate the index and I *think* that the user will see the expected change. After all that effort is completed, I *think* we'll see the expected changes, but we won't know until we try it" puts me in a very precarious position. This assumes that I have a reasonable chance of getting the original data. But say I've been indexing data from a live feed. Sure as hell hope I stored the data somewhere, because going back to the source and saying "please resend me 10 years worth of data that I have in my index" is...er...hard. Or say that the original provider has gone out of business, or the licensing arrangement specifies a one-time transmission of data that may not be retained in its original form or..... The point of this long diatribe is that there are many reasons why reindexing is impossible and/or impractical. Making any decision that requires reindexing for a new version is locking a user into a version potentially forever. We should not underestimate how painful that can be and should never think that "just reindex" is acceptable in all situations. It's not. Period. Be very clear that some number of Lucene users will absolutely not be able to reindex. We may still make a decision that requires this, but let's make it without deluding ourselves that it's a possible solution for everyone. So an upgrade tool seems like a reasonable compromise. I agree that being hampered in what we can develop in Lucene by having to accomodate reading old indexes slows new features etc. It's always nice to be able to work without dealing with pesky legacy issues <G>. Perhaps splitting out the indexing upgrades into a separate program lets us accommodate both concerns. FWIW Erick On Thu, Apr 15, 2010 at 9:42 AM, Danil ŢORIN <torin...@gmail.com> wrote: > True. Just need the tool. > > On Thu, Apr 15, 2010 at 16:39, Earwin Burrfoot <ear...@gmail.com> wrote: > > > > On Thu, Apr 15, 2010 at 17:17, Yonik Seeley <yo...@lucidimagination.com> > wrote: > > > Seamless online upgrades have their place too... say you are upgrading > > > one server at a time in a cluster. > > > > Nothing here that can't be solved with an upgrade tool. Down one > > server, upgrade index, upgrade sofware, up. > > > > -- > > Kirill Zakharenko/Кирилл Захаренко (ear...@gmail.com) > > Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423 > > ICQ: 104465785 > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org > > For additional commands, e-mail: java-dev-h...@lucene.apache.org > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-dev-h...@lucene.apache.org > >