If you absolutely cannot re-index, and you have *no* access to the data again - you are one ballsy mofo to upgrade to a new version of Lucene for "features". It means you likely BASE jump in your free time?

On 04/15/2010 10:14 AM, Erick Erickson wrote:
Coming in late to the discussion, and without really understanding the
underlying Lucene issues, but...

The size of the problem of reindexing is under-appreciated I think. Somewhere in my company is the original data I indexed. But the effort it would take to resurrect it is O(unknown). An unfortunate reality of commercial products is that the often receive very little love for extended periods of time until all of
the sudden more work is required. There ensues an extended period of
re-orientation, even if the people who originally worked on the project are still
around.

*Assuming* the data is available to reindex (and there are many reasons
besides poor practice on the part of the company that it may not be),
remembering/finding out exactly which of the various backups you made
of the original data is the one that's actually in your product can be highly
non-trivial. Compounded by the fact that the product manager will be
adamant about "Do NOT surprise our customers".

So I can be in a spot of saying "I *think* I have the original data set, and I *think* I have the original code used to index it, and if I get a new version of Lucene I *think* I can recreate the index and I *think* that the user will see the expected change. After all that effort is completed, I *think* we'll see the
expected changes, but we won't know until we try it" puts me in a very
precarious position.

This assumes that I have a reasonable chance of getting the original data. But say I've been indexing data from a live feed. Sure as hell hope I stored the
data somewhere, because going back to the source and saying "please resend
me 10 years worth of data that I have in my index" is...er...hard. Or say
that the original provider has gone out of business, or the licensing arrangement specifies a one-time transmission of data that may not be retained in its original
form or.....

The point of this long diatribe is that there are many reasons why reindexing is impossible and/or impractical. Making any decision that requires reindexing for a new version is locking a user into a version potentially forever. We should not underestimate how painful that can be and should never think that "just reindex"
is acceptable in all situations. It's not. Period.

Be very clear that some number of Lucene users will absolutely not be able
to reindex. We may still make a decision that requires this, but let's make it
without deluding ourselves that it's a possible solution for everyone.

So an upgrade tool seems like a reasonable compromise. I agree that being
hampered in what we can develop in Lucene by having to accomodate
reading old indexes slows new features etc. It's always nice to be
able to work without dealing with pesky legacy issues <G>. Perhaps
splitting out the indexing upgrades into a separate program lets us
accommodate both concerns.

FWIW
Erick

On Thu, Apr 15, 2010 at 9:42 AM, Danil ŢORIN <torin...@gmail.com <mailto:torin...@gmail.com>> wrote:

    True. Just need the tool.

    On Thu, Apr 15, 2010 at 16:39, Earwin Burrfoot <ear...@gmail.com
    <mailto:ear...@gmail.com>> wrote:
    >
    > On Thu, Apr 15, 2010 at 17:17, Yonik Seeley
    <yo...@lucidimagination.com <mailto:yo...@lucidimagination.com>>
    wrote:
    > > Seamless online upgrades have their place too... say you are
    upgrading
    > > one server at a time in a cluster.
    >
    > Nothing here that can't be solved with an upgrade tool. Down one
    > server, upgrade index, upgrade sofware, up.
    >
    > --
    > Kirill Zakharenko/Кирилл Захаренко (ear...@gmail.com
    <mailto:ear...@gmail.com>)
    > Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
    > ICQ: 104465785
    >
    >
    ---------------------------------------------------------------------
    > To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
    <mailto:java-dev-unsubscr...@lucene.apache.org>
    > For additional commands, e-mail: java-dev-h...@lucene.apache.org
    <mailto:java-dev-h...@lucene.apache.org>
    >

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
    <mailto:java-dev-unsubscr...@lucene.apache.org>
    For additional commands, e-mail: java-dev-h...@lucene.apache.org
    <mailto:java-dev-h...@lucene.apache.org>




--
- Mark

http://www.lucidimagination.com




---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Reply via email to