[ 
https://issues.apache.org/jira/browse/LUCENE-1473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12652740#action_12652740
 ] 

Michael McCandless commented on LUCENE-1473:
--------------------------------------------


{quote}
> At the risk of pissing off the Lucene powerhouse, I feel I have to express 
> some candor. I am growing more and more frustrated with the lack of the open 
> source nature of this project and its unwillingness to work with the 
> developer community. This is a rather trivial issue, and it is taking 7 
> back-and-forth's to reiterate some standard Java behavior that has been 
> around for years.
{quote}
Whoa!  I'm sorry if my questions are giving this impression.  I don't
intend to.

But I do have real questions, still, because I don't think
Serialization is actually so simple.  I too was surprised on looking
at what started as a simple patch yet on digging into it uncovered
some real challenges.

{quote}
>Use case: deploying lucene in a distributed environment, we have a 
>broker/server architecture. (standard stuff), we want roll out search servers 
>with lucene 2.4 instance by instance. The problem is that the broker is 
>sending a Query object to the searcher via java serialization at the server 
>level, and the broker is running 2.3. And because of specifically this 
>problem, 2.3 brokers cannot to talk to 2.4 search servers even when the Query 
>object was not changed.
{quote}
OK that is a great use case -- thanks.  That helps focus the many
questions here.

{quote}
> It is a known good java programming practice to include a suid to the class 
> (as a static variable) when the object declares itself to be Serializable.
{quote}

But that alone gives a too-fragile back-compat solution because it's
too coarse.  If we add field X to a class implementing Serializable,
and must bump the SUID, that's a hard break on back compat.  So really
we need to override read/writeObject() or read/writeExternal() to do
our own versioning.

Consider this actual example: RangeQuery, in 2.9, now separately
stores "boolean includeLower" and "boolean includeUpper".  In versions
<= 2.4, it only stores "boolean inclusive".  This means we can't rely
on the JVM's default versioning for serialization.

{quote}
> The serialVersionUID (suid) is a long because it is a java thing.
{quote}

But, that's only if you rely on the JVM's default serialization.  If
we implement our own (overriding read/writeObject or
read/writeExtenral) we don't have to use "long SUID".

{quote}
> The problem was two different people did the release with different compilers.
{quote}

I think it's more likely the addition of a new ctor to Term (that
takes only String field), that changed the SUID.

{quote}
> If it is not meant to be serialized, why did it implement Serializable.
{quote}

Because there are two different things it can "mean" when a class
implements Serializable, and I think that's the core
disconnect/challenge to this issue.

The first meaning (let's call it "live serialization") is: "within the
same version of Lucene you can serialize/deserialize this object".

The second meaning (let's call it "long-term persistence") is: "you
can serialize this object in version X of Lucene and later deserialize
it using a newer version Y of Lucene".

Lucene, today, only guarantees "live serialization", and that's the
intention when "implements Serializable" is added to a class.

But, what's now being asked for (expected) with this issue is
"long-term persistence", which is really a very different beast and a
much taller order.  With it comes a number of challenges, that warrant
scrutiny:

  * What's our back-compat policy for "long-term persistence"?

  * The storage protocol must have a version header, so future changes
    can switch on that and decode older formats.

  * We need strong test cases that deserialize older versions of these
    serialized classes so we don't accidentally break it.

  * We should look carefully at the protocol and not waste bytes if we
    can (1 byte vs 8 byte version header).

These issues are the same issues we face with the index file format,
because that is also long-term persistence.


> Implement Externalizable in main top level searcher classes
> -----------------------------------------------------------
>
>                 Key: LUCENE-1473
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1473
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Search
>    Affects Versions: 2.4
>            Reporter: Jason Rutherglen
>            Priority: Minor
>         Attachments: LUCENE-1473.patch
>
>
> To maintain serialization compatibility between Lucene versions, major 
> classes can implement Externalizable.  This will make Serialization faster 
> due to no reflection required and maintain backwards compatibility.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to