[ https://issues.apache.org/jira/browse/LUCENE-1473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12652740#action_12652740 ]
Michael McCandless commented on LUCENE-1473: -------------------------------------------- {quote} > At the risk of pissing off the Lucene powerhouse, I feel I have to express > some candor. I am growing more and more frustrated with the lack of the open > source nature of this project and its unwillingness to work with the > developer community. This is a rather trivial issue, and it is taking 7 > back-and-forth's to reiterate some standard Java behavior that has been > around for years. {quote} Whoa! I'm sorry if my questions are giving this impression. I don't intend to. But I do have real questions, still, because I don't think Serialization is actually so simple. I too was surprised on looking at what started as a simple patch yet on digging into it uncovered some real challenges. {quote} >Use case: deploying lucene in a distributed environment, we have a >broker/server architecture. (standard stuff), we want roll out search servers >with lucene 2.4 instance by instance. The problem is that the broker is >sending a Query object to the searcher via java serialization at the server >level, and the broker is running 2.3. And because of specifically this >problem, 2.3 brokers cannot to talk to 2.4 search servers even when the Query >object was not changed. {quote} OK that is a great use case -- thanks. That helps focus the many questions here. {quote} > It is a known good java programming practice to include a suid to the class > (as a static variable) when the object declares itself to be Serializable. {quote} But that alone gives a too-fragile back-compat solution because it's too coarse. If we add field X to a class implementing Serializable, and must bump the SUID, that's a hard break on back compat. So really we need to override read/writeObject() or read/writeExternal() to do our own versioning. Consider this actual example: RangeQuery, in 2.9, now separately stores "boolean includeLower" and "boolean includeUpper". In versions <= 2.4, it only stores "boolean inclusive". This means we can't rely on the JVM's default versioning for serialization. {quote} > The serialVersionUID (suid) is a long because it is a java thing. {quote} But, that's only if you rely on the JVM's default serialization. If we implement our own (overriding read/writeObject or read/writeExtenral) we don't have to use "long SUID". {quote} > The problem was two different people did the release with different compilers. {quote} I think it's more likely the addition of a new ctor to Term (that takes only String field), that changed the SUID. {quote} > If it is not meant to be serialized, why did it implement Serializable. {quote} Because there are two different things it can "mean" when a class implements Serializable, and I think that's the core disconnect/challenge to this issue. The first meaning (let's call it "live serialization") is: "within the same version of Lucene you can serialize/deserialize this object". The second meaning (let's call it "long-term persistence") is: "you can serialize this object in version X of Lucene and later deserialize it using a newer version Y of Lucene". Lucene, today, only guarantees "live serialization", and that's the intention when "implements Serializable" is added to a class. But, what's now being asked for (expected) with this issue is "long-term persistence", which is really a very different beast and a much taller order. With it comes a number of challenges, that warrant scrutiny: * What's our back-compat policy for "long-term persistence"? * The storage protocol must have a version header, so future changes can switch on that and decode older formats. * We need strong test cases that deserialize older versions of these serialized classes so we don't accidentally break it. * We should look carefully at the protocol and not waste bytes if we can (1 byte vs 8 byte version header). These issues are the same issues we face with the index file format, because that is also long-term persistence. > Implement Externalizable in main top level searcher classes > ----------------------------------------------------------- > > Key: LUCENE-1473 > URL: https://issues.apache.org/jira/browse/LUCENE-1473 > Project: Lucene - Java > Issue Type: Bug > Components: Search > Affects Versions: 2.4 > Reporter: Jason Rutherglen > Priority: Minor > Attachments: LUCENE-1473.patch > > > To maintain serialization compatibility between Lucene versions, major > classes can implement Externalizable. This will make Serialization faster > due to no reflection required and maintain backwards compatibility. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]