Hello all, I've been thinking about the strategy to upgrade to Lucene 3; Ignoring new features at the moment, the main issues in migration:
1- Store.COMPRESS not supported anymore 2- Some Analyzers and the QueryParser require an additional constructor parameter 3- DocIdSet interface (used in filters) changed - changed even in Lucene 2.9.x, making step-by-step migration harder While point 2 is not a great problem (I'm having a patch ready); points 1 and 3 are connected: DocIdSet must be solved as soon as we move to 2.9, while 2.9 is a requirement to implement COMPRESS in a different way if we choose to: It appears we can't maintain binary index compatibility, but supporting the feature is an option. Lucene 3 will transparently decompress an old-style compressed field when reading it and it will even decompress all fields during optimization, effectively transforming the index to the new format. If we want to still support the contract of org.hibernate.search.annotations.Store.COMPRESS we will have to compress ourself the field, possibly using a pluggable strategy; assuming the use of org.apache.lucene.document.CompressionTools as default implementation we can provide a backwards compatible-API but the resulting index is going to have a different format. A future improvement could be to use any external compression/decompression function (user provided implementation), any idea where? Maybe replace the Store enum with an interface? What should I do to solve HSEARCH-425 ? The options I've considered so far: A) Deprecate the Store.COMPRESS, without providing an alternative B) Change implementation to make use of Lucene's CompressionTools CompressionTools only exist since Lucene 2.9, so an upgrade is mandatory but other features are going to break, like filters (org.hibernate.search.filter.AndDocIdSet needs to implement an updated interface) So basically I'll need a branch, break some tests temporarily, or provide a single huge patch refactoring some features and tests at same time :-/ An alternative to branching would be to solve the Compress issue later, and focus on the build breaking changes first; in practice this would break the compression feature until it's fixed, but this shouldn't be a great problem as it going to change anyway... WDYT? I'm working on the new DocIdSet, even that will be a considerable change. Cheers, Sanne _______________________________________________ hibernate-dev mailing list hibernate-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/hibernate-dev