I forgot: For old 2.4 indexes this would also work, as the compressed fields appear as uncompressed. So the detection with binary/string also works here.
Please note, as soon as you optimize or update old indexes, compressed fields get automatically decompressed. But there is nothing to do from your side! Uwe ----- Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: [email protected] > -----Original Message----- > From: Ivan Vasilev [mailto:[email protected]] > Sent: Tuesday, December 29, 2009 12:19 PM > To: [email protected] > Subject: Re: Compressing field content with Lucene 3.0 > > 10x Uwe, > > That is fine :) > > Cheers, > Ivan > > Uwe Schindler wrote: > > It is still open to you how you handle it. On my projects I normally > only > > store string fields. If I compress them, they are binary. So I use a > similar > > approach like you to compress only large fields and store the others as > > string fields like before. > > > > When I retrieve the contents of the documents I use > > Document.getField("name") and then use the Fieldable.isBinary() to > detect if > > it was stored binary (and therefore compressed) or if it is string. > > > > ----- > > Uwe Schindler > > H.-H.-Meier-Allee 63, D-28213 Bremen > > http://www.thetaphi.de > > eMail: [email protected] > > > > > >> -----Original Message----- > >> From: Ivan Vasilev [mailto:[email protected]] > >> Sent: Tuesday, December 29, 2009 11:50 AM > >> To: [email protected] > >> Subject: Re: Compressing field content with Lucene 3.0 > >> > >> 10x Uwe for your answer, > >> > >> It is good news that data compressed with Field.Store.COMPRESS with 2.4 > >> will be retrieved properly from 3.0. > >> > >> From your answer I understand that in 3.0 there is no API way to > >> compress some of the values of some field and not to compress other > >> values for the same field. We should choose either to compress all of > >> the values of that field (and keep them as bin values) or not to > >> compress any of values for that field? > >> > >> Cheers, > >> Ivan > >> > >> > >> Uwe Schindler wrote: > >> > >>> If it is a 2.4 index, you can read it without any problems. It is only > >>> > >> no > >> > >>> longer possible to add fields with Field.Store.COMPRESS. Nothing more > >>> changed. > >>> > >>> If you want to add field with some compression, you have to compress > >>> yourself e.g. to a byte[]. You can then add this byte[] as a binary > >>> > >> stored > >> > >>> field. How you deal with this is in your responsibility. > >>> > >>> ----- > >>> Uwe Schindler > >>> H.-H.-Meier-Allee 63, D-28213 Bremen > >>> http://www.thetaphi.de > >>> eMail: [email protected] > >>> > >>> > >>> > >>>> -----Original Message----- > >>>> From: Ivan Vasilev [mailto:[email protected]] > >>>> Sent: Monday, December 28, 2009 7:13 PM > >>>> To: LUCENE MAIL LIST > >>>> Subject: Compressing field content with Lucene 3.0 > >>>> > >>>> Hi Guys, > >>>> > >>>> Could you give me advice how to deal with Lucene 3.0 with 2.4 indexes > >>>> that contain compressed data. > >>>> > >>>> Our case is following - we have code like this: > >>>> > >>>> Field.Store fieldStored = storedFieldsSet.contains(fieldName) ? > >>>> (fieldValue.length() >= COMPRESS_THRESHOLD ? Field.Store.COMPRESS : > >>>> Field.Store.YES) : Field.Store.NO; > >>>> Field.Index fieldIndexed = indexedFieldsSet.contains(fieldName) ? > >>>> Field.Index.ANALYZED : Field.Index.NO; > >>>> doc.add(new Field(fieldName, fieldValue, fieldStored, fieldIndexed)); > >>>> > >>>> So for one and the same field some values are compressed in the > indexes > >>>> other values are not. > >>>> Reading Lucene documentation I understand that all the values for > some > >>>> field should be either compressed or not compressed OR we should > >>>> > >> somehow > >> > >>>> keep which values for which fields are compressed. If this is so it > >>>> would be very difficult to migrate our system to Lucene 3.0 without > >>>> > >> need > >> > >>>> of re indexing. > >>>> If I am not right could you tell me please: > >>>> 1. How to read these indexes (created with the code above with Lucene > >>>> 2.4) with Lucene 3.0? I mean when fetching field values how to know > >>>> > >> when > >> > >>>> to use CompressionTools.decompressString(..) and when to skip it? > >>>> 2. Is there possibility in Lucene 3.0 again to compress values of > some > >>>> docs for certain field and for other docs the values for the same > field > >>>> not to be compressed? > >>>> > >>>> Cheers, > >>>> Ivan > >>>> > >>>> > >>>> --------------------------------------------------------------------- > >>>> To unsubscribe, e-mail: [email protected] > >>>> For additional commands, e-mail: [email protected] > >>>> > >>>> > >>> > >>> --------------------------------------------------------------------- > >>> To unsubscribe, e-mail: [email protected] > >>> For additional commands, e-mail: [email protected] > >>> > >>> > >>> __________ NOD32 3990 (20090406) Information __________ > >>> > >>> This message was checked by NOD32 antivirus system. > >>> http://www.eset.com > >>> > >>> > >>> > >>> > >>> > >> --------------------------------------------------------------------- > >> To unsubscribe, e-mail: [email protected] > >> For additional commands, e-mail: [email protected] > >> > > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: [email protected] > > For additional commands, e-mail: [email protected] > > > > > > __________ NOD32 3990 (20090406) Information __________ > > > > This message was checked by NOD32 antivirus system. > > http://www.eset.com > > > > > > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
