Hello,

I have tried to retrieve values stored via StoredField type inside a Collector 
when its method setNextReader(AtomicReaderContext) was called.
I used the following method from FieldCache, but do not get back any values:
      FieldCache.DEFAULT.getTerms(indexReader, field, false);

Retrieving the values from the document itself during call to 
Collector.collect(int) works fine.
But this is much much slower than getting all terms at once as by the above 
method.

My question:
Is there a way to get binary content with similar performance as by the above 
described concept, i.e. retrieving the field terms when setting the reader in a 
Collector?


Besides, the concept works fine for any stored field that is indexed, e.g. like 
in the following code snippet:

            final FieldType fieldType = new FieldType();
            {
                fieldType.setStored(true);
                fieldType.setIndexed(true); // need to index, otherwise no fast 
retrieval of terms in collector is possible
                fieldType.setIndexOptions(IndexOptions.DOCS_ONLY);
                fieldType.setTokenized(false);
                fieldType.setOmitNorms(true);
                fieldType.freeze();
            }

            Field field = new Field(fieldName, fieldValue, fieldType); // 
fieldValue is of type String

But this does not allow me to store binary content (i.e. values in byte[] 
arrays) as is available for StoredField.
The constructor expects field content of type String.
I have tried to convert the content into base64 encoded strings, but the 
conversion from base64 encoded strings to byte arrays is quite expensive for 
large indexes.


Thanks for your advice.

Best regards,

Josef

Reply via email to