[
https://issues.apache.org/jira/browse/LUCENE-1757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Erick Erickson resolved LUCENE-1757.
------------------------------------
Resolution: Won't Fix
SPRING_CLEANING_2013 JIRAS. I think this has been long since changed.
> Support adding a "stored" field via a Reader
> --------------------------------------------
>
> Key: LUCENE-1757
> URL: https://issues.apache.org/jira/browse/LUCENE-1757
> Project: Lucene - Core
> Issue Type: Wish
> Components: core/index
> Reporter: Tim Smith
>
> All current constructors for Field() that take a Reader explicitly say they
> will not be stored.
> It would be highly desirable to support adding a stored field to a Document
> using a Reader (or some special interface that can go direct to the source
> data)
> This could greatly reduce memory required for adding very large stored fields
> (if used efficiently by IndexWriter)
> This will support two primary use cases:
> 1. can create stored field from arbitrary CharSequence
> I may internally use a MutableString type class during document processing to
> conserve memory, however, i would currently have to convert this to a
> String() prior to adding it as a stored field. If i could just pass a Reader
> for this mutable string/char sequence indexing could be smart enough to not
> require allocating double the space.
> 2. can create a stored field from a file on disk
> If adding large stored fields, the actual value may be on disk to reduce
> memory use during indexing. In order to support using this as a Stored Field,
> it would currently have to be entirely loaded into memory as a String/byte[]
> in order to be added to a Field() (this could be quite large and provoke
> OutOfMemory error)
> Document retrieval considerations:
> It would then also be ideal if when fetching a Document from the index, you
> could specify a "max string size" for the returned stored field
> if the field was larger than this cutoff, a Reader going directly to disk
> would be returned instead of a String/byte[] This would again allow smart
> applications to save memory during document retrieval (this would be
> especially be nice for highlighting as the source data could be streamed
> right into the highlighter)
> It would also be acceptable if some new interface would be accepted instead
> of Reader
> this could be some form of "sized" input stream that will return the number
> of bytes/chars that will be produced in total
> ex:
> {code}
> public interface FieldSource {
> /** Size of stored field value (in bytes if isBinary() is true, in chars if
> isBinary() is false) */
> public int size();
> /** if true, use getInputStream(), if false, use getReader() */
> public boolean isBinary();
> /** Get the input stream for pulling this from its source (null if
> isBinary() is false) */
> public InputStream getInputStream();
> /** Get the reader for reading character data (null if isBinary() is true)
> */
> public Reader getReader();
> }
> {code}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]