[ 
https://issues.apache.org/jira/browse/LUCENE-1757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson resolved LUCENE-1757.
------------------------------------

    Resolution: Won't Fix

SPRING_CLEANING_2013 JIRAS. I think this has been long since changed.
                
> Support adding a "stored" field via a Reader
> --------------------------------------------
>
>                 Key: LUCENE-1757
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1757
>             Project: Lucene - Core
>          Issue Type: Wish
>          Components: core/index
>            Reporter: Tim Smith
>
> All current constructors for Field() that take a Reader explicitly say they 
> will not be stored.
> It would be highly desirable to support adding a stored field to a Document 
> using a Reader (or some special interface that can go direct to the source 
> data)
> This could greatly reduce memory required for adding very large stored fields 
> (if used efficiently by IndexWriter)
> This will support two primary use cases:
> 1. can create stored field from arbitrary CharSequence 
> I may internally use a MutableString type class during document processing to 
> conserve memory, however, i would currently have to convert this to a 
> String() prior to adding it as a stored field. If i could just pass a Reader 
> for this mutable string/char sequence indexing could be smart enough to not 
> require allocating double the space.
> 2. can create a stored field from a file on disk
> If adding large stored fields, the actual value may be on disk to reduce 
> memory use during indexing. In order to support using this as a Stored Field, 
> it would currently have to be entirely loaded into memory as a String/byte[] 
> in order to be added to a Field() (this could be quite large and provoke 
> OutOfMemory error)
> Document retrieval considerations:
> It would then also be ideal if when fetching a Document from the index, you 
> could specify a "max string size" for the returned stored field
> if the field was larger than this cutoff, a Reader going directly to disk 
> would be returned instead of a String/byte[]  This would again allow smart 
> applications to save memory during document retrieval (this would be 
> especially be nice for highlighting as the source data could be streamed 
> right into the highlighter)
> It would also be acceptable if some new interface would be accepted instead 
> of Reader
> this could be some form of "sized" input stream that will return the number 
> of bytes/chars that will be produced in total
> ex:
> {code}
> public interface FieldSource {
>   /** Size of stored field value (in bytes if isBinary() is true, in chars if 
> isBinary() is false) */
>   public int size();
>   /** if true, use getInputStream(), if false, use getReader() */
>   public boolean isBinary();
>   /** Get the input stream for pulling this from its source (null if 
> isBinary() is false) */
>   public InputStream getInputStream();
>   /** Get the reader for reading character data (null if isBinary() is true) 
> */
>   public Reader getReader();
> }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to