[ 
https://issues.apache.org/jira/browse/LUCENENET-417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13039436#comment-13039436
 ] 

Christopher Currens commented on LUCENENET-417:
-----------------------------------------------

Good call.  I think I was confusing storing the whole field with storing the 
term vectors, which lucene.net can do.

I still think at the very least being able to store binary values via a stream 
is a necessary addition to Lucene.Net.  Strings are less of an issue, to me at 
least, of making streamable.  However, I can see the benefit when indexing 
large items, which is really all this is attempting to solve. There are 
speed/memory issues created by being forced to load large quantities of data 
into memory to perform any sort of indexing operation on them.  This may not be 
a terribly large use case for some people, but anyone trying to write a 
multi-threaded indexing system would certainly enjoy the benefits of a low 
memory footprint/speed increase.

> implement streams as field values
> ---------------------------------
>
>                 Key: LUCENENET-417
>                 URL: https://issues.apache.org/jira/browse/LUCENENET-417
>             Project: Lucene.Net
>          Issue Type: New Feature
>          Components: Lucene.Net Core
>            Reporter: Christopher Currens
>         Attachments: BinaryStream.patch
>
>
> Adding binary values to a field is an expensive operation, as the whole 
> binary data must be loaded into memory and then written to the index.  Adding 
> the ability to use a stream instead of a byte array could not only speed up 
> the indexing process, but reducing the memory footprint as well.
> Java lucene has the ability to use a TextReader the both analyze and store 
> text in the index.  .NET lacks the ability to store the data in the index, 
> due to the fact that .net TextReaders cannot seek or reset the position of 
> the stream.  This should be a feature added into Lucene.NET as well.  My 
> thoughts are to add another Field constructor, that is Field(string name, 
> System.IO.Stream stream, System.Text.Encoding encoding), that will allow the 
> text to be analyzed and stored into the index.
> Comments about this approach are greatly appreciated.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to