[ 
https://issues.apache.org/jira/browse/BLUR-290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13820714#comment-13820714
 ] 

Aaron McCurry commented on BLUR-290:
------------------------------------

This is the very thing I have struggled to find an answer to over the last 3+ 
years of developing the system.  :-(

I'm really not sure how to solve this problem without doing more work on 
indexing and keeping the read/search side the same.

Let me throw out a slightly different idea.  If we make the assumption that 
indexing and searching performance are the 2 things we want to optimize the 
system to achieve.  And storage is less of a concern.

What if we index each Row into a mini-index, an index of just that Row.  Then 
we merge that mini-index into the main (deleting any existing Row) and store 
that mini-index as a single binary field on the Row.  That way instead of 
re-indexing the documents a second time when we modify, we read in the 
mini-index.  Make the update, and rewrite the Row+mini index again.

What do you think?

> NRT Updates using RAMDirectory & Swap
> -------------------------------------
>
>                 Key: BLUR-290
>                 URL: https://issues.apache.org/jira/browse/BLUR-290
>             Project: Apache Blur
>          Issue Type: New Feature
>    Affects Versions: experimental-dev
>            Reporter: Ravikumar
>         Attachments: BlurFlushingIndexWriter.java, BlurIndexTracker.java, 
> BlurRealTimeIndex.java, BlurRealTimeIndexWriter.java, 
> BlurRealTimeManager.java, BlurRealTimeManagerReopenThread.java, 
> RealTimeTransactionRecorder.java, SlabAllocator.java, SlabRAMDirectory.java, 
> SlabRAMFile.java, SlabRAMInputStream.java, SlabRAMOutputStream.java, 
> SortingMultiReader.java
>
>
> We have been discussing about handling humungous rows in Blur (BLUR-220). 
> Explore the idea of using RAMDirectory at the front, backed by 
> persistent-index.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to