[ 
https://issues.apache.org/jira/browse/BLUR-290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravikumar updated BLUR-290:
---------------------------

    Attachment: BlurFieldsConsumer.java
                CompressingRowWriter.java
                CompressingRowReader.java
                RowReaderCache.java
                SortingMultiReader.java
                PrimeDocCache.java
                RowDocsCollector.java

The last patch handled the write part, to quickly determine the doc-idVsrow-id 
mapping.

This patch handles the reader/scoring part.

The BlurRealTimeManager.java internally uses a SortingMultiReader class 
to open readers for searching. This class internally opens a 
CompressingRowReader and adds to RowReaderCache. This cache is on 
CoreClosedListener, which takes care of removing obsolete readers

The Scoring part is contained in PrimeDocCache and RowDocsCollector.

The PrimeDocCache will simply load the BitSet from CompressingRowReader, based 
on real-time flag.

The RowDocsCollector gathers rows across segments and returns a globally 
score-sorted TopDocs. I have left-out "Super" scoring, as I do not know how to 
do it correctly. It has some hairy logic I don't understand

Depending on the number of rows matched, this Collector will take up lot of    
memory unlike a PriorityQueue. We need to hold-on to all rows, until the last 
row is fully examined.

Please do go through this, when you find time and see if it fits in with 
existing Blur Logic

There is still the problem of plugging in RAMDirs here, which I shall probably 
attempt to solve down the line. 

> NRT Updates using RAMDirectory & Swap
> -------------------------------------
>
>                 Key: BLUR-290
>                 URL: https://issues.apache.org/jira/browse/BLUR-290
>             Project: Apache Blur
>          Issue Type: New Feature
>    Affects Versions: experimental-dev
>            Reporter: Ravikumar
>         Attachments: BlurFieldsConsumer.java, BlurFieldsConsumer.java, 
> BlurFlushingIndexWriter.java, BlurIndexTracker.java, 
> BlurPostingsConsumer.java, BlurPostingsFormat.java, BlurRealTimeIndex.java, 
> BlurRealTimeIndexWriter.java, BlurRealTimeManager.java, 
> BlurRealTimeManagerReopenThread.java, BlurRowCodec.java, 
> BlurTermsConsumer.java, CompressingRowIndexReader.java, 
> CompressingRowIndexWriter.java, CompressingRowReader.java, 
> CompressingRowReader.java, CompressingRowWriter.java, 
> CompressingRowWriter.java, GrowableByteArrayDataOutput.java, 
> PrimeDocCache.java, RealTimeTransactionRecorder.java, RowCache.java, 
> RowDocsCollector.java, RowReaderCache.java, SlabAllocator.java, 
> SlabRAMDirectory.java, SlabRAMFile.java, SlabRAMInputStream.java, 
> SlabRAMOutputStream.java, SortingMultiReader.java, SortingMultiReader.java, 
> TestCompressingRowWriter.java
>
>
> We have been discussing about handling humungous rows in Blur (BLUR-220). 
> Explore the idea of using RAMDirectory at the front, backed by 
> persistent-index.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to