[
https://issues.apache.org/jira/browse/BLUR-290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13865211#comment-13865211
]
Ravikumar commented on BLUR-290:
--------------------------------
Thanks Aaron for the findings.
I think this is cool stuff. Sounds very very interesting. Commits inside
millisec is awesome
"In Blur now there's no need for the WAL because everything is committed to
disk".
I do not understand this part. You mean to say that FastHdfsKeyValueDirectory
points to the local-dir and every operation takes place here, with an implicit
commit?
I have also observed few things, which I wanted to get your opinion on
1. In TransactionRecorder, we have a synchronized block on writer.commit(). For
realtime indexes, this could be detrimental right?
2. commit() is supposed to be very slow and async operation, as documented in
lucene-API [tens-of-seconds is also acceptable!!!]. So, may be the gains of
improving commit time might turn out to be too-much-work and too-little-gain.
3. NRTCachingDirectory is exactly meant for frequent NRT re-open calls, like in
our case. Newly created files will be on RAM and synced to disk, only during
commit() calls [which is anyways async]. So HDFS meta-calls and FileStatus
calls are fully avoided.
It would be interesting to see if you can wrap the HDFSDir with a NRTCachingDir
and benchmark it, against the experimental KeyValueDir
> NRT Updates using RAMDirectory & Swap
> -------------------------------------
>
> Key: BLUR-290
> URL: https://issues.apache.org/jira/browse/BLUR-290
> Project: Apache Blur
> Issue Type: New Feature
> Affects Versions: experimental-dev
> Reporter: Ravikumar
> Attachments: BlurFieldsConsumer.java, BlurFieldsConsumer.java,
> BlurFieldsConsumer.java, BlurFlushingIndexWriter.java, BlurIndexTracker.java,
> BlurPostingsConsumer.java, BlurPostingsConsumer.java,
> BlurPostingsFormat.java, BlurPostingsFormat.java, BlurRealTimeIndex.java,
> BlurRealTimeIndex.java, BlurRealTimeIndexTest.java,
> BlurRealTimeIndexWriter.java, BlurRealTimeManager.java,
> BlurRealTimeManagerReopenThread.java, BlurRowCodec.java, BlurRowCodec.java,
> BlurSegmentInfoFormat.java, BlurSegmentInfoWriter.java,
> BlurTermsConsumer.java, BlurTermsConsumer.java,
> CompressingRowIndexReader.java, CompressingRowIndexWriter.java,
> CompressingRowReader.java, CompressingRowReader.java,
> CompressingRowReader.java, CompressingRowWriter.java,
> CompressingRowWriter.java, CompressingRowWriter.java,
> GrowableByteArrayDataOutput.java, PrimeDocCache.java,
> RealTimeTransactionRecorder.java, RealTimeTransactionRecorder.java,
> RowCache.java, RowDocsCollector.java, RowDocsCollector.java,
> RowReaderCache.java, RowReaderCache.java, SlabAllocator.java,
> SlabRAMDirectory.java, SlabRAMFile.java, SlabRAMInputStream.java,
> SlabRAMOutputStream.java, SortingMultiReader.java, SortingMultiReader.java,
> TestCompressingRowWriter.java, TestCompressingRowWriter.java
>
>
> We have been discussing about handling humungous rows in Blur (BLUR-220).
> Explore the idea of using RAMDirectory at the front, backed by
> persistent-index.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)