On Sun, Nov 30, 2014 at 3:53 PM, Mark Kerzner <[email protected]> wrote:
> Hi, > > Latest Lucene 4.0 (and Solr) has the feature of near-real-time search: > index is updated in memory and is available for searches, but not committed > to the hard drive, with all the accompanying features. > > Blur has the same, I believe, but I am guessing that it has implemented it > directly, without the latest Lucene in-memory features. Why do I think so? > Because Blur had this seemingly before Lucene 4.0. > > Could you please either give me the answer, or tell me where in the code to > look? > Yes Blur has a NRT like capability though it is not implemented with the Lucene NRT classes. Currently there are 3 different ways that Blur accepts data mutates. 1. Thrift API mutate call. This call is blocking and commits and refreshes the index during the call. This is also an atomic call. http://incubator.apache.org/blur/docs/0.2.3/Blur.html#Fn_Blur_mutate A variant of the call is mutate batch which just batches the calls to each shard server. However this is not an atomic call. Meaning that in the event of a mutate failure in one shard the entire batch will not fail. http://incubator.apache.org/blur/docs/0.2.3/Blur.html#Fn_Blur_mutateBatch 2. Thrift API enqueue mutate call. This call is similar to the Lucene NRT updates in that it will indexing for 5 seconds (configurable) and then commit and refresh. Something to note about this method that is different than the default Lucene implementation is that Blur will not return results to the user that are not committed to the index. The way this call is implemented is by placing an in-memory queue in front of the indexing process. Currently the queue is not backed to disk, but it is something we want to add. http://incubator.apache.org/blur/docs/0.2.3/Blur.html#Fn_Blur_enqueueMutate 3. The last method is not NRT but is worth mentioning. MapReduce batch processing can produce a bulk incremental load for Blur. All of the index changes are performed per shard through a single internal API. https://github.com/apache/incubator-blur/blob/master/blur-core/src/main/java/org/apache/blur/manager/writer/IndexAction.java And the writer that handles all mutates. https://github.com/apache/incubator-blur/blob/master/blur-core/src/main/java/org/apache/blur/manager/writer/BlurIndexSimpleWriter.java There will also be a 4th method for index mutations soon. We will be implementing a write API in our new command platform. In concept they are similar to stored procedures which allow developers to embed their own methods, indexing and query models into Blur. Does this answer your question? Aaron > Thank you. > > Sincerely, > Mark >
