[jira] [Commented] (BLUR-445) Remove online mutates from the Blur thrift api

Garrett Barton (JIRA) Wed, 11 Nov 2015 06:47:53 -0800

    [ 
https://issues.apache.org/jira/browse/BLUR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15000449#comment-15000449
 ]


Garrett Barton commented on BLUR-445:
-------------------------------------

Yes please add more info, I am actually working on a prototype dumping kafka 
topics via the enqueue mutate path.  It sounds like your looking to stand up 
something outside of the shard server itself to co-ordinate ingestion of new 
data and that seems cool.  Apache Druid does something similar, maybe look at 
what they have done?

> Remove online mutates from the Blur thrift api
> ----------------------------------------------
>
>                 Key: BLUR-445
>                 URL: https://issues.apache.org/jira/browse/BLUR-445
>             Project: Apache Blur
>          Issue Type: Improvement
>          Components: Blur
>    Affects Versions: 0.3.0
>            Reporter: Aaron McCurry
>             Fix For: 0.3.0
>
>
> The primary use case for Blur is for massive ingestion of information to be 
> indexed and searched.  Currently I believe the system has been made overly 
> complex due to the atomic operations in the online index mutation system.  It 
> forces the shard servers to have writers open to each of the indexes in the 
> given table, this requires a lot of memory, cpu, and file resources per shard.
> Currently the system only allows for mutates to be atomic when mutating a 
> single row.  Batch mutates are not atomic.
> I propose that we move all index mutations to the bulk indexing approach and 
> utilize hdfs snapshots for commiting index information within a given table.  
> This will allow the controller and shard servers to become readonly with 
> respect to the indexes.
> Assuming we move forward with this approach a new daemon will need to 
> created, and index manager.  This daemon will coordinate indexing (MR, Spark, 
> Tez, Flink, etc) and merging globally for the cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (BLUR-445) Remove online mutates from the Blur thrift api

Reply via email to