[jira] [Commented] (BLUR-445) Remove online mutates from the Blur thrift api

Tim Williams (JIRA) Wed, 11 Nov 2015 06:41:39 -0800

    [ 
https://issues.apache.org/jira/browse/BLUR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15000444#comment-15000444
 ]


Tim Williams commented on BLUR-445:
-----------------------------------

Can you clarify  "move all index mutations to the bulk indexing approach"?  
There are at least four different bulk-ish approaches (m/r, batch, hive, 
enqueue) - so what does *the* bulk indexing approach mean in this context?   I 
reckon, you'll want to share more about the new daemon too... is that similar 
to what supports hive-style indexing?

> Remove online mutates from the Blur thrift api
> ----------------------------------------------
>
>                 Key: BLUR-445
>                 URL: https://issues.apache.org/jira/browse/BLUR-445
>             Project: Apache Blur
>          Issue Type: Improvement
>          Components: Blur
>    Affects Versions: 0.3.0
>            Reporter: Aaron McCurry
>             Fix For: 0.3.0
>
>
> The primary use case for Blur is for massive ingestion of information to be 
> indexed and searched.  Currently I believe the system has been made overly 
> complex due to the atomic operations in the online index mutation system.  It 
> forces the shard servers to have writers open to each of the indexes in the 
> given table, this requires a lot of memory, cpu, and file resources per shard.
> Currently the system only allows for mutates to be atomic when mutating a 
> single row.  Batch mutates are not atomic.
> I propose that we move all index mutations to the bulk indexing approach and 
> utilize hdfs snapshots for commiting index information within a given table.  
> This will allow the controller and shard servers to become readonly with 
> respect to the indexes.
> Assuming we move forward with this approach a new daemon will need to 
> created, and index manager.  This daemon will coordinate indexing (MR, Spark, 
> Tez, Flink, etc) and merging globally for the cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (BLUR-445) Remove online mutates from the Blur thrift api

Reply via email to