[jira] [Commented] (METRON-1448) Update SolrWriter to conform to new collection strategy

ASF GitHub Bot (JIRA) Wed, 07 Feb 2018 14:15:25 -0800

    [ 
https://issues.apache.org/jira/browse/METRON-1448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16356155#comment-16356155
 ]


ASF GitHub Bot commented on METRON-1448:
----------------------------------------

Github user cestella commented on the issue:

    https://github.com/apache/metron/pull/929
  
    You can set it to not manually commit (set `solr.commitPerBatch` to `false` 
and no committing happens), but you're risking losing data if a worker dies.  
Honestly, want a durable commit with a fsync before you ack the tuples in a 
batch, otherwise you're courting data loss.  This is the same strategy we do 
for ES and HDFS (though commit there is a fsync).  That being said, I'm 
sensitive to performance issues around that that people may have, so I let 
people turn it off with a strong warning in the docs (also this was legacy 
behavior in the SolrWriter).


> Update SolrWriter to conform to new collection strategy
> -------------------------------------------------------
>
>                 Key: METRON-1448
>                 URL: https://issues.apache.org/jira/browse/METRON-1448
>             Project: Metron
>          Issue Type: Improvement
>            Reporter: Casey Stella
>            Priority: Major
>
> Currently the SolrWriter presumes a single collection to be written to.  The 
> new collection strategy for Solr implies a collection per sensor.  Also, 
> there are a few rough edges in the writer which could stand smoothing:
>  * By default, we use solr's implicit commit mechanism, rather than 
> committing at the batch granularity.  This may result in lost data on worker 
> failure.
>  * We do not use the the batch add api, but rather message-by-message add



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (METRON-1448) Update SolrWriter to conform to new collection strategy

Reply via email to