[jira] [Commented] (METRON-1829) Large Error Message Causes Slow Search Performance

ASF GitHub Bot (JIRA) Thu, 18 Oct 2018 09:14:13 -0700


    [ 
https://issues.apache.org/jira/browse/METRON-1829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16655505#comment-16655505
 ]


ASF GitHub Bot commented on METRON-1829:
----------------------------------------

Github user nickwallen commented on a diff in the pull request:

    https://github.com/apache/metron/pull/1239#discussion_r226369605
  
    --- Diff: 
metron-platform/metron-writer/src/main/java/org/apache/metron/writer/BulkWriterComponent.java
 ---
    @@ -118,12 +118,15 @@ public void commit(BulkWriterResponse response) {
     
       public void error(String sensorType, Throwable e, Iterable<Tuple> 
tuples, MessageGetStrategy messageGetStrategy) {
         LOG.error(format("Failing %d tuple(s); sensorType=%s", 
Iterables.size(tuples), sensorType), e);
    -    MetronError error = new MetronError()
    -            .withSensorType(Collections.singleton(sensorType))
    -            .withErrorType(Constants.ErrorType.INDEXING_ERROR)
    -            .withThrowable(e);
    -    tuples.forEach(t -> error.addRawMessage(messageGetStrategy.get(t)));
    -    handleError(tuples, error);
    +    tuples.forEach(t -> {
    --- End diff --
    
    Seems like this will ack the same tuples repetitively.  If 500 messages in 
a batch fail, then we will ack all 500 of them, 500 times.
    
    There is also the nuisance that we report the error to the collector for 
each and every failed message, instead of just once for the batch.  There is 
only one `Throwable` error to report, so we should just report it once.
    
    We may need something like this.
    
    ```suggestion
        // emit one error for each failed message
        tuples.forEach(t -> {
          MetronError error = new MetronError()
                  .withSensorType(Collections.singleton(sensorType))
                  .withErrorType(Constants.ErrorType.INDEXING_ERROR)
                  .withThrowable(e)
                  .addRawMessage(messageGetStrategy.get(t));
          collector.emit(Constants.ERROR_STREAM, new 
Values(error.getJSONObject()));
          collector.ack(t);
        });
    
        // there is only one error to report for all of the failed tuples
        collector.reportError(e);
      }
    ```


> Large Error Message Causes Slow Search Performance
> --------------------------------------------------
>
>                 Key: METRON-1829
>                 URL: https://issues.apache.org/jira/browse/METRON-1829
>             Project: Metron
>          Issue Type: Bug
>            Reporter: Ryan Merriman
>            Priority: Major
>
> Errors that occur during batch writes in the index topologies (batch and RA) 
> are written to Elasticsearch as a single, large error message, with a field 
> for each failed message. For example, if the batch size is 5000, a single 
> error message will be created with 5000 fields `raw_message_0`, 
> `raw_message_1`, .., `raw_message_4999`. With such large messages, searching 
> the error index in Elasticsearch is excessively slow.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (METRON-1829) Large Error Message Causes Slow Search Performance

Reply via email to