[ 
https://issues.apache.org/jira/browse/NIFI-11480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Sampson updated NIFI-11480:
---------------------------------
    Description: 
https://github.com/apache/nifi/pull/6903 for NIFI-11111 introduced a 
[conversation|https://github.com/apache/nifi/pull/6903#issuecomment-1513872398] 
about outputting the response error details for Records that are not processed 
by Elasticsearch.

The same PR introduces a new {{elasticsearch.bulk.error}} attribute for the 
{{PutElasticsearchJson}} processor, but explains why [it's not so simple for 
PutElasticsearchRecord|https://github.com/apache/nifi/pull/6903#issuecomment-1514554132]
 due to input FlowFiles potentially containing many Records and there being no 
obvious way of expressing error details for all such Records in the single 
output flowfile.

One [suggested 
approach|https://github.com/apache/nifi/pull/6903#issuecomment-1517903668] 
would be to "partition" the output {{errors}} Records into multiple flowfiles, 
grouped by the error {{type}} provided by Elasticsearch. This {{type}} could 
then be added to the flowfile(s) as the {{elasticsearch.bulk.error}} attribute. 
Flows could then {{RouteOnAttribute}} if they wanted to handle certain 
Elasticsearch errors in particular ways. Leaving all errors flowfiles in the 
same output queue avoids the problem of the [large (and changing) number of 
potential Elasticsearch error 
types|https://github.com/apache/nifi/pull/6903#issuecomment-1517863606]

Such output partitioning (if implemented) should be optional, driven by a 
processor property that maintains the current "all in one" flowfile output by 
default.

  was:
https://github.com/apache/nifi/pull/6903 for NIFI-11111 introduced a 
[conversation|https://github.com/apache/nifi/pull/6903#issuecomment-1513872398] 
about outputting the response error details for Records that are not processed 
by Elasticsearch.

The same PR introduces a new {{elasticsearch.bulk.error}} attribute for the 
{{PutElasticsearchJson}} processor, but explains why [it's not so simple for 
PutElasticsearchRecord|https://github.com/apache/nifi/pull/6903#issuecomment-1514554132]
 due to input FlowFiles potentially containing many Records and there being no 
obvious way of expressing error details for all such Records in the single 
output flowfile.

One [suggested 
approach|https://github.com/apache/nifi/pull/6903#issuecomment-1517903668] 
would be to "partition" the output {{errors}} Records into multiple flowfiles, 
grouped by the error {{type}} provided by Elasticsearch. This {{type}} could 
then be added to the flowfile(s) as the {{elasticsearch.bulk.error}} attribute. 
Flows could then {{RouteOnAttribute}} if they wanted to handle certain 
Elasticsearch errors in particular ways. Leaving all errors flowfiles in the 
same output queue avoids the problem of the [large (and changing) number of 
potential Elasticsearch error 
types|https://github.com/apache/nifi/pull/6903#issuecomment-1517863606]


> PutElasticsearchRecord should have an option to output _bulk api response 
> errors as flowfile attributes
> -------------------------------------------------------------------------------------------------------
>
>                 Key: NIFI-11480
>                 URL: https://issues.apache.org/jira/browse/NIFI-11480
>             Project: Apache NiFi
>          Issue Type: Improvement
>            Reporter: Chris Sampson
>            Priority: Minor
>
> https://github.com/apache/nifi/pull/6903 for NIFI-11111 introduced a 
> [conversation|https://github.com/apache/nifi/pull/6903#issuecomment-1513872398]
>  about outputting the response error details for Records that are not 
> processed by Elasticsearch.
> The same PR introduces a new {{elasticsearch.bulk.error}} attribute for the 
> {{PutElasticsearchJson}} processor, but explains why [it's not so simple for 
> PutElasticsearchRecord|https://github.com/apache/nifi/pull/6903#issuecomment-1514554132]
>  due to input FlowFiles potentially containing many Records and there being 
> no obvious way of expressing error details for all such Records in the single 
> output flowfile.
> One [suggested 
> approach|https://github.com/apache/nifi/pull/6903#issuecomment-1517903668] 
> would be to "partition" the output {{errors}} Records into multiple 
> flowfiles, grouped by the error {{type}} provided by Elasticsearch. This 
> {{type}} could then be added to the flowfile(s) as the 
> {{elasticsearch.bulk.error}} attribute. Flows could then {{RouteOnAttribute}} 
> if they wanted to handle certain Elasticsearch errors in particular ways. 
> Leaving all errors flowfiles in the same output queue avoids the problem of 
> the [large (and changing) number of potential Elasticsearch error 
> types|https://github.com/apache/nifi/pull/6903#issuecomment-1517863606]
> Such output partitioning (if implemented) should be optional, driven by a 
> processor property that maintains the current "all in one" flowfile output by 
> default.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to