[
https://issues.apache.org/jira/browse/NIFI-4550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16238451#comment-16238451
]
Michael Moser commented on NIFI-4550:
-------------------------------------
Perhaps somewhat related to NIFI-1874?
> Add an InferCharacterSet processor
> ----------------------------------
>
> Key: NIFI-4550
> URL: https://issues.apache.org/jira/browse/NIFI-4550
> Project: Apache NiFi
> Issue Type: New Feature
> Components: Extensions
> Reporter: Matt Burgess
> Priority: Minor
>
> Sometimes in a NiFi flow it is not known what character set an incoming flow
> file is using. This can make it difficult for downstream processing if the
> processors expect a particular charset (whether the user can configure it or
> not). There is a ConvertCharacterSet processor, but it expects an explicit
> value for Input Character Set, when this might not be known.
> I propose an InferCharacterSet processor, which would presumably use some
> license-friendly third-party library (there is a discussion
> [here|https://stackoverflow.com/questions/499010/java-how-to-determine-the-correct-charset-encoding-of-a-stream])
> to guess the character set, perhaps adding it as an attribute for use
> downstream in ConvertCharacterSet.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)