[ https://issues.apache.org/jira/browse/NIFI-4550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
David Handermann reassigned NIFI-4550: -------------------------------------- Assignee: endzeit > Add an InferCharacterSet processor > ---------------------------------- > > Key: NIFI-4550 > URL: https://issues.apache.org/jira/browse/NIFI-4550 > Project: Apache NiFi > Issue Type: New Feature > Components: Extensions > Reporter: Matt Burgess > Assignee: endzeit > Priority: Minor > Time Spent: 2h > Remaining Estimate: 0h > > Sometimes in a NiFi flow it is not known what character set an incoming flow > file is using. This can make it difficult for downstream processing if the > processors expect a particular charset (whether the user can configure it or > not). There is a ConvertCharacterSet processor, but it expects an explicit > value for Input Character Set, when this might not be known. > I propose an InferCharacterSet processor, which would presumably use some > license-friendly third-party library (there is a discussion > [here|https://stackoverflow.com/questions/499010/java-how-to-determine-the-correct-charset-encoding-of-a-stream]) > to guess the character set, perhaps adding it as an attribute for use > downstream in ConvertCharacterSet. -- This message was sent by Atlassian Jira (v8.20.10#820010)