[
https://issues.apache.org/jira/browse/NIFI-1077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14985599#comment-14985599
]
Mark Payne commented on NIFI-1077:
----------------------------------
[~JPercivall] - looks good now! Verified that it's invalid with the character
set set to "bogus", verified that it worked okay with "utf-8" and that if i
used an attribute to reference a value of "utf-8" it works as planned. Using an
attribute that has a value of "bogus" did result in the FlowFile being rolled
back, which is expected. There was a thread on the mailing list recently about
how to handle that, but as of right now the general approach is to roll back
the session. Looks good overall. +1
Merged to master and pushed.
Thanks for knocking this out!
> Allow ConvertCharacterSet to accept expression language
> -------------------------------------------------------
>
> Key: NIFI-1077
> URL: https://issues.apache.org/jira/browse/NIFI-1077
> Project: Apache NiFi
> Issue Type: Improvement
> Reporter: Joseph Percivall
> Assignee: Joseph Percivall
> Priority: Minor
> Fix For: 0.4.0
>
> Attachments:
> NIFI-1077_added_expression_validation_to_charset_validator.patch
>
>
> This issue arose from a user on the mailing list. It demonstrates the need to
> be able to use expression language to set the incoming (and potentially
> outgoing) character sets:
> I'm looking to process many files into common formats. The source files are
> coming in various character sets, mime types, and new line terminators.
> My thinking for a data flow was along these lines:
> GetFile (from many sub directories) ->
> ExecuteStreamCommand (file -i) ->
> ConvertCharacterSet (from previous command to utf8) ->
> ReplaceText (to change any \r\n into \n) ->
> PutFile (into a directory structure based on values found in the original
> file path and filename)
> Additional steps would be added for archiving a copy of the original,
> converting xml files, etc.
> Attempting to process these with Nifi leaves me confused as to how to process
> within the tool. If I want to ConvertCharacterSet, I have to know the input
> type. I setup a ExecuteStreamCommand to file -i
> ${absolute.path:append(${filename})} which returned the expected values. I
> don't see a way to turn these results into input for the processor, which
> doesn't accept expression language for that field.
> I also considered ConvertCSVToAvro as an interim step but notice the same
> issue. Any suggestions what this dataflow should look like?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)