I saw the patch you added for NIFI-1077. Thanks! Do you plan to add an issue for the ExecuteStreamCommand output, or should I be looking into NIFI-190 that Bryan mentioned?
On Tue, Oct 27, 2015 at 5:30 PM, Joe Percivall <[email protected]> wrote: > No one responded with concerns regarding allowing expression language for > the input/output character set so I created a jira [1]. This use-case is > something that should be easy for NiFi and the flow for this use-case is > definitely more of a hack job than it should be. > > Does anyone have objections for adding a configuration option to put the > output of > ExecuteStreamCommand to an attribute instead of the FlowFile contents? > > [1] https://issues.apache.org/jira/browse/NIFI-1077 > > > Joe > - - - - - - > Joseph Percivall > linkedin.com/in/Percivall > e: [email protected] > > > > > On Tuesday, October 27, 2015 5:15 PM, Charlie Frasure < > [email protected]> wrote: > > > > Thank you both for the replies. I built a flow that adds the "fragment" > attributes early on, and splits the feed after the ExecuteStream that > identifies the character set. The character set payload goes through > ExtractText to move it into an attribute and ReplaceText to delete the > contents of the file. The two streams are then funneled to a MergeContent > using Defragment, which results in the original data with an extra blank > line and the character set attribute attached. > > I suppose at this point I could route based on attributes for each > character set or call another ExecuteStream to iconv. This works, but > seems a bit of a hack job. Any suggestions for improvement? Is this an > expected use case for the tool? > > > On Tue, Oct 27, 2015 at 10:45 AM, Bryan Bende <[email protected]> wrote: > > One problem with the above flow is that ExecuteStreamCommand will replace > the contents of the FlowFile with the results of the command, so the > FlowFIle will have the encoding value and no longer have the original > content. > > > > > >This could potentially be solved in the future with the "hold file" > processor [1] where the original file is held on one path, while the same > file goes to ExecuteStreamCommand, after getting the encoding it could be > extracted to an attribute and then trigger the original file for release, > copying over the encoding attribute. > > > > > >[1] https://issues.apache.org/jira/browse/NIFI-190 > > > > > > > > > > > > > >On Tue, Oct 27, 2015 at 10:24 AM, Joe Percivall <[email protected]> > wrote: > > > >Hey Charlie, > >> > >>Sorry no one has followed up with you yet. One way I see around > ConvertCharacterSet not supporting expression language is to route on > attribute (assuming the character set is extracted to be an attribute) to > different ConvertCharacterSet processors depending on the input character > set. > >> > >>That being said, I don't see a reason why the ConvertCharacterSet > shouldn't support expression language. If anyone doesn't have objections > I'll put in a ticket later today and knock it out real quick. > >> > >> > >>Joe > >>- - - - - - > >>Joseph Percivall > >>linkedin.com/in/Percivall > >>e: [email protected] > >> > >> > >> > >> > >> > >>On Sunday, October 25, 2015 7:13 PM, Charlie Frasure < > [email protected]> wrote: > >> > >> > >> > >>I'm looking to process many files into common formats. The source files > are coming in various character sets, mime types, and new line terminators. > >> > >>My thinking for a data flow was along these lines: > >> > >>GetFile (from many sub directories) -> > >>ExecuteStreamCommand (file -i) -> > >>ConvertCharacterSet (from previous command to utf8) -> > >>ReplaceText (to change any \r\n into \n) -> > >>PutFile (into a directory structure based on values found in the > original file path and filename) > >> > >>Additional steps would be added for archiving a copy of the original, > converting xml files, etc. > >> > >>Attempting to process these with Nifi leaves me confused as to how to > process within the tool. If I want to ConvertCharacterSet, I have to know > the input type. I setup a ExecuteStreamCommand to file -i > ${absolute.path:append(${filename})} which returned the expected values. I > don't see a way to turn these results into input for the processor, which > doesn't accept expression language for that field. > >> > >>I also considered ConvertCSVToAvro as an interim step but notice the > same issue. Any suggestions what this dataflow should look like? > >> > >> > >>Charlie > >> > > >
