Charlie,

One thing that you should note, specifically when using the Correlation 
Attribute is the <Maximum number of Bins> property. If the value that
you are using for the Correlation Attribute varies quite a bit, you could 
quickly fill up the default number of bins (100). In this case, it won't be
able to add a FlowFile to any of the bins until the timeout occurs and as a 
result it will immediately evict the oldest bin. 

Thanks
-Mark



> On Nov 30, 2015, at 3:05 PM, Charlie Frasure <[email protected]> wrote:
> 
> Joe,
> 
> Thanks for checking in.  I tried it again and noticed that the correlation 
> attribute in MergeContent doesn't accept expressions.  I was attempting to 
> combine multiple attributes to define a bin, so I moved that expression to an 
> earlier UpdateAttribute process which seemed to resolve my issue.
> 
> Now I'm dealing with bins being released before I think they should, but it 
> seems that there's been other people with the same problem that must've been 
> resolved, so I'll poke on that a bit more before posting.
> 
> Thanks,
> Charlie
>  
> 
> 
> 
> On Mon, Nov 30, 2015 at 1:21 PM, Joe Percivall <[email protected] 
> <mailto:[email protected]>> wrote:
> Hello Charlie,
> 
> Sorry no one has gotten back to you yet, everyone is busy getting 0.4.0 
> finished up and of course Thanksgiving. Have you made any more progress?
> 
> 
> Since it is a continuous task it is well within NiFi's wheelhouse. In your 
> original message you mentioned that you already had them merged in to single 
> flowfile but just had trouble creating the path to do a PutFile. Have you 
> tried using expression language [1] to create the path? Assuming you have 
> attributes for the category and date you should be able to create an 
> expression language expression which properly evaluates to what you need.
> 
> If you need help with creating the proper expression, just reply with the 
> attribute names for the category and dates and I'd be happy to help.
> 
> [1] 
> https://nifi.apache.org/docs/nifi-docs/html/expression-language-guide.html 
> <https://nifi.apache.org/docs/nifi-docs/html/expression-language-guide.html>
> 
> Joe
> - - - - - -
> Joseph Percivall
> linkedin.com/in/Percivall <http://linkedin.com/in/Percivall>
> e: [email protected] <mailto:[email protected]>
> 
> 
> 
> 
> On Monday, November 23, 2015 11:37 AM, Charlie Frasure 
> <[email protected] <mailto:[email protected]>> wrote:
> 
> 
> 
> Joe,
> 
> This is a continuous task.  The main intent is to keep a version of the file 
> prior to conversions etc.  Ideally, it would be highly compressed, and easy 
> to locate.  Best case scenario, the archive files are the contents of highly 
> structured nested directories.  File sizes range from a few bytes to < 1GB.  
> It wouldn't have to run real time (updating archives seems to be a fairly 
> intensive task), but would probably run at least every few days.
> 
> Thanks,
> Charlie
> 
> 
> 
> 
> 
> On Mon, Nov 23, 2015 at 11:08 AM, Joe Witt <[email protected] 
> <mailto:[email protected]>> wrote:
> 
> Charlie,
> >
> >Can give some pointers on how to get in the ballpark with this but
> >want to make sure we have a good alignment of purpose here.  NiFi has
> >from time to time come up as an intuitive way to build an archive
> >management tool and it is always "not quite right" because of the
> >subtle differences between continuous streams of information and
> >ad-hoc sort of one-time tasks.
> >
> >Would this be a continuous task (always running) even if it is slow
> >(every few minutes, hours, days) or would it be a one-time thing to
> >move a bunch of data from one place to another?
> >
> >The difference sounds very minor but it will help me to understand how
> >best to respond.
> >
> >Thanks
> >Joe
> >
> >
> >On Mon, Nov 23, 2015 at 10:54 AM, Charlie Frasure
> ><[email protected] <mailto:[email protected]>> wrote:
> >> Use case: Archive and compress files by category and month, store like 
> >> files
> >> in a common directory.
> >>
> >> I'm already processing the files, and have extracted the interesting
> >> attributes from each.  I ran them through MergeContent, but have not been
> >> able to produce a logical directory structure to store the results.  I 
> >> would
> >> prefer something like archive/categoryA/201511/somefilename.tar.gz where
> >> somefilename is made up of all the categoryA files received in November
> >> 2015.
> >>
> >> I switched gears, and used PutFile to store the files in the preferred
> >> directory structure, but at a loss of how to archive them within their
> >> folders given hundreds of dynamic categories, and date additions every
> >> month.
> >>
> >> I'm playing with MergeContent's Correlation Attribute Name, but am also
> >> considering trying the "Degfragment" merge strategy by correlating the 
> >> files
> >> earlier in the process.
> >>
> >> Any suggestions would be appreciated.
> >
> 

Reply via email to