JoeS

Sounds great.  I'd ignore my provenance comment as that was really
more about how something external could keep tabs on progress, etc..
Mark Payne designed/built the List/Fetch HDFS one so I'll defer to him
for the good bits.  But the logic to follow for saving state you'll
want is probably the same.

Mark - do you have the design of that thing documented anywhere?  It
is a good pattern to describe because it is effectively a model for
taking non-scaleable dataflow interfaces and making them behave as if
they were.

Thanks
JoeW

On Wed, Jul 29, 2015 at 6:07 AM, Joe Skora <[email protected]> wrote:
> Joe,
>
> I'm interested in working on List/FetchFile.  It seems like starting with
> [NIFI-631|https://issues.apache.org/jira/browse/NIFI-631] makes sense.
> I'll look at List/FetchHDFS, but is there any further detail on how this
> functionality should differ from GetFile?   As for keeping state,
> provenance was suggested, a separate state folder might work, or some file
> systems support additional state that might be usable.
>
> Regards,
> Joe
>
> On Tue, Jul 28, 2015 at 12:42 AM, Joe Witt <[email protected]> wrote:
>
>> Anup,
>>
>> The two tickets in question appear to be:
>> https://issues.apache.org/jira/browse/NIFI-631
>> https://issues.apache.org/jira/browse/NIFI-673
>>
>> Neither have been claimed as of yet.  Anybody interested in taking one
>> or both of these on?  It would be a lot like List/Fetch HDFS so you'll
>> have good examples to work from.
>>
>> Thanks
>> Joe
>>
>> On Tue, Jul 28, 2015 at 12:37 AM, Sethuram, Anup
>> <[email protected]> wrote:
>> > Can I expect this functionality in the upcoming releases of Nifi ?
>> >
>> > On 13/07/15 9:13 am, "Sethuram, Anup" <[email protected]> wrote:
>> >
>> >>Where is this 1TB dataset living today?
>> >>[anup] Resides in a filesystem
>> >>
>> >>- What is the current nature of the dataset?  Is it already in large
>> >>bundles as files or is it a series of tiny messages, etc..?  Does it
>> >>need to be split/merged/etc..
>> >>[anup] Archived files of size 3MB each collected over a period. Directory
>> >>(1TB) -> Sub-Directories  -> Files
>> >>
>> >>- What is the format of the data?  Is it something that can easily be
>> >>split/merged or will it require special processes to do so?
>> >>[anup] zip, tar formats.
>> >>
>> >>
>> >>
>> >>--
>> >>View this message in context:
>> >>
>> http://apache-nifi-incubating-developer-list.39713.n7.nabble.com/Fetch-cha
>> >>nge-list-tp1351p2126.html
>> >>Sent from the Apache NiFi (incubating) Developer List mailing list
>> >>archive at Nabble.com.
>> >>
>> >>________________________________
>> >>The information contained in this message may be confidential and legally
>> >>protected under applicable law. The message is intended solely for the
>> >>addressee(s). If you are not the intended recipient, you are hereby
>> >>notified that any use, forwarding, dissemination, or reproduction of this
>> >>message is strictly prohibited and may be unlawful. If you are not the
>> >>intended recipient, please contact the sender by return e-mail and
>> >>destroy all copies of the original message.
>> >
>> >
>> > ________________________________
>> > The information contained in this message may be confidential and
>> legally protected under applicable law. The message is intended solely for
>> the addressee(s). If you are not the intended recipient, you are hereby
>> notified that any use, forwarding, dissemination, or reproduction of this
>> message is strictly prohibited and may be unlawful. If you are not the
>> intended recipient, please contact the sender by return e-mail and destroy
>> all copies of the original message.
>>

Reply via email to