Hi Koji, that seems a pretty good idea, thanks for bringing it up! I wasn't
aware of nifi nano but definitely will give it a shot. =)

Thanks

Em qua, 10 de abr de 2019 às 22:38, Koji Kawamura <[email protected]>
escreveu:

> Hi Eric,
>
> Although my knowledge on MiNiFi, Python and Go is limited, I wonder if
> "nanofi" library can be used from the proprietary application so that
> they can fetch FlowFiles directly using Site-to-Site protocol. That
> can be an interesting approach and will be able to eliminate the need
> of storing data to a local volume (mentioned in the possible approach
> A).
> https://github.com/apache/nifi-minifi-cpp/tree/master/nanofi
>
> The latest MiNiFi (C++) version 0.6.0 was released recently.
> https://cwiki.apache.org/confluence/display/MINIFI/Release+Notes
>
> Thanks,
> Koji
>
> On Thu, Apr 11, 2019 at 2:28 AM Eric Chaves <[email protected]> wrote:
> >
> > Hi Folks,
> >
> > My company is using nifi to perform several data-flow process and now we
> received a requirement to do some fairly complex ETL over large files. To
> process those files we have some proprietary applications (mostly written
> in phyton or go) that ran as docker containers.
> >
> > I don't think that porting those apps as nifi processors would produce a
> good result due to each app complexity.
> >
> > Also we would like keep using the nifi queues so we can monitor overall
> progress as we already do (we ran several other nifi flows) so we are
> discarding for now solutions that for example submit files to an external
> queue like SQS or Rabbit for consumption.
> >
> > So far we come up with two solutions that would:
> >
> > have kubernete cluster of running jobs periodically querying the nifi
> queue for new flowfiles and pull one when a file arrives.
> > download the file-content (which is already stored outside of nifi) and
> process it.
> > submit the result back to nifi (using a HTTP Listener processor) to
> trigger subsequent nifi process.
> >
> >
> > For step 1 and 2 so far we are considering two possible approaches:
> >
> > A) use a minifi container togheter with the app container in a sidecar
> design. minifi would connect to our nifi cluster and handle file download
> to a local volume for the app container process them.
> >
> > B) use nifi rest API to query and consume flowfiles on queue
> >
> > One requirement is that if needed we would manually scale up the app
> cluster to have multiple containers consumer more queued files in parallel.
> >
> > Do you guys recommend one over another (or a third approach)? Any
> pitfalls you can foresee?
> >
> > Would be really glad to hear your thoughts on this matter.
> >
> > Best regards,
> >
> > Eric
>

Reply via email to