Hi Dweep,

I would like to add to Pierre Villard's insightful answer.
 2)  NiFi having at least 3 filesystem repositories, multiple write and
read occur on same record on different stages of a single pipeline. This
demands for high IOPS. Vertical scaling of IOPS is very costly/leads to
roadblock sometimes which can be handled better in clustered mode by load
balancing of flowfiles.

Regards,
Purushotham Pushpavanth



On Mon, 5 Aug 2019 at 15:37, Pierre Villard <[email protected]>
wrote:

> Hi Dweep,
>
> I'll let other chime in, but here are some answers to your questions:
>
> 1) Yes - NiFi supports a very fine-grained authorizations model and
> authentication mechanisms.
> Authentication:
> https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#user_authentication
> Authorization:
> https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#multi-tenant-authorization
>
> You can also find resources on the Internet on how to setup authentication
> & authorization.
>
> 2) I'd say that it is up to your requirements and if you need high
> availability. From a pure performance standpoint, vertical scaling is
> probably enough for your use case unless you have very huge amounts of
> data. Clustering will help you achieve even better performance (millions of
> events per second), and will improve reliability in case of failure.
>
> 3) Yes the data is persisted. There are some parameters that you can tune
> based on your tolerance against data loss.
> Example: nifi.flowfile.repository.always.sync - If set to true, any change
> to the repository will be synchronized to the disk, meaning that NiFi will
> ask the operating system not to cache the information. This is very
> expensive and can significantly reduce NiFi performance. However, if it is
> false, there could be the potential for data loss if either there is a
> sudden power loss or the operating system crashes. The default value is
> false.
>
> In other words, unless you have serious hardware/OS failures, you should
> not lose any data. And everything will be persisted/restart upon NiFi
> restart. In case data loss is a critical part of your system, using a
> broker like Kafka with the ability to replay events could be a possible
> solution.
>
> 4) I recommend this awesome post by Bryan:
> https://bryanbende.com/development/2016/09/15/apache-nifi-and-apache-kafka
>
> 5) There are some options available for the metrics. You can have a look
> at reporting tasks for this purpose. A set or articles you can read is
> available here:
> https://pierrevillard.com/2017/05/11/monitoring-nifi-introduction/
>
> Hope this helps!
> Pierre
>
>
>
>
>
> Le lun. 5 août 2019 à 07:11, Dweep Sharma <[email protected]> a
> écrit :
>
>> Hi All,
>>
>> I have been using Nifi to setup some pipelines now. Before I can absorb
>> more use cases into this, I need to understand a few capabilities
>>
>> 1) Can we setup an user authentication before the web application. If
>> yes, is there a way we can have role based access for processor groups. I
>> would like certain teams working on only specific groups and not control
>> all.
>>
>> 2) If the major use case would only involve reading from RMQ, KAFKA
>> convert to parquet and store in S3, does it make sense to setup a cluster
>> or just vertical scaling is good ?
>>
>> 3) Are the flow files in the queues (connections between processors)
>> persisted?. Any machine failure or restart would cause a loss of data ? For
>> instance messages are dequeued form RMQ and lost due to failure. Which
>> would be a best way to handle this ? I think maintaining a low back
>> pressure (threshold) can help mitigate the loss
>>
>> 4) Does the Kafka consumer, by default consume all partitions or is there
>> a way to control that.
>>
>> 5) Can we have some of the metrics of processors pushed out as
>> notifications or alerts (flow file count in / out or errors etc)
>>
>> It would be great, if someone could share resources that address these.
>>
>> Thanks in advance.
>>
>> -Dweep
>>
>>
>>
>>
>> *::DISCLAIMER::----------------------------------------------------------------------------------------------------------------------------------------------------The
>> contents of this e-mail and any attachments are confidential and intended
>> for the named recipient(s) only.E-mail transmission is not guaranteed to be
>> secure or error-free as information could be intercepted, corrupted,lost,
>> destroyed, arrive late or incomplete, or may contain viruses in
>> transmission. The e mail and its contents(with or without referred errors)
>> shall therefore not attach any liability on the originator or redBus.com.
>> Views or opinions, if any, presented in this email are solely those of the
>> author and may not necessarily reflect the views or opinions of redBus.com.
>> Any form of reproduction, dissemination, copying, disclosure,
>> modification,distribution and / or publication of this message without the
>> prior written consent of authorized representative of redbus.
>> <http://redbus.in/>com is strictly prohibited. If you have received this
>> email in error please delete it and notify the sender immediately.Before
>> opening any email and/or attachments, please check them for viruses and
>> other defects.*
>
>

Reply via email to