Re: Possible Regression in PutAzureBlobStorage 1.12.0

2020-09-16 Thread Eric Secules
Hello,

I noticed I maybe was not configuring the processor correctly but nothing
went wrong until now.

I had the following configured:
*Container Name*: "base_container/A/B"
*Blob*: ${filename}

I changed it to this and it works:
*Container Name*: "base_container"
*Blob*: "/A/B/"${filename}

Thanks,
Eric

On Wed, Sep 16, 2020 at 1:12 PM Paul Kelly  wrote:

> Hi Eric,
>
> We also noticed an issue with PutAzureBlobStorage when upgrading to NiFi
> 1.12.0, and I believe it has to do with changes made for
> https://issues.apache.org/jira/browse/NIFI-6913, where the default
> behavior changed to check to see if a container exists before writing a
> file, and creating the container if it doesn't exist.  The SAS tokens we
> are using within our flows do not have permission to list containers, so
> this check is failing, and therefore the overall Put operation is failing.
> We are seeing a different error, but I suspect it is related since this in
> your stack trace:
>
> at
> com.microsoft.azure.storage.blob.CloudBlobContainer.createIfNotExists(CloudBlobContainer.java:354)
> at
> com.microsoft.azure.storage.blob.CloudBlobContainer.createIfNotExists(CloudBlobContainer.java:301)
>
> which indicates that yours is also trying to create the container and
> failing due to this new check.  Do you have permission to list and create
> containers?
>
> I reported this issue in https://issues.apache.org/jira/browse/NIFI-7794.
> I plan to submit a fix for it by adding a property to the processor to
> determine if a container should be created if it doesn't exist, but if
> someone else beats me to it, that'd be great.
>
> Paul
>
> On Wed, Sep 16, 2020 at 7:10 PM Joey Frazee 
> wrote:
>
>> Eric, can you share any details about your config (e.g., what do you have
>> in the Blob property)? I tried the following scenarios in an upgrade to
>> 1.12.0 and main and they seem to work.
>>
>> Pre-existing object: A/B/test.json
>>
>> New object: A/B/${filename}.json
>>
>> New object with new pseudo-dirs:
>> ${random():mod(10):plus(1)}/${random():mod(10):plus(1)}/${filename}.json
>>
>> -joey
>>
>> On Sep 16, 2020, 11:02 AM -0700, Eric Secules ,
>> wrote:
>>
>> Hello everyone,
>>
>>
>> I was able to see why this is an issue. It's an issue that the blob is
>> stored several layers deep at "my-container/A/B/my_test_blob.json"
>>
>>
>> -Eric
>>
>>
>> On Wed, Sep 16, 2020 at 10:49 AM Eric Secules  wrote:
>>
>>> Hello everyone,
>>>
>>> I tried upgrading to 1.12.0 and right away noticed that
>>> PutAzureBlobStorage is failing due to the following error. I don't think
>>> it's an issue with access because I can use a ListAzureBlobStorage on the
>>> same container and I haven't changed the permissions of the container. And
>>> I didn't change any parameters during the upgrade and it was working on
>>> 1.11.4.
>>>
>>> I am writing the blob to a container path that already exists:
>>> "my-container/A/B/my_test_blob.json:"
>>>
>>> 2020-09-16 00:59:17,283 ERROR [Timer-Driven Process Thread-6]
>>> o.a.n.p.a.storage.PutAzureBlobStorage
>>> PutAzureBlobStorage[id=15404d54-bc14-350c-7847-521b765dd57f] Failed to put
>>> Azure blob my_test_blob.json: com.microsoft.azure.storage.StorageException:
>>> The requested URI does not represent any resource on the server.
>>> com.microsoft.azure.storage.StorageException: The requested URI does not
>>> represent any resource on the server.
>>> at
>>> com.microsoft.azure.storage.StorageException.translateException(StorageException.java:87)
>>> at
>>> com.microsoft.azure.storage.core.ExecutionEngine.executeWithRetry(ExecutionEngine.java:220)
>>> at
>>> com.microsoft.azure.storage.blob.CloudBlobContainer.exists(CloudBlobContainer.java:744)
>>> at
>>> com.microsoft.azure.storage.blob.CloudBlobContainer.createIfNotExists(CloudBlobContainer.java:354)
>>> at
>>> com.microsoft.azure.storage.blob.CloudBlobContainer.createIfNotExists(CloudBlobContainer.java:301)
>>> at
>>> org.apache.nifi.processors.azure.storage.PutAzureBlobStorage.onTrigger(PutAzureBlobStorage.java:100)
>>> at
>>> org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
>>> at
>>> org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1174)
>>> at
>>> org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:213)
>>> at
>>> org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)
>>> at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110)
>>> at
>>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>>> at
>>> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>>> at
>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>>> at
>>> 

Re: Possible Regression in PutAzureBlobStorage 1.12.0

2020-09-16 Thread Paul Kelly
Hi Eric,

We also noticed an issue with PutAzureBlobStorage when upgrading to NiFi
1.12.0, and I believe it has to do with changes made for
https://issues.apache.org/jira/browse/NIFI-6913, where the default behavior
changed to check to see if a container exists before writing a file, and
creating the container if it doesn't exist.  The SAS tokens we are using
within our flows do not have permission to list containers, so this check
is failing, and therefore the overall Put operation is failing.  We are
seeing a different error, but I suspect it is related since this in your
stack trace:

at
com.microsoft.azure.storage.blob.CloudBlobContainer.createIfNotExists(CloudBlobContainer.java:354)
at
com.microsoft.azure.storage.blob.CloudBlobContainer.createIfNotExists(CloudBlobContainer.java:301)

which indicates that yours is also trying to create the container and
failing due to this new check.  Do you have permission to list and create
containers?

I reported this issue in https://issues.apache.org/jira/browse/NIFI-7794.
I plan to submit a fix for it by adding a property to the processor to
determine if a container should be created if it doesn't exist, but if
someone else beats me to it, that'd be great.

Paul

On Wed, Sep 16, 2020 at 7:10 PM Joey Frazee  wrote:

> Eric, can you share any details about your config (e.g., what do you have
> in the Blob property)? I tried the following scenarios in an upgrade to
> 1.12.0 and main and they seem to work.
>
> Pre-existing object: A/B/test.json
>
> New object: A/B/${filename}.json
>
> New object with new pseudo-dirs:
> ${random():mod(10):plus(1)}/${random():mod(10):plus(1)}/${filename}.json
>
> -joey
>
> On Sep 16, 2020, 11:02 AM -0700, Eric Secules , wrote:
>
> Hello everyone,
>
>
> I was able to see why this is an issue. It's an issue that the blob is
> stored several layers deep at "my-container/A/B/my_test_blob.json"
>
>
> -Eric
>
>
> On Wed, Sep 16, 2020 at 10:49 AM Eric Secules  wrote:
>
>> Hello everyone,
>>
>> I tried upgrading to 1.12.0 and right away noticed that
>> PutAzureBlobStorage is failing due to the following error. I don't think
>> it's an issue with access because I can use a ListAzureBlobStorage on the
>> same container and I haven't changed the permissions of the container. And
>> I didn't change any parameters during the upgrade and it was working on
>> 1.11.4.
>>
>> I am writing the blob to a container path that already exists:
>> "my-container/A/B/my_test_blob.json:"
>>
>> 2020-09-16 00:59:17,283 ERROR [Timer-Driven Process Thread-6]
>> o.a.n.p.a.storage.PutAzureBlobStorage
>> PutAzureBlobStorage[id=15404d54-bc14-350c-7847-521b765dd57f] Failed to put
>> Azure blob my_test_blob.json: com.microsoft.azure.storage.StorageException:
>> The requested URI does not represent any resource on the server.
>> com.microsoft.azure.storage.StorageException: The requested URI does not
>> represent any resource on the server.
>> at
>> com.microsoft.azure.storage.StorageException.translateException(StorageException.java:87)
>> at
>> com.microsoft.azure.storage.core.ExecutionEngine.executeWithRetry(ExecutionEngine.java:220)
>> at
>> com.microsoft.azure.storage.blob.CloudBlobContainer.exists(CloudBlobContainer.java:744)
>> at
>> com.microsoft.azure.storage.blob.CloudBlobContainer.createIfNotExists(CloudBlobContainer.java:354)
>> at
>> com.microsoft.azure.storage.blob.CloudBlobContainer.createIfNotExists(CloudBlobContainer.java:301)
>> at
>> org.apache.nifi.processors.azure.storage.PutAzureBlobStorage.onTrigger(PutAzureBlobStorage.java:100)
>> at
>> org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
>> at
>> org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1174)
>> at
>> org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:213)
>> at
>> org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)
>> at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110)
>> at
>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>> at
>> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>> at
>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>> at
>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>> at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>> at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>> at java.lang.Thread.run(Thread.java:748)
>> Caused by: java.lang.NullPointerException: null
>> at
>> com.microsoft.azure.storage.core.ExecutionEngine.executeWithRetry(ExecutionEngine.java:202)
>> ... 16 common frames omitted

Re: Possible Regression in PutAzureBlobStorage 1.12.0

2020-09-16 Thread Joey Frazee
Eric, can you share any details about your config (e.g., what do you have in 
the Blob property)? I tried the following scenarios in an upgrade to 1.12.0 and 
main and they seem to work.

Pre-existing object: A/B/test.json

New object: A/B/${filename}.json

New object with new pseudo-dirs: 
${random():mod(10):plus(1)}/${random():mod(10):plus(1)}/${filename}.json

-joey

On Sep 16, 2020, 11:02 AM -0700, Eric Secules , wrote:
> Hello everyone,
>
> I was able to see why this is an issue. It's an issue that the blob is stored 
> several layers deep at "my-container/A/B/my_test_blob.json"
>
> -Eric
>
On Wed, Sep 16, 2020 at 10:49 AM Eric Secules  wrote:
> Hello everyone,
>
> I tried upgrading to 1.12.0 and right away noticed that PutAzureBlobStorage 
> is failing due to the following error. I don't think it's an issue with 
> access because I can use a ListAzureBlobStorage on the same container and I 
> haven't changed the permissions of the container. And I didn't change any 
> parameters during the upgrade and it was working on 1.11.4.
>
> I am writing the blob to a container path that already exists: 
> "my-container/A/B/my_test_blob.json:"
>
> 2020-09-16 00:59:17,283 ERROR [Timer-Driven Process Thread-6] 
> o.a.n.p.a.storage.PutAzureBlobStorage 
> PutAzureBlobStorage[id=15404d54-bc14-350c-7847-521b765dd57f] Failed to put 
> Azure blob my_test_blob.json: com.microsoft.azure.storage.StorageException: 
> The requested URI does not represent any resource on the server.
> com.microsoft.azure.storage.StorageException: The requested URI does not 
> represent any resource on the server.
>         at 
> com.microsoft.azure.storage.StorageException.translateException(StorageException.java:87)
>         at 
> com.microsoft.azure.storage.core.ExecutionEngine.executeWithRetry(ExecutionEngine.java:220)
>         at 
> com.microsoft.azure.storage.blob.CloudBlobContainer.exists(CloudBlobContainer.java:744)
>         at 
> com.microsoft.azure.storage.blob.CloudBlobContainer.createIfNotExists(CloudBlobContainer.java:354)
>         at 
> com.microsoft.azure.storage.blob.CloudBlobContainer.createIfNotExists(CloudBlobContainer.java:301)
>         at 
> org.apache.nifi.processors.azure.storage.PutAzureBlobStorage.onTrigger(PutAzureBlobStorage.java:100)
>         at 
> org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
>         at 
> org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1174)
>         at 
> org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:213)
>         at 
> org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)
>         at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110)
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>         at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>         at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>         at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>         at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.NullPointerException: null
>         at 
> com.microsoft.azure.storage.core.ExecutionEngine.executeWithRetry(ExecutionEngine.java:202)
>         ... 16 common frames omitted
>
> Thanks,
> Eric


Re: Possible Regression in PutAzureBlobStorage 1.12.0

2020-09-16 Thread Eric Secules
Hello everyone,

I was able to see why this is an issue. It's an issue that the blob is
stored several layers deep at "my-container/A/B/my_test_blob.json"

-Eric

On Wed, Sep 16, 2020 at 10:49 AM Eric Secules  wrote:

> Hello everyone,
>
> I tried upgrading to 1.12.0 and right away noticed that
> PutAzureBlobStorage is failing due to the following error. I don't think
> it's an issue with access because I can use a ListAzureBlobStorage on the
> same container and I haven't changed the permissions of the container. And
> I didn't change any parameters during the upgrade and it was working on
> 1.11.4.
>
> I am writing the blob to a container path that already exists:
> "my-container/A/B/my_test_blob.json:"
>
> 2020-09-16 00:59:17,283 ERROR [Timer-Driven Process Thread-6]
> o.a.n.p.a.storage.PutAzureBlobStorage
> PutAzureBlobStorage[id=15404d54-bc14-350c-7847-521b765dd57f] Failed to put
> Azure blob my_test_blob.json: com.microsoft.azure.storage.StorageException:
> The requested URI does not represent any resource on the server.
> com.microsoft.azure.storage.StorageException: The requested URI does not
> represent any resource on the server.
> at
> com.microsoft.azure.storage.StorageException.translateException(StorageException.java:87)
> at
> com.microsoft.azure.storage.core.ExecutionEngine.executeWithRetry(ExecutionEngine.java:220)
> at
> com.microsoft.azure.storage.blob.CloudBlobContainer.exists(CloudBlobContainer.java:744)
> at
> com.microsoft.azure.storage.blob.CloudBlobContainer.createIfNotExists(CloudBlobContainer.java:354)
> at
> com.microsoft.azure.storage.blob.CloudBlobContainer.createIfNotExists(CloudBlobContainer.java:301)
> at
> org.apache.nifi.processors.azure.storage.PutAzureBlobStorage.onTrigger(PutAzureBlobStorage.java:100)
> at
> org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
> at
> org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1174)
> at
> org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:213)
> at
> org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)
> at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110)
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.NullPointerException: null
> at
> com.microsoft.azure.storage.core.ExecutionEngine.executeWithRetry(ExecutionEngine.java:202)
> ... 16 common frames omitted
>
> Thanks,
> Eric
>


Possible Regression in PutAzureBlobStorage 1.12.0

2020-09-16 Thread Eric Secules
Hello everyone,

I tried upgrading to 1.12.0 and right away noticed that PutAzureBlobStorage
is failing due to the following error. I don't think it's an issue with
access because I can use a ListAzureBlobStorage on the same container and I
haven't changed the permissions of the container. And I didn't change any
parameters during the upgrade and it was working on 1.11.4.

I am writing the blob to a container path that already exists:
"my-container/A/B/my_test_blob.json:"

2020-09-16 00:59:17,283 ERROR [Timer-Driven Process Thread-6]
o.a.n.p.a.storage.PutAzureBlobStorage
PutAzureBlobStorage[id=15404d54-bc14-350c-7847-521b765dd57f] Failed to put
Azure blob my_test_blob.json: com.microsoft.azure.storage.StorageException:
The requested URI does not represent any resource on the server.
com.microsoft.azure.storage.StorageException: The requested URI does not
represent any resource on the server.
at
com.microsoft.azure.storage.StorageException.translateException(StorageException.java:87)
at
com.microsoft.azure.storage.core.ExecutionEngine.executeWithRetry(ExecutionEngine.java:220)
at
com.microsoft.azure.storage.blob.CloudBlobContainer.exists(CloudBlobContainer.java:744)
at
com.microsoft.azure.storage.blob.CloudBlobContainer.createIfNotExists(CloudBlobContainer.java:354)
at
com.microsoft.azure.storage.blob.CloudBlobContainer.createIfNotExists(CloudBlobContainer.java:301)
at
org.apache.nifi.processors.azure.storage.PutAzureBlobStorage.onTrigger(PutAzureBlobStorage.java:100)
at
org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
at
org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1174)
at
org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:213)
at
org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)
at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.NullPointerException: null
at
com.microsoft.azure.storage.core.ExecutionEngine.executeWithRetry(ExecutionEngine.java:202)
... 16 common frames omitted

Thanks,
Eric


Re: Securing NiFI behind a proxy (NGINX).

2020-09-16 Thread Kevin Doran
Hi,

Your understanding is correct: In order to get the connections you want, NGINX 
will have to be recognized by NiFi as an authorized proxy. The client 
certificate DN will be used for each request, provided NGINX terminates that 
TLS connection from the client and passes the DN of the certificate in the 
X-ProxiedEntitiesChain header to NiFi.

There are a few examples here:
https://github.com/ijokarumawak/nifi-reverseproxy/tree/master/nginx 


Here is an example of configuring NGINX to pass the client Cert DN to NiFi:
https://github.com/ijokarumawak/nifi-reverseproxy/blob/master/nginx/standalone-secure-http/nginx.conf
 


The FQDN of NGINX should match the external hostname of the machine (i.e., what 
the client uses to send requests).

Hope this helps,
Kevin



> On Sep 16, 2020, at 02:04, scotty  wrote:
> 
> Hi Vijay,
> 
> After realizing that the reverse proxy was the problem, I've got NiFi,
> standalone, secured with certificates by removing the reverse proxy out of
> the mix.
> 
> Is there some example, somewhere, of using a NGINX reverse proxy so that I
> can have the following setup?
> 
> client > https > NGINX > https > NiFi
> 
> My understanding is that NGINX needs a client certificate and that the FQDN
> of that certificate needs to be setup to proxy user requests in the NiFi UI.
> I've done both of these things as well as setup the nifi.web.proxy.host and
> nifi.web.proxy.context.path in the nifi.properties file.
> 
> Is there a specific FQDN that NGINX is supposed to have?
> 
> Thanks.
> 
> 
> 
> --
> Sent from: http://apache-nifi-users-list.2361937.n4.nabble.com/



Re: Performance of DistributeLoad - Batch Size

2020-09-16 Thread Mark Payne
I wasn’t expecting a bug report either :) Re the record stuff: I agree that the 
schema handling can be a bit complicated when you’re getting started.  
Especially if you’re not familiar with Avro and the schema format that it uses. 
But typically once you create a couple of schemas and configure a couple of 
record readers/writers, it starts to make a lot more sense.

Also of note, it’s gotten a *LOT* easier to handle, with the introduction of 
schema inference. If you don’t plan to use a schema registry outside of nifi, 
you can usually just use a Schema Access Strategy of “Infer Schema” for Record 
Readers and a Schema Access Strategy of “Inherit Record Schema.” Most of the 
other schema-related properties can be ignored.

And there’s a PR up for NIFI-1121 [1], which is in review. That should also 
help to make the readers/writers much easier to configure by automatically 
hiding properties that are not relevant when configuring components. For 
example, if you choose a Schema Access Strategy of Infer Schema, there should 
be no need to ask you for the Schema Name and Schema Text, as those don’t 
really apply.

So I do think it’s worth taking the time to learn the Record stuff now - 
performance difference is amazing, and flows are usually much more 
straight-forward. But there’s more we’re doing to make it easier.

Thanks
-Mark

[1] https://issues.apache.org/jira/browse/NIFI-1121

On Sep 15, 2020, at 9:48 PM, Ryan Hendrickson 
mailto:ryan.andrew.hendrick...@gmail.com>> 
wrote:

Thanks Mark - I was not expecting a Bug report out of this!  I'll give the 0 
millis a try tomorrow and see what happens.  In fairness, your laptop is 
probably more powerful than the virtual CPUs I'm running on :-).

@Ryan I've got to learn the Record stuff better than I have now... It's the 
whole complicated schema thing that has kept me away for far too long...

Ryan

On Tue, Sep 15, 2020 at 7:04 PM Mark Payne 
mailto:marka...@hotmail.com>> wrote:
Hey Ryan,

I tried to replicate the behavior that you’re seeing. I wasn’t seeing behavior 
as slow as what you’re mentioning, but was definitely seeing significantly 
slower performance than I would have expected (reached about 1.5 million/5 mins 
on my laptop, would expect about 8-10 million/5 mins). Did some quick profiling 
and see that it’s due to the NiFi session not handling a large number of 
Provenance Route events well. I created a Jira for this [1]. Interestingly, in 
the interim, you may get better performance by using a Run Duration of 0 millis 
instead of 1 second. That would end up being more expensive in other ways but 
would avoid the issue found in NIFI-7812. Hard to know for sure if it would 
help without trying it out to see.

Hope this helps!
-Mark

https://issues.apache.org/jira/browse/NIFI-7812



On Sep 15, 2020, at 5:42 PM, Ryan Hendrickson 
mailto:ryan.andrew.hendrick...@gmail.com>> 
wrote:

Hi Mark,
   I'm using Next Available, and the Destination Queues are set with Zero (0) 
for Back Pressure and Size threshold, so the destinations should not fill up.

   I did switch to using RoundRobin and set it to a yield of 0.  That got me up 
to about 300,000 ff's / 5 minutes.  I was hoping for something around 1,000,000 
ff / 5 minutes.

   The overall flow looks a bit like this: Large amount of flow files -> 
Distribute Load -> PutElasticsearcHttp.

Ryan

On Tue, Sep 15, 2020 at 4:55 PM Mark Payne 
mailto:marka...@hotmail.com>> wrote:
Ryan,

I presume you’re using the Round Robin strategy? Looks like that strategy will 
yield the processor if any destination is full. And it sounds like that will be 
very common in your case. Would recommend configuring the Processor and in the 
Settings tab, set the Yield Duration to “0 secs”. I suspect that will give you 
dramatically better performance.

Thanks
-Mark


> On Sep 15, 2020, at 4:41 PM, Ryan Hendrickson 
> mailto:ryan.andrew.hendrick...@gmail.com>> 
> wrote:
>
> Hello,
>I've got 1 million plus FlowFiles (nothing I can do about the count), that 
> goto a DistributeLoad.  The DistributeLoad with 2 threads, a run duration of 
> 1 sec can only sustain ~200,000 FlowFiles / five minutes.
>
>Is there a better design pattern or a processor that takes a Batch Size to 
> split a Relationship into two or more?
>
> Thanks,
> Ryan





Re: Securing NiFI behind a proxy (NGINX).

2020-09-16 Thread scotty
Hi Vijay,

After realizing that the reverse proxy was the problem, I've got NiFi,
standalone, secured with certificates by removing the reverse proxy out of
the mix.

Is there some example, somewhere, of using a NGINX reverse proxy so that I
can have the following setup?
 
client > https > NGINX > https > NiFi

My understanding is that NGINX needs a client certificate and that the FQDN
of that certificate needs to be setup to proxy user requests in the NiFi UI.
I've done both of these things as well as setup the nifi.web.proxy.host and
nifi.web.proxy.context.path in the nifi.properties file.

Is there a specific FQDN that NGINX is supposed to have?

Thanks.



--
Sent from: http://apache-nifi-users-list.2361937.n4.nabble.com/