Thanks!

Let's move the discussion over to the PR.

Matt

Sent from my iPhone

> On Sep 10, 2016, at 12:14 AM, Peter Wicks (pwicks) <[email protected]> wrote:
> 
> Matt,
>  
> I’ve identified the source of the issue, created a patch/unit test, and PR. 
> In StandardFlowFileQueue: writeSwapFilesIfNecessary.  When it calculates the 
> `numSwapFiles`, if the number of FlowFiles in the queue is perfectly 
> splitable (in my case 100000/20000 = 5) and the Active Queue is empty then 
> ALL files move to swap and none are left in Active.
>  
> https://github.com/apache/nifi/pull/1000
>  
> If you have a chance to take a look at the PR I’d appreciate it.
>  
> Thanks,
>   Peter
>  
> From: Matt Gilman [mailto:[email protected]] 
> Sent: Friday, September 09, 2016 5:00 PM
> To: [email protected]
> Subject: Re: Erroneous Queue has No FlowFiles message
>  
> Peter,
>  
> Thanks for the confirmation. I think there is some case were hitting here 
> where some flowfiles are being swapped instead of added back to the active 
> queue. The queue listing only returns the top 100 entries in the active 
> queue. Haven't identified the case that's causing it yet but definitely have 
> a better idea what's going on now.
>  
> Thanks
>  
> Matt
> 
> Sent from my iPhone
> 
> On Sep 9, 2016, at 5:42 PM, Peter Wicks (pwicks) <[email protected]> wrote:
> 
> Matt,
>  
> I followed the swapping train of thought and debugged the code. When I debug 
> the code where it gets the files the `size` variable looks like this:
>  
> FlowFile Queue Size[ ActiveQueue=[0, 0 Bytes], Swap Queue=[100000, 26600000 
> Bytes], Swap Files=[10], Unacknowledged=[0, 0 Bytes] ]
>  
> But the List FlowFiles command only looks at the Active queue…
>  
> That looks like the root cause, what I don’t know is if this is by design.
>  
> --Peter
>  
> From: Peter Wicks (pwicks) 
> Sent: Friday, September 09, 2016 3:28 PM
> To: '[email protected]' <[email protected]>
> Subject: RE: Erroneous Queue has No FlowFiles message
>  
> Matt,
>  
> You also asked in an earlier email if I could still reproduce it, and if so 
> to try enabling DEBUG level logging.  I am able to reproduce, so I enabled it:
>  
> 2016-09-09 21:27:28,352 DEBUG [List FlowFiles for Connection 
> 0f620e2d-0157-1000-4a1d-fd988c59e290] o.a.n.controller.StandardFlowFileQueue 
> FlowFileQueue[id=0f620e2d-0157-1000-4a1d-fd988c59e290] Acquired lock to 
> perform listing of FlowFiles
>  
> 2016-09-09 21:27:28,353 DEBUG [List FlowFiles for Connection 
> 0f620e2d-0157-1000-4a1d-fd988c59e290] o.a.n.controller.StandardFlowFileQueue 
> FlowFileQueue[id=0f620e2d-0157-1000-4a1d-fd988c59e290] Finished listing 
> FlowFiles for active queue with a total of 0 results
>  
> 2016-09-09 21:27:29,656 INFO [NiFi Web Server-112] 
> o.a.n.controller.StandardFlowFileQueue Canceling ListFlowFile Request with ID 
> 10d92339-0157-1000-42f4-464c37340fdb
>  
> Thanks,
>   Peter
>  
> From: Peter Wicks (pwicks) 
> Sent: Friday, September 09, 2016 3:15 PM
> To: [email protected]
> Subject: RE: Erroneous Queue has No FlowFiles message
>  
> Matt,
>  
> PutSQL is the end of the line, no downstream processors.
> Batch size is 1000, yes I have fragmented transactions set to false.
>  
> nifi.queue.swap.threshold=20000
>  
> --Peter
>  
>  
> From: Matt Gilman [mailto:[email protected]] 
> Sent: Friday, September 09, 2016 2:23 PM
> To: [email protected]
> Subject: Re: Erroneous Queue has No FlowFiles message
>  
> Peter,
>  
> Would you be able to share what you've configured for the batch size of 
> PutSQL (assuming that 'fragmented transactions' is disabled) and what your 
> swap threshold is configured to (nifi.queue.swap.threshold in 
> nifi.properties)?
>  
> Also, what is following the PutSQL? Had any of those connections exceeded 
> their configured back pressure threshold?
>  
> Thanks again.
>  
> Matt
>  
> On Fri, Sep 9, 2016 at 11:18 AM, Peter Wicks (pwicks) <[email protected]> 
> wrote:
> PutSQL.  The 100k FlowFiles are all SQL Insert queries with associated 
> attributys, generated by a JSONToSQL processor.
>  
> From: Matt Gilman [mailto:[email protected]] 
> Sent: Friday, September 09, 2016 8:51 AM
> 
> To: [email protected]
> Subject: Re: Erroneous Queue has No FlowFiles message
>  
> Peter,
>  
> What is the processor downstream of the connection in question? Thanks.
>  
> Matt
>  
> On Fri, Sep 9, 2016 at 10:39 AM, Matt Gilman <[email protected]> wrote:
> Peter,
>  
> Thanks for the answers. Still not quite sure what's causing this and am 
> trying to narrow down the possible cause. Are you still able to replicate the 
> issue? If so, can you enable debug level logging for 
>  
> org.apache.nifi.controller.StandardFlowFileQueue
>  
> and see if there are any meaningful messages in the nifi-app.log?
>  
> Thanks!
>  
> Matt
>  
>  
> On Fri, Sep 9, 2016 at 9:52 AM, Peter Wicks (pwicks) <[email protected]> 
> wrote:
> Matt,
>  
> This is not a cluster.
> Yes, it’s secured. Kerberos.
>  
> The thing that gets me is I can list another queue on the same graph/same 
> processor group.
>  
> --Peter
>  
>  
> From: Matt Gilman [mailto:[email protected]] 
> Sent: Friday, September 09, 2016 5:25 AM
> 
> To: [email protected]
> Subject: Re: Erroneous Queue has No FlowFiles message
>  
> Peter,
>  
> Thanks for the details! These will be very helpful investigating what's 
> happening here. A couple follow-up questions...
>  
> - Is this a cluster?
> - Is this instance secured?
>  
> Thanks
>  
> Matt
>  
> On Fri, Sep 9, 2016 at 12:13 AM, Peter Wicks (pwicks) <[email protected]> 
> wrote:
> Gunjan,
>  
> Thanks for the response. I included those messages to emphasize the 
> difference between a normal Queue List and mine.  In a normal queue list the 
> GET step includes a non-empty “flowFileSummaries” array, assuming there are 
> FlowFiles to show.
> When I list my other queue, the one with 23 FlowFiles in it, I get back an 
> array with 23 entries.  Based on the JSON I’m assuming that my queue with 
> 100,000 files in it should return 100, but instead I get 0.
>  
> Thanks,
>   Peter
>  
> From: Gunjan Dave [mailto:[email protected]] 
> Sent: Thursday, September 08, 2016 9:26 PM
> To: [email protected]
> Subject: Re: Erroneous Queue has No FlowFiles message
>  
> Hi Peter, once you post the request, your first step, you get a listing 
> request reference handle UUID as part of response. 
> This UUID is used to perform the all the operations on the queue.
> This UUID is active until a DELETE request is sent. Once you delete the 
> active request, you get the message you mentioned in the logs, this is not an 
> issue. 
> If you check the developer panel in chrome, you will see all 3 operations, 
> post-get-delete in succession.
> 
>  
> On Fri, Sep 9, 2016, 8:48 AM Peter Wicks (pwicks) <[email protected]> wrote:
> Running NiFI 1.0.0, I’m listing a queue that has 100k files queued. I’ve 
> stopped both the incoming and outgoing processors, so the files are just 
> hanging out in the queue, no possible motion.
>  
> I get, “The queue has no FlowFiles” message.  Here are the actual responses 
> from the REST calls:
>  
> POST - Listing-requests
> {"listingRequest":{"id":"0cee44de-0157-1000-5668-6e93a465e227","uri":"https://localhost:8443/nifi-api/flowfile-queues/0bacce2d-0157-1000-1a6d-6e0fd84a6bd6/listing-requests/0cee44de-0157-1000-5668-6e93a465e227","submissionTime":"09/09/2016
>  03:12:04.318 GMT+00:00","lastUpdated":"03:12:04 
> GMT+00:00","percentCompleted":0,"finished":false,"maxResults":100,"state":"Waiting
>  for other queue requests to 
> complete","queueSize":{"byteCount":25400000,"objectCount":100000},"sourceRunning":false,"destinationRunning":false}}
>  
> GET
> {"listingRequest":{"id":"0cee44de-0157-1000-5668-6e93a465e227","uri":"https://
>  
> localhost:8443/nifi-api/flowfile-queues/0bacce2d-0157-1000-1a6d-6e0fd84a6bd6/listing-requests/0cee44de-0157-1000-5668-6e93a465e227","submissionTime":"09/09/2016
>  03:12:04.318 GMT+00:00","lastUpdated":"03:12:04 
> GMT+00:00","percentCompleted":100,"finished":true,"maxResults":100,"state":"Completed
>  
> successfully","queueSize":{"byteCount":25400000,"objectCount":100000},"flowFileSummaries":[],"sourceRunning":false,"destinationRunning":false}}
>  
> DELETE
> {"listingRequest":{"id":"0cee44de-0157-1000-5668-6e93a465e227","uri":"https://
>  
> localhost:8443/nifi-api/flowfile-queues/0bacce2d-0157-1000-1a6d-6e0fd84a6bd6/listing-requests/0cee44de-0157-1000-5668-6e93a465e227","submissionTime":"09/09/2016
>  03:12:04.318 GMT+00:00","lastUpdated":"03:12:04 
> GMT+00:00","percentCompleted":100,"finished":true,"maxResults":100,"state":"Completed
>  
> successfully","queueSize":{"byteCount":25400000,"objectCount":100000},"sourceRunning":false,"destinationRunning":false}}
>  
> On a subsequent test (thus the difference in ID’s) I checked the nifi-app.log 
> file and found this single message:
>  
> 2016-09-09 03:15:50,043 INFO [NiFi Web Server-828] 
> o.a.n.controller.StandardFlowFileQueue Canceling ListFlowFile Request with ID 
> 0cf1b178-0157-1000-9111-9b889415bcdc
>  
> Not clear why it was canceled.
>  
> I went up one step in the process, and that queue has 23 items in it. I was 
> able to list it without issue.
>  
> Any ideas why I can’t list the queue?
>  
> Thanks,
>   Peter Wicks
>  
>  
>  
>  

Reply via email to