Would this be as simple as adding a version field (or hash of this and some other fairly static information) to the NodeHeartbeat interface?
Andy LoPresto [email protected] [email protected] PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4 BACE 3C6E F65B 2F7D EF69 > On Jan 6, 2017, at 3:23 PM, Joe Witt <[email protected]> wrote: > > We are definitely going to make this case (misaligned node installs) > easier to identify. > > On Fri, Jan 6, 2017 at 6:22 PM, Kevin Verhoeven > <[email protected]> wrote: >> Matt, >> >> >> >> Your instincts were spot on, one of the 12 nodes had not been properly >> updated. I stopped the nodes one-by-one until I isolated the broken install >> and then reinstalled 1.1.1 on that node. The cluster is now operating as >> expected. >> >> >> >> Thanks so much for your input, I appreciate your help. >> >> >> >> Kevin >> >> >> >> From: Kevin Verhoeven [mailto:[email protected]] >> Sent: Friday, January 6, 2017 11:57 AM >> To: [email protected] >> Subject: RE: NiFi UI fails to display Processor Group: NullPointerException >> >> >> >> Unfortunately, I don’t see the stack trace in the app log either. I see many >> heartbeats from SocketProtocolListener and ClusterProtocolHeartbeater, but >> when I cause the UI to fail I see no exceptions or errors to indicate there >> was a problem. >> >> >> >> It makes sense that the GenerateFlowFile is causing this problem and knowing >> that it was updated in 1.1.0 confirms this. I checked all nodes and I see >> they are all running 1.1.1. So it appears that the update was successful on >> all nodes. I might try to stop the nodes one-by-one and see if I can catch >> different results. This seems like a long-shot, but I’ll try anything! >> >> >> >> Kevin >> >> >> >> From: Matt Gilman [mailto:[email protected]] >> Sent: Friday, January 6, 2017 10:49 AM >> To: [email protected] >> Subject: Re: NiFi UI fails to display Processor Group: NullPointerException >> >> >> >> Can you check the app log for the stack trace? Being a cluster, if there was >> a bug in the response merging logic it may be logged outside the user log. >> >> >> >> Also, is it possible that all the nodes did not get upgraded? I'm pretty >> sure that GenerateFlowFile received some new properties in 1.1.0. I wonder >> if there is an issue when the different nodes have different sets of >> properties for the same component. >> >> >> >> Matt >> >> >> >> On Fri, Jan 6, 2017 at 1:25 PM, Kevin Verhoeven <[email protected]> >> wrote: >> >> I’ll enable DEBUG level logging and see if I can capture anything. >> >> >> >> Navigating into the Processor Group still works in my 1.0.0 installation, it >> is only when I updated a cluster to 1.1.1 that I started seeing this >> behavior. I have two clusters, Dev and Prod and I updated Dev. Prod remains >> 1.0.0. >> >> >> >> The problem also happens on new Processor Groups. Here’s my test: >> >> >> >> 1. Create new Processor Group. >> >> 2. Browse into the Processor Group. >> >> 3. Add UpdateAttribute Processor, works. >> >> 4. Add GenerateFlowFile Processor, fails. (error is the same error as >> above). >> >> >> >> At this point I cannot enter my new Processor Group. >> >> >> >> Strangely, my Processor Groups that do not have a GenerateFlowFile Processor >> still work. So I might have just narrowed this down to one Processor, >> GenerateFlowFile. >> >> >> >> Kevin >> >> >> >> From: Matt Gilman [mailto:[email protected]] >> Sent: Friday, January 6, 2017 10:18 AM >> >> >> To: [email protected] >> Subject: Re: NiFi UI fails to display Processor Group: NullPointerException >> >> >> >> Thanks for the extra details. So it is the endpoint for loading the group >> which contains the necessary configuration and statistics to generate the >> graph. Can you try enabling DEBUG level logging for this package in your >> conf/logback.xml? >> >> >> >> org.apache.nifi.web.api.config >> >> >> >> Does navigating into this Process Group still work in your 1.0.0 >> installation? Do you know if there is anything unique about the contents of >> that group? I'm just trying to find something that might help point us in >> the correct direction so we can figure out a work around and get the issue >> addressed for upcoming releases. >> >> >> >> Thanks >> >> >> >> Matt >> >> >> >> On Fri, Jan 6, 2017 at 1:01 PM, Kevin Verhoeven <[email protected]> >> wrote: >> >> Thank you for your response, I really appreciate your help. I reviewed the >> user log and did not see a stack trace after the NPE message. Looking >> backward in the user log I see other log messages for the same thread and I >> found the endpoint being requested: >> >> >> >> 2017-01-06 17:45:44,338 INFO [NiFi Web Server-3798] >> org.apache.nifi.web.filter.RequestLogger Attempting request for (anonymous) >> GET >> http://servername:10000/nifi-api/flow/process-groups/e8a4ad16-1b4e-3afc-a8e0-0a2cdda95632 >> (source ip: ipaddress) >> >> … >> >> 2017-01-06 17:45:44,360 ERROR [NiFi Web Server-3798] >> o.a.nifi.web.api.config.ThrowableMapper An unexpected error has occurred: >> java.lang.NullPointerException. Returning Internal Server Error >> response.java.lang.NullPointerException: null >> >> >> >> Using Developer Tools in Chrome I compared the endpoint and it matches, >> here’s what Chrome indicated: >> >> >> >> GET >> http://servername:10000/nifi-api/flow/process-groups/e8a4ad16-1b4e-3afc-a8e0-0a2cdda95632 >> 500 (Internal Server Error) >> >> An unexpected error has occurred. Please check the logs for additional >> details. >> >> >> >> >> >> send @ jquery-2.1.1.min.js:2 >> >> ajax @ jquery-2.1.1.min.js:2 >> >> u @ nf-canvas-all.js?1.1.1:45 >> >> (anonymous) @ nf-canvas-all.js?1.1.1:45 >> >> Deferred @ jquery-2.1.1.min.js:2 >> >> reload @ nf-canvas-all.js?1.1.1:45 >> >> enterGroup @ nf-canvas-all.js?1.1.1:4 >> >> (anonymous) @ nf-canvas-all.js?1.1.1:32 >> >> (anonymous) @ d3.min.js:1 >> >> >> >> >> >> Response Headers: >> >> HTTP/1.1 500 Internal Server Error >> >> Date: Fri, 06 Jan 2017 17:54:07 GMT >> >> Content-Type: text/plain >> >> Transfer-Encoding: chunked >> >> Server: Jetty(9.3.9.v20160517) >> >> >> >> >> >> Request Headers >> >> Accept: application/json, text/javascript, */*; q=0.01 >> >> Accept-Encoding: gzip, deflate, sdch >> >> Accept-Language: en-US,en;q=0.8 >> >> Connection: keep-alive >> >> Host: servername:10000 >> >> Referer: http://servername:10000/nifi/ >> >> User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, >> like Gecko) Chrome/55.0.2883.87 Safari/537.36 >> >> X-Requested-With: XMLHttpRequest >> >> >> >> >> >> There is one other error that Chrome points out to me, but it might not be >> related. Kerberos returns an error: >> >> >> >> POST http://servername:10000/nifi-api/access/kerberos 409 >> (Conflict) >> >> >> >> Probably not related, but otherwise I do not see any other error that would >> explain what is going on with the Process Group. >> >> >> >> Regards, >> >> >> >> Kevin >> >> >> >> From: Matt Gilman [mailto:[email protected]] >> Sent: Thursday, January 5, 2017 5:14 PM >> To: [email protected] >> Subject: Re: NiFi UI fails to display Processor Group: NullPointerException >> >> >> >> Sorry for the inconvenience. Is there a stack trace listed in the user log? >> A quick glance at the code and the error handler looks like it logs the >> exception (which should include the stack trace) unconditionally. >> >> >> >> Also, you should be able to look backward from that line to look for other >> log messages for the same thread 'NiFi Web Server-299' to see which endpoint >> was being invoked. Sounds like the endpoint is going to be the one that >> returns that nest Process Group but just to be sure. Additionally, you can >> verify that by checking the develop tools in your browser to see which >> requestion failed. >> >> >> >> If there is a stack trace following the NPE message, that would be helpful. >> Thanks. >> >> >> >> Matt >> >> >> >> On Thu, Jan 5, 2017 at 7:47 PM, Kevin Verhoeven <[email protected]> >> wrote: >> >> After updating from NiFi 1.0 to 1.1.1, I am unable to browse into one or >> more of my Processor Groups. The UI seems to work until I double click on a >> specific Processor Group, then the UI throws an error “An unexpected error >> has occurred” and I find the following in the nifi-user.log log file: >> >> >> >> 2017-01-05 22:29:15,886 INFO [NiFi Web Server-19] >> org.apache.nifi.web.filter.RequestLogger Attempting request for (anonymous) >> GET >> http://servername:10000/nifi-api/flow/process-groups/e8a4ad16-1b4e-3afc-a8e0-0a2cdda95632 >> (source ip: ipaddress) >> >> 2017-01-05 22:29:16,397 INFO [NiFi Web Server-305] >> org.apache.nifi.web.filter.RequestLogger Attempting request for (anonymous) >> GET http:// servername:10000/nifi-api/flow/current-user (source ip: >> ipaddress) >> >> … >> >> 2017-01-05 22:29:16,402 ERROR [NiFi Web Server-299] >> o.a.nifi.web.api.config.ThrowableMapper An unexpected error has occurred: >> java.lang.NullPointerException. Returning Internal Server Error response. >> >> java.lang.NullPointerException: null >> >> … >> >> 2017-01-05 22:29:16,408 INFO [NiFi Web Server-305] >> org.apache.nifi.web.filter.RequestLogger Attempting request for (anonymous) >> GET http:// servername:10000/nifi-api/flow/controller/bulletins (source ip: >> 10 ipaddress) >> >> 2017-01-05 22:29:16,821 INFO [NiFi Web Server-19] >> org.apache.nifi.web.filter.RequestLogger Attempting request for (anonymous) >> GET http:// servername:10000/nifi-api/site-to-site (source ip: ipaddress) >> >> >> >> Any advice on how to proceed? >> >> >> Thanks, >> >> >> >> Kevin >> >> >> >> >> >>
signature.asc
Description: Message signed with OpenPGP using GPGMail
