[GitHub] nifi pull request: [NIFI-1394] - Unit test creates resources in HO...
GitHub user smarthi opened a pull request: https://github.com/apache/nifi/pull/174 [NIFI-1394] - Unit test creates resources in HOME directory Changed the DB_LOCATION to be "/tmp/test/h2" You can merge this pull request into a Git repository by running: $ git pull https://github.com/smarthi/incubator-nifi NIFI-1394 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/nifi/pull/174.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #174 commit 8c822c4560a2137a98dbe2ad384cc1018b096ff3 Author: smarthi Date: 2016-01-16T06:48:06Z [NIFI-1394] - Unit test creates resources in HOME directory --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
Re: NiFi 0.4.1 InvokeHttp processor POST error issue
Joe, Awesome. +1 to "Use Chunked Encoding" as 'false' by default. I think for historical purposes of HTTP, if you know the size of your payload, you should explicitly assert it using content-length. Sending payloads with "normal" transfer encoding by default would be a great strategy. >From a pragmatic point of view, if you know the size of your payload, there's really no reason to use chunked encoding. Having this feature property exposed at all is just a nicety, I guess. Adam On Fri, Jan 15, 2016 at 1:58 PM, Joe Percivall < joeperciv...@yahoo.com.invalid> wrote: > Adam, > > You are right and it looks like OkHttp could easily support it. > If you look at this line of OkHttp [1], you'll see that if the > contentLength is not set then it will use chunked and otherwise it won't. > If we added a property to InvokeHttp in which the user chooses to have > chunked or not then adjust the contentLength method of the RequestBody we > can enable the option. > I tried making a simple change in InvokeHttp to the RequestBody to just > implement the contentLength method with the correct value, combined with > updating the mime-type to "application/x-www-form-urlencoded" to send as > the content-type, and I successfully ran the template. I'll created a > ticket[2] and will have a patch up for the change shortly. > I'm going to assume that the default value for the "Use Chunked Encoding" > property will false. > [1] > https://github.com/square/okhttp/blob/parent-2.7.1/okhttp/src/main/java/com/squareup/okhttp/Call.java#L263[2] > https://issues.apache.org/jira/browse/NIFI-1405 > Joe - - - - - - > Joseph Percivall > linkedin.com/in/Percivall > e: joeperciv...@yahoo.com > > > > > > > On Friday, January 15, 2016 9:24 AM, Adam Taft wrote: > Joe, > > Just as a quick observation, this statement isn't completely accurate: > > > "... and can stream the contents instead of loading into memory" > > The original InvokeHTTP code (pre okhttp) explicitly set the content-length > header, because it was known (the flowfile payload content length is always > known). This does not, however, imply that the entire contents were loaded > into memory. The previous InvokeHTTP used the > #setFixedLengthStreamingMode(long), which is described as: > > "This method is used to enable streaming of a HTTP request body without > internal buffering, when the content length is known in advance." [1] > > HttpURLConnection doesn't need to buffer if the length is known in > advance. It's only when it doesn't know the length that it either needs to > buffer to determine it or use chunked encoding. > > I think it's important to be able to support non-chunked encoded POST > requests. There are many "legacy" (or even "broken") web services that > don't work with chunked encoding, obviously like in this case. > > Unfortunately, I don't recall that okhttp has similar direct support for > "fixed length streaming". It's probable that a custom implementation of > okhttp.RequestBody would need to be created to support this. [2] > > [1] > > https://docs.oracle.com/javase/8/docs/api/java/net/HttpURLConnection.html#setFixedLengthStreamingMode-long- > > [2] http://square.github.io/okhttp/3.x/okhttp/okhttp3/RequestBody.html > > > On Thu, Jan 14, 2016 at 10:29 PM, Joe Percivall < > joeperciv...@yahoo.com.invalid> wrote: > > > Hello Evan, > > > > Glad to hear you're enjoying NiFi! > > > > I was able to replicate your results so I dug in a bit and noticed in > > Wireshark that the "Transfer-Encoding" header for InvokeHttp was set to > > "chunked". When I tried using the same flag for curl it failed so I'm > > relatively confident that is the problem. Currently InvokeHttp requires > > using the chunk encoding for POST (primarily because you don't need to > know > > the content-length and can stream the contents instead of loading into > > memory). > > > > PostHttp does have a "Use Chunked Encoding" option which would solve your > > problem except that it doesn't work properly. PostHttp is using the > > "EntityTemplate" which streams the content so the content length will > never > > be implemented and thus it will alway use the chunked encoding. I > created a > > ticket for it [1]. > > > > > > Also as a note, when creating a template you have to either explicitly > > select the connections or not select anything and create a template for > the > > whole canvas (your template didn't have any connections). > > > > [1] https://issues.apache.org/jira/browse/NIFI-1396 > > > > Cheers, > > Joe > > - - - - - - > > Joseph Percivall > > linkedin.com/in/Percivall > > e: joeperciv...@yahoo.com > > > > > > > > On Thursday, January 14, 2016 8:07 PM, "yuchen@thomsonreuters.com" < > > yuchen@thomsonreuters.com> wrote: > > > > > > > > > > Hi Guys, > > > > Not sure if it is the correct way to raise issue by sending this email, > if > > not, let me know where the post the issue, thanks. > > > > We are using NiFi InvokeHttp processor to do POST to an webpage. > > URL: >
Re: NiFi 0.4.1 InvokeHttp processor POST error issue
Adam, You are right and it looks like OkHttp could easily support it. If you look at this line of OkHttp [1], you'll see that if the contentLength is not set then it will use chunked and otherwise it won't. If we added a property to InvokeHttp in which the user chooses to have chunked or not then adjust the contentLength method of the RequestBody we can enable the option. I tried making a simple change in InvokeHttp to the RequestBody to just implement the contentLength method with the correct value, combined with updating the mime-type to "application/x-www-form-urlencoded" to send as the content-type, and I successfully ran the template. I'll created a ticket[2] and will have a patch up for the change shortly. I'm going to assume that the default value for the "Use Chunked Encoding" property will false. [1] https://github.com/square/okhttp/blob/parent-2.7.1/okhttp/src/main/java/com/squareup/okhttp/Call.java#L263[2] https://issues.apache.org/jira/browse/NIFI-1405 Joe - - - - - - Joseph Percivall linkedin.com/in/Percivall e: joeperciv...@yahoo.com On Friday, January 15, 2016 9:24 AM, Adam Taft wrote: Joe, Just as a quick observation, this statement isn't completely accurate: > "... and can stream the contents instead of loading into memory" The original InvokeHTTP code (pre okhttp) explicitly set the content-length header, because it was known (the flowfile payload content length is always known). This does not, however, imply that the entire contents were loaded into memory. The previous InvokeHTTP used the #setFixedLengthStreamingMode(long), which is described as: "This method is used to enable streaming of a HTTP request body without internal buffering, when the content length is known in advance." [1] HttpURLConnection doesn't need to buffer if the length is known in advance. It's only when it doesn't know the length that it either needs to buffer to determine it or use chunked encoding. I think it's important to be able to support non-chunked encoded POST requests. There are many "legacy" (or even "broken") web services that don't work with chunked encoding, obviously like in this case. Unfortunately, I don't recall that okhttp has similar direct support for "fixed length streaming". It's probable that a custom implementation of okhttp.RequestBody would need to be created to support this. [2] [1] https://docs.oracle.com/javase/8/docs/api/java/net/HttpURLConnection.html#setFixedLengthStreamingMode-long- [2] http://square.github.io/okhttp/3.x/okhttp/okhttp3/RequestBody.html On Thu, Jan 14, 2016 at 10:29 PM, Joe Percivall < joeperciv...@yahoo.com.invalid> wrote: > Hello Evan, > > Glad to hear you're enjoying NiFi! > > I was able to replicate your results so I dug in a bit and noticed in > Wireshark that the "Transfer-Encoding" header for InvokeHttp was set to > "chunked". When I tried using the same flag for curl it failed so I'm > relatively confident that is the problem. Currently InvokeHttp requires > using the chunk encoding for POST (primarily because you don't need to know > the content-length and can stream the contents instead of loading into > memory). > > PostHttp does have a "Use Chunked Encoding" option which would solve your > problem except that it doesn't work properly. PostHttp is using the > "EntityTemplate" which streams the content so the content length will never > be implemented and thus it will alway use the chunked encoding. I created a > ticket for it [1]. > > > Also as a note, when creating a template you have to either explicitly > select the connections or not select anything and create a template for the > whole canvas (your template didn't have any connections). > > [1] https://issues.apache.org/jira/browse/NIFI-1396 > > Cheers, > Joe > - - - - - - > Joseph Percivall > linkedin.com/in/Percivall > e: joeperciv...@yahoo.com > > > > On Thursday, January 14, 2016 8:07 PM, "yuchen@thomsonreuters.com" < > yuchen@thomsonreuters.com> wrote: > > > > > Hi Guys, > > Not sure if it is the correct way to raise issue by sending this email, if > not, let me know where the post the issue, thanks. > > We are using NiFi InvokeHttp processor to do POST to an webpage. > URL: > http://www.hkexnews.hk/listedco/listconews/advancedsearch/search_active_main.aspx > Request header: Content-Type: application/x-www-form-urlencoded > POST Data: > txt_stock_code=24984&sel_DateOfReleaseFrom_y=2016&sel_DateOfReleaseFrom_m=01&sel_DateOfReleaseFrom_d=04&sel_DateOfReleaseTo_y=2016&sel_DateOfReleaseTo_m=01&sel_DateOfReleaseTo_d=11&IsFromNewList=False&sel_tier_1=-2&sel_tier_2_group=-2&sel_tier_2=-2 > > To make sure the request header and request body are correct, we use > Fiddler to compose the post request. > And the response show the request header and post data are correct. > > > Attached file is the template we are using, it is working fine on version > 0.3.0 > > But not on the latest version 0.4.1 > > So we suppose it is potential defect of the InvokeHttp processo
Re: ListenLumberjack processor is working
Andre, Very cool that you have made progress here. Being able to integrate with logstash will be very useful. I think the refactoring I'm doing for the RELP stuff should help reduce the amount of code that had to be carried over from ListenSyslog. I'm happy to help you update your code once my changes are in. Sorry it hasn't gotten in sooner. -Bryan On Fri, Jan 15, 2016 at 8:39 AM, Andre wrote: > Hey folks, > > I've managed to progress on ListenLumberjack. The code is a bit > 'spaghettic' at the moment, with some serious amount of logger. enabled > to allow some additional troubleshooting, but overall it "works". > > I am strongly considering refactor the code as whole once Bryan completes > the ListenRELP processor. > > Functional code (I guess? :D ) should be available in here: > > https://github.com/trixpan/nifi-lumberjack-bundle/ > > Known issues: > * If logstash-forwarder goes silent for too long the processor will raise a > Timeout. Couldn't find evidence of a keep alive within Lumberjack so I am > considering catching this error as debug. > * I suspect the code may have some memory leaks. > * Tests haven't been created yet. To be honest I never wrote unit tests in > my whole life so it will be another ride. :-) > > My results were the following: > > Single thread, 2 sec runs > 2016/01/15 23:52:27.589014 Registrar: processing 4000 events > 2016/01/15 23:52:29.169361 Registrar: processing 4000 events > 2016/01/15 23:52:30.552031 Registrar: processing 4000 events > 2016/01/15 23:52:32.998425 Registrar: processing 4000 events > 2016/01/15 23:52:35.411438 Registrar: processing 4000 events > 2016/01/15 23:52:37.062141 Registrar: processing 4000 events > 2016/01/15 23:52:39.468577 Registrar: processing 4000 events > 2016/01/15 23:52:40.940890 Registrar: processing 4000 events > 2016/01/15 23:52:43.480875 Registrar: processing 4000 events > 2016/01/15 23:52:45.026758 Registrar: processing 4000 events > > 4 threads, 2 sec runs > 2016/01/15 23:56:03.376303 Registrar: processing 4000 events > 2016/01/15 23:56:03.443074 Registrar: processing 4000 events > 2016/01/15 23:56:03.471795 Registrar: processing 4000 events > 2016/01/15 23:56:03.508283 Registrar: processing 4000 events > 2016/01/15 23:56:03.534002 Registrar: processing 4000 events > 2016/01/15 23:56:03.562387 Registrar: processing 4000 events > 2016/01/15 23:56:03.587744 Registrar: processing 4000 events > 2016/01/15 23:56:03.622716 Registrar: processing 4000 events > 2016/01/15 23:56:03.649074 Registrar: processing 4000 events > 2016/01/15 23:56:03.675780 Registrar: processing 4000 events > > Would anyone have a decent logstash testbed to put some extra pressure > against the processor? >
Re: NiFi 0.4.1 InvokeHttp processor POST error issue
Joe, Just as a quick observation, this statement isn't completely accurate: > "... and can stream the contents instead of loading into memory" The original InvokeHTTP code (pre okhttp) explicitly set the content-length header, because it was known (the flowfile payload content length is always known). This does not, however, imply that the entire contents were loaded into memory. The previous InvokeHTTP used the #setFixedLengthStreamingMode(long), which is described as: "This method is used to enable streaming of a HTTP request body without internal buffering, when the content length is known in advance." [1] HttpURLConnection doesn't need to buffer if the length is known in advance. It's only when it doesn't know the length that it either needs to buffer to determine it or use chunked encoding. I think it's important to be able to support non-chunked encoded POST requests. There are many "legacy" (or even "broken") web services that don't work with chunked encoding, obviously like in this case. Unfortunately, I don't recall that okhttp has similar direct support for "fixed length streaming". It's probable that a custom implementation of okhttp.RequestBody would need to be created to support this. [2] [1] https://docs.oracle.com/javase/8/docs/api/java/net/HttpURLConnection.html#setFixedLengthStreamingMode-long- [2] http://square.github.io/okhttp/3.x/okhttp/okhttp3/RequestBody.html On Thu, Jan 14, 2016 at 10:29 PM, Joe Percivall < joeperciv...@yahoo.com.invalid> wrote: > Hello Evan, > > Glad to hear you're enjoying NiFi! > > I was able to replicate your results so I dug in a bit and noticed in > Wireshark that the "Transfer-Encoding" header for InvokeHttp was set to > "chunked". When I tried using the same flag for curl it failed so I'm > relatively confident that is the problem. Currently InvokeHttp requires > using the chunk encoding for POST (primarily because you don't need to know > the content-length and can stream the contents instead of loading into > memory). > > PostHttp does have a "Use Chunked Encoding" option which would solve your > problem except that it doesn't work properly. PostHttp is using the > "EntityTemplate" which streams the content so the content length will never > be implemented and thus it will alway use the chunked encoding. I created a > ticket for it [1]. > > > Also as a note, when creating a template you have to either explicitly > select the connections or not select anything and create a template for the > whole canvas (your template didn't have any connections). > > [1] https://issues.apache.org/jira/browse/NIFI-1396 > > Cheers, > Joe > - - - - - - > Joseph Percivall > linkedin.com/in/Percivall > e: joeperciv...@yahoo.com > > > > On Thursday, January 14, 2016 8:07 PM, "yuchen@thomsonreuters.com" < > yuchen@thomsonreuters.com> wrote: > > > > > Hi Guys, > > Not sure if it is the correct way to raise issue by sending this email, if > not, let me know where the post the issue, thanks. > > We are using NiFi InvokeHttp processor to do POST to an webpage. > URL: > http://www.hkexnews.hk/listedco/listconews/advancedsearch/search_active_main.aspx > Request header: Content-Type: application/x-www-form-urlencoded > POST Data: > txt_stock_code=24984&sel_DateOfReleaseFrom_y=2016&sel_DateOfReleaseFrom_m=01&sel_DateOfReleaseFrom_d=04&sel_DateOfReleaseTo_y=2016&sel_DateOfReleaseTo_m=01&sel_DateOfReleaseTo_d=11&IsFromNewList=False&sel_tier_1=-2&sel_tier_2_group=-2&sel_tier_2=-2 > > To make sure the request header and request body are correct, we use > Fiddler to compose the post request. > And the response show the request header and post data are correct. > > > Attached file is the template we are using, it is working fine on version > 0.3.0 > > But not on the latest version 0.4.1 > > So we suppose it is potential defect of the InvokeHttp processor in this > version. > We checked the source code and try to locate the issue, and found it is > using com.squareup.okhttp.Request; to do the request, so we are not go any > further to dig the issue… > Currently we are using Curl to do the POST as a workaround. > > Let me know your comments, thanks. > > Finally, NiFi is a great tool!!! You guys are awesome!!! > > Best Regards, > Evan from Thomson Reuters >
ListenLumberjack processor is working
Hey folks, I've managed to progress on ListenLumberjack. The code is a bit 'spaghettic' at the moment, with some serious amount of logger. enabled to allow some additional troubleshooting, but overall it "works". I am strongly considering refactor the code as whole once Bryan completes the ListenRELP processor. Functional code (I guess? :D ) should be available in here: https://github.com/trixpan/nifi-lumberjack-bundle/ Known issues: * If logstash-forwarder goes silent for too long the processor will raise a Timeout. Couldn't find evidence of a keep alive within Lumberjack so I am considering catching this error as debug. * I suspect the code may have some memory leaks. * Tests haven't been created yet. To be honest I never wrote unit tests in my whole life so it will be another ride. :-) My results were the following: Single thread, 2 sec runs 2016/01/15 23:52:27.589014 Registrar: processing 4000 events 2016/01/15 23:52:29.169361 Registrar: processing 4000 events 2016/01/15 23:52:30.552031 Registrar: processing 4000 events 2016/01/15 23:52:32.998425 Registrar: processing 4000 events 2016/01/15 23:52:35.411438 Registrar: processing 4000 events 2016/01/15 23:52:37.062141 Registrar: processing 4000 events 2016/01/15 23:52:39.468577 Registrar: processing 4000 events 2016/01/15 23:52:40.940890 Registrar: processing 4000 events 2016/01/15 23:52:43.480875 Registrar: processing 4000 events 2016/01/15 23:52:45.026758 Registrar: processing 4000 events 4 threads, 2 sec runs 2016/01/15 23:56:03.376303 Registrar: processing 4000 events 2016/01/15 23:56:03.443074 Registrar: processing 4000 events 2016/01/15 23:56:03.471795 Registrar: processing 4000 events 2016/01/15 23:56:03.508283 Registrar: processing 4000 events 2016/01/15 23:56:03.534002 Registrar: processing 4000 events 2016/01/15 23:56:03.562387 Registrar: processing 4000 events 2016/01/15 23:56:03.587744 Registrar: processing 4000 events 2016/01/15 23:56:03.622716 Registrar: processing 4000 events 2016/01/15 23:56:03.649074 Registrar: processing 4000 events 2016/01/15 23:56:03.675780 Registrar: processing 4000 events Would anyone have a decent logstash testbed to put some extra pressure against the processor?