Re: Nifi errors - FetchFile and UnpackContent

2019-10-11 Thread Tomislav Novosel
Any more suggestions to this situation?

Thanks,
Tom

On Thu, 3 Oct 2019 at 19:54, Tomislav Novosel  wrote:

> Hi Jeff,
>
> None of this is applied in pipeline and FetchFile processor.
> It is not on cluster, it runs only on one standalone Nifi instance.
> Completion strategy is on None, nor deleting, nor Moving.
>
> Only thing that can be is that someone else uses the file at the same time
> because that shared disk is used by other people to
> who are reading the files and doing some analysis.
>
> Can that be also the cause for Truncated ZIP file on UnpackContent
> processor?
>
> I applied loopback relationship on that processors for failure flowfiles
> to retry on failure.
>
> Thanks.
> Tom
>
> On Thu, 3 Oct 2019 at 17:18, Jeff  wrote:
>
>> Hello Tomislav,
>>
>> Are these processors running in a multi-node cluster?  Is FetchFile
>> downstream from a ListFile processor that is scheduled to run on all nodes
>> versus Primary Node only?  Is FetchFile's Completion Strategy set to "Move
>> File" or "Delete File"?  Typically, source processors should be scheduled
>> to run on the primary node, otherwise when reading from the same source
>> across multiple nodes, for example a shared network drive, each source
>> processor might pull the same data.  In a situation like this, the same
>> file could be listed by each node, and the FetchFile processor on each node
>> may attempt to fetch the same file.
>>
>> If you set the source processor to run on Primary Node only, you can
>> load-balance the connection between the source processor and FetchFile to
>> distribute the load of fetching the files across the cluster.
>>
>> On Thu, Oct 3, 2019 at 2:32 AM Tomislav Novosel 
>> wrote:
>>
>>> Hi all,
>>>
>>> I'm getting errors from FetchFile and UnpackContent processors.
>>> I have pipeline where I fetch zip files as they come continuously on
>>> shared network drive
>>> with Minimum file age set to 30 sec to avoid fetching file before it is
>>> written to disk completely.
>>>
>>> Sometimes I get this error from FetchFile:
>>>
>>> FetchFile[id=c741187c-1172-1166-e752-1f79197a8029] Could not fetch file
>>> \\avl01\ATGRZ\TestFactory\02 Dep Service\01
>>> Processdata\Backup\dfs_atfexport\MANA38\ANA_12_BPE7347\ANA_12_BPE7347_TDL_HL_1\measurement_file.atf.zip
>>> from file system for
>>> StandardFlowFileRecord[uuid=e7a5e3c4-0981-4ff3-85ea-91e41f0c3c0e,claim=,offset=0,name=PEI_BPE7347_TDLHL1new_826_20191001161312.atf.zip,size=0]
>>> because the existence of the file cannot be verified; routing to failure
>>>
>>>
>>> And from UnpackContent sometimes I get this error:
>>>
>>>
>>> UnpackContent[id=0164106c-d3b7-1e3f-c770-6e6e07f9259d] Unable to unpack
>>> StandardFlowFileRecord[uuid=4a019d58-fe45-4276-a161-e46cd8b1667c,claim=StandardContentClaim
>>> [resourceClaim=StandardResourceClaim[id=1570052741201-5000,
>>> container=default, section=904], offset=1651,
>>> length=28417768],offset=0,name=measurement.atf.zip,size=28417768] due to
>>> IOException thrown from
>>> UnpackContent[id=0164106c-d3b7-1e3f-c770-6e6e07f9259d]:
>>> java.io.IOException: Truncated ZIP file; routing to failure:
>>>
>>> org.apache.nifi.processor.exception.ProcessException: IOException thrown
>>> from UnpackContent[id=0164106c-d3b7-1e3f-c770-6e6e07f9259d]:
>>> java.io.IOException: Truncated ZIP file
>>>
>>>
>>> After getting this error from UnpackContent I tried to fetch file again
>>> and to unpack it. It went well, without any errors.
>>> So what does this errors mean? I spoke to colleagues who are using this
>>> files on the source side and they said files are ok, not corrupted or
>>> something.
>>>
>>> Please help or give advice.
>>>
>>> Thanks in advance.
>>> Tom
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>


Problem with Context Path Whitelisting

2019-10-11 Thread Swarup Karavadi
Greetings,

I have deployed a single node unsecured NiFi cluster (I say cluster because 
nifi.cluster.is.node is set to "true") as a stateful set on Kubernetes (AWS EKS 
to be specific). The NiFi cluster sits behind an Nginx ingress. I have 
configured the Nginx ingress to forward the appropriate headers to NiFi (when 
deployed behind a reverse proxy) as described in the documentation 
.
 

The path on the Nginx ingress which proxies traffic to the NiFi UI is 
"/pie/ip". This same path has been whitelisted by setting the 
"nifi.web.proxy.context.path" property to "/pie/ip". The way I am expecting 
this setup to work is that when users navigate to http://foo.com/pie/ip 
 in the browser, they are shown a simple HTML page with 
redirect info and then automatically redirected to http://foo.com/pie/ip/nifi 
 where they can view the NiFi canvas. Instead, the 
users are being redirected to http://foo.com/nifi  which 
results in a 404 response because there is no '/nifi' path that has been 
configured on the Nginx ingress.

I set the NiFi and Jetty Server log levels to DEBUG to understand what was 
happening under the hood and this is what I got - 

On Startup (when the SanitizeContextPathFilter is initialized) - 
2019-10-11 06:07:26,206 DEBUG [main] o.a.n.w.filter.SanitizeContextPathFilter 
SanitizeContextPathFilter received provided whitelisted context paths from NiFi 
properties: /pie/ip

On Request (when the actual request is made) - 
2019-10-11 06:45:45,556 DEBUG [NiFi Web Server-23] 
org.apache.nifi.web.util.WebUtils Context path: 
2019-10-11 06:45:45,556 DEBUG [NiFi Web Server-23] 
org.apache.nifi.web.util.WebUtils On the request, the following context paths 
were parsed from headers:
 X-ProxyContextPath: /pie/ip
X-Forwarded-Context: null
X-Forwarded-Prefix: null
2019-10-11 06:45:45,556 DEBUG [NiFi Web Server-23] 
org.apache.nifi.web.util.WebUtils Determined context path: /pie/ip
2019-10-11 06:45:45,556 ERROR [NiFi Web Server-23] 
org.apache.nifi.web.util.WebUtils The provided context path [/pie/ip] was not 
whitelisted []
2019-10-11 06:45:45,556 ERROR [NiFi Web Server-23] 
org.apache.nifi.web.util.WebUtils Error determining context path on JSP page: 
The provided context path [/pie/ip] was not whitelisted []
2019-10-11 06:45:45,556 DEBUG [NiFi Web Server-23] 
o.a.n.w.filter.SanitizeContextPathFilter SanitizeContextPathFilter set 
contextPath: 

You will notice from the above log entries that the path '/pie/ip' was 
successfully whitelisted. Yet, when handling the request, the whitelisted 
context paths array is empty and this causes the wrong redirect to happen on 
the browser - and I can't figure out why this is happening or how I can fix it. 
Has anyone come across this kind of problem before? Any help on this is much 
appreciated.

Cheers,
Swarup.

Re: Problem with Context Path Whitelisting

2019-10-11 Thread Jeff
Swarup,

Agreed with Kevin, very nice write-up on the scenario!

Would you please provide the original request as sent by Nginx, along with
your configuration pertaining to NiFi in Nginx?  We can set up some test
cases to reproduce what's happening and get a JIRA filed if there's an edge
case not being handled by NiFi.

On Fri, Oct 11, 2019 at 9:30 AM Kevin Doran  wrote:

> Swarup,
>
> First, thanks for the great email. Nice job troubleshooting this and
> sharing your findings with the community.
>
> I'm more familiar with how these types of things get configured on
> NiFi Registry than NiFi, so I'm not as much help as others. But I did
> take a look and one thing I noticed was a difference between the
> startup config and the per-request config.
>
> On Startup, the whitelisted context paths are coming from the
> ServletContext FilterConfig [1].
>
> During request handling, the whitelisted context paths are coming from
> the ApplicationContext, directly from NiFi Properties [2]
>
> [1]
> https://github.com/apache/nifi/blob/master/nifi-commons/nifi-web-utils/src/main/java/org/apache/nifi/web/filter/SanitizeContextPathFilter.java#L41
> [2]
> https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-web-api/src/main/java/org/apache/nifi/web/api/ApplicationResource.java#L165
>
> Ultimately, my assumption is that both of these property values
> *should* be backed by the same nifi.properties file. But it appears
> something is happening in your case/environment/situation that is
> causing the ServletContext and ApplicationContext to get
> configured/initialized differently. This could be something specific
> to your environment or it could be uncovering an edge-case bug in
> NiFi.
>
> I think others on this mailing list who are more familiar with how the
> ServletContext gets setup in NiFi might be able to help further on
> this and determine if there is a solution/workaround or bug that needs
> patching.
>
> Thanks,
> Kevin
>
> On Fri, Oct 11, 2019 at 4:55 AM Swarup Karavadi  wrote:
> >
> > Greetings,
> >
> > I have deployed a single node unsecured NiFi cluster (I say cluster
> because nifi.cluster.is.node is set to "true") as a stateful set on
> Kubernetes (AWS EKS to be specific). The NiFi cluster sits behind an Nginx
> ingress. I have configured the Nginx ingress to forward the appropriate
> headers to NiFi (when deployed behind a reverse proxy) as described in the
> documentation.
> >
> > The path on the Nginx ingress which proxies traffic to the NiFi UI is
> "/pie/ip". This same path has been whitelisted by setting the
> "nifi.web.proxy.context.path" property to "/pie/ip". The way I am expecting
> this setup to work is that when users navigate to http://foo.com/pie/ip
> in the browser, they are shown a simple HTML page with redirect info and
> then automatically redirected to http://foo.com/pie/ip/nifi where they
> can view the NiFi canvas. Instead, the users are being redirected to
> http://foo.com/nifi which results in a 404 response because there is no
> '/nifi' path that has been configured on the Nginx ingress.
> >
> > I set the NiFi and Jetty Server log levels to DEBUG to understand what
> was happening under the hood and this is what I got -
> >
> > On Startup (when the SanitizeContextPathFilter is initialized) -
> > 2019-10-11 06:07:26,206 DEBUG [main]
> o.a.n.w.filter.SanitizeContextPathFilter SanitizeContextPathFilter received
> provided whitelisted context paths from NiFi properties: /pie/ip
> >
> > On Request (when the actual request is made) -
> > 2019-10-11 06:45:45,556 DEBUG [NiFi Web Server-23]
> org.apache.nifi.web.util.WebUtils Context path:
> > 2019-10-11 06:45:45,556 DEBUG [NiFi Web Server-23]
> org.apache.nifi.web.util.WebUtils On the request, the following context
> paths were parsed from headers:
> >  X-ProxyContextPath: /pie/ip
> > X-Forwarded-Context: null
> > X-Forwarded-Prefix: null
> > 2019-10-11 06:45:45,556 DEBUG [NiFi Web Server-23]
> org.apache.nifi.web.util.WebUtils Determined context path: /pie/ip
> > 2019-10-11 06:45:45,556 ERROR [NiFi Web Server-23]
> org.apache.nifi.web.util.WebUtils The provided context path [/pie/ip] was
> not whitelisted []
> > 2019-10-11 06:45:45,556 ERROR [NiFi Web Server-23]
> org.apache.nifi.web.util.WebUtils Error determining context path on JSP
> page: The provided context path [/pie/ip] was not whitelisted []
> > 2019-10-11 06:45:45,556 DEBUG [NiFi Web Server-23]
> o.a.n.w.filter.SanitizeContextPathFilter SanitizeContextPathFilter set
> contextPath:
> >
> > You will notice from the above log entries that the path '/pie/ip' was
> successfully whitelisted. Yet, when handling the request, the whitelisted
> context paths array is empty and this causes the wrong redirect to happen
> on the browser - and I can't figure out why this is happening or how I can
> fix it. Has anyone come across this kind of problem before? Any help on
> this is much appreciated.
> >
> > Cheers,
> > Swarup.
>


Re: Problem with Context Path Whitelisting

2019-10-11 Thread Kevin Doran
Swarup,

First, thanks for the great email. Nice job troubleshooting this and
sharing your findings with the community.

I'm more familiar with how these types of things get configured on
NiFi Registry than NiFi, so I'm not as much help as others. But I did
take a look and one thing I noticed was a difference between the
startup config and the per-request config.

On Startup, the whitelisted context paths are coming from the
ServletContext FilterConfig [1].

During request handling, the whitelisted context paths are coming from
the ApplicationContext, directly from NiFi Properties [2]

[1] 
https://github.com/apache/nifi/blob/master/nifi-commons/nifi-web-utils/src/main/java/org/apache/nifi/web/filter/SanitizeContextPathFilter.java#L41
[2] 
https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-web-api/src/main/java/org/apache/nifi/web/api/ApplicationResource.java#L165

Ultimately, my assumption is that both of these property values
*should* be backed by the same nifi.properties file. But it appears
something is happening in your case/environment/situation that is
causing the ServletContext and ApplicationContext to get
configured/initialized differently. This could be something specific
to your environment or it could be uncovering an edge-case bug in
NiFi.

I think others on this mailing list who are more familiar with how the
ServletContext gets setup in NiFi might be able to help further on
this and determine if there is a solution/workaround or bug that needs
patching.

Thanks,
Kevin

On Fri, Oct 11, 2019 at 4:55 AM Swarup Karavadi  wrote:
>
> Greetings,
>
> I have deployed a single node unsecured NiFi cluster (I say cluster because 
> nifi.cluster.is.node is set to "true") as a stateful set on Kubernetes (AWS 
> EKS to be specific). The NiFi cluster sits behind an Nginx ingress. I have 
> configured the Nginx ingress to forward the appropriate headers to NiFi (when 
> deployed behind a reverse proxy) as described in the documentation.
>
> The path on the Nginx ingress which proxies traffic to the NiFi UI is 
> "/pie/ip". This same path has been whitelisted by setting the 
> "nifi.web.proxy.context.path" property to "/pie/ip". The way I am expecting 
> this setup to work is that when users navigate to http://foo.com/pie/ip in 
> the browser, they are shown a simple HTML page with redirect info and then 
> automatically redirected to http://foo.com/pie/ip/nifi where they can view 
> the NiFi canvas. Instead, the users are being redirected to 
> http://foo.com/nifi which results in a 404 response because there is no 
> '/nifi' path that has been configured on the Nginx ingress.
>
> I set the NiFi and Jetty Server log levels to DEBUG to understand what was 
> happening under the hood and this is what I got -
>
> On Startup (when the SanitizeContextPathFilter is initialized) -
> 2019-10-11 06:07:26,206 DEBUG [main] o.a.n.w.filter.SanitizeContextPathFilter 
> SanitizeContextPathFilter received provided whitelisted context paths from 
> NiFi properties: /pie/ip
>
> On Request (when the actual request is made) -
> 2019-10-11 06:45:45,556 DEBUG [NiFi Web Server-23] 
> org.apache.nifi.web.util.WebUtils Context path:
> 2019-10-11 06:45:45,556 DEBUG [NiFi Web Server-23] 
> org.apache.nifi.web.util.WebUtils On the request, the following context paths 
> were parsed from headers:
>  X-ProxyContextPath: /pie/ip
> X-Forwarded-Context: null
> X-Forwarded-Prefix: null
> 2019-10-11 06:45:45,556 DEBUG [NiFi Web Server-23] 
> org.apache.nifi.web.util.WebUtils Determined context path: /pie/ip
> 2019-10-11 06:45:45,556 ERROR [NiFi Web Server-23] 
> org.apache.nifi.web.util.WebUtils The provided context path [/pie/ip] was not 
> whitelisted []
> 2019-10-11 06:45:45,556 ERROR [NiFi Web Server-23] 
> org.apache.nifi.web.util.WebUtils Error determining context path on JSP page: 
> The provided context path [/pie/ip] was not whitelisted []
> 2019-10-11 06:45:45,556 DEBUG [NiFi Web Server-23] 
> o.a.n.w.filter.SanitizeContextPathFilter SanitizeContextPathFilter set 
> contextPath:
>
> You will notice from the above log entries that the path '/pie/ip' was 
> successfully whitelisted. Yet, when handling the request, the whitelisted 
> context paths array is empty and this causes the wrong redirect to happen on 
> the browser - and I can't figure out why this is happening or how I can fix 
> it. Has anyone come across this kind of problem before? Any help on this is 
> much appreciated.
>
> Cheers,
> Swarup.