[jira] [Commented] (NIFI-9878) DistributedCacheMap Handshake failure, processor hang indefinitely.

2022-09-13 Thread Jon Shoemaker (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-9878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17603647#comment-17603647
 ] 

Jon Shoemaker commented on NIFI-9878:
-

[~exceptionfactory]  Submitted pull request

> DistributedCacheMap Handshake failure, processor hang indefinitely.
> ---
>
> Key: NIFI-9878
> URL: https://issues.apache.org/jira/browse/NIFI-9878
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Affects Versions: 1.15.3, 1.17.0, 1.16.3
>Reporter: Aaron Rich
>Assignee: Jon Shoemaker
>Priority: Major
>  Labels: Handshake, distributed_cache
> Attachments: 
> 0001-NIFI-9878-fix-for-hanging-client-thread-with-handsha.patch, 
> image-2022-04-05-21-54-31-002.png, image-2022-04-05-21-55-16-221.png
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When a DistributedCacheMapClient attempts to connect to a 
> DistributedCacheMapServer, but the handshake response is never received by 
> the client, the PutDistributedCacheMap processor with hang indefinitely. The 
> handshake never times out.
> A situation like this can be caused if a proxy allows for the TCP connection 
> to be established between client and server but fails to deliver handshake 
> data to/from DistributedCacheMapServer (for example an unstable Istio service 
> mesh between the two). Could also happen if a client was accidentally 
> misconfigured to point to wrong TCP server point (one that wasn't hosting a 
> DistributedCacheMapServer.
> Steps to recreate:
> 1) Set up a PutDistributedCacheMap processor with a 
> DistributedMapCacheClientService
> 2) Configure DistributedMapCacheClientService to point to a non 
> DistributedCacheMapServer tcp server (nc -lk 127.0.0.1 4457). This simulates 
> a situation where the socket connection can be made but there is no handshake 
> response from the server (for example, server is in bad state and unable to 
> respond, a proxy is misbehaving, etc).
> 3) use generateFlowFile to trigger PutDistributedCacheMap  processor.
> 4) processor will hang with no failure or success. Processor will have to be 
> force terminated.
> !image-2022-04-05-21-54-31-002.png!
> !image-2022-04-05-21-55-16-221.png!
> Hang occurs at :
> CacheClientRequestHandler.java:92: handshakeHandler.waitHandshakeComplete();
>  
> Currently, the "connection timeout" parameter is only used to timeout the 
> establishment of the TCP socket connection, not the full application layer 
> connection.
> Suggestion:
> Handshake should have a timeout too to be robust to handle a network outage 
> where the TCP connection is able to be created, but the handshake data can't 
> be exchanged. The processor hanging prevents any way to handle this error in 
> a dataflow.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (NIFI-9878) DistributedCacheMap Handshake failure, processor hang indefinitely.

2022-09-13 Thread Jon Shoemaker (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-9878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17603602#comment-17603602
 ] 

Jon Shoemaker commented on NIFI-9878:
-

Due to refactoring of the DistributeCacheClient code, the patch, as written, 
only works with 1.17 and 1.18

> DistributedCacheMap Handshake failure, processor hang indefinitely.
> ---
>
> Key: NIFI-9878
> URL: https://issues.apache.org/jira/browse/NIFI-9878
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Affects Versions: 1.15.3, 1.17.0, 1.16.3
>Reporter: Aaron Rich
>Assignee: Jon Shoemaker
>Priority: Major
>  Labels: Handshake, distributed_cache
> Attachments: 
> 0001-NIFI-9878-fix-for-hanging-client-thread-with-handsha.patch, 
> image-2022-04-05-21-54-31-002.png, image-2022-04-05-21-55-16-221.png
>
>
> When a DistributedCacheMapClient attempts to connect to a 
> DistributedCacheMapServer, but the handshake response is never received by 
> the client, the PutDistributedCacheMap processor with hang indefinitely. The 
> handshake never times out.
> A situation like this can be caused if a proxy allows for the TCP connection 
> to be established between client and server but fails to deliver handshake 
> data to/from DistributedCacheMapServer (for example an unstable Istio service 
> mesh between the two). Could also happen if a client was accidentally 
> misconfigured to point to wrong TCP server point (one that wasn't hosting a 
> DistributedCacheMapServer.
> Steps to recreate:
> 1) Set up a PutDistributedCacheMap processor with a 
> DistributedMapCacheClientService
> 2) Configure DistributedMapCacheClientService to point to a non 
> DistributedCacheMapServer tcp server (nc -lk 127.0.0.1 4457). This simulates 
> a situation where the socket connection can be made but there is no handshake 
> response from the server (for example, server is in bad state and unable to 
> respond, a proxy is misbehaving, etc).
> 3) use generateFlowFile to trigger PutDistributedCacheMap  processor.
> 4) processor will hang with no failure or success. Processor will have to be 
> force terminated.
> !image-2022-04-05-21-54-31-002.png!
> !image-2022-04-05-21-55-16-221.png!
> Hang occurs at :
> CacheClientRequestHandler.java:92: handshakeHandler.waitHandshakeComplete();
>  
> Currently, the "connection timeout" parameter is only used to timeout the 
> establishment of the TCP socket connection, not the full application layer 
> connection.
> Suggestion:
> Handshake should have a timeout too to be robust to handle a network outage 
> where the TCP connection is able to be created, but the handshake data can't 
> be exchanged. The processor hanging prevents any way to handle this error in 
> a dataflow.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-9878) DistributedCacheMap Handshake failure, processor hang indefinitely.

2022-09-13 Thread Jon Shoemaker (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-9878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jon Shoemaker updated NIFI-9878:

Affects Version/s: 1.17.0

> DistributedCacheMap Handshake failure, processor hang indefinitely.
> ---
>
> Key: NIFI-9878
> URL: https://issues.apache.org/jira/browse/NIFI-9878
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Affects Versions: 1.15.3, 1.17.0, 1.16.3
>Reporter: Aaron Rich
>Assignee: Jon Shoemaker
>Priority: Major
>  Labels: Handshake, distributed_cache
> Attachments: 
> 0001-NIFI-9878-fix-for-hanging-client-thread-with-handsha.patch, 
> image-2022-04-05-21-54-31-002.png, image-2022-04-05-21-55-16-221.png
>
>
> When a DistributedCacheMapClient attempts to connect to a 
> DistributedCacheMapServer, but the handshake response is never received by 
> the client, the PutDistributedCacheMap processor with hang indefinitely. The 
> handshake never times out.
> A situation like this can be caused if a proxy allows for the TCP connection 
> to be established between client and server but fails to deliver handshake 
> data to/from DistributedCacheMapServer (for example an unstable Istio service 
> mesh between the two). Could also happen if a client was accidentally 
> misconfigured to point to wrong TCP server point (one that wasn't hosting a 
> DistributedCacheMapServer.
> Steps to recreate:
> 1) Set up a PutDistributedCacheMap processor with a 
> DistributedMapCacheClientService
> 2) Configure DistributedMapCacheClientService to point to a non 
> DistributedCacheMapServer tcp server (nc -lk 127.0.0.1 4457). This simulates 
> a situation where the socket connection can be made but there is no handshake 
> response from the server (for example, server is in bad state and unable to 
> respond, a proxy is misbehaving, etc).
> 3) use generateFlowFile to trigger PutDistributedCacheMap  processor.
> 4) processor will hang with no failure or success. Processor will have to be 
> force terminated.
> !image-2022-04-05-21-54-31-002.png!
> !image-2022-04-05-21-55-16-221.png!
> Hang occurs at :
> CacheClientRequestHandler.java:92: handshakeHandler.waitHandshakeComplete();
>  
> Currently, the "connection timeout" parameter is only used to timeout the 
> establishment of the TCP socket connection, not the full application layer 
> connection.
> Suggestion:
> Handshake should have a timeout too to be robust to handle a network outage 
> where the TCP connection is able to be created, but the handshake data can't 
> be exchanged. The processor hanging prevents any way to handle this error in 
> a dataflow.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-9878) DistributedCacheMap Handshake failure, processor hang indefinitely.

2022-09-12 Thread Jon Shoemaker (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-9878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jon Shoemaker updated NIFI-9878:

Affects Version/s: 1.16.3
   Status: Patch Available  (was: Open)

> DistributedCacheMap Handshake failure, processor hang indefinitely.
> ---
>
> Key: NIFI-9878
> URL: https://issues.apache.org/jira/browse/NIFI-9878
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Affects Versions: 1.16.3, 1.15.3
>Reporter: Aaron Rich
>Assignee: Jon Shoemaker
>Priority: Major
>  Labels: Handshake, distributed_cache
> Attachments: 
> 0001-NIFI-9878-fix-for-hanging-client-thread-with-handsha.patch, 
> image-2022-04-05-21-54-31-002.png, image-2022-04-05-21-55-16-221.png
>
>
> When a DistributedCacheMapClient attempts to connect to a 
> DistributedCacheMapServer, but the handshake response is never received by 
> the client, the PutDistributedCacheMap processor with hang indefinitely. The 
> handshake never times out.
> A situation like this can be caused if a proxy allows for the TCP connection 
> to be established between client and server but fails to deliver handshake 
> data to/from DistributedCacheMapServer (for example an unstable Istio service 
> mesh between the two). Could also happen if a client was accidentally 
> misconfigured to point to wrong TCP server point (one that wasn't hosting a 
> DistributedCacheMapServer.
> Steps to recreate:
> 1) Set up a PutDistributedCacheMap processor with a 
> DistributedMapCacheClientService
> 2) Configure DistributedMapCacheClientService to point to a non 
> DistributedCacheMapServer tcp server (nc -lk 127.0.0.1 4457). This simulates 
> a situation where the socket connection can be made but there is no handshake 
> response from the server (for example, server is in bad state and unable to 
> respond, a proxy is misbehaving, etc).
> 3) use generateFlowFile to trigger PutDistributedCacheMap  processor.
> 4) processor will hang with no failure or success. Processor will have to be 
> force terminated.
> !image-2022-04-05-21-54-31-002.png!
> !image-2022-04-05-21-55-16-221.png!
> Hang occurs at :
> CacheClientRequestHandler.java:92: handshakeHandler.waitHandshakeComplete();
>  
> Currently, the "connection timeout" parameter is only used to timeout the 
> establishment of the TCP socket connection, not the full application layer 
> connection.
> Suggestion:
> Handshake should have a timeout too to be robust to handle a network outage 
> where the TCP connection is able to be created, but the handshake data can't 
> be exchanged. The processor hanging prevents any way to handle this error in 
> a dataflow.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-9878) DistributedCacheMap Handshake failure, processor hang indefinitely.

2022-09-12 Thread Jon Shoemaker (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-9878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jon Shoemaker updated NIFI-9878:

Attachment: 0001-NIFI-9878-fix-for-hanging-client-thread-with-handsha.patch

> DistributedCacheMap Handshake failure, processor hang indefinitely.
> ---
>
> Key: NIFI-9878
> URL: https://issues.apache.org/jira/browse/NIFI-9878
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Affects Versions: 1.15.3
>Reporter: Aaron Rich
>Assignee: Jon Shoemaker
>Priority: Major
>  Labels: Handshake, distributed_cache
> Attachments: 
> 0001-NIFI-9878-fix-for-hanging-client-thread-with-handsha.patch, 
> image-2022-04-05-21-54-31-002.png, image-2022-04-05-21-55-16-221.png
>
>
> When a DistributedCacheMapClient attempts to connect to a 
> DistributedCacheMapServer, but the handshake response is never received by 
> the client, the PutDistributedCacheMap processor with hang indefinitely. The 
> handshake never times out.
> A situation like this can be caused if a proxy allows for the TCP connection 
> to be established between client and server but fails to deliver handshake 
> data to/from DistributedCacheMapServer (for example an unstable Istio service 
> mesh between the two). Could also happen if a client was accidentally 
> misconfigured to point to wrong TCP server point (one that wasn't hosting a 
> DistributedCacheMapServer.
> Steps to recreate:
> 1) Set up a PutDistributedCacheMap processor with a 
> DistributedMapCacheClientService
> 2) Configure DistributedMapCacheClientService to point to a non 
> DistributedCacheMapServer tcp server (nc -lk 127.0.0.1 4457). This simulates 
> a situation where the socket connection can be made but there is no handshake 
> response from the server (for example, server is in bad state and unable to 
> respond, a proxy is misbehaving, etc).
> 3) use generateFlowFile to trigger PutDistributedCacheMap  processor.
> 4) processor will hang with no failure or success. Processor will have to be 
> force terminated.
> !image-2022-04-05-21-54-31-002.png!
> !image-2022-04-05-21-55-16-221.png!
> Hang occurs at :
> CacheClientRequestHandler.java:92: handshakeHandler.waitHandshakeComplete();
>  
> Currently, the "connection timeout" parameter is only used to timeout the 
> establishment of the TCP socket connection, not the full application layer 
> connection.
> Suggestion:
> Handshake should have a timeout too to be robust to handle a network outage 
> where the TCP connection is able to be created, but the handshake data can't 
> be exchanged. The processor hanging prevents any way to handle this error in 
> a dataflow.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (NIFI-9878) DistributedCacheMap Handshake failure, processor hang indefinitely.

2022-09-09 Thread Jon Shoemaker (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-9878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jon Shoemaker reassigned NIFI-9878:
---

Assignee: Jon Shoemaker

> DistributedCacheMap Handshake failure, processor hang indefinitely.
> ---
>
> Key: NIFI-9878
> URL: https://issues.apache.org/jira/browse/NIFI-9878
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Affects Versions: 1.15.3
>Reporter: Aaron Rich
>Assignee: Jon Shoemaker
>Priority: Major
>  Labels: Handshake, distributed_cache
> Attachments: image-2022-04-05-21-54-31-002.png, 
> image-2022-04-05-21-55-16-221.png
>
>
> When a DistributedCacheMapClient attempts to connect to a 
> DistributedCacheMapServer, but the handshake response is never received by 
> the client, the PutDistributedCacheMap processor with hang indefinitely. The 
> handshake never times out.
> A situation like this can be caused if a proxy allows for the TCP connection 
> to be established between client and server but fails to deliver handshake 
> data to/from DistributedCacheMapServer (for example an unstable Istio service 
> mesh between the two). Could also happen if a client was accidentally 
> misconfigured to point to wrong TCP server point (one that wasn't hosting a 
> DistributedCacheMapServer.
> Steps to recreate:
> 1) Set up a PutDistributedCacheMap processor with a 
> DistributedMapCacheClientService
> 2) Configure DistributedMapCacheClientService to point to a non 
> DistributedCacheMapServer tcp server (nc -lk 127.0.0.1 4457). This simulates 
> a situation where the socket connection can be made but there is no handshake 
> response from the server (for example, server is in bad state and unable to 
> respond, a proxy is misbehaving, etc).
> 3) use generateFlowFile to trigger PutDistributedCacheMap  processor.
> 4) processor will hang with no failure or success. Processor will have to be 
> force terminated.
> !image-2022-04-05-21-54-31-002.png!
> !image-2022-04-05-21-55-16-221.png!
> Hang occurs at :
> CacheClientRequestHandler.java:92: handshakeHandler.waitHandshakeComplete();
>  
> Currently, the "connection timeout" parameter is only used to timeout the 
> establishment of the TCP socket connection, not the full application layer 
> connection.
> Suggestion:
> Handshake should have a timeout too to be robust to handle a network outage 
> where the TCP connection is able to be created, but the handshake data can't 
> be exchanged. The processor hanging prevents any way to handle this error in 
> a dataflow.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (NIFI-9878) DistributedCacheMap Handshake failure, processor hang indefinitely.

2022-08-23 Thread Jon Shoemaker (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-9878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17583714#comment-17583714
 ] 

Jon Shoemaker commented on NIFI-9878:
-

Experiencing the same issue.  In our scenario it works correctly most of the 
time but occasionally the handshake response is never received and the 
processor thread hangs until the processor is terminated.  Some of these stuck 
threads happen when the DistributedCacheServer is restarted.

> DistributedCacheMap Handshake failure, processor hang indefinitely.
> ---
>
> Key: NIFI-9878
> URL: https://issues.apache.org/jira/browse/NIFI-9878
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Affects Versions: 1.15.3
>Reporter: Aaron Rich
>Priority: Major
>  Labels: Handshake, distributed_cache
> Attachments: image-2022-04-05-21-54-31-002.png, 
> image-2022-04-05-21-55-16-221.png
>
>
> When a DistributedCacheMapClient attempts to connect to a 
> DistributedCacheMapServer, but the handshake response is never received by 
> the client, the PutDistributedCacheMap processor with hang indefinitely. The 
> handshake never times out.
> A situation like this can be caused if a proxy allows for the TCP connection 
> to be established between client and server but fails to deliver handshake 
> data to/from DistributedCacheMapServer (for example an unstable Istio service 
> mesh between the two). Could also happen if a client was accidentally 
> misconfigured to point to wrong TCP server point (one that wasn't hosting a 
> DistributedCacheMapServer.
> Steps to recreate:
> 1) Set up a PutDistributedCacheMap processor with a 
> DistributedMapCacheClientService
> 2) Configure DistributedMapCacheClientService to point to a non 
> DistributedCacheMapServer tcp server (nc -lk 127.0.0.1 4457). This simulates 
> a situation where the socket connection can be made but there is no handshake 
> response from the server (for example, server is in bad state and unable to 
> respond, a proxy is misbehaving, etc).
> 3) use generateFlowFile to trigger PutDistributedCacheMap  processor.
> 4) processor will hang with no failure or success. Processor will have to be 
> force terminated.
> !image-2022-04-05-21-54-31-002.png!
> !image-2022-04-05-21-55-16-221.png!
> Hang occurs at :
> CacheClientRequestHandler.java:92: handshakeHandler.waitHandshakeComplete();
>  
> Currently, the "connection timeout" parameter is only used to timeout the 
> establishment of the TCP socket connection, not the full application layer 
> connection.
> Suggestion:
> Handshake should have a timeout too to be robust to handle a network outage 
> where the TCP connection is able to be created, but the handshake data can't 
> be exchanged. The processor hanging prevents any way to handle this error in 
> a dataflow.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (NIFI-9788) Update Apache Commons Codec to 1.15

2022-03-11 Thread Jon Shoemaker (Jira)
Jon Shoemaker created NIFI-9788:
---

 Summary: Update Apache Commons Codec to 1.15
 Key: NIFI-9788
 URL: https://issues.apache.org/jira/browse/NIFI-9788
 Project: Apache NiFi
  Issue Type: Task
Reporter: Jon Shoemaker






--
This message was sent by Atlassian Jira
(v8.20.1#820001)