[jira] [Commented] (GEODE-4322) Locator fails to start with NPE during join to the distributed system

2018-01-24 Thread Vahram Aharonyan (JIRA)

[ 
https://issues.apache.org/jira/browse/GEODE-4322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16337596#comment-16337596
 ] 

Vahram Aharonyan commented on GEODE-4322:
-

Hi [~bschuchardt],

Thanks for the info.

Actually yes, we are shutting down whole cluster. 

BTW, lets assume we have 2 locators(Loc-1 and Loc-2). We stop Loc-1 and remove 
its dat file before start (while Loc-2 remains alive) , will this cause 
problems? Or we should remove all the dat files from all locator nodes only 
when whole cluster is powered off?

Thanks,

Vahram.

> Locator fails to start with NPE during join to the distributed system
> -
>
> Key: GEODE-4322
> URL: https://issues.apache.org/jira/browse/GEODE-4322
> Project: Geode
>  Issue Type: Bug
>  Components: membership
>Affects Versions: 1.2.0
>Reporter: Vahram Aharonyan
>Assignee: Bruce Schuchardt
>Priority: Major
>
> Found out that after setting security-udp-dhalgo=AES:128 in prorperties files 
> sometimes  locator is failing to come online with the following Exception:
> [severe 2018/01/19 04:22:12.194 PST  tid=0x45] 
> Exception in processing request from 10.144.248.41
> java.lang.RuntimeException: Not found public key for member 
> 16nodedata6(d4b4f5d4-47d2-44b1-a07c-6a7f5755e52d:11493):10002
>  at 
> org.apache.geode.distributed.internal.membership.gms.messenger.GMSEncrypt.getPublicKey(GMSEncrypt.java:177)
>  at 
> org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger.getPublicKey(JGroupsMessenger.java:1365)
>  at 
> org.apache.geode.distributed.internal.membership.gms.locator.GMSLocator.processRequest(GMSLocator.java:271)
>  at 
> org.apache.geode.distributed.internal.InternalLocator$PrimaryHandler.processRequest(InternalLocator.java:1256)
>  at 
> org.apache.geode.distributed.internal.tcpserver.TcpServer.lambda$processRequest$0(TcpServer.java:401)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.NullPointerException
>  at 
> org.apache.geode.distributed.internal.membership.gms.messenger.GMSEncrypt.getPeerEncryptor(GMSEncrypt.java:258)
>  at 
> org.apache.geode.distributed.internal.membership.gms.messenger.GMSEncrypt.getPublicKey(GMSEncrypt.java:175)
>  ... 7 more
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.NullPointerException
>  at 
> org.apache.geode.distributed.internal.membership.gms.messenger.GMSEncrypt.getPeerEncryptor(GMSEncrypt.java:258)
>  at 
> org.apache.geode.distributed.internal.membership.gms.messenger.GMSEncrypt.getPublicKey(GMSEncrypt.java:175)
>  ... 7 more
> Please note, that generally this issue is hit after cluster restart. This is 
> important, as during poweroff locator can go offline first and one of other 
> members will become coordinator and update view file accordingly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (GEODE-4322) Locator fails to start with NPE during join to the distributed system

2018-01-22 Thread Vahram Aharonyan (JIRA)

[ 
https://issues.apache.org/jira/browse/GEODE-4322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16335448#comment-16335448
 ] 

Vahram Aharonyan commented on GEODE-4322:
-

Hi [~bschuchardt],

Unfortunately we don't have a way to quickly test this with Geode 1.3. We have 
setups only with 1.1.0 and 1.2.0. Do you know what is the status of the test 
mentioned in GEODE-2542 in case of GEODE 1.3 branch? I see that stacktraces 
from fails are pretty similar to what we are observing.

Thanks,

Vahram.

> Locator fails to start with NPE during join to the distributed system
> -
>
> Key: GEODE-4322
> URL: https://issues.apache.org/jira/browse/GEODE-4322
> Project: Geode
>  Issue Type: Bug
>  Components: membership
>Affects Versions: 1.2.0
>Reporter: Vahram Aharonyan
>Assignee: Bruce Schuchardt
>Priority: Major
>
> Found out that after setting security-udp-dhalgo=AES:128 in prorperties files 
> sometimes  locator is failing to come online with the following Exception:
> [severe 2018/01/19 04:22:12.194 PST  tid=0x45] 
> Exception in processing request from 10.144.248.41
> java.lang.RuntimeException: Not found public key for member 
> 16nodedata6(d4b4f5d4-47d2-44b1-a07c-6a7f5755e52d:11493):10002
>  at 
> org.apache.geode.distributed.internal.membership.gms.messenger.GMSEncrypt.getPublicKey(GMSEncrypt.java:177)
>  at 
> org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger.getPublicKey(JGroupsMessenger.java:1365)
>  at 
> org.apache.geode.distributed.internal.membership.gms.locator.GMSLocator.processRequest(GMSLocator.java:271)
>  at 
> org.apache.geode.distributed.internal.InternalLocator$PrimaryHandler.processRequest(InternalLocator.java:1256)
>  at 
> org.apache.geode.distributed.internal.tcpserver.TcpServer.lambda$processRequest$0(TcpServer.java:401)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.NullPointerException
>  at 
> org.apache.geode.distributed.internal.membership.gms.messenger.GMSEncrypt.getPeerEncryptor(GMSEncrypt.java:258)
>  at 
> org.apache.geode.distributed.internal.membership.gms.messenger.GMSEncrypt.getPublicKey(GMSEncrypt.java:175)
>  ... 7 more
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.NullPointerException
>  at 
> org.apache.geode.distributed.internal.membership.gms.messenger.GMSEncrypt.getPeerEncryptor(GMSEncrypt.java:258)
>  at 
> org.apache.geode.distributed.internal.membership.gms.messenger.GMSEncrypt.getPublicKey(GMSEncrypt.java:175)
>  ... 7 more
> Please note, that generally this issue is hit after cluster restart. This is 
> important, as during poweroff locator can go offline first and one of other 
> members will become coordinator and update view file accordingly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (GEODE-4322) Locator fails to start with NPE during join to the distributed system

2018-01-22 Thread Vahram Aharonyan (JIRA)

[ 
https://issues.apache.org/jira/browse/GEODE-4322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16334002#comment-16334002
 ] 

Vahram Aharonyan commented on GEODE-4322:
-

Hi [~amb],

This is really important issue as it is  blocking from encrypting inter-node 
UDP communication. This means that all the UDP communication within the cluster 
goes as a plain text . Can't we have this fixed sooner in terms of upcoming 
1.4.0? Or having a patch for older releases would be an option as well.

Thanks,

Vahram.

> Locator fails to start with NPE during join to the distributed system
> -
>
> Key: GEODE-4322
> URL: https://issues.apache.org/jira/browse/GEODE-4322
> Project: Geode
>  Issue Type: Bug
>  Components: membership
>Affects Versions: 1.2.0
>Reporter: Vahram Aharonyan
>Priority: Major
>
> Found out that after setting security-udp-dhalgo=AES:128 in prorperties files 
> sometimes  locator is failing to come online with the following Exception:
> [severe 2018/01/19 04:22:12.194 PST  tid=0x45] 
> Exception in processing request from 10.144.248.41
> java.lang.RuntimeException: Not found public key for member 
> 16nodedata6(d4b4f5d4-47d2-44b1-a07c-6a7f5755e52d:11493):10002
>  at 
> org.apache.geode.distributed.internal.membership.gms.messenger.GMSEncrypt.getPublicKey(GMSEncrypt.java:177)
>  at 
> org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger.getPublicKey(JGroupsMessenger.java:1365)
>  at 
> org.apache.geode.distributed.internal.membership.gms.locator.GMSLocator.processRequest(GMSLocator.java:271)
>  at 
> org.apache.geode.distributed.internal.InternalLocator$PrimaryHandler.processRequest(InternalLocator.java:1256)
>  at 
> org.apache.geode.distributed.internal.tcpserver.TcpServer.lambda$processRequest$0(TcpServer.java:401)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.NullPointerException
>  at 
> org.apache.geode.distributed.internal.membership.gms.messenger.GMSEncrypt.getPeerEncryptor(GMSEncrypt.java:258)
>  at 
> org.apache.geode.distributed.internal.membership.gms.messenger.GMSEncrypt.getPublicKey(GMSEncrypt.java:175)
>  ... 7 more
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.NullPointerException
>  at 
> org.apache.geode.distributed.internal.membership.gms.messenger.GMSEncrypt.getPeerEncryptor(GMSEncrypt.java:258)
>  at 
> org.apache.geode.distributed.internal.membership.gms.messenger.GMSEncrypt.getPublicKey(GMSEncrypt.java:175)
>  ... 7 more
> Please note, that generally this issue is hit after cluster restart. This is 
> important, as during poweroff locator can go offline first and one of other 
> members will become coordinator and update view file accordingly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (GEODE-4322) Locator fails to start with NPE during join to the distributed system

2018-01-19 Thread Vahram Aharonyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/GEODE-4322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vahram Aharonyan updated GEODE-4322:

Affects Version/s: 1.2.0

> Locator fails to start with NPE during join to the distributed system
> -
>
> Key: GEODE-4322
> URL: https://issues.apache.org/jira/browse/GEODE-4322
> Project: Geode
>  Issue Type: Bug
>  Components: locator
>Affects Versions: 1.2.0
>Reporter: Vahram Aharonyan
>Priority: Major
> Fix For: 1.4.0
>
>
> Found out that after setting security-udp-dhalgo=AES:128 in prorperties files 
> sometimes  locator is failing to come online with the following Exception:
> [severe 2018/01/19 04:22:12.194 PST  tid=0x45] 
> Exception in processing request from 10.144.248.41
> java.lang.RuntimeException: Not found public key for member 
> 16nodedata6(d4b4f5d4-47d2-44b1-a07c-6a7f5755e52d:11493):10002
>  at 
> org.apache.geode.distributed.internal.membership.gms.messenger.GMSEncrypt.getPublicKey(GMSEncrypt.java:177)
>  at 
> org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger.getPublicKey(JGroupsMessenger.java:1365)
>  at 
> org.apache.geode.distributed.internal.membership.gms.locator.GMSLocator.processRequest(GMSLocator.java:271)
>  at 
> org.apache.geode.distributed.internal.InternalLocator$PrimaryHandler.processRequest(InternalLocator.java:1256)
>  at 
> org.apache.geode.distributed.internal.tcpserver.TcpServer.lambda$processRequest$0(TcpServer.java:401)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.NullPointerException
>  at 
> org.apache.geode.distributed.internal.membership.gms.messenger.GMSEncrypt.getPeerEncryptor(GMSEncrypt.java:258)
>  at 
> org.apache.geode.distributed.internal.membership.gms.messenger.GMSEncrypt.getPublicKey(GMSEncrypt.java:175)
>  ... 7 more
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.NullPointerException
>  at 
> org.apache.geode.distributed.internal.membership.gms.messenger.GMSEncrypt.getPeerEncryptor(GMSEncrypt.java:258)
>  at 
> org.apache.geode.distributed.internal.membership.gms.messenger.GMSEncrypt.getPublicKey(GMSEncrypt.java:175)
>  ... 7 more
> Please note, that generally this issue is hit after cluster restart. This is 
> important, as during poweroff locator can go offline first and one of other 
> members will become coordinator and update view file accordingly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (GEODE-4322) Locator fails to start with NPE during join to the distributed system

2018-01-19 Thread Vahram Aharonyan (JIRA)
Vahram Aharonyan created GEODE-4322:
---

 Summary: Locator fails to start with NPE during join to the 
distributed system
 Key: GEODE-4322
 URL: https://issues.apache.org/jira/browse/GEODE-4322
 Project: Geode
  Issue Type: Bug
  Components: locator
Reporter: Vahram Aharonyan
 Fix For: 1.4.0


Found out that after setting security-udp-dhalgo=AES:128 in prorperties files 
sometimes  locator is failing to come online with the following Exception:

[severe 2018/01/19 04:22:12.194 PST  tid=0x45] 
Exception in processing request from 10.144.248.41
java.lang.RuntimeException: Not found public key for member 
16nodedata6(d4b4f5d4-47d2-44b1-a07c-6a7f5755e52d:11493):10002
 at 
org.apache.geode.distributed.internal.membership.gms.messenger.GMSEncrypt.getPublicKey(GMSEncrypt.java:177)
 at 
org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger.getPublicKey(JGroupsMessenger.java:1365)
 at 
org.apache.geode.distributed.internal.membership.gms.locator.GMSLocator.processRequest(GMSLocator.java:271)
 at 
org.apache.geode.distributed.internal.InternalLocator$PrimaryHandler.processRequest(InternalLocator.java:1256)
 at 
org.apache.geode.distributed.internal.tcpserver.TcpServer.lambda$processRequest$0(TcpServer.java:401)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
 at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.NullPointerException
 at 
org.apache.geode.distributed.internal.membership.gms.messenger.GMSEncrypt.getPeerEncryptor(GMSEncrypt.java:258)
 at 
org.apache.geode.distributed.internal.membership.gms.messenger.GMSEncrypt.getPublicKey(GMSEncrypt.java:175)
 ... 7 more
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
 at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.NullPointerException
 at 
org.apache.geode.distributed.internal.membership.gms.messenger.GMSEncrypt.getPeerEncryptor(GMSEncrypt.java:258)
 at 
org.apache.geode.distributed.internal.membership.gms.messenger.GMSEncrypt.getPublicKey(GMSEncrypt.java:175)
 ... 7 more

Please note, that generally this issue is hit after cluster restart. This is 
important, as during poweroff locator can go offline first and one of other 
members will become coordinator and update view file accordingly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (GEODE-3637) configureClientSSLSocket call can block Acceptor thread

2017-10-27 Thread Vahram Aharonyan (JIRA)

[ 
https://issues.apache.org/jira/browse/GEODE-3637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16221904#comment-16221904
 ] 

Vahram Aharonyan commented on GEODE-3637:
-

Udo,

Is there a possibility that after moving registerClient from Acceptor thread to 
ServerConnection.run it is ServerConnection that will stack forever on socket 
read once the socket closes on other side and notification of that gets lost 
due to packet drop?

Also initializeClientNofication is being called in ServerConnection only for 
Selector case, in Acceptor thread that was generic and was not depending on 
Selector mode. Is this expected change?

Thanks,
Vahram.


> configureClientSSLSocket call can block Acceptor thread
> ---
>
> Key: GEODE-3637
> URL: https://issues.apache.org/jira/browse/GEODE-3637
> Project: Geode
>  Issue Type: Bug
>  Components: client/server
>Affects Versions: 1.1.0, 1.2.0
>Reporter: Vahram Aharonyan
>Assignee: Udo Kohlmeyer
>Priority: Critical
>
> org.apache.geode.internal.net.SocketCreator#configureClientSSLSocket timeout 
> for Socket is being configured before starting SSL handshake only if passed 
> "timeout" argument is larger than 0.
> Having sslSocket.startHandshake issued without setting timeout can result to 
> the blocking of caller thread as in GEODE-2898, GEODE-3023.
> Below is the example of Handshaker thread stack-trace that got stacked:
> "Handshaker /10.124.195.100:1 Thread 183" Id=526300 in RUNNABLE (running 
> in native)
> Total blocked: 4   Total waited: 884
>   java.net.SocketInputStream.socketRead0(Native Method)
>   java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
>   java.net.SocketInputStream.read(SocketInputStream.java:171)
>   java.net.SocketInputStream.read(SocketInputStream.java:141)
>   sun.security.ssl.InputRecord.readFully(InputRecord.java:465)
>   sun.security.ssl.InputRecord.read(InputRecord.java:503)
>   sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:973)
>   
> sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1375)
>   sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1403)
>   sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1387)
>   
> org.apache.geode.internal.net.SocketCreator.configureClientSSLSocket(SocketCreator.java:1088)
>   org.apache.geode.internal.net.SocketCreator.connect(SocketCreator.java:967)
>   org.apache.geode.internal.net.SocketCreator.connect(SocketCreator.java:929)
>   
> org.apache.geode.internal.net.SocketCreator.connectForServer(SocketCreator.java:908)
>   org.apache.geode.internal.tcp.Connection.(Connection.java:1306)
>   org.apache.geode.internal.tcp.Connection.createSender(Connection.java:1094)
>   
> org.apache.geode.internal.tcp.ConnectionTable.getOrderedAndOwned(ConnectionTable.java:553)
>   org.apache.geode.internal.tcp.ConnectionTable.get(ConnectionTable.java:664)
>   org.apache.geode.internal.tcp.TCPConduit.getConnection(TCPConduit.java:1037)
>   
> org.apache.geode.distributed.internal.direct.DirectChannel.getConnections(DirectChannel.java:543)
>   
> org.apache.geode.distributed.internal.direct.DirectChannel.sendToMany(DirectChannel.java:319)
>   
> org.apache.geode.distributed.internal.direct.DirectChannel.send(DirectChannel.java:605)
>   
> org.apache.geode.distributed.internal.membership.gms.mgr.GMSMembershipManager.directChannelSend(GMSMembershipManager.java:1684)
>   
> org.apache.geode.distributed.internal.membership.gms.mgr.GMSMembershipManager.send(GMSMembershipManager.java:1875)
>   
> org.apache.geode.distributed.internal.DistributionChannel.send(DistributionChannel.java:82)
>   
> org.apache.geode.distributed.internal.DistributionManager.sendOutgoing(DistributionManager.java:3416)
>   
> org.apache.geode.distributed.internal.DistributionManager.sendMessage(DistributionManager.java:3453)
>   
> org.apache.geode.distributed.internal.DistributionManager.putOutgoing(DistributionManager.java:1832)
>   
> org.apache.geode.internal.cache.UpdateAttributesProcessor.sendProfileUpdate(UpdateAttributesProcessor.java:162)
>   
> org.apache.geode.internal.cache.UpdateAttributesProcessor.distribute(UpdateAttributesProcessor.java:97)
>   
> org.apache.geode.internal.cache.DistributedRegion.initialized(DistributedRegion.java:1128)
>   
> org.apache.geode.internal.cache.LocalRegion.initialize(LocalRegion.java:2413)
>   
> org.apache.geode.internal.cache.DistributedRegion.initialize(DistributedRegion.java:1117)
>   org.apache.geode.internal.cache.HARegion.initialize(HARegion.java:345)
>   
> org.apache.geode.internal.cache.GemFireCacheImpl.createVMRegion(GemFireCacheImpl.java:3308)
>   org.apache.geode.internal.cache.HARegion.getInstance(HARegion.java:265)
>   
> 

[jira] [Created] (GEODE-3904) Allow plugging of custom classLoaders runtime

2017-10-25 Thread Vahram Aharonyan (JIRA)
Vahram Aharonyan created GEODE-3904:
---

 Summary: Allow plugging of custom classLoaders runtime
 Key: GEODE-3904
 URL: https://issues.apache.org/jira/browse/GEODE-3904
 Project: Geode
  Issue Type: Improvement
  Components: core
Reporter: Vahram Aharonyan


Currently it is not possible to attach multiple ClassLoaders to Geode ClassPath 
once whole distributed system or some members are already configured and 
running. 

This is important if there are dynamically loaded plugins in the system and 
objects from their classes can be serialized/deserialized during some actions. 
To successfully complete these operations Geode need to have this corresponding 
classLoaders in its ClassPathLoader.







--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (GEODE-3563) SSL socket handling problems in TCPConduit run

2017-09-18 Thread Vahram Aharonyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/GEODE-3563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vahram Aharonyan updated GEODE-3563:

 Priority: Critical  (was: Major)
Fix Version/s: (was: 1.2.1)
   1.3.0

> SSL socket handling problems in TCPConduit run
> --
>
> Key: GEODE-3563
> URL: https://issues.apache.org/jira/browse/GEODE-3563
> Project: Geode
>  Issue Type: Bug
>  Components: client/server
>Reporter: Vahram Aharonyan
>Priority: Critical
> Fix For: 1.3.0
>
>
> Here are two cases that seems to problematic in TCPConduit.run flow:
> 1. TCPConduit.run() has no action performed for the case when SSLException is 
> thrown from sslSocket.startHandshake(), as a result the socket remains open. 
> Catch block from the end of  configureServerSSLSocket() will just report a 
> fatal error(even it seem that this portion is going to be removed in 1.2.1 
> according to GEODE-3393) and re-throw the exception.
> 2. configureServerSSLSocket call is performed without setting socket timeout 
> before that. This can bring to run thread blocking case if read initiated 
> from the SSL handshake flow will not return. Linking to similar issues 
> observed with other acceptors previously: GEODE-2898, GEODE-3023.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (GEODE-3637) configureClientSSLSocket call can block Acceptor thread

2017-09-18 Thread Vahram Aharonyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/GEODE-3637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vahram Aharonyan updated GEODE-3637:

Affects Version/s: 1.1.0
   1.2.0

> configureClientSSLSocket call can block Acceptor thread
> ---
>
> Key: GEODE-3637
> URL: https://issues.apache.org/jira/browse/GEODE-3637
> Project: Geode
>  Issue Type: Bug
>  Components: client/server
>Affects Versions: 1.1.0, 1.2.0
>Reporter: Vahram Aharonyan
>Priority: Critical
> Fix For: 1.3.0
>
>
> org.apache.geode.internal.net.SocketCreator#configureClientSSLSocket timeout 
> for Socket is being configured before starting SSL handshake only if passed 
> "timeout" argument is larger than 0.
> Having sslSocket.startHandshake issued without setting timeout can result to 
> the blocking of caller thread as in GEODE-2898, GEODE-3023.
> Below is the example of Handshaker thread stack-trace that got stacked:
> "Handshaker /10.124.195.100:1 Thread 183" Id=526300 in RUNNABLE (running 
> in native)
> Total blocked: 4   Total waited: 884
>   java.net.SocketInputStream.socketRead0(Native Method)
>   java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
>   java.net.SocketInputStream.read(SocketInputStream.java:171)
>   java.net.SocketInputStream.read(SocketInputStream.java:141)
>   sun.security.ssl.InputRecord.readFully(InputRecord.java:465)
>   sun.security.ssl.InputRecord.read(InputRecord.java:503)
>   sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:973)
>   
> sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1375)
>   sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1403)
>   sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1387)
>   
> org.apache.geode.internal.net.SocketCreator.configureClientSSLSocket(SocketCreator.java:1088)
>   org.apache.geode.internal.net.SocketCreator.connect(SocketCreator.java:967)
>   org.apache.geode.internal.net.SocketCreator.connect(SocketCreator.java:929)
>   
> org.apache.geode.internal.net.SocketCreator.connectForServer(SocketCreator.java:908)
>   org.apache.geode.internal.tcp.Connection.(Connection.java:1306)
>   org.apache.geode.internal.tcp.Connection.createSender(Connection.java:1094)
>   
> org.apache.geode.internal.tcp.ConnectionTable.getOrderedAndOwned(ConnectionTable.java:553)
>   org.apache.geode.internal.tcp.ConnectionTable.get(ConnectionTable.java:664)
>   org.apache.geode.internal.tcp.TCPConduit.getConnection(TCPConduit.java:1037)
>   
> org.apache.geode.distributed.internal.direct.DirectChannel.getConnections(DirectChannel.java:543)
>   
> org.apache.geode.distributed.internal.direct.DirectChannel.sendToMany(DirectChannel.java:319)
>   
> org.apache.geode.distributed.internal.direct.DirectChannel.send(DirectChannel.java:605)
>   
> org.apache.geode.distributed.internal.membership.gms.mgr.GMSMembershipManager.directChannelSend(GMSMembershipManager.java:1684)
>   
> org.apache.geode.distributed.internal.membership.gms.mgr.GMSMembershipManager.send(GMSMembershipManager.java:1875)
>   
> org.apache.geode.distributed.internal.DistributionChannel.send(DistributionChannel.java:82)
>   
> org.apache.geode.distributed.internal.DistributionManager.sendOutgoing(DistributionManager.java:3416)
>   
> org.apache.geode.distributed.internal.DistributionManager.sendMessage(DistributionManager.java:3453)
>   
> org.apache.geode.distributed.internal.DistributionManager.putOutgoing(DistributionManager.java:1832)
>   
> org.apache.geode.internal.cache.UpdateAttributesProcessor.sendProfileUpdate(UpdateAttributesProcessor.java:162)
>   
> org.apache.geode.internal.cache.UpdateAttributesProcessor.distribute(UpdateAttributesProcessor.java:97)
>   
> org.apache.geode.internal.cache.DistributedRegion.initialized(DistributedRegion.java:1128)
>   
> org.apache.geode.internal.cache.LocalRegion.initialize(LocalRegion.java:2413)
>   
> org.apache.geode.internal.cache.DistributedRegion.initialize(DistributedRegion.java:1117)
>   org.apache.geode.internal.cache.HARegion.initialize(HARegion.java:345)
>   
> org.apache.geode.internal.cache.GemFireCacheImpl.createVMRegion(GemFireCacheImpl.java:3308)
>   org.apache.geode.internal.cache.HARegion.getInstance(HARegion.java:265)
>   
> org.apache.geode.internal.cache.ha.HARegionQueue.createHARegion(HARegionQueue.java:348)
>   
> org.apache.geode.internal.cache.ha.HARegionQueue.(HARegionQueue.java:328)
>   
> org.apache.geode.internal.cache.ha.HARegionQueue$BlockingHARegionQueue.(HARegionQueue.java:2199)
>   
> org.apache.geode.internal.cache.ha.HARegionQueue$DurableHARegionQueue.(HARegionQueue.java:2450)
>   
> org.apache.geode.internal.cache.ha.HARegionQueue.getHARegionQueueInstance(HARegionQueue.java:2030)
>   
> 

[jira] [Updated] (GEODE-3637) configureClientSSLSocket call can block Acceptor thread

2017-09-18 Thread Vahram Aharonyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/GEODE-3637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vahram Aharonyan updated GEODE-3637:

Priority: Critical  (was: Major)

> configureClientSSLSocket call can block Acceptor thread
> ---
>
> Key: GEODE-3637
> URL: https://issues.apache.org/jira/browse/GEODE-3637
> Project: Geode
>  Issue Type: Bug
>  Components: client/server
>Reporter: Vahram Aharonyan
>Priority: Critical
> Fix For: 1.3.0
>
>
> org.apache.geode.internal.net.SocketCreator#configureClientSSLSocket timeout 
> for Socket is being configured before starting SSL handshake only if passed 
> "timeout" argument is larger than 0.
> Having sslSocket.startHandshake issued without setting timeout can result to 
> the blocking of caller thread as in GEODE-2898, GEODE-3023.
> Below is the example of Handshaker thread stack-trace that got stacked:
> "Handshaker /10.124.195.100:1 Thread 183" Id=526300 in RUNNABLE (running 
> in native)
> Total blocked: 4   Total waited: 884
>   java.net.SocketInputStream.socketRead0(Native Method)
>   java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
>   java.net.SocketInputStream.read(SocketInputStream.java:171)
>   java.net.SocketInputStream.read(SocketInputStream.java:141)
>   sun.security.ssl.InputRecord.readFully(InputRecord.java:465)
>   sun.security.ssl.InputRecord.read(InputRecord.java:503)
>   sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:973)
>   
> sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1375)
>   sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1403)
>   sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1387)
>   
> org.apache.geode.internal.net.SocketCreator.configureClientSSLSocket(SocketCreator.java:1088)
>   org.apache.geode.internal.net.SocketCreator.connect(SocketCreator.java:967)
>   org.apache.geode.internal.net.SocketCreator.connect(SocketCreator.java:929)
>   
> org.apache.geode.internal.net.SocketCreator.connectForServer(SocketCreator.java:908)
>   org.apache.geode.internal.tcp.Connection.(Connection.java:1306)
>   org.apache.geode.internal.tcp.Connection.createSender(Connection.java:1094)
>   
> org.apache.geode.internal.tcp.ConnectionTable.getOrderedAndOwned(ConnectionTable.java:553)
>   org.apache.geode.internal.tcp.ConnectionTable.get(ConnectionTable.java:664)
>   org.apache.geode.internal.tcp.TCPConduit.getConnection(TCPConduit.java:1037)
>   
> org.apache.geode.distributed.internal.direct.DirectChannel.getConnections(DirectChannel.java:543)
>   
> org.apache.geode.distributed.internal.direct.DirectChannel.sendToMany(DirectChannel.java:319)
>   
> org.apache.geode.distributed.internal.direct.DirectChannel.send(DirectChannel.java:605)
>   
> org.apache.geode.distributed.internal.membership.gms.mgr.GMSMembershipManager.directChannelSend(GMSMembershipManager.java:1684)
>   
> org.apache.geode.distributed.internal.membership.gms.mgr.GMSMembershipManager.send(GMSMembershipManager.java:1875)
>   
> org.apache.geode.distributed.internal.DistributionChannel.send(DistributionChannel.java:82)
>   
> org.apache.geode.distributed.internal.DistributionManager.sendOutgoing(DistributionManager.java:3416)
>   
> org.apache.geode.distributed.internal.DistributionManager.sendMessage(DistributionManager.java:3453)
>   
> org.apache.geode.distributed.internal.DistributionManager.putOutgoing(DistributionManager.java:1832)
>   
> org.apache.geode.internal.cache.UpdateAttributesProcessor.sendProfileUpdate(UpdateAttributesProcessor.java:162)
>   
> org.apache.geode.internal.cache.UpdateAttributesProcessor.distribute(UpdateAttributesProcessor.java:97)
>   
> org.apache.geode.internal.cache.DistributedRegion.initialized(DistributedRegion.java:1128)
>   
> org.apache.geode.internal.cache.LocalRegion.initialize(LocalRegion.java:2413)
>   
> org.apache.geode.internal.cache.DistributedRegion.initialize(DistributedRegion.java:1117)
>   org.apache.geode.internal.cache.HARegion.initialize(HARegion.java:345)
>   
> org.apache.geode.internal.cache.GemFireCacheImpl.createVMRegion(GemFireCacheImpl.java:3308)
>   org.apache.geode.internal.cache.HARegion.getInstance(HARegion.java:265)
>   
> org.apache.geode.internal.cache.ha.HARegionQueue.createHARegion(HARegionQueue.java:348)
>   
> org.apache.geode.internal.cache.ha.HARegionQueue.(HARegionQueue.java:328)
>   
> org.apache.geode.internal.cache.ha.HARegionQueue$BlockingHARegionQueue.(HARegionQueue.java:2199)
>   
> org.apache.geode.internal.cache.ha.HARegionQueue$DurableHARegionQueue.(HARegionQueue.java:2450)
>   
> org.apache.geode.internal.cache.ha.HARegionQueue.getHARegionQueueInstance(HARegionQueue.java:2030)
>   
> 

[jira] [Comment Edited] (GEODE-3563) SSL socket handling problems in TCPConduit run

2017-09-12 Thread Vahram Aharonyan (JIRA)

[ 
https://issues.apache.org/jira/browse/GEODE-3563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16163144#comment-16163144
 ] 

Vahram Aharonyan edited comment on GEODE-3563 at 9/12/17 3:46 PM:
--

Hi [~amb], we don't have a pull request created for this ticket yet. We have 
some thoughts on this like :

1. putting timeout before configuring SSL socket as it was done in  GEODE-2898, 
GEODE-3023 to avoid any blocking situation.
2. handle SSL exception and do some cleanup work to close the socket in run 
function.

Does this seem to be reasonable?

Thanks,
Vahram.


was (Author: vaharonyan):
Hi Anthony, we don't have a pull request created for this ticket yet. We have 
some thoughts on this like :

1. putting timeout before configuring SSL socket as it was done in  GEODE-2898, 
GEODE-3023 to avoid any blocking situation.
2. handle SSL exception and do some cleanup work to close the socket in run 
function.

Does this seem to be reasonable?

Thanks,
Vahram.

> SSL socket handling problems in TCPConduit run
> --
>
> Key: GEODE-3563
> URL: https://issues.apache.org/jira/browse/GEODE-3563
> Project: Geode
>  Issue Type: Bug
>  Components: client/server
>Reporter: Vahram Aharonyan
> Fix For: 1.2.1
>
>
> Here are two cases that seems to problematic in TCPConduit.run flow:
> 1. TCPConduit.run() has no action performed for the case when SSLException is 
> thrown from sslSocket.startHandshake(), as a result the socket remains open. 
> Catch block from the end of  configureServerSSLSocket() will just report a 
> fatal error(even it seem that this portion is going to be removed in 1.2.1 
> according to GEODE-3393) and re-throw the exception.
> 2. configureServerSSLSocket call is performed without setting socket timeout 
> before that. This can bring to run thread blocking case if read initiated 
> from the SSL handshake flow will not return. Linking to similar issues 
> observed with other acceptors previously: GEODE-2898, GEODE-3023.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (GEODE-3563) SSL socket handling problems in TCPConduit run

2017-09-12 Thread Vahram Aharonyan (JIRA)

[ 
https://issues.apache.org/jira/browse/GEODE-3563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16163144#comment-16163144
 ] 

Vahram Aharonyan commented on GEODE-3563:
-

Hi Anthony, we don't have a pull request created for this ticket yet. We have 
some thoughts on this like :

1. putting timeout before configuring SSL socket as it was done in  GEODE-2898, 
GEODE-3023 to avoid any blocking situation.
2. handle SSL exception and do some cleanup work to close the socket in run 
function.

Does this seem to be reasonable?

Thanks,
Vahram.

> SSL socket handling problems in TCPConduit run
> --
>
> Key: GEODE-3563
> URL: https://issues.apache.org/jira/browse/GEODE-3563
> Project: Geode
>  Issue Type: Bug
>  Components: client/server
>Reporter: Vahram Aharonyan
> Fix For: 1.2.1
>
>
> Here are two cases that seems to problematic in TCPConduit.run flow:
> 1. TCPConduit.run() has no action performed for the case when SSLException is 
> thrown from sslSocket.startHandshake(), as a result the socket remains open. 
> Catch block from the end of  configureServerSSLSocket() will just report a 
> fatal error(even it seem that this portion is going to be removed in 1.2.1 
> according to GEODE-3393) and re-throw the exception.
> 2. configureServerSSLSocket call is performed without setting socket timeout 
> before that. This can bring to run thread blocking case if read initiated 
> from the SSL handshake flow will not return. Linking to similar issues 
> observed with other acceptors previously: GEODE-2898, GEODE-3023.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (GEODE-3563) SSL socket handling problems in TCPConduit run

2017-09-06 Thread Vahram Aharonyan (JIRA)
Vahram Aharonyan created GEODE-3563:
---

 Summary: SSL socket handling problems in TCPConduit run
 Key: GEODE-3563
 URL: https://issues.apache.org/jira/browse/GEODE-3563
 Project: Geode
  Issue Type: Bug
  Components: client/server
Reporter: Vahram Aharonyan
 Fix For: 1.2.1


Here are two cases that seems to problematic in TCPConduit.run flow:

1. TCPConduit.run() has no action performed for the case when SSLException is 
thrown from sslSocket.startHandshake(), as a result the socket remains open. 
Catch block from the end of  configureServerSSLSocket() will just report a 
fatal error(even it seem that this portion is going to be removed in 1.2.1 
according to GEODE-3393) and re-throw the exception.

2. configureServerSSLSocket call is performed without setting socket timeout 
before that. This can bring to run thread blocking case if read initiated from 
the SSL handshake flow will not return. Linking to similar issues observed with 
other acceptors previously: GEODE-2898, GEODE-3023.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)