[jira] [Commented] (GEODE-4322) Locator fails to start with NPE during join to the distributed system
[ https://issues.apache.org/jira/browse/GEODE-4322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16337596#comment-16337596 ] Vahram Aharonyan commented on GEODE-4322: - Hi [~bschuchardt], Thanks for the info. Actually yes, we are shutting down whole cluster. BTW, lets assume we have 2 locators(Loc-1 and Loc-2). We stop Loc-1 and remove its dat file before start (while Loc-2 remains alive) , will this cause problems? Or we should remove all the dat files from all locator nodes only when whole cluster is powered off? Thanks, Vahram. > Locator fails to start with NPE during join to the distributed system > - > > Key: GEODE-4322 > URL: https://issues.apache.org/jira/browse/GEODE-4322 > Project: Geode > Issue Type: Bug > Components: membership >Affects Versions: 1.2.0 >Reporter: Vahram Aharonyan >Assignee: Bruce Schuchardt >Priority: Major > > Found out that after setting security-udp-dhalgo=AES:128 in prorperties files > sometimes locator is failing to come online with the following Exception: > [severe 2018/01/19 04:22:12.194 PST tid=0x45] > Exception in processing request from 10.144.248.41 > java.lang.RuntimeException: Not found public key for member > 16nodedata6(d4b4f5d4-47d2-44b1-a07c-6a7f5755e52d:11493):10002 > at > org.apache.geode.distributed.internal.membership.gms.messenger.GMSEncrypt.getPublicKey(GMSEncrypt.java:177) > at > org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger.getPublicKey(JGroupsMessenger.java:1365) > at > org.apache.geode.distributed.internal.membership.gms.locator.GMSLocator.processRequest(GMSLocator.java:271) > at > org.apache.geode.distributed.internal.InternalLocator$PrimaryHandler.processRequest(InternalLocator.java:1256) > at > org.apache.geode.distributed.internal.tcpserver.TcpServer.lambda$processRequest$0(TcpServer.java:401) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.lang.NullPointerException > at > org.apache.geode.distributed.internal.membership.gms.messenger.GMSEncrypt.getPeerEncryptor(GMSEncrypt.java:258) > at > org.apache.geode.distributed.internal.membership.gms.messenger.GMSEncrypt.getPublicKey(GMSEncrypt.java:175) > ... 7 more > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.lang.NullPointerException > at > org.apache.geode.distributed.internal.membership.gms.messenger.GMSEncrypt.getPeerEncryptor(GMSEncrypt.java:258) > at > org.apache.geode.distributed.internal.membership.gms.messenger.GMSEncrypt.getPublicKey(GMSEncrypt.java:175) > ... 7 more > Please note, that generally this issue is hit after cluster restart. This is > important, as during poweroff locator can go offline first and one of other > members will become coordinator and update view file accordingly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (GEODE-4322) Locator fails to start with NPE during join to the distributed system
[ https://issues.apache.org/jira/browse/GEODE-4322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16335448#comment-16335448 ] Vahram Aharonyan commented on GEODE-4322: - Hi [~bschuchardt], Unfortunately we don't have a way to quickly test this with Geode 1.3. We have setups only with 1.1.0 and 1.2.0. Do you know what is the status of the test mentioned in GEODE-2542 in case of GEODE 1.3 branch? I see that stacktraces from fails are pretty similar to what we are observing. Thanks, Vahram. > Locator fails to start with NPE during join to the distributed system > - > > Key: GEODE-4322 > URL: https://issues.apache.org/jira/browse/GEODE-4322 > Project: Geode > Issue Type: Bug > Components: membership >Affects Versions: 1.2.0 >Reporter: Vahram Aharonyan >Assignee: Bruce Schuchardt >Priority: Major > > Found out that after setting security-udp-dhalgo=AES:128 in prorperties files > sometimes locator is failing to come online with the following Exception: > [severe 2018/01/19 04:22:12.194 PST tid=0x45] > Exception in processing request from 10.144.248.41 > java.lang.RuntimeException: Not found public key for member > 16nodedata6(d4b4f5d4-47d2-44b1-a07c-6a7f5755e52d:11493):10002 > at > org.apache.geode.distributed.internal.membership.gms.messenger.GMSEncrypt.getPublicKey(GMSEncrypt.java:177) > at > org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger.getPublicKey(JGroupsMessenger.java:1365) > at > org.apache.geode.distributed.internal.membership.gms.locator.GMSLocator.processRequest(GMSLocator.java:271) > at > org.apache.geode.distributed.internal.InternalLocator$PrimaryHandler.processRequest(InternalLocator.java:1256) > at > org.apache.geode.distributed.internal.tcpserver.TcpServer.lambda$processRequest$0(TcpServer.java:401) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.lang.NullPointerException > at > org.apache.geode.distributed.internal.membership.gms.messenger.GMSEncrypt.getPeerEncryptor(GMSEncrypt.java:258) > at > org.apache.geode.distributed.internal.membership.gms.messenger.GMSEncrypt.getPublicKey(GMSEncrypt.java:175) > ... 7 more > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.lang.NullPointerException > at > org.apache.geode.distributed.internal.membership.gms.messenger.GMSEncrypt.getPeerEncryptor(GMSEncrypt.java:258) > at > org.apache.geode.distributed.internal.membership.gms.messenger.GMSEncrypt.getPublicKey(GMSEncrypt.java:175) > ... 7 more > Please note, that generally this issue is hit after cluster restart. This is > important, as during poweroff locator can go offline first and one of other > members will become coordinator and update view file accordingly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (GEODE-4322) Locator fails to start with NPE during join to the distributed system
[ https://issues.apache.org/jira/browse/GEODE-4322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16334002#comment-16334002 ] Vahram Aharonyan commented on GEODE-4322: - Hi [~amb], This is really important issue as it is blocking from encrypting inter-node UDP communication. This means that all the UDP communication within the cluster goes as a plain text . Can't we have this fixed sooner in terms of upcoming 1.4.0? Or having a patch for older releases would be an option as well. Thanks, Vahram. > Locator fails to start with NPE during join to the distributed system > - > > Key: GEODE-4322 > URL: https://issues.apache.org/jira/browse/GEODE-4322 > Project: Geode > Issue Type: Bug > Components: membership >Affects Versions: 1.2.0 >Reporter: Vahram Aharonyan >Priority: Major > > Found out that after setting security-udp-dhalgo=AES:128 in prorperties files > sometimes locator is failing to come online with the following Exception: > [severe 2018/01/19 04:22:12.194 PST tid=0x45] > Exception in processing request from 10.144.248.41 > java.lang.RuntimeException: Not found public key for member > 16nodedata6(d4b4f5d4-47d2-44b1-a07c-6a7f5755e52d:11493):10002 > at > org.apache.geode.distributed.internal.membership.gms.messenger.GMSEncrypt.getPublicKey(GMSEncrypt.java:177) > at > org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger.getPublicKey(JGroupsMessenger.java:1365) > at > org.apache.geode.distributed.internal.membership.gms.locator.GMSLocator.processRequest(GMSLocator.java:271) > at > org.apache.geode.distributed.internal.InternalLocator$PrimaryHandler.processRequest(InternalLocator.java:1256) > at > org.apache.geode.distributed.internal.tcpserver.TcpServer.lambda$processRequest$0(TcpServer.java:401) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.lang.NullPointerException > at > org.apache.geode.distributed.internal.membership.gms.messenger.GMSEncrypt.getPeerEncryptor(GMSEncrypt.java:258) > at > org.apache.geode.distributed.internal.membership.gms.messenger.GMSEncrypt.getPublicKey(GMSEncrypt.java:175) > ... 7 more > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.lang.NullPointerException > at > org.apache.geode.distributed.internal.membership.gms.messenger.GMSEncrypt.getPeerEncryptor(GMSEncrypt.java:258) > at > org.apache.geode.distributed.internal.membership.gms.messenger.GMSEncrypt.getPublicKey(GMSEncrypt.java:175) > ... 7 more > Please note, that generally this issue is hit after cluster restart. This is > important, as during poweroff locator can go offline first and one of other > members will become coordinator and update view file accordingly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (GEODE-4322) Locator fails to start with NPE during join to the distributed system
[ https://issues.apache.org/jira/browse/GEODE-4322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vahram Aharonyan updated GEODE-4322: Affects Version/s: 1.2.0 > Locator fails to start with NPE during join to the distributed system > - > > Key: GEODE-4322 > URL: https://issues.apache.org/jira/browse/GEODE-4322 > Project: Geode > Issue Type: Bug > Components: locator >Affects Versions: 1.2.0 >Reporter: Vahram Aharonyan >Priority: Major > Fix For: 1.4.0 > > > Found out that after setting security-udp-dhalgo=AES:128 in prorperties files > sometimes locator is failing to come online with the following Exception: > [severe 2018/01/19 04:22:12.194 PST tid=0x45] > Exception in processing request from 10.144.248.41 > java.lang.RuntimeException: Not found public key for member > 16nodedata6(d4b4f5d4-47d2-44b1-a07c-6a7f5755e52d:11493):10002 > at > org.apache.geode.distributed.internal.membership.gms.messenger.GMSEncrypt.getPublicKey(GMSEncrypt.java:177) > at > org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger.getPublicKey(JGroupsMessenger.java:1365) > at > org.apache.geode.distributed.internal.membership.gms.locator.GMSLocator.processRequest(GMSLocator.java:271) > at > org.apache.geode.distributed.internal.InternalLocator$PrimaryHandler.processRequest(InternalLocator.java:1256) > at > org.apache.geode.distributed.internal.tcpserver.TcpServer.lambda$processRequest$0(TcpServer.java:401) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.lang.NullPointerException > at > org.apache.geode.distributed.internal.membership.gms.messenger.GMSEncrypt.getPeerEncryptor(GMSEncrypt.java:258) > at > org.apache.geode.distributed.internal.membership.gms.messenger.GMSEncrypt.getPublicKey(GMSEncrypt.java:175) > ... 7 more > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.lang.NullPointerException > at > org.apache.geode.distributed.internal.membership.gms.messenger.GMSEncrypt.getPeerEncryptor(GMSEncrypt.java:258) > at > org.apache.geode.distributed.internal.membership.gms.messenger.GMSEncrypt.getPublicKey(GMSEncrypt.java:175) > ... 7 more > Please note, that generally this issue is hit after cluster restart. This is > important, as during poweroff locator can go offline first and one of other > members will become coordinator and update view file accordingly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (GEODE-4322) Locator fails to start with NPE during join to the distributed system
Vahram Aharonyan created GEODE-4322: --- Summary: Locator fails to start with NPE during join to the distributed system Key: GEODE-4322 URL: https://issues.apache.org/jira/browse/GEODE-4322 Project: Geode Issue Type: Bug Components: locator Reporter: Vahram Aharonyan Fix For: 1.4.0 Found out that after setting security-udp-dhalgo=AES:128 in prorperties files sometimes locator is failing to come online with the following Exception: [severe 2018/01/19 04:22:12.194 PST tid=0x45] Exception in processing request from 10.144.248.41 java.lang.RuntimeException: Not found public key for member 16nodedata6(d4b4f5d4-47d2-44b1-a07c-6a7f5755e52d:11493):10002 at org.apache.geode.distributed.internal.membership.gms.messenger.GMSEncrypt.getPublicKey(GMSEncrypt.java:177) at org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger.getPublicKey(JGroupsMessenger.java:1365) at org.apache.geode.distributed.internal.membership.gms.locator.GMSLocator.processRequest(GMSLocator.java:271) at org.apache.geode.distributed.internal.InternalLocator$PrimaryHandler.processRequest(InternalLocator.java:1256) at org.apache.geode.distributed.internal.tcpserver.TcpServer.lambda$processRequest$0(TcpServer.java:401) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.NullPointerException at org.apache.geode.distributed.internal.membership.gms.messenger.GMSEncrypt.getPeerEncryptor(GMSEncrypt.java:258) at org.apache.geode.distributed.internal.membership.gms.messenger.GMSEncrypt.getPublicKey(GMSEncrypt.java:175) ... 7 more at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.NullPointerException at org.apache.geode.distributed.internal.membership.gms.messenger.GMSEncrypt.getPeerEncryptor(GMSEncrypt.java:258) at org.apache.geode.distributed.internal.membership.gms.messenger.GMSEncrypt.getPublicKey(GMSEncrypt.java:175) ... 7 more Please note, that generally this issue is hit after cluster restart. This is important, as during poweroff locator can go offline first and one of other members will become coordinator and update view file accordingly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (GEODE-3637) configureClientSSLSocket call can block Acceptor thread
[ https://issues.apache.org/jira/browse/GEODE-3637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16221904#comment-16221904 ] Vahram Aharonyan commented on GEODE-3637: - Udo, Is there a possibility that after moving registerClient from Acceptor thread to ServerConnection.run it is ServerConnection that will stack forever on socket read once the socket closes on other side and notification of that gets lost due to packet drop? Also initializeClientNofication is being called in ServerConnection only for Selector case, in Acceptor thread that was generic and was not depending on Selector mode. Is this expected change? Thanks, Vahram. > configureClientSSLSocket call can block Acceptor thread > --- > > Key: GEODE-3637 > URL: https://issues.apache.org/jira/browse/GEODE-3637 > Project: Geode > Issue Type: Bug > Components: client/server >Affects Versions: 1.1.0, 1.2.0 >Reporter: Vahram Aharonyan >Assignee: Udo Kohlmeyer >Priority: Critical > > org.apache.geode.internal.net.SocketCreator#configureClientSSLSocket timeout > for Socket is being configured before starting SSL handshake only if passed > "timeout" argument is larger than 0. > Having sslSocket.startHandshake issued without setting timeout can result to > the blocking of caller thread as in GEODE-2898, GEODE-3023. > Below is the example of Handshaker thread stack-trace that got stacked: > "Handshaker /10.124.195.100:1 Thread 183" Id=526300 in RUNNABLE (running > in native) > Total blocked: 4 Total waited: 884 > java.net.SocketInputStream.socketRead0(Native Method) > java.net.SocketInputStream.socketRead(SocketInputStream.java:116) > java.net.SocketInputStream.read(SocketInputStream.java:171) > java.net.SocketInputStream.read(SocketInputStream.java:141) > sun.security.ssl.InputRecord.readFully(InputRecord.java:465) > sun.security.ssl.InputRecord.read(InputRecord.java:503) > sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:973) > > sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1375) > sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1403) > sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1387) > > org.apache.geode.internal.net.SocketCreator.configureClientSSLSocket(SocketCreator.java:1088) > org.apache.geode.internal.net.SocketCreator.connect(SocketCreator.java:967) > org.apache.geode.internal.net.SocketCreator.connect(SocketCreator.java:929) > > org.apache.geode.internal.net.SocketCreator.connectForServer(SocketCreator.java:908) > org.apache.geode.internal.tcp.Connection.(Connection.java:1306) > org.apache.geode.internal.tcp.Connection.createSender(Connection.java:1094) > > org.apache.geode.internal.tcp.ConnectionTable.getOrderedAndOwned(ConnectionTable.java:553) > org.apache.geode.internal.tcp.ConnectionTable.get(ConnectionTable.java:664) > org.apache.geode.internal.tcp.TCPConduit.getConnection(TCPConduit.java:1037) > > org.apache.geode.distributed.internal.direct.DirectChannel.getConnections(DirectChannel.java:543) > > org.apache.geode.distributed.internal.direct.DirectChannel.sendToMany(DirectChannel.java:319) > > org.apache.geode.distributed.internal.direct.DirectChannel.send(DirectChannel.java:605) > > org.apache.geode.distributed.internal.membership.gms.mgr.GMSMembershipManager.directChannelSend(GMSMembershipManager.java:1684) > > org.apache.geode.distributed.internal.membership.gms.mgr.GMSMembershipManager.send(GMSMembershipManager.java:1875) > > org.apache.geode.distributed.internal.DistributionChannel.send(DistributionChannel.java:82) > > org.apache.geode.distributed.internal.DistributionManager.sendOutgoing(DistributionManager.java:3416) > > org.apache.geode.distributed.internal.DistributionManager.sendMessage(DistributionManager.java:3453) > > org.apache.geode.distributed.internal.DistributionManager.putOutgoing(DistributionManager.java:1832) > > org.apache.geode.internal.cache.UpdateAttributesProcessor.sendProfileUpdate(UpdateAttributesProcessor.java:162) > > org.apache.geode.internal.cache.UpdateAttributesProcessor.distribute(UpdateAttributesProcessor.java:97) > > org.apache.geode.internal.cache.DistributedRegion.initialized(DistributedRegion.java:1128) > > org.apache.geode.internal.cache.LocalRegion.initialize(LocalRegion.java:2413) > > org.apache.geode.internal.cache.DistributedRegion.initialize(DistributedRegion.java:1117) > org.apache.geode.internal.cache.HARegion.initialize(HARegion.java:345) > > org.apache.geode.internal.cache.GemFireCacheImpl.createVMRegion(GemFireCacheImpl.java:3308) > org.apache.geode.internal.cache.HARegion.getInstance(HARegion.java:265) > >
[jira] [Created] (GEODE-3904) Allow plugging of custom classLoaders runtime
Vahram Aharonyan created GEODE-3904: --- Summary: Allow plugging of custom classLoaders runtime Key: GEODE-3904 URL: https://issues.apache.org/jira/browse/GEODE-3904 Project: Geode Issue Type: Improvement Components: core Reporter: Vahram Aharonyan Currently it is not possible to attach multiple ClassLoaders to Geode ClassPath once whole distributed system or some members are already configured and running. This is important if there are dynamically loaded plugins in the system and objects from their classes can be serialized/deserialized during some actions. To successfully complete these operations Geode need to have this corresponding classLoaders in its ClassPathLoader. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (GEODE-3563) SSL socket handling problems in TCPConduit run
[ https://issues.apache.org/jira/browse/GEODE-3563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vahram Aharonyan updated GEODE-3563: Priority: Critical (was: Major) Fix Version/s: (was: 1.2.1) 1.3.0 > SSL socket handling problems in TCPConduit run > -- > > Key: GEODE-3563 > URL: https://issues.apache.org/jira/browse/GEODE-3563 > Project: Geode > Issue Type: Bug > Components: client/server >Reporter: Vahram Aharonyan >Priority: Critical > Fix For: 1.3.0 > > > Here are two cases that seems to problematic in TCPConduit.run flow: > 1. TCPConduit.run() has no action performed for the case when SSLException is > thrown from sslSocket.startHandshake(), as a result the socket remains open. > Catch block from the end of configureServerSSLSocket() will just report a > fatal error(even it seem that this portion is going to be removed in 1.2.1 > according to GEODE-3393) and re-throw the exception. > 2. configureServerSSLSocket call is performed without setting socket timeout > before that. This can bring to run thread blocking case if read initiated > from the SSL handshake flow will not return. Linking to similar issues > observed with other acceptors previously: GEODE-2898, GEODE-3023. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (GEODE-3637) configureClientSSLSocket call can block Acceptor thread
[ https://issues.apache.org/jira/browse/GEODE-3637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vahram Aharonyan updated GEODE-3637: Affects Version/s: 1.1.0 1.2.0 > configureClientSSLSocket call can block Acceptor thread > --- > > Key: GEODE-3637 > URL: https://issues.apache.org/jira/browse/GEODE-3637 > Project: Geode > Issue Type: Bug > Components: client/server >Affects Versions: 1.1.0, 1.2.0 >Reporter: Vahram Aharonyan >Priority: Critical > Fix For: 1.3.0 > > > org.apache.geode.internal.net.SocketCreator#configureClientSSLSocket timeout > for Socket is being configured before starting SSL handshake only if passed > "timeout" argument is larger than 0. > Having sslSocket.startHandshake issued without setting timeout can result to > the blocking of caller thread as in GEODE-2898, GEODE-3023. > Below is the example of Handshaker thread stack-trace that got stacked: > "Handshaker /10.124.195.100:1 Thread 183" Id=526300 in RUNNABLE (running > in native) > Total blocked: 4 Total waited: 884 > java.net.SocketInputStream.socketRead0(Native Method) > java.net.SocketInputStream.socketRead(SocketInputStream.java:116) > java.net.SocketInputStream.read(SocketInputStream.java:171) > java.net.SocketInputStream.read(SocketInputStream.java:141) > sun.security.ssl.InputRecord.readFully(InputRecord.java:465) > sun.security.ssl.InputRecord.read(InputRecord.java:503) > sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:973) > > sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1375) > sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1403) > sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1387) > > org.apache.geode.internal.net.SocketCreator.configureClientSSLSocket(SocketCreator.java:1088) > org.apache.geode.internal.net.SocketCreator.connect(SocketCreator.java:967) > org.apache.geode.internal.net.SocketCreator.connect(SocketCreator.java:929) > > org.apache.geode.internal.net.SocketCreator.connectForServer(SocketCreator.java:908) > org.apache.geode.internal.tcp.Connection.(Connection.java:1306) > org.apache.geode.internal.tcp.Connection.createSender(Connection.java:1094) > > org.apache.geode.internal.tcp.ConnectionTable.getOrderedAndOwned(ConnectionTable.java:553) > org.apache.geode.internal.tcp.ConnectionTable.get(ConnectionTable.java:664) > org.apache.geode.internal.tcp.TCPConduit.getConnection(TCPConduit.java:1037) > > org.apache.geode.distributed.internal.direct.DirectChannel.getConnections(DirectChannel.java:543) > > org.apache.geode.distributed.internal.direct.DirectChannel.sendToMany(DirectChannel.java:319) > > org.apache.geode.distributed.internal.direct.DirectChannel.send(DirectChannel.java:605) > > org.apache.geode.distributed.internal.membership.gms.mgr.GMSMembershipManager.directChannelSend(GMSMembershipManager.java:1684) > > org.apache.geode.distributed.internal.membership.gms.mgr.GMSMembershipManager.send(GMSMembershipManager.java:1875) > > org.apache.geode.distributed.internal.DistributionChannel.send(DistributionChannel.java:82) > > org.apache.geode.distributed.internal.DistributionManager.sendOutgoing(DistributionManager.java:3416) > > org.apache.geode.distributed.internal.DistributionManager.sendMessage(DistributionManager.java:3453) > > org.apache.geode.distributed.internal.DistributionManager.putOutgoing(DistributionManager.java:1832) > > org.apache.geode.internal.cache.UpdateAttributesProcessor.sendProfileUpdate(UpdateAttributesProcessor.java:162) > > org.apache.geode.internal.cache.UpdateAttributesProcessor.distribute(UpdateAttributesProcessor.java:97) > > org.apache.geode.internal.cache.DistributedRegion.initialized(DistributedRegion.java:1128) > > org.apache.geode.internal.cache.LocalRegion.initialize(LocalRegion.java:2413) > > org.apache.geode.internal.cache.DistributedRegion.initialize(DistributedRegion.java:1117) > org.apache.geode.internal.cache.HARegion.initialize(HARegion.java:345) > > org.apache.geode.internal.cache.GemFireCacheImpl.createVMRegion(GemFireCacheImpl.java:3308) > org.apache.geode.internal.cache.HARegion.getInstance(HARegion.java:265) > > org.apache.geode.internal.cache.ha.HARegionQueue.createHARegion(HARegionQueue.java:348) > > org.apache.geode.internal.cache.ha.HARegionQueue.(HARegionQueue.java:328) > > org.apache.geode.internal.cache.ha.HARegionQueue$BlockingHARegionQueue.(HARegionQueue.java:2199) > > org.apache.geode.internal.cache.ha.HARegionQueue$DurableHARegionQueue.(HARegionQueue.java:2450) > > org.apache.geode.internal.cache.ha.HARegionQueue.getHARegionQueueInstance(HARegionQueue.java:2030) > >
[jira] [Updated] (GEODE-3637) configureClientSSLSocket call can block Acceptor thread
[ https://issues.apache.org/jira/browse/GEODE-3637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vahram Aharonyan updated GEODE-3637: Priority: Critical (was: Major) > configureClientSSLSocket call can block Acceptor thread > --- > > Key: GEODE-3637 > URL: https://issues.apache.org/jira/browse/GEODE-3637 > Project: Geode > Issue Type: Bug > Components: client/server >Reporter: Vahram Aharonyan >Priority: Critical > Fix For: 1.3.0 > > > org.apache.geode.internal.net.SocketCreator#configureClientSSLSocket timeout > for Socket is being configured before starting SSL handshake only if passed > "timeout" argument is larger than 0. > Having sslSocket.startHandshake issued without setting timeout can result to > the blocking of caller thread as in GEODE-2898, GEODE-3023. > Below is the example of Handshaker thread stack-trace that got stacked: > "Handshaker /10.124.195.100:1 Thread 183" Id=526300 in RUNNABLE (running > in native) > Total blocked: 4 Total waited: 884 > java.net.SocketInputStream.socketRead0(Native Method) > java.net.SocketInputStream.socketRead(SocketInputStream.java:116) > java.net.SocketInputStream.read(SocketInputStream.java:171) > java.net.SocketInputStream.read(SocketInputStream.java:141) > sun.security.ssl.InputRecord.readFully(InputRecord.java:465) > sun.security.ssl.InputRecord.read(InputRecord.java:503) > sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:973) > > sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1375) > sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1403) > sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1387) > > org.apache.geode.internal.net.SocketCreator.configureClientSSLSocket(SocketCreator.java:1088) > org.apache.geode.internal.net.SocketCreator.connect(SocketCreator.java:967) > org.apache.geode.internal.net.SocketCreator.connect(SocketCreator.java:929) > > org.apache.geode.internal.net.SocketCreator.connectForServer(SocketCreator.java:908) > org.apache.geode.internal.tcp.Connection.(Connection.java:1306) > org.apache.geode.internal.tcp.Connection.createSender(Connection.java:1094) > > org.apache.geode.internal.tcp.ConnectionTable.getOrderedAndOwned(ConnectionTable.java:553) > org.apache.geode.internal.tcp.ConnectionTable.get(ConnectionTable.java:664) > org.apache.geode.internal.tcp.TCPConduit.getConnection(TCPConduit.java:1037) > > org.apache.geode.distributed.internal.direct.DirectChannel.getConnections(DirectChannel.java:543) > > org.apache.geode.distributed.internal.direct.DirectChannel.sendToMany(DirectChannel.java:319) > > org.apache.geode.distributed.internal.direct.DirectChannel.send(DirectChannel.java:605) > > org.apache.geode.distributed.internal.membership.gms.mgr.GMSMembershipManager.directChannelSend(GMSMembershipManager.java:1684) > > org.apache.geode.distributed.internal.membership.gms.mgr.GMSMembershipManager.send(GMSMembershipManager.java:1875) > > org.apache.geode.distributed.internal.DistributionChannel.send(DistributionChannel.java:82) > > org.apache.geode.distributed.internal.DistributionManager.sendOutgoing(DistributionManager.java:3416) > > org.apache.geode.distributed.internal.DistributionManager.sendMessage(DistributionManager.java:3453) > > org.apache.geode.distributed.internal.DistributionManager.putOutgoing(DistributionManager.java:1832) > > org.apache.geode.internal.cache.UpdateAttributesProcessor.sendProfileUpdate(UpdateAttributesProcessor.java:162) > > org.apache.geode.internal.cache.UpdateAttributesProcessor.distribute(UpdateAttributesProcessor.java:97) > > org.apache.geode.internal.cache.DistributedRegion.initialized(DistributedRegion.java:1128) > > org.apache.geode.internal.cache.LocalRegion.initialize(LocalRegion.java:2413) > > org.apache.geode.internal.cache.DistributedRegion.initialize(DistributedRegion.java:1117) > org.apache.geode.internal.cache.HARegion.initialize(HARegion.java:345) > > org.apache.geode.internal.cache.GemFireCacheImpl.createVMRegion(GemFireCacheImpl.java:3308) > org.apache.geode.internal.cache.HARegion.getInstance(HARegion.java:265) > > org.apache.geode.internal.cache.ha.HARegionQueue.createHARegion(HARegionQueue.java:348) > > org.apache.geode.internal.cache.ha.HARegionQueue.(HARegionQueue.java:328) > > org.apache.geode.internal.cache.ha.HARegionQueue$BlockingHARegionQueue.(HARegionQueue.java:2199) > > org.apache.geode.internal.cache.ha.HARegionQueue$DurableHARegionQueue.(HARegionQueue.java:2450) > > org.apache.geode.internal.cache.ha.HARegionQueue.getHARegionQueueInstance(HARegionQueue.java:2030) > >
[jira] [Comment Edited] (GEODE-3563) SSL socket handling problems in TCPConduit run
[ https://issues.apache.org/jira/browse/GEODE-3563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16163144#comment-16163144 ] Vahram Aharonyan edited comment on GEODE-3563 at 9/12/17 3:46 PM: -- Hi [~amb], we don't have a pull request created for this ticket yet. We have some thoughts on this like : 1. putting timeout before configuring SSL socket as it was done in GEODE-2898, GEODE-3023 to avoid any blocking situation. 2. handle SSL exception and do some cleanup work to close the socket in run function. Does this seem to be reasonable? Thanks, Vahram. was (Author: vaharonyan): Hi Anthony, we don't have a pull request created for this ticket yet. We have some thoughts on this like : 1. putting timeout before configuring SSL socket as it was done in GEODE-2898, GEODE-3023 to avoid any blocking situation. 2. handle SSL exception and do some cleanup work to close the socket in run function. Does this seem to be reasonable? Thanks, Vahram. > SSL socket handling problems in TCPConduit run > -- > > Key: GEODE-3563 > URL: https://issues.apache.org/jira/browse/GEODE-3563 > Project: Geode > Issue Type: Bug > Components: client/server >Reporter: Vahram Aharonyan > Fix For: 1.2.1 > > > Here are two cases that seems to problematic in TCPConduit.run flow: > 1. TCPConduit.run() has no action performed for the case when SSLException is > thrown from sslSocket.startHandshake(), as a result the socket remains open. > Catch block from the end of configureServerSSLSocket() will just report a > fatal error(even it seem that this portion is going to be removed in 1.2.1 > according to GEODE-3393) and re-throw the exception. > 2. configureServerSSLSocket call is performed without setting socket timeout > before that. This can bring to run thread blocking case if read initiated > from the SSL handshake flow will not return. Linking to similar issues > observed with other acceptors previously: GEODE-2898, GEODE-3023. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (GEODE-3563) SSL socket handling problems in TCPConduit run
[ https://issues.apache.org/jira/browse/GEODE-3563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16163144#comment-16163144 ] Vahram Aharonyan commented on GEODE-3563: - Hi Anthony, we don't have a pull request created for this ticket yet. We have some thoughts on this like : 1. putting timeout before configuring SSL socket as it was done in GEODE-2898, GEODE-3023 to avoid any blocking situation. 2. handle SSL exception and do some cleanup work to close the socket in run function. Does this seem to be reasonable? Thanks, Vahram. > SSL socket handling problems in TCPConduit run > -- > > Key: GEODE-3563 > URL: https://issues.apache.org/jira/browse/GEODE-3563 > Project: Geode > Issue Type: Bug > Components: client/server >Reporter: Vahram Aharonyan > Fix For: 1.2.1 > > > Here are two cases that seems to problematic in TCPConduit.run flow: > 1. TCPConduit.run() has no action performed for the case when SSLException is > thrown from sslSocket.startHandshake(), as a result the socket remains open. > Catch block from the end of configureServerSSLSocket() will just report a > fatal error(even it seem that this portion is going to be removed in 1.2.1 > according to GEODE-3393) and re-throw the exception. > 2. configureServerSSLSocket call is performed without setting socket timeout > before that. This can bring to run thread blocking case if read initiated > from the SSL handshake flow will not return. Linking to similar issues > observed with other acceptors previously: GEODE-2898, GEODE-3023. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (GEODE-3563) SSL socket handling problems in TCPConduit run
Vahram Aharonyan created GEODE-3563: --- Summary: SSL socket handling problems in TCPConduit run Key: GEODE-3563 URL: https://issues.apache.org/jira/browse/GEODE-3563 Project: Geode Issue Type: Bug Components: client/server Reporter: Vahram Aharonyan Fix For: 1.2.1 Here are two cases that seems to problematic in TCPConduit.run flow: 1. TCPConduit.run() has no action performed for the case when SSLException is thrown from sslSocket.startHandshake(), as a result the socket remains open. Catch block from the end of configureServerSSLSocket() will just report a fatal error(even it seem that this portion is going to be removed in 1.2.1 according to GEODE-3393) and re-throw the exception. 2. configureServerSSLSocket call is performed without setting socket timeout before that. This can bring to run thread blocking case if read initiated from the SSL handshake flow will not return. Linking to similar issues observed with other acceptors previously: GEODE-2898, GEODE-3023. -- This message was sent by Atlassian JIRA (v6.4.14#64029)