Hi Mike, Generally, establishing connections in parallel could make sense, but note that in most this would be a minor optimization, because:
- Under load connections are established once and then reused. If you observe disconnections during application lifetime under load, then probably this should be addressed first. - Actual communication is asynchronous, we use NIO for this. If connection already exists, sendGeneric() basically just puts a message into a queue. -Val On Mon, May 22, 2017 at 7:04 PM, Michael Griggs <michael.gri...@gridgain.com > wrote: > Hi Igniters, > > > > Whilst diagnosing a problem with a slow query, I became aware of a > potential > issue in the Ignite codebase. When executing a SQL query that is to run > remotely, the IgniteH2Indexing#send() method is called, with a > Collection<ClusterNode> as one of its parameters. This collection is > iterated sequentially, and ctx.io().sendGeneric() is called synchronously > for each node. This is inefficient if > > > > a) This is the first execution of a query, and thus TCP connections > have to be established > > b) The cost of establishing a TCP connection is high > > > > And optionally > > > > c) There are a large number of nodes in the cluster > > > > In my current situation, developers want to run test queries from their > code > running locally, but connected via VPN to their UAT server environment. > The > cost of opening a TCP connection is in the multiple seconds, as you can see > from this Ignite log file snippet: > > 2017-05-22 18:29:48,908 INFO [TcpCommunicationSpi] - Established outgoing > communication connection [locAddr=/7.1.14.242:56924, > rmtAddr=/10.132.80.3:47100] > > 2017-05-22 18:29:52,294 INFO [TcpCommunicationSpi] - Established outgoing > communication connection [locAddr=/7.1.14.242:56923, > rmtAddr=/10.132.80.30:47102] > > 2017-05-22 18:29:58,659 INFO [TcpCommunicationSpi] - Established outgoing > communication connection [locAddr=/7.1.14.242:56971, > rmtAddr=/10.132.80.23:47101] > > 2017-05-22 18:30:03,183 INFO [TcpCommunicationSpi] - Established outgoing > communication connection [locAddr=/7.1.14.242:56972, > rmtAddr=/10.132.80.21:47100] > > 2017-05-22 18:30:06,039 INFO [TcpCommunicationSpi] - Established outgoing > communication connection [locAddr=/7.1.14.242:56973, > rmtAddr=/10.132.80.21:47103] > > 2017-05-22 18:30:10,828 INFO [TcpCommunicationSpi] - Established outgoing > communication connection [locAddr=/7.1.14.242:57020, > rmtAddr=/10.132.80.20:47100] > > 2017-05-22 18:30:13,060 INFO [TcpCommunicationSpi] - Established outgoing > communication connection [locAddr=/7.1.14.242:57021, > rmtAddr=/10.132.80.29:47103] > > 2017-05-22 18:30:22,144 INFO [TcpCommunicationSpi] - Established outgoing > communication connection [locAddr=/7.1.14.242:57022, > rmtAddr=/10.132.80.22:47103] > > 2017-05-22 18:30:26,513 INFO [TcpCommunicationSpi] - Established outgoing > communication connection [locAddr=/7.1.14.242:57024, > rmtAddr=/10.132.80.20:47101] > > 2017-05-22 18:30:28,526 INFO [TcpCommunicationSpi] - Established outgoing > communication connection [locAddr=/7.1.14.242:57025, > rmtAddr=/10.132.80.30:47103] > > > > Comparing the same code that is executed inside of the UAT environment (so > not using the VPN): > > 2017-05-22 18:22:18,102 INFO [TcpCommunicationSpi] - Established outgoing > communication connection [locAddr=/10.175.11.38:53288, > rmtAddr=/10.175.11.58:47100] > > 2017-05-22 18:22:18,105 INFO [TcpCommunicationSpi] - Established outgoing > communication connection [locAddr=/10.175.11.38:45890, > rmtAddr=/10.175.11.54:47101] > > 2017-05-22 18:22:18,108 INFO [TcpCommunicationSpi] - Established outgoing > communication connection [locAddr=/127.0.0.1:47582, > rmtAddr=/127.0.0.1:47100] > > 2017-05-22 18:22:18,111 INFO [TcpCommunicationSpi] - Established outgoing > communication connection [locAddr=/127.0.0.1:45240, > rmtAddr=/127.0.0.1:47103] > > 2017-05-22 18:22:18,114 INFO [TcpCommunicationSpi] - Established outgoing > communication connection [locAddr=/10.175.11.38:46280, > rmtAddr=/10.175.11.15:47100] > > 2017-05-22 18:22:18,118 INFO [TcpCommunicationSpi] - Established outgoing > communication connection [locAddr=/10.132.80.21:51476, > rmtAddr=/10.132.80.29:47103] > > 2017-05-22 18:22:18,120 INFO [TcpCommunicationSpi] - Established outgoing > communication connection [locAddr=/10.132.80.21:56274, > rmtAddr=pocfd-master1/10.132.80.22:47103] > > 2017-05-22 18:22:18,124 INFO [TcpCommunicationSpi] - Established outgoing > communication connection [locAddr=/10.132.80.21:53558, > rmtAddr=pocfd-ignite1/10.132.80.20:47101] > > 2017-05-22 18:22:18,127 INFO [TcpCommunicationSpi] - Established outgoing > communication connection [locAddr=/10.132.80.21:56216, > rmtAddr=/10.132.80.30:47103] > > > > This is a design flaw in the Ignite code, as we are relying on the client's > network behaving in a particular way (i.e., port opening being very fast). > We should instead try to mask this potential slowness by establishing > connections in parallel, and waiting on the results. > > > > I would like to hear others thoughts and comment before we open a JIRA to > look at this. > > > > Regards > > Mike > >