Hi Mike,

Generally, establishing connections in parallel could make sense, but note
that in most this would be a minor optimization, because:

   - Under load connections are established once and then reused. If you
   observe disconnections during application lifetime under load, then
   probably this should be addressed first.
   - Actual communication is asynchronous, we use NIO for this. If
   connection already exists, sendGeneric() basically just puts a message into
   a queue.

-Val

On Mon, May 22, 2017 at 7:04 PM, Michael Griggs <michael.gri...@gridgain.com
> wrote:

> Hi Igniters,
>
>
>
> Whilst diagnosing a problem with a slow query, I became aware of a
> potential
> issue in the Ignite codebase.  When executing a SQL query that is to run
> remotely, the IgniteH2Indexing#send() method is called, with a
> Collection<ClusterNode> as one of its parameters.  This collection is
> iterated sequentially, and ctx.io().sendGeneric() is called synchronously
> for each node.  This is inefficient if
>
>
>
> a)       This is the first execution of a query, and thus TCP connections
> have to be established
>
> b)      The cost of establishing a TCP connection is high
>
>
>
> And optionally
>
>
>
> c)       There are a large number of nodes in the cluster
>
>
>
> In my current situation, developers want to run test queries from their
> code
> running locally, but connected via VPN to their UAT server environment.
> The
> cost of opening a TCP connection is in the multiple seconds, as you can see
> from this Ignite log file snippet:
>
> 2017-05-22 18:29:48,908 INFO [TcpCommunicationSpi] - Established outgoing
> communication connection [locAddr=/7.1.14.242:56924,
> rmtAddr=/10.132.80.3:47100]
>
> 2017-05-22 18:29:52,294 INFO [TcpCommunicationSpi] - Established outgoing
> communication connection [locAddr=/7.1.14.242:56923,
> rmtAddr=/10.132.80.30:47102]
>
> 2017-05-22 18:29:58,659 INFO [TcpCommunicationSpi] - Established outgoing
> communication connection [locAddr=/7.1.14.242:56971,
> rmtAddr=/10.132.80.23:47101]
>
> 2017-05-22 18:30:03,183 INFO [TcpCommunicationSpi] - Established outgoing
> communication connection [locAddr=/7.1.14.242:56972,
> rmtAddr=/10.132.80.21:47100]
>
> 2017-05-22 18:30:06,039 INFO [TcpCommunicationSpi] - Established outgoing
> communication connection [locAddr=/7.1.14.242:56973,
> rmtAddr=/10.132.80.21:47103]
>
> 2017-05-22 18:30:10,828 INFO [TcpCommunicationSpi] - Established outgoing
> communication connection [locAddr=/7.1.14.242:57020,
> rmtAddr=/10.132.80.20:47100]
>
> 2017-05-22 18:30:13,060 INFO [TcpCommunicationSpi] - Established outgoing
> communication connection [locAddr=/7.1.14.242:57021,
> rmtAddr=/10.132.80.29:47103]
>
> 2017-05-22 18:30:22,144 INFO [TcpCommunicationSpi] - Established outgoing
> communication connection [locAddr=/7.1.14.242:57022,
> rmtAddr=/10.132.80.22:47103]
>
> 2017-05-22 18:30:26,513 INFO [TcpCommunicationSpi] - Established outgoing
> communication connection [locAddr=/7.1.14.242:57024,
> rmtAddr=/10.132.80.20:47101]
>
> 2017-05-22 18:30:28,526 INFO [TcpCommunicationSpi] - Established outgoing
> communication connection [locAddr=/7.1.14.242:57025,
> rmtAddr=/10.132.80.30:47103]
>
>
>
> Comparing the same code that is executed inside of the UAT environment (so
> not using the VPN):
>
> 2017-05-22 18:22:18,102 INFO [TcpCommunicationSpi] - Established outgoing
> communication connection [locAddr=/10.175.11.38:53288,
> rmtAddr=/10.175.11.58:47100]
>
> 2017-05-22 18:22:18,105 INFO [TcpCommunicationSpi] - Established outgoing
> communication connection [locAddr=/10.175.11.38:45890,
> rmtAddr=/10.175.11.54:47101]
>
> 2017-05-22 18:22:18,108 INFO [TcpCommunicationSpi] - Established outgoing
> communication connection [locAddr=/127.0.0.1:47582,
> rmtAddr=/127.0.0.1:47100]
>
> 2017-05-22 18:22:18,111 INFO [TcpCommunicationSpi] - Established outgoing
> communication connection [locAddr=/127.0.0.1:45240,
> rmtAddr=/127.0.0.1:47103]
>
> 2017-05-22 18:22:18,114 INFO [TcpCommunicationSpi] - Established outgoing
> communication connection [locAddr=/10.175.11.38:46280,
> rmtAddr=/10.175.11.15:47100]
>
> 2017-05-22 18:22:18,118 INFO [TcpCommunicationSpi] - Established outgoing
> communication connection [locAddr=/10.132.80.21:51476,
> rmtAddr=/10.132.80.29:47103]
>
> 2017-05-22 18:22:18,120 INFO [TcpCommunicationSpi] - Established outgoing
> communication connection [locAddr=/10.132.80.21:56274,
> rmtAddr=pocfd-master1/10.132.80.22:47103]
>
> 2017-05-22 18:22:18,124 INFO [TcpCommunicationSpi] - Established outgoing
> communication connection [locAddr=/10.132.80.21:53558,
> rmtAddr=pocfd-ignite1/10.132.80.20:47101]
>
> 2017-05-22 18:22:18,127 INFO [TcpCommunicationSpi] - Established outgoing
> communication connection [locAddr=/10.132.80.21:56216,
> rmtAddr=/10.132.80.30:47103]
>
>
>
> This is a design flaw in the Ignite code, as we are relying on the client's
> network behaving in a particular way (i.e., port opening being very fast).
> We should instead try to mask this potential slowness by establishing
> connections in parallel, and waiting on the results.
>
>
>
> I would like to hear others thoughts and comment before we open a JIRA to
> look at this.
>
>
>
> Regards
>
> Mike
>
>

Reply via email to