Re: Unintuitive error message when invalid marshaller files found

2017-09-18 Thread Michael Griggs
java.JavaLogger info
INFO: Command protocol successfully stopped: TCP binary
Sep 18, 2017 8:22:35 AM org.apache.ignite.logger.java.JavaLogger info
INFO: Command protocol successfully stopped: Jetty REST
Disconnected from the target VM, address: '127.0.0.1:57778',
transport: 'socket'
Sep 18, 2017 8:22:35 AM org.apache.ignite.logger.java.JavaLogger info
INFO: 

>>>
+-+
>>> Ignite ver.
2.1.4#20170830-sha1:e9d5598fb4fece26c20e5a690ebc4a76ecad795a stopped
OK
>>>
+-+
>>> Ignite instance name: evictionExampleCluster
>>> Grid uptime: 00:00:12:676

Exception in thread "main" class org.apache.ignite.IgniteException:
Failed to start processor: GridProcessorAdapter []
    at
org.apache.ignite.internal.util.IgniteUtils.convertException(IgniteUtils.java:966)
    at org.apache.ignite.Ignition.start(Ignition.java:350)
    at com.gridgain.proserv.ServerNode.run(ServerNode.java:26)
    at com.gridgain.proserv.ServerNode.main(ServerNode.java:21)
Caused by: class org.apache.ignite.IgniteCheckedException: Failed to
start processor: GridProcessorAdapter []
    at
org.apache.ignite.internal.IgniteKernal.startProcessor(IgniteKernal.java:1813)
    at
org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:946)
    at
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:1904)
    at
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1646)
    at
org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1074)
    at
org.apache.ignite.internal.IgnitionEx.startConfigurations(IgnitionEx.java:992)
    at
org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:878)
    at
org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:777)
    at
org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:647)
    at
org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:616)
    at org.apache.ignite.Ignition.start(Ignition.java:347)
    ... 2 more
Caused by: class org.apache.ignite.IgniteCheckedException: Reading
marshaller mapping from file 248380598.classname failed; last symbol
of file name is expected to be numeric.
    at
org.apache.ignite.internal.MarshallerMappingFileStore.getPlatformId(MarshallerMappingFileStore.java:186)
    at
org.apache.ignite.internal.MarshallerMappingFileStore.restoreMappings(MarshallerMappingFileStore.java:153)
    at
org.apache.ignite.internal.MarshallerContextImpl.onMarshallerProcessorStarted(MarshallerContextImpl.java:524)
    at
org.apache.ignite.internal.processors.marshaller.GridMarshallerMappingProcessor.start(GridMarshallerMappingProcessor.java:114)
    at
org.apache.ignite.internal.IgniteKernal.startProcessor(IgniteKernal.java:1810)
    ... 12 more
Caused by: java.lang.NumberFormatException: For input string: "e"
    at
java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
    at java.lang.Integer.parseInt(Integer.java:580)
    at java.lang.Byte.parseByte(Byte.java:149)
    at java.lang.Byte.parseByte(Byte.java:175)
    at
org.apache.ignite.internal.MarshallerMappingFileStore.getPlatformId(MarshallerMappingFileStore.java:183)
    ... 16 more

- Original Message -
From: dev@ignite.apache.org
To:<dev@ignite.apache.org>
Cc:
Sent:Fri, 15 Sep 2017 09:57:27 -0700
Subject:Re: Unintuitive error message when invalid marshaller files
found

 Mike,

 Can you show the exception that is thrown?

 -Val

 On Fri, Sep 15, 2017 at 7:12 AM, Michael Griggs
<mich...@griggs.org.uk>
 wrote:

 > This afternoon I came across an unusual case where there were files
in
 > my work/marshaller folder with invalid filenames. It seems that the
 > valid format is -[0-9]+.classname[0-9]. However, I had files that
 > were in the format -[0-9]+.classname - i.e., no trailing zero.
Where
 > these files came from I'm not sure, perhaps a significantly older
 > version of Ignite?
 >
 > The error message could be improved, and unless there is an
 > outstanding JIRA I will open one to
 >
 > 1. Print the full file path, not just the filename - this will help
in
 > determining where the work/marshaller folder is located
 > 2. Suggesting to clear out the contents of the work/marshaller
folder
 > and restart
 >
 > Alternatively, can we just ignore files that do not end in [0-9] ?
 >
 > Regards
 > Mike
 >
 >
 >




Unintuitive error message when invalid marshaller files found

2017-09-15 Thread Michael Griggs
This afternoon I came across an unusual case where there were files in
my work/marshaller folder with invalid filenames.  It seems that the
valid format is -[0-9]+.classname[0-9].  However, I had files that
were in the format -[0-9]+.classname - i.e., no trailing zero.  Where
these files came from I'm not sure, perhaps a significantly older
version of Ignite?

The error message could be improved, and unless there is an
outstanding JIRA I will open one to 

1. Print the full file path, not just the filename - this will help in
determining where the work/marshaller folder is located
2. Suggesting to clear out the contents of the work/marshaller folder
and restart

Alternatively, can we just ignore files that do not end in [0-9] ?

Regards
Mike




[jira] [Created] (IGNITE-5785) C# QuerySqlField attribute should provide access to Order parameter

2017-07-19 Thread Michael Griggs (JIRA)
Michael Griggs created IGNITE-5785:
--

 Summary: C# QuerySqlField attribute should provide access to Order 
parameter
 Key: IGNITE-5785
 URL: https://issues.apache.org/jira/browse/IGNITE-5785
 Project: Ignite
  Issue Type: Improvement
  Components: platforms
Affects Versions: 2.1
Reporter: Michael Griggs
 Fix For: 2.2


https://apacheignite.readme.io/docs/indexes#section-group-indexes

{{order}} parameter should be accessible via Ignite.NET.  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IGNITE-5783) LINQ queries should provide the ability to generate the SQL query plan

2017-07-19 Thread Michael Griggs (JIRA)
Michael Griggs created IGNITE-5783:
--

 Summary: LINQ queries should provide the ability to generate the 
SQL query plan
 Key: IGNITE-5783
 URL: https://issues.apache.org/jira/browse/IGNITE-5783
 Project: Ignite
  Issue Type: New Feature
  Components: platforms
Affects Versions: 2.0
Reporter: Michael Griggs
Priority: Minor
 Fix For: 2.2


At present, the only way to see the query plan generated by a LINQ query in C# 
is to:

# Call {{GetFieldsQuery()}}
# Prepend the string {{"explain "}} to the resulting string
# execute the query in the step above and retrieve the plan





--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IGNITE-5697) Web Console should be configurable for IPv4 connections

2017-07-05 Thread Michael Griggs (JIRA)
Michael Griggs created IGNITE-5697:
--

 Summary: Web Console should be configurable for IPv4 connections
 Key: IGNITE-5697
 URL: https://issues.apache.org/jira/browse/IGNITE-5697
 Project: Ignite
  Issue Type: Bug
  Components: UI
Affects Versions: 2.0
Reporter: Michael Griggs
 Fix For: 2.1


When an IPv6 network interface is available, NodeJS will always prefer that to 
an IPv4 interface.  This causes problems if IPv6 is not fully operational on 
the network.  

This fix will add the ability to specify that IPv4 should be used.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Writing a helper for KafkaStreamer

2017-06-06 Thread Michael Griggs
We made a change [1] that required users to re-write code that uses
KafkaStreamer to initialise tuple extractors.  The re-write is not
immediately obvious, and for simple use cases (streaming single tuples) it
is easy to write a helper function that will make the transition easier.  

 

I intend to raise a JIRA and submit a PR to implement this.

 

MG

 

 

[1] https://issues.apache.org/jira/browse/IGNITE-4140 

 

--

Michael Griggs

Consultant, EMEA

GridGain Systems

 



Re: Inefficient approach to executing remote SQL queries

2017-05-23 Thread Michael Griggs
The problem here is with the initial opening of connections. With a client
who connects and disconnects quickly, and frequently, a 30-second plus
connection time is not workable.

Mike

On 23 May 2017 6:51 pm, "Dmitriy Setrakyan" <dsetrak...@apache.org> wrote:

> Why do we turn off the connections, once established? Why not keep them
> open, until an endpoint explicitly closes them?
>
> On Tue, May 23, 2017 at 2:16 AM, Sergi Vladykin <sergi.vlady...@gmail.com>
> wrote:
>
> > Michael,
> >
> > I see your point. I think it must not be too hard to start asynchronously
> > establishing connections to all the needed nodes.
> >
> > I've created respective issue in Jira:
> > https://issues.apache.org/jira/browse/IGNITE-5277
> >
> > Sergi
> >
> > 2017-05-23 11:56 GMT+03:00 Michael Griggs <michael.gri...@gridgain.com>:
> >
> > > Hi Val
> > >
> > > This is precisely my point: it's only a minor optimization until the
> > point
> > > when establishing each connection takes 3-4 seconds, and we establish
> 32
> > of
> > > them in sequence.  At that point it becomes a serious issue: the
> customer
> > > cannot run SQL queries from their development machines without them
> > timing
> > > out once out of every two or three runs.  These kind of problems
> > undermine
> > > confidence in Ignite.
> > >
> > > Mike
> > >
> > >
> > > -Original Message-
> > > From: Valentin Kulichenko [mailto:valentin.kuliche...@gmail.com]
> > > Sent: 22 May 2017 19:15
> > > To: dev@ignite.apache.org
> > > Subject: Re: Inefficient approach to executing remote SQL queries
> > >
> > > Hi Mike,
> > >
> > > Generally, establishing connections in parallel could make sense, but
> > note
> > > that in most this would be a minor optimization, because:
> > >
> > >- Under load connections are established once and then reused. If
> you
> > >observe disconnections during application lifetime under load, then
> > >probably this should be addressed first.
> > >- Actual communication is asynchronous, we use NIO for this. If
> > >connection already exists, sendGeneric() basically just puts a
> message
> > > into
> > >a queue.
> > >
> > > -Val
> > >
> > > On Mon, May 22, 2017 at 7:04 PM, Michael Griggs <
> > > michael.gri...@gridgain.com
> > > > wrote:
> > >
> > > > Hi Igniters,
> > > >
> > > >
> > > >
> > > > Whilst diagnosing a problem with a slow query, I became aware of a
> > > > potential issue in the Ignite codebase.  When executing a SQL query
> > > > that is to run remotely, the IgniteH2Indexing#send() method is
> called,
> > > > with a Collection as one of its parameters.  This
> > > > collection is iterated sequentially, and ctx.io().sendGeneric() is
> > > > called synchronously for each node.  This is inefficient if
> > > >
> > > >
> > > >
> > > > a)   This is the first execution of a query, and thus TCP
> > connections
> > > > have to be established
> > > >
> > > > b)  The cost of establishing a TCP connection is high
> > > >
> > > >
> > > >
> > > > And optionally
> > > >
> > > >
> > > >
> > > > c)   There are a large number of nodes in the cluster
> > > >
> > > >
> > > >
> > > > In my current situation, developers want to run test queries from
> > > > their code running locally, but connected via VPN to their UAT server
> > > > environment.
> > > > The
> > > > cost of opening a TCP connection is in the multiple seconds, as you
> > > > can see from this Ignite log file snippet:
> > > >
> > > > 2017-05-22 18:29:48,908 INFO [TcpCommunicationSpi] - Established
> > > > outgoing communication connection [locAddr=/7.1.14.242:56924,
> > > > rmtAddr=/10.132.80.3:47100]
> > > >
> > > > 2017-05-22 18:29:52,294 INFO [TcpCommunicationSpi] - Established
> > > > outgoing communication connection [locAddr=/7.1.14.242:56923,
> > > > rmtAddr=/10.132.80.30:47102]
> > > >
> > > > 2017-05-22 18:29:58,659 INFO [TcpCommunicationSpi] - Established
> > > > outgoing communication connection [

RE: Inefficient approach to executing remote SQL queries

2017-05-23 Thread Michael Griggs
Hi Val

This is precisely my point: it's only a minor optimization until the point when 
establishing each connection takes 3-4 seconds, and we establish 32 of them in 
sequence.  At that point it becomes a serious issue: the customer cannot run 
SQL queries from their development machines without them timing out once out of 
every two or three runs.  These kind of problems undermine confidence in 
Ignite.  

Mike


-Original Message-
From: Valentin Kulichenko [mailto:valentin.kuliche...@gmail.com] 
Sent: 22 May 2017 19:15
To: dev@ignite.apache.org
Subject: Re: Inefficient approach to executing remote SQL queries

Hi Mike,

Generally, establishing connections in parallel could make sense, but note that 
in most this would be a minor optimization, because:

   - Under load connections are established once and then reused. If you
   observe disconnections during application lifetime under load, then
   probably this should be addressed first.
   - Actual communication is asynchronous, we use NIO for this. If
   connection already exists, sendGeneric() basically just puts a message into
   a queue.

-Val

On Mon, May 22, 2017 at 7:04 PM, Michael Griggs <michael.gri...@gridgain.com
> wrote:

> Hi Igniters,
>
>
>
> Whilst diagnosing a problem with a slow query, I became aware of a 
> potential issue in the Ignite codebase.  When executing a SQL query 
> that is to run remotely, the IgniteH2Indexing#send() method is called, 
> with a Collection as one of its parameters.  This 
> collection is iterated sequentially, and ctx.io().sendGeneric() is 
> called synchronously for each node.  This is inefficient if
>
>
>
> a)   This is the first execution of a query, and thus TCP connections
> have to be established
>
> b)  The cost of establishing a TCP connection is high
>
>
>
> And optionally
>
>
>
> c)   There are a large number of nodes in the cluster
>
>
>
> In my current situation, developers want to run test queries from 
> their code running locally, but connected via VPN to their UAT server 
> environment.
> The
> cost of opening a TCP connection is in the multiple seconds, as you 
> can see from this Ignite log file snippet:
>
> 2017-05-22 18:29:48,908 INFO [TcpCommunicationSpi] - Established 
> outgoing communication connection [locAddr=/7.1.14.242:56924, 
> rmtAddr=/10.132.80.3:47100]
>
> 2017-05-22 18:29:52,294 INFO [TcpCommunicationSpi] - Established 
> outgoing communication connection [locAddr=/7.1.14.242:56923, 
> rmtAddr=/10.132.80.30:47102]
>
> 2017-05-22 18:29:58,659 INFO [TcpCommunicationSpi] - Established 
> outgoing communication connection [locAddr=/7.1.14.242:56971, 
> rmtAddr=/10.132.80.23:47101]
>
> 2017-05-22 18:30:03,183 INFO [TcpCommunicationSpi] - Established 
> outgoing communication connection [locAddr=/7.1.14.242:56972, 
> rmtAddr=/10.132.80.21:47100]
>
> 2017-05-22 18:30:06,039 INFO [TcpCommunicationSpi] - Established 
> outgoing communication connection [locAddr=/7.1.14.242:56973, 
> rmtAddr=/10.132.80.21:47103]
>
> 2017-05-22 18:30:10,828 INFO [TcpCommunicationSpi] - Established 
> outgoing communication connection [locAddr=/7.1.14.242:57020, 
> rmtAddr=/10.132.80.20:47100]
>
> 2017-05-22 18:30:13,060 INFO [TcpCommunicationSpi] - Established 
> outgoing communication connection [locAddr=/7.1.14.242:57021, 
> rmtAddr=/10.132.80.29:47103]
>
> 2017-05-22 18:30:22,144 INFO [TcpCommunicationSpi] - Established 
> outgoing communication connection [locAddr=/7.1.14.242:57022, 
> rmtAddr=/10.132.80.22:47103]
>
> 2017-05-22 18:30:26,513 INFO [TcpCommunicationSpi] - Established 
> outgoing communication connection [locAddr=/7.1.14.242:57024, 
> rmtAddr=/10.132.80.20:47101]
>
> 2017-05-22 18:30:28,526 INFO [TcpCommunicationSpi] - Established 
> outgoing communication connection [locAddr=/7.1.14.242:57025, 
> rmtAddr=/10.132.80.30:47103]
>
>
>
> Comparing the same code that is executed inside of the UAT environment 
> (so not using the VPN):
>
> 2017-05-22 18:22:18,102 INFO [TcpCommunicationSpi] - Established 
> outgoing communication connection [locAddr=/10.175.11.38:53288, 
> rmtAddr=/10.175.11.58:47100]
>
> 2017-05-22 18:22:18,105 INFO [TcpCommunicationSpi] - Established 
> outgoing communication connection [locAddr=/10.175.11.38:45890, 
> rmtAddr=/10.175.11.54:47101]
>
> 2017-05-22 18:22:18,108 INFO [TcpCommunicationSpi] - Established 
> outgoing communication connection [locAddr=/127.0.0.1:47582, 
> rmtAddr=/127.0.0.1:47100]
>
> 2017-05-22 18:22:18,111 INFO [TcpCommunicationSpi] - Established 
> outgoing communication connection [locAddr=/127.0.0.1:45240, 
> rmtAddr=/127.0.0.1:47103]
>
> 2017-05-22 18:22:18,114 INFO [TcpCommunicationSpi] - Established 
> outgoing communic

Inefficient approach to executing remote SQL queries

2017-05-22 Thread Michael Griggs
Hi Igniters,

 

Whilst diagnosing a problem with a slow query, I became aware of a potential
issue in the Ignite codebase.  When executing a SQL query that is to run
remotely, the IgniteH2Indexing#send() method is called, with a
Collection as one of its parameters.  This collection is
iterated sequentially, and ctx.io().sendGeneric() is called synchronously
for each node.  This is inefficient if



a)   This is the first execution of a query, and thus TCP connections
have to be established

b)  The cost of establishing a TCP connection is high



And optionally

 

c)   There are a large number of nodes in the cluster

 

In my current situation, developers want to run test queries from their code
running locally, but connected via VPN to their UAT server environment.  The
cost of opening a TCP connection is in the multiple seconds, as you can see
from this Ignite log file snippet:

2017-05-22 18:29:48,908 INFO [TcpCommunicationSpi] - Established outgoing
communication connection [locAddr=/7.1.14.242:56924,
rmtAddr=/10.132.80.3:47100]

2017-05-22 18:29:52,294 INFO [TcpCommunicationSpi] - Established outgoing
communication connection [locAddr=/7.1.14.242:56923,
rmtAddr=/10.132.80.30:47102]

2017-05-22 18:29:58,659 INFO [TcpCommunicationSpi] - Established outgoing
communication connection [locAddr=/7.1.14.242:56971,
rmtAddr=/10.132.80.23:47101]

2017-05-22 18:30:03,183 INFO [TcpCommunicationSpi] - Established outgoing
communication connection [locAddr=/7.1.14.242:56972,
rmtAddr=/10.132.80.21:47100]

2017-05-22 18:30:06,039 INFO [TcpCommunicationSpi] - Established outgoing
communication connection [locAddr=/7.1.14.242:56973,
rmtAddr=/10.132.80.21:47103]

2017-05-22 18:30:10,828 INFO [TcpCommunicationSpi] - Established outgoing
communication connection [locAddr=/7.1.14.242:57020,
rmtAddr=/10.132.80.20:47100]

2017-05-22 18:30:13,060 INFO [TcpCommunicationSpi] - Established outgoing
communication connection [locAddr=/7.1.14.242:57021,
rmtAddr=/10.132.80.29:47103]

2017-05-22 18:30:22,144 INFO [TcpCommunicationSpi] - Established outgoing
communication connection [locAddr=/7.1.14.242:57022,
rmtAddr=/10.132.80.22:47103]

2017-05-22 18:30:26,513 INFO [TcpCommunicationSpi] - Established outgoing
communication connection [locAddr=/7.1.14.242:57024,
rmtAddr=/10.132.80.20:47101]

2017-05-22 18:30:28,526 INFO [TcpCommunicationSpi] - Established outgoing
communication connection [locAddr=/7.1.14.242:57025,
rmtAddr=/10.132.80.30:47103]

 

Comparing the same code that is executed inside of the UAT environment (so
not using the VPN):

2017-05-22 18:22:18,102 INFO [TcpCommunicationSpi] - Established outgoing
communication connection [locAddr=/10.175.11.38:53288,
rmtAddr=/10.175.11.58:47100]

2017-05-22 18:22:18,105 INFO [TcpCommunicationSpi] - Established outgoing
communication connection [locAddr=/10.175.11.38:45890,
rmtAddr=/10.175.11.54:47101]

2017-05-22 18:22:18,108 INFO [TcpCommunicationSpi] - Established outgoing
communication connection [locAddr=/127.0.0.1:47582,
rmtAddr=/127.0.0.1:47100]

2017-05-22 18:22:18,111 INFO [TcpCommunicationSpi] - Established outgoing
communication connection [locAddr=/127.0.0.1:45240,
rmtAddr=/127.0.0.1:47103]

2017-05-22 18:22:18,114 INFO [TcpCommunicationSpi] - Established outgoing
communication connection [locAddr=/10.175.11.38:46280,
rmtAddr=/10.175.11.15:47100]

2017-05-22 18:22:18,118 INFO [TcpCommunicationSpi] - Established outgoing
communication connection [locAddr=/10.132.80.21:51476,
rmtAddr=/10.132.80.29:47103]

2017-05-22 18:22:18,120 INFO [TcpCommunicationSpi] - Established outgoing
communication connection [locAddr=/10.132.80.21:56274,
rmtAddr=pocfd-master1/10.132.80.22:47103]

2017-05-22 18:22:18,124 INFO [TcpCommunicationSpi] - Established outgoing
communication connection [locAddr=/10.132.80.21:53558,
rmtAddr=pocfd-ignite1/10.132.80.20:47101]

2017-05-22 18:22:18,127 INFO [TcpCommunicationSpi] - Established outgoing
communication connection [locAddr=/10.132.80.21:56216,
rmtAddr=/10.132.80.30:47103]

 

This is a design flaw in the Ignite code, as we are relying on the client's
network behaving in a particular way (i.e., port opening being very fast).
We should instead try to mask this potential slowness by establishing
connections in parallel, and waiting on the results.

 

I would like to hear others thoughts and comment before we open a JIRA to
look at this.

 

Regards

Mike



[jira] [Created] (IGNITE-4878) IgniteH2Indexing can throw java.util.ConcurrentModificationException

2017-03-29 Thread Michael Griggs (JIRA)
Michael Griggs created IGNITE-4878:
--

 Summary: IgniteH2Indexing can throw 
java.util.ConcurrentModificationException
 Key: IGNITE-4878
 URL: https://issues.apache.org/jira/browse/IGNITE-4878
 Project: Ignite
  Issue Type: Bug
Affects Versions: 1.9
Reporter: Michael Griggs
Assignee: Michael Griggs


>From the Collections#synchronizedCollection method:

{noformat}
 * It is imperative that the user manually synchronize on the returned
 * collection when traversing it via {@link Iterator}, {@link Spliterator}
 * or {@link Stream}:
 * 
 *  Collection c = Collections.synchronizedCollection(myCollection);
 * ...
 *  synchronized (c) {
 *  Iterator i = c.iterator(); // Must be in the synchronized block
 *  while (i.hasNext())
 * foo(i.next());
 *  }
 * 
 * Failure to follow this advice may result in non-deterministic behavior.
{noformat}




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


RE: Distribution of keys to partitions

2017-03-15 Thread Michael Griggs
Have we ever heard of somebody needing to set the partition count to a 
non-power-of-two number?  Perhaps we could restrict the method so that it will 
only accept a power of two as the partition count?

-Original Message-
From: Valentin Kulichenko [mailto:valentin.kuliche...@gmail.com] 
Sent: 15 March 2017 16:22
To: dev@ignite.apache.org
Subject: Re: Distribution of keys to partitions

Andrey,

Absolutely, your point is correct. I'm talking about default behavior which 
must be as effective as possible. In case we do this optimization, I would also 
show a warning if number of partitions is not a power of two.

-Val

On Wed, Mar 15, 2017 at 5:09 PM, Andrey Gura <ag...@apache.org> wrote:

> Anyway, we can't always use this optimization because it will not work 
> for non power of two values.
>
> On Wed, Mar 15, 2017 at 6:48 PM, Valentin Kulichenko 
> <valentin.kuliche...@gmail.com> wrote:
> > In 99% of cases number of partition is a power of two, because it's 
> > the default value. Almost no one changes it. If this change actually 
> > provides better distribution, it absolutely makes sense to do it.
> >
> > Michael, can you create a Jira ticket and put you findings there?
> >
> > -Val
> >
> > On Wed, Mar 15, 2017 at 3:58 PM, Andrey Gura <ag...@apache.org> wrote:
> >
> >> Michael,
> >>
> >> it makes sense only for cases when partitions count is power of two.
> >> Affinity function doesn't have this limitation.
> >>
> >> Bu, of course, we can check, that partitions count is power of two 
> >> and use optimized hash code calculation.
> >>
> >>
> >> On Wed, Mar 15, 2017 at 4:09 PM, Michael Griggs 
> >> <michael.gri...@gridgain.com> wrote:
> >> > Hi Igniters,
> >> >
> >> > Last week I was working with a group of Ignite users.  They are
> inserting
> >> > several million string keys in to a cache.  Each string key was 
> >> > approximately 22-characters in length.  When I exported the 
> >> > partition counts (via GG Visor) I was able to see an unusual 
> >> > periodicity in the number of keys allocated to partitions.  I charted 
> >> > this in Excel [1].
> >> > After further investigation, it appears that there is a 
> >> > relationship between the number of keys being inserted, the 
> >> > number of partitions assigned to the cache and amount of apparent 
> >> > periodicity: a small
> number
> >> of
> >> > partitions will cause periodicity to appear with lower numbers of
> keys.
> >> >
> >> > The RendezvousAffinityFunction#partition function performs a 
> >> > simple calculation of key hashcode modulo partition-count:
> >> >
> >> > U.safeAbs(key.hashCode() % parts)
> >> >
> >> >
> >> > Digging further I was led to the fact that this is how the Java
> HashMap
> >> > *used* to behave [2], but was upgraded around Java 1.4 to perform 
> >> > the
> >> > following:
> >> >
> >> > key.hashCode() & (parts - 1)
> >> >
> >> > which performs more efficiently.  It was then updated further to 
> >> > do
> the
> >> > following:
> >> >
> >> > (h = key.hashCode()) ^ (h >>> 16);
> >> >
> >> > with the bit-shift performed to
> >> >
> >> > incorporate impact of the highest bits that would otherwise never 
> >> > be used in index calculations because of table bounds
> >> >
> >> >
> >> > When using this function, rather than our 
> >> > RendezvousAffinityFunction#partition implementation, I also saw a 
> >> > significant decrease in the periodicity and a better distribution 
> >> > of
> keys
> >> > amongst partitions [3].
> >> >
> >> > I would like to suggest that we adopt this modified hash function
> inside
> >> > RendezvousAffinityFunction.
> >> >
> >> > Regards
> >> > Mike
> >> >
> >> >
> >> > [1]: https://i.imgur.com/0FtCZ2A.png
> >> > [2]:
> >> > https://www.quora.com/Why-does-Java-use-a-mediocre-
> >> hashCode-implementation-for-strings
> >> > [3]: https://i.imgur.com/8ZuCSA3.png
> >>
>



[jira] [Created] (IGNITE-4828) Improve the distribution of keys within partitions

2017-03-15 Thread Michael Griggs (JIRA)
Michael Griggs created IGNITE-4828:
--

 Summary: Improve the distribution of keys within partitions
 Key: IGNITE-4828
 URL: https://issues.apache.org/jira/browse/IGNITE-4828
 Project: Ignite
  Issue Type: Improvement
Affects Versions: 1.9
Reporter: Michael Griggs
 Fix For: 2.0


An issue has been found when inserting several million string keys in to a 
cache.  Each string key was approximately 22-characters in length.  When I 
exported the partition counts (via GG Visor) I was able to see an unusual 
periodicity in the number of keys allocated to partitions.  I charted this in 
Excel (1).

After further investigation, it appears that there is a relationship
between the number of keys being inserted, the number of partitions
assigned to the cache and amount of apparent periodicity: a small number 
ofpartitions will cause periodicity to appear with a lower number of keys.

The {{RendezvousAffinityFunction#partition}} function performs a simple
calculation of key hashcode modulo partition-count:

{{U.safeAbs(key.hashCode() % parts)}}

Digging further I was led to the fact that this is how the Java HashMap
*used* to behave (2), but was upgraded around Java 1.4 to perform the
following:

{{key.hashCode() & (parts - 1)}}

which performs more efficiently.  It was then updated further to do the
following:

{{(h = key.hashCode()) ^ (h >>> 16);}}

with the bit-shift performed to
bq. incorporate impact of the highest bits that would otherwise never be used 
in index calculations because of table bounds

When using this function, rather than our
{{RendezvousAffinityFunction#partition}} implementation, I also saw a
significant decrease in the periodicity and a better distribution of keys
amongst partitions (3). 

(1):  https://i.imgur.com/0FtCZ2A.png
(2):  
https://www.quora.com/Why-does-Java-use-a-mediocre-hashCode-implementation-for-strings
(3):  https://i.imgur.com/8ZuCSA3.png



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Distribution of keys to partitions

2017-03-15 Thread Michael Griggs
Hi Igniters,

Last week I was working with a group of Ignite users.  They are inserting
several million string keys in to a cache.  Each string key was
approximately 22-characters in length.  When I exported the partition
counts (via GG Visor) I was able to see an unusual periodicity in the
number of keys allocated to partitions.  I charted this in Excel [1].
After further investigation, it appears that there is a relationship
between the number of keys being inserted, the number of partitions
assigned to the cache and amount of apparent periodicity: a small number of
partitions will cause periodicity to appear with lower numbers of keys.

The RendezvousAffinityFunction#partition function performs a simple
calculation of key hashcode modulo partition-count:

U.safeAbs(key.hashCode() % parts)


Digging further I was led to the fact that this is how the Java HashMap
*used* to behave [2], but was upgraded around Java 1.4 to perform the
following:

key.hashCode() & (parts - 1)

which performs more efficiently.  It was then updated further to do the
following:

(h = key.hashCode()) ^ (h >>> 16);

with the bit-shift performed to

incorporate impact of the highest bits that would otherwise
never be used in index calculations because of table bounds


When using this function, rather than our
RendezvousAffinityFunction#partition implementation, I also saw a
significant decrease in the periodicity and a better distribution of keys
amongst partitions [3].

I would like to suggest that we adopt this modified hash function inside
RendezvousAffinityFunction.

Regards
Mike


[1]: https://i.imgur.com/0FtCZ2A.png
[2]:
https://www.quora.com/Why-does-Java-use-a-mediocre-hashCode-implementation-for-strings
[3]: https://i.imgur.com/8ZuCSA3.png


[jira] [Created] (IGNITE-2173) Some log.debug() calls in CacheAbstractJdbcStore.java are not protected with if (log.isDebugEnabled())

2015-12-15 Thread Michael Griggs (JIRA)
Michael Griggs created IGNITE-2173:
--

 Summary: Some log.debug() calls in CacheAbstractJdbcStore.java are 
not protected with if (log.isDebugEnabled())
 Key: IGNITE-2173
 URL: https://issues.apache.org/jira/browse/IGNITE-2173
 Project: Ignite
  Issue Type: Bug
  Components: cache
Affects Versions: 1.5
Reporter: Michael Griggs
Priority: Minor


e.g., line 1029

{code}
log.debug("Write entries to db one by one using update and insert statements 
[cache name=" +
{code}

A side-effect of this is WARNing messages in your log file when these 
statements are executed:

{code}
2015-12-15 16:19:26 WARN  CacheJdbcPojoStore:463 - Logging at DEBUG level 
without checking if DEBUG level is enabled: 
{code}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (IGNITE-1774) REST API should be implemented using Jersey

2015-10-22 Thread Michael Griggs (JIRA)
Michael Griggs created IGNITE-1774:
--

 Summary: REST API should be implemented using Jersey
 Key: IGNITE-1774
 URL: https://issues.apache.org/jira/browse/IGNITE-1774
 Project: Ignite
  Issue Type: Improvement
Affects Versions: ignite-1.4
Reporter: Michael Griggs


Jersey+Jetty is a well established method for implementing RESTful interfaces.  

The REST API should implement non-modifying cache methods (e.g., get()) via 
HTTP GET methods, and modifying cache methods (e.g., put(), replace()) via HTTP 
POST methods.  This allows the JSON data to be correctly formatted in the body 
of the message, instead of needing to be URL-encoded on the URI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)