Hey Stanislav,
I was able to get into a situation like described before but the server nodes 
are not under pressure any more. The cache state seems to stay corrupted. I was 
able to restart the client and even then the client does not see the whole 
cache content (but with the sql query) and gets "not found" errors when trying 
to load objects it wants to load because the keys were spit out by the sql 
I am able to reproduce the situation, if the load is applied to the servers 
while one of them is inserting new objects with LoadCache method. I switched to 
REPLICATED mode in the meantime and the problem still exists.

Current situation:
I have 138285 objects (SQLLine "SELECT COUNT(*) FROM [cache]") which 
corresponds to the object count read from the database to fill the cache.
The SQL query being executed on the client is "select count(*) FROM [cache] 
WHERE Field1 IN 
 AND Field2 IN (3,1,4,7);"
This query gives constantly 120284 hits in SQLLine. On the client node I get 
the same result - at least most of the time. Sometimes the client only gets 
3753 as result at the moment. The client does the query every 10 to 15 seconds.
Now about the numbers that visor (and `cache.GetSize()` on the client) reports: 
72935 (Split up into 1866 on the first and 71069 on the second node).

Therefore I've 65350 objects that are seen by SQL (everytime using SQLLine, 
most of the times seen by the client) and that are not seen by visor / by 
`cache.GetSize()` executed on the client.

I will extract the configuration and some log data as soon as possible. I'll 
also try to replace LoadCache with a DataStreamer. Maybe it helps.


-----Urspr√ľngliche Nachricht-----
Von: Stanislav Lukyanov <stanlukya...@gmail.com> 
Gesendet: 13 April 2018 18:26
An: user@ignite.apache.org
Betreff: Re: Discrepancy between cache.Size() / Visorcmd and SQL query resultset


Could you please share your configuration files, cache configurations, logs 
from all nodes and the code snippets you use to do the queries (visor commands, 
SQL, etc)?


Bellenger, Dominique wrote
> Hey Igniters,
> I've the following setup (Ignite .NET 2.4): 2 Server nodes, 1 client 
> node doing SQL-queries on the cache periodically (every 20 Seconds in my 
> case).
> The cache is filled with 110_000 entries from a database, using 
> "LoadCache" method. Key is a string representation of a number, 
> nothing fancy here.
> Situation: Both server nodes are put under pressure by doing 
> affinity-run compute jobs on both nodes, affecting all cache entries 
> (read, change, put every entry).
> I made the following observations:
>   1.  Visorcmd showed that the entries were distributed like 60_000 on 
> one node and 34_000 on the other. The same sum (94_000) was shown on 
> the client side on every periodic "tick" when calling "GetSize" on the 
> cache instance 
> (https://github.com/apache/ignite/blob/master/modules/platforms/dotnet/Apache.Ignite.Core/Cache/ICache.cs#L685).
>      *   Why are there entries missing? Running SELECT Count(*) on the
> Cache with SQLLine reports back 110_000 entries.
>      *   Why are the entries not distributed 50/50 (or nearly 50/50)?
>   1.  On the client, the SQL query invoked on every "tick" returned 
> sometimes 110_000 entries, sometimes 60_000 or 34_000. There was no 
> error or warning in the client or server log about failing SQL queries.
>      *   In a partitioned cache both servers do a query and the results
> are merged, if I understood correctly. It seems to me that one of the 
> servers sometimes returns an empty result set and therefore the client 
> gets a too small result set. Question is: why does this happen even 
> without a warning on the server nodes about a failing query?
>   2.  In that situation the client is not able to load a specific 
> entry from the cache multiple times using TryGet(TK key, out TV value) 
> (https://github.com/apache/ignite/blob/master/modules/platforms/dotnet/Apache.Ignite.Core/Cache/ICache.cs#L297).
> Those entries definitely are existing in the cache.
>   3.  In that situation on one of both server nodes I get errors that 
> an entry could not be loaded (like in 3) but on the affinity-server node!).
> In my understanding the compute jobs shall get executed on the primary 
> node for the given key. And this node is not able to load an entry by 
> that key (when under heavy CPU pressure)?
> Something is strange here. Any ideas?
> Cheers,
> Dome

Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Reply via email to