Re: Killing a node under load stalls the grid with ignite 1.7

2016-11-06 Thread Vladislav Pyatkov
Hi,

It should not be in CacheStore implementation, but i if you does not want
re-write logic, do it asynchronously.

On Thu, Nov 3, 2016 at 6:35 PM, bintisepaha  wrote:

> the problem is when I am in write behind for order, how do I access the
> trade
> object. its only present in the cache. at that time I need access trade
> cache and that is causing issues.
>
>
>
> --
> View this message in context: http://apache-ignite-users.
> 70518.x6.nabble.com/Killing-a-node-under-load-stalls-the-
> grid-with-ignite-1-7-tp8130p8695.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>



-- 
Vladislav Pyatkov


Re: Killing a node under load stalls the grid with ignite 1.7

2016-11-03 Thread bintisepaha
the problem is when I am in write behind for order, how do I access the trade
object. its only present in the cache. at that time I need access trade
cache and that is causing issues.



--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Killing-a-node-under-load-stalls-the-grid-with-ignite-1-7-tp8130p8695.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Killing a node under load stalls the grid with ignite 1.7

2016-10-31 Thread Vladislav Pyatkov
Hi,

I mean, If you need create Order entry before Trade, you can to do it in
CacheStore implementation, but do not use IgniteCache for this. Just write
inserts for both tables.

Why this way did not matched?

On Mon, Oct 31, 2016 at 4:55 PM, bintisepaha  wrote:

> Hi Vladislav,
>
> what you are describing above is not clear to me at all?
> Could you please elaborate?
>
> Thanks,
> Binti
>
>
>
> --
> View this message in context: http://apache-ignite-users.
> 70518.x6.nabble.com/Killing-a-node-under-load-stalls-the-
> grid-with-ignite-1-7-tp8130p8630.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>


Re: Killing a node under load stalls the grid with ignite 1.7

2016-10-31 Thread Vladislav Pyatkov
Hi,

You need to write into "write behind handler" to database only (you can
fill several table, if it needed, for example "Order" than "Trade").
Cache which has "read through" on table "Trade" will always read value from
database, until cache entry does not exist.

On Thu, Oct 27, 2016 at 6:03 PM, bintisepaha  wrote:

> yes I think you are write. Is there any setting that we can use in write
> behind that will not lock the entries?
> the use case is we have is like this
>
> Parent table - Order (Order Cache)
> Child Table - Trade (Trade Cache)
>
> We only have write behind on Order Cache and when writing that we write
> order and trade table both. so we query trade cache from order cache store
> writeAll() which is causing the above issue. We need to do this because we
> cannot write trade in the database without writing order. Foreign key
> constraints and data-integrity.
>
> Do you have any recommendations to solve this problem? We cannot use
> write-through. How do we make sure 2 tables are written in an order if they
> are in separate caches?
>
> Thanks,
> Binti
>
>
>
> --
> View this message in context: http://apache-ignite-users.
> 70518.x6.nabble.com/Killing-a-node-under-load-stalls-the-
> grid-with-ignite-1-7-tp8130p8557.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>



-- 
Vladislav Pyatkov


Re: Killing a node under load stalls the grid with ignite 1.7

2016-10-27 Thread bintisepaha
yes I think you are write. Is there any setting that we can use in write
behind that will not lock the entries?
the use case is we have is like this

Parent table - Order (Order Cache)
Child Table - Trade (Trade Cache)

We only have write behind on Order Cache and when writing that we write
order and trade table both. so we query trade cache from order cache store
writeAll() which is causing the above issue. We need to do this because we
cannot write trade in the database without writing order. Foreign key
constraints and data-integrity. 

Do you have any recommendations to solve this problem? We cannot use
write-through. How do we make sure 2 tables are written in an order if they
are in separate caches?

Thanks,
Binti



--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Killing-a-node-under-load-stalls-the-grid-with-ignite-1-7-tp8130p8557.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Killing a node under load stalls the grid with ignite 1.7

2016-10-25 Thread Vladislav Pyatkov
Hi,

Incorrect implementation of CacheStore is the most probable reason, because
stored entry is locked. You need to avoid lock one entry in othe.

Necessary re-write the code and re-check, I think, the issue will resolved.

On Tue, Oct 25, 2016 at 1:05 AM, bintisepaha  wrote:

> Hi, actually we use a lot of caches from cache store writeAll().
> For confirming if that is the cause of the grid stall, we would have to
> completely change our design.
>
> Can someone confirm that this is the cause for grid to stall? referencing
> cache.get from a cache store and then killing or bringing up nodes leads to
> a stall?
>
> We see a node blocked on flusher thread while doing a cache.get() when the
> grid is stalled, if we kill that node, the grid starts functioning. But we
> would like to understand are we using write behind incorrectly or there are
> some settings that we can use to re-balance or write-behind that might save
> us from something like this.
>
> Thanks,
> Binti
>
>
>
> --
> View this message in context: http://apache-ignite-users.
> 70518.x6.nabble.com/Killing-a-node-under-load-stalls-the-
> grid-with-ignite-1-7-tp8130p8449.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>



-- 
Vladislav Pyatkov


Re: Killing a node under load stalls the grid with ignite 1.7

2016-10-24 Thread bintisepaha
Hi, actually we use a lot of caches from cache store writeAll().
For confirming if that is the cause of the grid stall, we would have to
completely change our design. 

Can someone confirm that this is the cause for grid to stall? referencing
cache.get from a cache store and then killing or bringing up nodes leads to
a stall?

We see a node blocked on flusher thread while doing a cache.get() when the
grid is stalled, if we kill that node, the grid starts functioning. But we
would like to understand are we using write behind incorrectly or there are
some settings that we can use to re-balance or write-behind that might save
us from something like this.

Thanks,
Binti



--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Killing-a-node-under-load-stalls-the-grid-with-ignite-1-7-tp8130p8449.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Killing a node under load stalls the grid with ignite 1.7

2016-10-21 Thread Vladislav Pyatkov
Hi,

Yes, please attach new dumps (without putting in cache into cache store).
That reduce search of reason.

On Fri, Oct 21, 2016 at 3:54 PM, bintisepaha  wrote:

> This was done to optimize our writes to the DB. on every save, we do not
> want
> to delete and insert records, so we do a digest comparison. Do you think
> this causes an issue? How does cache store handle transactions or locks?
> when a node dies, if a flusher thread is doing write-behind how does that
> affect data rebalancing?
>
> If you could answer the above questions, it will give us more clarity.
>
> We are removing it now. but still killing a node is stalling the cluster.
> Will send the latest thread dumps to you today.
>
> Thanks,
> Binti
>
>
>
> --
> View this message in context: http://apache-ignite-users.
> 70518.x6.nabble.com/Killing-a-node-under-load-stalls-the-
> grid-with-ignite-1-7-tp8130p8405.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>



-- 
Vladislav Pyatkov


Re: Killing a node under load stalls the grid with ignite 1.7

2016-10-21 Thread bintisepaha
This was done to optimize our writes to the DB. on every save, we do not want
to delete and insert records, so we do a digest comparison. Do you think
this causes an issue? How does cache store handle transactions or locks?
when a node dies, if a flusher thread is doing write-behind how does that
affect data rebalancing?

If you could answer the above questions, it will give us more clarity. 

We are removing it now. but still killing a node is stalling the cluster.
Will send the latest thread dumps to you today.

Thanks,
Binti



--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Killing-a-node-under-load-stalls-the-grid-with-ignite-1-7-tp8130p8405.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Killing a node under load stalls the grid with ignite 1.7

2016-10-18 Thread Vladislav Pyatkov
Hi,

I have just saw:

  at
org.apache.ignite.internal.processors.cache.IgniteCacheProxy.put(IgniteCacheProxy.java:1214)
  at
com.tudor.datagridI.server.writebehind.BasePersistentService.replaceDigest(BasePersistentService.java:302)
  at
com.tudor.datagridI.server.writebehind.JdbcTradeOrderPersistentService.replaceDigest(JdbcTradeOrderPersistentService.java:291)
  at
com.tudor.datagridI.server.writebehind.JdbcTradeOrderPersistentService.writeTradeOrders(JdbcTradeOrderPersistentService.java:74)
  at
com.tudor.datagridI.server.cachestore.springjdbc.TradeOrderCacheStore.writeAll(TradeOrderCacheStore.java:238)
  at
org.apache.ignite.internal.processors.cache.store.GridCacheWriteBehindStore.updateStore(GridCacheWriteBehindStore.java:685

Why are you update elements of cache in Cache Store?

On Tue, Oct 18, 2016 at 4:41 AM, bintisepaha  wrote:

> This is a sample cache config. We have the same issue with on heap settings
> too.
> Do you need something else?
>
>  class="org.apache.ignite.configuration.CacheConfiguration">
> 
> 
> 
> 
>
> 
> 
> 
> 
> 
> 
> 
> 
>
>
> 
> 
>
> 
> 
> 
> 
> 
> 
> 
>  value="FULL_SYNC" />
> 
> 
> 
>  class="org.apache.ignite.cache.QueryEntity">
>  
> value="com.tudor.datagridI.client.data.trading.OrderKey"
> />
>  value="com.tudor.datagridI.
> client.data.trading.TradeOrder" />
>
> 
> 
>  key="traderId" value="java.lang.Integer" />
>  key="orderId" value="java.lang.Integer" />
>  key="insIid" value="java.lang.Integer" />
>  key="settlement" value="java.util.Date" />
>  key="clearAgent" value="java.lang.String" />
>  key="strategy" value="java.lang.String" />
>  value="java.lang.Integer" />
>  key="pvDate" value="java.util.Date" />
>  key="linkId" value="java.lang.Integer" />
> 
> 
> 
> 
>  class="org.apache.ignite.cache.QueryIndex">
>
> 
>
> 
>
>   traderId
>
>   orderId
>
> 
>
> 
>
> 
>
> SORTED
>
> 
>  name="name" value="tradeOrder_key_index" />
> 
> 
> 
> 
> 
> 
> 
> 
> class="org.apache.ignite.cache.affinity.rendezvous.
> RendezvousAffinityFunction">
>  value="true" />
> 
> 
> 
>
>
>
>
> --
> View this message in context: http://apache-ignite-users.
> 70518.x6.nabble.com/Killing-a-node-under-load-stalls-the-
> grid-with-ignite-1-7-tp8130p8334.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>



-- 
Vladislav Pyatkov


Re: Killing a node under load stalls the grid with ignite 1.7

2016-10-14 Thread bintisepaha
Hi, I don't have a simple working example. But under even some load its a
very reproducible problem. We had the same issue with ignite 1.5.0-final as
well. Never used 1.6 as much, and now have the same issue with 1.7.0.

If you are able to reproduce on your end , it will be really helpful.

Where do you see lock between GridCacheWriteBehindStore and
GridCachePartitionExchangeManager?

Thanks,
Binti



--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Killing-a-node-under-load-stalls-the-grid-with-ignite-1-7-tp8130p8302.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Killing a node under load stalls the grid with ignite 1.7

2016-10-14 Thread Vladislav Pyatkov
Hi Binti,

I can not reproduce this issue.
Could you please provide cache configuration?

On Thu, Oct 13, 2016 at 6:48 PM, vdpyatkov  wrote:

> Hi Binti,
>
> Hi,
> This is look like a lock GridCacheWriteBehindStore and
> GridCachePartitionExchangeManager.
>
> Could you give work an example of this?
> If not I try to reproduce it tomorrow
>
>
>
> --
> View this message in context: http://apache-ignite-users.
> 70518.x6.nabble.com/Killing-a-node-under-load-stalls-the-
> grid-with-ignite-1-7-tp8130p8273.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>



-- 
Vladislav Pyatkov


Re: Killing a node under load stalls the grid with ignite 1.7

2016-10-13 Thread vdpyatkov
Hi Binti,

Hi,
This is look like a lock GridCacheWriteBehindStore and
GridCachePartitionExchangeManager.

Could you give work an example of this?
If not I try to reproduce it tomorrow



--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Killing-a-node-under-load-stalls-the-grid-with-ignite-1-7-tp8130p8273.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Killing a node under load stalls the grid with ignite 1.7

2016-10-08 Thread bintisepaha
Hi could someone please look at this and respond? 

Thanks, 
Binti 



--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Killing-a-node-under-load-stalls-the-grid-with-ignite-1-7-tp8130p8158.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Killing a node under load stalls the grid with ignite 1.7

2016-10-06 Thread bintisepaha
Hi, we are using ignite 1.7 and under some load when caches are being updated
and write behind is moving along if we just kill a node, the entire grid
stalls. attaching thread dumps when the partitioned caches were in full_sync
mode and also when all were in full_async mode. It looks like something to
do with exchange worker. we have a failureDetection Timeout on server nodes
of 30 seconds. this is to avoid grid from stalling when we have long major
GC pauses. with all g1gc settings we are unable to avoid major GCs. so we
had to workaround and use a longer failureDetection time.

DevDump06Oct2016.zip

  

When there is no load, killing a node does not stall the grid.

On the client node when the grid stalls, we see this being logged
continuously.

U.warn(log, "Failed to wait for partition map exchange [" +
"topVer=" +
exchFut.topologyVersion() +
", node=" + cctx.localNodeId() + "].
" +
"Dumping pending objects that might
be the cause: ");

Thanks,
Binti




--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Killing-a-node-under-load-stalls-the-grid-with-ignite-1-7-tp8130.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.