Re: And again... Failed to get page IO instance (page content is corrupted)

2018-07-05 Thread Olexandr K
Hi guys,

are you still planning to release 2.6.0 as patch release for 2.5.0?

BR, Oleksandr

On Fri, Jun 29, 2018 at 11:43 AM, Andrey Mashenkov <
andrey.mashen...@gmail.com> wrote:

> Hi Oleg,
>
> Yes, page corruption issues shouldn't happened when persistence is
> disabled.
> Please, let us know it you will face one.
>
>
> On Fri, Jun 29, 2018 at 1:56 AM Olexandr K 
> wrote:
>
>> Hi Andrey,
>>
>> Thanks for clarifying this.
>> We have just a single persistent cache and I reworked the code to get rid
>> of expiration policy.
>> All our non-persistent caches have expiration policy but this should not
>> be a problem, right?
>>
>> BR, Oleksandr
>>
>> On Thu, Jun 28, 2018 at 8:37 PM, Andrey Mashenkov <
>> andrey.mashen...@gmail.com> wrote:
>>
>>> Hi Oleg,
>>>
>>> The issue you mentioned IGNITE-8659 [1] is caused by IGNITE-5874 [2]
>>> that will not a part of ignite-2.6 release.
>>> For now, 'ExpiryPolicy with persistence' is totally broken and all it's
>>> fixes are planned to the next 2.7 release.
>>>
>>>
>>> [1] https://issues.apache.org/jira/browse/IGNITE-8659
>>> [2] https://issues.apache.org/jira/browse/IGNITE-5874
>>>
>>> On Tue, Jun 26, 2018 at 11:26 PM Olexandr K <
>>> olexandr.kundire...@gmail.com> wrote:
>>>
>>>> Hi Andrey,
>>>>
>>>> I see Fix version 2.7 in Jira: https://issues.apache.org/
>>>> jira/browse/IGNITE-8659
>>>> This is a critical bug.. bouncing of server node in not-a-right-time
>>>> causes a catastrophe.
>>>> This mean no availability in fact - I had to clean data folders to
>>>> start my cluster after that
>>>>
>>>> BR, Oleksandr
>>>>
>>>>
>>>> On Fri, Jun 22, 2018 at 4:06 PM, Andrey Mashenkov <
>>>> andrey.mashen...@gmail.com> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> We've found and fixed few issues related to ExpiryPolicy usage.
>>>>> Most likely, your issue is [1] and it is planned to ignite 2.6 release.
>>>>>
>>>>> [1] https://issues.apache.org/jira/browse/IGNITE-8659
>>>>>
>>>>>
>>>>> On Fri, Jun 22, 2018 at 8:43 AM Olexandr K <
>>>>> olexandr.kundire...@gmail.com> wrote:
>>>>>
>>>>>> Hi Team,
>>>>>>
>>>>>> Issue is still there in 2.5.0
>>>>>>
>>>>>> Steps to reproduce:
>>>>>> 1) start 2 servers + 2 clients topology
>>>>>> 2) start load testing on client nodes
>>>>>> 3) stop server 1
>>>>>> 4) start server 1
>>>>>> 5) stop server 1 again when rebalancing is in progress
>>>>>> => and we got data corrupted here, see error below
>>>>>> => we were not able to restart Ignite cluster after that and need to
>>>>>> perform data folders cleanup...
>>>>>>
>>>>>> 2018-06-21 11:28:01.684 [ttl-cleanup-worker-#43] ERROR  - Critical
>>>>>> system error detected. Will be handled accordingly to configured handler
>>>>>> [hnd=class o.a.i.failure.StopNodeOrHaltFailureHandler,
>>>>>> failureCtx=FailureContext [type=SYSTEM_WORKER_TERMINATION, err=class
>>>>>> o.a.i.IgniteException: Runtime failure on bounds: [lower=null,
>>>>>> upper=PendingRow [
>>>>>> org.apache.ignite.IgniteException: Runtime failure on bounds:
>>>>>> [lower=null, upper=PendingRow []]
>>>>>> at org.apache.ignite.internal.processors.cache.persistence.
>>>>>> tree.BPlusTree.find(BPlusTree.java:971)
>>>>>> ~[ignite-core-2.5.0.jar:2.5.0]
>>>>>> at org.apache.ignite.internal.processors.cache.persistence.
>>>>>> tree.BPlusTree.find(BPlusTree.java:950)
>>>>>> ~[ignite-core-2.5.0.jar:2.5.0]
>>>>>> at org.apache.ignite.internal.processors.cache.
>>>>>> IgniteCacheOffheapManagerImpl.expire(IgniteCacheOffheapManagerImpl.java:1024)
>>>>>> ~[ignite-core-2.5.0.jar:2.5.0]
>>>>>> at org.apache.ignite.internal.processors.cache.
>>>>>> GridCacheTtlManager.expire(GridCacheTtlManager.java:197)
>>>>>> ~[ignite-core-2.5.0.jar:2.5.0]
>>>>>> at org.apache.ignite.internal.processors.cache.
>>>>>>

Re: And again... Failed to get page IO instance (page content is corrupted)

2018-06-26 Thread Olexandr K
Hi Andrey,

I see Fix version 2.7 in Jira:
https://issues.apache.org/jira/browse/IGNITE-8659
This is a critical bug.. bouncing of server node in not-a-right-time causes
a catastrophe.
This mean no availability in fact - I had to clean data folders to start my
cluster after that

BR, Oleksandr


On Fri, Jun 22, 2018 at 4:06 PM, Andrey Mashenkov <
andrey.mashen...@gmail.com> wrote:

> Hi,
>
> We've found and fixed few issues related to ExpiryPolicy usage.
> Most likely, your issue is [1] and it is planned to ignite 2.6 release.
>
> [1] https://issues.apache.org/jira/browse/IGNITE-8659
>
>
> On Fri, Jun 22, 2018 at 8:43 AM Olexandr K 
> wrote:
>
>> Hi Team,
>>
>> Issue is still there in 2.5.0
>>
>> Steps to reproduce:
>> 1) start 2 servers + 2 clients topology
>> 2) start load testing on client nodes
>> 3) stop server 1
>> 4) start server 1
>> 5) stop server 1 again when rebalancing is in progress
>> => and we got data corrupted here, see error below
>> => we were not able to restart Ignite cluster after that and need to
>> perform data folders cleanup...
>>
>> 2018-06-21 11:28:01.684 [ttl-cleanup-worker-#43] ERROR  - Critical system
>> error detected. Will be handled accordingly to configured handler
>> [hnd=class o.a.i.failure.StopNodeOrHaltFailureHandler,
>> failureCtx=FailureContext [type=SYSTEM_WORKER_TERMINATION, err=class
>> o.a.i.IgniteException: Runtime failure on bounds: [lower=null,
>> upper=PendingRow [
>> org.apache.ignite.IgniteException: Runtime failure on bounds:
>> [lower=null, upper=PendingRow []]
>> at org.apache.ignite.internal.processors.cache.persistence.
>> tree.BPlusTree.find(BPlusTree.java:971) ~[ignite-core-2.5.0.jar:2.5.0]
>> at org.apache.ignite.internal.processors.cache.persistence.
>> tree.BPlusTree.find(BPlusTree.java:950) ~[ignite-core-2.5.0.jar:2.5.0]
>> at org.apache.ignite.internal.processors.cache.
>> IgniteCacheOffheapManagerImpl.expire(IgniteCacheOffheapManagerImpl.java:1024)
>> ~[ignite-core-2.5.0.jar:2.5.0]
>> at org.apache.ignite.internal.processors.cache.
>> GridCacheTtlManager.expire(GridCacheTtlManager.java:197)
>> ~[ignite-core-2.5.0.jar:2.5.0]
>> at org.apache.ignite.internal.processors.cache.
>> GridCacheSharedTtlCleanupManager$CleanupWorker.body(
>> GridCacheSharedTtlCleanupManager.java:137) [ignite-core-2.5.0.jar:2.5.0]
>> at 
>> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
>> [ignite-core-2.5.0.jar:2.5.0]
>> at java.lang.Thread.run(Thread.java:748) [?:1.8.0_162]
>> Caused by: java.lang.IllegalStateException: Item not found: 2
>> at org.apache.ignite.internal.processors.cache.persistence.
>> tree.io.AbstractDataPageIO.findIndirectItemIndex(AbstractDataPageIO.java:341)
>> ~[ignite-core-2.5.0.jar:2.5.0]
>> at org.apache.ignite.internal.processors.cache.persistence.
>> tree.io.AbstractDataPageIO.getDataOffset(AbstractDataPageIO.java:450)
>> ~[ignite-core-2.5.0.jar:2.5.0]
>> at org.apache.ignite.internal.processors.cache.persistence.
>> tree.io.AbstractDataPageIO.readPayload(AbstractDataPageIO.java:492)
>> ~[ignite-core-2.5.0.jar:2.5.0]
>> at org.apache.ignite.internal.processors.cache.persistence.
>> CacheDataRowAdapter.initFromLink(CacheDataRowAdapter.java:150)
>> ~[ignite-core-2.5.0.jar:2.5.0]
>> at org.apache.ignite.internal.processors.cache.persistence.
>> CacheDataRowAdapter.initFromLink(CacheDataRowAdapter.java:102)
>> ~[ignite-core-2.5.0.j
>>
>> BR, Oleksandr
>>
>> On Thu, Jun 14, 2018 at 2:51 PM, Olexandr K <
>> olexandr.kundire...@gmail.com> wrote:
>>
>>> Upgraded to 2.5.0 and didn't get such error so far..
>>> Thanks!
>>>
>>> On Wed, Jun 13, 2018 at 4:58 PM, dkarachentsev <
>>> dkarachent...@gridgain.com> wrote:
>>>
>>>> It would be better to upgrade to 2.5, where it is fixed.
>>>> But if you want to overcome this issue in your's version, you need to
>>>> add
>>>> ignite-indexing dependency to your classpath and configure SQL indexes.
>>>> For
>>>> example [1], just modify it to work with Spring in XML:
>>>> 
>>>> 
>>>> org.your.KeyObject
>>>> org.your.ValueObject
>>>> 
>>>> 
>>>>
>>>> [1]
>>>> https://apacheignite-sql.readme.io/docs/schema-and-
>>>> indexes#section-registering-indexed-types
>>>>
>>>> Thanks!
>>>> -Dmitry
>>>>
>>>>
>>>>
>>>> --
>>>> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>>>>
>>>
>>>
>>
>
> --
> Best regards,
> Andrey V. Mashenkov
>


Re: And again... Failed to get page IO instance (page content is corrupted)

2018-06-21 Thread Olexandr K
Hi Team,

Issue is still there in 2.5.0

Steps to reproduce:
1) start 2 servers + 2 clients topology
2) start load testing on client nodes
3) stop server 1
4) start server 1
5) stop server 1 again when rebalancing is in progress
=> and we got data corrupted here, see error below
=> we were not able to restart Ignite cluster after that and need to
perform data folders cleanup...

2018-06-21 11:28:01.684 [ttl-cleanup-worker-#43] ERROR  - Critical system
error detected. Will be handled accordingly to configured handler
[hnd=class o.a.i.failure.StopNodeOrHaltFailureHandler,
failureCtx=FailureContext [type=SYSTEM_WORKER_TERMINATION, err=class
o.a.i.IgniteException: Runtime failure on bounds: [lower=null,
upper=PendingRow [
org.apache.ignite.IgniteException: Runtime failure on bounds: [lower=null,
upper=PendingRow []]
at
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.find(BPlusTree.java:971)
~[ignite-core-2.5.0.jar:2.5.0]
at
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.find(BPlusTree.java:950)
~[ignite-core-2.5.0.jar:2.5.0]
at
org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl.expire(IgniteCacheOffheapManagerImpl.java:1024)
~[ignite-core-2.5.0.jar:2.5.0]
at
org.apache.ignite.internal.processors.cache.GridCacheTtlManager.expire(GridCacheTtlManager.java:197)
~[ignite-core-2.5.0.jar:2.5.0]
at
org.apache.ignite.internal.processors.cache.GridCacheSharedTtlCleanupManager$CleanupWorker.body(GridCacheSharedTtlCleanupManager.java:137)
[ignite-core-2.5.0.jar:2.5.0]
at
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
[ignite-core-2.5.0.jar:2.5.0]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_162]
Caused by: java.lang.IllegalStateException: Item not found: 2
at
org.apache.ignite.internal.processors.cache.persistence.tree.io.AbstractDataPageIO.findIndirectItemIndex(AbstractDataPageIO.java:341)
~[ignite-core-2.5.0.jar:2.5.0]
at
org.apache.ignite.internal.processors.cache.persistence.tree.io.AbstractDataPageIO.getDataOffset(AbstractDataPageIO.java:450)
~[ignite-core-2.5.0.jar:2.5.0]
at
org.apache.ignite.internal.processors.cache.persistence.tree.io.AbstractDataPageIO.readPayload(AbstractDataPageIO.java:492)
~[ignite-core-2.5.0.jar:2.5.0]
at
org.apache.ignite.internal.processors.cache.persistence.CacheDataRowAdapter.initFromLink(CacheDataRowAdapter.java:150)
~[ignite-core-2.5.0.jar:2.5.0]
at
org.apache.ignite.internal.processors.cache.persistence.CacheDataRowAdapter.initFromLink(CacheDataRowAdapter.java:102)
~[ignite-core-2.5.0.j

BR, Oleksandr

On Thu, Jun 14, 2018 at 2:51 PM, Olexandr K 
wrote:

> Upgraded to 2.5.0 and didn't get such error so far..
> Thanks!
>
> On Wed, Jun 13, 2018 at 4:58 PM, dkarachentsev  > wrote:
>
>> It would be better to upgrade to 2.5, where it is fixed.
>> But if you want to overcome this issue in your's version, you need to add
>> ignite-indexing dependency to your classpath and configure SQL indexes.
>> For
>> example [1], just modify it to work with Spring in XML:
>> 
>> 
>> org.your.KeyObject
>> org.your.ValueObject
>> 
>> 
>>
>> [1]
>> https://apacheignite-sql.readme.io/docs/schema-and-indexes#
>> section-registering-indexed-types
>>
>> Thanks!
>> -Dmitry
>>
>>
>>
>> --
>> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>>
>
>


Re: No writes to WAL files for the whole day?

2018-06-21 Thread Olexandr K
Yes, you are right, hashes for 00.wal and 01.wal are different before/after
load test
(last-modified-time still showing june 15th)

BEFORE

PS F:\ignite-wal\V_HP_LK_DCN01> get-filehash .\*

Algorithm   Hash
-   
SHA256
40A373D58ADF4C9E15B3F0969ED3B5E4D52AC24BC807AD9C8EE4F92982B71925
SHA256
DE2B96517256C844524CD9A1C4ED8EDE18FA92440C348A762DE96A3CA6AAFC89
SHA256
9077E6ED38B77E142C65AD4125717A7630D90FC402F0CDFBC7FBF95660C21016
...

PS F:\ignite-wal\V_HP_LK_DCN01> ls

ModeLastWriteTime Length Name
- -- 
-a--- 6/15/2018   2:21 PM   67108864 .wal
-a--- 6/15/2018   3:27 PM   67108864 0001.wal
-a--- 6/15/2018   3:44 PM   67108864 0002.wal
...

AFTER

PS F:\ignite-wal\V_HP_LK_DCN01> get-filehash .\*

Algorithm   Hash
-   
SHA256
B7CA93C0FEEE94CD60C468F615E2C16B009DF13AD3F799DAC5A8CAB1A6E3438D
SHA256
E3290E24F2C7F9967A6FD800466ECE6382BBC969274C6C1B49A1119C2B3ECE29
SHA256
9077E6ED38B77E142C65AD4125717A7630D90FC402F0CDFBC7FBF95660C21016
...

PS F:\ignite-wal\V_HP_LK_DCN01> ls

ModeLastWriteTime Length Name
- -- 
-a--- 6/15/2018   2:21 PM   67108864 .wal
-a--- 6/15/2018   3:27 PM   67108864 0001.wal
-a--- 6/15/2018   3:44 PM   67108864 0002.wal
...


On Wed, Jun 20, 2018 at 1:12 PM, dkarachentsev 
wrote:

> I suppose that is issue with updating timestamps, rather with WAL writes.
> Try to make a load test and compare hash sum of files before load test and
> after. Also check if WAL history grow.
>
> Thanks!
> -Dmitry
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>


Re: "Connect timed out" errors during cluster restart

2018-06-20 Thread Olexandr K
got it, thanks

On Wed, Jun 20, 2018 at 11:57 AM, dkarachentsev 
wrote:

> Hi Oleksandr,
>
> It's OK for discovery, and this message is printed only in debug mode:
>
> if (log.isDebugEnabled())
> log.error("Exception on direct send: " +
> e.getMessage(),
> e);
>
> Just turn off debug logging for discovery package:
> org.apache.ignite.spi.discovery.tcp.
>
> Thanks!
> -Dmitry
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>


Re: No writes to WAL files for the whole day?

2018-06-20 Thread Olexandr K
WAL mode is default one, all caches are FULL_ASYNC

...and now I see one of my Ignite servers updated WAL files but other one
still have 19th last-modification-time
it looks like WAL files are updated once per day











here is my data storage config:











 








 








On Wed, Jun 20, 2018 at 11:51 AM, dkarachentsev 
wrote:

> Hi,
>
> What is your configuration? Check WAL mode and path to persistence.
>
> Thanks!
> -Dmitry
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>


No writes to WAL files for the whole day?

2018-06-20 Thread Olexandr K
Hi Ignite team,

I noticed that nothing is written into WAL files even on Ignite restart.

My testing steps:

1) bounce application and ignite cluster
2) perform load testing
3) bounce application and ignite cluster
4) check ignite files:

data files have recent modification time - OK

latest WAL file was modified yesterday... Why? Is it expected / explainable?

PS F:\ignite-wal\V_HP_LK_DCN01> ls
ModeLastWriteTime Length Name
- -- 
-a--- 6/15/2018   2:21 PM   67108864 .wal
-a--- 6/15/2018   3:27 PM   67108864 0001.wal
-a--- 6/15/2018   3:44 PM   67108864 0002.wal
-a--- 6/16/2018   1:18 AM   67108864 0003.wal
-a--- 6/18/2018   6:35 PM   67108864 0004.wal
-a--- 6/18/2018   9:30 PM   67108864 0005.wal
-a--- 6/19/2018  12:05 AM   67108864 0006.wal
-a--- 6/19/2018  12:49 AM   67108864 0007.wal
-a--- 6/19/2018   1:42 AM   67108864 0008.wal
-a--- 6/19/2018   2:50 PM   67108864 0009.wal

BR, Oleksandr


"Connect timed out" errors during cluster restart

2018-06-19 Thread Olexandr K
 Hi Igniters,

I'm getting "connect timed out" errors on each cluster restart
Errors are logged ~10 times before cluster activation
Everything is working fine after that
They are looking as false alarms... looks like nodes are trying to connect
each other when they are not UP yet.
Why it is logged with ERROR level?

I have SSL=on and Authentication=on
Topology: 2 server nodes
Ignite version: 2.5.0
OS: Windows Server 2012 R2

I have custome discovery SPI implementation but it just injects auth
credentials on local node init, nothing more (it extends TcpDiscoverySpi)

Sample error:

2018-06-19 19:43:29.347 [main] ERROR com.xxx.lk.ignite.AuthTcpDiscoverySpi
- Exception on direct send: connect timed out
java.net.SocketTimeoutException: connect timed out
at java.net.TwoStacksPlainSocketImpl.socketConnect(Native Method)
~[?:1.8.0_162]
at
java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
~[?:1.8.0_162]
at
java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
~[?:1.8.0_162]
at
java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
~[?:1.8.0_162]
at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:172) ~[?:1.8.0_162]
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) ~[?:1.8.0_162]
at java.net.Socket.connect(Socket.java:589) ~[?:1.8.0_162]
at sun.security.ssl.SSLSocketImpl.connect(SSLSocketImpl.java:673)
~[?:1.8.0_162]
at
org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.openSocket(TcpDiscoverySpi.java:1450)
~[ignite-core-2.5.0.jar:2.5.0]
at
org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.openSocket(TcpDiscoverySpi.java:1413)
~[ignite-core-2.5.0.jar:2.5.0]
at
org.apache.ignite.spi.discovery.tcp.ServerImpl.sendMessageDirectly(ServerImpl.java:1199)
[ignite-core-2.5.0.jar:2.5.0]
at
org.apache.ignite.spi.discovery.tcp.ServerImpl.sendJoinRequestMessage(ServerImpl.java:1046)
[ignite-core-2.5.0.jar:2.5.0]
at
org.apache.ignite.spi.discovery.tcp.ServerImpl.joinTopology(ServerImpl.java:890)
[ignite-core-2.5.0.jar:2.5.0]
at
org.apache.ignite.spi.discovery.tcp.ServerImpl.spiStart(ServerImpl.java:373)
[ignite-core-2.5.0.jar:2.5.0]
at
org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.spiStart(TcpDiscoverySpi.java:1948)
[ignite-core-2.5.0.jar:2.5.0]
at
org.apache.ignite.internal.managers.GridManagerAdapter.startSpi(GridManagerAdapter.java:297)
[ignite-core-2.5.0.jar:2.5.0]
at
org.apache.ignite.internal.managers.discovery.GridDiscoveryManager.start(GridDiscoveryManager.java:915)
[ignite-core-2.5.0.jar:2.5.0]
at
org.apache.ignite.internal.IgniteKernal.startManager(IgniteKernal.java:1720)
[ignite-core-2.5.0.jar:2.5.0]
at org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:1033)
[ignite-core-2.5.0.jar:2.5.0]
at
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:2014)
[ignite-core-2.5.0.jar:2.5.0]
at
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1723)
[ignite-core-2.5.0.jar:2.5.0]
at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1151)
[ignite-core-2.5.0.jar:2.5.0]
at
org.apache.ignite.internal.IgnitionEx.startConfigurations(IgnitionEx.java:1069)
[ignite-core-2.5.0.jar:2.5.0]
at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:955)
[ignite-core-2.5.0.jar:2.5.0]
at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:854)
[ignite-core-2.5.0.jar:2.5.0]
at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:724)
[ignite-core-2.5.0.jar:2.5.0]
at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:693)
[ignite-core-2.5.0.jar:2.5.0]
at org.apache.ignite.Ignition.start(Ignition.java:352)
[ignite-core-2.5.0.jar:2.5.0]
at com.xxx.lk.ignite.WindowsService.start(WindowsService.java:45)
[lk.ignite_auth-1.0.0.jar:?]
2018-06-19 19:43:29.362 [main] DEBUG com.xxx.lk.ignite.AuthTcpDiscoverySpi
- Failed to send join request message [addr=v-hp-lk-dcn02.xxxgroup.tek.loc/
10.2.0.252:47502, msg=connect timed out]


[ignite-server.xml]











v-hp-lk-dcn01.xxxgroup.tek.loc:47500..47504
v-hp-lk-dcn02.xxxgroup.tek.loc:47500..47504







BR, Oleksandr


Re: Baseline topology issue when restarting server nodes one by one

2018-06-14 Thread Olexandr K
No, it is reconnecting fine.
I just wondered that client node is kicked off from cluster when we have
one of server nodes available.
It is clear after your explanation.

Just curious - why client node is connected to one server node and not to
both?
All my caches are partitioned and client should go to server A or B
depending on which partition was resolved for the key, right?
Did you mean that if client C is using server A as gateway, all traffic
goes through server A?


On Wed, Jun 13, 2018 at 4:43 PM, Stanislav Lukyanov 
wrote:

> On the client disconnection:
>
> The client is connected to one of the server nodes, using it as a sort of
> gateway to the cluster.
>
> If the gateway server fails, the client supposed to attempt to reconnect
> to other IPs.
>
> Right now I can’t say for sure whether it does so indefinitely, or has
> some timeout for it.
>
> Perhaps you’ve changed networkTimeout or something, and after that it
> doesn’t reconnect?
>
>
>
> On the IDs in Visor:
>
> Somewhat confusingly, ConsistentID is NOT NodeID.
>
> Visor shows NodeID and not ConsistentID.
>
>
>
> NodeID is regenerated each time a node restarts and it is always a UUID.
>
> ConsistentID is a UUID by default, but it doesn’t have to be the same as
> NodeID and doesn’t have to have a form of UUID – any string works (and even
> any Object with an idempotent toString(), but tsss - don’t tell anyone!).
>
>
>
> Stan
>
>
>
> *From: *Olexandr K 
> *Sent: *13 июня 2018 г. 0:58
> *To: *user@ignite.apache.org
> *Subject: *Re: Baseline topology issue when restarting server nodes one
> by one
>
>
>
> I configured ConsistentId equal to hostname for each node and this issue
> is not reproduced anymore
>
>
>
> One more strange behaviour I noticed is that one of client nodes gets
> disconnected after one of server nodes goes down.
>
> I have reconnect logic in place so it comes back later but is such
> behaviour expected?
>
> Not sure whether it is related with consistent IDs but I didn't see it
> earlier...
>
>
>
> BTW, after configuring consistent IDs I see them in "control.bat
> --baseline" output only.
>
> Visor output and server logs still show generated IDs
>
> That looks confusing...
>
>
>
> 
>
> Cluster state: active
> Current topology version: 7
> Baseline nodes:
> ConsistentID=V-HP-LK-DCN01, STATE=ONLINE
> ConsistentID=V-HP-LK-DCN02, STATE=ONLINE
>
>
>
> 
>
> 9871EAFF(@n0) | Server
>
> BBA63A1F(@n2) | Server
>
> 1DEDB701(@n1) | Client
>
> 5931AF53(@n3) | Client
>
>
>
> 
>
> logs\v-hp-lk-dcn01\ignite.log:383:>>> Local node
> [ID=9871EAFF-73AF-4E2E-99A7-8F5DF58A3C40, order=1, clientMode=false]
> logs\v-hp-lk-dcn02\ignite.log:274:>>> Local node
> [ID=BBA63A1F-559E-461C-B7ED-B10CE3DE33CC, order=7, clientMode=false]
>
>
>
>
>
> On Tue, Jun 12, 2018 at 9:48 PM, Olexandr K 
> wrote:
>
> Hi, Dmitry
>
>
>
> server nodes start with ignite-server.xml and client nodes with
> ignite-client.xml
>
> server node hosts: v-hp-lk-dcn01, v-hp-lk-dcn02
>
>
>
> 
>
>
>
> 
> http://www.springframework.org/schema/beans;
>xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance;
>xsi:schemaLocation="http://www.springframework.org/schema/beans
> http://www.springframework.org/schema/beans/spring-beans.xsd;>
>
>  class="org.apache.ignite.configuration.IgniteConfiguration">
>
>
>  />
>
> 
>
> 
> 
> 
> 
> 
> 
>
> 
> 
>  value="s-hp-fs01\\dev$\\config\\keystore.jks"
> />
>  
> 
>  factory-method="getDisabledTrustManager" />
> 
> 
> 
>
> 
> 
>  value="S-hp-fs01\\dev$\\config\\log4j2.xml"/>
> 
> 
>
> 
> 
> 
> 
>  value="auth_durable_region"/>
> 
>  value="FULL_ASYNC"/>
> 
> 
> 
> 
> 
> 
> 
>
> 
> 
>
> 
> 
> 
> 
> 
> 
> 
>  
>  

Re: And again... Failed to get page IO instance (page content is corrupted)

2018-06-14 Thread Olexandr K
Upgraded to 2.5.0 and didn't get such error so far..
Thanks!

On Wed, Jun 13, 2018 at 4:58 PM, dkarachentsev 
wrote:

> It would be better to upgrade to 2.5, where it is fixed.
> But if you want to overcome this issue in your's version, you need to add
> ignite-indexing dependency to your classpath and configure SQL indexes. For
> example [1], just modify it to work with Spring in XML:
> 
> 
> org.your.KeyObject
> org.your.ValueObject
> 
> 
>
> [1]
> https://apacheignite-sql.readme.io/docs/schema-and-
> indexes#section-registering-indexed-types
>
> Thanks!
> -Dmitry
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>


Re: And again... Failed to get page IO instance (page content is corrupted)

2018-06-13 Thread Olexandr K
I'm using key/value API.
Should I define index for the key explicitly?
This sounds strange... Can you give a sample how I can do this in xml
please?

Here is one of my cache configurations.
Actually I'm storing UUID per UUID here: IgniteCache








 



On Wed, Jun 13, 2018 at 12:18 PM, Andrey Mashenkov <
andrey.mashen...@gmail.com> wrote:

> Hi,
>
> Possibly, it is a bug in partition eviction optimization. Ignite can skip
> partition eviction procedure and remove partition instantly if there is no
> indexes.
>
> If it is so, you can try the latest ignite-2.5 version [1] or
> as a workaround you can add index via configuring QueryEntity or via
> setting cacheCfg.setIndexTypes().
>
>
> [1] https://ignite.apache.org/download.cgi#binaries
>
> On Wed, Jun 13, 2018 at 1:22 AM, Oleks K 
> wrote:
>
>> Hi guys,
>>
>> I got similar errors in 2.4.0
>>
>> First:
>>
>> org.apache.ignite.IgniteException: Runtime failure on bounds:
>> [lower=null,
>> upper=PendingRow []]
>>   --> Caused by: java.lang.IllegalStateException: Failed to get page IO
>> instance (page content is corrupted)
>>
>> Then lots of:
>>
>> org.apache.ignite.IgniteException: Runtime failure on bounds
>>   --> Caused by: java.lang.IllegalStateException: Item not found: 3
>>
>> This was reproduced when I started and stopped server nodes under the load
>> Topology: 2 server and 2 client nodes
>> Java: 1.8.0_162
>> OS: Windows Server 2012 R2 6.3 amd64
>>
>> Cache config:
>> 
>> 
>> > value="auth_durable_region"/>
>> 
>> > value="FULL_ASYNC"/>
>> 
>> 
>> 
>> 
>>
>> Ignite team, can you comment on this please?
>> How critical is the issue? What is the impact?
>> Any workarounds? Fix planned?
>>
>> 2018-06-13 00:22:30.978 [exchange-worker-#42] INFO
>> org.apache.ignite.internal.processors.cache.distributed.dht.
>> preloader.GridDhtPartitionDemander
>> - Starting rebalancing [mode=ASYNC,
>> fromNode=bdddfe24-aab3-46fa-9452-efe933783adb, partitionsCount=787,
>> topology=AffinityTopologyVersion [topVer=5, minorTopVer=0], updateSeq=12]
>> 2018-06-13 00:22:31.594 [ttl-cleanup-worker-#52] ERROR
>> org.apache.ignite.internal.processors.cache.GridCacheSharedT
>> tlCleanupManager
>> - Runtime error caught during grid runnable execution: GridWorker
>> [name=ttl-cleanup-worker, igniteInstanceName=null, finished=false,
>> hashCode=473353699, interrupted=false, runner=ttl-cleanup-worker-#52]
>> org.apache.ignite.IgniteException: Runtime failure on bounds:
>> [lower=null,
>> upper=PendingRow []]
>> at
>> org.apache.ignite.internal.processors.cache.persistence.tree
>> .BPlusTree.find(BPlusTree.java:963)
>> ~[ignite-core-2.4.0.jar:2.4.0]
>> at
>> org.apache.ignite.internal.processors.cache.persistence.tree
>> .BPlusTree.find(BPlusTree.java:942)
>> ~[ignite-core-2.4.0.jar:2.4.0]
>> at
>> org.apache.ignite.internal.processors.cache.IgniteCacheOffhe
>> apManagerImpl.expire(IgniteCacheOffheapManagerImpl.java:974)
>> ~[ignite-core-2.4.0.jar:2.4.0]
>> at
>> org.apache.ignite.internal.processors.cache.GridCacheTtlMana
>> ger.expire(GridCacheTtlManager.java:197)
>> ~[ignite-core-2.4.0.jar:2.4.0]
>> at
>> org.apache.ignite.internal.processors.cache.GridCacheSharedT
>> tlCleanupManager$CleanupWorker.body(GridCacheSh
>> aredTtlCleanupManager.java:129)
>> ~[ignite-core-2.4.0.jar:2.4.0]
>> at
>> org.apache.ignite.internal.util.worker.GridWorker.run(GridWo
>> rker.java:110)
>> [ignite-core-2.4.0.jar:2.4.0]
>> at java.lang.Thread.run(Thread.java:748) [?:1.8.0_162]
>> Caused by: java.lang.IllegalStateException: Failed to get page IO
>> instance
>> (page content is corrupted)
>> at
>> org.apache.ignite.internal.processors.cache.persistence.tree
>> .io.IOVersions.forVersion(IOVersions.java:83)
>> ~[ignite-core-2.4.0.jar:2.4.0]
>> at
>> org.apache.ignite.internal.processors.cache.persistence.tree
>> .io.IOVersions.forPage(IOVersions.java:95)
>> ~[ignite-core-2.4.0.jar:2.4.0]
>> at
>> org.apache.ignite.internal.processors.cache.persistence.Cach
>> eDataRowAdapter.initFromLink(CacheDataRowAdapter.java:148)
>> ~[ignite-core-2.4.0.jar:2.4.0]
>> at
>> org.apache.ignite.internal.processors.cache.persistence.Cach
>> eDataRowAdapter.initFromLink(CacheDataRowAdapter.java:102)
>> ~[ignite-core-2.4.0.jar:2.4.0]
>> at
>> org.apache.ignite.internal.processors.cache.tree.PendingRow.
>> initKey(PendingRow.java:72)
>> ~[ignite-core-2.4.0.jar:2.4.0]
>> at
>> org.apache.ignite.internal.processors.cache.tree.PendingEntr
>> iesTree.getRow(PendingEntriesTree.java:118)
>> ~[ignite-core-2.4.0.jar:2.4.0]
>> at
>> org.apache.ignite.internal.processors.cache.tree.PendingEntr
>> 

Re: Baseline topology issue when restarting server nodes one by one

2018-06-12 Thread Olexandr K
I configured ConsistentId equal to hostname for each node and this issue is
not reproduced anymore

One more strange behaviour I noticed is that one of client nodes gets
disconnected after one of server nodes goes down.
I have reconnect logic in place so it comes back later but is such
behaviour expected?
Not sure whether it is related with consistent IDs but I didn't see it
earlier...

BTW, after configuring consistent IDs I see them in "control.bat
--baseline" output only.
Visor output and server logs still show generated IDs
That looks confusing...


Cluster state: active
Current topology version: 7
Baseline nodes:
ConsistentID=V-HP-LK-DCN01, STATE=ONLINE
ConsistentID=V-HP-LK-DCN02, STATE=ONLINE


9871EAFF(@n0) | Server
BBA63A1F(@n2) | Server
1DEDB701(@n1) | Client
5931AF53(@n3) | Client


logs\v-hp-lk-dcn01\ignite.log:383:>>> Local node
[ID=9871EAFF-73AF-4E2E-99A7-8F5DF58A3C40, order=1, clientMode=false]
logs\v-hp-lk-dcn02\ignite.log:274:>>> Local node
[ID=BBA63A1F-559E-461C-B7ED-B10CE3DE33CC, order=7, clientMode=false]


On Tue, Jun 12, 2018 at 9:48 PM, Olexandr K 
wrote:

> Hi, Dmitry
>
> server nodes start with ignite-server.xml and client nodes with
> ignite-client.xml
> server node hosts: v-hp-lk-dcn01, v-hp-lk-dcn02
>
> 
>
> 
> http://www.springframework.org/schema/beans;
>xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance;
>xsi:schemaLocation="http://www.springframework.org/schema/beans
> http://www.springframework.org/schema/beans/spring-beans.xsd;>
>
>  class="org.apache.ignite.configuration.IgniteConfiguration">
>
>
>  />
>
> 
>
> 
> 
> 
> 
> 
> 
>
> 
> 
>  value="s-hp-fs01\\dev$\\config\\keystore.jks"
> />
>  
> 
>  factory-method="getDisabledTrustManager" />
> 
> 
> 
>
> 
> 
>  value="S-hp-fs01\\dev$\\config\\log4j2.xml"/>
> 
> 
>
> 
> 
> 
> 
>  value="auth_durable_region"/>
> 
>  value="FULL_ASYNC"/>
> 
> 
> 
> 
> 
> 
> 
>
> 
> 
>
> 
> 
> 
> 
> 
> 
> 
>  
> 
> 
> 
> 
> 
> 
>  value="auth_durable_region"/>
>  value="true"/>
>  
>  />
> 
> 
> 
> 
> 
>
> 
> 
> 
> 
> 
> 
> 
> 
> v-hp-lk-dcn01:47500..47504
> v-hp-lk-dcn02:47500..47504
> 
> 
> 
> 
> 
> 
>
> 
> 
>
> 
>
> 
> http://www.springframework.org/schema/beans;
> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance;
> xsi:schemaLocation="http://www.springframework.org/schema/beans
> http://www.springframework.org/schema/beans/spring-beans.xsd;>
>
> 
>
>  />
>
> 
> 
>  value="s-hp-fs01\\dev$\\config\\keystore.jks" />
>  
> 
>  factory-method="getDisabledTrustManager" />
> 
> 
> 
>
> 
> 
>  value="S-hp-fs01\\dev$\\config\\ignite-log4j2.xml"
> />
> 
> 
>
> 
> 
> 
> 
>  value="auth_durable_region" />
> 
>  value="FULL_A

Re: Baseline topology issue when restarting server nodes one by one

2018-06-12 Thread Olexandr K
Hi, Dmitry

server nodes start with ignite-server.xml and client nodes with
ignite-client.xml
server node hosts: v-hp-lk-dcn01, v-hp-lk-dcn02




http://www.springframework.org/schema/beans;
   xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance;
   xsi:schemaLocation="http://www.springframework.org/schema/beans
http://www.springframework.org/schema/beans/spring-beans.xsd;>

















 





























   







 








 
















v-hp-lk-dcn01:47500..47504
v-hp-lk-dcn02:47500..47504













http://www.springframework.org/schema/beans;
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance;
xsi:schemaLocation="http://www.springframework.org/schema/beans
http://www.springframework.org/schema/beans/spring-beans.xsd;>








 





















 















 








 















v-hp-lk-dcn01:47500..47504
v-hp-lk-dcn02:47500..47504












On Tue, Jun 12, 2018 at 7:03 PM, dkarachentsev 
wrote:

> Hi,
>
> What IgniteConfiguration do you use? Could you please share it?
>
> Thanks!
> -Dmitry
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>


Baseline topology issue when restarting server nodes one by one

2018-06-12 Thread Olexandr K
Hi Ignite team,

I'm faced with baseline topology issue.
Here are my testing steps:

1) start 2 server (A, B) and 2 client nodes (C, D)
2) ensure baseline topology consists of 2 server nodes
3) stop server node A
4) start server node A
5) stop server node B
... oops, it cannot be started anymore

ERROR: Caused by: org.apache.ignite.spi.IgniteSpiException: Node with set
up BaselineTopology is not allowed to join cluster without one:
c53de0cb-32de-4d9c-be08-a3da7fc35e6f

.. but c53de0cb-32de-4d9c-be08-a3da7fc35e6f  is actually node B
it looks to be started with different ID and cannot join the cluster

Is it possible to set consistentIds to be equals to host names? how to do
this?

Ignite version: 2.4.0

control.bat --baseline
Cluster state: active
Current topology version: 10

Baseline nodes:
ConsistentID=c53de0cb-32de-4d9c-be08-a3da7fc35e6f, STATE=OFFLINE
ConsistentID=f62816f4-2889-4e2e-85d4-515daed9cb4c, STATE=ONLINE


Re: Cache operations hanging for a minute when one of server nodesgoesdown

2018-06-12 Thread Olexandr K
Hi Stan,

I spent half of a day but was not able to find such balanced configuration.

I followed your second advice and everything looks good now

"I’d suggest to use an ExecutorService to call put()/putAsync(), getting a
cancelable Future from the start. "

I'm doing all ignite calls via dedicated thread pool and controlling max
call time via future.
Calls still hangs for 5-60 seconds after server nodes up/down but this
happens in cache-pool and is not affecting whole system.

I'm just handling this as cache misses on application side

Thanks


On Mon, Jun 11, 2018 at 7:27 PM, Stanislav Lukyanov 
wrote:

> When a node joins the cluster needs to perform partition map exchange
> process. If this process takes too long, the cluster may become
> unresponsive.
>
> Looks like this is what happened in your case. You can check how long an
> exchange took by looking for “Started exchange” and “Finished exchange” in
> the logs – I assume it’s around 20 seconds.
>
>
>
> Debugging hanged partition map exchange issues may be pretty tricky.
>
> My best guess so far is that the reduced timeouts you’ve set resulted in
> failed network IO (e.g. instead of waiting for a message for 10s and
> getting it on the first try, you retry every 3s until a fast enough
> delivery happens – which might be the 10th or 20th attempt).
>
> Try changing timeouts back and see how long your exchanges take on a node
> join. Perhaps some value will be low enough to detect node failures and
> high enough to allow regular operations to pass.
>
> If that doesn’t help, please share the full logs from all nodes.
>
>
>
> Thanks,
>
> Stan
>
>
>
> *From: *Olexandr K 
> *Sent: *11 июня 2018 г. 14:24
> *To: *user@ignite.apache.org
> *Subject: *Re: Cache operations hanging for a minute when one of server
> nodesgoesdown
>
>
>
> Hi Stan,
>
>
>
> I tried to decrease network/failure timeouts and it worked fine when node
> stopped.
>
> Unfortunately I got lot's of hanged calls when it started again.
>
> At that time all cache calls got stuck for 25-30 seconds.
>
> Is it expected? I thought rebalancing should occur in background and node
> should join the cluster when it is 100% ready, no?
>
> See some log extracts below.
>
>
>
>  
>  
>  
>  
> 
> 
>
>
>
>
>
> -- CACHE CALL STARTED: cache.put()
>
>
>
> 83837380032 2018-06-11 13:51:54.777 [https-jsse-nio-8080-exec-4] INFO
> com.xxx.lk.backend.cache.impl.RefreshTokenCache - Store:
> d289a0a3-bca6-49a4-ae9d-9568517d656e
>
>
>
> -- MANY WARNINGS BEFORE CALL COMPLETED
>
>
> 2018-06-11 13:51:55.058 [grid-timeout-worker-#4] DEBUG
> org.apache.ignite.internal.processors.timeout.GridTimeoutProcessor -
> Timeout has occurred [obj=CancelableTask 
> [id=fe9e57ee361-ba3899f9-a10b-4498-98e6-ed3b65dfc3f8,
> endTime=1528714315053, period=3000, cancel=false, task=org.apache.ignite.
> internal.processors.query.GridQueryProcessor$2@2649758c], process=true]
> 2018-06-11 13:51:55.355 [grid-timeout-worker-#4] DEBUG
> org.apache.ignite.internal.processors.timeout.GridTimeoutProcessor -
> Timeout has occurred [obj=CancelableTask 
> [id=ee9e57ee361-ba3899f9-a10b-4498-98e6-ed3b65dfc3f8,
> endTime=1528714315351, period=2000, cancel=false, task=org.apache.ignite.
> internal.processors.query.h2.IgniteH2Indexing$13@110ac52b], process=true]
> 2018-06-11 13:51:56.339 [nio-acceptor-#5] DEBUG org.apache.ignite.spi.
> communication.tcp.TcpCommunicationSpi - Balancing data [min0=0, minIdx=0,
> max0=-1, maxIdx=-1]
>
> 2018-06-11 13:51:57.074 [exchange-worker-#18] WARN
> org.apache.ignite.internal.diagnostic - Failed to wait for partition map
> exchange [topVer=AffinityTopologyVersion [topVer=6, minorTopVer=0],
> node=ba482197-64cc-4d84-81f7-2b58f0c66a0c]. Dumping pending objects that
> might be the cause:
>  2018-06-11 13:51:57.074 [exchange-worker-#18] WARN
> org.apache.ignite.internal.diagnostic - Ready affinity version:
> AffinityTopologyVersion [topVer=5, minorTopVer=0]
>  2018-06-11 13:51:57.074 [exchange-worker-#18] WARN
> org.apache.ignite.internal.diagnostic - Last exchange future:
> GridDhtPartitionsExchangeFuture [firstDiscoEvt=DiscoveryEvent
> [evtNode=TcpDiscoveryNode [id=f5fbdfd8-8df1-4222-b4e5-2d12f42dd95f,
> addrs=[10.2.0.163, 127.0.0.1, 30.251.106.199], sockAddrs=[v-hp-lk-dcn01.
> xxxgroup.tek.loc/10.2.0.163:47500, /127.0.0.1:47500, /30.251.106.199:47500],
> discPort=47500, order=6, intOrder=5, lastExchangeTime=1528714313636,
> loc=false, ver=2.4.0#20180305-sha1:aa342270, isClient=false], topVer=6,
> nodeId8=ba482197, msg=Node joined: TcpDiscoveryNode
> [id=f5fbdfd8-8df1-4222-b4e5-2d12f42dd95f, addrs=[10.2.0.163, 127.0.0.1,
> 30.251

Re: Cache operations hanging for a minute when one of server nodes goesdown

2018-06-11 Thread Olexandr K
se an ExecutorService to call put()/putAsync(), getting a cancelable
> Future from the start.
>
>
>
> Thanks,
>
> Stan
>
>
>
> *From: *Olexandr K 
> *Sent: *11 июня 2018 г. 2:44
> *To: *user@ignite.apache.org
> *Subject: *Cache operations hanging for a minute when one of server nodes
> goesdown
>
>
>
> Hi Igniters,
>
>
>
> I'm testing our system for availability.
>
> It uses Ignite as key/value persistent cache.
>
>
>
> Here is my test:
>
> 1) start 2 server and 2 client nodes
>
> 2) run heavy load on client nodes (some application logic which cause
> cache calls)
>
> 3) stop 1 server node
>
>
>
> Here I expect all in-progress cache operations targeted to server 1 node
> to fail fast.
>
> What I don't want is to hang all my processing threads for significant
> time.
>
> Unfortunately it works exactly that way: I'm constantly getting my threads
> blocked for 20-80 seconds.
>
> Finally putAsync() completes successfully but I'd prefer cache operation
> to fail fast. I don't want to hang all processing threads for a minute
> because of cache.
>
>
>
> It works the same for put() and putAsync() calls.
>
>
>
> As I see in the code, it can be fixed by calling future.get(timeout)
> instead of future.get() in TcpCommunicationSpi.
>
> Timeout should be configurable.
>
>
>
> TcpCommunicationSpi (line: 2799)
>
>   private GridCommunicationClient reserveClient(ClusterNode node, int
> connIdx) {
>
> ...
>
> client = fut.get();
>
>
>
> Does it make sense from your point of view?
>
>
>
> Here is my thread  dump:
>
>
>
> threads=[
>   {
> threadName=https-jsse-nio-8080-exec-20,
> threadId=102,
> blockedTime=-1,
> blockedCount=0,
> waitedTime=-1,
> waitedCount=5,
> lockName=null,
> lockOwnerId=-1,
> lockOwnerName=null,
> inNative=false,
> suspended=false,
> threadState=WAITING,
> stackTrace=[
>   {
> methodName=park,
> fileName=Unsafe.java,
> lineNumber=-2,
> className=sun.misc.Unsafe,
> nativeMethod=true
>   },
>   {
> methodName=park,
> fileName=LockSupport.java,
> lineNumber=304,
> className=java.util.concurrent.locks.LockSupport,
> nativeMethod=false
>   },
>   {
> methodName=get0,
> fileName=GridFutureAdapter.java,
> lineNumber=177,
> className=org.apache.ignite.internal.util.future.
> GridFutureAdapter,
> nativeMethod=false
>   },
>   {
> methodName=get,
> fileName=GridFutureAdapter.java,
> lineNumber=140,
> className=org.apache.ignite.internal.util.future.
> GridFutureAdapter,
> nativeMethod=false
>   },
>   {
> methodName=reserveClient,
> fileName=TcpCommunicationSpi.java,
> lineNumber=2799,
> className=org.apache.ignite.spi.communication.tcp.
> TcpCommunicationSpi,
> nativeMethod=false
>   },
>
> 
>
>   {
> methodName=putAsync,
> fileName=IgniteCacheProxyImpl.java,
> lineNumber=1035,
> className=org.apache.ignite.internal.processors.cache.
> IgniteCacheProxyImpl,
> nativeMethod=false
>   },
>   {
> methodName=putAsync,
> fileName=GatewayProtectedCacheProxy.java,
> lineNumber=900,
> className=org.apache.ignite.internal.processors.cache.
> GatewayProtectedCacheProxy,
> nativeMethod=false
>   },
>
>
>
> Sample cache config:
>
>
>
> 
> 
>  value="auth_durable_region"/>
> 
>  value="FULL_ASYNC"/>
> 
> 
> 
> 
>
>
>
> Ignite version: 2.4.0
>
> OS: Windows Server 2012 R2
>
>
>
> BR, Oleksandr
>
>
>


Cache operations hanging for a minute when one of server nodes goes down

2018-06-10 Thread Olexandr K
Hi Igniters,

I'm testing our system for availability.
It uses Ignite as key/value persistent cache.

Here is my test:
1) start 2 server and 2 client nodes
2) run heavy load on client nodes (some application logic which cause cache
calls)
3) stop 1 server node

Here I expect all in-progress cache operations targeted to server 1 node to
fail fast.
What I don't want is to hang all my processing threads for significant
time.
Unfortunately it works exactly that way: I'm constantly getting my threads
blocked for 20-80 seconds.
Finally putAsync() completes successfully but I'd prefer cache operation to
fail fast. I don't want to hang all processing threads for a minute because
of cache.

It works the same for put() and putAsync() calls.

As I see in the code, it can be fixed by calling future.get(timeout)
instead of future.get() in TcpCommunicationSpi.
Timeout should be configurable.

TcpCommunicationSpi (line: 2799)
  private GridCommunicationClient reserveClient(ClusterNode node, int
connIdx) {
...
client = fut.get();

Does it make sense from your point of view?

Here is my thread  dump:

threads=[
  {
threadName=https-jsse-nio-8080-exec-20,
threadId=102,
blockedTime=-1,
blockedCount=0,
waitedTime=-1,
waitedCount=5,
lockName=null,
lockOwnerId=-1,
lockOwnerName=null,
inNative=false,
suspended=false,
threadState=WAITING,
stackTrace=[
  {
methodName=park,
fileName=Unsafe.java,
lineNumber=-2,
className=sun.misc.Unsafe,
nativeMethod=true
  },
  {
methodName=park,
fileName=LockSupport.java,
lineNumber=304,
className=java.util.concurrent.locks.LockSupport,
nativeMethod=false
  },
  {
methodName=get0,
fileName=GridFutureAdapter.java,
lineNumber=177,
className=org.apache.ignite.internal.util.future.GridFutureAdapter,
nativeMethod=false
  },
  {
methodName=get,
fileName=GridFutureAdapter.java,
lineNumber=140,
className=org.apache.ignite.internal.util.future.GridFutureAdapter,
nativeMethod=false
  },
  {
methodName=reserveClient,
fileName=TcpCommunicationSpi.java,
lineNumber=2799,

className=org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi,
nativeMethod=false
  },

  {
methodName=putAsync,
fileName=IgniteCacheProxyImpl.java,
lineNumber=1035,

className=org.apache.ignite.internal.processors.cache.IgniteCacheProxyImpl,
nativeMethod=false
  },
  {
methodName=putAsync,
fileName=GatewayProtectedCacheProxy.java,
lineNumber=900,

className=org.apache.ignite.internal.processors.cache.GatewayProtectedCacheProxy,
nativeMethod=false
  },

Sample cache config:











Ignite version: 2.4.0
OS: Windows Server 2012 R2

BR, Oleksandr


Re: control.bat authentication support

2018-05-22 Thread Olexandr K
Hi Stan,

Actually I see some support in CommandHandler.java which is called from
control.bat

At line 608 where configuration is created:

GridClientConfiguration cfg = new GridClientConfiguration();

we just need to add something like:

cfg.setSecurityCredentialsProvider(new SecurityCredentialsProvider() {
@Override
public SecurityCredentials credentials() throws
IgniteCheckedException {
return creds; //load them from control.bat command line
options or env variables
}
});

BR, Oleksandr

On Mon, May 21, 2018 at 11:53 PM, Stanislav Lukyanov <stanlukya...@gmail.com
> wrote:

> Hi,
>
>
>
> Ignite doesn’t provide built-in support for authentication, so the
> built-in control.bat/sh also don’t have stubs for that.
>
> So yes, I guess you need to write your own tool.
>
>
>
> A tool like that would be pretty simple though – just start a client node,
> parse command line arguments and
>
> map them to the methods of the ignite.cluster().
>
>
>
> Stan
>
>
>
> *From: *Olexandr K <olexandr.kundire...@gmail.com>
> *Sent: *17 мая 2018 г. 17:09
> *To: *user@ignite.apache.org
> *Subject: *control.bat authentication support
>
>
>
> Hi guys,
>
>
>
> I configured Ignite user/password authentication by adding custom plugin.
>
> It works fine in server/client nodes and visor but I can't find any auth
> support in control.bat
>
> I checked it's source code and don't see any place where I can provide
> credentials.
>
>
>
> Should I write my own control tool or how can I solve this?
>
>
>
> BR, Oleksandr
>
>
>


control.bat authentication support

2018-05-17 Thread Olexandr K
Hi guys,

I configured Ignite user/password authentication by adding custom plugin.
It works fine in server/client nodes and visor but I can't find any auth
support in control.bat
I checked it's source code and don't see any place where I can provide
credentials.

Should I write my own control tool or how can I solve this?

BR, Oleksandr


Re: Which ports does ignite cluster need to run normally?

2018-05-07 Thread Olexandr K
Hi Val, Ignite team

Reviewed all opened Ignite ports after your comments.
Everything is clear except loopback-related ports.

Here is what I see in Resource Monitor (Windows Server 2012 R2)
(prunsrv.exe is common-daemon java service wrapper running JVM 1.8 inside)

I don't understand what port 62219 is used for.
I exprimented with Ignite restarts and see that this port is chosen
dynamically.
Also, if we have port 62219 opened, we also observe number of loopback
connections starting from port 6
What are all these loopback connections used for?
Can we disable this or at least configure to use static ports?



Listening ports

Image PID  AddressPortProtocol
Firewall Status
prunsrv.exe2220IPv4 unspecified62219TCP  Allowed,
restricted - ???
prunsrv.exe2220IPv4 unspecified47500TCP  Allowed,
restricted - Ignite discovery
prunsrv.exe2220IPv4 unspecified47100TCP  Allowed,
restricted - Ignite communication
prunsrv.exe2220IPv4 unspecified11211TCP  Allowed,
restricted - HTTP internal
prunsrv.exe2220IPv4 unspecified10800TCP Allowed,
restricted - ODBC
prunsrv.exe2220IPv4 unspecified9020TCP   Allowed,
restricted - JMX

TCP Connections

ImagePIDLocal AddressLocal PortRemote AddressRemote
PortPacket Loss (%)Latency (ms)
prunsrv.exe2220server1_host47500   visor_host575500
10
prunsrv.exe2220server1_host50607server2_host47100
01
prunsrv.exe2220server1_host62275server2_host47500
01

prunsrv.exe2220IPv4 loopback62230IPv4 loopback62231
00
prunsrv.exe2220IPv4 loopback62251IPv4 loopback62250
--
prunsrv.exe2220IPv4 loopback62250IPv4 loopback62251
--
prunsrv.exe2220IPv4 loopback62249IPv4 loopback62248
--



On Thu, Apr 26, 2018 at 8:42 PM, vkulichenko 
wrote:

> Hi,
>
> The configuration is fine and it does eliminate ranges so that node always
> binds to 47500. The only drawback is that if 47500 is not available for
> whatever reason, node would not start.
>
> -Val
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>


Re: Strange cluster activation behaviour in ignite 2.4.0

2018-04-26 Thread Olexandr K
yes, everything is clear now
thanks!

On Thu, Apr 26, 2018 at 2:05 PM, slava.koptilin 
wrote:

> Hi,
>
> 1) start 2 server nodes
> ^ the cluster was started at the first time. There was no baseline
> topology yet.
>
> 2) control.bat --state => Cluster is inactive
> 3) control.bat --activate => Cluster activated
> ^ the first activation of the cluster, it forms baseline topology
>
> 4) stop both nodes
> 5) start them again
> ^ The baseline already exists and Ignite can automatically activate the
> cluster.
>
> 6) control.bat --state => Cluster is active
> ^ Yes, that is expected behavior.
>
> Please take a look at this page:
> https://apacheignite.readme.io/docs/cluster-activation#
> section-baseline-topology
>
> > Note that the baseline topology is not set when the cluster is started
> for
> > the first time;
> > that's the only time when a manual intervention is needed.
> > So, once all the nodes that should belong to the baseline topology are up
> > and running,
> > you can set the topology from code, Ignite Web Console or command line
> > tool,
> > and let Ignite handle the automatic cluster activation routines going
> > forward.
>
> I hope this hepls.
>
> Thanks,
> Slava.
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>


Re: Strange cluster activation behaviour in ignite 2.4.0

2018-04-26 Thread Olexandr K
but I didn't setup baseline topology

doc says: "To form the baseline topology from a set of nodes, use the
./control.sh --baseline set command along with a list of the nodes' consist
ent IDs:"

is it auto-resolving after first cluster activation and we don't need to
use "control.bat --baseline" at all?

On Thu, Apr 26, 2018 at 1:06 PM, slava.koptilin 
wrote:

> Hello Oleks,
>
> The cluster can be automatically activated once all the nodes of the
> baseline topology have joined after a cluster restart. This is expected
> behaviour.
>
> [1]
> https://apacheignite.readme.io/docs/cluster-activation#
> section-automatic-activation
>
> Thanks,
> Slava.
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>


Strange cluster activation behaviour in ignite 2.4.0

2018-04-26 Thread Olexandr K
Hi Igniters,

I observe strange cluster activation behaviour in ignite 2.4.0

1) start 2 server nodes
2) control.bat --state => Cluster is inactive
3) control.bat --activate => Cluster activated
4) stop both nodes
5) start them again
6) control.bat --state => Cluster is active

Q: why my cluster was auto-activated on restart? Is it expected?

I didn't run any "control.bat --baseline" command

OS : Windows Server 2012 R2










host1:47500..47504
host2:47500..47504








Re: Which ports does ignite cluster need to run normally?

2018-04-26 Thread Olexandr K
Thanks Val

one more question on this

if I want to minimize number of potentially-used ports (because of our
security guys)
is it ok to have configuration like below one?
here we have 3 server nodes each running on separate server, they always
use the same 47500 port - no ranges
As I understood if we'll later start one more node at host4:47500 it will
join cluster if at least one from default-configured nodes is up, right?
do you see any drawback in such configuration?








host1:47500
host2:47500
host3:47500








On Sat, Apr 21, 2018 at 9:49 PM, vkulichenko 
wrote:

> The only place where Ignite uses UDP is multicast IP finder [1]. Default
> number is 47400.
>
> [1]
> https://apacheignite.readme.io/docs/cluster-config#
> multicast-based-discovery
>
> -Val
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>


Re: Which ports does ignite cluster need to run normally?

2018-04-20 Thread Olexandr K
 what about UDP ports? what they are used for in Ignite?

On Thu, Apr 19, 2018 at 10:40 PM, vkulichenko  wrote:

> Most of these seem to ephemeral ports assigned to discovery and
> communication
> clients when they connect to well known configured ports on server side.
> That would always happen for any TCP connection.
>
> -Val
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>


Re: Failed to lock keys (all partition nodes left the grid)

2018-04-05 Thread Olexandr K
Here is client code which fails:

public void storeRefreshToken(UUID uid, UUID refreshTokenID) {
IgniteCache<UUID, LinkedList> cache =
getRefreshTokenCache();

try (Transaction tx = startTransaction()) {
LinkedList ids = cache.get(uid);  //getting error
here

On Thu, Apr 5, 2018 at 11:22 PM, Olexandr K <olexandr.kundire...@gmail.com>
wrote:

> Hi team,
>
> I'm getting strange exception when trying to save Key/Value pair.
> When I tested locally everything was fine.
> Now I deployed application to servers cluster and stuck with this error.
>
> What does it mean? My both Ignite server nodes are UP.
>
> Here is configuration for this cache:
>
> 
> 
> 
>  value="FULL_SYNC"/>
> 
> 
> 
>
> Here is my topology
>
> visor> top
> Hosts: 4
> +===
> +
> |  Int./Ext. IPs  |   Node ID8(@)| Node Type |
> OS| CPUs |  MACs   | CPU Load |
> +===
> +
> | 0:0:0:0:0:0:0:1 | 1: 35CD3AD1(@n2) | Client| Windows Server 2012 R2
> amd64 6.3 | 4| 00:00:00:00:00:00:00:E0 | 0.00 %   |
> | 10.2.0.225  |  |
> |  |  | 00:50:56:25:00:B9
> |  |
> | 127.0.0.1   |  |
> |  |  |
> |  |
> | 30.251.106.197  |  |
> |  |  |
> |  |
> +-+--+---+--
> +--+-+--+
> | 0:0:0:0:0:0:0:1 | 1: 230F83B8(@n3) | Client| Windows Server 2012 R2
> amd64 6.3 | 4| 00:00:00:00:00:00:00:E0 | 0.13 %   |
> | 10.2.0.250  |  |
> |  |  | 00:50:56:25:00:35
> |  |
> | 127.0.0.1   |  |
> |  |  |
> |  |
> | 30.251.106.11   |  |
> |  |  |
> |  |
> +-+--+---+--
> +--+-+--+
> | 0:0:0:0:0:0:0:1 | 1: DF9FD2A4(@n0) | Server| Windows Server 2012 R2
> amd64 6.3 | 4| 00:00:00:00:00:00:00:E0 | 0.00 %   |
> | 10.2.0.163  |  |
> |  |  | 00:50:56:25:00:B7
> |  |
> | 127.0.0.1   |  |
> |  |  |
> |  |
> | 30.251.106.199  |  |
> |  |  |
> |  |
> +-+--+---+--
> +--+-+--+
> | 0:0:0:0:0:0:0:1 | 1: 965AD7CD(@n1) | Server| Windows Server 2012 R2
> amd64 6.3 | 4| 00:00:00:00:00:00:00:E0 | 0.00 %   |
> | 10.2.0.252  |  |
> |  |  | 00:50:56:25:00:BB
> |  |
> | 127.0.0.1   |  |
> |  |  |
> |  |
> | 30.251.106.90   |  |
> |  |  |
> |  |
> +---
> +
>
> REQ_002 2018-04-05 23:08:05.259 [async-worker-2] ERROR
> com.xxx.backend.async.AsyncExecutor - Failed to lock keys (all partition
> nodes left the grid).
> org.apache.ignite.cache.CacheServerNotFoundException: Failed to lock keys
> (all partition nodes left the grid).
> at org.apache.ignite.internal.processors.cache.GridCacheUtils.
> convertToCacheException(GridCacheUtils.java:1282)
> ~[ignite-core-2.4.0.jar:2.4.0]
> at org.apache.ignite.internal.processors.cache.IgniteCacheProxyImpl.
> cacheException(IgniteCacheProxyImpl.java:1673)
> ~[ignite-core-2.4.0.jar:2.4.0]
> at org.apache.ignite.internal.processors.cache.
> IgniteCacheProxyImpl.get(IgniteCacheProxyImpl.java:852)
> ~[ignite-core-2.4.0.jar:2.4.0]
> at org.apache.ignite.internal.processors.cache.
> GatewayProtectedCacheProxy.get(GatewayProtectedCacheProxy.java:676)
> ~[ignite-core-2.4.0.jar:2.4.0]
> at 
> com.xxx.backend.cache.impl.DurableCacheIgnite.storeRefreshToken(DurableCacheIgnite.java:268)
> ~[lk.b

Failed to lock keys (all partition nodes left the grid)

2018-04-05 Thread Olexandr K
Hi team,

I'm getting strange exception when trying to save Key/Value pair.
When I tested locally everything was fine.
Now I deployed application to servers cluster and stuck with this error.

What does it mean? My both Ignite server nodes are UP.

Here is configuration for this cache:









Here is my topology

visor> top
Hosts: 4
+===+
|  Int./Ext. IPs  |   Node ID8(@)| Node Type |
OS| CPUs |  MACs   | CPU Load |
+===+
| 0:0:0:0:0:0:0:1 | 1: 35CD3AD1(@n2) | Client| Windows Server 2012 R2
amd64 6.3 | 4| 00:00:00:00:00:00:00:E0 | 0.00 %   |
| 10.2.0.225  |  |
|  |  | 00:50:56:25:00:B9
|  |
| 127.0.0.1   |  |
|  |  |
|  |
| 30.251.106.197  |  |
|  |  |
|  |
+-+--+---+--+--+-+--+
| 0:0:0:0:0:0:0:1 | 1: 230F83B8(@n3) | Client| Windows Server 2012 R2
amd64 6.3 | 4| 00:00:00:00:00:00:00:E0 | 0.13 %   |
| 10.2.0.250  |  |
|  |  | 00:50:56:25:00:35
|  |
| 127.0.0.1   |  |
|  |  |
|  |
| 30.251.106.11   |  |
|  |  |
|  |
+-+--+---+--+--+-+--+
| 0:0:0:0:0:0:0:1 | 1: DF9FD2A4(@n0) | Server| Windows Server 2012 R2
amd64 6.3 | 4| 00:00:00:00:00:00:00:E0 | 0.00 %   |
| 10.2.0.163  |  |
|  |  | 00:50:56:25:00:B7
|  |
| 127.0.0.1   |  |
|  |  |
|  |
| 30.251.106.199  |  |
|  |  |
|  |
+-+--+---+--+--+-+--+
| 0:0:0:0:0:0:0:1 | 1: 965AD7CD(@n1) | Server| Windows Server 2012 R2
amd64 6.3 | 4| 00:00:00:00:00:00:00:E0 | 0.00 %   |
| 10.2.0.252  |  |
|  |  | 00:50:56:25:00:BB
|  |
| 127.0.0.1   |  |
|  |  |
|  |
| 30.251.106.90   |  |
|  |  |
|  |
+---+

REQ_002 2018-04-05 23:08:05.259 [async-worker-2] ERROR
com.xxx.backend.async.AsyncExecutor - Failed to lock keys (all partition
nodes left the grid).
org.apache.ignite.cache.CacheServerNotFoundException: Failed to lock keys
(all partition nodes left the grid).
at
org.apache.ignite.internal.processors.cache.GridCacheUtils.convertToCacheException(GridCacheUtils.java:1282)
~[ignite-core-2.4.0.jar:2.4.0]
at
org.apache.ignite.internal.processors.cache.IgniteCacheProxyImpl.cacheException(IgniteCacheProxyImpl.java:1673)
~[ignite-core-2.4.0.jar:2.4.0]
at
org.apache.ignite.internal.processors.cache.IgniteCacheProxyImpl.get(IgniteCacheProxyImpl.java:852)
~[ignite-core-2.4.0.jar:2.4.0]
at
org.apache.ignite.internal.processors.cache.GatewayProtectedCacheProxy.get(GatewayProtectedCacheProxy.java:676)
~[ignite-core-2.4.0.jar:2.4.0]
at
com.xxx.backend.cache.impl.DurableCacheIgnite.storeRefreshToken(DurableCacheIgnite.java:268)
~[lk.backend-1.0.0.jar:?]
at
com.xxx.cache.impl.DurableCacheIgnite$$FastClassBySpringCGLIB$$fa2c0515.invoke()
~[lk.backend-1.0.0.jar:?]
at
org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:204)
~[spring-core-5.0.4.RELEASE.jar:5.0.4.RELEASE]
at
org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:747)
~[spring-aop-5.0.4.RELEASE.jar:5.0.4.RELEASE]
at
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163)
~[spring-aop-5.0.4.RELEASE.jar:5.0.4.RELEASE]
at
org.springframework.aop.aspectj.MethodInvocationProceedingJoinPoint.proceed(MethodInvocationProceedingJoinPoint.java:89)
~[spring-aop-5.0.4.RELEASE.jar:5.0.4.RELEASE]
at
com.xxx.backend.logging.ComponentLogger.logTimeMethod(ComponentLogger.java:34)
~[lk.backend-1.0.0.jar:?]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
~[?:1.8.0_162]
   

Re: How to reconnect Ignite client after server node bounce?

2018-03-06 Thread Olexandr K
Yes, it is reproducible

See client thread dump attached

Server node is just logging metrics, nothing more:

[13:03:10,830][INFO][grid-timeout-worker-#23][IgniteKernal]
Metrics for local node (to disable set 'metricsLogFrequency' to 0)
^-- Node [id=d0ae4736, uptime=00:15:00.098]
^-- H/N/C [hosts=1, nodes=2, CPUs=2]
^-- CPU [cur=0.5%, avg=1.02%, GC=0%]
^-- PageMemory [pages=0]
^-- Heap [used=67MB, free=72.64%, comm=245MB]
^-- Non heap [used=43MB, free=97.12%, comm=44MB]
^-- Public thread pool [active=0, idle=0, qSize=0]
^-- System thread pool [active=0, idle=6, qSize=0]
^-- Outbound messages queue [size=0]


On Tue, Mar 6, 2018 at 12:29 PM, Stanislav Lukyanov <stanlukya...@gmail.com>
wrote:

> Hi Olexandr,
>
>
>
> Is it reproducible? Can you share a full thread dump?
>
> Also, are there any related messages in the logs (make sure you’re running
> with `java -DIGNITE_QUEIT=false` or with `ignite.sh -v`)?
>
>
>
> Thanks,
>
> Stan
>
>
>
> *From: *Olexandr K <olexandr.kundire...@gmail.com>
> *Sent: *6 марта 2018 г. 12:48
> *To: *user@ignite.apache.org
> *Subject: *How to reconnect Ignite client after server node bounce?
>
>
>
> Hi Team,
>
> I tried the following scenario:
>
> 1) start local ignite server node
>
> 2) start client node and put some key/value pairs
>
> 3) bounce server node
>
> 4) put some more key/value pairs from client node
>
>
>
> at step (4) ignite put(..) operation just hanged
>
> I expected ignite client to take care of auto-reconnection or at least
> fail-fast...
>
> How should I handle this?
>
>
>
> here is hanged thread's stack:
>
> Daemon Thread [async-worker-2] (Suspended)
> Unsafe.park(boolean, long) line: not available [native method]
> LockSupport.park() line: 304
> lock ->
> GridFutureAdapter$ChainFuture<R,T>(GridFutureAdapter).get0(boolean)
> line: 177
> GridFutureAdapter$ChainFuture<R,T>(GridFutureAdapter).get() line:
> 140
> GridCacheAdapter$22.op(GridNearTxLocal) line: 2342
> GridCacheAdapter$22.op(GridNearTxLocal) line: 2340
> 
> GridDhtColocatedCache<K,V>(GridCacheAdapter<K,V>).syncOp(GridCacheAdapter<K,SyncOp<>>)
> line: 4040
> GridDhtColocatedCache<K,V>(GridCacheAdapter<K,V>).put0(K, V,
> CacheEntryPredicate) line: 2340
> GridDhtColocatedCache<K,V>(GridCacheAdapter<K,V>).put(K, V,
> CacheEntryPredicate) line: 2321
> GridDhtColocatedCache<K,V>(GridCacheAdapter<K,V>).put(K, V) line:
> 2298
> IgniteCacheProxyImpl<K,V>.put(K, V) line: 1005
> GatewayProtectedCacheProxy<K,V>.put(K, V) line: 872
> put  ->PersonRegistrationController.lambda$2(String, String, String,
> String) line: 106
> ...
>
> (I'm using ignite 2.3.0)
>
>
>
> BR, Oleksandr
>
>
>
2018-03-06 13:00:12
Full thread dump Java HotSpot(TM) 64-Bit Server VM (25.161-b12 mixed mode):

"async-worker-6" #123 daemon prio=5 os_prio=0 tid=0x011fd000 nid=0x5166 
waiting on condition [0x7f06d75f4000]
   java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
at 
org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:177)
at 
org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:140)
at 
org.apache.ignite.internal.processors.cache.GridCacheAdapter$22.op(GridCacheAdapter.java:2342)
at 
org.apache.ignite.internal.processors.cache.GridCacheAdapter$22.op(GridCacheAdapter.java:2340)
at 
org.apache.ignite.internal.processors.cache.GridCacheAdapter.syncOp(GridCacheAdapter.java:4040)
at 
org.apache.ignite.internal.processors.cache.GridCacheAdapter.put0(GridCacheAdapter.java:2340)
at 
org.apache.ignite.internal.processors.cache.GridCacheAdapter.put(GridCacheAdapter.java:2321)
at 
org.apache.ignite.internal.processors.cache.GridCacheAdapter.put(GridCacheAdapter.java:2298)
at 
org.apache.ignite.internal.processors.cache.IgniteCacheProxyImpl.put(IgniteCacheProxyImpl.java:1005)
at 
org.apache.ignite.internal.processors.cache.GatewayProtectedCacheProxy.put(GatewayProtectedCacheProxy.java:872)
at 
com.company.lk.backend.ctl.person.PersonRegistrationController.lambda$2(PersonRegistrationController.java:107)
at 
com.company.lk.backend.ctl.person.PersonRegistrationController$$Lambda$332/1484018980.get(Unknown
 Source)
at 
java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.

Re: How to configure write through persistence?

2018-03-06 Thread Olexandr K
Thanks Dmitry

I started server node via bin/ignite.sh but didn't override DEFAULT_CONFIG
to point on my configuration

As result, client node was using my config and server node used default
config.

Persistence is working after fixing this.


On Tue, Mar 6, 2018 at 11:55 AM, Dmitry Pavlov <dpavlov@gmail.com>
wrote:

> Hi Oleksandr,
>
> Could you please check Ignite logs for messages were persistence directory
> located?
>
> When Ignite is started from code, it is possible that Ignite is not able
> to locate IGNITE_WORK or IGNITE_HOME and creates work files in temp dir.
> Start scripts always provide this setting, but in developer's environment
> it may be missed.
>
> Setting up work/home/persistenceStoreDir will probably help to locate data
> files in configured place, so inserted data will be loaded always.
>
> Sincerely,
> Dmitriy Pavlov
>
> вт, 6 мар. 2018 г. в 12:42, Olexandr K <olexandr.kundire...@gmail.com>:
>
>> Hi Team,
>>
>> I'm trying to configure persistent cache.
>>
>> Specifically I need to cache key-value pairs and they should be also
>> stored to persistent storage.
>>
>> Here is my configuration:
>>
>> [ignite.xml]
>> ...
>> 
>> 
>> 
>> 
>> 
>> > value="FULL_SYNC"/>
>> 
>> 
>> 
>> 
>> 
>>
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>>
>> Here is my client code:
>>
>> Ignition.setClientMode(true);
>> ignite = Ignition.start("classpath:///ignite.xml");
>> ignite.active(true);
>> ...
>> ignite.getSessionCache().put(key, value);
>>
>> I cannot get my data back after bouncing Ignite node.
>> What I'm missing here?
>>
>> BR, Oleksandr
>>
>


How to reconnect Ignite client after server node bounce?

2018-03-06 Thread Olexandr K
Hi Team,

I tried the following scenario:

1) start local ignite server node
2) start client node and put some key/value pairs
3) bounce server node
4) put some more key/value pairs from client node

at step (4) ignite put(..) operation just hanged

I expected ignite client to take care of auto-reconnection or at least
fail-fast...
How should I handle this?

here is hanged thread's stack:

Daemon Thread [async-worker-2] (Suspended)
Unsafe.park(boolean, long) line: not available [native method]
LockSupport.park() line: 304
lock ->
GridFutureAdapter$ChainFuture(GridFutureAdapter).get0(boolean)
line: 177
GridFutureAdapter$ChainFuture(GridFutureAdapter).get() line:
140
GridCacheAdapter$22.op(GridNearTxLocal) line: 2342
GridCacheAdapter$22.op(GridNearTxLocal) line: 2340

GridDhtColocatedCache(GridCacheAdapter).syncOp(GridCacheAdapter>)
line: 4040
GridDhtColocatedCache(GridCacheAdapter).put0(K, V,
CacheEntryPredicate) line: 2340
GridDhtColocatedCache(GridCacheAdapter).put(K, V,
CacheEntryPredicate) line: 2321
GridDhtColocatedCache(GridCacheAdapter).put(K, V) line:
2298
IgniteCacheProxyImpl.put(K, V) line: 1005
GatewayProtectedCacheProxy.put(K, V) line: 872
put  ->PersonRegistrationController.lambda$2(String, String, String,
String) line: 106
...

(I'm using ignite 2.3.0)

BR, Oleksandr


How to configure write through persistence?

2018-03-06 Thread Olexandr K
Hi Team,

I'm trying to configure persistent cache.

Specifically I need to cache key-value pairs and they should be also stored
to persistent storage.

Here is my configuration:

[ignite.xml]
...






















Here is my client code:

Ignition.setClientMode(true);
ignite = Ignition.start("classpath:///ignite.xml");
ignite.active(true);
...
ignite.getSessionCache().put(key, value);

I cannot get my data back after bouncing Ignite node.
What I'm missing here?

BR, Oleksandr


Re: Recommended storage configuration

2018-02-27 Thread Olexandr K
Yeah, just wanted to confirm I'm not missing obvious things

Thanks!

On Tue, Feb 27, 2018 at 12:40 AM, vkulichenko  wrote:

> Oleksandr,
>
> Generally, this heavily depends on your use case, workload, amount of data,
> etc... I don't see any issues with your configuration in particular, but
> you
> need to run your tests to figure out if it provides results that you
> expect.
>
> -Val
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>


Recommended storage configuration

2018-02-26 Thread Olexandr K
Hi guys,

What is recommended hardware for for Ignite node if it is used as
distributed cache and  persistent store?

I see you mentioned that separate SSDs are recommended for WAL, index and
data:
https://apacheignite.readme.io/docs/durable-memory-tuning

What can you say about such minimal ignite node configuration?

CPU 8 cores
RAM 64 GB
2x SSD 32 GB (index and WAL)
1x SSD 128 GB (data)
SATA drive for OS
NFS drive for Ignite logs

BR, Oleksandr


Re: Production ready on Windows? Free vs Commercial difference?

2018-02-23 Thread Olexandr K
Thank you!

I'm taking Ignite for our project then

Will come with more concrete/technical questions soon )

On Fri, Feb 23, 2018 at 1:23 AM, vkulichenko 
wrote:

> Hi Oleksandr,
>
> 1. Yes, Ignite is production ready.
> 2. I doubt there is a fully pledged example for this, but I don't see any
> reason why it wouldn't be possible. You can run anything you want within a
> service.
> 3. Ignite is always and fully free. If you're asking about commercial
> products build on top of Ignite, please contact vendor companies that
> provide them.
> 4. I doubt that would be possible as in that case you have to operate with
> objects, and therefore write code. But you can emulate such workload with
> simple SQL queries that select/update by primary key.
> 5. Basically it's a tool to route requests of a legacy thin client. Not
> sure
> if it would be useful for the new client protocol, but for now I think you
> can ignore it.
>
> -Val
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>


Production ready on Windows? Free vs Commercial difference?

2018-02-22 Thread Olexandr K
Hi Ignite team,

I'm evaluating Ignite as distributed cache + key value store for our
client.

I have read most of the documentation but wanted to clarify some points:

1) Is Ignite production-ready for Windows Java 8 deployments? (Our client
has Windows cluster ready and devops team in place).
2) Our system consists of Spring Boot powered microservices which will use
Ignite as cache/storage. Is it possible to deploy Spring Boot apps as
Ignite services? Do you have any example for that?
3) What are the differences between free and commercial versions? Just
advanced monitoring/management tools or there are some volume/cpu-count/etc
limitations in free one?
4) Do you have some utility to play with adding/removing key-value pairs to
Ignite storage? Like you have sqlline.sh for SQL
5) igniterouter.sh - what it is used for? I didn't find any docs for this
utility

Thanks,
Oleksandr