[jira] [Created] (IGNITE-10852) [Documentation] - Add details on to public API behaviour

2018-12-29 Thread Alexander Gerus (JIRA)
Alexander Gerus created IGNITE-10852:


 Summary: [Documentation] - Add details on to public API behaviour
 Key: IGNITE-10852
 URL: https://issues.apache.org/jira/browse/IGNITE-10852
 Project: Ignite
  Issue Type: Improvement
  Components: documentation
Affects Versions: 2.7, 2.6, 2.5, 2.4
Reporter: Alexander Gerus


Current public API documentation has some specification gaps. In case if method 
was not successfully executed, it is not clear what should be done by user code.

Good practice is to describe all API exceptions that can be processed by user 
code and recommended actions



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (IGNITE-6477) Add cache index metric to represent index size

2018-12-10 Thread Alexander Gerus (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Gerus updated IGNITE-6477:

Affects Version/s: 2.5

> Add cache index metric to represent index size
> --
>
> Key: IGNITE-6477
> URL: https://issues.apache.org/jira/browse/IGNITE-6477
> Project: Ignite
>  Issue Type: Bug
>  Components: general
>Affects Versions: 1.8, 1.9, 2.0, 2.1, 2.5
>Reporter: Alexander Belyak
>Priority: Minor
>  Labels: iep-29
>
> Now we can't estimate space used by particular cache index. Let's add it!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (IGNITE-6477) Add cache index metric to represent index size

2018-12-10 Thread Alexander Gerus (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Gerus updated IGNITE-6477:

Labels: iep-29  (was: )

> Add cache index metric to represent index size
> --
>
> Key: IGNITE-6477
> URL: https://issues.apache.org/jira/browse/IGNITE-6477
> Project: Ignite
>  Issue Type: Bug
>  Components: general
>Affects Versions: 1.8, 1.9, 2.0, 2.1
>Reporter: Alexander Belyak
>Priority: Minor
>  Labels: iep-29
>
> Now we can't estimate space used by particular cache index. Let's add it!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (IGNITE-10385) NPE in CachePartitionPartialCountersMap.toString

2018-11-22 Thread Alexander Gerus (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-10385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Gerus updated IGNITE-10385:
-
Priority: Blocker  (was: Major)

> NPE in CachePartitionPartialCountersMap.toString
> 
>
> Key: IGNITE-10385
> URL: https://issues.apache.org/jira/browse/IGNITE-10385
> Project: Ignite
>  Issue Type: Bug
>  Components: general
>Affects Versions: 2.4
>Reporter: Anton Kurbanov
>Priority: Blocker
>
> {noformat}
> Failed to reinitialize local partitions (preloading will be stopped)
> org.apache.ignite.IgniteException: null
> at 
> org.apache.ignite.internal.util.tostring.GridToStringBuilder.toStringImpl(GridToStringBuilder.java:1032)
>  ~[ignite-core-2.4.10.jar:2.4.10]
> at 
> org.apache.ignite.internal.util.tostring.GridToStringBuilder.toString(GridToStringBuilder.java:868)
>  ~[ignite-core-2.4.10.jar:2.4.10]
> at 
> org.apache.ignite.internal.managers.communication.GridIoMessage.toString(GridIoMessage.java:358)
>  ~[ignite-core-2.4.10.jar:2.4.10]
> at java.lang.String.valueOf(String.java:2994) ~[?:1.8.0_171]
> at java.lang.StringBuilder.append(StringBuilder.java:131) ~[?:1.8.0_171]
> at 
> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage0(TcpCommunicationSpi.java:2653)
>  ~[ignite-core-2.4.10.jar:2.4.10]
> at 
> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage(TcpCommunicationSpi.java:2586)
>  ~[ignite-core-2.4.10.jar:2.4.10]
> at 
> org.apache.ignite.internal.managers.communication.GridIoManager.send(GridIoManager.java:1642)
>  ~[ignite-core-2.4.10.jar:2.4.10]
> at 
> org.apache.ignite.internal.managers.communication.GridIoManager.sendToGridTopic(GridIoManager.java:1714)
>  ~[ignite-core-2.4.10.jar:2.4.10]
> at 
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.send(GridCacheIoManager.java:1160)
>  ~[ignite-core-2.4.10.jar:2.4.10]
> at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.sendLocalPartitions(GridDhtPartitionsExchangeFuture.java:1399)
>  ~[ignite-core-2.4.10.jar:2.4.10]
> at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.sendPartitions(GridDhtPartitionsExchangeFuture.java:1506)
>  ~[ignite-core-2.4.10.jar:2.4.10]
> at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.distributedExchange(GridDhtPartitionsExchangeFuture.java:1139)
>  ~[ignite-core-2.4.10.jar:2.4.10]
> at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.init(GridDhtPartitionsExchangeFuture.java:703)
>  [ignite-core-2.4.10.jar:2.4.10]
> at 
> org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body(GridCachePartitionExchangeManager.java:2379)
>  [ignite-core-2.4.10.jar:2.4.10]
> at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110) 
> [ignite-core-2.4.10.jar:2.4.10]
> at java.lang.Thread.run(Thread.java:748) [?:1.8.0_171]
> Caused by: org.apache.ignite.IgniteException
> at 
> org.apache.ignite.internal.util.tostring.GridToStringBuilder.toStringImpl(GridToStringBuilder.java:1032)
>  ~[ignite-core-2.4.10.jar:2.4.10]
> at 
> org.apache.ignite.internal.util.tostring.GridToStringBuilder.toString(GridToStringBuilder.java:830)
>  ~[ignite-core-2.4.10.jar:2.4.10]
> at 
> org.apache.ignite.internal.util.tostring.GridToStringBuilder.toString(GridToStringBuilder.java:787)
>  ~[ignite-core-2.4.10.jar:2.4.10]
> at 
> org.apache.ignite.internal.util.tostring.GridToStringBuilder.toString(GridToStringBuilder.java:889)
>  ~[ignite-core-2.4.10.jar:2.4.10]
> at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsSingleMessage.toString(GridDhtPartitionsSingleMessage.java:551)
>  ~[ignite-core-2.4.10.jar:2.4.10]
> at java.lang.String.valueOf(String.java:2994) ~[?:1.8.0_171]
> at 
> org.apache.ignite.internal.util.GridStringBuilder.a(GridStringBuilder.java:101)
>  ~[ignite-core-2.4.10.jar:2.4.10]
> at 
> org.apache.ignite.internal.util.tostring.SBLimitedLength.a(SBLimitedLength.java:88)
>  ~[ignite-core-2.4.10.jar:2.4.10]
> at 
> org.apache.ignite.internal.util.tostring.GridToStringBuilder.toString(GridToStringBuilder.java:943)
>  ~[ignite-core-2.4.10.jar:2.4.10]
> at 
> org.apache.ignite.internal.util.tostring.GridToStringBuilder.toStringImpl(GridToStringBuilder.java:1009)
>  ~[ignite-core-2.4.10.jar:2.4.10]
> ... 16 more
> Caused by: java.lang.NullPointerException
> at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.CachePartitionPartialCountersMap.toString(CachePartitionPartialCountersMap.java:231)
>  ~[ignite-core-2.4.10.jar:2.4.10]
> at java.lang.String.valueOf(String.java:2994) ~[?:1.8.0_171]
> at 

[jira] [Updated] (IGNITE-9525) Ignite + Informatica Integration

2018-11-02 Thread Alexander Gerus (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-9525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Gerus updated IGNITE-9525:

Attachment: Ignite_Informatica_Integration.pdf

> Ignite + Informatica Integration
> 
>
> Key: IGNITE-9525
> URL: https://issues.apache.org/jira/browse/IGNITE-9525
> Project: Ignite
>  Issue Type: Task
>  Components: documentation
>Reporter: Prachi Garg
>Assignee: Pavel Vinokurov
>Priority: Major
> Fix For: 2.7
>
> Attachments: Ignite_Informatica_Integration.pdf
>
>
> Mentioned in https://cwiki.apache.org/confluence/display/IGNITE/Required+Docs



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (IGNITE-9676) Ignite as storage in Spring Session

2018-09-24 Thread Alexander Gerus (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-9676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Gerus updated IGNITE-9676:

Fix Version/s: 2.7

> Ignite as storage in Spring Session
> ---
>
> Key: IGNITE-9676
> URL: https://issues.apache.org/jira/browse/IGNITE-9676
> Project: Ignite
>  Issue Type: New Feature
>Reporter: Anton Kurbanov
>Assignee: Anton Kurbanov
>Priority: Minor
> Fix For: 2.7
>
>
> Implement repository backed with Ignite for sessions clustering with Spring 
> Session.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (IGNITE-8879) Blinking baseline node sometimes unable to connect to cluster

2018-09-12 Thread Alexander Gerus (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-8879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Gerus updated IGNITE-8879:

Priority: Critical  (was: Major)

> Blinking baseline node sometimes unable to connect to cluster
> -
>
> Key: IGNITE-8879
> URL: https://issues.apache.org/jira/browse/IGNITE-8879
> Project: Ignite
>  Issue Type: Bug
>Affects Versions: 2.5
>Reporter: Dmitry Sherstobitov
>Assignee: Vladislav Pyatkov
>Priority: Critical
> Attachments: IGNITE-8879.zip
>
>
> Almost the same scenario as in IGNITE-8874 but node left baseline while 
> blinking
> All caches with 2 backups
>  4 nodes in cluster
>  # Start cluster, load data
>  # Start transactional loading (8 threads, 100 ops/second put/get in each op)
>  # Repeat 10 times: kill one node, remove from baseline, start node again 
> (*with no LFS clean*), wait for rebalance
>  # Check idle_verify, check data corruption
>  
> At some point killed node unable to start and join cluster because of error
> (Attachments info: grid.1.node2.X.log - blinking node logs, X - iteration 
> counter from step 3)
> {code:java}
> 080ee8-END.bin]
> [2018-06-26 19:01:43,039][INFO ][main][PageMemoryImpl] Started page memory 
> [memoryAllocated=100.0 MiB, pages=24800, tableSize=1.9 MiB, 
> checkpointBuffer=100.0 MiB]
> [2018-06-26 19:01:43,039][INFO ][main][GridCacheDatabaseSharedManager] 
> Checking memory state [lastValidPos=FileWALPointer [idx=0, fileOff=583691, 
> len=119], lastMarked=FileWALPointer [idx=0, fileOff=583691, len=119], 
> lastCheckpointId=7fca4dbb-8f01-4b63-95e2-43283b080ee8]
> [2018-06-26 19:01:43,050][INFO ][main][GridCacheDatabaseSharedManager] Found 
> last checkpoint marker [cpId=7fca4dbb-8f01-4b63-95e2-43283b080ee8, 
> pos=FileWALPointer [idx=0, fileOff=583691, len=119]]
> [2018-06-26 19:01:43,082][INFO ][main][FileWriteAheadLogManager] Stopping WAL 
> iteration due to an exception: EOF at position [100] expected to read [1] 
> bytes, ptr=FileWALPointer [idx=0, fileOff=100, len=0]
> [2018-06-26 19:01:43,219][WARN ][main][FileWriteAheadLogManager] WAL segment 
> tail is reached. [ Expected next state: {Index=19,Offset=794017}, Actual 
> state : {Index=3602879702215753728,Offset=775434544} ]
> [2018-06-26 19:01:43,243][INFO ][main][GridCacheDatabaseSharedManager] 
> Applying lost cache updates since last checkpoint record 
> [lastMarked=FileWALPointer [idx=0, fileOff=583691, len=119], 
> lastCheckpointId=7fca4dbb-8f01-4b63-95e2-43283b080ee8]
> [2018-06-26 19:01:43,246][INFO ][main][FileWriteAheadLogManager] Stopping WAL 
> iteration due to an exception: EOF at position [100] expected to read [1] 
> bytes, ptr=FileWALPointer [idx=0, fileOff=100, len=0]
> [2018-06-26 19:01:43,336][WARN ][main][FileWriteAheadLogManager] WAL segment 
> tail is reached. [ Expected next state: {Index=19,Offset=794017}, Actual 
> state : {Index=3602879702215753728,Offset=775434544} ]
> [2018-06-26 19:01:43,336][INFO ][main][GridCacheDatabaseSharedManager] 
> Finished applying WAL changes [updatesApplied=0, time=101ms]
> [2018-06-26 19:01:43,450][INFO 
> ][main][GridSnapshotAwareClusterStateProcessorImpl] Restoring history for 
> BaselineTopology[id=4]
> [2018-06-26 19:01:43,454][ERROR][main][IgniteKernal] Exception during start 
> processors, node will be stopped and close connections
> class org.apache.ignite.IgniteCheckedException: Failed to start processor: 
> GridProcessorAdapter []
> at 
> org.apache.ignite.internal.IgniteKernal.startProcessor(IgniteKernal.java:1769)
> at 
> org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:1001)
> at 
> org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:2020)
> at 
> org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1725)
> at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1153)
> at 
> org.apache.ignite.internal.IgnitionEx.startConfigurations(IgnitionEx.java:1071)
> at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:957)
> at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:856)
> at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:726)
> at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:695)
> at org.apache.ignite.Ignition.start(Ignition.java:352)
> at 
> org.apache.ignite.startup.cmdline.CommandLineStartup.main(CommandLineStartup.java:301)
> Caused by: class org.apache.ignite.IgniteCheckedException: Restoring of 
> BaselineTopology history has failed, expected history item not found for id=1
> at 
> org.apache.ignite.internal.processors.cluster.BaselineTopologyHistory.restoreHistory(BaselineTopologyHistory.java:54)
> at 
> 

[jira] [Created] (IGNITE-9495) Update version for org.apache.lucene lucene-queryparser : 5.5.2

2018-09-07 Thread Alexander Gerus (JIRA)
Alexander Gerus created IGNITE-9495:
---

 Summary: Update version for org.apache.lucene lucene-queryparser : 
5.5.2
 Key: IGNITE-9495
 URL: https://issues.apache.org/jira/browse/IGNITE-9495
 Project: Ignite
  Issue Type: Improvement
Affects Versions: 2.6, 2.5, 2.4
Reporter: Alexander Gerus


Update version for org.apache.lucene
Current version: lucene-queryparser : 5.5.2
New version version: later than 7.1



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-9295) Add Warning message for multiple data streamers

2018-08-16 Thread Alexander Gerus (JIRA)
Alexander Gerus created IGNITE-9295:
---

 Summary: Add Warning message for multiple data streamers
 Key: IGNITE-9295
 URL: https://issues.apache.org/jira/browse/IGNITE-9295
 Project: Ignite
  Issue Type: Improvement
Reporter: Alexander Gerus
 Fix For: 2.7


DataStreamer is design to allocate as much resources as available. In case if 
user is starting more then one instance per cache, it can cause significant 
slowdown for the streaming due to significant consumption of resources 

The proposal is to add warning message to the application log in case if two or 
more data streamers per cache:

Warning Text: “DataStreamer is already running. For best performance please use 
single instance”

The warning should be printed only once when the case is detected



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (IGNITE-8987) Ignite hangs during getting of atomic structure after autoactivation

2018-08-09 Thread Alexander Gerus (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-8987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Gerus reassigned IGNITE-8987:
---

Assignee: Roman Guseinov  (was: Alexey Goncharuk)

> Ignite hangs during getting of atomic structure after autoactivation
> 
>
> Key: IGNITE-8987
> URL: https://issues.apache.org/jira/browse/IGNITE-8987
> Project: Ignite
>  Issue Type: Bug
>  Components: cache
>Affects Versions: 2.4
>Reporter: Andrey Aleksandrov
>Assignee: Roman Guseinov
>Priority: Major
> Fix For: 2.7
>
> Attachments: reproducer.java
>
>
> I investigate the use cases with autoactivation and creating of the 
> IgniteAtomicSequence. It hangs on awaitInitialization() method in case if it 
> called after the last node from BLT was started.
> Steps to reproduce:
> First iteration:
>  
> Do next in one thread:
> 1)Start server 1
> 2)Start server 2
> 3)Activate the cluster 
> 4)Create the IgniteAtomicSequence using next code:
> IgniteAtomicSequence igniteAtomicSequence = ignite.atomicSequence(
>  "TestName",
>  atomicConfiguration,
>  10,
>  true);
> Second iteration:
> 1)Start server 1
> 2)Start server 2 (Autoactivation will be started)
> 3)Get the IgniteAtomicSequence using next code:
> IgniteAtomicSequence igniteAtomicSequence = ignite.atomicSequence(
>  "TestName",
>  10,
>  true); //could be false because TestName was already created in iteration 1
> In this case, we hang in awaitInitialization() method in 
> DataStructureProcessor.getAtomic() method.
> In case if I added some sleep timeout between step 2 and 3 in the second 
> iteration then everything was ok. Looks like we have some race here.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (IGNITE-9196) SQL: Memory leak in MapNodeResults

2018-08-06 Thread Alexander Gerus (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-9196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Gerus updated IGNITE-9196:

Priority: Blocker  (was: Major)

> SQL: Memory leak in MapNodeResults
> --
>
> Key: IGNITE-9196
> URL: https://issues.apache.org/jira/browse/IGNITE-9196
> Project: Ignite
>  Issue Type: Bug
>  Components: sql
>Affects Versions: 2.6
>Reporter: Denis Mekhanikov
>Priority: Blocker
>
> When size of a SQL query result set is a multiple of {{Query#pageSize}}, then 
> {{MapQueryResult}} is never closed and removed from {{MapNodeResults#res}} 
> collection.
> The following code leads to OOME when run with 1Gb heap:
> {code:java}
> public class MemLeakRepro {
> public static void main(String[] args) {
> Ignition.start(getConfiguration("server"));
> try (Ignite client = 
> Ignition.start(getConfiguration("client").setClientMode(true))) {
> IgniteCache cache = startPeopleCache(client);
> int pages = 10;
> int pageSize = 1024;
> for (int i = 0; i < pages * pageSize; i++) {
> Person p = new Person("Person #" + i, 25);
> cache.put(i, p);
> }
> for (int i = 0; i < 1_000_000; i++) {
> if (i % 1000 == 0)
> System.out.println("Select iteration #" + i);
> Query> qry = new SqlFieldsQuery("select * from 
> people");
> qry.setPageSize(pageSize);
> QueryCursor> cursor = cache.query(qry);
> cursor.getAll();
> cursor.close();
> }
> }
> }
> private static IgniteConfiguration getConfiguration(String instanceName) {
> IgniteConfiguration igniteCfg = new IgniteConfiguration();
> igniteCfg.setIgniteInstanceName(instanceName);
> TcpDiscoverySpi discoSpi = new TcpDiscoverySpi();
> discoSpi.setIpFinder(new TcpDiscoveryVmIpFinder(true));
> return igniteCfg;
> }
> private static IgniteCache startPeopleCache(Ignite node) 
> {
> CacheConfiguration cacheCfg = new 
> CacheConfiguration<>("cache");
> QueryEntity qe = new QueryEntity(Integer.class, Person.class);
> qe.setTableName("people");
> cacheCfg.setQueryEntities(Collections.singleton(qe));
> cacheCfg.setSqlSchema("PUBLIC");
> return node.getOrCreateCache(cacheCfg);
> }
> public static class Person {
> @QuerySqlField
> private String name;
> @QuerySqlField
> private int age;
> public Person(String name, int age) {
> this.name = name;
> this.age = age;
> }
> }
> }
> {code}
>  
> At the same time it works perfectly fine, when there are, for example, 
> {{pages * pageSize - 1}} records in cache instead.
> The reason for it is that {{MapQueryResult#fetchNextPage(...)}} method 
> doesn't return true, when the result set size is a multiple of the page size.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (IGNITE-9178) Partition lost event are not triggered if multiple nodes left cluster

2018-08-03 Thread Alexander Gerus (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-9178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Gerus reassigned IGNITE-9178:
---

Assignee: Pavel Vinokurov

> Partition lost event are not triggered if multiple nodes left cluster
> -
>
> Key: IGNITE-9178
> URL: https://issues.apache.org/jira/browse/IGNITE-9178
> Project: Ignite
>  Issue Type: Bug
>  Components: cache
>Affects Versions: 2.4
>Reporter: Pavel Vinokurov
>Assignee: Pavel Vinokurov
>Priority: Major
>
> If multiple nodes left cluster simultaneously, left partitions are removed 
> from GridDhtPartitionTopologyImpl#node2part without adding to leftNode2Part  
> in GridDhtPartitionTopologyImpl#update method.
> Thus GridDhtPartitionTopologyImpl#detectLostPartitions can't detect lost 
> partitions



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (IGNITE-5103) TcpDiscoverySpi ignores maxMissedClientHeartbeats property

2018-08-03 Thread Alexander Gerus (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-5103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Gerus updated IGNITE-5103:

Priority: Critical  (was: Major)

> TcpDiscoverySpi ignores maxMissedClientHeartbeats property
> --
>
> Key: IGNITE-5103
> URL: https://issues.apache.org/jira/browse/IGNITE-5103
> Project: Ignite
>  Issue Type: Bug
>  Components: general
>Affects Versions: 1.9
>Reporter: Valentin Kulichenko
>Assignee: Evgenii Zhuravlev
>Priority: Critical
> Fix For: 2.7
>
> Attachments: TcpDiscoveryClientSuspensionSelfTest.java
>
>
> Test scenario is the following:
> * Start one or more servers.
> * Start a client node.
> * Suspend client process using {{-SIGSTOP}} signal.
> * Wait for {{maxMissedClientHeartbeats*heartbeatFrequency}}.
> * Client node is expected to be removed from topology, but server nodes don't 
> do that.
> Attached is the unit test reproducing the same by stopping the heartbeat 
> sender thread on the client.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (IGNITE-9178) Partition lost event are not triggered if multiple nodes left cluster

2018-08-03 Thread Alexander Gerus (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-9178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Gerus updated IGNITE-9178:

Priority: Critical  (was: Major)

> Partition lost event are not triggered if multiple nodes left cluster
> -
>
> Key: IGNITE-9178
> URL: https://issues.apache.org/jira/browse/IGNITE-9178
> Project: Ignite
>  Issue Type: Bug
>  Components: cache
>Affects Versions: 2.4
>Reporter: Pavel Vinokurov
>Assignee: Pavel Vinokurov
>Priority: Critical
>
> If multiple nodes left cluster simultaneously, left partitions are removed 
> from GridDhtPartitionTopologyImpl#node2part without adding to leftNode2Part  
> in GridDhtPartitionTopologyImpl#update method.
> Thus GridDhtPartitionTopologyImpl#detectLostPartitions can't detect lost 
> partitions



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (IGNITE-9068) Node fails to stop when CacheObjectBinaryProcessor.addMeta() is executed inside guard()/unguard()

2018-08-03 Thread Alexander Gerus (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-9068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Gerus updated IGNITE-9068:

Priority: Critical  (was: Major)

> Node fails to stop when CacheObjectBinaryProcessor.addMeta() is executed 
> inside guard()/unguard()
> -
>
> Key: IGNITE-9068
> URL: https://issues.apache.org/jira/browse/IGNITE-9068
> Project: Ignite
>  Issue Type: Bug
>  Components: binary, managed services
>Affects Versions: 2.5
>Reporter: Ilya Kasnacheev
>Assignee: Ilya Lantukh
>Priority: Critical
>  Labels: test
> Fix For: 2.7
>
> Attachments: GridServiceDeadlockTest.java, MyService.java
>
>
> When addMeta is called in e.g. service deployment it us executed inside 
> guard()/unguard()
> If node will be stopped at this point, Ignite.stop() will hang.
> Consider the following thread dump:
> {code}
> "Thread-1" #57 prio=5 os_prio=0 tid=0x7f7780005000 nid=0x7f26 runnable 
> [0x7f766cbef000]
>java.lang.Thread.State: TIMED_WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0x0005cb7b0468> (a 
> java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
>   at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireNanos(AbstractQueuedSynchronizer.java:934)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireNanos(AbstractQueuedSynchronizer.java:1247)
>   at 
> java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.tryLock(ReentrantReadWriteLock.java:1115)
>   at 
> org.apache.ignite.internal.util.StripedCompositeReadWriteLock$WriteLock.tryLock(StripedCompositeReadWriteLock.java:220)
>   at 
> org.apache.ignite.internal.GridKernalGatewayImpl.tryWriteLock(GridKernalGatewayImpl.java:143)
> // Waiting for lock to cancel futures of BinaryMetadataTransport
>   at org.apache.ignite.internal.IgniteKernal.stop0(IgniteKernal.java:2171)
>   at org.apache.ignite.internal.IgniteKernal.stop(IgniteKernal.java:2094)
>   at 
> org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.stop0(IgnitionEx.java:2545)
>   - locked <0x0005cb423f00> (a 
> org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance)
>   at 
> org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.stop(IgnitionEx.java:2508)
>   at 
> org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$2.run(IgnitionEx.java:2033)
> "test-runner-#1%service.GridServiceDeadlockTest%" #13 prio=5 os_prio=0 
> tid=0x7f77b87d5800 nid=0x7eb8 waiting on condition [0x7f778cdfc000]
>java.lang.Thread.State: WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
> // May never return if there's discovery problems
>   at 
> org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:177)
>   at 
> org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:140)
>   at 
> org.apache.ignite.internal.processors.cache.binary.CacheObjectBinaryProcessorImpl.addMeta(CacheObjectBinaryProcessorImpl.java:463)
>   at 
> org.apache.ignite.internal.processors.cache.binary.CacheObjectBinaryProcessorImpl$2.addMeta(CacheObjectBinaryProcessorImpl.java:188)
>   at 
> org.apache.ignite.internal.binary.BinaryContext.registerUserClassDescriptor(BinaryContext.java:802)
>   at 
> org.apache.ignite.internal.binary.BinaryContext.registerClassDescriptor(BinaryContext.java:761)
>   at 
> org.apache.ignite.internal.binary.BinaryContext.descriptorForClass(BinaryContext.java:627)
>   at 
> org.apache.ignite.internal.binary.BinaryWriterExImpl.marshal0(BinaryWriterExImpl.java:174)
>   at 
> org.apache.ignite.internal.binary.BinaryWriterExImpl.marshal(BinaryWriterExImpl.java:157)
>   at 
> org.apache.ignite.internal.binary.BinaryWriterExImpl.marshal(BinaryWriterExImpl.java:144)
>   at 
> org.apache.ignite.internal.binary.GridBinaryMarshaller.marshal(GridBinaryMarshaller.java:254)
>   at 
> org.apache.ignite.internal.binary.BinaryMarshaller.marshal0(BinaryMarshaller.java:82)
>   at 
> org.apache.ignite.marshaller.AbstractNodeNameAwareMarshaller.marshal(AbstractNodeNameAwareMarshaller.java:58)
>   at 
> org.apache.ignite.internal.util.IgniteUtils.marshal(IgniteUtils.java:10069)
>   at 
> org.apache.ignite.internal.processors.service.GridServiceProcessor.prepareServiceConfigurations(GridServiceProcessor.java:570)
>   at 
> org.apache.ignite.internal.processors.service.GridServiceProcessor.deployAll(GridServiceProcessor.java:622)
>   at 
> 

[jira] [Updated] (IGNITE-9184) Cluster hangs during concurrent node restart and continues query registration

2018-08-03 Thread Alexander Gerus (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-9184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Gerus updated IGNITE-9184:

Ignite Flags:   (was: Docs Required)

> Cluster hangs during concurrent node restart and continues query registration
> -
>
> Key: IGNITE-9184
> URL: https://issues.apache.org/jira/browse/IGNITE-9184
> Project: Ignite
>  Issue Type: Bug
>  Components: general
>Affects Versions: 2.6
>Reporter: Mikhail Cherkasov
>Assignee: Dmitriy Govorukhin
>Priority: Blocker
> Fix For: 2.7
>
> Attachments: StressTest.java, logs, stacktrace
>
>
> Please check the attached test case and stack trace.
> I can see: "Failed to wait for initial partition map exchange" message.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (IGNITE-9184) Cluster hangs during concurrent node restart and continues query registration

2018-08-03 Thread Alexander Gerus (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-9184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Gerus reassigned IGNITE-9184:
---

Assignee: Dmitriy Govorukhin

> Cluster hangs during concurrent node restart and continues query registration
> -
>
> Key: IGNITE-9184
> URL: https://issues.apache.org/jira/browse/IGNITE-9184
> Project: Ignite
>  Issue Type: Bug
>  Components: general
>Affects Versions: 2.6
>Reporter: Mikhail Cherkasov
>Assignee: Dmitriy Govorukhin
>Priority: Blocker
> Fix For: 2.7
>
> Attachments: StressTest.java, logs, stacktrace
>
>
> Please check the attached test case and stack trace.
> I can see: "Failed to wait for initial partition map exchange" message.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (IGNITE-9184) Cluster hangs during concurrent node restart and continues query registration

2018-08-03 Thread Alexander Gerus (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-9184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Gerus updated IGNITE-9184:

Priority: Blocker  (was: Critical)

> Cluster hangs during concurrent node restart and continues query registration
> -
>
> Key: IGNITE-9184
> URL: https://issues.apache.org/jira/browse/IGNITE-9184
> Project: Ignite
>  Issue Type: Bug
>  Components: general
>Affects Versions: 2.6
>Reporter: Mikhail Cherkasov
>Priority: Blocker
> Fix For: 2.7
>
> Attachments: StressTest.java, logs, stacktrace
>
>
> Please check the attached test case and stack trace.
> I can see: "Failed to wait for initial partition map exchange" message.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-9112) Pre-touch for Ignite off-heap memory

2018-07-27 Thread Alexander Gerus (JIRA)
Alexander Gerus created IGNITE-9112:
---

 Summary: Pre-touch for Ignite off-heap memory
 Key: IGNITE-9112
 URL: https://issues.apache.org/jira/browse/IGNITE-9112
 Project: Ignite
  Issue Type: New Feature
Affects Versions: 2.6, 2.5, 2.4
Reporter: Alexander Gerus


At the moment Ignite off-heap memory is allocated in virtual memory of 
operating system, not physical memory: it is recorded in an internal data 
structure to avoid it being used by any other process. Not even a single page 
will be allocated in physical memory until it's actually accessed. When the 
Ignite needs memory, the operating system will allocate pages as needed.

The proposal is to add an option to Ignite that will touch every single byte of 
the max off heap with a '0', resulting in the memory being allocated in the 
physical memory in addition to being reserved in the internal data structure 
(virtual memory). Similar option is available in JVM {{-XX:+AlwaysPreTouch}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (IGNITE-9068) Node fails to stop when CacheObjectBinaryProcessor.addMeta() is executed inside guard()/unguard()

2018-07-26 Thread Alexander Gerus (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-9068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Gerus reassigned IGNITE-9068:
---

Assignee: Pavel Kovalenko  (was: Ilya Lantukh)

> Node fails to stop when CacheObjectBinaryProcessor.addMeta() is executed 
> inside guard()/unguard()
> -
>
> Key: IGNITE-9068
> URL: https://issues.apache.org/jira/browse/IGNITE-9068
> Project: Ignite
>  Issue Type: Bug
>  Components: binary, managed services
>Affects Versions: 2.5
>Reporter: Ilya Kasnacheev
>Assignee: Pavel Kovalenko
>Priority: Major
>  Labels: test
> Attachments: GridServiceDeadlockTest.java
>
>
> When addMeta is called in e.g. service deployment it us executed inside 
> guard()/unguard()
> If node will be stopped at this point, Ignite.stop() will hang.
> Consider the following thread dump:
> {code}
> "Thread-1" #57 prio=5 os_prio=0 tid=0x7f7780005000 nid=0x7f26 runnable 
> [0x7f766cbef000]
>java.lang.Thread.State: TIMED_WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0x0005cb7b0468> (a 
> java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
>   at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireNanos(AbstractQueuedSynchronizer.java:934)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireNanos(AbstractQueuedSynchronizer.java:1247)
>   at 
> java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.tryLock(ReentrantReadWriteLock.java:1115)
>   at 
> org.apache.ignite.internal.util.StripedCompositeReadWriteLock$WriteLock.tryLock(StripedCompositeReadWriteLock.java:220)
>   at 
> org.apache.ignite.internal.GridKernalGatewayImpl.tryWriteLock(GridKernalGatewayImpl.java:143)
> // Waiting for lock to cancel futures of BinaryMetadataTransport
>   at org.apache.ignite.internal.IgniteKernal.stop0(IgniteKernal.java:2171)
>   at org.apache.ignite.internal.IgniteKernal.stop(IgniteKernal.java:2094)
>   at 
> org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.stop0(IgnitionEx.java:2545)
>   - locked <0x0005cb423f00> (a 
> org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance)
>   at 
> org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.stop(IgnitionEx.java:2508)
>   at 
> org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$2.run(IgnitionEx.java:2033)
> "test-runner-#1%service.GridServiceDeadlockTest%" #13 prio=5 os_prio=0 
> tid=0x7f77b87d5800 nid=0x7eb8 waiting on condition [0x7f778cdfc000]
>java.lang.Thread.State: WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
> // May never return if there's discovery problems
>   at 
> org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:177)
>   at 
> org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:140)
>   at 
> org.apache.ignite.internal.processors.cache.binary.CacheObjectBinaryProcessorImpl.addMeta(CacheObjectBinaryProcessorImpl.java:463)
>   at 
> org.apache.ignite.internal.processors.cache.binary.CacheObjectBinaryProcessorImpl$2.addMeta(CacheObjectBinaryProcessorImpl.java:188)
>   at 
> org.apache.ignite.internal.binary.BinaryContext.registerUserClassDescriptor(BinaryContext.java:802)
>   at 
> org.apache.ignite.internal.binary.BinaryContext.registerClassDescriptor(BinaryContext.java:761)
>   at 
> org.apache.ignite.internal.binary.BinaryContext.descriptorForClass(BinaryContext.java:627)
>   at 
> org.apache.ignite.internal.binary.BinaryWriterExImpl.marshal0(BinaryWriterExImpl.java:174)
>   at 
> org.apache.ignite.internal.binary.BinaryWriterExImpl.marshal(BinaryWriterExImpl.java:157)
>   at 
> org.apache.ignite.internal.binary.BinaryWriterExImpl.marshal(BinaryWriterExImpl.java:144)
>   at 
> org.apache.ignite.internal.binary.GridBinaryMarshaller.marshal(GridBinaryMarshaller.java:254)
>   at 
> org.apache.ignite.internal.binary.BinaryMarshaller.marshal0(BinaryMarshaller.java:82)
>   at 
> org.apache.ignite.marshaller.AbstractNodeNameAwareMarshaller.marshal(AbstractNodeNameAwareMarshaller.java:58)
>   at 
> org.apache.ignite.internal.util.IgniteUtils.marshal(IgniteUtils.java:10069)
>   at 
> org.apache.ignite.internal.processors.service.GridServiceProcessor.prepareServiceConfigurations(GridServiceProcessor.java:570)
>   at 
> org.apache.ignite.internal.processors.service.GridServiceProcessor.deployAll(GridServiceProcessor.java:622)
>   at 
> 

[jira] [Updated] (IGNITE-8828) Detecting and stopping unresponsive nodes during Partition Map Exchange

2018-07-25 Thread Alexander Gerus (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-8828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Gerus updated IGNITE-8828:

Ignite Flags: Docs Required

> Detecting and stopping unresponsive nodes during Partition Map Exchange
> ---
>
> Key: IGNITE-8828
> URL: https://issues.apache.org/jira/browse/IGNITE-8828
> Project: Ignite
>  Issue Type: Improvement
>  Components: general
>Reporter: Sergey Chugunov
>Assignee: Ilya Lantukh
>Priority: Major
>  Labels: iep-25
>   Original Estimate: 264h
>  Remaining Estimate: 264h
>
> During PME process coordinator (1) gathers local partition maps from all 
> nodes and (2) sends calculated full partition map back to all nodes in the 
> topology.
> However if one or more nodes fail to send local information on step 1 for any 
> reason, PME process hangs blocking all operations. The only solution will be 
> to manually identify and stop nodes which failed to send info to coordinator.
> This should be done by coordinator itself: in case it didn't receive in time 
> local partition maps from any nodes, it should check that stopping these 
> nodes won't lead to data loss and then stop them forcibly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (IGNITE-8908) NPE on discovery message processing

2018-07-02 Thread Alexander Gerus (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-8908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Gerus updated IGNITE-8908:

Priority: Critical  (was: Major)

> NPE on discovery message processing
> ---
>
> Key: IGNITE-8908
> URL: https://issues.apache.org/jira/browse/IGNITE-8908
> Project: Ignite
>  Issue Type: Bug
>  Components: general
>Reporter: Mikhail Cherkasov
>Priority: Critical
> Attachments: ContinuousQueryTask.txt, RegisterTask.txt, 
> ServiceTask.txt
>
>
> To reproduce the problem we do the the following steps:
> 1)  start 4 server nodes 
> 2) start client nodes: ServiceTask, RegisterTask, ContinuousQueryTask
> 3) restart 3 of 4 server nodes.
> The following exception is observed in logs:
> [2018-07-02 10:15:48,199][ERROR]tcp-disco-msg-worker-#2 Failed to notify 
> direct custom event listener: MetadataUpdateAcceptedMessage 
> [id=cae4cd95461-6f8d75e0-424e-4a8f-8f20-c34a9d55e44f, typeId=-372239526, 
> acceptedVer=1, duplicated=false]
> java.lang.NullPointerException: null
> at 
> org.apache.ignite.internal.processors.cache.binary.BinaryMetadataTransport$MetadataUpdateAcceptedListener.onCustomEvent(BinaryMetadataTransport.java:451)
>  ~[ignite-core-2.x.x.jar:2.x.x]
> at 
> org.apache.ignite.internal.processors.cache.binary.BinaryMetadataTransport$MetadataUpdateAcceptedListener.onCustomEvent(BinaryMetadataTransport.java:418)
>  ~[ignite-core-2.x.x.jar:2.x.x]
> at 
> org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$4.onDiscovery0(GridDiscoveryManager.java:695)
>  [ignite-core-2.x.x.jar:2.x.x]
> at 
> org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$4.onDiscovery(GridDiscoveryManager.java:577)
>  [ignite-core-2.x.x.jar:2.x.x]
> at 
> org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.notifyDiscoveryListener(ServerImpl.java:5453)
>  [ignite-core-2.x.x.jar:2.x.x]
> at 
> org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processCustomMessage(ServerImpl.java:5279)
>  [ignite-core-2.x.x.jar:2.x.x]
> at 
> org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processCustomMessage(ServerImpl.java:5313)
>  [ignite-core-2.x.x.jar:2.x.x]
> at 
> org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processMessage(ServerImpl.java:2739)
>  [ignite-core-2.x.x.jar:2.x.x]
> at 
> org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processMessage(ServerImpl.java:2531)
>  [ignite-core-2.x.x.jar:2.x.x]
> at 
> org.apache.ignite.spi.discovery.tcp.ServerImpl$MessageWorkerAdapter.body(ServerImpl.java:6730)
>  [ignite-core-2.x.x.jar:2.x.x]
> at 
> org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.body(ServerImpl.java:2614)
>  [ignite-core-2.x.x.jar:2.x.x]
> at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:62) 
> [ignite-core-2.x.x.jar:2.x.x]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-8676) Possible data loss after stoping/starting several nodes at the same time

2018-06-26 Thread Alexander Gerus (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-8676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16523436#comment-16523436
 ] 

Alexander Gerus commented on IGNITE-8676:
-

Cannot be reproduce after following fixes:
 * https://issues.apache.org/jira/browse/IGNITE-8339
 * https://issues.apache.org/jira/browse/IGNITE-8122 
 * https://issues.apache.org/jira/browse/IGNITE-8405 

 

> Possible data loss after stoping/starting several nodes at the same time
> 
>
> Key: IGNITE-8676
> URL: https://issues.apache.org/jira/browse/IGNITE-8676
> Project: Ignite
>  Issue Type: Bug
>  Components: persistence
>Affects Versions: 2.4
>Reporter: Andrey Aleksandrov
>Assignee: Stanislav Lukyanov
>Priority: Critical
> Fix For: 2.6
>
> Attachments: DataLossTest.zip, Ignite8676Test.java, 
> image-2018-06-01-12-34-54-320.png, image-2018-06-01-13-12-47-218.png, 
> image-2018-06-01-13-15-17-437.png
>
>
> Steps to reproduce:
> 1)Start 3 data (DN1, DN2, DN3) nodes with the configuration that contains the 
> cache with node filter for only these three nodes and 1 backup. (see 
> configuration from attachment)
>  2)Activate the cluster. Now you should have 3 nodes in BLT
>  3)Start new server node (SN). Now you should have 3 nodes in BLT and 1 node 
> not in the baseline.
>  4)Using some node load about 1 (or more) entities into the cache.
>  5)Start that number of primary partitions equals to backup partitions.
> !image-2018-06-01-12-34-54-320.png!
>  6)Now stop DN3 and SN. After that start them at the same time.
>  7)When DN3 and SN will be online, check that number of primary partitions 
> (PN) equals to backup partitions (BN).
> 7.1)In a case if PN == BN => go to step 6)
>  7.2)In a case if PN != BN => go to step 8)
>  
> !image-2018-06-01-13-12-47-218.png!
> 8)Deactivate the cluster with control.sh.
>  9)Activate the cluster with control.sh.
> Not you should see the data loss.
> !image-2018-06-01-13-15-17-437.png!
> Notes:
>  1)Stops/Starts should be done at the same time
>  2)Consistent Ids for nodes should be constant.
> Not you should see the data loss.
> Also, I provide the reproducer that often possible to reproduce this issue 
> (not always).  Free the working directory and restart reproducer in case if 
> there is no data loss in this iteration.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (IGNITE-8676) Possible data loss after stoping/starting several nodes at the same time

2018-06-26 Thread Alexander Gerus (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-8676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Gerus resolved IGNITE-8676.
-
Resolution: Cannot Reproduce

> Possible data loss after stoping/starting several nodes at the same time
> 
>
> Key: IGNITE-8676
> URL: https://issues.apache.org/jira/browse/IGNITE-8676
> Project: Ignite
>  Issue Type: Bug
>  Components: persistence
>Affects Versions: 2.4
>Reporter: Andrey Aleksandrov
>Assignee: Stanislav Lukyanov
>Priority: Critical
> Fix For: 2.6
>
> Attachments: DataLossTest.zip, Ignite8676Test.java, 
> image-2018-06-01-12-34-54-320.png, image-2018-06-01-13-12-47-218.png, 
> image-2018-06-01-13-15-17-437.png
>
>
> Steps to reproduce:
> 1)Start 3 data (DN1, DN2, DN3) nodes with the configuration that contains the 
> cache with node filter for only these three nodes and 1 backup. (see 
> configuration from attachment)
>  2)Activate the cluster. Now you should have 3 nodes in BLT
>  3)Start new server node (SN). Now you should have 3 nodes in BLT and 1 node 
> not in the baseline.
>  4)Using some node load about 1 (or more) entities into the cache.
>  5)Start that number of primary partitions equals to backup partitions.
> !image-2018-06-01-12-34-54-320.png!
>  6)Now stop DN3 and SN. After that start them at the same time.
>  7)When DN3 and SN will be online, check that number of primary partitions 
> (PN) equals to backup partitions (BN).
> 7.1)In a case if PN == BN => go to step 6)
>  7.2)In a case if PN != BN => go to step 8)
>  
> !image-2018-06-01-13-12-47-218.png!
> 8)Deactivate the cluster with control.sh.
>  9)Activate the cluster with control.sh.
> Not you should see the data loss.
> !image-2018-06-01-13-15-17-437.png!
> Notes:
>  1)Stops/Starts should be done at the same time
>  2)Consistent Ids for nodes should be constant.
> Not you should see the data loss.
> Also, I provide the reproducer that often possible to reproduce this issue 
> (not always).  Free the working directory and restart reproducer in case if 
> there is no data loss in this iteration.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Issue Comment Deleted] (IGNITE-8676) Possible data loss after stoping/starting several nodes at the same time

2018-06-26 Thread Alexander Gerus (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-8676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Gerus updated IGNITE-8676:

Comment: was deleted

(was: Assigned on Stan as solution for the issue is known and should be merged 
to affected 2.4 master)

> Possible data loss after stoping/starting several nodes at the same time
> 
>
> Key: IGNITE-8676
> URL: https://issues.apache.org/jira/browse/IGNITE-8676
> Project: Ignite
>  Issue Type: Bug
>  Components: persistence
>Affects Versions: 2.4
>Reporter: Andrey Aleksandrov
>Assignee: Stanislav Lukyanov
>Priority: Critical
> Fix For: 2.6
>
> Attachments: DataLossTest.zip, Ignite8676Test.java, 
> image-2018-06-01-12-34-54-320.png, image-2018-06-01-13-12-47-218.png, 
> image-2018-06-01-13-15-17-437.png
>
>
> Steps to reproduce:
> 1)Start 3 data (DN1, DN2, DN3) nodes with the configuration that contains the 
> cache with node filter for only these three nodes and 1 backup. (see 
> configuration from attachment)
>  2)Activate the cluster. Now you should have 3 nodes in BLT
>  3)Start new server node (SN). Now you should have 3 nodes in BLT and 1 node 
> not in the baseline.
>  4)Using some node load about 1 (or more) entities into the cache.
>  5)Start that number of primary partitions equals to backup partitions.
> !image-2018-06-01-12-34-54-320.png!
>  6)Now stop DN3 and SN. After that start them at the same time.
>  7)When DN3 and SN will be online, check that number of primary partitions 
> (PN) equals to backup partitions (BN).
> 7.1)In a case if PN == BN => go to step 6)
>  7.2)In a case if PN != BN => go to step 8)
>  
> !image-2018-06-01-13-12-47-218.png!
> 8)Deactivate the cluster with control.sh.
>  9)Activate the cluster with control.sh.
> Not you should see the data loss.
> !image-2018-06-01-13-15-17-437.png!
> Notes:
>  1)Stops/Starts should be done at the same time
>  2)Consistent Ids for nodes should be constant.
> Not you should see the data loss.
> Also, I provide the reproducer that often possible to reproduce this issue 
> (not always).  Free the working directory and restart reproducer in case if 
> there is no data loss in this iteration.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (IGNITE-8676) Possible data loss after stoping/starting several nodes at the same time

2018-06-26 Thread Alexander Gerus (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-8676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Gerus reassigned IGNITE-8676:
---

Assignee: Stanislav Lukyanov

Assigned on Stan as solution for the issue is known and should be merged to 
affected 2.4 master

> Possible data loss after stoping/starting several nodes at the same time
> 
>
> Key: IGNITE-8676
> URL: https://issues.apache.org/jira/browse/IGNITE-8676
> Project: Ignite
>  Issue Type: Bug
>  Components: persistence
>Affects Versions: 2.4
>Reporter: Andrey Aleksandrov
>Assignee: Stanislav Lukyanov
>Priority: Critical
> Fix For: 2.6
>
> Attachments: DataLossTest.zip, Ignite8676Test.java, 
> image-2018-06-01-12-34-54-320.png, image-2018-06-01-13-12-47-218.png, 
> image-2018-06-01-13-15-17-437.png
>
>
> Steps to reproduce:
> 1)Start 3 data (DN1, DN2, DN3) nodes with the configuration that contains the 
> cache with node filter for only these three nodes and 1 backup. (see 
> configuration from attachment)
>  2)Activate the cluster. Now you should have 3 nodes in BLT
>  3)Start new server node (SN). Now you should have 3 nodes in BLT and 1 node 
> not in the baseline.
>  4)Using some node load about 1 (or more) entities into the cache.
>  5)Start that number of primary partitions equals to backup partitions.
> !image-2018-06-01-12-34-54-320.png!
>  6)Now stop DN3 and SN. After that start them at the same time.
>  7)When DN3 and SN will be online, check that number of primary partitions 
> (PN) equals to backup partitions (BN).
> 7.1)In a case if PN == BN => go to step 6)
>  7.2)In a case if PN != BN => go to step 8)
>  
> !image-2018-06-01-13-12-47-218.png!
> 8)Deactivate the cluster with control.sh.
>  9)Activate the cluster with control.sh.
> Not you should see the data loss.
> !image-2018-06-01-13-15-17-437.png!
> Notes:
>  1)Stops/Starts should be done at the same time
>  2)Consistent Ids for nodes should be constant.
> Not you should see the data loss.
> Also, I provide the reproducer that often possible to reproduce this issue 
> (not always).  Free the working directory and restart reproducer in case if 
> there is no data loss in this iteration.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (IGNITE-8740) Support reuse of already initialized Ignite in IgniteSpringBean

2018-06-21 Thread Alexander Gerus (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-8740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Gerus reassigned IGNITE-8740:
---

Assignee: Amir Akhmedov  (was: Alexander Gerus)

> Support reuse of already initialized Ignite in IgniteSpringBean
> ---
>
> Key: IGNITE-8740
> URL: https://issues.apache.org/jira/browse/IGNITE-8740
> Project: Ignite
>  Issue Type: Improvement
>  Components: spring
>Affects Versions: 2.4
>Reporter: Ilya Kasnacheev
>Assignee: Amir Akhmedov
>Priority: Blocker
> Fix For: 2.6
>
>
> See 
> http://apache-ignite-users.70518.x6.nabble.com/IgniteSpringBean-amp-Ignite-SpringTransactionManager-broken-with-2-4-td21667.html#a21724
>  (there's patch available)
> The idea is to introduce a workaround for users hit by IGNITE-6555, which 
> unfortunately broke some scenarios.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (IGNITE-8740) Support reuse of already initialized Ignite in IgniteSpringBean

2018-06-21 Thread Alexander Gerus (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-8740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Gerus reassigned IGNITE-8740:
---

Assignee: Alexander Gerus  (was: Amir Akhmedov)

> Support reuse of already initialized Ignite in IgniteSpringBean
> ---
>
> Key: IGNITE-8740
> URL: https://issues.apache.org/jira/browse/IGNITE-8740
> Project: Ignite
>  Issue Type: Improvement
>  Components: spring
>Affects Versions: 2.4
>Reporter: Ilya Kasnacheev
>Assignee: Alexander Gerus
>Priority: Blocker
> Fix For: 2.6
>
>
> See 
> http://apache-ignite-users.70518.x6.nabble.com/IgniteSpringBean-amp-Ignite-SpringTransactionManager-broken-with-2-4-td21667.html#a21724
>  (there's patch available)
> The idea is to introduce a workaround for users hit by IGNITE-6555, which 
> unfortunately broke some scenarios.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (IGNITE-8524) Document consistency check utilities

2018-05-22 Thread Alexander Gerus (JIRA)

[ 
https://issues.apache.org/jira/browse/IGNITE-8524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16482680#comment-16482680
 ] 

Alexander Gerus edited comment on IGNITE-8524 at 5/22/18 9:58 AM:
--

[~ivan.glukos], Hi Ivan, do you have any forecasted date for the task to be 
completed?

Many thanks


was (Author: agerus):
[~ivan.glukos], Hi Ivan, do you have any forecasted date for the task to be 
completed? Our client is waiting for this spec.

Many thanks

> Document consistency check utilities
> 
>
> Key: IGNITE-8524
> URL: https://issues.apache.org/jira/browse/IGNITE-8524
> Project: Ignite
>  Issue Type: Task
>  Components: documentation
>Reporter: Denis Magda
>Assignee: Ivan Rakov
>Priority: Critical
> Fix For: 2.5
>
>
> Ignite 2.5 will go with special consistency check utilities that, for 
> instance, ensure that the data stays consistent across backups, indexes are 
> correct and many other things. More details can be taken from here:
> * https://issues.apache.org/jira/browse/IGNITE-8277
> * https://issues.apache.org/jira/browse/IGNITE-7467
> Here is an empty page that is created for the documentation: 
> https://apacheignite.readme.io/docs/consistency-check-utilities



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (IGNITE-8530) Exchange hangs during start/stop stress test

2018-05-22 Thread Alexander Gerus (JIRA)

[ 
https://issues.apache.org/jira/browse/IGNITE-8530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16482566#comment-16482566
 ] 

Alexander Gerus edited comment on IGNITE-8530 at 5/22/18 9:55 AM:
--

[~akalashnikov] Can you please help with analysis for the issue. 


was (Author: agerus):
[~akalashnikov] Can you please help with analysis for the issue. It is really 
critical for multiple clients

> Exchange hangs during start/stop stress test
> 
>
> Key: IGNITE-8530
> URL: https://issues.apache.org/jira/browse/IGNITE-8530
> Project: Ignite
>  Issue Type: Bug
>  Components: general
>Affects Versions: 2.4
>Reporter: Mikhail Cherkasov
>Assignee: Anton Kalashnikov
>Priority: Major
> Attachments: LocalRunner.java, Main2.java
>
>
> Please see attached test, it starts N_CORES*2+2 nodes first and after this 
> starts N_CORES*2 threads with while(true) cycle in which closes and starts 
> nodes with small random pause.
> After couple minutes it hangs with Failed to wait for partition map exchange.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-8524) Document consistency check utilities

2018-05-21 Thread Alexander Gerus (JIRA)

[ 
https://issues.apache.org/jira/browse/IGNITE-8524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16482680#comment-16482680
 ] 

Alexander Gerus commented on IGNITE-8524:
-

[~ivan.glukos], Hi Ivan, do you have any forecasted date for the task to be 
completed? Our client is waiting for this spec.

Many thanks

> Document consistency check utilities
> 
>
> Key: IGNITE-8524
> URL: https://issues.apache.org/jira/browse/IGNITE-8524
> Project: Ignite
>  Issue Type: Task
>  Components: documentation
>Reporter: Denis Magda
>Assignee: Ivan Rakov
>Priority: Critical
> Fix For: 2.5
>
>
> Ignite 2.5 will go with special consistency check utilities that, for 
> instance, ensure that the data stays consistent across backups, indexes are 
> correct and many other things. More details can be taken from here:
> * https://issues.apache.org/jira/browse/IGNITE-8277
> * https://issues.apache.org/jira/browse/IGNITE-7467
> Here is an empty page that is created for the documentation: 
> https://apacheignite.readme.io/docs/consistency-check-utilities



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-8530) Exchange hangs during start/stop stress test

2018-05-21 Thread Alexander Gerus (JIRA)

[ 
https://issues.apache.org/jira/browse/IGNITE-8530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16482566#comment-16482566
 ] 

Alexander Gerus commented on IGNITE-8530:
-

[~akalashnikov] Can you please help with analysis for the issue. It is really 
critical for multiple clients

> Exchange hangs during start/stop stress test
> 
>
> Key: IGNITE-8530
> URL: https://issues.apache.org/jira/browse/IGNITE-8530
> Project: Ignite
>  Issue Type: Bug
>  Components: general
>Affects Versions: 2.4
>Reporter: Mikhail Cherkasov
>Assignee: Anton Kalashnikov
>Priority: Major
> Attachments: LocalRunner.java, Main2.java
>
>
> Please see attached test, it starts N_CORES*2+2 nodes first and after this 
> starts N_CORES*2 threads with while(true) cycle in which closes and starts 
> nodes with small random pause.
> After couple minutes it hangs with Failed to wait for partition map exchange.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)