date:20200616

[jira] [Updated] (IGNITE-13105) Spring data 2.0 IgniteRepositoryQuery#transformQueryCursor contains cursor leak

2020-06-16 Thread Alexey Kuznetsov (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-13105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexey Kuznetsov updated IGNITE-13105:
--
Issue Type: Bug  (was: Improvement)

> Spring data 2.0 IgniteRepositoryQuery#transformQueryCursor contains cursor 
> leak
> ---
>
> Key: IGNITE-13105
> URL: https://issues.apache.org/jira/browse/IGNITE-13105
> Project: Ignite
>  Issue Type: Bug
>  Components: springdata
>Affects Versions: 2.8.1
>Reporter: Alexey Kuznetsov
>Assignee: Alexey Kuznetsov
>Priority: Critical
> Fix For: 2.9
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> This code produce cursor leak in RunningQueryManager:
> If result set contains one ore more rows.
> {code}
> case ONE_VALUE:
> Iterator iter = qryIter.iterator();
> if (iter.hasNext())
> return iter.next().get(0);
> return null;
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (IGNITE-10914) Web Console: Account email should be unique

2020-06-16 Thread Alexey Kuznetsov (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-10914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexey Kuznetsov closed IGNITE-10914.
-

> Web Console: Account email should be unique
> ---
>
> Key: IGNITE-10914
> URL: https://issues.apache.org/jira/browse/IGNITE-10914
> Project: Ignite
>  Issue Type: Bug
>  Components: wizards
>Affects Versions: 2.7
>Reporter: Alexey Kuznetsov
>Assignee: Alexey Kuznetsov
>Priority: Major
> Fix For: 2.8
>
> Attachments: screenshot-1.png
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> I found that if user click several times on "Sign up" it will result in 
> several identical accounts created, but only one is allowed.
>  
> To fix we need add unique index for "email" field and prepare migration that 
> will remove possible duplicates.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (IGNITE-11166) Web Agent: Host name verifier should be disabled in case of "-Dtrust.all=true".

2020-06-16 Thread Alexey Kuznetsov (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-11166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexey Kuznetsov closed IGNITE-11166.
-

> Web Agent: Host name verifier should be disabled in case of 
> "-Dtrust.all=true".
> ---
>
> Key: IGNITE-11166
> URL: https://issues.apache.org/jira/browse/IGNITE-11166
> Project: Ignite
>  Issue Type: Bug
>  Components: wizards
>Affects Versions: 2.7
>Reporter: Alexey Kuznetsov
>Assignee: Alexey Kuznetsov
>Priority: Critical
> Fix For: 2.8
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> This is a regression after IGNITE-9845



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (IGNITE-11338) Web Console: Current cache "jump" after edit on "Basic configuration" screen

2020-06-16 Thread Alexey Kuznetsov (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-11338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexey Kuznetsov closed IGNITE-11338.
-

> Web Console: Current cache "jump" after edit on "Basic configuration" screen
> 
>
> Key: IGNITE-11338
> URL: https://issues.apache.org/jira/browse/IGNITE-11338
> Project: Ignite
>  Issue Type: Bug
>  Components: wizards
>Reporter: Alexey Kuznetsov
>Assignee: Alexey Kuznetsov
>Priority: Major
> Fix For: 2.8
>
>  Time Spent: 4h 19m
>  Remaining Estimate: 0h
>
> Steps to reproduce:
> # Create new cluster.
> # Add 3 caches (just click 3 times on "Add new cache")
> # Click on first cache with name "Cache" and add a letter "x" to the end of 
> its name.
> # It will "jump" to the end of list.
> # And current "editable"  item become cache with name "Cache1"



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (IGNITE-13104) Spring data 2.0 IgniteRepositoryImpl#deleteAllById contains wrong code

2020-06-16 Thread Alexey Kuznetsov (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-13104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexey Kuznetsov updated IGNITE-13104:
--
Issue Type: Bug  (was: Improvement)

> Spring data 2.0 IgniteRepositoryImpl#deleteAllById contains wrong code
> --
>
> Key: IGNITE-13104
> URL: https://issues.apache.org/jira/browse/IGNITE-13104
> Project: Ignite
>  Issue Type: Bug
>  Components: springdata
>Affects Versions: 2.8.1
>Reporter: Alexey Kuznetsov
>Assignee: Alexey Kuznetsov
>Priority: Major
> Fix For: 2.9
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> {code}
> /** {@inheritDoc} */
> @Override public void deleteAllById(Iterable ids) {
> if (ids instanceof Set)
> cache.removeAll((Set)ids);
> if (ids instanceof Collection)
> cache.removeAll(new HashSet<>((Collection)ids));
> TreeSet keys = new TreeSet<>();
> for (ID id : ids)
> keys.add(id);
> cache.removeAll(keys);
> }
> {code}
> As you can see cache.removeAll may be executed THREE times in some situations.
> Also this method can throw ClassCast exception if ids collection contains 
> objects that are not implement Comparable interface.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (IGNITE-10609) Failed to remove cache from service configuration.

2020-06-16 Thread Alexey Kuznetsov (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-10609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexey Kuznetsov closed IGNITE-10609.
-

> Failed to remove cache from service configuration.
> --
>
> Key: IGNITE-10609
> URL: https://issues.apache.org/jira/browse/IGNITE-10609
> Project: Ignite
>  Issue Type: Bug
>  Components: wizards
>Reporter: Alexey Kuznetsov
>Assignee: Pavel Konstantinov
>Priority: Major
> Fix For: 2.8
>
>
> # Add service configuration for some cluster with caches.
> # Select any cache
> There are not possibility to remove cache selection from service 
> configuration.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (IGNITE-10552) Web Agent: Improve logging when cluster topology changed

2020-06-16 Thread Alexey Kuznetsov (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-10552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexey Kuznetsov closed IGNITE-10552.
-

> Web Agent: Improve logging when cluster topology changed
> 
>
> Key: IGNITE-10552
> URL: https://issues.apache.org/jira/browse/IGNITE-10552
> Project: Ignite
>  Issue Type: Task
>  Components: wizards
>Reporter: Alexey Kuznetsov
>Assignee: Pavel Konstantinov
>Priority: Major
> Fix For: 2.8
>
>
> Web Agent can be configured with several URLs of REST addresses in order to 
> try connect to next address if current address no longer available.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (IGNITE-10433) Web Console: "Import models" dialog doesn't unsubscribe from watching agent after closing

2020-06-16 Thread Alexey Kuznetsov (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-10433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexey Kuznetsov closed IGNITE-10433.
-

> Web Console: "Import models" dialog doesn't unsubscribe from watching agent 
> after closing
> -
>
> Key: IGNITE-10433
> URL: https://issues.apache.org/jira/browse/IGNITE-10433
> Project: Ignite
>  Issue Type: Task
>  Components: wizards
>Reporter: Alexey Kuznetsov
>Assignee: Pavel Konstantinov
>Priority: Major
> Fix For: 2.8
>
>
> Noticed the next behaviour:
>  # Open configuration page or cluster edit page.
>  # Click import when agent is not run.
>  # Click *Back* button to close *Connection to Ignite Web Agent is not 
> established* dialog.
>  # Start and stop web-agent.
> *Connection to Ignite Web Agent is not established* dialog is shown.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (IGNITE-10245) o.a.i.internal.util.nio.ssl.GridNioSslFilter failed with Assertion if invalid SSL Cipher suite name specified

2020-06-16 Thread Alexey Kuznetsov (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-10245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexey Kuznetsov closed IGNITE-10245.
-

> o.a.i.internal.util.nio.ssl.GridNioSslFilter failed with Assertion if invalid 
> SSL Cipher suite name specified
> -
>
> Key: IGNITE-10245
> URL: https://issues.apache.org/jira/browse/IGNITE-10245
> Project: Ignite
>  Issue Type: Task
>Reporter: Alexey Kuznetsov
>Assignee: Ryabov Dmitrii
>Priority: Major
> Fix For: 2.8
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> This issue is related to IGNITE-10189.
> In case of invalid cipher suite name GridNioSslFilter  failed with assertion 
> in org.apache.ignite.internal.util.nio.ssl.GridNioSslFilter#sslHandler method.
> Need to investigate and fix.
>  
> See test: ClientSslParametersTest.testNonExistentCipherSuite()



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (IGNITE-9887) A lot of exceptions on cluster deactivation.

2020-06-16 Thread Alexey Kuznetsov (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-9887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexey Kuznetsov closed IGNITE-9887.


> A lot of exceptions on cluster deactivation.
> 
>
> Key: IGNITE-9887
> URL: https://issues.apache.org/jira/browse/IGNITE-9887
> Project: Ignite
>  Issue Type: Bug
>  Components: sql
>Reporter: Alexey Kuznetsov
>Assignee: Vladimir Ozerov
>Priority: Blocker
> Fix For: 2.7
>
>
> Start node with caches configured via config.
> Activate cluster.
> Deactivate cluster - got following exception for almost for all caches.
> {code}
> [2018-10-15 
> 10:31:56,715][ERROR][exchange-worker-#44%tester%][IgniteH2Indexing] Failed to 
> drop schema on cache stop (will ignore): c_partitioned
> class org.apache.ignite.internal.processors.query.IgniteSQLException: Failed 
> to execute statement: DROP SCHEMA IF EXISTS "c_partitioned"
>   at 
> org.apache.ignite.internal.processors.query.h2.IgniteH2Indexing.executeSystemStatement(IgniteH2Indexing.java:763)
>   at 
> org.apache.ignite.internal.processors.query.h2.IgniteH2Indexing.dropSchema(IgniteH2Indexing.java:715)
>   at 
> org.apache.ignite.internal.processors.query.h2.IgniteH2Indexing.unregisterCache(IgniteH2Indexing.java:3577)
>   at 
> org.apache.ignite.internal.processors.query.GridQueryProcessor.onCacheStop0(GridQueryProcessor.java:1693)
>   at 
> org.apache.ignite.internal.processors.query.GridQueryProcessor.onCacheStop(GridQueryProcessor.java:885)
>   at 
> org.apache.ignite.internal.processors.cache.GridCacheProcessor.stopCache(GridCacheProcessor.java:1375)
>   at 
> org.apache.ignite.internal.processors.cache.GridCacheProcessor.prepareCacheStop(GridCacheProcessor.java:2336)
>   at 
> org.apache.ignite.internal.processors.cache.GridCacheProcessor.processCacheStopRequestOnExchangeDone(GridCacheProcessor.java:2488)
>   at 
> org.apache.ignite.internal.processors.cache.GridCacheProcessor.onExchangeDone(GridCacheProcessor.java:2557)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.onDone(GridDhtPartitionsExchangeFuture.java:1921)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.finishExchangeOnCoordinator(GridDhtPartitionsExchangeFuture.java:3265)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.onAllReceived(GridDhtPartitionsExchangeFuture.java:2973)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.distributedExchange(GridDhtPartitionsExchangeFuture.java:1321)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.init(GridDhtPartitionsExchangeFuture.java:789)
>   at 
> org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body0(GridCachePartitionExchangeManager.java:2657)
>   at 
> org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body(GridCachePartitionExchangeManager.java:2529)
>   at 
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: org.h2.jdbc.JdbcSQLException: Невозможно удалить "c_partitioned", 
> пока существует зависимый объект "SLEEP, SQUARE, UPPERCASE, _LENGTH_, _CUBE_"
> Cannot drop "c_partitioned" because "SLEEP, SQUARE, UPPERCASE, _LENGTH_, 
> _CUBE_" depends on it; SQL statement:
> DROP SCHEMA IF EXISTS "c_partitioned" [90107-197]
>   at org.h2.message.DbException.getJdbcSQLException(DbException.java:357)
>   at org.h2.message.DbException.get(DbException.java:179)
>   at org.h2.command.ddl.DropSchema.update(DropSchema.java:59)
>   at org.h2.command.CommandContainer.update(CommandContainer.java:102)
>   at org.h2.command.Command.executeUpdate(Command.java:261)
>   at 
> org.h2.jdbc.JdbcStatement.executeUpdateInternal(JdbcStatement.java:169)
>   at org.h2.jdbc.JdbcStatement.executeUpdate(JdbcStatement.java:126)
>   at 
> org.apache.ignite.internal.processors.query.h2.IgniteH2Indexing.executeSystemStatement(IgniteH2Indexing.java:758)
>   ... 17 more
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (IGNITE-8554) Cache metrics: expose metrics with rebalance info about keys

2020-06-16 Thread Alexey Kuznetsov (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-8554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexey Kuznetsov closed IGNITE-8554.


> Cache metrics: expose metrics with rebalance info about keys
> 
>
> Key: IGNITE-8554
> URL: https://issues.apache.org/jira/browse/IGNITE-8554
> Project: Ignite
>  Issue Type: Improvement
>  Components: cache
>Reporter: Alexey Kuznetsov
>Assignee: Ilya Lantukh
>Priority: Major
> Fix For: 2.7
>
>
> In order to show info about rebalance progress we need to expose 
> estimatedRebalancingKeys and rebalancedKeys metrics.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (IGNITE-9839) Web Console: update to RxJS 6

2020-06-16 Thread Alexey Kuznetsov (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-9839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexey Kuznetsov closed IGNITE-9839.


> Web Console: update to RxJS 6
> -
>
> Key: IGNITE-9839
> URL: https://issues.apache.org/jira/browse/IGNITE-9839
> Project: Ignite
>  Issue Type: Task
>  Components: wizards
>Reporter: Alexey Kuznetsov
>Assignee: Pavel Konstantinov
>Priority: Major
> Fix For: 2.8
>
>  Time Spent: 7h 7m
>  Remaining Estimate: 0h
>
> Since RxJS 6 is required by latest version of Angular, we won't be able to 
> proceed with UI framework migration without it. To do: update import paths, 
> convert to pipe API.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (IGNITE-8518) Web Console: Auto focus "Confirm" button in Confirmation dialog

2020-06-16 Thread Alexey Kuznetsov (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-8518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexey Kuznetsov closed IGNITE-8518.


> Web Console: Auto focus "Confirm" button in Confirmation dialog
> ---
>
> Key: IGNITE-8518
> URL: https://issues.apache.org/jira/browse/IGNITE-8518
> Project: Ignite
>  Issue Type: Improvement
>  Components: wizards
>Reporter: Alexey Kuznetsov
>Assignee: Pavel Konstantinov
>Priority: Major
> Fix For: 2.8
>
>
> This will allow to confirm from keyboard.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (IGNITE-6464) ignite.active() == true, but ignite.utilityCache() may return null

2020-06-16 Thread Alexey Kuznetsov (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-6464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexey Kuznetsov closed IGNITE-6464.


> ignite.active() == true, but ignite.utilityCache() may return null
> --
>
> Key: IGNITE-6464
> URL: https://issues.apache.org/jira/browse/IGNITE-6464
> Project: Ignite
>  Issue Type: Bug
>  Components: cache
>Reporter: Alexey Kuznetsov
>Assignee: Dmitriy Govorukhin
>Priority: Major
> Fix For: 2.4
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (IGNITE-5539) MemoryMetrics.getTotalAllocatedPages return 0 when persistence is enabled

2020-06-16 Thread Alexey Kuznetsov (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-5539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexey Kuznetsov closed IGNITE-5539.


> MemoryMetrics.getTotalAllocatedPages return 0 when persistence is enabled
> -
>
> Key: IGNITE-5539
> URL: https://issues.apache.org/jira/browse/IGNITE-5539
> Project: Ignite
>  Issue Type: Bug
>  Components: persistence
>Affects Versions: 2.1
>Reporter: Alexey Kuznetsov
>Assignee: Sergey Chugunov
>Priority: Major
>  Labels: iep-6
>
> In memory only mode metrics show some not zero values.
> With persistence it shows zero.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (IGNITE-5095) NULL strings in REST-HTTP should be serialized as null

2020-06-16 Thread Alexey Kuznetsov (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-5095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexey Kuznetsov closed IGNITE-5095.


> NULL strings in REST-HTTP should be serialized as null
> --
>
> Key: IGNITE-5095
> URL: https://issues.apache.org/jira/browse/IGNITE-5095
> Project: Ignite
>  Issue Type: Task
>  Components: clients
>Reporter: Alexey Kuznetsov
>Assignee: Sergey Kozlov
>Priority: Major
> Fix For: 2.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (IGNITE-4350) Cache JDBC POJO store: improve default data transformation

2020-06-16 Thread Alexey Kuznetsov (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-4350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexey Kuznetsov closed IGNITE-4350.


> Cache JDBC POJO store: improve default data transformation
> --
>
> Key: IGNITE-4350
> URL: https://issues.apache.org/jira/browse/IGNITE-4350
> Project: Ignite
>  Issue Type: Task
>  Components: cache
>Affects Versions: 1.7
>Reporter: Alexey Kuznetsov
>Assignee: Alexey Kuznetsov
>Priority: Major
> Fix For: 2.0
>
>
> Improve JdbcTypesDefaultTransformer logic in case when in database column 
> declared as some TYPE1 and in POJO store same column declared as some TYPE2. 
> We could try to handle such cases out of the box.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (IGNITE-3964) SQL: implement support for custom table name

2020-06-16 Thread Alexey Kuznetsov (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-3964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexey Kuznetsov closed IGNITE-3964.


> SQL: implement support for custom table name
> 
>
> Key: IGNITE-3964
> URL: https://issues.apache.org/jira/browse/IGNITE-3964
> Project: Ignite
>  Issue Type: Task
>  Components: sql
>Affects Versions: 1.6
>Reporter: Alexey Kuznetsov
>Assignee: Andrey Mashenkov
>Priority: Major
> Fix For: 1.9
>
>
> We have ability to specify aliases for columns via 
> org.apache.ignite.cache.QueryEntity#getAliases.
> But how about alias for table name? This could be useful in case of moving 
> legacy application to Ignite with a lot of SQL statements.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (IGNITE-1178) NPE in GridCacheProcessor.onKernalStop()

2020-06-16 Thread Alexey Kuznetsov (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-1178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexey Kuznetsov closed IGNITE-1178.


> NPE in GridCacheProcessor.onKernalStop()
> 
>
> Key: IGNITE-1178
> URL: https://issues.apache.org/jira/browse/IGNITE-1178
> Project: Ignite
>  Issue Type: Bug
>  Components: cache
>Affects Versions: 1.1.4
>Reporter: Alexey Kuznetsov
>Assignee: Alexey Kuznetsov
>Priority: Minor
> Fix For: 2.0
>
>
> If user start node with incorrectly configured cache type metadata for 
> JdbcPojoStore NPE is raised in onKernalStop. See attached stacktrace:
> {code}
> org.apache.ignite.IgniteCheckedException: Failed to initialize property
> 'exceptionOid' for key class 'class
> org.apache.ignite.examples.algofusion.ExceptionMasterKey' and value class
> 'class org.apache.ignite.examples.algofusion.ExceptionMaster'. Make sure
> that one of these classes contains respective getter method or field.
> at
> org.apache.ignite.internal.processors.query.GridQueryProcessor.buildClassProperty(GridQueryProcessor.java:1342)
> at
> org.apache.ignite.internal.processors.query.GridQueryProcessor.processClassMeta(GridQueryProcessor.java:1148)
> at
> org.apache.ignite.internal.processors.query.GridQueryProcessor.initializeCache(GridQueryProcessor.java:149)
> at
> org.apache.ignite.internal.processors.query.GridQueryProcessor.onCacheStart(GridQueryProcessor.java:249)
> at
> org.apache.ignite.internal.processors.cache.GridCacheProcessor.startCache(GridCacheProcessor.java:922)
> at
> org.apache.ignite.internal.processors.cache.GridCacheProcessor.onKernalStart(GridCacheProcessor.java:779)
> at 
> org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:829)
> at
> org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:1538)
> at
> org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1405)
> at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:931)
> at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:477)
> at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:458)
> at org.apache.ignite.Ignition.start(Ignition.java:321)
> at org.apache.ignite.examples.algofusion.AlgoDB.main(AlgoDB.java:89)
> [15:45:29,007][ERROR][main][IgniteKernal] Failed to pre-stop processor:
> GridProcessorAdapter []
> java.lang.NullPointerException
> at
> org.apache.ignite.internal.processors.cache.GridCacheEventManager.isRecordable(GridCacheEventManager.java:342)
> at
> org.apache.ignite.internal.processors.cache.GridCacheProcessor.onKernalStop(GridCacheProcessor.java:1089)
> at
> org.apache.ignite.internal.processors.cache.GridCacheProcessor.onKernalStop(GridCacheProcessor.java:896)
> at 
> org.apache.ignite.internal.IgniteKernal.stop0(IgniteKernal.java:1706)
> at 
> org.apache.ignite.internal.IgniteKernal.stop(IgniteKernal.java:1650)
> at 
> org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:852)
> at
> org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:1538)
> at
> org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1405)
> at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:931)
> at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:477)
> at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:458)
> at org.apache.ignite.Ignition.start(Ignition.java:321)
> at org.apache.ignite.examples.algofusion.AlgoDB.main(AlgoDB.java:89)
> [15:45:29] Ignite node stopped wih ERRORS [uptime=00:00:02:515]
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (IGNITE-3592) Provide some kind of pluggable compression SPI support

2020-06-16 Thread Alexey Kuznetsov (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-3592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexey Kuznetsov closed IGNITE-3592.


> Provide some kind of pluggable compression SPI support
> --
>
> Key: IGNITE-3592
> URL: https://issues.apache.org/jira/browse/IGNITE-3592
> Project: Ignite
>  Issue Type: Task
>  Components: cache
>Reporter: Alexey Kuznetsov
>Assignee: Vyacheslav Daradur
>Priority: Major
> Fix For: 2.3
>
>
> It may be useful in some cases to compress data stored in cache.
> And in order to give access to compressed data from SQL engine this support 
> should be implemented in ignite-core level.
> See discussion on dev-list: 
> http://apache-ignite-developers.2346864.n4.nabble.com/Data-compression-in-Ignite-2-0-td10099.html



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (IGNITE-1152) Distribution of backup partitions is not uniform

2020-06-16 Thread Alexey Kuznetsov (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-1152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexey Kuznetsov closed IGNITE-1152.


> Distribution of backup partitions is not uniform
> 
>
> Key: IGNITE-1152
> URL: https://issues.apache.org/jira/browse/IGNITE-1152
> Project: Ignite
>  Issue Type: Bug
>  Components: cache
>Affects Versions: 1.1.4
>Reporter: Alexey Kuznetsov
>Assignee: Alexey Goncharuk
>Priority: Minor
> Attachments: CacheBackupPartitionsTest.java
>
>
> I started 4 nodes with partitioned cache with 1 backup.
> And found that primary parts more or less uniform, but backup parts - not:
> Node n1: : pri=244, bak=367
> Node n2: : pri=260, bak=590
> Node n3: : pri=244, bak=367
> Node n4: : pri=260, bak=590
> Code to test this issue attached.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (IGNITE-844) Create tests for queries configured via CacheTypeMetadata

2020-06-16 Thread Alexey Kuznetsov (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexey Kuznetsov closed IGNITE-844.
---

> Create tests for queries configured via CacheTypeMetadata
> -
>
> Key: IGNITE-844
> URL: https://issues.apache.org/jira/browse/IGNITE-844
> Project: Ignite
>  Issue Type: Task
>  Components: sql
>Affects Versions: sprint-1
>Reporter: Alexey Kuznetsov
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (IGNITE-755) Add example for Query + CacheTypeMetadata

2020-06-16 Thread Alexey Kuznetsov (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexey Kuznetsov closed IGNITE-755.
---

> Add example for Query + CacheTypeMetadata
> -
>
> Key: IGNITE-755
> URL: https://issues.apache.org/jira/browse/IGNITE-755
> Project: Ignite
>  Issue Type: Task
>Reporter: Alexey Kuznetsov
>Priority: Major
> Attachments: 
> 0001-IGNITE-755-Implemented-example-of-cache-type-metadat.patch
>
>
> Check content for page 
> https://dash.readme.io/project/apacheignite/v1.3/suggested/update/556e697efc3aa80d00e1a8f2



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (IGNITE-735) Need to add support for dynamic indexes

2020-06-16 Thread Alexey Kuznetsov (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexey Kuznetsov closed IGNITE-735.
---

> Need to add support for dynamic indexes
> ---
>
> Key: IGNITE-735
> URL: https://issues.apache.org/jira/browse/IGNITE-735
> Project: Ignite
>  Issue Type: Task
>  Components: sql
>Reporter: Alexey Kuznetsov
>Assignee: Sergei Vladykin
>Priority: Major
> Fix For: 2.0
>
>
> We should support dynamic index creation and deletion.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (IGNITE-369) Cache manager should switch cache statisticsEnabled property globaly

2020-06-16 Thread Alexey Kuznetsov (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexey Kuznetsov closed IGNITE-369.
---

Feature released.

> Cache manager should switch cache statisticsEnabled property globaly
> 
>
> Key: IGNITE-369
> URL: https://issues.apache.org/jira/browse/IGNITE-369
> Project: Ignite
>  Issue Type: Task
>Affects Versions: sprint-2
>Reporter: Alexey Kuznetsov
>Assignee: Aleksey Plekhanov
>Priority: Major
>  Labels: iep-6
> Fix For: 2.4
>
>
> Also you should take care about new nodes that joining grid.
> New node could have statisticsEnabled with opposite value that nodes in grid.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (IGNITE-186) Access to IgniteFS via Visorcmd

2020-06-16 Thread Alexey Kuznetsov (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexey Kuznetsov resolved IGNITE-186.
-
Resolution: Won't Fix

IGFS was dropped from Ignite.

> Access to IgniteFS via Visorcmd
> ---
>
> Key: IGNITE-186
> URL: https://issues.apache.org/jira/browse/IGNITE-186
> Project: Ignite
>  Issue Type: Task
>  Components: UI
>Affects Versions: sprint-1
>Reporter: Alexey Kuznetsov
>Priority: Major
>
> Now IgniteFS is part of Ignite-fabric and doesn't need hadoop libs anymore.
> But hadoop provided access to IgniteFS via hadoop fs. 
> I suggest to introduce following:
> 1. Add support filesystem commands by visorcmd similar hadoop fs
> 2. Add command-line option to visorcmd to execute an one command and exit, 
> for instance:
> bin/visorcmd.sh -e mkdir /123



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (IGNITE-186) Access to IgniteFS via Visorcmd

2020-06-16 Thread Alexey Kuznetsov (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexey Kuznetsov closed IGNITE-186.
---

> Access to IgniteFS via Visorcmd
> ---
>
> Key: IGNITE-186
> URL: https://issues.apache.org/jira/browse/IGNITE-186
> Project: Ignite
>  Issue Type: Task
>  Components: UI
>Affects Versions: sprint-1
>Reporter: Alexey Kuznetsov
>Priority: Major
>
> Now IgniteFS is part of Ignite-fabric and doesn't need hadoop libs anymore.
> But hadoop provided access to IgniteFS via hadoop fs. 
> I suggest to introduce following:
> 1. Add support filesystem commands by visorcmd similar hadoop fs
> 2. Add command-line option to visorcmd to execute an one command and exit, 
> for instance:
> bin/visorcmd.sh -e mkdir /123



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (IGNITE-184) visorcmd: improve auto-calculation of width of table columns

2020-06-16 Thread Alexey Kuznetsov (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexey Kuznetsov closed IGNITE-184.
---

> visorcmd: improve auto-calculation of width of table columns
> 
>
> Key: IGNITE-184
> URL: https://issues.apache.org/jira/browse/IGNITE-184
> Project: Ignite
>  Issue Type: Task
>  Components: UI
>Affects Versions: sprint-1
>Reporter: Alexey Kuznetsov
>Priority: Major
> Attachments: gg-8723.png
>
>
> I suggest to use following logic:
> if parameter value is a set of values separated by space then print this 
> separated values from new line
> Use terminal width to determine table columns maximum width
> see attached screenshot



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (IGNITE-184) visorcmd: improve auto-calculation of width of table columns

2020-06-16 Thread Alexey Kuznetsov (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexey Kuznetsov resolved IGNITE-184.
-
Resolution: Won't Fix

VisorCMD will be replaced by control.sh soon

> visorcmd: improve auto-calculation of width of table columns
> 
>
> Key: IGNITE-184
> URL: https://issues.apache.org/jira/browse/IGNITE-184
> Project: Ignite
>  Issue Type: Task
>  Components: UI
>Affects Versions: sprint-1
>Reporter: Alexey Kuznetsov
>Priority: Major
> Attachments: gg-8723.png
>
>
> I suggest to use following logic:
> if parameter value is a set of values separated by space then print this 
> separated values from new line
> Use terminal width to determine table columns maximum width
> see attached screenshot



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (IGNITE-13033) Java thin client: Service invocation

2020-06-16 Thread Ignite TC Bot (Jira)



[ 
https://issues.apache.org/jira/browse/IGNITE-13033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17138018#comment-17138018
 ] 

Ignite TC Bot commented on IGNITE-13033:


{panel:title=Branch: [pull/7908/head] Base: [master] : No blockers 
found!|borderStyle=dashed|borderColor=#ccc|titleBGColor=#D6F7C1}{panel}
[TeamCity *--> Run :: All* 
Results|https://ci.ignite.apache.org/viewLog.html?buildId=5380075&buildTypeId=IgniteTests24Java8_RunAll]

> Java thin client: Service invocation
> 
>
> Key: IGNITE-13033
> URL: https://issues.apache.org/jira/browse/IGNITE-13033
> Project: Ignite
>  Issue Type: New Feature
>  Components: thin client
>Reporter: Aleksey Plekhanov
>Assignee: Aleksey Plekhanov
>Priority: Major
>  Labels: iep-46
>  Time Spent: 5h 40m
>  Remaining Estimate: 0h
>
> Provide an API to invoke Ignite Services from java thin client.
> Protocol changes and all implementation details described in IEP-46.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (IGNITE-13051) Optimized affinity switch on baseline node left is broken for client topology and MVCC

2020-06-16 Thread Maksim Timonin (Jira)



[ 
https://issues.apache.org/jira/browse/IGNITE-13051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17137944#comment-17137944
 ] 

Maksim Timonin commented on IGNITE-13051:
-

[~ascherbakov] hi! did you have a chance to check the PR?

> Optimized affinity switch on baseline node left is broken for client topology 
> and MVCC
> --
>
> Key: IGNITE-13051
> URL: https://issues.apache.org/jira/browse/IGNITE-13051
> Project: Ignite
>  Issue Type: Bug
>Reporter: Alexey Scherbakov
>Assignee: Maksim Timonin
>Priority: Critical
> Fix For: 2.9
>
> Attachments: reproducer.patch, Снимок экрана 2020-06-03 в 
> 19.41.53.png, Снимок экрана 2020-06-03 в 19.42.44.png
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> If a node contains only client cache topology with MVCC enabled, PME will 
> hang after changes introduced in IGNITE-12617.
> Reproduced by 
> CachePartitionLossWithRestartsTest.testPartitionLossDetectionOnClientTopology[0
>  false false -1] and enabled MVCC.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (IGNITE-13157) Upgrade build process for Javadocs

2020-06-16 Thread Mauricio Stekl (Jira)

Mauricio Stekl created IGNITE-13157:
---

 Summary: Upgrade build process for Javadocs
 Key: IGNITE-13157
 URL: https://issues.apache.org/jira/browse/IGNITE-13157
 Project: Ignite
  Issue Type: Improvement
  Components: documentation
Reporter: Mauricio Stekl
 Attachments: gsc_mobile_errors.png

I am reaching out to you all to see if you could help fix some issues with 
mobile usability in the Javadocs sections for Ignite website.
 
I know you might think most users will not read the API doc using their 
smartphones, and that is true. But as you can see in the attached screenshot 
gsc_mobile_errors.png, we have errors related to mobile usability reported by 
Google itself through GSC. Obviously this affects our performance on search 
results, as Google is giving more and more priority to mobile friendly websites.
For Apache Ignite website we basically have the largest part of the website 
with issues, since we only have around 30 pages mobile friendly, and 1000+ 
pages which are not. 
 
Specifically with Javadocs, the main problem is it contains old html frames for 
layout and navigation, among other markup bad practices, and this would make 
impossible to update any css to be responsive for small screens.  From what I 
was able to find [[1]|https://bugs.openjdk.java.net/browse/JDK-8215599] 
[[2]|https://bugs.openjdk.java.net/browse/JDK-8196202], starting with JDK 9 
there is a '--no-frames' option which fixes this problem. And with version 11 
this option is enabled by default and other features included. You can see here 
how the new layout looks: 
[https://docs.oracle.com/en/java/javase/11/docs/api/index.html]
 
Would it be possible to upgrade Java to version 11 only for building the 
javadocs? This might be a great starting point to fix these problems. Later we 
could update the .css to adjust font sizes and other details.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (IGNITE-12003) Java thin client fails to get object with compact footer from cache

2020-06-16 Thread Ivan Pavlukhin (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-12003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Pavlukhin resolved IGNITE-12003.
-
Resolution: Duplicate

> Java thin client fails to get object with compact footer from cache
> ---
>
> Key: IGNITE-12003
> URL: https://issues.apache.org/jira/browse/IGNITE-12003
> Project: Ignite
>  Issue Type: Bug
>  Components: thin client
>Affects Versions: 2.7.5
>Reporter: Ivan Pavlukhin
>Priority: Major
> Attachments: ThinClientCompactFooterDeserializationProblem.java
>
>
> Experiment:
> 1. Start server.
> 2. Start thin client, configure to use compact footer. Put some Pojo into 
> cache.
> 3. Start another client (compact footer enabled). Try to read Pojo saved in 
> previous step (Pojo class is available).
> Result -- {{BinaryObjectException("Cannot find metadata for object with 
> compact footer: " + typeId)}}.
> See attached reproducer [^ThinClientCompactFooterDeserializationProblem.java].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (IGNITE-13012) Fix failure detection timeout. Simplify node ping routine.

2020-06-16 Thread Vladimir Steshin (Jira)



[ 
https://issues.apache.org/jira/browse/IGNITE-13012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17136824#comment-17136824
 ] 

Vladimir Steshin edited comment on IGNITE-13012 at 6/16/20, 5:03 PM:
-

[~avinogradov], I've put the patch. It creates:

# JmhNodeFailureDetection. Not an ordinary JMH, I believe. Because we have to 
start/wait/fail node, the detection time is only small peice of each run. So, 
fixed/not-fixes results are close. I made own runs and collected timings to 
prepare the output. 

You can find in the output of the fix (example):
{code:java}
Detection delay: 294. Failure detection timeout: 300
Total detection delay: 5477

Run complete. Total time: 00:01:23
Benchmark  Mode  Cnt   Score   Error
Units
JmhNodeFailureDetection.measureTotalForTheOutput  thrpt   10,954  
ops/min
{code}

vs not-fixed:

{code:java}
Detection delay: 539. Failure detection timeout: 300
Total detection delay: 11370

Run complete. Total time: 00:01:41

Benchmark  Mode  Cnt  Score   Error
Units
JmhNodeFailureDetection.measureTotalForTheOutput  thrpt   5,276  
ops/min
{code}

# A test which fail in unfixed code.
{code:java}TcpDiscoveryNetworkIssuesTest.testNodeFailureDetectedWithinConfiguredTimeout(){code}


was (Author: vladsz83):
[~avinogradov], I've put the patch. It creates:

# JmhNodeFailureDetection. Not an ordinary JMH, I believe. Because we have to 
start/wait/fail node, the detection time is only small peice of each run. So, 
fixed/not-fixes results are close. I made own runs and collected timings to 
prepare the output. 

You can find in the output of the fix (example):
{code:java}
Detection delay: 294. Failure detection timeout: 300
Total detection delay: 5477

# Run complete. Total time: 00:01:23
Benchmark  Mode  Cnt   Score   Error
Units
JmhNodeFailureDetection.measureTotalForTheOutput  thrpt   10,954  
ops/min
{code}

vs not-fixed:

{code:java}
Detection delay: 539. Failure detection timeout: 300
Total detection delay: 11370

# Run complete. Total time: 00:01:41

Benchmark  Mode  Cnt  Score   Error
Units
JmhNodeFailureDetection.measureTotalForTheOutput  thrpt   5,276  
ops/min
{code}

# A test which fail in unfixed code.
{code:java}TcpDiscoveryNetworkIssuesTest.testNodeFailureDetectedWithinConfiguredTimeout(){code}

> Fix failure detection timeout. Simplify node ping routine.
> --
>
> Key: IGNITE-13012
> URL: https://issues.apache.org/jira/browse/IGNITE-13012
> Project: Ignite
>  Issue Type: Improvement
>Affects Versions: 2.8.1
>Reporter: Vladimir Steshin
>Assignee: Vladimir Steshin
>Priority: Major
>  Labels: iep-45
> Attachments: IGNITE-13012-patch.patch
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> Connection failure may not be detected within 
> IgniteConfiguration.failureDetectionTimeout. Actual worst delay is: 
> ServerImpl.CON_CHECK_INTERVAL + IgniteConfiguration.failureDetectionTimeout. 
> Node ping routine is duplicated.
> We should fix:
> 1. Failure detection timeout should take in account last sent message. 
> Current ping is bound to own time:
> {code:java}ServerImpl. RingMessageWorker.lastTimeConnCheckMsgSent{code}
> This is weird because any discovery message check connection. 
> 2. Make connection check interval depend on failure detection timeout (FTD). 
> Current value is a constant:
> {code:java}static int ServerImpls.CON_CHECK_INTERVAL = 500{code}
> 3. Remove additional, quickened connection checking.  Once we do fix 1, this 
> will become even more useless.
> Despite TCP discovery has a period of connection checking, it may send ping 
> before this period exhausts. This premature node ping relies on the time of 
> any sent or even any received message. 
> 4. Do not worry user with “Node seems disconnected” when everything is OK. 
> Once we do fix 1 and 3, this will become even more useless. 
> Node may log on INFO: “Local node seems to be disconnected from topology …” 
> whereas it is not actually disconnected at all.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (IGNITE-13012) Fix failure detection timeout. Simplify node ping routine.

2020-06-16 Thread Vladimir Steshin (Jira)



[ 
https://issues.apache.org/jira/browse/IGNITE-13012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17136824#comment-17136824
 ] 

Vladimir Steshin edited comment on IGNITE-13012 at 6/16/20, 5:03 PM:
-

[~avinogradov], I've put the patch. It creates:

* JmhNodeFailureDetection. Not an ordinary JMH, I believe. Because we have to 
start/wait/fail node, the detection time is only small peice of each run. So, 
fixed/not-fixes results are close. I made own runs and collected timings to 
prepare the output. 

You can find in the output of the fix (example):
{code:java}
Detection delay: 294. Failure detection timeout: 300
Total detection delay: 5477

# Run complete. Total time: 00:01:23
Benchmark  Mode  Cnt   Score   Error
Units
JmhNodeFailureDetection.measureTotalForTheOutput  thrpt   10,954  
ops/min
{code}

vs not-fixed:

{code:java}
Detection delay: 539. Failure detection timeout: 300
Total detection delay: 11370

# Run complete. Total time: 00:01:41

Benchmark  Mode  Cnt  Score   Error
Units
JmhNodeFailureDetection.measureTotalForTheOutput  thrpt   5,276  
ops/min
{code}

* A test which fail in unfixed code.
{code:java}TcpDiscoveryNetworkIssuesTest.testNodeFailureDetectedWithinConfiguredTimeout(){code}


was (Author: vladsz83):
[~avinogradov], I've put the patch. It creates:

# JmhNodeFailureDetection. Not an ordinary JMH, I believe. Because we have to 
start/wait/fail node, the detection time is only small peice of each run. So, 
fixed/not-fixes results are close. I made own runs and collected timings to 
prepare the output. 

You can find in the output of the fix (example):
{code:java}
Detection delay: 294. Failure detection timeout: 300
Total detection delay: 5477

Run complete. Total time: 00:01:23
Benchmark  Mode  Cnt   Score   Error
Units
JmhNodeFailureDetection.measureTotalForTheOutput  thrpt   10,954  
ops/min
{code}

vs not-fixed:

{code:java}
Detection delay: 539. Failure detection timeout: 300
Total detection delay: 11370

Run complete. Total time: 00:01:41

Benchmark  Mode  Cnt  Score   Error
Units
JmhNodeFailureDetection.measureTotalForTheOutput  thrpt   5,276  
ops/min
{code}

# A test which fail in unfixed code.
{code:java}TcpDiscoveryNetworkIssuesTest.testNodeFailureDetectedWithinConfiguredTimeout(){code}

> Fix failure detection timeout. Simplify node ping routine.
> --
>
> Key: IGNITE-13012
> URL: https://issues.apache.org/jira/browse/IGNITE-13012
> Project: Ignite
>  Issue Type: Improvement
>Affects Versions: 2.8.1
>Reporter: Vladimir Steshin
>Assignee: Vladimir Steshin
>Priority: Major
>  Labels: iep-45
> Attachments: IGNITE-13012-patch.patch
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> Connection failure may not be detected within 
> IgniteConfiguration.failureDetectionTimeout. Actual worst delay is: 
> ServerImpl.CON_CHECK_INTERVAL + IgniteConfiguration.failureDetectionTimeout. 
> Node ping routine is duplicated.
> We should fix:
> 1. Failure detection timeout should take in account last sent message. 
> Current ping is bound to own time:
> {code:java}ServerImpl. RingMessageWorker.lastTimeConnCheckMsgSent{code}
> This is weird because any discovery message check connection. 
> 2. Make connection check interval depend on failure detection timeout (FTD). 
> Current value is a constant:
> {code:java}static int ServerImpls.CON_CHECK_INTERVAL = 500{code}
> 3. Remove additional, quickened connection checking.  Once we do fix 1, this 
> will become even more useless.
> Despite TCP discovery has a period of connection checking, it may send ping 
> before this period exhausts. This premature node ping relies on the time of 
> any sent or even any received message. 
> 4. Do not worry user with “Node seems disconnected” when everything is OK. 
> Once we do fix 1 and 3, this will become even more useless. 
> Node may log on INFO: “Local node seems to be disconnected from topology …” 
> whereas it is not actually disconnected at all.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (IGNITE-13012) Fix failure detection timeout. Simplify node ping routine.

2020-06-16 Thread Vladimir Steshin (Jira)



[ 
https://issues.apache.org/jira/browse/IGNITE-13012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17136824#comment-17136824
 ] 

Vladimir Steshin edited comment on IGNITE-13012 at 6/16/20, 5:02 PM:
-

[~avinogradov], I've put the patch. It creates:

# JmhNodeFailureDetection. Not an ordinary JMH, I believe. Because we have to 
start/wait/fail node, the detection time is only small peice of each run. So, 
fixed/not-fixes results are close. I made own runs and collected timings to 
prepare the output. 

You can find in the output of the fix (example):
{code:java}
Detection delay: 294. Failure detection timeout: 300
Total detection delay: 5477

# Run complete. Total time: 00:01:23
Benchmark  Mode  Cnt   Score   Error
Units
JmhNodeFailureDetection.measureTotalForTheOutput  thrpt   10,954  
ops/min
{code}

vs not-fixed:

{code:java}
Detection delay: 539. Failure detection timeout: 300
Total detection delay: 11370

# Run complete. Total time: 00:01:41

Benchmark  Mode  Cnt  Score   Error
Units
JmhNodeFailureDetection.measureTotalForTheOutput  thrpt   5,276  
ops/min
{code}

# A test which fail in unfixed code.
{code:java}TcpDiscoveryNetworkIssuesTest.testNodeFailureDetectedWithinConfiguredTimeout(){code}


was (Author: vladsz83):
[~avinogradov], I've put the patch. It creates:

* JmhNodeFailureDetection. Not an ordinary JMH, I believe. Because we have to 
start/wait/fail node, the detection time is only small peice of each run. So, 
fixed/not-fixes results are close. I made own runs and collected timings to 
prepare the output. 

You can find in the output of the fix (example):
{code:java}
Detection delay: 294. Failure detection timeout: 300
Total detection delay: 5477

# Run complete. Total time: 00:01:23
Benchmark  Mode  Cnt   Score   Error
Units
JmhNodeFailureDetection.measureTotalForTheOutput  thrpt   10,954  
ops/min
{code}

vs not-fixed:

{code:java}
Detection delay: 539. Failure detection timeout: 300
Total detection delay: 11370

# Run complete. Total time: 00:01:41

Benchmark  Mode  Cnt  Score   Error
Units
JmhNodeFailureDetection.measureTotalForTheOutput  thrpt   5,276  
ops/min
{code}

* 
{code:java}TcpDiscoveryNetworkIssuesTest.testNodeFailureDetectedWithinConfiguredTimeout(){code}

> Fix failure detection timeout. Simplify node ping routine.
> --
>
> Key: IGNITE-13012
> URL: https://issues.apache.org/jira/browse/IGNITE-13012
> Project: Ignite
>  Issue Type: Improvement
>Affects Versions: 2.8.1
>Reporter: Vladimir Steshin
>Assignee: Vladimir Steshin
>Priority: Major
>  Labels: iep-45
> Attachments: IGNITE-13012-patch.patch
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> Connection failure may not be detected within 
> IgniteConfiguration.failureDetectionTimeout. Actual worst delay is: 
> ServerImpl.CON_CHECK_INTERVAL + IgniteConfiguration.failureDetectionTimeout. 
> Node ping routine is duplicated.
> We should fix:
> 1. Failure detection timeout should take in account last sent message. 
> Current ping is bound to own time:
> {code:java}ServerImpl. RingMessageWorker.lastTimeConnCheckMsgSent{code}
> This is weird because any discovery message check connection. 
> 2. Make connection check interval depend on failure detection timeout (FTD). 
> Current value is a constant:
> {code:java}static int ServerImpls.CON_CHECK_INTERVAL = 500{code}
> 3. Remove additional, quickened connection checking.  Once we do fix 1, this 
> will become even more useless.
> Despite TCP discovery has a period of connection checking, it may send ping 
> before this period exhausts. This premature node ping relies on the time of 
> any sent or even any received message. 
> 4. Do not worry user with “Node seems disconnected” when everything is OK. 
> Once we do fix 1 and 3, this will become even more useless. 
> Node may log on INFO: “Local node seems to be disconnected from topology …” 
> whereas it is not actually disconnected at all.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (IGNITE-13012) Fix failure detection timeout. Simplify node ping routine.

2020-06-16 Thread Vladimir Steshin (Jira)



[ 
https://issues.apache.org/jira/browse/IGNITE-13012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17136824#comment-17136824
 ] 

Vladimir Steshin edited comment on IGNITE-13012 at 6/16/20, 5:02 PM:
-

[~avinogradov], I've put the patch. It creates:

# JmhNodeFailureDetection. Not an ordinary JMH, I believe. Because we have to 
start/wait/fail node, the detection time is only small peice of each run. So, 
fixed/not-fixes results are close. I made own runs and collected timings to 
prepare the output. 

You can find in the output of the fix (example):
{code:java}
Detection delay: 294. Failure detection timeout: 300
Total detection delay: 5477

# Run complete. Total time: 00:01:23
Benchmark  Mode  Cnt   Score   Error
Units
JmhNodeFailureDetection.measureTotalForTheOutput  thrpt   10,954  
ops/min
{code}

vs not-fixed:

{code:java}
Detection delay: 539. Failure detection timeout: 300
Total detection delay: 11370

# Run complete. Total time: 00:01:41

Benchmark  Mode  Cnt  Score   Error
Units
JmhNodeFailureDetection.measureTotalForTheOutput  thrpt   5,276  
ops/min
{code}

# A test which fail in unfixed code.
{code:java}TcpDiscoveryNetworkIssuesTest.testNodeFailureDetectedWithinConfiguredTimeout(){code}


was (Author: vladsz83):
[~avinogradov], I've put the patch. It creates:

# JmhNodeFailureDetection. Not an ordinary JMH, I believe. Because we have to 
start/wait/fail node, the detection time is only small peice of each run. So, 
fixed/not-fixes results are close. I made own runs and collected timings to 
prepare the output. 

You can find in the output of the fix (example):
{code:java}
Detection delay: 294. Failure detection timeout: 300
Total detection delay: 5477

# Run complete. Total time: 00:01:23
Benchmark  Mode  Cnt   Score   Error
Units
JmhNodeFailureDetection.measureTotalForTheOutput  thrpt   10,954  
ops/min
{code}

vs not-fixed:

{code:java}
Detection delay: 539. Failure detection timeout: 300
Total detection delay: 11370

# Run complete. Total time: 00:01:41

Benchmark  Mode  Cnt  Score   Error
Units
JmhNodeFailureDetection.measureTotalForTheOutput  thrpt   5,276  
ops/min
{code}

# A test which fail in unfixed code.
{code:java}TcpDiscoveryNetworkIssuesTest.testNodeFailureDetectedWithinConfiguredTimeout(){code}

> Fix failure detection timeout. Simplify node ping routine.
> --
>
> Key: IGNITE-13012
> URL: https://issues.apache.org/jira/browse/IGNITE-13012
> Project: Ignite
>  Issue Type: Improvement
>Affects Versions: 2.8.1
>Reporter: Vladimir Steshin
>Assignee: Vladimir Steshin
>Priority: Major
>  Labels: iep-45
> Attachments: IGNITE-13012-patch.patch
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> Connection failure may not be detected within 
> IgniteConfiguration.failureDetectionTimeout. Actual worst delay is: 
> ServerImpl.CON_CHECK_INTERVAL + IgniteConfiguration.failureDetectionTimeout. 
> Node ping routine is duplicated.
> We should fix:
> 1. Failure detection timeout should take in account last sent message. 
> Current ping is bound to own time:
> {code:java}ServerImpl. RingMessageWorker.lastTimeConnCheckMsgSent{code}
> This is weird because any discovery message check connection. 
> 2. Make connection check interval depend on failure detection timeout (FTD). 
> Current value is a constant:
> {code:java}static int ServerImpls.CON_CHECK_INTERVAL = 500{code}
> 3. Remove additional, quickened connection checking.  Once we do fix 1, this 
> will become even more useless.
> Despite TCP discovery has a period of connection checking, it may send ping 
> before this period exhausts. This premature node ping relies on the time of 
> any sent or even any received message. 
> 4. Do not worry user with “Node seems disconnected” when everything is OK. 
> Once we do fix 1 and 3, this will become even more useless. 
> Node may log on INFO: “Local node seems to be disconnected from topology …” 
> whereas it is not actually disconnected at all.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (IGNITE-13012) Fix failure detection timeout. Simplify node ping routine.

2020-06-16 Thread Vladimir Steshin (Jira)



[ 
https://issues.apache.org/jira/browse/IGNITE-13012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17136824#comment-17136824
 ] 

Vladimir Steshin edited comment on IGNITE-13012 at 6/16/20, 5:01 PM:
-

[~avinogradov], I've put the patch. It creates:

* JmhNodeFailureDetection. Not ordinary JMH, I believe. Because we have to 
start/wait/fail node, the detection time is only small peice of each run. So, 
fixed/not-fixes results are close. I made own runs and collected timings to 
prepare the output. 

You can find in the output of the fix (example):
{code:java}
Detection delay: 294. Failure detection timeout: 300
Total detection delay: 5477

# Run complete. Total time: 00:01:23
Benchmark  Mode  Cnt   Score   Error
Units
JmhNodeFailureDetection.measureTotalForTheOutput  thrpt   10,954  
ops/min
{code}

vs not-fixed:

{code:text}
Detection delay: 539. Failure detection timeout: 300
Total detection delay: 11370

# Run complete. Total time: 00:01:41

Benchmark  Mode  Cnt  Score   Error
Units
JmhNodeFailureDetection.measureTotalForTheOutput  thrpt   5,276  
ops/min
{code}

* 
{code:java}TcpDiscoveryNetworkIssuesTest.testNodeFailureDetectedWithinConfiguredTimeout(){code}


was (Author: vladsz83):
[~avinogradov], I've put the patch. It creates:

* JmhNodeFailureDetection. Not ordinary JMH, I believe. Because we have to 
start/wait/fail node, the detection time is only small peice of each run. So, 
fixed/not-fixes results are close. I made own runs and collected timings to 
prepare the output. 

You can find in the output of the fix (example):
{code:text}
Detection delay: 294. Failure detection timeout: 300
Total detection delay: 5477

# Run complete. Total time: 00:01:23
Benchmark  Mode  Cnt   Score   Error
Units
JmhNodeFailureDetection.measureTotalForTheOutput  thrpt   10,954  
ops/min
{code}

vs not-fixed:

{code:text}
Detection delay: 539. Failure detection timeout: 300
Total detection delay: 11370

# Run complete. Total time: 00:01:41

Benchmark  Mode  Cnt  Score   Error
Units
JmhNodeFailureDetection.measureTotalForTheOutput  thrpt   5,276  
ops/min
{code}

* 
{code:java}TcpDiscoveryNetworkIssuesTest.testNodeFailureDetectedWithinConfiguredTimeout(){code}

> Fix failure detection timeout. Simplify node ping routine.
> --
>
> Key: IGNITE-13012
> URL: https://issues.apache.org/jira/browse/IGNITE-13012
> Project: Ignite
>  Issue Type: Improvement
>Affects Versions: 2.8.1
>Reporter: Vladimir Steshin
>Assignee: Vladimir Steshin
>Priority: Major
>  Labels: iep-45
> Attachments: IGNITE-13012-patch.patch
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> Connection failure may not be detected within 
> IgniteConfiguration.failureDetectionTimeout. Actual worst delay is: 
> ServerImpl.CON_CHECK_INTERVAL + IgniteConfiguration.failureDetectionTimeout. 
> Node ping routine is duplicated.
> We should fix:
> 1. Failure detection timeout should take in account last sent message. 
> Current ping is bound to own time:
> {code:java}ServerImpl. RingMessageWorker.lastTimeConnCheckMsgSent{code}
> This is weird because any discovery message check connection. 
> 2. Make connection check interval depend on failure detection timeout (FTD). 
> Current value is a constant:
> {code:java}static int ServerImpls.CON_CHECK_INTERVAL = 500{code}
> 3. Remove additional, quickened connection checking.  Once we do fix 1, this 
> will become even more useless.
> Despite TCP discovery has a period of connection checking, it may send ping 
> before this period exhausts. This premature node ping relies on the time of 
> any sent or even any received message. 
> 4. Do not worry user with “Node seems disconnected” when everything is OK. 
> Once we do fix 1 and 3, this will become even more useless. 
> Node may log on INFO: “Local node seems to be disconnected from topology …” 
> whereas it is not actually disconnected at all.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (IGNITE-13012) Fix failure detection timeout. Simplify node ping routine.

2020-06-16 Thread Vladimir Steshin (Jira)



[ 
https://issues.apache.org/jira/browse/IGNITE-13012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17136824#comment-17136824
 ] 

Vladimir Steshin edited comment on IGNITE-13012 at 6/16/20, 5:01 PM:
-

[~avinogradov], I've put the patch. It creates:

* JmhNodeFailureDetection. Not ordinary JMH, I believe. Because we have to 
start/wait/fail node, the detection time is only small peice of each run. So, 
fixed/not-fixes results are close. I made own runs and collected timings to 
prepare the output. 

You can find in the output of the fix (example):
{code:java}
Detection delay: 294. Failure detection timeout: 300
Total detection delay: 5477

# Run complete. Total time: 00:01:23
Benchmark  Mode  Cnt   Score   Error
Units
JmhNodeFailureDetection.measureTotalForTheOutput  thrpt   10,954  
ops/min
{code}

vs not-fixed:

{code:java}
Detection delay: 539. Failure detection timeout: 300
Total detection delay: 11370

# Run complete. Total time: 00:01:41

Benchmark  Mode  Cnt  Score   Error
Units
JmhNodeFailureDetection.measureTotalForTheOutput  thrpt   5,276  
ops/min
{code}

* 
{code:java}TcpDiscoveryNetworkIssuesTest.testNodeFailureDetectedWithinConfiguredTimeout(){code}


was (Author: vladsz83):
[~avinogradov], I've put the patch. It creates:

* JmhNodeFailureDetection. Not ordinary JMH, I believe. Because we have to 
start/wait/fail node, the detection time is only small peice of each run. So, 
fixed/not-fixes results are close. I made own runs and collected timings to 
prepare the output. 

You can find in the output of the fix (example):
{code:java}
Detection delay: 294. Failure detection timeout: 300
Total detection delay: 5477

# Run complete. Total time: 00:01:23
Benchmark  Mode  Cnt   Score   Error
Units
JmhNodeFailureDetection.measureTotalForTheOutput  thrpt   10,954  
ops/min
{code}

vs not-fixed:

{code:text}
Detection delay: 539. Failure detection timeout: 300
Total detection delay: 11370

# Run complete. Total time: 00:01:41

Benchmark  Mode  Cnt  Score   Error
Units
JmhNodeFailureDetection.measureTotalForTheOutput  thrpt   5,276  
ops/min
{code}

* 
{code:java}TcpDiscoveryNetworkIssuesTest.testNodeFailureDetectedWithinConfiguredTimeout(){code}

> Fix failure detection timeout. Simplify node ping routine.
> --
>
> Key: IGNITE-13012
> URL: https://issues.apache.org/jira/browse/IGNITE-13012
> Project: Ignite
>  Issue Type: Improvement
>Affects Versions: 2.8.1
>Reporter: Vladimir Steshin
>Assignee: Vladimir Steshin
>Priority: Major
>  Labels: iep-45
> Attachments: IGNITE-13012-patch.patch
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> Connection failure may not be detected within 
> IgniteConfiguration.failureDetectionTimeout. Actual worst delay is: 
> ServerImpl.CON_CHECK_INTERVAL + IgniteConfiguration.failureDetectionTimeout. 
> Node ping routine is duplicated.
> We should fix:
> 1. Failure detection timeout should take in account last sent message. 
> Current ping is bound to own time:
> {code:java}ServerImpl. RingMessageWorker.lastTimeConnCheckMsgSent{code}
> This is weird because any discovery message check connection. 
> 2. Make connection check interval depend on failure detection timeout (FTD). 
> Current value is a constant:
> {code:java}static int ServerImpls.CON_CHECK_INTERVAL = 500{code}
> 3. Remove additional, quickened connection checking.  Once we do fix 1, this 
> will become even more useless.
> Despite TCP discovery has a period of connection checking, it may send ping 
> before this period exhausts. This premature node ping relies on the time of 
> any sent or even any received message. 
> 4. Do not worry user with “Node seems disconnected” when everything is OK. 
> Once we do fix 1 and 3, this will become even more useless. 
> Node may log on INFO: “Local node seems to be disconnected from topology …” 
> whereas it is not actually disconnected at all.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (IGNITE-13012) Fix failure detection timeout. Simplify node ping routine.

2020-06-16 Thread Vladimir Steshin (Jira)



[ 
https://issues.apache.org/jira/browse/IGNITE-13012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17136824#comment-17136824
 ] 

Vladimir Steshin edited comment on IGNITE-13012 at 6/16/20, 5:01 PM:
-

[~avinogradov], I've put the patch. It creates:

* JmhNodeFailureDetection. Not an ordinary JMH, I believe. Because we have to 
start/wait/fail node, the detection time is only small peice of each run. So, 
fixed/not-fixes results are close. I made own runs and collected timings to 
prepare the output. 

You can find in the output of the fix (example):
{code:java}
Detection delay: 294. Failure detection timeout: 300
Total detection delay: 5477

# Run complete. Total time: 00:01:23
Benchmark  Mode  Cnt   Score   Error
Units
JmhNodeFailureDetection.measureTotalForTheOutput  thrpt   10,954  
ops/min
{code}

vs not-fixed:

{code:java}
Detection delay: 539. Failure detection timeout: 300
Total detection delay: 11370

# Run complete. Total time: 00:01:41

Benchmark  Mode  Cnt  Score   Error
Units
JmhNodeFailureDetection.measureTotalForTheOutput  thrpt   5,276  
ops/min
{code}

* 
{code:java}TcpDiscoveryNetworkIssuesTest.testNodeFailureDetectedWithinConfiguredTimeout(){code}


was (Author: vladsz83):
[~avinogradov], I've put the patch. It creates:

* JmhNodeFailureDetection. Not ordinary JMH, I believe. Because we have to 
start/wait/fail node, the detection time is only small peice of each run. So, 
fixed/not-fixes results are close. I made own runs and collected timings to 
prepare the output. 

You can find in the output of the fix (example):
{code:java}
Detection delay: 294. Failure detection timeout: 300
Total detection delay: 5477

# Run complete. Total time: 00:01:23
Benchmark  Mode  Cnt   Score   Error
Units
JmhNodeFailureDetection.measureTotalForTheOutput  thrpt   10,954  
ops/min
{code}

vs not-fixed:

{code:java}
Detection delay: 539. Failure detection timeout: 300
Total detection delay: 11370

# Run complete. Total time: 00:01:41

Benchmark  Mode  Cnt  Score   Error
Units
JmhNodeFailureDetection.measureTotalForTheOutput  thrpt   5,276  
ops/min
{code}

* 
{code:java}TcpDiscoveryNetworkIssuesTest.testNodeFailureDetectedWithinConfiguredTimeout(){code}

> Fix failure detection timeout. Simplify node ping routine.
> --
>
> Key: IGNITE-13012
> URL: https://issues.apache.org/jira/browse/IGNITE-13012
> Project: Ignite
>  Issue Type: Improvement
>Affects Versions: 2.8.1
>Reporter: Vladimir Steshin
>Assignee: Vladimir Steshin
>Priority: Major
>  Labels: iep-45
> Attachments: IGNITE-13012-patch.patch
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> Connection failure may not be detected within 
> IgniteConfiguration.failureDetectionTimeout. Actual worst delay is: 
> ServerImpl.CON_CHECK_INTERVAL + IgniteConfiguration.failureDetectionTimeout. 
> Node ping routine is duplicated.
> We should fix:
> 1. Failure detection timeout should take in account last sent message. 
> Current ping is bound to own time:
> {code:java}ServerImpl. RingMessageWorker.lastTimeConnCheckMsgSent{code}
> This is weird because any discovery message check connection. 
> 2. Make connection check interval depend on failure detection timeout (FTD). 
> Current value is a constant:
> {code:java}static int ServerImpls.CON_CHECK_INTERVAL = 500{code}
> 3. Remove additional, quickened connection checking.  Once we do fix 1, this 
> will become even more useless.
> Despite TCP discovery has a period of connection checking, it may send ping 
> before this period exhausts. This premature node ping relies on the time of 
> any sent or even any received message. 
> 4. Do not worry user with “Node seems disconnected” when everything is OK. 
> Once we do fix 1 and 3, this will become even more useless. 
> Node may log on INFO: “Local node seems to be disconnected from topology …” 
> whereas it is not actually disconnected at all.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (IGNITE-13012) Fix failure detection timeout. Simplify node ping routine.

2020-06-16 Thread Vladimir Steshin (Jira)



[ 
https://issues.apache.org/jira/browse/IGNITE-13012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17136824#comment-17136824
 ] 

Vladimir Steshin commented on IGNITE-13012:
---

[~avinogradov], I've put the patch. It creates:

* JmhNodeFailureDetection. Not ordinary JMH, I believe. Because we have to 
start/wait/fail node, the detection time is only small peice of each run. So, 
fixed/not-fixes results are close. I made own runs and collected timings to 
prepare the output. 

You can find in the output of the fix (example):
{code:text}
Detection delay: 294. Failure detection timeout: 300
Total detection delay: 5477

# Run complete. Total time: 00:01:23
Benchmark  Mode  Cnt   Score   Error
Units
JmhNodeFailureDetection.measureTotalForTheOutput  thrpt   10,954  
ops/min
{code}

vs not-fixed:

{code:text}
Detection delay: 539. Failure detection timeout: 300
Total detection delay: 11370

# Run complete. Total time: 00:01:41

Benchmark  Mode  Cnt  Score   Error
Units
JmhNodeFailureDetection.measureTotalForTheOutput  thrpt   5,276  
ops/min
{code}

* 
{code:java}TcpDiscoveryNetworkIssuesTest.testNodeFailureDetectedWithinConfiguredTimeout(){code}

> Fix failure detection timeout. Simplify node ping routine.
> --
>
> Key: IGNITE-13012
> URL: https://issues.apache.org/jira/browse/IGNITE-13012
> Project: Ignite
>  Issue Type: Improvement
>Affects Versions: 2.8.1
>Reporter: Vladimir Steshin
>Assignee: Vladimir Steshin
>Priority: Major
>  Labels: iep-45
> Attachments: IGNITE-13012-patch.patch
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> Connection failure may not be detected within 
> IgniteConfiguration.failureDetectionTimeout. Actual worst delay is: 
> ServerImpl.CON_CHECK_INTERVAL + IgniteConfiguration.failureDetectionTimeout. 
> Node ping routine is duplicated.
> We should fix:
> 1. Failure detection timeout should take in account last sent message. 
> Current ping is bound to own time:
> {code:java}ServerImpl. RingMessageWorker.lastTimeConnCheckMsgSent{code}
> This is weird because any discovery message check connection. 
> 2. Make connection check interval depend on failure detection timeout (FTD). 
> Current value is a constant:
> {code:java}static int ServerImpls.CON_CHECK_INTERVAL = 500{code}
> 3. Remove additional, quickened connection checking.  Once we do fix 1, this 
> will become even more useless.
> Despite TCP discovery has a period of connection checking, it may send ping 
> before this period exhausts. This premature node ping relies on the time of 
> any sent or even any received message. 
> 4. Do not worry user with “Node seems disconnected” when everything is OK. 
> Once we do fix 1 and 3, this will become even more useless. 
> Node may log on INFO: “Local node seems to be disconnected from topology …” 
> whereas it is not actually disconnected at all.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (IGNITE-13012) Fix failure detection timeout. Simplify node ping routine.

2020-06-16 Thread Vladimir Steshin (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-13012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Steshin updated IGNITE-13012:
--
Attachment: IGNITE-13012-patch.patch

> Fix failure detection timeout. Simplify node ping routine.
> --
>
> Key: IGNITE-13012
> URL: https://issues.apache.org/jira/browse/IGNITE-13012
> Project: Ignite
>  Issue Type: Improvement
>Affects Versions: 2.8.1
>Reporter: Vladimir Steshin
>Assignee: Vladimir Steshin
>Priority: Major
>  Labels: iep-45
> Attachments: IGNITE-13012-patch.patch
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> Connection failure may not be detected within 
> IgniteConfiguration.failureDetectionTimeout. Actual worst delay is: 
> ServerImpl.CON_CHECK_INTERVAL + IgniteConfiguration.failureDetectionTimeout. 
> Node ping routine is duplicated.
> We should fix:
> 1. Failure detection timeout should take in account last sent message. 
> Current ping is bound to own time:
> {code:java}ServerImpl. RingMessageWorker.lastTimeConnCheckMsgSent{code}
> This is weird because any discovery message check connection. 
> 2. Make connection check interval depend on failure detection timeout (FTD). 
> Current value is a constant:
> {code:java}static int ServerImpls.CON_CHECK_INTERVAL = 500{code}
> 3. Remove additional, quickened connection checking.  Once we do fix 1, this 
> will become even more useless.
> Despite TCP discovery has a period of connection checking, it may send ping 
> before this period exhausts. This premature node ping relies on the time of 
> any sent or even any received message. 
> 4. Do not worry user with “Node seems disconnected” when everything is OK. 
> Once we do fix 1 and 3, this will become even more useless. 
> Node may log on INFO: “Local node seems to be disconnected from topology …” 
> whereas it is not actually disconnected at all.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (IGNITE-12903) Fix ML + SQL examples

2020-06-16 Thread Alexey Zinoviev (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-12903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexey Zinoviev updated IGNITE-12903:
-
Fix Version/s: 2.9

> Fix ML + SQL examples
> -
>
> Key: IGNITE-12903
> URL: https://issues.apache.org/jira/browse/IGNITE-12903
> Project: Ignite
>  Issue Type: Task
>  Components: examples, ml
>Reporter: Taras Ledkov
>Assignee: Alexey Zinoviev
>Priority: Major
> Fix For: 2.9
>
>
> The examples
> {{DecisionTreeClassificationTrainerSQLInferenceExample}}
> {{DecisionTreeClassificationTrainerSQLTableExample}}
> are used CSVREAD function to initial load data into cluster.
> Must be changed because this function is disabled by default



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (IGNITE-12903) Fix ML + SQL examples

2020-06-16 Thread Alexey Zinoviev (Jira)



[ 
https://issues.apache.org/jira/browse/IGNITE-12903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17136757#comment-17136757
 ] 

Alexey Zinoviev commented on IGNITE-12903:
--

[~tledkov-gridgain] great, I'd prefer the third example, I suppose like in 
another examples.

Will wait cool implementations,

I reassigned ticket on myself, fix it for 2.9.

> Fix ML + SQL examples
> -
>
> Key: IGNITE-12903
> URL: https://issues.apache.org/jira/browse/IGNITE-12903
> Project: Ignite
>  Issue Type: Task
>  Components: examples, ml
>Reporter: Taras Ledkov
>Assignee: Alexey Zinoviev
>Priority: Major
>
> The examples
> {{DecisionTreeClassificationTrainerSQLInferenceExample}}
> {{DecisionTreeClassificationTrainerSQLTableExample}}
> are used CSVREAD function to initial load data into cluster.
> Must be changed because this function is disabled by default



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (IGNITE-12903) Fix ML + SQL examples

2020-06-16 Thread Taras Ledkov (Jira)



[ 
https://issues.apache.org/jira/browse/IGNITE-12903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17136745#comment-17136745
 ] 

Taras Ledkov commented on IGNITE-12903:
---

[~zaleslaw], there is not well implemented import from CSV at the Ignite now.
There is CSVREAD command implemented for thin JDBC but CSV parser is 
implemented invalid there.
So for the fix example i see several ways:
- waits for good implementation CSV import functionality (al least for thin 
JDBC see: IGNITE-12852);
- parse cvs files by example's code and use SQL/cache api to insert data;
- populate cache manually not from CSV.

> Fix ML + SQL examples
> -
>
> Key: IGNITE-12903
> URL: https://issues.apache.org/jira/browse/IGNITE-12903
> Project: Ignite
>  Issue Type: Task
>  Components: examples, ml
>Reporter: Taras Ledkov
>Assignee: Alexey Zinoviev
>Priority: Major
>
> The examples
> {{DecisionTreeClassificationTrainerSQLInferenceExample}}
> {{DecisionTreeClassificationTrainerSQLTableExample}}
> are used CSVREAD function to initial load data into cluster.
> Must be changed because this function is disabled by default



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (IGNITE-13151) Checkpointer code refactoring

2020-06-16 Thread Anton Kalashnikov (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-13151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Kalashnikov reassigned IGNITE-13151:
--

Assignee: Anton Kalashnikov

> Checkpointer code refactoring
> -
>
> Key: IGNITE-13151
> URL: https://issues.apache.org/jira/browse/IGNITE-13151
> Project: Ignite
>  Issue Type: Sub-task
>  Components: persistence
>Reporter: Sergey Chugunov
>Assignee: Anton Kalashnikov
>Priority: Major
>
> Checkpointer is at the center of Ignite persistence subsystem and more people 
> from the community understand it the better means it is more stable and more 
> efficient.
> However for now checkpointer code sits inside of 
> GridCacheDatabaseSharedManager class and is entangled with this higher-level 
> and more general component.
> To take a step forward to more modular checkpointer we need to do two things:
>  # Move checkpointer code outside database manager to a separate class.
>  # Create a well-defined API of checkpointer that will allow us to create new 
> implementations of checkpointer in the future. An example of this is new 
> checkpointer implementation needed for defragmentation feature purposes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (IGNITE-13109) Skip metastorage entries that can not be unmarshalled

2020-06-16 Thread Amelchev Nikita (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-13109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amelchev Nikita updated IGNITE-13109:
-
Description: 
Need to skip metastorage entries that can not be unmarshalled (created by the 
old cluster). It leads that nodes can't join to the first started node:

{noformat}
[SEVERE][main][PersistenceBasicCompatibilityTest1] Got exception while starting 
(will rollback startup routine).
class org.apache.ignite.IgniteCheckedException: Failed to start manager: 
GridManagerAdapter [enabled=true, 
name=org.apache.ignite.internal.managers.discovery.GridDiscoveryManager]
at org.apache.ignite.internal.IgniteKernal.startManager(IgniteKernal.java:2035)
at org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:1314)
at 
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:2063)
at 
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1703)
at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1116)
at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:636)
at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:562)
at org.apache.ignite.Ignition.start(Ignition.java:328)
at 
org.apache.ignite.testframework.junits.multijvm.IgniteNodeRunner.main(IgniteNodeRunner.java:74)
Caused by: class org.apache.ignite.IgniteCheckedException: Failed to start SPI: 
TcpDiscoverySpi [addrRslvr=null, sockTimeout=5000, ackTimeout=5000, 
marsh=JdkMarshaller 
[clsFilter=org.apache.ignite.marshaller.MarshallerUtils$1@77b14724], 
reconCnt=10, reconDelay=2000, maxAckTimeout=60, soLinger=5, 
forceSrvMode=false, clientReconnectDisabled=false, internalLsnr=null, 
skipAddrsRandomization=false]
at 
org.apache.ignite.internal.managers.GridManagerAdapter.startSpi(GridManagerAdapter.java:302)
at 
org.apache.ignite.internal.managers.discovery.GridDiscoveryManager.start(GridDiscoveryManager.java:948)
at org.apache.ignite.internal.IgniteKernal.startManager(IgniteKernal.java:2030)
... 8 more
Caused by: class org.apache.ignite.spi.IgniteSpiException: Unable to unmarshal 
key=ignite.testOldKey
at 
org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.checkFailedError(TcpDiscoverySpi.java:2009)
at 
org.apache.ignite.spi.discovery.tcp.ServerImpl.joinTopology(ServerImpl.java:1116)
at org.apache.ignite.spi.discovery.tcp.ServerImpl.spiStart(ServerImpl.java:427)
at 
org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.spiStart(TcpDiscoverySpi.java:2111)
at 
org.apache.ignite.internal.managers.GridManagerAdapter.startSpi(GridManagerAdapter.java:299)
... 10 more
{noformat}


  was:
Need to skip metastorage entries that can not be unmarshalled (created by the 
old cluster). It leads that nodes can't join to the first started node:

{noformat}
[SEVERE][main][PersistenceBasicCompatibilityTest1] Got exception while starting 
(will rollback startup routine).
class org.apache.ignite.IgniteCheckedException: Failed to start manager: 
GridManagerAdapter [enabled=true, 
name=org.apache.ignite.internal.managers.discovery.GridDiscoveryManager]
at org.apache.ignite.internal.IgniteKernal.startManager(IgniteKernal.java:2035)
at org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:1314)
at 
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:2063)
at 
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1703)
at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1116)
at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:636)
at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:562)
at org.apache.ignite.Ignition.start(Ignition.java:328)
at 
org.apache.ignite.testframework.junits.multijvm.IgniteNodeRunner.main(IgniteNodeRunner.java:74)
Caused by: class org.apache.ignite.IgniteCheckedException: Failed to start SPI: 
TcpDiscoverySpi [addrRslvr=null, sockTimeout=5000, ackTimeout=5000, 
marsh=JdkMarshaller 
[clsFilter=org.apache.ignite.marshaller.MarshallerUtils$1@77b14724], 
reconCnt=10, reconDelay=2000, maxAckTimeout=60, soLinger=5, 
forceSrvMode=false, clientReconnectDisabled=false, internalLsnr=null, 
skipAddrsRandomization=false]
at 
org.apache.ignite.internal.managers.GridManagerAdapter.startSpi(GridManagerAdapter.java:302)
at 
org.apache.ignite.internal.managers.discovery.GridDiscoveryManager.start(GridDiscoveryManager.java:948)
at org.apache.ignite.internal.IgniteKernal.startManager(IgniteKernal.java:2030)
... 8 more
Caused by: class org.apache.ignite.spi.IgniteSpiException: Unable to unmarshal 
key=ignite.testOldClusterTag
at 
org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.checkFailedError(TcpDiscoverySpi.java:2009)
at 
org.apache.ignite.spi.discovery.tcp.ServerImpl.joinTopology(ServerImpl.java:1116)
at org.apache.ignite.spi.discovery.tcp.ServerImpl.spiStart(ServerImpl.java:427)
at 
org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.spiStart(TcpDiscoverySpi.java:2111)
at

[jira] [Updated] (IGNITE-13156) Continuous query filter deployment hungs discovery thread

2020-06-16 Thread Ivan Bessonov (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-13156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Bessonov updated IGNITE-13156:
---
Description: 
Continuous query starts with a custom discovery event. Handler of the event is 
executed in discovery thread synchronously. Even worse is the fact that message 
itself is mutable and it blocks the ring.

Inside of the handler there is a is p2p resource request from other node, which 
can be pretty time consuming. And after 
https://issues.apache.org/jira/browse/IGNITE-12438 or similar tasks this could 
even lead to a deadlock.

All IO operations must be removed from discovery handlers.

This scenario is reproduced in 
GridP2PContinuousDeploymentSelfTest#testServerJoinWithP2PClassDeployedInCluster
{code:java}
 at 
org.apache.ignite.internal.managers.communication.GridIoManager.send(GridIoManager.java:2099)
 at 
org.apache.ignite.internal.managers.communication.GridIoManager.send(GridIoManager.java:2099)
 at 
org.apache.ignite.internal.managers.communication.GridIoManager.sendToGridTopic(GridIoManager.java:2231)
 at 
org.apache.ignite.internal.managers.deployment.GridDeploymentCommunication.sendResourceRequest(GridDeploymentCommunication.java:456)
 at 
org.apache.ignite.internal.managers.deployment.GridDeploymentClassLoader.sendResourceRequest(GridDeploymentClassLoader.java:793)
 at 
org.apache.ignite.internal.managers.deployment.GridDeploymentClassLoader.getResourceAsStreamEx(GridDeploymentClassLoader.java:745)
 at 
org.apache.ignite.internal.managers.deployment.GridDeploymentPerVersionStore.checkLoadRemoteClass(GridDeploymentPerVersionStore.java:729)
 at 
org.apache.ignite.internal.managers.deployment.GridDeploymentPerVersionStore.getDeployment(GridDeploymentPerVersionStore.java:314)
 at 
org.apache.ignite.internal.managers.deployment.GridDeploymentManager.getGlobalDeployment(GridDeploymentManager.java:498)
 at 
org.apache.ignite.internal.GridEventConsumeHandler.p2pUnmarshal(GridEventConsumeHandler.java:416)
 at 
org.apache.ignite.internal.processors.continuous.GridContinuousProcessor.processStartRequest(GridContinuousProcessor.java:1423)
 at 
org.apache.ignite.internal.processors.continuous.GridContinuousProcessor.access$400(GridContinuousProcessor.java:117)
 at 
org.apache.ignite.internal.processors.continuous.GridContinuousProcessor$2.onCustomEvent(GridContinuousProcessor.java:220)
 at 
org.apache.ignite.internal.processors.continuous.GridContinuousProcessor$2.onCustomEvent(GridContinuousProcessor.java:211)
 at 
org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$4.onDiscovery0(GridDiscoveryManager.java:670)
 at 
org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$4.lambda$onDiscovery$0(GridDiscoveryManager.java:533)
 at 
org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryMessageNotifierWorker.body0(GridDiscoveryManager.java:2635)
 at 
org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryMessageNotifierWorker.body(GridDiscoveryManager.java:2673)
 at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120) 
at java.lang.Thread.run(Thread.java:748)
{code}

  was:
Continuous query starts with a custom discovery event. Handler of the event is 
executed in discovery thread synchronously. Even worse is the fact that message 
itself is mutable and it blocks the ring.

Inside of the handler there is a is p2p resource request from other node, which 
can be pretty time consuming. And after 
https://issues.apache.org/jira/browse/IGNITE-12438 or similar tasks this could 
even lead to a deadlock.

All IO operations must be removed from discovery handlers.
{code:java}
 at 
org.apache.ignite.internal.managers.communication.GridIoManager.send(GridIoManager.java:2099)
 at 
org.apache.ignite.internal.managers.communication.GridIoManager.send(GridIoManager.java:2099)
 at 
org.apache.ignite.internal.managers.communication.GridIoManager.sendToGridTopic(GridIoManager.java:2231)
 at 
org.apache.ignite.internal.managers.deployment.GridDeploymentCommunication.sendResourceRequest(GridDeploymentCommunication.java:456)
 at 
org.apache.ignite.internal.managers.deployment.GridDeploymentClassLoader.sendResourceRequest(GridDeploymentClassLoader.java:793)
 at 
org.apache.ignite.internal.managers.deployment.GridDeploymentClassLoader.getResourceAsStreamEx(GridDeploymentClassLoader.java:745)
 at 
org.apache.ignite.internal.managers.deployment.GridDeploymentPerVersionStore.checkLoadRemoteClass(GridDeploymentPerVersionStore.java:729)
 at 
org.apache.ignite.internal.managers.deployment.GridDeploymentPerVersionStore.getDeployment(GridDeploymentPerVersionStore.java:314)
 at 
org.apache.ignite.internal.managers.deployment.GridDeploymentManager.getGlobalDeployment(GridDeploymentManager.java:498)
 at 
org.apache.ignite.internal.GridEventConsumeHandler.p2pUnmarshal(GridEventConsumeHandler.java:416)
 at 
org.apache.ignite.internal.proces

[jira] [Commented] (IGNITE-13033) Java thin client: Service invocation

2020-06-16 Thread Pavel Tupitsyn (Jira)



[ 
https://issues.apache.org/jira/browse/IGNITE-13033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17136657#comment-17136657
 ] 

Pavel Tupitsyn commented on IGNITE-13033:
-

[~alex_pl] Looks good to me, thank you. Please see a couple minor comments on 
GitHub.

> Java thin client: Service invocation
> 
>
> Key: IGNITE-13033
> URL: https://issues.apache.org/jira/browse/IGNITE-13033
> Project: Ignite
>  Issue Type: New Feature
>  Components: thin client
>Reporter: Aleksey Plekhanov
>Assignee: Aleksey Plekhanov
>Priority: Major
>  Labels: iep-46
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>
> Provide an API to invoke Ignite Services from java thin client.
> Protocol changes and all implementation details described in IEP-46.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (IGNITE-13156) Continuous query filter deployment hungs discovery thread

2020-06-16 Thread Ivan Bessonov (Jira)

Ivan Bessonov created IGNITE-13156:
--

 Summary: Continuous query filter deployment hungs discovery thread
 Key: IGNITE-13156
 URL: https://issues.apache.org/jira/browse/IGNITE-13156
 Project: Ignite
  Issue Type: Bug
Reporter: Ivan Bessonov


Continuous query starts with a custom discovery event. Handler of the event is 
executed in discovery thread synchronously. Even worse is the fact that message 
itself is mutable and it blocks the ring.

Inside of the handler there is a is p2p resource request from other node, which 
can be pretty time consuming. And after 
https://issues.apache.org/jira/browse/IGNITE-12438 or similar tasks this could 
even lead to a deadlock.

All IO operations must be removed from discovery handlers.
{code:java}
 at 
org.apache.ignite.internal.managers.communication.GridIoManager.send(GridIoManager.java:2099)
 at 
org.apache.ignite.internal.managers.communication.GridIoManager.send(GridIoManager.java:2099)
 at 
org.apache.ignite.internal.managers.communication.GridIoManager.sendToGridTopic(GridIoManager.java:2231)
 at 
org.apache.ignite.internal.managers.deployment.GridDeploymentCommunication.sendResourceRequest(GridDeploymentCommunication.java:456)
 at 
org.apache.ignite.internal.managers.deployment.GridDeploymentClassLoader.sendResourceRequest(GridDeploymentClassLoader.java:793)
 at 
org.apache.ignite.internal.managers.deployment.GridDeploymentClassLoader.getResourceAsStreamEx(GridDeploymentClassLoader.java:745)
 at 
org.apache.ignite.internal.managers.deployment.GridDeploymentPerVersionStore.checkLoadRemoteClass(GridDeploymentPerVersionStore.java:729)
 at 
org.apache.ignite.internal.managers.deployment.GridDeploymentPerVersionStore.getDeployment(GridDeploymentPerVersionStore.java:314)
 at 
org.apache.ignite.internal.managers.deployment.GridDeploymentManager.getGlobalDeployment(GridDeploymentManager.java:498)
 at 
org.apache.ignite.internal.GridEventConsumeHandler.p2pUnmarshal(GridEventConsumeHandler.java:416)
 at 
org.apache.ignite.internal.processors.continuous.GridContinuousProcessor.processStartRequest(GridContinuousProcessor.java:1423)
 at 
org.apache.ignite.internal.processors.continuous.GridContinuousProcessor.access$400(GridContinuousProcessor.java:117)
 at 
org.apache.ignite.internal.processors.continuous.GridContinuousProcessor$2.onCustomEvent(GridContinuousProcessor.java:220)
 at 
org.apache.ignite.internal.processors.continuous.GridContinuousProcessor$2.onCustomEvent(GridContinuousProcessor.java:211)
 at 
org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$4.onDiscovery0(GridDiscoveryManager.java:670)
 at 
org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$4.lambda$onDiscovery$0(GridDiscoveryManager.java:533)
 at 
org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryMessageNotifierWorker.body0(GridDiscoveryManager.java:2635)
 at 
org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryMessageNotifierWorker.body(GridDiscoveryManager.java:2673)
 at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120) 
at java.lang.Thread.run(Thread.java:748)
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (IGNITE-13086) Improve current page replacement mechanism.

2020-06-16 Thread Stanilovsky Evgeny (Jira)



[ 
https://issues.apache.org/jira/browse/IGNITE-13086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17136618#comment-17136618
 ] 

Stanilovsky Evgeny commented on IGNITE-13086:
-

[~alex_pl] i found that approac from [1] is near to mine,  without additional 
relocation's, looks like we need merge this one [1]. Thanks !
 !screenshot-2.png! 

[1] https://github.com/apache/ignite/pull/7919

> Improve current page replacement mechanism.
> ---
>
> Key: IGNITE-13086
> URL: https://issues.apache.org/jira/browse/IGNITE-13086
> Project: Ignite
>  Issue Type: Improvement
>  Components: persistence
>Affects Versions: 2.8.1
>Reporter: Stanilovsky Evgeny
>Assignee: Stanilovsky Evgeny
>Priority: Major
> Attachments: 8.7-fix-replacement400_rand_512val_5touch_oldts.log, 
> 8.7-replacement400_rand_512val_5touch_oldts.log, 
> IgnitePdsPageReplacementTestToYard.java, replacement_64_new.jfr.zip, 
> replacement_64_old.jfr.zip, screenshot-1.png, screenshot-2.png
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Experimentally proven that current page replacement functionality has 
> problems with replace candidate computation. Current implementation obtain 5 
> random pages and make further decisions basing this pages last touch 
> timestamp and some inner flags, however still possible cases when this pages 
> set can be simply nullified due to inner logic. All improvements need to be 
> proven, for example, by simple scenario: 
> 1. put some data until event EVT_PAGE_REPLACEMENT_STARTED is triggered
> 2. put 2 times more data than been loaded in p1.
> 3. execute fullscan (through ScanQuery) for old\cold data processing 
> emulation.
> 4. start processing only pages which can fit into current mem region.
> 5. measure "replacedPages" metric.
> (i attach code mention above)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (IGNITE-13086) Improve current page replacement mechanism.

2020-06-16 Thread Stanilovsky Evgeny (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-13086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stanilovsky Evgeny updated IGNITE-13086:

Attachment: screenshot-2.png

> Improve current page replacement mechanism.
> ---
>
> Key: IGNITE-13086
> URL: https://issues.apache.org/jira/browse/IGNITE-13086
> Project: Ignite
>  Issue Type: Improvement
>  Components: persistence
>Affects Versions: 2.8.1
>Reporter: Stanilovsky Evgeny
>Assignee: Stanilovsky Evgeny
>Priority: Major
> Attachments: 8.7-fix-replacement400_rand_512val_5touch_oldts.log, 
> 8.7-replacement400_rand_512val_5touch_oldts.log, 
> IgnitePdsPageReplacementTestToYard.java, replacement_64_new.jfr.zip, 
> replacement_64_old.jfr.zip, screenshot-1.png, screenshot-2.png
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Experimentally proven that current page replacement functionality has 
> problems with replace candidate computation. Current implementation obtain 5 
> random pages and make further decisions basing this pages last touch 
> timestamp and some inner flags, however still possible cases when this pages 
> set can be simply nullified due to inner logic. All improvements need to be 
> proven, for example, by simple scenario: 
> 1. put some data until event EVT_PAGE_REPLACEMENT_STARTED is triggered
> 2. put 2 times more data than been loaded in p1.
> 3. execute fullscan (through ScanQuery) for old\cold data processing 
> emulation.
> 4. start processing only pages which can fit into current mem region.
> 5. measure "replacedPages" metric.
> (i attach code mention above)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (IGNITE-12994) Move binary metadata to PDS storage folder

2020-06-16 Thread Vyacheslav Koptilin (Jira)



[ 
https://issues.apache.org/jira/browse/IGNITE-12994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17136612#comment-17136612
 ] 

Vyacheslav Koptilin commented on IGNITE-12994:
--

Hello [~sdanilov], [~sergey-chugunov],

Merged to master. Thank you for your efforts!

> Move binary metadata to PDS storage folder
> --
>
> Key: IGNITE-12994
> URL: https://issues.apache.org/jira/browse/IGNITE-12994
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Semyon Danilov
>Assignee: Semyon Danilov
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Move binary metadata to PDS storage folder
> Motivation: On K8s deployment disk that is attached to container can not be 
> accessible from other containers or outside of K8s. In case if support will 
> need to drop persistence except data, there will be no way to recover due to 
> binary metadata is required to process PDS files
> Current PDS files that are required to restore data:
>  * {WORK_DIR}/db/\{nodeId}
>  * {WORK_DIR}/db/wal/\{nodeId}
>  * {WORK_DIR}/db/wal/archive/\{nodeId}
> Proposed implementation:
> If binary meta is already located at \{WORK_DIR}/db then start using it
> If no metadata been detected via new path then:
>  # Check previous location
>  # Copy metadata files to a new location
>  # Delete previous binary metadata



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (IGNITE-12994) Move binary metadata to PDS storage folder

2020-06-16 Thread Sergey Chugunov (Jira)



[ 
https://issues.apache.org/jira/browse/IGNITE-12994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17136562#comment-17136562
 ] 

Sergey Chugunov commented on IGNITE-12994:
--

[~sdanilov], as I can see comment about your change were either addressed or 
get answers on dev list. Overall patch looks good to me, we are clear to 
proceed with merging it to master.

[~Pavlukhin], indeed idea of moving binary metadata to metastorage has emerged 
right when metastorage was introduced. But previously with such a change we 
would loose any procedure to recreate binary meta (e.g. to replace field type 
or do other incompatible change). When IGNITE-13154 is introduced we'll be able 
to start working on that.

> Move binary metadata to PDS storage folder
> --
>
> Key: IGNITE-12994
> URL: https://issues.apache.org/jira/browse/IGNITE-12994
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Semyon Danilov
>Assignee: Semyon Danilov
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Move binary metadata to PDS storage folder
> Motivation: On K8s deployment disk that is attached to container can not be 
> accessible from other containers or outside of K8s. In case if support will 
> need to drop persistence except data, there will be no way to recover due to 
> binary metadata is required to process PDS files
> Current PDS files that are required to restore data:
>  * {WORK_DIR}/db/\{nodeId}
>  * {WORK_DIR}/db/wal/\{nodeId}
>  * {WORK_DIR}/db/wal/archive/\{nodeId}
> Proposed implementation:
> If binary meta is already located at \{WORK_DIR}/db then start using it
> If no metadata been detected via new path then:
>  # Check previous location
>  # Copy metadata files to a new location
>  # Delete previous binary metadata



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (IGNITE-13147) Avoid DHT topology map updates before it's initialization

2020-06-16 Thread Ivan Rakov (Jira)



[ 
https://issues.apache.org/jira/browse/IGNITE-13147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17136543#comment-17136543
 ] 

Ivan Rakov commented on IGNITE-13147:
-

[~ascherbakov] LGTM, please proceed to merge.

> Avoid DHT topology map updates before it's initialization
> -
>
> Key: IGNITE-13147
> URL: https://issues.apache.org/jira/browse/IGNITE-13147
> Project: Ignite
>  Issue Type: Bug
>Affects Versions: 2.8
>Reporter: Alexey Scherbakov
>Assignee: Alexey Scherbakov
>Priority: Major
> Fix For: 2.9
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> It can happen if a partition state is restored from persistent store on 
> logical recovery and can cause NPE on older versions during illegal access to 
> unitialized topology:
> {noformat}
> [ERROR][exchange-worker-#41][GridDhtPartitionsExchangeFuture] Failed to 
> reinitialize local partitions (rebalancing will be stopped): 
> GridDhtPartitionExchangeId [topVer=AffinityTopologyVersion [topVer=102, 
> minorTopVer=0], discoEvt=DiscoveryEvent [evtNode=TcpDiscoveryNode 
> [id=d159da1c-6f70-4ef2-bfa4-4feb64b829de, consistentId=T1SivatheObserver, 
> addrs=ArrayList [10.44.166.91, 127.0.0.1], sockAddrs=HashSet 
> [/127.0.0.1:56500, clrv052580.ic.ing.net/10.44.166.91:56500], 
> discPort=56500, order=102, intOrder=60, lastExchangeTime=1586354937705, 
> loc=true,, isClient=false], topVer=102, msgTemplate=null, nodeId8=d159da1c, 
> msg=null, type=NODE_JOINED, tstamp=1586354901638], nodeId=d159da1c, 
> evt=NODE_JOINED]
> java.lang.NullPointerException: null
>   at 
> org.apache.ignite.internal.processors.affinity.GridAffinityAssignmentV2.getIds(GridAffinityAssignmentV2.java:211)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.topology.GridDhtPartitionTopologyImpl.updateLocal(GridDhtPartitionTopologyImpl.java:2554)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.topology.GridDhtPartitionTopologyImpl.afterStateRestored(GridDhtPartitionTopologyImpl.java:714)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.lambda$beforeExchange$38edadb$1(GridCacheDatabaseSharedManager.java:1514)
>   at 
> org.apache.ignite.internal.util.IgniteUtils.lambda$null$1(IgniteUtils.java:10790)
>   at java.util.concurrent.FutureTask.run(Unknown Source) ~[?:1.8.0_241]
>   at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) 
> ~[?:1.8.0_241]
>   at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) 
> ~[?:1.8.0_241]
>   at java.lang.Thread.run(Unknown Source) [?:1.8.0_241]
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (IGNITE-13155) Snapshot creation throws NPE on an in-memory cluster

2020-06-16 Thread Maxim Muzafarov (Jira)

Maxim Muzafarov created IGNITE-13155:


 Summary: Snapshot creation throws NPE on an in-memory cluster
 Key: IGNITE-13155
 URL: https://issues.apache.org/jira/browse/IGNITE-13155
 Project: Ignite
  Issue Type: Bug
Reporter: Maxim Muzafarov
Assignee: Maxim Muzafarov
 Fix For: 2.9


Snapshot creation throws NPE on an in-memory cluster.

{code}
Error stack trace:
class org.apache.ignite.internal.client.GridClientException: Failed to handle 
request: [req=EXE, 
taskName=org.apache.ignite.internal.visor.snapshot.VisorSnapshotCreateTask, 
params=[VisorTaskArgument [debug=false]], err=Failed to reduce job results due 
to undeclared user exception 
[task=org.apache.ignite.internal.visor.snapshot.VisorSnapshotCreateTask@4d45b97f,
 err=class org.apache.ignite.IgniteException: null], trace=class 
org.apache.ignite.IgniteCheckedException: Failed to reduce job results due to 
undeclared user exception 
[task=org.apache.ignite.internal.visor.snapshot.VisorSnapshotCreateTask@4d45b97f,
 err=class org.apache.ignite.IgniteException: null]
at 
org.apache.ignite.internal.util.IgniteUtils.cast(IgniteUtils.java:7566)
at 
org.apache.ignite.internal.util.future.GridFutureAdapter.resolve(GridFutureAdapter.java:260)
at 
org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:172)
at 
org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:141)
at 
org.apache.ignite.internal.processors.rest.handlers.task.GridTaskCommandHandler$2.apply(GridTaskCommandHandler.java:263)
at 
org.apache.ignite.internal.processors.rest.handlers.task.GridTaskCommandHandler$2.apply(GridTaskCommandHandler.java:257)
at 
org.apache.ignite.internal.util.future.GridFutureAdapter.notifyListener(GridFutureAdapter.java:399)
at 
org.apache.ignite.internal.util.future.GridFutureAdapter.listen(GridFutureAdapter.java:354)
at 
org.apache.ignite.internal.processors.rest.handlers.task.GridTaskCommandHandler.handleAsyncUnsafe(GridTaskCommandHandler.java:257)
at 
org.apache.ignite.internal.processors.rest.handlers.task.GridTaskCommandHandler.handleAsync(GridTaskCommandHandler.java:163)
at 
org.apache.ignite.internal.processors.rest.GridRestProcessor.handleRequest(GridRestProcessor.java:325)
at 
org.apache.ignite.internal.processors.rest.GridRestProcessor.access$100(GridRestProcessor.java:104)
at 
org.apache.ignite.internal.processors.rest.GridRestProcessor$2.body(GridRestProcessor.java:179)
at 
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: class org.apache.ignite.compute.ComputeUserUndeclaredException: 
Failed to reduce job results due to undeclared user exception 
[task=org.apache.ignite.internal.visor.snapshot.VisorSnapshotCreateTask@4d45b97f,
 err=class org.apache.ignite.IgniteException: null]
at 
org.apache.ignite.internal.processors.task.GridTaskWorker.reduce(GridTaskWorker.java:1184)
at 
org.apache.ignite.internal.processors.task.GridTaskWorker.onResponse(GridTaskWorker.java:974)
at 
org.apache.ignite.internal.processors.task.GridTaskWorker.processDelayedResponses(GridTaskWorker.java:711)
at 
org.apache.ignite.internal.processors.task.GridTaskWorker.body(GridTaskWorker.java:542)
at 
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
at 
org.apache.ignite.internal.processors.task.GridTaskProcessor.startTask(GridTaskProcessor.java:830)
at 
org.apache.ignite.internal.processors.task.GridTaskProcessor.execute(GridTaskProcessor.java:555)
at 
org.apache.ignite.internal.processors.task.GridTaskProcessor.execute(GridTaskProcessor.java:535)
at 
org.apache.ignite.internal.processors.rest.handlers.task.GridTaskCommandHandler.handleAsyncUnsafe(GridTaskCommandHandler.java:227)
... 8 more
Caused by: class org.apache.ignite.IgniteException: null
at 
org.apache.ignite.internal.util.IgniteUtils.convertException(IgniteUtils.java:1086)
at 
org.apache.ignite.internal.util.future.IgniteFutureImpl.convertException(IgniteFutureImpl.java:168)
at 
org.apache.ignite.internal.util.future.IgniteFutureImpl.get(IgniteFutureImpl.java:137)
at 
org.apache.ignite.internal.processors.cache.persistence.snapshot.SnapshotMXBeanImpl.createSnapshot(SnapshotMXBeanImpl.java:43)
at 
org.apache.ignite.internal.visor.snapshot.VisorSnapshotCreateTask$VisorSnapshotCreateJob.run(VisorSnapshotCreateTask.java:57)
at 
org.apache.ignite.internal.visor.snapshot.VisorSnapshotCreateTask$VisorSnapshotCr

[jira] [Assigned] (IGNITE-12903) Fix ML + SQL examples

2020-06-16 Thread Alexey Zinoviev (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-12903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexey Zinoviev reassigned IGNITE-12903:


Assignee: Alexey Zinoviev  (was: Taras Ledkov)

> Fix ML + SQL examples
> -
>
> Key: IGNITE-12903
> URL: https://issues.apache.org/jira/browse/IGNITE-12903
> Project: Ignite
>  Issue Type: Task
>  Components: examples
>Reporter: Taras Ledkov
>Assignee: Alexey Zinoviev
>Priority: Major
>
> The examples
> {{DecisionTreeClassificationTrainerSQLInferenceExample}}
> {{DecisionTreeClassificationTrainerSQLTableExample}}
> are used CSVREAD function to initial load data into cluster.
> Must be changed because this function is disabled by default



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (IGNITE-12903) Fix ML + SQL examples

2020-06-16 Thread Alexey Zinoviev (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-12903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexey Zinoviev updated IGNITE-12903:
-
Component/s: ml

> Fix ML + SQL examples
> -
>
> Key: IGNITE-12903
> URL: https://issues.apache.org/jira/browse/IGNITE-12903
> Project: Ignite
>  Issue Type: Task
>  Components: examples, ml
>Reporter: Taras Ledkov
>Assignee: Alexey Zinoviev
>Priority: Major
>
> The examples
> {{DecisionTreeClassificationTrainerSQLInferenceExample}}
> {{DecisionTreeClassificationTrainerSQLTableExample}}
> are used CSVREAD function to initial load data into cluster.
> Must be changed because this function is disabled by default



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (IGNITE-12903) Fix ML + SQL examples

2020-06-16 Thread Alexey Zinoviev (Jira)



[ 
https://issues.apache.org/jira/browse/IGNITE-12903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17136508#comment-17136508
 ] 

Alexey Zinoviev commented on IGNITE-12903:
--

[~tledkov-gridgain] What is the best way to fix? Enable this function manually 
(could you suggest the way, here, in comments) or the best way here to populate 
cache manually not from CSV. What do you think?

> Fix ML + SQL examples
> -
>
> Key: IGNITE-12903
> URL: https://issues.apache.org/jira/browse/IGNITE-12903
> Project: Ignite
>  Issue Type: Task
>  Components: examples
>Reporter: Taras Ledkov
>Assignee: Taras Ledkov
>Priority: Major
>
> The examples
> {{DecisionTreeClassificationTrainerSQLInferenceExample}}
> {{DecisionTreeClassificationTrainerSQLTableExample}}
> are used CSVREAD function to initial load data into cluster.
> Must be changed because this function is disabled by default



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (IGNITE-13154) Introduce the ability for a user to manage binary types

2020-06-16 Thread Taras Ledkov (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-13154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Taras Ledkov updated IGNITE-13154:
--
Summary: Introduce the ability for a user to manage binary types  (was: 
Introduce aility to manage manage binary types by the users)

> Introduce the ability for a user to manage binary types
> ---
>
> Key: IGNITE-13154
> URL: https://issues.apache.org/jira/browse/IGNITE-13154
> Project: Ignite
>  Issue Type: Improvement
>  Components: binary
>Reporter: Taras Ledkov
>Assignee: Taras Ledkov
>Priority: Major
> Fix For: 2.9
>
>
> We need a way to change schema (including unsupported changes, such as column 
> type change) without cluster restart. This is for the case when all data 
> associated with the binary type has been removed, so removing the old schema 
> is safe.
> Now users must stop the cluster and remove the folder with binary metadata 
> manually.
> The proposed way is to introduce internal API to manage binary types and 
> public command line interface (via control.sh). That way one can remove a 
> cache from the cluster, then unregister corresponding binary types, then 
> launch a new version of an application that would register the new schema and 
> reload the data.
> *The current implementation has restrictions:*
> - all cluster nodes must support remove type feature.
> - the cluster must not contains data with type to remove.
> - operation of the update type must not be launched on the cluster for the 
> type to remove (operation examples: put into cache, 
> BinaryObjectBuilder#build).
> - client nodes process metadata operation asynchronously so type may be 
> removed at the client after any delay after the remove type operation is 
> completed.
> - if the node that contains the old version of the type joins to the cluster 
> where type was removed the type is propagated to cluster metadata (because 
> metadata tombstone not supported).
> - if the node that contains the old version of the type cannot join to the 
> cluster where type was removed and then updated to the new version (because 
> metadata versioned tombstone not supported).
> So, user scenarios looks like:
> # Be sure that all server nodes supports remove type feature.
> # Remove caches contains the data with type to remove.
> # Stops the client node with older version.
> # Stops all operation with type to remove (don't create binary objects, don't 
> run compute jobs with type to remove).
> # Remove the type on the stable topology (and production destination topolog).
> # Waits any delay (depends on the cluster size and clients count)
> # Produce operations with new version of the type.
> *Proposed command line interface*
> New commands (all commands are _experimental_ ):
> - {{--meta list}} prints info about all available binary types:
> {{typeId=, typeName=, fields=, 
> schemas=, isEnum=}}
> - {{\-\-meta details (\-\- typeId | \-\-typeName )}} prints 
> detailed info info about specified type. The type may be specified by type 
> name or type ID.
> output example:
> {code}
> typeId=0x1FBFBC0C (532659212)
> typeName=TypeName1
> Fields:
>   name=fld3, type=long[], fieldId=0x2FFF95 (3145621)
>   name=fld2, type=double, fieldId=0x2FFF94 (3145620)
>   name=fld1, type=Date, fieldId=0x2FFF93 (3145619)
>   name=fld0, type=int, fieldId=0x2FFF92 (3145618)
> Schemas:
>   schemaId=0x6C5CC179 (1818018169), fields=[fld0]
>   schemaId=0x70E46431 (1894016049), fields=[fld0, fld1, fld2, fld3]
> {code}
> - {{\-\-meta remove (\-\- typeId | \-\-typeName ) [\-\-out 
> ]}} removes metadata for specified type form cluster and saves the 
> removed metadata to the specified file. If the file name isn't specified the 
> output file name is: {{.bin}}
> The command requires confirmation.
> *N.B.*: The all session of thin clients (ODBC, JDBC, thin client) are closed 
> (to cleanup local cache of the binary metadata).
> - {{\-\-meta update \-\-in ]}} update cluster metadata from 
> specified file (file name is required)
> The command requires confirmation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (IGNITE-13154) Introduce aility to manage manage binary types by the users

2020-06-16 Thread Taras Ledkov (Jira)

Taras Ledkov created IGNITE-13154:
-

 Summary: Introduce aility to manage manage binary types by the 
users
 Key: IGNITE-13154
 URL: https://issues.apache.org/jira/browse/IGNITE-13154
 Project: Ignite
  Issue Type: Improvement
  Components: binary
Reporter: Taras Ledkov
Assignee: Taras Ledkov
 Fix For: 2.9


We need a way to change schema (including unsupported changes, such as column 
type change) without cluster restart. This is for the case when all data 
associated with the binary type has been removed, so removing the old schema is 
safe.

Now users must stop the cluster and remove the folder with binary metadata 
manually.

The proposed way is to introduce internal API to manage binary types and public 
command line interface (via control.sh). That way one can remove a cache from 
the cluster, then unregister corresponding binary types, then launch a new 
version of an application that would register the new schema and reload the 
data.

*The current implementation has restrictions:*
- all cluster nodes must support remove type feature.
- the cluster must not contains data with type to remove.
- operation of the update type must not be launched on the cluster for the type 
to remove (operation examples: put into cache, BinaryObjectBuilder#build).
- client nodes process metadata operation asynchronously so type may be removed 
at the client after any delay after the remove type operation is completed.
- if the node that contains the old version of the type joins to the cluster 
where type was removed the type is propagated to cluster metadata (because 
metadata tombstone not supported).
- if the node that contains the old version of the type cannot join to the 
cluster where type was removed and then updated to the new version (because 
metadata versioned tombstone not supported).

So, user scenarios looks like:
# Be sure that all server nodes supports remove type feature.
# Remove caches contains the data with type to remove.
# Stops the client node with older version.
# Stops all operation with type to remove (don't create binary objects, don't 
run compute jobs with type to remove).
# Remove the type on the stable topology (and production destination topolog).
# Waits any delay (depends on the cluster size and clients count)
# Produce operations with new version of the type.

*Proposed command line interface*
New commands (all commands are _experimental_ ):
- {{--meta list}} prints info about all available binary types:
{{typeId=, typeName=, fields=, schemas=, 
isEnum=}}
- {{\-\-meta details (\-\- typeId | \-\-typeName )}} prints detailed 
info info about specified type. The type may be specified by type name or type 
ID.
output example:
{code}
typeId=0x1FBFBC0C (532659212)
typeName=TypeName1
Fields:
  name=fld3, type=long[], fieldId=0x2FFF95 (3145621)
  name=fld2, type=double, fieldId=0x2FFF94 (3145620)
  name=fld1, type=Date, fieldId=0x2FFF93 (3145619)
  name=fld0, type=int, fieldId=0x2FFF92 (3145618)
Schemas:
  schemaId=0x6C5CC179 (1818018169), fields=[fld0]
  schemaId=0x70E46431 (1894016049), fields=[fld0, fld1, fld2, fld3]
{code}
- {{\-\-meta remove (\-\- typeId | \-\-typeName ) [\-\-out 
]}} removes metadata for specified type form cluster and saves the 
removed metadata to the specified file. If the file name isn't specified the 
output file name is: {{.bin}}
The command requires confirmation.
*N.B.*: The all session of thin clients (ODBC, JDBC, thin client) are closed 
(to cleanup local cache of the binary metadata).
- {{\-\-meta update \-\-in ]}} update cluster metadata from 
specified file (file name is required)
The command requires confirmation.




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (IGNITE-13144) Improve ClusterState enum and other minor improvements for the cluster read-only mode.

2020-06-16 Thread Sergey Antonov (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-13144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Antonov updated IGNITE-13144:

Description: 
We have {{boolean ClusterState#active(ClusterState)}} static method. It would 
be better to have an instance method {{boolean active()}}. 
Also, we have ClusterState#lesserOf static method. It needs for internal 
purposes. So we must move this method in an internal package.
We introduced new compute requests for getting/change cluster state from the 
client node as internal classes of GridClusterStateProcessor. We should move 
them to separate public classes for maintainability purposes.

  was:
We have {{boolean ClusterState#active(ClusterState)}} static method. It would 
be better to have an instance method {{boolean active()}}. 
Also, we have ClusterState#lesserOf static method. It needs for internal 
purposes. So we must move this method in an internal package.
We introduced new compute requests for getting/change cluster state from the 
client node as internal classes of GridClusterStateProcessor. We should move 
them to separate public classes for the maintainability purposes.


> Improve ClusterState enum and other minor improvements for the cluster 
> read-only mode.
> --
>
> Key: IGNITE-13144
> URL: https://issues.apache.org/jira/browse/IGNITE-13144
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Sergey Antonov
>Assignee: Sergey Antonov
>Priority: Major
> Fix For: 2.9
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We have {{boolean ClusterState#active(ClusterState)}} static method. It would 
> be better to have an instance method {{boolean active()}}. 
> Also, we have ClusterState#lesserOf static method. It needs for internal 
> purposes. So we must move this method in an internal package.
> We introduced new compute requests for getting/change cluster state from the 
> client node as internal classes of GridClusterStateProcessor. We should move 
> them to separate public classes for maintainability purposes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (IGNITE-13144) Improve ClusterState enum and other minor improvements for the cluster read-only mode.

2020-06-16 Thread Sergey Antonov (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-13144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Antonov updated IGNITE-13144:

Description: 
We have {{boolean ClusterState#active(ClusterState)}} static method. It would 
be better to have an instance method {{boolean active()}}. 
Also, we have ClusterState#lesserOf static method. It needs for internal 
purposes. So we must move this method in an internal package.
We introduced new compute requests for getting/change cluster state from the 
client node as internal classes of GridClusterStateProcessor. We should move 
them to separate public classes for the maintainability purposes.

  was:
We have {{boolean ClusterState#active(ClusterState)}} static method. It would 
be better to have an instance method {{boolean active()}}. 
Also, we have ClusterState#lesserOf static method. It needs for internal 
purposes. So we must move this method in an internal package.


> Improve ClusterState enum and other minor improvements for the cluster 
> read-only mode.
> --
>
> Key: IGNITE-13144
> URL: https://issues.apache.org/jira/browse/IGNITE-13144
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Sergey Antonov
>Assignee: Sergey Antonov
>Priority: Major
> Fix For: 2.9
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We have {{boolean ClusterState#active(ClusterState)}} static method. It would 
> be better to have an instance method {{boolean active()}}. 
> Also, we have ClusterState#lesserOf static method. It needs for internal 
> purposes. So we must move this method in an internal package.
> We introduced new compute requests for getting/change cluster state from the 
> client node as internal classes of GridClusterStateProcessor. We should move 
> them to separate public classes for the maintainability purposes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (IGNITE-13144) Improve ClusterState enum and other minor improvements for the cluster read-only mode.

2020-06-16 Thread Sergey Antonov (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-13144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Antonov updated IGNITE-13144:

Summary: Improve ClusterState enum and other minor improvements for the 
cluster read-only mode.  (was: Improve ClusterState enum)

> Improve ClusterState enum and other minor improvements for the cluster 
> read-only mode.
> --
>
> Key: IGNITE-13144
> URL: https://issues.apache.org/jira/browse/IGNITE-13144
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Sergey Antonov
>Assignee: Sergey Antonov
>Priority: Major
> Fix For: 2.9
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We have {{boolean ClusterState#active(ClusterState)}} static method. It would 
> be better to have an instance method {{boolean active()}}. 
> Also, we have ClusterState#lesserOf static method. It needs for internal 
> purposes. So we must move this method in an internal package.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (IGNITE-13153) Fix CammandHandler user Attributes

2020-06-16 Thread Sergei Ryzhov (Jira)

Sergei Ryzhov created IGNITE-13153:
--

 Summary: Fix CammandHandler user Attributes
 Key: IGNITE-13153
 URL: https://issues.apache.org/jira/browse/IGNITE-13153
 Project: Ignite
  Issue Type: Task
Reporter: Sergei Ryzhov
Assignee: Sergei Ryzhov


Fix CammandHandler user attributes for used certificate when CammandHandler 
used with SSL



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (IGNITE-13147) Avoid DHT topology map updates before it's initialization

2020-06-16 Thread Alexey Scherbakov (Jira)



 [ 
https://issues.apache.org/jira/browse/IGNITE-13147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexey Scherbakov reassigned IGNITE-13147:
--

Assignee: Alexey Scherbakov

> Avoid DHT topology map updates before it's initialization
> -
>
> Key: IGNITE-13147
> URL: https://issues.apache.org/jira/browse/IGNITE-13147
> Project: Ignite
>  Issue Type: Bug
>Affects Versions: 2.8
>Reporter: Alexey Scherbakov
>Assignee: Alexey Scherbakov
>Priority: Major
> Fix For: 2.9
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> It can happen if a partition state is restored from persistent store on 
> logical recovery and can cause NPE on older versions during illegal access to 
> unitialized topology:
> {noformat}
> [ERROR][exchange-worker-#41][GridDhtPartitionsExchangeFuture] Failed to 
> reinitialize local partitions (rebalancing will be stopped): 
> GridDhtPartitionExchangeId [topVer=AffinityTopologyVersion [topVer=102, 
> minorTopVer=0], discoEvt=DiscoveryEvent [evtNode=TcpDiscoveryNode 
> [id=d159da1c-6f70-4ef2-bfa4-4feb64b829de, consistentId=T1SivatheObserver, 
> addrs=ArrayList [10.44.166.91, 127.0.0.1], sockAddrs=HashSet 
> [/127.0.0.1:56500, clrv052580.ic.ing.net/10.44.166.91:56500], 
> discPort=56500, order=102, intOrder=60, lastExchangeTime=1586354937705, 
> loc=true,, isClient=false], topVer=102, msgTemplate=null, nodeId8=d159da1c, 
> msg=null, type=NODE_JOINED, tstamp=1586354901638], nodeId=d159da1c, 
> evt=NODE_JOINED]
> java.lang.NullPointerException: null
>   at 
> org.apache.ignite.internal.processors.affinity.GridAffinityAssignmentV2.getIds(GridAffinityAssignmentV2.java:211)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.topology.GridDhtPartitionTopologyImpl.updateLocal(GridDhtPartitionTopologyImpl.java:2554)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.topology.GridDhtPartitionTopologyImpl.afterStateRestored(GridDhtPartitionTopologyImpl.java:714)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.lambda$beforeExchange$38edadb$1(GridCacheDatabaseSharedManager.java:1514)
>   at 
> org.apache.ignite.internal.util.IgniteUtils.lambda$null$1(IgniteUtils.java:10790)
>   at java.util.concurrent.FutureTask.run(Unknown Source) ~[?:1.8.0_241]
>   at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) 
> ~[?:1.8.0_241]
>   at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) 
> ~[?:1.8.0_241]
>   at java.lang.Thread.run(Unknown Source) [?:1.8.0_241]
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (IGNITE-13147) Avoid DHT topology map updates before it's initialization

2020-06-16 Thread Alexey Scherbakov (Jira)



[ 
https://issues.apache.org/jira/browse/IGNITE-13147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17136383#comment-17136383
 ] 

Alexey Scherbakov commented on IGNITE-13147:


The fix also includes optimization for late affinity switch time and reduces a 
number of sent partition state messages.

> Avoid DHT topology map updates before it's initialization
> -
>
> Key: IGNITE-13147
> URL: https://issues.apache.org/jira/browse/IGNITE-13147
> Project: Ignite
>  Issue Type: Bug
>Affects Versions: 2.8
>Reporter: Alexey Scherbakov
>Priority: Major
> Fix For: 2.9
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> It can happen if a partition state is restored from persistent store on 
> logical recovery and can cause NPE on older versions during illegal access to 
> unitialized topology:
> {noformat}
> [ERROR][exchange-worker-#41][GridDhtPartitionsExchangeFuture] Failed to 
> reinitialize local partitions (rebalancing will be stopped): 
> GridDhtPartitionExchangeId [topVer=AffinityTopologyVersion [topVer=102, 
> minorTopVer=0], discoEvt=DiscoveryEvent [evtNode=TcpDiscoveryNode 
> [id=d159da1c-6f70-4ef2-bfa4-4feb64b829de, consistentId=T1SivatheObserver, 
> addrs=ArrayList [10.44.166.91, 127.0.0.1], sockAddrs=HashSet 
> [/127.0.0.1:56500, clrv052580.ic.ing.net/10.44.166.91:56500], 
> discPort=56500, order=102, intOrder=60, lastExchangeTime=1586354937705, 
> loc=true,, isClient=false], topVer=102, msgTemplate=null, nodeId8=d159da1c, 
> msg=null, type=NODE_JOINED, tstamp=1586354901638], nodeId=d159da1c, 
> evt=NODE_JOINED]
> java.lang.NullPointerException: null
>   at 
> org.apache.ignite.internal.processors.affinity.GridAffinityAssignmentV2.getIds(GridAffinityAssignmentV2.java:211)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.topology.GridDhtPartitionTopologyImpl.updateLocal(GridDhtPartitionTopologyImpl.java:2554)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.topology.GridDhtPartitionTopologyImpl.afterStateRestored(GridDhtPartitionTopologyImpl.java:714)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.lambda$beforeExchange$38edadb$1(GridCacheDatabaseSharedManager.java:1514)
>   at 
> org.apache.ignite.internal.util.IgniteUtils.lambda$null$1(IgniteUtils.java:10790)
>   at java.util.concurrent.FutureTask.run(Unknown Source) ~[?:1.8.0_241]
>   at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) 
> ~[?:1.8.0_241]
>   at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) 
> ~[?:1.8.0_241]
>   at java.lang.Thread.run(Unknown Source) [?:1.8.0_241]
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

67 matches

Mail list logo