[jira] [Commented] (IGNITE-11829) Distribute joins fail if number of tables > 7

2019-05-24 Thread Stanislav Lukyanov (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-11829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16847997#comment-16847997
 ] 

Stanislav Lukyanov commented on IGNITE-11829:
-

The workaround for this issue is two group some of the JOINs into a subquery:
{code}
SELECT *
FROM public.PERSON P1
JOIN public.PERSON P2 ON P1.ID = P2.ID
JOIN public.PERSON P3 ON P1.ID = P3.ID
JOIN public.PERSON P4 ON P1.ID = P4.ID
JOIN public.PERSON P5 ON P1.ID = P5.ID
JOIN public.PERSON P6 ON P1.ID = P6.ID
JOIN (
select P7.ID as ID, P7.NAME as P7NAME, P8.NAME as P8NAME
FROM public.PERSON P7
JOIN public.PERSON P8 ON P7.ID = P8.ID
) P78 ON P1.ID = P78.ID
{code}

> Distribute joins fail if number of tables > 7
> -
>
> Key: IGNITE-11829
> URL: https://issues.apache.org/jira/browse/IGNITE-11829
> Project: Ignite
>  Issue Type: Improvement
>  Components: sql
>Affects Versions: 2.7
>Reporter: Stanislav Lukyanov
>Priority: Major
>  Labels: newbie
>
> Distributed joins fail with ArrayIndexOutOfBounds when the total number of 
> tables is > 7.
> Example:
> {code}
> try (Ignite ignite = 
> Ignition.start("examples/config/example-ignite.xml");) {
> IgniteCache cache = ignite.createCache("foo");
> cache.query(new SqlFieldsQuery("CREATE TABLE Person(ID INTEGER 
> PRIMARY KEY, NAME VARCHAR(100));"));
> cache.query(new SqlFieldsQuery("INSERT INTO Person(ID, NAME) 
> VALUES (1, 'Ed'), (2, 'Ann'), (3, 'Emma');"));
> cache.query(new SqlFieldsQuery("SELECT *\n" +
> "FROM PERSON P1\n" +
> "JOIN PERSON P2 ON P1.ID = P2.ID\n" +
> "JOIN PERSON P3 ON P1.ID = P3.ID\n" +
> "JOIN PERSON P4 ON P1.ID = P4.ID\n" +
> "JOIN PERSON P5 ON P1.ID = P5.ID\n" +
> "JOIN PERSON P6 ON P1.ID = P6.ID\n" +
> "JOIN PERSON P7 ON P1.ID = P7.ID\n" +
> "JOIN PERSON P8 ON P1.ID = 
> P8.ID").setDistributedJoins(true).setEnforceJoinOrder(false));
> }
> {code}
> throws
> {code}
> Exception in thread "main" javax.cache.CacheException: General error: 
> "java.lang.ArrayIndexOutOfBoundsException" [5-197]
>   at 
> org.apache.ignite.internal.processors.cache.IgniteCacheProxyImpl.query(IgniteCacheProxyImpl.java:832)
>   at 
> org.apache.ignite.internal.processors.cache.IgniteCacheProxyImpl.query(IgniteCacheProxyImpl.java:765)
>   at 
> org.apache.ignite.internal.processors.cache.GatewayProtectedCacheProxy.query(GatewayProtectedCacheProxy.java:403)
>   at 
> org.apache.ignite.examples.ExampleNodeStartup.main(ExampleNodeStartup.java:60)
> Caused by: class 
> org.apache.ignite.internal.processors.query.IgniteSQLException: General 
> error: "java.lang.ArrayIndexOutOfBoundsException" [5-197]
>   at 
> org.apache.ignite.internal.processors.query.h2.QueryParser.parseH2(QueryParser.java:454)
>   at 
> org.apache.ignite.internal.processors.query.h2.QueryParser.parse0(QueryParser.java:156)
>   at 
> org.apache.ignite.internal.processors.query.h2.QueryParser.parse(QueryParser.java:121)
>   at 
> org.apache.ignite.internal.processors.query.h2.IgniteH2Indexing.querySqlFields(IgniteH2Indexing.java:1191)
>   at 
> org.apache.ignite.internal.processors.query.GridQueryProcessor$3.applyx(GridQueryProcessor.java:2261)
>   at 
> org.apache.ignite.internal.processors.query.GridQueryProcessor$3.applyx(GridQueryProcessor.java:2257)
>   at 
> org.apache.ignite.internal.util.lang.IgniteOutClosureX.apply(IgniteOutClosureX.java:53)
>   at 
> org.apache.ignite.internal.processors.query.GridQueryProcessor.executeQuery(GridQueryProcessor.java:2767)
>   at 
> org.apache.ignite.internal.processors.query.GridQueryProcessor.lambda$querySqlFields$1(GridQueryProcessor.java:2277)
>   at 
> org.apache.ignite.internal.processors.query.GridQueryProcessor.executeQuerySafe(GridQueryProcessor.java:2297)
>   at 
> org.apache.ignite.internal.processors.query.GridQueryProcessor.querySqlFields(GridQueryProcessor.java:2250)
>   at 
> org.apache.ignite.internal.processors.query.GridQueryProcessor.querySqlFields(GridQueryProcessor.java:2177)
>   at 
> org.apache.ignite.internal.processors.cache.IgniteCacheProxyImpl.query(IgniteCacheProxyImpl.java:817)
>   ... 3 more
> Caused by: org.h2.jdbc.JdbcSQLException: General error: 
> "java.lang.ArrayIndexOutOfBoundsException" [5-197]
>   at org.h2.message.DbException.getJdbcSQLException(DbException.java:357)
>   at org.h2.message.DbException.get(DbException.java:168)
>   at org.h2.message.DbException.convert(DbException.java:307)
>   at org.h2.message.DbException.toSQLException(DbException.java:280)
>   at 

[jira] [Created] (IGNITE-11873) Enabling SQL On-heap Row Cache results in row cache being inconsistent with off-heap storage

2019-05-24 Thread Joel Lang (JIRA)
Joel Lang created IGNITE-11873:
--

 Summary: Enabling SQL On-heap Row Cache results in row cache being 
inconsistent with off-heap storage
 Key: IGNITE-11873
 URL: https://issues.apache.org/jira/browse/IGNITE-11873
 Project: Ignite
  Issue Type: Bug
  Components: cache, persistence, sql
Affects Versions: 2.7
Reporter: Joel Lang
 Attachments: entry1.png, entry2.png

When enabling the SQL On-heap Row Cache feature on a persistent, atomic, 
replicated cache, I found that after a number of queries and updates, averaging 
from 40 to 60 updates, the on-heap cache will become inconsistent with the 
off-heap storage. This manifests on a single, non-clustered Ignite node that I 
test with.

Specifically I would query a cache using SQL for a specific entry, but when 
updating the entry using a normal put() on the cache, the entry would not be 
changed from the perspective of the next SQL query. This causes the business 
code to not behave as expected.

When examining the state of the cache from DBeaver using a select query, I've 
found that the problem row in question is duplicated in the query results, and 
out of order despite ordering the results by key:

!entry1.png!

Restarting Ignite to clear the on-heap cache reveals the actual row:

!entry2.png!

When looking at the state of H2RowCache from a heap dump, I found that there 
where two different instances of GridH2KeyValueRowOnheap containing two 
different instances of the cache value in different states: the one I'm seeing 
and the one I'm trying to update it to.

As a side effect of all of this, the ModifyingEntryProcessor always fails on 
that row because "entryVal" is never equal to "val" when checked in the 
process() method.

If more information is needed to reproduce I can try to make a simple example 
next week after the holiday.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-10663) Implement cache mode allows reads with consistency check and fix

2019-05-24 Thread Ignite TC Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-10663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16847802#comment-16847802
 ] 

Ignite TC Bot commented on IGNITE-10663:


{panel:title=-- Run :: All: Possible 
Blockers|borderStyle=dashed|borderColor=#ccc|titleBGColor=#F7D6C1}
{color:#d04437}Platform .NET{color} [[tests 
4|https://ci.ignite.apache.org/viewLog.html?buildId=3934727]]
* exe: CacheParityTest.TestCache

{color:#d04437}Platform .NET (Core Linux){color} [[tests 
1|https://ci.ignite.apache.org/viewLog.html?buildId=3934728]]
* dll: CacheParityTest.TestCache

{color:#d04437}SPI (URI Deploy){color} [[tests 
1|https://ci.ignite.apache.org/viewLog.html?buildId=3934673]]
* IgniteUriDeploymentTestSuite: 
GridUriDeploymentSimpleSelfTest.testSimpleRedeployTwoTasks

{color:#d04437}[Check Code Style]{color} [[tests 0 Exit Code 
|https://ci.ignite.apache.org/viewLog.html?buildId=3934755]]

{color:#d04437}Basic 1{color} [[tests 
1|https://ci.ignite.apache.org/viewLog.html?buildId=3934684]]
* IgniteBasicTestSuite: GridVersionSelfTest.testVersions

{panel}
[TeamCity *-- Run :: All* 
Results|https://ci.ignite.apache.org/viewLog.html?buildId=3934756buildTypeId=IgniteTests24Java8_RunAll]

> Implement cache mode allows reads with consistency check and fix
> 
>
> Key: IGNITE-10663
> URL: https://issues.apache.org/jira/browse/IGNITE-10663
> Project: Ignite
>  Issue Type: Task
>Reporter: Anton Vinogradov
>Assignee: Anton Vinogradov
>Priority: Major
>  Labels: iep-31
> Fix For: 2.8
>
>  Time Spent: 10h 10m
>  Remaining Estimate: 0h
>
> The main idea is to provide special "read from cache" mode which will read a 
> value from primary and all backups and will check that values are the same.
> In case values differ they should be fixed according to the appropriate 
> strategy.
> ToDo list:
> 1) {{cache.withConsistency().get(key)}} should guarantee values will be 
> checked across the topology and fixed if necessary.
> 2) LWW (Last Write Wins) strategy should be used for validation.
> 3) Since  LWW and any other strategy do not guarantee that the correct value 
> will be chosen.
> We have to record the event contains all values and the chosen one.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-11256) Implement read-only mode for grid

2019-05-24 Thread Alexei Scherbakov (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-11256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16847688#comment-16847688
 ] 

Alexei Scherbakov commented on IGNITE-11256:


[~antonovsergey93]

I left some minor comments under PR.

 

> Implement read-only mode for grid
> -
>
> Key: IGNITE-11256
> URL: https://issues.apache.org/jira/browse/IGNITE-11256
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Alexei Scherbakov
>Assignee: Sergey Antonov
>Priority: Major
> Fix For: 2.8
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Should be triggered from control.sh utility.
> Useful for maintenance work, for example checking partition consistency 
> (idle_verify)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-11644) Get rid of old exchange protocol

2019-05-24 Thread Amelchev Nikita (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-11644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16847650#comment-16847650
 ] 

Amelchev Nikita commented on IGNITE-11644:
--

[~agoncharuk], Hi. Can this be merged to 2.8? 

> Get rid of old exchange protocol
> 
>
> Key: IGNITE-11644
> URL: https://issues.apache.org/jira/browse/IGNITE-11644
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Alexey Goncharuk
>Priority: Major
>
> Old (non-merging exchange protocol) is not used anymore and should be removed 
> from the code to clean it up.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (IGNITE-9913) Prevent data updates blocking in case of backup BLT server node leave

2019-05-24 Thread Amelchev Nikita (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-9913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16847456#comment-16847456
 ] 

Amelchev Nikita edited comment on IGNITE-9913 at 5/24/19 2:14 PM:
--

I investigate the issue about MOVING partitions - should we allow lightweight 
PME if the cluster has those partitions or not. 
[Dev-list 
discussion.|http://apache-ignite-developers.2346864.n4.nabble.com/Lightweight-version-of-partitions-map-exchange-td41551.html]


was (Author: nsamelchev):
I investigate the issue about MOVING partitions - should we allow lightweight 
PME if the cluster has those partitions. 
[Dev-list 
discussion.|http://apache-ignite-developers.2346864.n4.nabble.com/Lightweight-version-of-partitions-map-exchange-td41551.html]

> Prevent data updates blocking in case of backup BLT server node leave
> -
>
> Key: IGNITE-9913
> URL: https://issues.apache.org/jira/browse/IGNITE-9913
> Project: Ignite
>  Issue Type: Improvement
>  Components: general
>Reporter: Ivan Rakov
>Assignee: Amelchev Nikita
>Priority: Major
> Fix For: 2.8
>
> Attachments: 9913_yardstick.png, master_yardstick.png
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Ignite cluster performs distributed partition map exchange when any server 
> node leaves or joins the topology.
> Distributed PME blocks all updates and may take a long time. If all 
> partitions are assigned according to the baseline topology and server node 
> leaves, there's no actual need to perform distributed PME: every cluster node 
> is able to recalculate new affinity assigments and partition states locally. 
> If we'll implement such lightweight PME and handle mapping and lock requests 
> on new topology version correctly, updates won't be stopped (except updates 
> of partitions that lost their primary copy).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (IGNITE-11872) Add EmptyLineSeparator to codestyle checker

2019-05-24 Thread Maxim Muzafarov (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-11872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maxim Muzafarov updated IGNITE-11872:
-
Ignite Flags:   (was: Docs Required)

> Add EmptyLineSeparator to codestyle checker
> ---
>
> Key: IGNITE-11872
> URL: https://issues.apache.org/jira/browse/IGNITE-11872
> Project: Ignite
>  Issue Type: Improvement
>Affects Versions: 2.7
>Reporter: Nikolay Izhikov
>Assignee: Nikolay Izhikov
>Priority: Major
>  Labels: checkstyle
> Fix For: 2.8
>
>
> We should add following check:
> {code}
> 
> {code}
> and fix all current errors



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (IGNITE-11872) Add EmptyLineSeparator to codestyle checker

2019-05-24 Thread Maxim Muzafarov (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-11872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maxim Muzafarov updated IGNITE-11872:
-
Labels: checkstyle  (was: )

> Add EmptyLineSeparator to codestyle checker
> ---
>
> Key: IGNITE-11872
> URL: https://issues.apache.org/jira/browse/IGNITE-11872
> Project: Ignite
>  Issue Type: Improvement
>Affects Versions: 2.7
>Reporter: Nikolay Izhikov
>Assignee: Nikolay Izhikov
>Priority: Major
>  Labels: checkstyle
> Fix For: 2.8
>
>
> We should add following check:
> {code}
> 
> {code}
> and fix all current errors



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-11872) Add EmptyLineSeparator to codestyle checker

2019-05-24 Thread Nikolay Izhikov (JIRA)
Nikolay Izhikov created IGNITE-11872:


 Summary: Add EmptyLineSeparator to codestyle checker
 Key: IGNITE-11872
 URL: https://issues.apache.org/jira/browse/IGNITE-11872
 Project: Ignite
  Issue Type: Improvement
Affects Versions: 2.7
Reporter: Nikolay Izhikov
Assignee: Nikolay Izhikov
 Fix For: 2.8


We should add following check:
{code}

{code}

and fix all current errors



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-11756) SQL: implement a table row count statistics for the local queries

2019-05-24 Thread Roman Kondakov (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-11756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16847524#comment-16847524
 ] 

Roman Kondakov commented on IGNITE-11756:
-

[~amashenkov], [~Pavlukhin], patch is ready for merge. Tests look good. Code 
style is broken in the master branch.

> SQL: implement a table row count statistics for the local queries
> -
>
> Key: IGNITE-11756
> URL: https://issues.apache.org/jira/browse/IGNITE-11756
> Project: Ignite
>  Issue Type: Improvement
>  Components: sql
>Reporter: Roman Kondakov
>Assignee: Roman Kondakov
>Priority: Major
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Row count statistics should help the H2 optimizer to select the better query 
> execution plan. Currently the row count supplied to H2 engine is hardcoded 
> value == 1 (see {{org.h2.index.Index#getRowCountApproximation}}).  As a 
> first step we can provide an actual table size in the case of local query. To 
> prevent counting size on each invocation we can cache row count value and 
> invalidate it in some cases:
>  * Rebalancing
>  * Multiple updates (after the initial loading)
>  * On timeout (i.e. 1 minute)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-11756) SQL: implement a table row count statistics for the local queries

2019-05-24 Thread Ignite TC Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-11756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16847521#comment-16847521
 ] 

Ignite TC Bot commented on IGNITE-11756:


{panel:title=-- Run :: All: Possible 
Blockers|borderStyle=dashed|borderColor=#ccc|titleBGColor=#F7D6C1}
{color:#d04437}[Check Code Style]{color} [[tests 0 Exit Code 
|https://ci.ignite.apache.org/viewLog.html?buildId=3931520]]

{panel}
[TeamCity *-- Run :: All* 
Results|https://ci.ignite.apache.org/viewLog.html?buildId=3925742buildTypeId=IgniteTests24Java8_RunAll]

> SQL: implement a table row count statistics for the local queries
> -
>
> Key: IGNITE-11756
> URL: https://issues.apache.org/jira/browse/IGNITE-11756
> Project: Ignite
>  Issue Type: Improvement
>  Components: sql
>Reporter: Roman Kondakov
>Assignee: Roman Kondakov
>Priority: Major
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Row count statistics should help the H2 optimizer to select the better query 
> execution plan. Currently the row count supplied to H2 engine is hardcoded 
> value == 1 (see {{org.h2.index.Index#getRowCountApproximation}}).  As a 
> first step we can provide an actual table size in the case of local query. To 
> prevent counting size on each invocation we can cache row count value and 
> invalidate it in some cases:
>  * Rebalancing
>  * Multiple updates (after the initial loading)
>  * On timeout (i.e. 1 minute)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-11869) control.sh idle_verify/validate_indexes shouldn't throw GridNotIdleException, if user pages wasn't modified in checkpoint.

2019-05-24 Thread Ignite TC Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-11869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16847499#comment-16847499
 ] 

Ignite TC Bot commented on IGNITE-11869:


{panel:title=-- Run :: All: Possible 
Blockers|borderStyle=dashed|borderColor=#ccc|titleBGColor=#F7D6C1}
{color:#d04437}ZooKeeper (Discovery) 2{color} [[tests 
1|https://ci.ignite.apache.org/viewLog.html?buildId=3924907]]
* ZookeeperDiscoverySpiTestSuite2: 
GridCommandHandlerTest.testIdleVerifyCheckCrcFailsOnNotIdleCluster

{color:#d04437}Basic 3{color} [[tests 
1|https://ci.ignite.apache.org/viewLog.html?buildId=3932781]]
* IgniteBasicWithPersistenceTestSuite: 
GridCommandHandlerTest.testIdleVerifyCheckCrcFailsOnNotIdleCluster

{panel}
[TeamCity *-- Run :: All* 
Results|https://ci.ignite.apache.org/viewLog.html?buildId=3924983buildTypeId=IgniteTests24Java8_RunAll]

> control.sh idle_verify/validate_indexes shouldn't throw GridNotIdleException, 
> if user pages wasn't modified in checkpoint.
> --
>
> Key: IGNITE-11869
> URL: https://issues.apache.org/jira/browse/IGNITE-11869
> Project: Ignite
>  Issue Type: Improvement
>Affects Versions: 2.8
>Reporter: Sergey Antonov
>Assignee: Sergey Antonov
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We shouldn't throw GridNotIdleException, if checkpoint contains dirty pages 
> related to ignite-sys-cache (system background activities) only.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (IGNITE-11871) [ML] IP resolver in TensorFlow cluster manager doesn't work properly

2019-05-24 Thread Anton Dmitriev (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-11871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Dmitriev updated IGNITE-11871:

Description: 
TensorFlow cluster manager requires NodeId to be resolved into IP address or 
hostname to pass the address/name to TensorFlow worker. Currently, it uses 
strategy "return first" and returns the first available address/name. As a 
result of that, in the case when the server has more than one interface cluster 
resolver might work incorrectly and return different addresses/names for the 
same server.

To fix this problem we need to update 
[TensorFlowServerAddressSpec|https://github.com/apache/ignite/blob/master/modules/tensorflow/src/main/java/org/apache/ignite/tensorflow/cluster/spec/TensorFlowServerAddressSpec.java]
 so that it returns the same address/name for the same server all the time. If 
a server has multiple network interfaces we need to find a "GCD", a network 
with all Ignite nodes.

  was:
TensorFlow cluster manager requires NodeId to be resolved into IP address or 
hostname to pass the address/name to TensorFlow worker. Currently, it uses 
strategy "return first" and returns the first available address/name. As a 
result of that, in the case when the server has more than one interface cluster 
resolver might work incorrectly and return different addresses/names for the 
same server.

To fix this problem we need to update TensorFlowServerAddressSpec so that it 
returns the same address/name for the same server all the time. If a server has 
multiple network interfaces we need to find a "GCD", a network with all Ignite 
nodes.


> [ML] IP resolver in TensorFlow cluster manager doesn't work properly
> 
>
> Key: IGNITE-11871
> URL: https://issues.apache.org/jira/browse/IGNITE-11871
> Project: Ignite
>  Issue Type: Bug
>  Components: ml
>Affects Versions: 2.7
>Reporter: Anton Dmitriev
>Assignee: Anton Dmitriev
>Priority: Major
>
> TensorFlow cluster manager requires NodeId to be resolved into IP address or 
> hostname to pass the address/name to TensorFlow worker. Currently, it uses 
> strategy "return first" and returns the first available address/name. As a 
> result of that, in the case when the server has more than one interface 
> cluster resolver might work incorrectly and return different addresses/names 
> for the same server.
> To fix this problem we need to update 
> [TensorFlowServerAddressSpec|https://github.com/apache/ignite/blob/master/modules/tensorflow/src/main/java/org/apache/ignite/tensorflow/cluster/spec/TensorFlowServerAddressSpec.java]
>  so that it returns the same address/name for the same server all the time. 
> If a server has multiple network interfaces we need to find a "GCD", a 
> network with all Ignite nodes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (IGNITE-11871) [ML] IP resolver in TensorFlow cluster manager doesn't work properly

2019-05-24 Thread Anton Dmitriev (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-11871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Dmitriev updated IGNITE-11871:

Ignite Flags:   (was: Docs Required)

> [ML] IP resolver in TensorFlow cluster manager doesn't work properly
> 
>
> Key: IGNITE-11871
> URL: https://issues.apache.org/jira/browse/IGNITE-11871
> Project: Ignite
>  Issue Type: Bug
>  Components: ml
>Affects Versions: 2.7
>Reporter: Anton Dmitriev
>Assignee: Anton Dmitriev
>Priority: Major
>
> TensorFlow cluster manager requires NodeId to be resolved into IP address or 
> hostname to pass the address/name to TensorFlow worker. Currently, it uses 
> strategy "return first" and returns the first available address/name. As a 
> result of that, in the case when the server has more than one interface 
> cluster resolver might work incorrectly and return different addresses/names 
> for the same server.
> To fix this problem we need to update TensorFlowServerAddressSpec so that it 
> returns the same address/name for the same server all the time. If a server 
> has multiple network interfaces we need to find a "GCD", a network with all 
> Ignite nodes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-11871) [ML] IP resolver in TensorFlow cluster manager doesn't work properly

2019-05-24 Thread Anton Dmitriev (JIRA)
Anton Dmitriev created IGNITE-11871:
---

 Summary: [ML] IP resolver in TensorFlow cluster manager doesn't 
work properly
 Key: IGNITE-11871
 URL: https://issues.apache.org/jira/browse/IGNITE-11871
 Project: Ignite
  Issue Type: Bug
  Components: ml
Affects Versions: 2.7
Reporter: Anton Dmitriev
Assignee: Anton Dmitriev


TensorFlow cluster manager requires NodeId to be resolved into IP address or 
hostname to pass the address/name to TensorFlow worker. Currently, it uses 
strategy "return first" and returns the first available address/name. As a 
result of that, in the case when the server has more than one interface cluster 
resolver might work incorrectly and return different addresses/names for the 
same server.

To fix this problem we need to update TensorFlowServerAddressSpec so that it 
returns the same address/name for the same server all the time. If a server has 
multiple network interfaces we need to find a "GCD", a network with all Ignite 
nodes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (IGNITE-11870) [ML] Changes required to support ML Python API

2019-05-24 Thread Yury Babak (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-11870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yury Babak updated IGNITE-11870:

Ignite Flags:   (was: Docs Required)

> [ML] Changes required to support ML Python API
> --
>
> Key: IGNITE-11870
> URL: https://issues.apache.org/jira/browse/IGNITE-11870
> Project: Ignite
>  Issue Type: Improvement
>  Components: ml
>Reporter: Anton Dmitriev
>Assignee: Anton Dmitriev
>Priority: Major
>  Labels: ml
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> To support ML Python API we need to change the existing API of the ML module.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-9913) Prevent data updates blocking in case of backup BLT server node leave

2019-05-24 Thread Amelchev Nikita (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-9913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16847456#comment-16847456
 ] 

Amelchev Nikita commented on IGNITE-9913:
-

I investigate the issue about MOVING partitions - should we allow lightweight 
PME if the cluster has those partitions. 
[Dev-list 
discussion.|http://apache-ignite-developers.2346864.n4.nabble.com/Lightweight-version-of-partitions-map-exchange-td41551.html]

> Prevent data updates blocking in case of backup BLT server node leave
> -
>
> Key: IGNITE-9913
> URL: https://issues.apache.org/jira/browse/IGNITE-9913
> Project: Ignite
>  Issue Type: Improvement
>  Components: general
>Reporter: Ivan Rakov
>Assignee: Amelchev Nikita
>Priority: Major
> Fix For: 2.8
>
> Attachments: 9913_yardstick.png, master_yardstick.png
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Ignite cluster performs distributed partition map exchange when any server 
> node leaves or joins the topology.
> Distributed PME blocks all updates and may take a long time. If all 
> partitions are assigned according to the baseline topology and server node 
> leaves, there's no actual need to perform distributed PME: every cluster node 
> is able to recalculate new affinity assigments and partition states locally. 
> If we'll implement such lightweight PME and handle mapping and lock requests 
> on new topology version correctly, updates won't be stopped (except updates 
> of partitions that lost their primary copy).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-5714) Implementation of suspend/resume for pessimistic transactions

2019-05-24 Thread Ignite TC Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-5714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16847439#comment-16847439
 ] 

Ignite TC Bot commented on IGNITE-5714:
---

{panel:title=-- Run :: All: No blockers 
found!|borderStyle=dashed|borderColor=#ccc|titleBGColor=#D6F7C1}{panel}
[TeamCity *-- Run :: All* 
Results|https://ci.ignite.apache.org/viewLog.html?buildId=3924867buildTypeId=IgniteTests24Java8_RunAll]

> Implementation of suspend/resume for pessimistic transactions
> -
>
> Key: IGNITE-5714
> URL: https://issues.apache.org/jira/browse/IGNITE-5714
> Project: Ignite
>  Issue Type: Sub-task
>  Components: general
>Reporter: Alexey Kuznetsov
>Assignee: Aleksey Plekhanov
>Priority: Major
>  Labels: iep-34
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Support transaction suspend()\resume() operations for pessimistic 
> transactions. Resume can be called in another thread.
>_+But there is a problem+_: Imagine, we started pessimistic transaction in 
> thread T1 and then perform put operation, which leads to sending 
> GridDistributedLockRequest to another node. Lock request contains thread id 
> of the transaction. Then we call suspend, resume in another thread and we 
> also must send messages to other nodes to change thread id. 
> It seems complicated task.It’s better to get rid of sending thread id to the 
> nodes.
> We can use transaction xid on other nodes instead of thread id. Xid is sent 
> to nodes in GridDistributedLockRequest#nearXidVer
>_+Proposed solution+_ : On remote nodes instead of thread id of near 
> transaction GridDistributedLockRequest#threadId use its xid 
> GridDistributedLockRequest#nearXidVer.
> Remove usages of near transaction's thread id on remote nodes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (IGNITE-4781) Web Console: Revise the Queries Screen

2019-05-24 Thread Vica Abramova (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-4781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vica Abramova updated IGNITE-4781:
--
Description: (was: Mockup: https://zpl.io/adzyZ12)

> Web Console: Revise the Queries Screen
> --
>
> Key: IGNITE-4781
> URL: https://issues.apache.org/jira/browse/IGNITE-4781
> Project: Ignite
>  Issue Type: Task
>  Components: UI, wizards
>Reporter: Vica Abramova
>Assignee: Ilya Borisov
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-11708) Unable to run tests in IgniteConfigVariationsAbstractTest subclasses

2019-05-24 Thread Ivan Pavlukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-11708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16847363#comment-16847363
 ] 

Ivan Pavlukhin commented on IGNITE-11708:
-

[~ivanan.fed] I see a problem in 
{{IgniteCacheConfigVariationsAbstractTest#beforeTestsStarted}} by design it 
prestarts a cluster. And to do it properly it requires _testsCfg_ (instance 
variable) to be already initialized. And it seems that with a previous rules 
stacking approach it worked. I do not have good ideas in my mind so far. Will 
think about it. Currently a couple of raw ideas:
* Consider using static variable instead _testsCfg_.
* Do not initialize _testsCfg_ with a default value to avoid skipping a real 
config which is expected to be injected.

> Unable to run tests in IgniteConfigVariationsAbstractTest subclasses
> 
>
> Key: IGNITE-11708
> URL: https://issues.apache.org/jira/browse/IGNITE-11708
> Project: Ignite
>  Issue Type: Bug
>Reporter: Ivan Fedotov
>Assignee: Ivan Fedotov
>Priority: Major
>  Labels: iep30
> Attachments: read_through_eviction_self_test.patch, 
> tx_out_test_fixed.patch
>
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> It seems that test classes that extend from 
> IgniteConfigVariationsAbstractTest cannot be started with JUnit4 @Test 
> annotation. 
> It is easy to check: if throw exception in any test methods, nothing will 
> happen.
> Reason can be in rule chain in IgniteConfigVariationsAbstractTest class [1], 
> maybe it destroys existing test workflow.
> [1] 
> https://github.com/apache/ignite/blob/master/modules/core/src/test/java/org/apache/ignite/testframework/junits/IgniteConfigVariationsAbstractTest.java#L62



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (IGNITE-11865) FailureProcessor treats tcp-comm-worker as blocked when it works on reestablishing connect to failed client node

2019-05-24 Thread Dmitriy Govorukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-11865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitriy Govorukhin updated IGNITE-11865:

Ignite Flags:   (was: Docs Required)

> FailureProcessor treats tcp-comm-worker as blocked when it works on 
> reestablishing connect to failed client node
> 
>
> Key: IGNITE-11865
> URL: https://issues.apache.org/jira/browse/IGNITE-11865
> Project: Ignite
>  Issue Type: Bug
>Affects Versions: 2.7
>Reporter: Sergey Chugunov
>Assignee: Sergey Chugunov
>Priority: Minor
> Fix For: 2.8
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When client node fails tcp-comm-worker thread on server keeps trying to 
> reestablish connection to the client until failed node is removed from 
> topology (on expiration of clientFailureDetectionTimeout).
> As tcp-comm-worker thread doesn't update its heartbeats from internal loops 
> FailureProcessor considers it as blocked and prints out misleading message to 
> logs along with full thread dump.
> To avoid polluting logs with unnecessary messages we need to teach 
> tcp-comm-worker how to update its heartbeat timestamp in FailureProcessor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-11870) [ML] Changes required to support ML Python API

2019-05-24 Thread Anton Dmitriev (JIRA)
Anton Dmitriev created IGNITE-11870:
---

 Summary: [ML] Changes required to support ML Python API
 Key: IGNITE-11870
 URL: https://issues.apache.org/jira/browse/IGNITE-11870
 Project: Ignite
  Issue Type: Improvement
  Components: ml
Reporter: Anton Dmitriev
Assignee: Anton Dmitriev


To support ML Python API we need to change the existing API of the ML module.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-11865) FailureProcessor treats tcp-comm-worker as blocked when it works on reestablishing connect to failed client node

2019-05-24 Thread Sergey Chugunov (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-11865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16847313#comment-16847313
 ] 

Sergey Chugunov commented on IGNITE-11865:
--

[~DmitriyGovorukhin],

Scala failure on TC looks irrelevant to this code change. Could you please find 
a moment and review it?

> FailureProcessor treats tcp-comm-worker as blocked when it works on 
> reestablishing connect to failed client node
> 
>
> Key: IGNITE-11865
> URL: https://issues.apache.org/jira/browse/IGNITE-11865
> Project: Ignite
>  Issue Type: Bug
>Affects Versions: 2.7
>Reporter: Sergey Chugunov
>Assignee: Sergey Chugunov
>Priority: Minor
> Fix For: 2.8
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When client node fails tcp-comm-worker thread on server keeps trying to 
> reestablish connection to the client until failed node is removed from 
> topology (on expiration of clientFailureDetectionTimeout).
> As tcp-comm-worker thread doesn't update its heartbeats from internal loops 
> FailureProcessor considers it as blocked and prints out misleading message to 
> logs along with full thread dump.
> To avoid polluting logs with unnecessary messages we need to teach 
> tcp-comm-worker how to update its heartbeat timestamp in FailureProcessor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (IGNITE-11865) FailureProcessor treats tcp-comm-worker as blocked when it works on reestablishing connect to failed client node

2019-05-24 Thread Sergey Chugunov (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-11865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Chugunov updated IGNITE-11865:
-
Reviewer: Dmitriy Govorukhin

> FailureProcessor treats tcp-comm-worker as blocked when it works on 
> reestablishing connect to failed client node
> 
>
> Key: IGNITE-11865
> URL: https://issues.apache.org/jira/browse/IGNITE-11865
> Project: Ignite
>  Issue Type: Bug
>Affects Versions: 2.7
>Reporter: Sergey Chugunov
>Assignee: Sergey Chugunov
>Priority: Minor
> Fix For: 2.8
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When client node fails tcp-comm-worker thread on server keeps trying to 
> reestablish connection to the client until failed node is removed from 
> topology (on expiration of clientFailureDetectionTimeout).
> As tcp-comm-worker thread doesn't update its heartbeats from internal loops 
> FailureProcessor considers it as blocked and prints out misleading message to 
> logs along with full thread dump.
> To avoid polluting logs with unnecessary messages we need to teach 
> tcp-comm-worker how to update its heartbeat timestamp in FailureProcessor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-11865) FailureProcessor treats tcp-comm-worker as blocked when it works on reestablishing connect to failed client node

2019-05-24 Thread Ignite TC Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-11865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16847311#comment-16847311
 ] 

Ignite TC Bot commented on IGNITE-11865:


{panel:title=-- Run :: All: Possible 
Blockers|borderStyle=dashed|borderColor=#ccc|titleBGColor=#F7D6C1}
{color:#d04437}Scala (Visor Console){color} [[tests 0 Exit Code 
|https://ci.ignite.apache.org/viewLog.html?buildId=3922845]]

{panel}
[TeamCity *-- Run :: All* 
Results|https://ci.ignite.apache.org/viewLog.html?buildId=3913074buildTypeId=IgniteTests24Java8_RunAll]

> FailureProcessor treats tcp-comm-worker as blocked when it works on 
> reestablishing connect to failed client node
> 
>
> Key: IGNITE-11865
> URL: https://issues.apache.org/jira/browse/IGNITE-11865
> Project: Ignite
>  Issue Type: Bug
>Affects Versions: 2.7
>Reporter: Sergey Chugunov
>Assignee: Sergey Chugunov
>Priority: Minor
> Fix For: 2.8
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When client node fails tcp-comm-worker thread on server keeps trying to 
> reestablish connection to the client until failed node is removed from 
> topology (on expiration of clientFailureDetectionTimeout).
> As tcp-comm-worker thread doesn't update its heartbeats from internal loops 
> FailureProcessor considers it as blocked and prints out misleading message to 
> logs along with full thread dump.
> To avoid polluting logs with unnecessary messages we need to teach 
> tcp-comm-worker how to update its heartbeat timestamp in FailureProcessor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-11670) Java thin client: Queries are inconsistent in case of failover

2019-05-24 Thread Ignite TC Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-11670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16847292#comment-16847292
 ] 

Ignite TC Bot commented on IGNITE-11670:


{panel:title=-- Run :: Basic Tests: No blockers 
found!|borderStyle=dashed|borderColor=#ccc|titleBGColor=#D6F7C1}{panel}
[TeamCity *-- Run :: Basic Tests* 
Results|https://ci.ignite.apache.org/viewLog.html?buildId=3924513buildTypeId=IgniteTests24Java8_RunBasicTests]

> Java thin client: Queries are inconsistent in case of failover
> --
>
> Key: IGNITE-11670
> URL: https://issues.apache.org/jira/browse/IGNITE-11670
> Project: Ignite
>  Issue Type: Bug
>  Components: thin client
>Affects Versions: 2.7
>Reporter: Aleksey Plekhanov
>Assignee: Aleksey Plekhanov
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When a thin client does failover and switches to a new server, open cursors 
> become inconsistent and silently returns the wrong result.
> Reproducer:
> {code:java}
> public void testQueryFailover() throws Exception {
> try (LocalIgniteCluster cluster = LocalIgniteCluster.start(1);
>  IgniteClient client = Ignition.startClient(new 
> ClientConfiguration()
>  .setAddresses(cluster.clientAddresses().iterator().next()))
> ) {
> ObjectName mbeanName = 
> U.makeMBeanName(Ignition.allGrids().get(0).name(), "Clients",
> ClientListenerProcessor.class.getSimpleName());
> ClientProcessorMXBean mxBean = 
> MBeanServerInvocationHandler.newProxyInstance(
> ManagementFactory.getPlatformMBeanServer(), mbeanName, 
> ClientProcessorMXBean.class, true);
> ClientCache cache = client.createCache("cache");
> cache.put(0, 0);
> cache.put(1, 1);
> Query> qry = new ScanQuery String>().setPageSize(1);
> try (QueryCursor> cur = 
> cache.query(qry)) {
> int cnt = 0;
> for (Iterator> it = 
> cur.iterator(); it.hasNext(); it.next()) {
> cnt++;
> if (cnt == 1)
> mxBean.dropAllConnections();
> }
> assertEquals(2, cnt);
> }
> }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)