Hello

2019-08-22 Thread Amit Chavan
Hello,

My name is Amit Chavan and I am just starting to use Ignite and play around
with it. I really appreciate the work this community has done for the
project. I also wanted to help contribute to the project. Is Jira board the
best place to seek any issues to work on?

Thanks,
Amit


Re: Do I have to use --illegal-access=permit for Java thin client and JDBC with JDK 9/10/11.

2019-08-22 Thread Denis Magda
Ok, I updated the docs saying that

"4. Add the following VM options to your Java applications. That's not
needed if you use Java thin clients or Ignite JDBC."
https://apacheignite.readme.io/docs/getting-started#section-running-ignite-with-java-9-10-11

-
Denis


On Thu, Aug 22, 2019 at 9:30 AM Denis Mekhanikov 
wrote:

> Denis,
>
> I didn’t find any usages of JDK internals in the implementation of the
> thin clients.
> It would be nice to verify in tests that thin clients can work without
> these flags.
>
> Do our Java 9/10/11 tests include thin client testing? If so, do these
> tests include these flags?
>
> Denis
> On 15 Aug 2019, 11:09 +0300, Denis Magda , wrote:
> > Denis,
> >
> > Does it mean we don't need to pass any flags from this list [1] at all
> for
> > the JDBC and thin clients?
> >
> > [1]
> >
> https://apacheignite.readme.io/docs/getting-started#section-running-ignite-with-java-9-10-11
> >
> > -
> > Denis
> >
> >
> > On Wed, Aug 14, 2019 at 5:56 PM Denis Mekhanikov 
> > wrote:
> >
> > > Hi!
> > >
> > > There are two JDK internal things that are used by Ignite: Unsafe and
> > > sun.nio.ch package.
> > > None of these things are used by thin clients. So, it’s fine to use
> thin
> > > clients without additional flags.
> > >
> > > Denis
> > >
> > > > On 13 Aug 2019, at 23:01, Shane Duan  wrote:
> > > >
> > > > Hi Igniter,
> > > >
> > > > I understand that --illegal-access=permit is required for JDK
> 9/10/11 on
> > > > Ignite server. But do I have to include this JVM parameter for Ignite
> > > Java
> > > > thin client and JDBC client? I tried some simple test without it and
> it
> > > > seems working fine...
> > > >
> > > >
> > > > Thanks,
> > > > Shane
> > >
> > >
>


Re: Node failure with "Failed to write buffer." error

2019-08-22 Thread Denis Magda
Ivan, Alex Goncharuk,

The exception trace is not helpful, it's not obvious what might be a reason
and how to address it. How do we tackle these problems?

Ibrahim, please attach all the log files for a detailed look.

-
Denis


On Thu, Aug 22, 2019 at 3:08 AM ihalilaltun 
wrote:

> Hi folks,
>
> We have been experiencing node failures with the error "Failed to write
> buffer." recently. Any ideas or optimizations not to get the error and node
> failure?
>
> Thanks...
>
> [2019-08-22T01:20:55,916][ERROR][wal-write-worker%null-#221][] Critical
> system error detected. Will be handled accordingly to configured handler
> [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
> super=AbstractFailureHandler [ignoredFailureTypes=[SYSTEM_WORKER_BLOCKED,
> SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=FailureContext
> [type=CRITICAL_ERROR, err=class
> o.a.i.i.processors.cache.persistence.StorageException: Failed to write
> buffer.]]
> org.apache.ignite.internal.processors.cache.persistence.StorageException:
> Failed to write buffer.
> at
>
> org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$WALWriter.writeBuffer(FileWriteAheadLogManager.java:3484)
> [ignite-core-2.7.5.jar:2.7.5]
> at
>
> org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$WALWriter.body(FileWriteAheadLogManager.java:3301)
> [ignite-core-2.7.5.jar:2.7.5]
> at
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
> [ignite-core-2.7.5.jar:2.7.5]
> at java.lang.Thread.run(Thread.java:748) [?:1.8.0_201]
> Caused by: java.nio.channels.ClosedChannelException
> at sun.nio.ch.FileChannelImpl.ensureOpen(FileChannelImpl.java:110)
> ~[?:1.8.0_201]
> at sun.nio.ch.FileChannelImpl.position(FileChannelImpl.java:253)
> ~[?:1.8.0_201]
> at
>
> org.apache.ignite.internal.processors.cache.persistence.file.RandomAccessFileIO.position(RandomAccessFileIO.java:48)
> ~[ignite-core-2.7.5.jar:2.7.5]
> at
>
> org.apache.ignite.internal.processors.cache.persistence.file.FileIODecorator.position(FileIODecorator.java:41)
> ~[ignite-core-2.7.5.jar:2.7.5]
> at
>
> org.apache.ignite.internal.processors.cache.persistence.file.AbstractFileIO.writeFully(AbstractFileIO.java:111)
> ~[ignite-core-2.7.5.jar:2.7.5]
> at
>
> org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$WALWriter.writeBuffer(FileWriteAheadLogManager.java:3477)
> ~[ignite-core-2.7.5.jar:2.7.5]
> ... 3 more
> [2019-08-22T01:20:55,921][WARN
> ][wal-write-worker%null-#221][FailureProcessor] No deadlocked threads
> detected.
> [2019-08-22T01:20:56,347][WARN
> ][wal-write-worker%null-#221][FailureProcessor] Thread dump at 2019/08/22
> 01:20:56 UTC
>
>
> *Ignite version*: 2.7.5
> *Cluster size*: 16
> *Client size*: 22
> *Cluster OS version*: Centos 7
> *Cluster Kernel version*: 4.4.185-1.el7.elrepo.x86_64
> *Java version* :
> java version "1.8.0_201"
> Java(TM) SE Runtime Environment (build 1.8.0_201-b09)
> Java HotSpot(TM) 64-Bit Server VM (build 25.201-b09, mixed mode)
>
> Current disk sizes;
> Screen_Shot_2019-08-22_at_12.png
> <
> http://apache-ignite-users.70518.x6.nabble.com/file/t2515/Screen_Shot_2019-08-22_at_12.png>
>
> Ignite and gc logs;
> ignite-9.zip
> 
> Ignite configuration file;
> default-config.xml
> <
> http://apache-ignite-users.70518.x6.nabble.com/file/t2515/default-config.xml>
>
>
>
>
> -
> İbrahim Halil Altun
> Senior Software Engineer @ Segmentify
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>


Re: [VOTE] Release Apache Ignite 2.7.6-rc1

2019-08-22 Thread Denis Magda
+1

-
Denis


On Thu, Aug 22, 2019 at 10:12 AM Dmitriy Pavlov  wrote:

> Dear Community,
>
> I have uploaded release candidate to
> https://dist.apache.org/repos/dist/dev/ignite/2.7.6-rc1/
> https://dist.apache.org/repos/dist/dev/ignite/packages_2.7.6-rc1/
>
> The following staging can be used for any dependent project for testing:
> https://repository.apache.org/content/repositories/orgapacheignite-1466/
>
> This is the second maintenance release for 2.7.x with a number of fixes.
>
> Tag name is 2.7.6-rc1:
>
> https://gitbox.apache.org/repos/asf?p=ignite.git;a=tag;h=refs/tags/2.7.6-rc1
>
> 2.7.6 changes:
>  * Ignite work directory is now set to the current user's home directory by
> default, native persistence files will not be stored in the Temp directory
> anymore
>  * Fixed a bug that caused a SELECT query with an equality predicate on a
> part of the primary compound key to return a single row even if the query
> matched multiple rows
>  * Fixed an issue that could cause data corruption during checkpointing
>  * Fixed an issue where a row size was calculated incorrectly for shared
> cache groups, which caused a tree corruption
>  * Reduced java heap footprint by optimizing GridDhtPartitionsFullMessage
> maps in exchange history
>  * .NET: Native persistence now works with a custom affinity function
>  * Fixed an issue where an outdated node with a destroyed cache caused the
> cluster to hang
>  * Fixed a bug that made it impossible to change the inline_size property
> of an existing index after it was dropped and recreated with a different
> value
>
> RELEASE NOTES:
>
> https://gitbox.apache.org/repos/asf?p=ignite.git;a=blob;f=RELEASE_NOTES.txt;hb=ignite-2.7.6
>
> Complete list of closed issues:
>
> https://issues.apache.org/jira/issues/?jql=project%20%3D%20IGNITE%20AND%20fixVersion%20%3D%202.7.6
>
> DEVNOTES
>
> https://gitbox.apache.org/repos/asf?p=ignite.git;a=blob_plain;f=DEVNOTES.txt;hb=ignite-2.7.6
>
> The vote is formal, see voting guidelines
> https://www.apache.org/foundation/voting.html
>
> +1 - to accept Apache Ignite 2.7.6-rc1
> 0 - don't care either way
> -1 - DO NOT accept Apache Ignite Ignite 2.7.6-rc1 (explain why)
>
> See notes on how to verify release here
> https://www.apache.org/info/verification.html
> and
>
> https://cwiki.apache.org/confluence/display/IGNITE/Release+Process#ReleaseProcess-P5.VotingonReleaseandReleaseVerification
>
> This vote will be open for at least 3 days till Sun Aug 25, 18:00 UTC.
>
> https://www.timeanddate.com/countdown/to?year=2019=8=25=18=0=0=utc-1
>
> Best Regards,
> Dmitriy Pavlov
>


[DISCUSSION] Release Apache Ignite 2.7.6-rc1

2019-08-22 Thread Dmitriy Pavlov
Hi Igniters,


feel free to use this thread for all non-voting discussions or questions
related to release 2.7.6



Please cast your vote in thread after checking the release

https://www.apache.org/info/verification.html

at topic

https://lists.apache.org/thread.html/dab6624041507ddebe8b2d5fffdaff13bbe52d1fb481ac0d702cb4f1@%3Cdev.ignite.apache.org%3E



You don’t need to be PMC member or committer to vote for release, each vote
counts.

Sincerely,
Dmitriy Pavlov


[VOTE] Release Apache Ignite 2.7.6-rc1

2019-08-22 Thread Dmitriy Pavlov
Dear Community,

I have uploaded release candidate to
https://dist.apache.org/repos/dist/dev/ignite/2.7.6-rc1/
https://dist.apache.org/repos/dist/dev/ignite/packages_2.7.6-rc1/

The following staging can be used for any dependent project for testing:
https://repository.apache.org/content/repositories/orgapacheignite-1466/

This is the second maintenance release for 2.7.x with a number of fixes.

Tag name is 2.7.6-rc1:
https://gitbox.apache.org/repos/asf?p=ignite.git;a=tag;h=refs/tags/2.7.6-rc1

2.7.6 changes:
 * Ignite work directory is now set to the current user's home directory by
default, native persistence files will not be stored in the Temp directory
anymore
 * Fixed a bug that caused a SELECT query with an equality predicate on a
part of the primary compound key to return a single row even if the query
matched multiple rows
 * Fixed an issue that could cause data corruption during checkpointing
 * Fixed an issue where a row size was calculated incorrectly for shared
cache groups, which caused a tree corruption
 * Reduced java heap footprint by optimizing GridDhtPartitionsFullMessage
maps in exchange history
 * .NET: Native persistence now works with a custom affinity function
 * Fixed an issue where an outdated node with a destroyed cache caused the
cluster to hang
 * Fixed a bug that made it impossible to change the inline_size property
of an existing index after it was dropped and recreated with a different
value

RELEASE NOTES:
https://gitbox.apache.org/repos/asf?p=ignite.git;a=blob;f=RELEASE_NOTES.txt;hb=ignite-2.7.6

Complete list of closed issues:
https://issues.apache.org/jira/issues/?jql=project%20%3D%20IGNITE%20AND%20fixVersion%20%3D%202.7.6

DEVNOTES
https://gitbox.apache.org/repos/asf?p=ignite.git;a=blob_plain;f=DEVNOTES.txt;hb=ignite-2.7.6

The vote is formal, see voting guidelines
https://www.apache.org/foundation/voting.html

+1 - to accept Apache Ignite 2.7.6-rc1
0 - don't care either way
-1 - DO NOT accept Apache Ignite Ignite 2.7.6-rc1 (explain why)

See notes on how to verify release here
https://www.apache.org/info/verification.html
and
https://cwiki.apache.org/confluence/display/IGNITE/Release+Process#ReleaseProcess-P5.VotingonReleaseandReleaseVerification

This vote will be open for at least 3 days till Sun Aug 25, 18:00 UTC.
https://www.timeanddate.com/countdown/to?year=2019=8=25=18=0=0=utc-1

Best Regards,
Dmitriy Pavlov


Re: Do I have to use --illegal-access=permit for Java thin client and JDBC with JDK 9/10/11.

2019-08-22 Thread Denis Mekhanikov
Denis,

I didn’t find any usages of JDK internals in the implementation of the thin 
clients.
It would be nice to verify in tests that thin clients can work without these 
flags.

Do our Java 9/10/11 tests include thin client testing? If so, do these tests 
include these flags?

Denis
On 15 Aug 2019, 11:09 +0300, Denis Magda , wrote:
> Denis,
>
> Does it mean we don't need to pass any flags from this list [1] at all for
> the JDBC and thin clients?
>
> [1]
> https://apacheignite.readme.io/docs/getting-started#section-running-ignite-with-java-9-10-11
>
> -
> Denis
>
>
> On Wed, Aug 14, 2019 at 5:56 PM Denis Mekhanikov 
> wrote:
>
> > Hi!
> >
> > There are two JDK internal things that are used by Ignite: Unsafe and
> > sun.nio.ch package.
> > None of these things are used by thin clients. So, it’s fine to use thin
> > clients without additional flags.
> >
> > Denis
> >
> > > On 13 Aug 2019, at 23:01, Shane Duan  wrote:
> > >
> > > Hi Igniter,
> > >
> > > I understand that --illegal-access=permit is required for JDK 9/10/11 on
> > > Ignite server. But do I have to include this JVM parameter for Ignite
> > Java
> > > thin client and JDBC client? I tried some simple test without it and it
> > > seems working fine...
> > >
> > >
> > > Thanks,
> > > Shane
> >
> >


Re: Community week status: Aug 13 – Aug 20

2019-08-22 Thread Garrett Alley
Thanks, Denis! We'll take a look at those doc issues and create tickets
soon.

-g-

===

Garrett Alley
Documentation
GridGain Systems


On Thu, Aug 22, 2019 at 8:30 AM Denis Mekhanikov 
wrote:

> Hi everyone!
>
> The following threads created last week address potential issues in the
> product.
>
> *Code issues:*
> *Incomplete results of SQL queries with equality conditions on a primary
> key.* JIRA ticket: https://issues.apache.org/jira/browse/IGNITE-12068
> StackOverflow question:
> https://stackoverflow.com/questions/57479636/simple-select-query-missing-rows-in-ignite
> The issue has already been fixed.
>
> *IgniteQueue.removeAll throws NPE.* Currently it’s unclear, why it
> happens. Investigation is needed.
> StackOverflow question:
> https://stackoverflow.com/questions/57473783/ignite-2-5-ignitequeue-removeall-throwing-npe
> Userlist thread:
> http://apache-ignite-users.70518.x6.nabble.com/IgniteQueue-removeAll-throwing-NPE-td29072.html
>
> *Unclear error message in SQL.*
> The following query results in an unclear error message:
>
> SELECT
> CASE WHEN YEAR(A1) = 2016 THEN SUM(A2) END  FROM
>
> Error: Failed to run reduce query locally.
> StackOverflow question:
> https://stackoverflow.com/questions/57472293/ignite-failed-to-run-reduce-query-locally
>
> *Async continuous queries in .NET generates Java threads, that never
> finish.* JIRA: https://issues.apache.org/jira/browse/IGNITE-9638
> StackOverflow:
> https://stackoverflow.com/questions/57513576/apache-ignite-spawns-too-much-threads
>
> *Hanging cache operations when async/await is used in .NET*
> JIRA issue: https://issues.apache.org/jira/browse/IGNITE-12033
> Userlist thread:
> http://apache-ignite-users.70518.x6.nabble.com/Replace-or-Put-after-PutAsync-causes-Ignite-to-hang-td27871.html
>
> *Metadata is stored in a work directory which is cleared on restarts.*
> This issue has a long history, and users keep suffering from it.
> StackOverflow:
> https://stackoverflow.com/questions/57529702/ignite-persisting-a-set-cannot-find-metadata-for-object-with-compact-footer
>
> *SQL create schema support*
> Userlist thread:
> http://apache-ignite-users.70518.x6.nabble.com/Will-ignite-support-the-CREATE-SCHEMA-statement-td29078.html
>
> *Documentation:*
> *Possible too long JVM pause.*
> The error message appears in logs pretty often when people have issues
> with garbage collection, and it doesn’t sound English. Should we fix the
> message?
> In any case, we should mention this message on a page about GC tuning.
> StackOverflow question:
> https://stackoverflow.com/questions/57541462/possible-too-long-jvm-pause
>
> *Run a script in SQLLine.* It would be nice to mention this ability in
> the documentation.
> StackOverflow question:
> https://stackoverflow.com/questions/57483908/shell-script-to-connect-to-ignite
>
> *Backup filter vs Node filter. Rack awareness.*
> The user in the thread bellow confused a backup filter in
> RendezvousAffinityFunction with a node filter in CacheConfiguration.
> Also the rack awareness topic was brought up in that thread. I noticed,
> that we don’t have it documented. It’s a nice thing to mention in the docs.
> Userlist thread:
> http://apache-ignite-users.70518.x6.nabble.com/Cache-spreading-to-new-nodes-td29063.html
>
> Denis
>
>


Re: Ignite pod keeps crashing and failed to recover the node

2019-08-22 Thread Denis Magda
Hello,

As I see, the community guys stepped in and ready to help with this problem
via this discussion:
http://apache-ignite-users.70518.x6.nabble.com/One-of-Ignite-pod-keeps-crashing-and-not-joining-the-cluster-td29091.html#a29105

Please check out that response that clarifies why the method you use for
data loading is not optimal:
https://stackoverflow.com/questions/56778778/apache-ignite-inserts-extremely-slow/56795152#56795152

-
Denis


On Tue, Aug 20, 2019 at 10:25 AM radha jai  wrote:

> Ignite has been deployed on the kubernets , there are 3 replicas of server
> pod. The pods were up and running fine for 9 days.  We have created 180
> inventory tables and 204 transactional tables. The data has been
> inserted using the PyIgnite client using the cache.put() method.  This is a
> very slow operation because PyIgnite is very slow.  Each insert is
> committed one at a time, so it is not able to do bulk-style inserts. The
> PyIgnite was inserting about 20 of the inventory tables simultaneously (20
> different threads/processes).
>
> The cluster was nowhere stable after 9days, one of the pod crashed and
> failed to recover. Below is the error log:
> {"type":"log","host":"ignite-cluster-ignite-esoc-2","level":"ERROR","system":"ignite-service","time":"2019-08-16T17:13:34,769Z","logger":"GridCachePartitionExchangeManager","timezone":"UTC","log":"Failed
> to process custom exchange task: ClientCacheChangeDummyDiscoveryMessage
> [reqId=6b5f6c50-a8c9-4b04-a461-49bfd0112eb0, cachesToClose=null,
> startCaches=[BgwService]] java.lang.NullPointerException| at
> org.apache.ignite.internal.processors.cache.CacheAffinitySharedManager.processClientCachesChanges(CacheAffinitySharedManager.java:635)|
> at
> org.apache.ignite.internal.processors.cache.GridCacheProcessor.processCustomExchangeTask(GridCacheProcessor.java:391)|
> at
> org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.processCustomTask(GridCachePartitionExchangeManager.java:2475)|
> at
> org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body0(GridCachePartitionExchangeManager.java:2620)|
> at
> org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body(GridCachePartitionExchangeManager.java:2539)|
> at
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)|
> at java.lang.Thread.run(Thread.java:748)"}
> {"type":"log","host":"ignite-cluster-ignite-esoc-2","level":"WARN","system":"ignite-service","time":"2019-08-16T17:13:36,724Z","logger":"GridCacheDatabaseSharedManager","timezone":"UTC","log":"Ignite
> node stopped in the middle of checkpoint. Will restore memory state and
> finish checkpoint on node start."}
>
> The error report file and ignite-config.xml has been attached for your
> info.
>
> Heap Memory and RAM Configurations are as below on each of the ignite
> server container:
>
> Heap Memory: 32gb
>
> RAM: 64GB
>
> Default memory region:
>
> cpu: 4
>
> Persistence volume
>
> wal_storage_size: 10GB
>
> persistence_storage_size: 10GB
>
>
> Thanks
>
> With Regards
>
> Radha
>


Re: Asynchronous registration of binary metadata

2019-08-22 Thread Denis Mekhanikov
Alexey,

Making only one node write metadata to disk synchronously is a possible and 
easy to implement solution, but it still has a few drawbacks:

• Discovery will still be blocked on one node. This is better than blocking all 
nodes one by one, but disk write may take indefinite time, so discovery may 
still be affected.
• There is an unlikely but at the same time an unpleasant case:
1. A coordinator writes metadata synchronously to disk and finalizes the 
metadata registration. Other nodes do it asynchronously, so actual fsync to a 
disk may be delayed.
2. A transaction is committed.
3. The cluster is shut down before all nodes finish their fsync of metadata.
4. Nodes are started again one by one.
5. Before the previous coordinator is started again, a read operation tries 
to read the data, that uses the metadata that wasn’t fsynced anywhere except 
the coordinator, which is still not started.
6. Error about unknown metadata is generated.

In the scheme, that Sergey and me proposed, this situation isn’t possible, 
since the data won’t be written to disk until fsync is finished. Every mapped 
node will wait on a future until metadata is written to disk before performing 
any cache changes.
What do you think about such fix?

Denis
On 22 Aug 2019, 12:44 +0300, Alexei Scherbakov , 
wrote:
> Denis Mekhanikov,
>
> I think at least one node (coordinator for example) still should write
> metadata synchronously to protect from a scenario:
>
> tx creating new metadata is commited <- all nodes in grid are failed
> (powered off) <- async writing to disk is completed
>
> where <- means "happens before"
>
> All other nodes could write asynchronously, by using separate thread or not
> doing fsync( same effect)
>
>
>
> ср, 21 авг. 2019 г. в 19:48, Denis Mekhanikov :
>
> > Alexey,
> >
> > I’m not suggesting to duplicate anything.
> > My point is that the proper fix will be implemented in a relatively
> > distant future. Why not improve the existing mechanism now instead of
> > waiting for the proper fix?
> > If we don’t agree on doing this fix in master, I can do it in a fork and
> > use it in my setup. So please let me know if you see any other drawbacks in
> > the proposed solution.
> >
> > Denis
> >
> > > On 21 Aug 2019, at 15:53, Alexei Scherbakov <
> > alexey.scherbak...@gmail.com> wrote:
> > >
> > > Denis Mekhanikov,
> > >
> > > If we are still talking about "proper" solution the metastore (I've meant
> > > of course distributed one) is the way to go.
> > >
> > > It has a contract to store cluster wide metadata in most efficient way
> > and
> > > can have any optimization for concurrent writing inside.
> > >
> > > I'm against creating some duplicating mechanism as you suggested. We do
> > not
> > > need another copy/paste code.
> > >
> > > Another possibility is to carry metadata along with appropriate request
> > if
> > > it's not found locally but this is a rather big modification.
> > >
> > >
> > >
> > > вт, 20 авг. 2019 г. в 17:26, Denis Mekhanikov :
> > >
> > > > Eduard,
> > > >
> > > > Usages will wait for the metadata to be registered and written to disk.
> > No
> > > > races should occur with such flow.
> > > > Or do you have some specific case on your mind?
> > > >
> > > > I agree, that using a distributed meta storage would be nice here.
> > > > But this way we will kind of move to the previous scheme with a
> > replicated
> > > > system cache, where metadata was stored before.
> > > > Will scheme with the metastorage be different in any way? Won’t we
> > decide
> > > > to move back to discovery messages again after a while?
> > > >
> > > > Denis
> > > >
> > > >
> > > > > On 20 Aug 2019, at 15:13, Eduard Shangareev <
> > eduard.shangar...@gmail.com>
> > > > wrote:
> > > > >
> > > > > Denis,
> > > > > How would we deal with races between registration and metadata usages
> > > > with
> > > > > such fast-fix?
> > > > >
> > > > > I believe, that we need to move it to distributed metastorage, and
> > await
> > > > > registration completeness if we can't find it (wait for work in
> > > > progress).
> > > > > Discovery shouldn't wait for anything here.
> > > > >
> > > > > On Tue, Aug 20, 2019 at 11:55 AM Denis Mekhanikov <
> > dmekhani...@gmail.com
> > > > >
> > > > > wrote:
> > > > >
> > > > > > Sergey,
> > > > > >
> > > > > > Currently metadata is written to disk sequentially on every node. 
> > > > > > Only
> > > > one
> > > > > > node at a time is able to write metadata to its storage.
> > > > > > Slowness accumulates when you add more nodes. A delay required to
> > write
> > > > > > one piece of metadata may be not that big, but if you multiply it by
> > say
> > > > > > 200, then it becomes noticeable.
> > > > > > But If we move the writing out from discovery threads, then nodes 
> > > > > > will
> > > > be
> > > > > > doing it in parallel.
> > > > > >
> > > > > > I think, it’s better to block some threads from a striped pool for a
> > > > > > little while rather than blocking 

Re: Release process wiki update

2019-08-22 Thread Nikolay Izhikov
Dmitriy, Thanks!

В Чт, 22/08/2019 в 15:09 +0300, Dmitriy Pavlov пишет:
> Hi Igniters,
> 
> A number of updates were applied to the process description. Feedback is as
> always welcomed.
> 
> https://cwiki.apache.org/confluence/display/IGNITE/Release+Process
> 
> I can't name it final cause I find some issues from time to time. But it is
> near to be a full description.
> 
> Sincerely,
> Dmitriy Pavlov
> 
> вт, 18 июн. 2019 г. в 20:34, Denis Magda :
> 
> > Dmitriy,
> > 
> > Thanks for keeping us in the loop. Let us know once the final steps are
> > over. If any help is needed, call for it.
> > 
> > -
> > Denis
> > 
> > 
> > On Tue, Jun 18, 2019 at 7:07 AM Dmitriy Pavlov  wrote:
> > 
> > > Hi Ignite developers,
> > > 
> > > Thanks to Pavel T. for providing an initial description on how to upload
> > > NuGet packages.
> > > 
> > > I've added details about how to set it up:
> > > 
> > > 
> > 
> > https://cwiki.apache.org/confluence/display/IGNITE/Release+Process#ReleaseProcess-6.3.13.UploadNuGetpackagestonuget.org
> > > 
> > > Thanks also to Alexander Shapkin, who helped me with uploading packages.
> > > 
> > > Sincerely
> > > Dmitriy Pavlov
> > > 
> > > вт, 11 июн. 2019 г. в 19:43, Dmitriy Pavlov :
> > > 
> > > > Hi Ignite developers,
> > > > 
> > > > I've done several more updates in the doc.
> > > > 
> > > > At the end of the page, the problems list was added:
> > > > 
> > 
> > https://cwiki.apache.org/confluence/display/IGNITE/Release+Process#ReleaseProcess-Notice
> > > > 
> > > > Should you have any additional information related to these issues,
> > > 
> > > please
> > > > let me know (both private and public). I will do my absolute best to
> > 
> > add
> > > > this information to the page. Alternatively, you can contribute updates
> > > > directly to the wiki (committer role or wiki edit permission required
> > 
> > ).
> > > > 
> > > > Sincerely,
> > > > Dmitriy Pavlov
> > > > 
> > > > пт, 7 июн. 2019 г. в 20:15, Dmitriy Pavlov :
> > > > 
> > > > > Hi Igniters,
> > > > > 
> > > > > I've started the process of updating
> > > > > https://cwiki.apache.org/confluence/display/IGNITE/Release+Process
> > 
> > and
> > > > > merging other pages and sources into one single page.
> > > > > 
> > > > > The document is not final, I will add changes as soon as I do steps.
> > > > > 
> > > > > Feedback is, as always, welcomed!
> > > > > 
> > > > > Sincerely
> > > > > Dmitriy Pavlov
> > > > > 


signature.asc
Description: This is a digitally signed message part


Re: Release process wiki update

2019-08-22 Thread Dmitriy Pavlov
Hi Igniters,

A number of updates were applied to the process description. Feedback is as
always welcomed.

https://cwiki.apache.org/confluence/display/IGNITE/Release+Process

I can't name it final cause I find some issues from time to time. But it is
near to be a full description.

Sincerely,
Dmitriy Pavlov

вт, 18 июн. 2019 г. в 20:34, Denis Magda :

> Dmitriy,
>
> Thanks for keeping us in the loop. Let us know once the final steps are
> over. If any help is needed, call for it.
>
> -
> Denis
>
>
> On Tue, Jun 18, 2019 at 7:07 AM Dmitriy Pavlov  wrote:
>
> > Hi Ignite developers,
> >
> > Thanks to Pavel T. for providing an initial description on how to upload
> > NuGet packages.
> >
> > I've added details about how to set it up:
> >
> >
> https://cwiki.apache.org/confluence/display/IGNITE/Release+Process#ReleaseProcess-6.3.13.UploadNuGetpackagestonuget.org
> >
> > Thanks also to Alexander Shapkin, who helped me with uploading packages.
> >
> > Sincerely
> > Dmitriy Pavlov
> >
> > вт, 11 июн. 2019 г. в 19:43, Dmitriy Pavlov :
> >
> > > Hi Ignite developers,
> > >
> > > I've done several more updates in the doc.
> > >
> > > At the end of the page, the problems list was added:
> > >
> >
> https://cwiki.apache.org/confluence/display/IGNITE/Release+Process#ReleaseProcess-Notice
> > >
> > > Should you have any additional information related to these issues,
> > please
> > > let me know (both private and public). I will do my absolute best to
> add
> > > this information to the page. Alternatively, you can contribute updates
> > > directly to the wiki (committer role or wiki edit permission required
> ).
> > >
> > > Sincerely,
> > > Dmitriy Pavlov
> > >
> > > пт, 7 июн. 2019 г. в 20:15, Dmitriy Pavlov :
> > >
> > >> Hi Igniters,
> > >>
> > >> I've started the process of updating
> > >> https://cwiki.apache.org/confluence/display/IGNITE/Release+Process
> and
> > >> merging other pages and sources into one single page.
> > >>
> > >> The document is not final, I will add changes as soon as I do steps.
> > >>
> > >> Feedback is, as always, welcomed!
> > >>
> > >> Sincerely
> > >> Dmitriy Pavlov
> > >>
> > >
> >
>


Re: Asynchronous registration of binary metadata

2019-08-22 Thread Alexei Scherbakov
Denis Mekhanikov,

I think at least one node (coordinator for example)  still should write
metadata synchronously to protect from a scenario:

tx creating new metadata is commited <- all nodes in grid are failed
(powered off) <- async writing to disk is completed

where <- means "happens before"

All other nodes could write asynchronously, by using separate thread or not
doing fsync( same effect)



ср, 21 авг. 2019 г. в 19:48, Denis Mekhanikov :

> Alexey,
>
> I’m not suggesting to duplicate anything.
> My point is that the proper fix will be implemented in a relatively
> distant future. Why not improve the existing mechanism now instead of
> waiting for the proper fix?
> If we don’t agree on doing this fix in master, I can do it in a fork and
> use it in my setup. So please let me know if you see any other drawbacks in
> the proposed solution.
>
> Denis
>
> > On 21 Aug 2019, at 15:53, Alexei Scherbakov <
> alexey.scherbak...@gmail.com> wrote:
> >
> > Denis Mekhanikov,
> >
> > If we are still talking about "proper" solution the metastore (I've meant
> > of course distributed one) is the way to go.
> >
> > It has a contract to store cluster wide metadata in most efficient way
> and
> > can have any optimization for concurrent writing inside.
> >
> > I'm against creating some duplicating mechanism as you suggested. We do
> not
> > need another copy/paste code.
> >
> > Another possibility is to carry metadata along with appropriate request
> if
> > it's not found locally but this is a rather big modification.
> >
> >
> >
> > вт, 20 авг. 2019 г. в 17:26, Denis Mekhanikov :
> >
> >> Eduard,
> >>
> >> Usages will wait for the metadata to be registered and written to disk.
> No
> >> races should occur with such flow.
> >> Or do you have some specific case on your mind?
> >>
> >> I agree, that using a distributed meta storage would be nice here.
> >> But this way we will kind of move to the previous scheme with a
> replicated
> >> system cache, where metadata was stored before.
> >> Will scheme with the metastorage be different in any way? Won’t we
> decide
> >> to move back to discovery messages again after a while?
> >>
> >> Denis
> >>
> >>
> >>> On 20 Aug 2019, at 15:13, Eduard Shangareev <
> eduard.shangar...@gmail.com>
> >> wrote:
> >>>
> >>> Denis,
> >>> How would we deal with races between registration and metadata usages
> >> with
> >>> such fast-fix?
> >>>
> >>> I believe, that we need to move it to distributed metastorage, and
> await
> >>> registration completeness if we can't find it (wait for work in
> >> progress).
> >>> Discovery shouldn't wait for anything here.
> >>>
> >>> On Tue, Aug 20, 2019 at 11:55 AM Denis Mekhanikov <
> dmekhani...@gmail.com
> >>>
> >>> wrote:
> >>>
>  Sergey,
> 
>  Currently metadata is written to disk sequentially on every node. Only
> >> one
>  node at a time is able to write metadata to its storage.
>  Slowness accumulates when you add more nodes. A delay required to
> write
>  one piece of metadata may be not that big, but if you multiply it by
> say
>  200, then it becomes noticeable.
>  But If we move the writing out from discovery threads, then nodes will
> >> be
>  doing it in parallel.
> 
>  I think, it’s better to block some threads from a striped pool for a
>  little while rather than blocking discovery for the same period, but
>  multiplied by a number of nodes.
> 
>  What do you think?
> 
>  Denis
> 
> > On 15 Aug 2019, at 10:26, Sergey Chugunov  >
>  wrote:
> >
> > Denis,
> >
> > Thanks for bringing this issue up, decision to write binary metadata
> >> from
> > discovery thread was really a tough decision to make.
> > I don't think that moving metadata to metastorage is a silver bullet
> >> here
> > as this approach also has its drawbacks and is not an easy change.
> >
> > In addition to workarounds suggested by Alexei we have two choices to
> > offload write operation from discovery thread:
> >
> > 1. Your scheme with a separate writer thread and futures completed
> >> when
> > write operation is finished.
> > 2. PME-like protocol with obvious complications like failover and
> > asynchronous wait for replies over communication layer.
> >
> > Your suggestion looks easier from code complexity perspective but in
> my
> > view it increases chances to get into starvation. Now if some node
> >> faces
> > really long delays during write op it is gonna be kicked out of
> >> topology
>  by
> > discovery protocol. In your case it is possible that more and more
>  threads
> > from other pools may stuck waiting on the operation future, it is
> also
>  not
> > good.
> >
> > What do you think?
> >
> > I also think that if we want to approach this issue systematically,
> we
>  need
> > to do a deep analysis of metastorage option as well and to finally
> >> choose
> > 

[jira] [Created] (IGNITE-12095) .NET: Remove empty tracing-related interfaces from public API

2019-08-22 Thread Pavel Tupitsyn (Jira)
Pavel Tupitsyn created IGNITE-12095:
---

 Summary: .NET: Remove empty tracing-related interfaces from public 
API
 Key: IGNITE-12095
 URL: https://issues.apache.org/jira/browse/IGNITE-12095
 Project: Ignite
  Issue Type: Bug
Reporter: Pavel Tupitsyn
Assignee: Pavel Tupitsyn


Remove IMetricExporterSpi and ITracingSpi. They bring zero value and can 
confuse users.

They were added because of failing API Parity tests. We should disable those 
tests to avoid confusing other team members, and run them manually when needed 
to check for new APIs.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (IGNITE-12094) Scan Query scheduled to be executed on wrong node

2019-08-22 Thread Dmitry Sterinzat (Jira)
Dmitry Sterinzat created IGNITE-12094:
-

 Summary: Scan Query scheduled to be executed on wrong node
 Key: IGNITE-12094
 URL: https://issues.apache.org/jira/browse/IGNITE-12094
 Project: Ignite
  Issue Type: Bug
  Components: cache
Affects Versions: 2.7.5
Reporter: Dmitry Sterinzat


When Scan query on Replicated cache executes from client node and topology is 
unstable (for example 2 new server nodes are up at the same time) it is 
possible to get empty result:

 

 
{code:java}
// IgniteCacheProxyImpl#projection

// here we return cluster group with random cluster node 
return 
ctx.kernalContext().grid().cluster().forDataNodes(ctx.name()).forRandom();{code}
 
{code:java}
// GridCacheQueryAdapter#executeScanQuery

// Affinity nodes snapshot.
Collection nodes = new ArrayList<>(nodes());
...
if (nodes.isEmpty()) {
if (part != null && forceLocal)
throw new IgniteCheckedException("No queryable nodes for partition " + 
part
+ " [forced local query=" + this + "]");

return new GridEmptyCloseableIterator();
} 

// GridCacheQueryAdapter#nodes(final GridCacheContext cctx,
@Nullable final ClusterGroup prj, @Nullable final Integer part)

final AffinityTopologyVersion topVer = 
cctx.affinity().affinityTopologyVersion();

// This collection isn't contains randomly selected node because 
AffinityTopologyVersion is previous.
Collection affNodes = CU.affinityNodes(cctx, topVer);
...
return F.view(affNodes, new P1() {
@Override public boolean apply(ClusterNode n) {
return cctx.discovery().cacheAffinityNode(n, cctx.name()) &&
(prj == null || prj.node(n.id()) != null) &&
(part == null || owners.contains(n));
}
});
}


{code}
In our case nodes collection is empty, because node that was randomly selected 
isn't current topology version (cache isn't already started), so we get 
GridEmptyCloseableIterator (query event isn't executed).

 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)