Re: Use AutoService library to generate Java SPI files in Ignite 3

2022-12-07 Thread Ivan Bessonov
Hello,

I'm all in for this change, thank you for the PR!

+1

ср, 7 дек. 2022 г. в 12:15, Вячеслав Коптилин :

> Hi Aleksandr,
>
> This suggestion seems useful to me.
> As Aleksandr pointed out, this is a compile-time dependency, so it doesn't
> look risky.
>
> +1
>
> Thanks,
> Slava.
>
>
> ср, 7 дек. 2022 г. в 10:20, Aleksandr Polovtsev :
>
> > Dear Igniters! In Ignite 3, we have a bunch of classes that utilize the
> > Java SPI (ConfigurationModule and MetricsExporter to name a few). For
> every
> > interface implementation we need to manually create a file in the
> META-INF
> > folder. This step can be automated by the AutoService library [1].
> >
> > I can see the following pros and cons of using this approach:
> > 1. Pros:
> >   * Less manual boilerplate,
> >   * This is a compile-time only dependency (an annotation and an
> annotation
> > processor),
> >   * Less files to maintain and update/move when corresponding interfaces
> > change.
> > 2. Cons:
> >   * A new dependency will be introduced and it looks like the community
> > doesn't like that.
> >
> > I've created a PR [2] with a demonstration how this library can be used
> in
> > the existing code base.
> >
> > [1] https://github.com/google/auto/tree/main/service
> > [2] https://github.com/apache/ignite-3/pull/1415
> >
> > --
> > With regards,
> > Aleksandr Polovtsev
> >
>


-- 
Sincerely yours,
Ivan Bessonov


Re: Apache Ignite 2.14 RELEASE [Time, Scope, Manager]

2022-10-04 Thread Ivan Bessonov
Hi Taras,

If there are enough improvements for 2.14.1, I don't mind it. Otherwise -
there's no point
in making a release for this specific bug.

пт, 30 сент. 2022 г. в 14:42, Taras Ledkov :

> Hi, Ivan.
>
> My apologies.
> A blocking performance issue has been fixed [1]. And the release candidate
> is verified and ready to be voted on. What do you think about the 2.14.1
> release, or can the fix wait until 2.15?
> I'm going to start VOTE just now.
>
> [1]. https://issues.apache.org/jira/browse/IGNITE-17764
>


-- 
Sincerely yours,
Ivan Bessonov


Re: Apache Ignite 2.14 RELEASE [Time, Scope, Manager]

2022-09-30 Thread Ivan Bessonov
Hello Igniters,

I know I'm late, but is there a chance to add one more fix to the scope?
It is very local and fixes and issue, brought to us in user-list recently,
where PDS
defragmentation simply doesn't work on Windows, because we don't close file
descriptors properly.

This code will not affect anything other than the defragmentation process
itself.

Please reply. Thank you in advance!

[1] https://issues.apache.org/jira/browse/IGNITE-17761
[2] https://github.com/apache/ignite/pull/10267

пт, 23 сент. 2022 г. в 11:43, Taras Ledkov :

> Dear Ignite Community!
>
> Release 2.14 has been delayed due to performance issues (compare with
> 2.13) for IgniteSqlQueryBenchmark in in-memory mode. We're looking.
>
> Any suggestions and support are welcome.
>


-- 
Sincerely yours,
Ivan Bessonov


Re: [DISCUSSION] BPlusTree design: links and papers

2022-07-19 Thread Ivan Bessonov
Hello, Nikolay,

as far as I know, the following paper usually referred as a part of
original inspiration:
https://www.csd.uoc.gr/~hy460/pdf/p650-lehman.pdf

I also heard rumors that the code was partially inspired by the PostgreSQL
implementation.

I don't know any other sources except for the code itself. Let's wait for
Sergi's reply in case
you contact him.

вт, 19 июл. 2022 г. в 17:20, Denis Magda :

> Hi Nikolay,
>
> Send a note to Sergi who implemented the tree for Ignite:
> https://www.linkedin.com/in/sergi-vladykin-a4781536/
>
> --
> Denis
>
> On Tue, Jul 19, 2022 at 9:42 AM Николай Ижиков 
> wrote:
>
> > Hello, Igniters and espesially Ignite veterans.
> >
> >
> > While study Ignite B+ tree I’m wondering - what was initial design of
> this
> > data structure?
> > Is it based on some paper or inspired by some existing implementation?
> > What should I study beside source code to better understand Ignite b+
> tree
> > and how it can be improved?
> >
> >
>


-- 
Sincerely yours,
Ivan Bessonov


Re: [DISCUSSION] Error handling in Ignite 3

2022-03-23 Thread Ivan Bessonov
> > > > > > > >
> > > > > > > > Exception hierarchy is not required when using error codes
> and
> > > > > > applicable
> > > > > > > > only to java API, so I would avoid spending efforts on it.
> > > > > > > >
> > > > > > > >
> > > > > > > > >   - Why no nested exceptions? Sometimes an error
> handler
> > is
> > > > > > > > interested
> > > > > > > > >   only in high level exceptions (like Invalid
> > > Configuration)
> > > > > and
> > > > > > > > > sometimes
> > > > > > > > >   details are needed (like specific configuration
> parser
> > > > > > > exceptions).
> > > > > > > > >
> > > > > > > >
> > > > > > > > Nested exceptions are not forbidden to use. They can provide
> > > > > additional
> > > > > > > > details on the error for debug purposes, but not strictly
> > > required,
> > > > > > > because
> > > > > > > > error code + message should provide enough information to the
> > > user.
> > > > > > > >
> > > > > > > >
> > > > > > > > >- For async methods returning a Future we may have a
> > > universal
> > > > > > rule
> > > > > > > on
> > > > > > > > >how to handle exceptions. For example, we may specify
> that
> > > any
> > > > > > async
> > > > > > > > > method
> > > > > > > > >can throw only invalid argument exceptions. All other
> > errors
> > > > are
> > > > > > > > > reported
> > > > > > > > >via the exceptionally(IgniteException -> {}) callback
> even
> > > if
> > > > > the
> > > > > > > > async
> > > > > > > > >method was executed synchronously.
> > > > > > > > >
> > > > > > > >
> > > > > > > > This is ok to me.
> > > > > > > >
> > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > вт, 13 апр. 2021 г. в 12:08, Alexei Scherbakov <
> > > > > > > > > alexey.scherbak...@gmail.com
> > > > > > > > > >:
> > > > > > > > >
> > > > > > > > > > Igniters,
> > > > > > > > > >
> > > > > > > > > > I would like to start the discussion about error handling
> > in
> > > > > > Ignite 3
> > > > > > > > and
> > > > > > > > > > how we can improve it compared to Ignite 2.
> > > > > > > > > >
> > > > > > > > > > The error handling in Ignite 2 was not very good because
> of
> > > > > generic
> > > > > > > > > > CacheException thrown on almost any occasion, having
> deeply
> > > > > nested
> > > > > > > root
> > > > > > > > > > cause and often containing no useful information on
> further
> > > > steps
> > > > > > to
> > > > > > > > fix
> > > > > > > > > > the issue.
> > > > > > > > > >
> > > > > > > > > > I aim to fix it by introducing some rules on error
> > handling.
> > > > > > > > > >
> > > > > > > > > > *Public exception structure.*
> > > > > > > > > >
> > > > > > > > > > A public exception must have an error code, a cause, and
> an
> > > > > action.
> > > > > > > > > >
> > > > > > > > > > * The code - the combination of 2 byte scope id and 2
> byte
> > > > error
> > > > > > > number
> > > > > > > > > > within the module. This allows up to 2^16 errors for each
> > > > scope,
> > > > > > > which
> > > > > > > > > > should be enough. The error code string representation
> can
> > > look
> > > > > > like
> > > > > > > > > > RFT-0001 or TBL-0001
> > > > > > > > > > * The cause - short string description of an issue,
> > readable
> > > by
> > > > > > user.
> > > > > > > > > This
> > > > > > > > > > can have dynamic parameters depending on the error type
> for
> > > > > better
> > > > > > > user
> > > > > > > > > > experience, like "Can't write a snapshot, no space left
> on
> > > > device
> > > > > > > {0}"
> > > > > > > > > > * The action - steps for a user to resolve error
> situation
> > > > > > described
> > > > > > > in
> > > > > > > > > the
> > > > > > > > > > documentation in the corresponding error section, for
> > example
> > > > > > "Clean
> > > > > > > up
> > > > > > > > > > disk space and retry the operation".
> > > > > > > > > >
> > > > > > > > > > Common errors should have their own scope, for example
> > > IGN-0001
> > > > > > > > > >
> > > > > > > > > > All public methods throw only unchecked
> > > > > > > > > > org.apache.ignite.lang.IgniteException containing
> > > > aforementioned
> > > > > > > > fields.
> > > > > > > > > > Each public method must have a section in the javadoc
> with
> > a
> > > > list
> > > > > > of
> > > > > > > > all
> > > > > > > > > > possible error codes for this method.
> > > > > > > > > >
> > > > > > > > > > A good example with similar structure can be found here
> [1]
> > > > > > > > > >
> > > > > > > > > > *Async timeouts.*
> > > > > > > > > >
> > > > > > > > > > Because almost all API methods in Ignite 3 are async,
> they
> > > all
> > > > > will
> > > > > > > > have
> > > > > > > > > a
> > > > > > > > > > configurable default timeout and can complete with
> timeout
> > > > error
> > > > > > if a
> > > > > > > > > > computation is not finished in time, for example if a
> > > response
> > > > > has
> > > > > > > not
> > > > > > > > > been
> > > > > > > > > > yet received.
> > > > > > > > > > I suggest to complete the async op future with
> > > TimeoutException
> > > > > in
> > > > > > > this
> > > > > > > > > > case to make it on par with synchronous execution using
> > > > > future.get,
> > > > > > > > which
> > > > > > > > > > will throw java.util.concurrent.TimeoutException on
> > timeout.
> > > > > > > > > > For reference, see
> > > > > java.util.concurrent.CompletableFuture#orTimeout
> > > > > > > > > > No special error code should be used for this scenario.
> > > > > > > > > >
> > > > > > > > > > *Internal exceptions hierarchy.*
> > > > > > > > > >
> > > > > > > > > > All internal exceptions should extend
> > > > > > > > > > org.apache.ignite.internal.lang.IgniteInternalException
> for
> > > > > checked
> > > > > > > > > > exceptions and
> > > > > > > > > >
> > > org.apache.ignite.internal.lang.IgniteInternalCheckedException
> > > > > for
> > > > > > > > > > unchecked exceptions.
> > > > > > > > > >
> > > > > > > > > > Thoughts ?
> > > > > > > > > >
> > > > > > > > > > [1]
> > > > > > >
> > https://docs.oracle.com/cd/B10501_01/server.920/a96525/preface.htm
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > >
> > > > > > > > > > Best regards,
> > > > > > > > > > Alexei Scherbakov
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > Best regards,
> > > > > > > > > Alexey
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > >
> > > > > > > > Best regards,
> > > > > > > > Alexei Scherbakov
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Vladislav Pyatkov
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > >
> > > > > Best regards,
> > > > > Alexei Scherbakov
> > > > >
> > > >
> > > >
> > > > --
> > > > Best regards,
> > > > Andrey V. Mashenkov
> > > >
> > >
> > >
> > > --
> > >
> > > Best regards,
> > > Alexei Scherbakov
> > >
> >
>
>
> --
>
> Best regards,
> Alexei Scherbakov
>


-- 
Sincerely yours,
Ivan Bessonov


Re: Apache Ignite 2.12 RELEASE [Time, Scope, Manager]

2021-11-25 Thread Ivan Bessonov
gt;>>>>>>>>>>>> I have prepared a fix for sqlline.sh -e:
> > >>>>>>>>>>>>>>>>>>> https://github.com/apache/ignite/pull/9536
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> If smbd reviews it, we can put it on 2.12.
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> Regards,
> > >>>>>>>>>>>>>>>>>>> --
> > >>>>>>>>>>>>>>>>>>> Ilya Kasnacheev
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> чт, 28 окт. 2021 г. в 10:39, Nikita Amelchev <
> > >>>>>>>>>>>>>> namelc...@apache.org>:
> > >>>>>>>>>>>>>>>>>>>> Maksim, ok.
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>> Ivan,
> > >>>>>>>>>>>>>>>>>>>> The minor issues can be considered if they will be
> > >>>>>>>>>> merged
> > >>>>>>>>>>>>>> before
> > >>>>>>>>>>>>>>>> the
> > >>>>>>>>>>>>>>>>>>>> code freeze.
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>> ср, 27 окт. 2021 г. в 14:55, Pavel Tupitsyn <
> > >>>>>>>>>>>>>> ptupit...@apache.org
> > >>>>>>>>>>>>>>>>> :
> > >>>>>>>>>>>>>>>>>>>>> Ivan, every additional ticket potentially moves the
> > >>>>>>>>>> release
> > >>>>>>>>>>>>>> date
> > >>>>>>>>>>>>>>>>>>> further.
> > >>>>>>>>>>>>>>>>>>>>> Also, AFAIK, Igor is on vacation and can't do a
> > >>>>>>>>>> review this
> > >>>>>>>>>>>>>> week.
> > >>>>>>>>>>>>>>>>>>>>> On Wed, Oct 27, 2021 at 2:32 PM Ivan Daschinsky <
> > >>>>>>>>>>>>>>>> ivanda...@gmail.com>
> > >>>>>>>>>>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>>>>>>>>> Pavel, but what is the problem to include a minor
> > >>>>>>>>>>>>>> refactoring?
> > >>>>>>>>>>>>>>>>>>>> Especially
> > >>>>>>>>>>>>>>>>>>>>>> before code freeze?
> > >>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>> ср, 27 окт. 2021 г. в 14:25, Pavel Tupitsyn <
> > >>>>>>>>>>>>>>>> ptupit...@apache.org>:
> > >>>>>>>>>>>>>>>>>>>>>>> Ivan, IGNITE-15806 seems to be a minor
> > >>>>>>>>>> refactoring, not
> > >>>>>>>>>>>>>> sure
> > >>>>>>>>>>>>>>>> we
> > >>>>>>>>>>>>>>>>>>>> should
> > >>>>>>>>>>>>>>>>>>>>>>> inflate the scope with this.
> > >>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>> On Wed, Oct 27, 2021 at 2:13 PM Ivan Daschinsky
> > >>>>>>>>>> <
> > >>>>>>>>>>>>>>>>>>> ivanda...@gmail.com
> > >>>>>>>>>>>>>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>> Nikita, can we add
> > >>>>>>>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/IGNITE-15806
> > >>>>>>>>>>>>>>>>>>>>>>> this
> > >>>>>>>>>>>>>>>>>>>>>>>> ticket for release?
> > >>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>> ср, 27 окт. 2021 г. в 13:40, Maksim Timonin <
> > >>>>>>>>>>>>>>>>>>>> timonin.ma...@gmail.com>:
> > >>>>>>>>>>>>>>>>>>>>>>>>> Hi,
> > >>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>> I'd like to include the ticket in 2.12.
> > >>>>>>>>>> There was a
> > >>>>>>>>>>>>>>>> performance
> > >>>>>>>>>>>>>>>>>>>>>> issue,
> > >>>>>>>>>>>>>>>>>>>>>>> I
> > >>>>>>>>>>>>>>>>>>>>>>>>> spent some time trying to figure it out. I
> > >>>>>>>>>> prepared
> > >>>>>>>>>>>>>> some
> > >>>>>>>>>>>>>>>> fixes
> > >>>>>>>>>>>>>>>>>>>> and I
> > >>>>>>>>>>>>>>>>>>>>>> am
> > >>>>>>>>>>>>>>>>>>>>>>>>> going to submit a patch this week.
> > >>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>> I think this issue is critical to include
> > >>>>>>>>>> to 2.12
> > >>>>>>>>>>>>>>>> together with
> > >>>>>>>>>>>>>>>>>>>> basic
> > >>>>>>>>>>>>>>>>>>>>>>>>> IndexQuery functionality, because this
> > >>>>>>>>>> ticket
> > >>>>>>>>>>>>>> provides
> > >>>>>>>>>>>>>>>> expected
> > >>>>>>>>>>>>>>>>>>>>>>> behavior
> > >>>>>>>>>>>>>>>>>>>>>>>> of
> > >>>>>>>>>>>>>>>>>>>>>>>>> querying indexes - sorted over all nodes
> > >>>>>>>>>> results.
> > >>>>>>>>>>>>>>>>>>>>>>>>> - IGNITE-15530: IndexQuery has to use
> > >>>>>>>>>> MergeSort
> > >>>>>>>>>>>>>> reducer
> > >>>>>>>>>>>>>>>> [1]
> > >>>>>>>>>>>>>>>>>>>>>>>>> [1]
> > >>>>>>>>>>>>>> https://issues.apache.org/jira/browse/IGNITE-15530
> > >>>>>>>>>>>>>>>>>>>>>>>>> On Wed, Oct 27, 2021 at 1:23 PM Nikita
> > >>>>>>>>>> Amelchev <
> > >>>>>>>>>>>>>>>>>>>>>> namelc...@apache.org>
> > >>>>>>>>>>>>>>>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>>> Hello.
> > >>>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>>> Code freeze is planned for Friday,
> > >>>>>>>>>> October 29.
> > >>>>>>>>>>>>>>>>>>>>>>>>>> There are 40 unresolved issues [1] in the
> > >>>>>>>>>> release
> > >>>>>>>>>>>>>>>> scope. I
> > >>>>>>>>>>>>>>>>>>>> suggest
> > >>>>>>>>>>>>>>>>>>>>>>>>>> bumping up the fix version if they will
> > >>>>>>>>>> not be
> > >>>>>>>>>>>>>>>> resolved till
> > >>>>>>>>>>>>>>>>>>>> the
> > >>>>>>>>>>>>>>>>>>>>>> code
> > >>>>>>>>>>>>>>>>>>>>>>>>>> freeze.
> > >>>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>>> Please, feel free to write if some of
> > >>>>>>>>>> them are
> > >>>>>>>>>>>>>>>> critical to be
> > >>>>>>>>>>>>>>>>>>>> in
> > >>>>>>>>>>>>>>>>>>>>>> the
> > >>>>>>>>>>>>>>>>>>>>>>>>> 2.12.
> > >>>>>>>>>>>>>>>>>>>>>>>>>> Also, please, pay attention to unresolved
> > >>>>>>>>>>>>>> documentation
> > >>>>>>>>>>>>>>>>>>>> issues. [2]
> > >>>>>>>>>>>>>>>>>>>>>>>>>> Sergey, Petr, thanks for helping with TC.
> > >>>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>>> [1]
> > >>>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>
> > >>>>
> > >>>
> https://issues.apache.org/jira/issues/?jql=(project%20%3D%20%27Ignite%27%20AND%20fixVersion%20is%20not%20empty%20AND%20fixVersion%20in%20(%272.12%27))%20AND%20(component%20is%20EMPTY%20OR%20component%20not%20in%20(documentation))%20%20and%20status%20not%20in%20(%27CLOSED%27%2C%20%27RESOLVED%27)%20ORDER%20BY%20priority%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20
> > >>>>>>>>>>>>>>>>>>>>>>>>>> [2]
> > >>>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>
> > >>>>
> > >>>
> https://issues.apache.org/jira/issues/?jql=project%20%3D%20%27Ignite%27%20AND%20fixVersion%20is%20not%20empty%20AND%20fixVersion%20in%20(%272.12%27)%20AND%20status%20NOT%20IN%20(Resolved%2C%20Closed)%20AND%20component%20in%20(documentation)%20ORDER%20BY%20priority%20%20%20%20%20%20%20%20%20
> > >>>>>>>>>>>>>>>>>>>>>>>>>> вт, 26 окт. 2021 г. в 12:50, Petr Ivanov <
> > >>>>>>>>>>>>>>>>>>> mr.wei...@gmail.com
> > >>>>>>>>>>>>>>>>>>>>> :
> > >>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks, Sergey.
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>>>> I've also reconfigured nightly builds
> > >>>>>>>>>> for
> > >>>>>>>>>>>>>> ignite-2.12
> > >>>>>>>>>>>>>>>>>>> branch.
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> On 22 Oct 2021, at 17:32, Сергей
> > >>>>>>>>>> Утцель <
> > >>>>>>>>>>>>>>>>>>> utt...@gmail.com>
> > >>>>>>>>>>>>>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> I configured TC bot to 'ignite-2.12'
> > >>>>>>>>>> instead
> > >>>>>>>>>>>>> of
> > >>>>>>>>>>>>>>>>>>>> 'ignite-2.11'
> > >>>>>>>>>>>>>>>>>>>>>>>>>> --
> > >>>>>>>>>>>>>>>>>>>>>>>>>> Best wishes,
> > >>>>>>>>>>>>>>>>>>>>>>>>>> Amelchev Nikita
> > >>>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>> --
> > >>>>>>>>>>>>>>>>>>>>>>>> Sincerely yours, Ivan Daschinskiy
> > >>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>> --
> > >>>>>>>>>>>>>>>>>>>>>> Sincerely yours, Ivan Daschinskiy
> > >>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>> --
> > >>>>>>>>>>>>>>>>>>>> Best wishes,
> > >>>>>>>>>>>>>>>>>>>> Amelchev Nikita
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> --
> > >>>>>>>>>>>>>>>> Best wishes,
> > >>>>>>>>>>>>>>>> Amelchev Nikita
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> --
> > >>>>>>>>>>>>>>> Sincerely yours, Ivan Daschinskiy
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> --
> > >>>>>>>>>>>>>> Best wishes,
> > >>>>>>>>>>>>>> Amelchev Nikita
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>> --
> > >>>>>>>>>>>>> Sincerely yours, Ivan Daschinskiy
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>> --
> > >>>>>>>>>>> Best wishes,
> > >>>>>>>>>>> Amelchev Nikita
> > >>>>>>
> > >>>>>>
> > >>>>>> --
> > >>>>>> Best wishes,
> > >>>>>> Amelchev Nikita
> > >>>>
> > >>>
> > >>
> > >>
> > >> --
> > >> Sincerely yours, Ivan Daschinskiy
> > >
> > >
> > >
> > > --
> > > Best wishes,
> > > Amelchev Nikita
> >
>
>
> --
> Best wishes,
> Amelchev Nikita
>


-- 
Sincerely yours,
Ivan Bessonov


Re: Ban Java Streams usage in Ignite 3 code

2021-09-08 Thread Ivan Bessonov
Hello Igniters,

I object, banning streams is an overkill. I would argue that most of the
code
is not on hot paths and that allocations in TLAB don't create much pressure
on GC.

Streams must be used cautiously, developers should know whether they
write hot methods or not. And if methods are not hot, code simplicity must
be
the first priority. I don't want Ignite 3 code to look like Ignite 2 code,
where
people would iterate over Lists using explicit access by indexes, because it
saves them a single Iterator allocation. That's absurd.

ср, 8 сент. 2021 г. в 11:43, Pavel Tupitsyn :

> Igniters,
>
> Java streams are known to be slower and cause more GC pressure than an
> equivalent loop.
> Below is a simple filter/map/reduce scenario (code [1]):
>
>  * Benchmark Mode  Cnt
> Score Error   Units
>
>  * StreamVsLoopBenchmark.loopSum thrpt3
>  7987.016 ± 293.013  ops/ms
>  * StreamVsLoopBenchmark.loopSum:·gc.alloc.rate  thrpt3
>≈ 10⁻⁴MB/sec
>  * StreamVsLoopBenchmark.loopSum:·gc.count   thrpt3
>   ≈ 0counts
>
>  * StreamVsLoopBenchmark.streamSum   thrpt3
>  1060.244 ±  36.485  ops/ms
>  * StreamVsLoopBenchmark.streamSum:·gc.alloc.ratethrpt3
>   315.819 ±  10.844  MB/sec
>  * StreamVsLoopBenchmark.streamSum:·gc.count thrpt3
>55.000counts
>
> Loop is several times faster and does not allocate at all.
>
> 1. Performance is one of the most important features of our product.
> 2. Most of our APIs will be on the hot path.
>
> One can argue about performance differences in real-world scenarios,
> but increasing GC pressure just to make the code a little bit nicer is
> unacceptable.
>
> I propose to ban streams usage in the codebase (except for the tests).
>
> Thoughts, objections?
>
> [1] https://gist.github.com/ptupitsyn/5934bbbf8f92ac4937e534af9386da97
>


-- 
Sincerely yours,
Ivan Bessonov


Re: Problem with dropping the index

2021-07-08 Thread Ivan Bessonov
gt; Step 2: Removing the index from the cache configuration and persist
> it.
> >>  >>
> >>  >> Problems:
> >>  >>
> >>  >> 1)We add and immediately delete the index, a checkpoint does not
> happen
> >>  >> and the node crashes, after restarting
> >>  >> DurableBackgroundCleanupIndexTreeTask will not be able to complete
> and
> >>  will
> >>  >> periodically restart due to the fact that it saves
> >>  >> DurableBackgroundCleanupIndexTreeTask#rootPages (root pages of index
> >>  trees)
> >>  >> that have not appeared;
> >>  >>
> >>  >> 2)After adding a DurableBackgroundCleanupIndexTreeTask node crashes,
> >>  after
> >>  >> restarting the node, the task will clean the index trees and there
> >>  will be
> >>  >> errors when using the index;
> >>  >>
> >>  >> 3)etc.
> >>  >>
> >>  >> Suggested solution:
> >>  >>
> >>  >> Rename the root index trees and write about this with a logical
> entry
> >>  in
> >>  >> the WAL and do this at the first start of
> >>  >> DurableBackgroundCleanupIndexTreeTask.
> >>  >> Thus, if we find the renamed root pages in task 1, we can clear the
> >>  index
> >>  >> trees to the end, otherwise the task can be completed.
> >>  >> Also, if we find that rename pages are present, and the step 2 has
> not
> >>  >> been completed, then we can start rebuilding the indexes.
> >>  >>
> >>  >> WDYT?
> >>  >
> >>  > --
> >>  >
> >>  > Best regards,
> >>  > Alexei Scherbakov
> >
> > --
> >
> > Best regards,
> > Alexei Scherbakov
>


-- 
Sincerely yours,
Ivan Bessonov


Re: Defrag?

2021-06-25 Thread Ivan Bessonov
Hello Ryan,

defragmentation is already implemented, your link was correct. The only
documentation that I know existing is located here: [1]

It's an offline process that you trigger on an online node. There's also a
"--defragmentation status" command to check the process for completion.
You should manually restart the node when defragmentation is completed.

That's a very quick introduction from me, please ask any specific questions.
I think we'll find a way to run it on your data without issues.

[1]
https://www.gridgain.com/docs/latest/administrators-guide/defragmentation

пт, 25 июн. 2021 г. в 04:11, Denis Magda :

> Ignite fellows,
>
> I remember some of us worked on the persistence defragmentation features.
> Has it been merged?
>
> @Valentin Kulichenko  probably you know the
> latest state.
>
> -
> Denis
>
> On Thu, Jun 24, 2021 at 11:59 AM Ilya Kasnacheev <
> ilya.kasnach...@gmail.com>
> wrote:
>
> > Hello!
> >
> > You can probably drop the entire cache and then re-populate it via
> > loadCache(), etc.
> >
> > Regards,
> > --
> > Ilya Kasnacheev
> >
> >
> > ср, 23 июн. 2021 г. в 21:47, Ryan Trollip :
> >
> >> Thanks, Ilya, we may have to consider moving back to non-native storage
> >> and caching more selectively as the performance degrades when there is a
> >> lot of write/delete activity or tables with large amounts of rows. This
> is
> >> with SQL with indexes and the use of query plans etc.
> >>
> >> Is there any easy way to rebuild the entire native database after hours?
> >> e.g. with a batch run on the weeknds?
> >>
> >> On Wed, Jun 23, 2021 at 7:39 AM Ilya Kasnacheev <
> >> ilya.kasnach...@gmail.com> wrote:
> >>
> >>> Hello!
> >>>
> >>> I don't think there's anything ready to use, but "killing performance"
> >>> from fragmentation is also not something reported too often.
> >>>
> >>> Regards,
> >>> --
> >>> Ilya Kasnacheev
> >>>
> >>>
> >>> ср, 16 июн. 2021 г. в 04:39, Ryan Trollip :
> >>>
> >>>> We see continual very large growth to data with ignite native. We have
> >>>> a very chatty use case that's creating and deleting stuff often. The
> data
> >>>> on disk just keeps growing at an explosive rate. So much so we ported
> this
> >>>> to a DB to see the difference and the DB is much smaller. I was
> searching
> >>>> to see if someone has the same issue. This is also killing
> performance.
> >>>>
> >>>> Founds this:
> >>>>
> >>>>
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-47%3A+Native+persistence+defragmentation
> >>>>
> >>>> Apparently, there is no auto-rebalancing of pages? or cleanup of
> pages?
> >>>>
> >>>> Has anyone implemented a workaround to rebuild the cache and indexes
> >>>> say on a weekly basis to get it to behave reasonably?
> >>>>
> >>>> Thanks
> >>>>
> >>>
>


-- 
Sincerely yours,
Ivan Bessonov


[jira] [Created] (IGNITE-14447) Invalid meta page can be used after index re-creation

2021-03-31 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-14447:
--

 Summary: Invalid meta page can be used after index re-creation
 Key: IGNITE-14447
 URL: https://issues.apache.org/jira/browse/IGNITE-14447
 Project: Ignite
  Issue Type: Bug
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov
 Fix For: 2.11


Consider the following scenario:
 * A user creates index "A"

 * Ignite allocates page 0x1234 as the index meta page and writes it to the 
index roots tree

 * Index is populated, query entity is written on disk

 * Checkpoint is triggered and the index pages (including root) are written to 
disk

 * User drops the index

 * The tree is deallocated, the meta page is removed from the roots tree, query 
entity without the index is written to disk. No logical record is written for 
the roots tree.

 * Node crashes without checkpoint being marked

 * Node restarts. Since the query entity does not contain the index "A", the 
index tree is not created

 * User deletes some entries, then attempts to create the index "A" again

 * Since the node did not trigger checkpoint before the crash and no logical 
record was written, the root tree contains obsolete tree with links pointing to 
non-existing data (namely, index "A" still refers to page 0x1234)

 * Depending on allocation pattern and enabled assertions flag, the node will 
either fail with an assertion, or will crash the JVM

Fundamentally, the issue is caused by inconsistency between index roots tree 
and query entity. Ideally, we should move cache configuration to page memory 
subsystem, but this may be a big change.

We should check whether writing a logical record on index drop that will run 
the index cleanup on recovery mitigates the issue (in other words, the index 
cleanup persistent task should be triggered even if no checkpoint was marked 
after query entity is persisted).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-14442) IgniteRunner fails with NPE after REST module was broken by incompatible changes.

2021-03-30 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-14442:
--

 Summary: IgniteRunner fails with NPE after REST module was broken 
by incompatible changes.
 Key: IGNITE-14442
 URL: https://issues.apache.org/jira/browse/IGNITE-14442
 Project: Ignite
  Issue Type: Sub-task
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-14372) Fix REST json configuration update requests

2021-03-22 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-14372:
--

 Summary: Fix REST json configuration update requests
 Key: IGNITE-14372
 URL: https://issues.apache.org/jira/browse/IGNITE-14372
 Project: Ignite
  Issue Type: Sub-task
Reporter: Ivan Bessonov






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-14371) Fix REST json representation for configuration

2021-03-22 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-14371:
--

 Summary: Fix REST json representation for configuration
 Key: IGNITE-14371
 URL: https://issues.apache.org/jira/browse/IGNITE-14371
 Project: Ignite
  Issue Type: Sub-task
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov


REST code is completely broken, it's time to fix it, partially at least.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-14302) Generated configuration classes break PMD suite in REST module

2021-03-10 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-14302:
--

 Summary: Generated configuration classes break PMD suite in REST 
module
 Key: IGNITE-14302
 URL: https://issues.apache.org/jira/browse/IGNITE-14302
 Project: Ignite
  Issue Type: Sub-task
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov


https://ci.ignite.apache.org/buildConfiguration/ignite3_Tests_SanityChecks_Pmd?branch=pull%2F65=overview=builds#all-projects



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-14279) Introduce "sendWithResponse" into network API

2021-03-04 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-14279:
--

 Summary: Introduce "sendWithResponse" into network API
 Key: IGNITE-14279
 URL: https://issues.apache.org/jira/browse/IGNITE-14279
 Project: Ignite
  Issue Type: Sub-task
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov
 Fix For: 3.0.0-alpha2


{noformat}
/**
 * Sends asynchronously a message with same guarantees as for {@link 
#send(NetworkMember, Object)} and
 * returns a response (RPC style).
 *
 * @param member Network member which should receive the message.
 * @param msg A message.
 * @param timeout Waiting for response timeout in milliseconds.
 * @param  Expected response type.
 * @return A future holding the response or error if the expected response 
was not received.
 */
 CompletableFuture sendWithResponse(NetworkMember member, Object msg, 
long timeout);
{noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-14230) Port DynamicConfiguration to new underlying configuration framework.

2021-02-24 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-14230:
--

 Summary: Port DynamicConfiguration to new underlying configuration 
framework.
 Key: IGNITE-14230
 URL: https://issues.apache.org/jira/browse/IGNITE-14230
 Project: Ignite
  Issue Type: Sub-task
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-14194) Multiple storages support for configuration

2021-02-17 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-14194:
--

 Summary: Multiple storages support for configuration
 Key: IGNITE-14194
 URL: https://issues.apache.org/jira/browse/IGNITE-14194
 Project: Ignite
  Issue Type: Sub-task
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov


Currently we have a single hardcoded storage, we should fix it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-14193) Initialize configuration tree with default values on first start

2021-02-16 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-14193:
--

 Summary: Initialize configuration tree with default values on 
first start
 Key: IGNITE-14193
 URL: https://issues.apache.org/jira/browse/IGNITE-14193
 Project: Ignite
  Issue Type: Sub-task
Reporter: Ivan Bessonov


Conceptually we have the following picture: every possible configuration has 
non-null value. The problem is the exact moment when you save values not 
initialized by the user.

This routine must be part of node lifecycle, of course, but implementation is 
not very trivial and used exclusively in lifecycle, which means that it can't 
be implemented as a part of other more abstract task.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-14145) ConfigurationUtil should be moved to internal package, visitor should be refactored.

2021-02-09 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-14145:
--

 Summary: ConfigurationUtil should be moved to internal package, 
visitor should be refactored.
 Key: IGNITE-14145
 URL: https://issues.apache.org/jira/browse/IGNITE-14145
 Project: Ignite
  Issue Type: Sub-task
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov


See a comment in IGNITE-14121. Also we should add return values to 
configuration visitor and split Config(root=true) from Config(root=false) for 
simplicity.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-14121) Implement ability to generate configuration trees from arbitrary sources

2021-02-03 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-14121:
--

 Summary: Implement ability to generate configuration trees from 
arbitrary sources
 Key: IGNITE-14121
 URL: https://issues.apache.org/jira/browse/IGNITE-14121
 Project: Ignite
  Issue Type: Sub-task
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov


Prototype is already present here: 
[https://github.com/apache/ignite-3/pull/34/files]
Now we need to adapt it to current configuration code and implement automatic 
generation of construction method's implementations.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-14102) Create escaping and searching util methods for configuration framework

2021-01-29 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-14102:
--

 Summary: Create escaping and searching util methods for 
configuration framework
 Key: IGNITE-14102
 URL: https://issues.apache.org/jira/browse/IGNITE-14102
 Project: Ignite
  Issue Type: Sub-task
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov


Right of the bat, I can think of two useful things to do:
 * escaping / unescaping;
 * replace for BaseSelectors#find that'll work on new trees.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-14087) Implement code generation for interfaces introduced in IGNITE-14062

2021-01-28 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-14087:
--

 Summary: Implement code generation for interfaces introduced in 
IGNITE-14062
 Key: IGNITE-14087
 URL: https://issues.apache.org/jira/browse/IGNITE-14087
 Project: Ignite
  Issue Type: Sub-task
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-14062) Create basic classes and interfaces for traversable configuration tree.

2021-01-26 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-14062:
--

 Summary: Create basic classes and interfaces for traversable 
configuration tree.
 Key: IGNITE-14062
 URL: https://issues.apache.org/jira/browse/IGNITE-14062
 Project: Ignite
  Issue Type: Sub-task
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov


Prototype code is presented in this PR: 
https://github.com/apache/ignite-3/pull/34



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13986) Proof of concept - SWIM group membership protocol for discovery

2021-01-13 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-13986:
--

 Summary: Proof of concept - SWIM group membership protocol for 
discovery
 Key: IGNITE-13986
 URL: https://issues.apache.org/jira/browse/IGNITE-13986
 Project: Ignite
  Issue Type: New Feature
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov


In IEP-61 it is mentioned that discovery protocol will be updated. We need to 
play with mentioned options for a little bit to conclude if they match our 
needs:

[http://www.cs.cornell.edu/Info/Projects/Spinglass/public_pdfs/SWIM.pdf]

[https://github.com/scalecube/scalecube-cluster]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [ANNOUNCE] Welcome Ivan Bessonov as a new committer

2021-01-12 Thread Ivan Bessonov
Thank you all!

It's funny that @WithSystemProperty is mentioned, but I'm glad that
development became easier with it =)

вт, 12 янв. 2021 г. в 15:12, ткаленко кирилл :

> Ivan, congratulations!
>
>
> 12.01.2021, 14:36, "Вячеслав Коптилин" :
> > Hello,
> >
> > My congratulations, Ivan! May the Force be with you!
> >
> > Thanks,
> > S.
> >
> > вт, 12 янв. 2021 г. в 14:27, Ivan Pavlukhin :
> >
> >>  Ivan, congratulations!
> >>
> >>  2021-01-12 13:36 GMT+03:00, Kseniya Romanova <
> romanova.ks@gmail.com>:
> >>  > Ivan, my congratulations!
> >>  >
> >>  > вт, 12 янв. 2021 г. в 13:32, Andrey Gura :
> >>  >
> >>  >> Igniters,
> >>  >>
> >>  >> The Apache Ignite Project Management Committee (PMC) has invited
> Ivan
> >>  >> Bessonov to become a new committer and are happy to announce that he
> >>  >> has accepted.
> >>  >>
> >>  >> Ivan has a lot of contributions to the Apache Ignite project. He
> >>  >> implemented Distributed Meta Storage feature which is one of key
> >>  >> features in the project. Also he contributed non trivial
> >>  >> improvements to such important parts of the project as
> Communication,
> >>  >> Discovery, Page Memory and Native Persistence. Another simple, but
> >>  >> very useful contribution is the widely used @WithSystemProperty
> >>  >> annotation for Apache Ignite test framework
> >>  >>
> >>  >> Being a committer enables easier contribution to the project since
> there
> >>  >> is
> >>  >> no need to go via the patch submission process. This should enable
> >>  better
> >>  >> productivity.
> >>  >>
> >>  >> Please join me in welcoming Ivan, and congratulating him on the new
> role
> >>  >> in
> >>  >> the Apache Ignite Community.
> >>  >>
> >>  >> Best Regards,
> >>  >> Andrey Gura
> >>  >>
> >>  >
> >>
> >>  --
> >>
> >>  Best regards,
> >>  Ivan Pavlukhin
>


-- 
Sincerely yours,
Ivan Bessonov


Re: SHA-512 for Maven deployment

2020-12-27 Thread Ivan Bessonov
Hi,

I've never done this before, but it seems like we need maven-gpg-plugin for
it [1].

Algorithm configuration would look like this:

--digest-algo=SHA512


Maybe this will help.

[1]
http://maven.apache.org/plugins-archives/maven-gpg-plugin-LATEST/sign-mojo.html

пн, 28 дек. 2020 г. в 01:25, Valentin Kulichenko <
valentin.kuliche...@gmail.com>:

> Igniters,
>
> I've been preparing the 3.0.0-alpha1 release and got confused about the
> requirements for checksums in Maven deployments. The Apache instruction [1]
> states that MD5 is deprecated and SHA1 should be avoided in favor of
> SHA-256 or SHA-512. However, it looks like we are still using the MD5/SHA1
> combination (at least that's what the staging for 2.9.1 [2] contains).
>
> On top of that, I can't find an easy way to switch to another checksum -
> Maven deploy plugin [3] creates MD5 and SHA1 files automatically and
> doesn't seem to have any options to tweak this behavior.
>
> That said, I have two questions:
>
>1. Are we required to use SHA512 or MD5/SHA1 is OK for now?
>2. Is there a painless way to include SHA512 in addition to MD5/SHA1?
>
> Can anyone shed some light on this?
>
> [1] https://infra.apache.org/release-signing.html#basic-facts
> [2]
>
> https://repository.apache.org/content/repositories/orgapacheignite-1490/org/apache/ignite/ignite-core/2.9.1/
> [3] https://maven.apache.org/plugins/maven-deploy-plugin/deploy-mojo.html
>
> -Val
>


-- 
Sincerely yours,
Ivan Bessonov


[jira] [Created] (IGNITE-13833) PersistenceBasicCompatibilityTest lacks recent releases

2020-12-10 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-13833:
--

 Summary: PersistenceBasicCompatibilityTest lacks recent releases
 Key: IGNITE-13833
 URL: https://issues.apache.org/jira/browse/IGNITE-13833
 Project: Ignite
  Issue Type: Bug
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13832) disco-notifier-worker handles IgniteInterruptedCheckedException incorrectly

2020-12-09 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-13832:
--

 Summary: disco-notifier-worker handles 
IgniteInterruptedCheckedException incorrectly
 Key: IGNITE-13832
 URL: https://issues.apache.org/jira/browse/IGNITE-13832
 Project: Ignite
  Issue Type: Bug
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov


DiscoveryMessageNotifierWorker#body handles InterruptedException correctly but 
if it catches IgniteInterruptedCheckedException, it'll do different logic which 
is incorrect. I believe all InterruptedException should be handled in the same 
way.

 
{code:java}
[org.gridgain:gridgain-compatibility] [2020-04-13 
08:19:15,109][ERROR][disco-notifier-worker-#69754%top2_node_rcv%][root] 
Critical system error detected. Will be handled accordingly to configured 
handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, 
super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet 
[SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], 
failureCtx=FailureContext [type=SYSTEM_WORKER_TERMINATION, err=class 
o.a.i.IgniteException: Failed to wait for handling disconnect event.]]
[08:19:15]W: [org.gridgain:gridgain-compatibility] class 
org.apache.ignite.IgniteException: Failed to wait for handling disconnect event.
[08:19:15]W: [org.gridgain:gridgain-compatibility]  at 
org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryWorker.awaitDisconnectEvent(GridDiscoveryManager.java:3128)
[08:19:15]W: [org.gridgain:gridgain-compatibility]  at 
org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryWorker.access$6400(GridDiscoveryManager.java:2793)
[08:19:15]W: [org.gridgain:gridgain-compatibility]  at 
org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$4.onDiscovery0(GridDiscoveryManager.java:868)
[08:19:15]W: [org.gridgain:gridgain-compatibility]  at 
org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$4.lambda$onDiscovery$0(GridDiscoveryManager.java:519)
[08:19:15]W: [org.gridgain:gridgain-compatibility]  at 
org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryMessageNotifierWorker.body0(GridDiscoveryManager.java:2686)
[08:19:15]W: [org.gridgain:gridgain-compatibility]  at 
org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryMessageNotifierWorker.body(GridDiscoveryManager.java:2724)
[08:19:15]W: [org.gridgain:gridgain-compatibility]  at 
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:119)
[08:19:15]W: [org.gridgain:gridgain-compatibility]  at 
java.lang.Thread.run(Thread.java:748)
[08:19:15]W: [org.gridgain:gridgain-compatibility] Caused by: class 
org.apache.ignite.internal.IgniteInterruptedCheckedException: Got interrupted 
while waiting for future to complete.
[08:19:15]W: [org.gridgain:gridgain-compatibility]  at 
org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:185)
[08:19:15]W: [org.gridgain:gridgain-compatibility]  at 
org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:140)
[08:19:15]W: [org.gridgain:gridgain-compatibility]  at 
org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryWorker.awaitDisconnectEvent(GridDiscoveryManager.java:3125)
[08:19:15]W: [org.gridgain:gridgain-compatibility]  ... 7 more
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSSION] Java 11 for Ignite 3.0 development

2020-12-08 Thread Ivan Bessonov
This is an awesome idea.

Honestly, I can't come up with strong technical arguments for Java 11 as a
source level, I had no chance to work with it long enough, but it feels
like a
proper time to move to a "modern" technology. Subjectively I can say that
Java 11 has a lot of good optimization and Ignite should run better on it.
So
it makes no sense to compile for 8 but recommend 11, you know.

вт, 8 дек. 2020 г. в 21:16, Данилов Семён :

> +1 for sure. AFAIK, the only thing holding us back from using Java 11 is
> the dominance of Java 8, but I'm sure that by the time Ignite 3 is GA,
> there will be much fewer Java 8 users if any significant number at all. By
> the by, Ignite's sources need minimal effort to be able to be compiled with
> Java 11 as a target.
>
> 08.12.2020, 15:00, "Nikolay Izhikov" :
> > +1 for using java 11.
> >
> >>  8 дек. 2020 г., в 13:18, ткаленко кирилл 
> написал(а):
> >>
> >>  +1
> >>
> >>  08.12.2020, 12:48, "Philipp Masharov" :
> >>>  Hello!
> >>>
> >>>  Andrey's arguments are solid.
> >>>
> >>>  On Tue, Dec 8, 2020 at 12:23 PM Pavel Tupitsyn 
> wrote:
> >>>
> >>>>   +1, Java 11 seems to be the only right choice at the moment.
> >>>>
> >>>>   On Tue, Dec 8, 2020 at 12:08 PM Alexey Zinoviev <
> zaleslaw@gmail.com>
> >>>>   wrote:
> >>>>
> >>>>   > I totally support Java 11 for development. It's time to go forward
> >>>>   >
> >>>>   > вт, 8 дек. 2020 г. в 11:40, Andrey Gura :
> >>>>   >
> >>>>   > > Igniters,
> >>>>   > >
> >>>>   > > We already had some discussion about using modern Java versions
> for
> >>>>   > > Ignite 3.0 development [1] but we still don't have consensus.
> >>>>   > > As I see from this discussion the strongest argument for Java
> 11 is
> >>>>   > > the fact that Java 11 is the latest LTS release which will have
> >>>>   > > premier support until September 2023. So I don't see any reason
> for
> >>>>   > > preferring any other version of Java at this moment.
> >>>>   > >
> >>>>   > > The purpose of this thread is to gather opinions about using
> Java 11
> >>>>   > > in the Ignite 3.0 project and, eventually, reach a consensus on
> this.
> >>>>   > >
> >>>>   > > I want to share my several arguments in favor of abandoning
> Java 8 and
> >>>>   > > preferring Java 11:
> >>>>   > >
> >>>>   > > * Java 8 has gone through the End of Public Updates process for
> legacy
> >>>>   > > releases. So it doesn't make sense to start new development on
> Java 8.
> >>>>   > >
> >>>>   > > * Java 9+ brings Jigsaw modularization which allows us to have
> more
> >>>>   > > fine-grained structure of Ignite modules and APIs in the future.
> >>>>   > >
> >>>>   > > * Ignite actively uses Unsafe functionality which, firstly,
> isn't
> >>>>   > > public, and secondly, leads to problems with running Ignite
> under Java
> >>>>   > > 9+ (modularization which requires dozens of command-line
> options in
> >>>>   > > order to forcibly export corresponding packages) and GraalVM.
> Such a
> >>>>   > > situation could be described as bad user experience and we
> should fix
> >>>>   > > it. Var handles [2] could be used for solving described
> problems.
> >>>>   > >
> >>>>   > > * Java 9+ introduces Flight Recorder API [3] which could be
> used in
> >>>>   > > the Ignite project for lightweight profiling of internal
> processes.
> >>>>   > >
> >>>>   > > Please, share your opinions, objections and ideas about this
> topic. I
> >>>>   > > hope we will not have serious disagreements and the consensus
> will be
> >>>>   > > reached quickly.
> >>>>   > >
> >>>>   > >
> >>>>   > > 1.
> >>>>   > >
> >>>>   >
> >>>>
> http://apache-ignite-developers.2346864.n4.nabble.com/DISCUSS-Ignite-3-0-development-approach-tp49922p50295.html
> >>>>   > > 2.
> >>>>   > >
> >>>>   >
> >>>>
> https://docs.oracle.com/javase/9/docs/api/java/lang/invoke/VarHandle.html
> >>>>   > > 3.
> >>>>   > >
> >>>>   >
> >>>>
> https://docs.oracle.com/en/java/javase/11/docs/api/jdk.jfr/jdk/jfr/FlightRecorder.html
> >>>>   > >
> >>>>   >
>


-- 
Sincerely yours,
Ivan Bessonov


Re: Re[2]: [DISCUSSION] Modules organization in Ignite 3

2020-12-08 Thread Ivan Bessonov
Conversation shifted into an unintended direction, but I agree.

I think that if API can (or will) be changed then it should be deprecated.
For that
we can introduce @IgniteDeprecated that will contain Ignite version when
API is planned to be removed. Otherwise it's either stable or experimental.
Having officially "unstable" features doesn't sound good for product
reputation.

As for the modularization - I'm all for this idea. If we don't force
ourselves to
organize code properly then we'll end up with the same problems as we have
in the current code base. And this way there's a hope of having good tests
that can be completed in minutes, not hours. At least new ones.

BTW, did we have any discussions about dependency injection and all this
stuff?
Seems like a related topic to me.

ср, 9 дек. 2020 г. в 09:47, Zhenya Stanilovsky :

>
>
> Hello Nikolay, if i find out introduced features structure in some
> project, i would prefer to choose different one )
>
> >
> >>
> >>>Hello, Alexey.
> >>>
> >>>Think we can extend our @IgniteExperimental annotation.
> >>>
> >>>`@IgniteExperimental` - mark features that are truly experimental and
> can be completely removed in future releases.
> >>>`@NotRecommended` - mark features that widely adopted by the users but
> implemented wrong or have known issues that can’t be fixed.
> >>>`@NotStable` - mark features supported by community but API not stable
> and can be reworked in the next release.
> >>>`@Stable` - mark features that are completely OK and here to stay.
> >>>
> >>>We should output notes about these annotations in the JavaDoc, also.
> >>>What do you think?
> >>>
> >>>
> >>>> 8 дек. 2020 г., в 12:49, Alexey Goncharuk <
> alexey.goncha...@gmail.com > написал(а):
> >>>>
> >>>> Igniters,
> >>>>
> >>>> I want to tackle the topic of modules structure in Ignite 3. So far,
> the
> >>>> modules in Ignite are mostly defined intuitively which leads to some
> >>>> complications:
> >>>>
> >>>> - Ignite public API is separated from the rest of the code only by
> >>>> package name. This leads to private classes leaking to public API
> which is
> >>>> very hard to catch even during the review process (we missed a bunch
> of
> >>>> such leaks for new metrics API [1] and I remember this happening for
> almost
> >>>> every SPI)
> >>>> - Classes from 'internal' packages are considered to be 'free for
> grabs'
> >>>> in every place of the code. This leads to tight coupling and
> abstraction
> >>>> leakage in the code. An example of such a case - an often cast of
> >>>> WALPointer to FileWALPointer, so that the community decided to get
> rid of
> >>>> the WALPointer interface altogether [2]
> >>>> - Overall code complexity. Because of the lack of inter-module
> >>>> interaction rules, we are free to add new methods and callbacks to any
> >>>> class, which leads to duplicating entities and verbose interfaces. A
> good
> >>>> example of this is the clear duplication of methods in
> >>>> IgniteCacheOffheapManager and IgniteCacheOffheapManager.DataStore [3]
> >>>>
> >>>> I think we need to work out some rules that will help us define and
> control
> >>>> both Ignite public API and module internal API which still defines a
> clear
> >>>> contract for other modules. Some ideas:
> >>>>
> >>>> - Perhaps we can move all user public classed and interfaces to an
> >>>> Ignite-API module which will have no dependencies on implementation
> >>>> modules. This will prevent private classes from leaking to the API
> module.
> >>>> - We need somehow define which classes from a module are exposed to
> >>>> other modules, and which classes are left for module-private usage.
> Maybe
> >>>> Java's jigsaw will help us here, but maybe we will be ok with just
> more
> >>>> strict java access modifiers usage :) The idea here is that a module
> should
> >>>> never touch a dependent module's private classes, ever. The exported
> >>>> classes and interfaces are still free to be modified between
> releases, as
> >>>> long as it is not a user public API.
> >>>> - A module should be logically complete, thus it may be beneficial if
> >>>> module name matches with the code package it provides (e.g.
> configuration
> >>>> -> org.apache.ignite.configuration, replication ->
> >>>> org.apache.ignite.replication, raft->org.apache.ignite.raft, etc)
> >>>>
> >>>> Any other principles/rules we can apply to make the code structure
> more
> >>>> concise? Thoughts?
> >>>>
> >>>> --AG
> >>>>
> >>>> [1]  https://issues.apache.org/jira/browse/IGNITE-12552
> >>>> [2]  https://issues.apache.org/jira/browse/IGNITE-13513
> >>>> [3]  https://issues.apache.org/jira/browse/IGNITE-13220
> >>
> >>
> >>
> >>



-- 
Sincerely yours,
Ivan Bessonov


[jira] [Created] (IGNITE-13823) WAL iterators require WRITE permissions

2020-12-07 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-13823:
--

 Summary: WAL iterators require WRITE permissions
 Key: IGNITE-13823
 URL: https://issues.apache.org/jira/browse/IGNITE-13823
 Project: Ignite
  Issue Type: Bug
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov


org.apache.ignite.internal.processors.cache.persistence.wal.FileDescriptor#toIO 
uses default permissions, i.e. "CREATE, READ, WRITE"



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13814) Long restorePartitionStates triggers FailureHandler on node startup

2020-12-04 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-13814:
--

 Summary: Long restorePartitionStates triggers FailureHandler on 
node startup
 Key: IGNITE-13814
 URL: https://issues.apache.org/jira/browse/IGNITE-13814
 Project: Ignite
  Issue Type: Bug
 Environment: {noformat}
Thread [name="sys-stripe-4-#5%EPE_CLUSTER_PERF%", id=24, state=WAITING, 
blockCnt=4, waitCnt=70836]
at java.base@11.0.8/jdk.internal.misc.Unsafe.park(Native Method)
at 
java.base@11.0.8/java.util.concurrent.locks.LockSupport.park(LockSupport.java:323)
at 
app//o.a.i.i.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:186)
at 
app//o.a.i.i.util.future.GridFutureAdapter.getUninterruptibly(GridFutureAdapter.java:154)
at 
app//o.a.i.i.processors.cache.persistence.file.AsyncFileIO.read(AsyncFileIO.java:128)
at 
app//o.a.i.i.processors.cache.persistence.file.AbstractFileIO$2.run(AbstractFileIO.java:89)
at 
app//o.a.i.i.processors.cache.persistence.file.AbstractFileIO.fully(AbstractFileIO.java:52)
at 
app//o.a.i.i.processors.cache.persistence.file.AbstractFileIO.readFully(AbstractFileIO.java:87)
at 
app//o.a.i.i.processors.cache.persistence.file.FilePageStore.readWithFailover(FilePageStore.java:794)
at 
app//o.a.i.i.processors.cache.persistence.file.FilePageStore.read(FilePageStore.java:418)
at 
app//o.a.i.i.processors.cache.persistence.file.FilePageStoreManager.read(FilePageStoreManager.java:519)
at 
app//o.a.i.i.processors.cache.persistence.file.FilePageStoreManager.read(FilePageStoreManager.java:503)
at 
app//o.a.i.i.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:874)
at 
app//o.a.i.i.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:700)
at 
app//o.a.i.i.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:689)
at 
app//o.a.i.i.processors.cache.persistence.DataStructure.acquirePage(DataStructure.java:157)
at 
app//o.a.i.i.processors.cache.persistence.freelist.PagesList.init(PagesList.java:274)
at 
app//o.a.i.i.processors.cache.persistence.freelist.AbstractFreeList.(AbstractFreeList.java:390)
at 
app//o.a.i.i.processors.cache.persistence.freelist.CacheFreeList.(CacheFreeList.java:57)
at 
app//o.a.i.i.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore$1.(GridCacheOffheapManager.java:1806)
at 
app//o.a.i.i.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore.init0(GridCacheOffheapManager.java:1805)
at 
app//o.a.i.i.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore.init(GridCacheOffheapManager.java:2130)
at 
app//o.a.i.i.processors.cache.persistence.GridCacheOffheapManager.restorePartitionStates(GridCacheOffheapManager.java:544)
at 
app//o.a.i.i.processors.cache.GridCacheProcessor$CacheRecoveryLifecycle.lambda$restorePartitionStates$0(GridCacheProcessor.java:5253)
at 
app//o.a.i.i.processors.cache.GridCacheProcessor$CacheRecoveryLifecycle$$Lambda$633/0x000800717040.run(Unknown
 Source)
at 
app//o.a.i.i.util.StripedExecutor$Stripe.body(StripedExecutor.java:559)
at app//o.a.i.i.util.worker.GridWorker.run(GridWorker.java:119)
at java.base@11.0.8/java.lang.Thread.run(Thread.java:834){noformat}
In this case, warm-up is on, but client also reports this to happen without 
warm-up.I don't think that restore partition states should trigger FH. It may 
take a lot of time with PDS. Also, why do we run it in striped pool? Let's 
imagine two large caches get the same stripe - restore time doubles.
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov


The following would be printed to log:
{noformat}
[2020-10-30T17:32:26,190][WARN ][grid-timeout-worker-#22%EPE_CLUSTER_PERF%][] 
Possible failure suppressed accordingly to a configured handler 
[hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, 
super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet 
[SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], 
failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class 
o.a.i.IgniteException: GridWorker [name=sys-stripe-4, 
igniteInstanceName=EPE_CLUSTER_PERF, finished=false, 
heartbeatTs=1604104192954]]]
org.apache.ignite.IgniteException: GridWorker [name=sys-stripe-4, 
igniteInstanceName=EPE_CLUSTER_PERF, finished=false, heartbeatTs=1604104192954]
at 
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$2.apply(IgnitionEx.java:1859)
 [ignite-core-8.7.28.jar:8.7.28]
at 
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$2.apply(IgnitionEx.java:1854)
 [ignite-core-8.7.28.jar:8.7.28]
at 
org.apache.ignite.internal.worker.WorkersRegistry.onIdle(WorkersRegistry.java:233)
 [ignite-core-

[jira] [Created] (IGNITE-13813) SKIP_GARBAGE WAL compression doesn't work for binary recovery

2020-12-04 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-13813:
--

 Summary: SKIP_GARBAGE WAL compression doesn't work for binary 
recovery
 Key: IGNITE-13813
 URL: https://issues.apache.org/jira/browse/IGNITE-13813
 Project: Ignite
  Issue Type: Bug
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov


{noformat}
class org.apache.ignite.IgniteCheckedException: Failed to apply page snapshot

at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.lambda$performBinaryMemoryRestore$14(GridCacheDatabaseSharedManager.java:2419)
at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.lambda$stripedApplyPage$18(GridCacheDatabaseSharedManager.java:2603)
at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.lambda$stripedApply$19(GridCacheDatabaseSharedManager.java:2641)
at 
org.apache.ignite.internal.util.StripedExecutor$Stripe.body(StripedExecutor.java:559)
at 
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:119)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.AssertionError: 4096
at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.applyPageSnapshot(GridCacheDatabaseSharedManager.java:2671)
at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.lambda$performBinaryMemoryRestore$14(GridCacheDatabaseSharedManager.java:2412)
... 5 more{noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13812) CheckpointEntry is read from WAL right after its creation.

2020-12-03 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-13812:
--

 Summary: CheckpointEntry is read from WAL right after its creation.
 Key: IGNITE-13812
 URL: https://issues.apache.org/jira/browse/IGNITE-13812
 Project: Ignite
  Issue Type: Bug
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov


{noformat}
[2020-07-31 16:33:15,545][INFO ][pitr-ctx-exec-#304][WalStateManager] WAL 
logging disabled
[2020-07-31 16:33:15,545][INFO 
][db-checkpoint-thread-#152][GridCacheDatabaseSharedManager] Checkpoint 
finished [cpId=e1a57b48-1610-4280-a3e2-4d808a5f0343, pages=64, 
markPos=FileWALPointer [idx=5, fileOff=45749881, len=186791], 
walSegmentsCleared=0, walSegmentsCovered=[], markDuration=49ms, pagesWrite=0ms, 
fsync=5ms, total=79ms]
[2020-07-31 16:33:15,546][INFO ][pitr-ctx-exec-#304][GridRecoveryProcessor] 
Start apply segment idx=1
[2020-07-31 16:33:16,012][INFO ][pitr-ctx-exec-#304][GridRecoveryProcessor] 
Segment idx=1 applied
[2020-07-31 16:33:16,373][INFO ][pitr-ctx-exec-#304][GridRecoveryProcessor] 
Segment idx=2 applied
[2020-07-31 16:33:16,553][ERROR][db-checkpoint-thread-#152][root] Critical 
system error detected. Will be handled accordingly to configured handler 
[hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, 
super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet 
[SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], 
failureCtx=FailureContext [type=CRITICAL_ERROR, 
err=java.lang.ClassCastException: class 
o.a.i.i.pagemem.wal.record.MemoryRecoveryRecord cannot be cast to class 
o.a.i.i.pagemem.wal.record.CheckpointRecord 
(o.a.i.i.pagemem.wal.record.MemoryRecoveryRecord and 
o.a.i.i.pagemem.wal.record.CheckpointRecord are in unnamed module of loader 
'app')]]
java.lang.ClassCastException: class 
org.apache.ignite.internal.pagemem.wal.record.MemoryRecoveryRecord cannot be 
cast to class org.apache.ignite.internal.pagemem.wal.record.CheckpointRecord 
(org.apache.ignite.internal.pagemem.wal.record.MemoryRecoveryRecord and 
org.apache.ignite.internal.pagemem.wal.record.CheckpointRecord are in unnamed 
module of loader 'app')
at 
org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointEntry$GroupStateLazyStore.initIfNeeded(CheckpointEntry.java:353)
at 
org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointEntry$GroupStateLazyStore.access$300(CheckpointEntry.java:245)
at 
org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointEntry.initIfNeeded(CheckpointEntry.java:124)
at 
org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointEntry.groupState(CheckpointEntry.java:106)
at 
org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointHistory.addCpToEarliestCpMap(CheckpointHistory.java:246)
at 
org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointHistory.addCheckpoint(CheckpointHistory.java:179)
at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$Checkpointer.markCheckpointBegin(GridCacheDatabaseSharedManager.java:4221)
at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$Checkpointer.doCheckpoint(GridCacheDatabaseSharedManager.java:3732)
at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$Checkpointer.body(GridCacheDatabaseSharedManager.java:3621)
at 
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:119)
at java.base/java.lang.Thread.run(Thread.java:834){noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13811) ServerImpl#pingNode(InetSocketAddress, UUID, UUID) fails to ping nodes with unresolved addresses

2020-12-03 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-13811:
--

 Summary: ServerImpl#pingNode(InetSocketAddress, UUID, UUID) fails 
to ping nodes with unresolved addresses
 Key: IGNITE-13811
 URL: https://issues.apache.org/jira/browse/IGNITE-13811
 Project: Ignite
  Issue Type: Bug
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov


Wrong key is deleted from map.
{code:java}
pingMap.putIfAbsent(addr, fut)
{code}
{code:java}
if (addr.isUnresolved())
 addr = new InetSocketAddress(InetAddress.getByName(addr.getHostName()), 
addr.getPort());
{code}
{code:java}
boolean b = pingMap.remove(addr, fut);

assert b;
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13808) Control.sh validate_indexes throws CorruptedTreeException and fails server node during check

2020-12-03 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-13808:
--

 Summary: Control.sh validate_indexes throws CorruptedTreeException 
and fails server node during check
 Key: IGNITE-13808
 URL: https://issues.apache.org/jira/browse/IGNITE-13808
 Project: Ignite
  Issue Type: Bug
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov


CorruptedTreeException during validate index command calls Failure handler and 
stops server node:
{code:java}
[21:44:26,257][WARNING][pool-5-thread-2][ValidateIndexesClosure] Current 
progress of ValidateIndexesClosure: checked integrity of 1 index partitions of 
14 cache groups
[21:44:26,852][SEVERE][pool-5-thread-16][] Critical system error detected. Will 
be handled accordingly to configured handler [hnd=StopNodeOrHaltFailureHandler 
[tryStop=false, timeout=0, super=AbstractFailureHandler 
[ignoredFailureTypes=UnmodifiableSet [SYSTEM_WORKER_BLOCKED, 
SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=FailureContext 
[type=CRITICAL_ERROR, err=class 
o.a.i.i.processors.cache.persistence.tree.CorruptedTreeException: B+Tree is 
corrupted [pages(groupId, pageId)=[], msg=Runtime failure on bounds: 
[lower=null, upper=null
class 
org.apache.ignite.internal.processors.cache.persistence.tree.CorruptedTreeException:
 B+Tree is corrupted [pages(groupId, pageId)=[], msg=Runtime failure on bounds: 
[lower=null, upper=null]]
at 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.corruptedTreeException(BPlusTree.java:5126)
at 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.find(BPlusTree.java:1029)
at 
org.apache.ignite.internal.processors.query.h2.database.H2TreeIndex.find(H2TreeIndex.java:243)
at 
org.apache.ignite.internal.visor.verify.ValidateIndexesClosure.processIndex(ValidateIndexesClosure.java:651)
at 
org.apache.ignite.internal.visor.verify.ValidateIndexesClosure.access$200(ValidateIndexesClosure.java:93)
at 
org.apache.ignite.internal.visor.verify.ValidateIndexesClosure$4.call(ValidateIndexesClosure.java:631)
at 
org.apache.ignite.internal.visor.verify.ValidateIndexesClosure$4.call(ValidateIndexesClosure.java:629)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTreeRuntimeException:
 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTreeRuntimeException:
 java.lang.IllegalStateException: Item not found: 11
at 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.findLowerUnbounded(BPlusTree.java:987)
at 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.find(BPlusTree.java:1014)
... 9 more
Caused by: 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTreeRuntimeException:
 java.lang.IllegalStateException: Item not found: 11
at 
org.apache.ignite.internal.processors.cache.persistence.CacheDataRowAdapter.initFromLink(CacheDataRowAdapter.java:203)
at 
org.apache.ignite.internal.processors.cache.persistence.CacheDataRowAdapter.initFromLink(CacheDataRowAdapter.java:104)
at 
org.apache.ignite.internal.processors.query.h2.database.H2RowFactory.getRow(H2RowFactory.java:61)
at 
org.apache.ignite.internal.processors.query.h2.database.H2Tree.createRowFromLink(H2Tree.java:246)
at 
org.apache.ignite.internal.processors.query.h2.database.io.H2ExtrasLeafIO.getLookupRow(H2ExtrasLeafIO.java:126)
at 
org.apache.ignite.internal.processors.query.h2.database.io.H2ExtrasLeafIO.getLookupRow(H2ExtrasLeafIO.java:36)
at 
org.apache.ignite.internal.processors.query.h2.database.H2Tree.getRow(H2Tree.java:264)
at 
org.apache.ignite.internal.processors.query.h2.database.H2Tree.getRow(H2Tree.java:56)
at 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$ForwardCursor.fillFromBuffer(BPlusTree.java:4808)
at 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$ForwardCursor.init(BPlusTree.java:4710)
at 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$ForwardCursor.access$5000(BPlusTree.java:4646)
at 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.findLowerUnbounded(BPlusTree.java:976)
... 10 more
Caused by: java.lang.IllegalStateException: Item not found: 11
at 
org.apache.ignite.internal.processors.cache.persistence.tree.io.AbstractDataPageIO.findIndirectItemIndex(AbstractDataPageIO.java:341)
at 
org.apache.ignite.internal.processors.cache.persistence.tree.io.AbstractDataPageIO.getDataOffset

[jira] [Created] (IGNITE-13802) GridCacheOffheapManager#addPartitions ignores candidate pages count for index partition

2020-12-02 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-13802:
--

 Summary: GridCacheOffheapManager#addPartitions ignores candidate 
pages count for index partition
 Key: IGNITE-13802
 URL: https://issues.apache.org/jira/browse/IGNITE-13802
 Project: Ignite
  Issue Type: Bug
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov


It also marks page as dirty despite doing nothing with it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13795) java.nio.file.InvalidPathException: Illegal char <:> at lock page on windows

2020-12-02 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-13795:
--

 Summary: java.nio.file.InvalidPathException: Illegal char <:> at 
lock page on windows
 Key: IGNITE-13795
 URL: https://issues.apache.org/jira/browse/IGNITE-13795
 Project: Ignite
  Issue Type: Bug
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov


{code:java}
Exception in thread "Thread-1" java.nio.file.InvalidPathException: Illegal char 
<:> at index 109: 
C:\BuildAgent\work\d501ae8146bd8253\i2test\var\suite-thin_clients\art-gg-ult\work\diagnostic\page_lock_dump_0:0:0:0:0:0:0:1,127.0.0.1,172.23.240.1,172.25.2.217:47500_2020_06_22_17_24_06_377
at sun.nio.fs.WindowsPathParser.normalize(WindowsPathParser.java:182)
at sun.nio.fs.WindowsPathParser.parse(WindowsPathParser.java:153)
at sun.nio.fs.WindowsPathParser.parse(WindowsPathParser.java:77)
at sun.nio.fs.WindowsPath.parse(WindowsPath.java:94)
at sun.nio.fs.WindowsFileSystem.getPath(WindowsFileSystem.java:255)
at java.io.File.toPath(File.java:2234)
at 
org.apache.ignite.internal.processors.cache.persistence.diagnostic.pagelocktracker.dumpprocessors.ToFileDumpProcessor.saveToFile(ToFileDumpProcessor.java:69)
at 
org.apache.ignite.internal.processors.cache.persistence.diagnostic.pagelocktracker.dumpprocessors.ToFileDumpProcessor.toFileDump(ToFileDumpProcessor.java:53)
at 
org.apache.ignite.internal.processors.cache.persistence.diagnostic.pagelocktracker.PageLockTrackerManager.onHangThreads(PageLockTrackerManager.java:123)
at 
org.apache.ignite.internal.processors.cache.persistence.diagnostic.pagelocktracker.SharedPageLockTracker$TimeOutWorker.run(SharedPageLockTracker.java:385)
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13786) PDS defragmentation can inflate index size

2020-12-01 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-13786:
--

 Summary: PDS defragmentation can inflate index size
 Key: IGNITE-13786
 URL: https://issues.apache.org/jira/browse/IGNITE-13786
 Project: Ignite
  Issue Type: Bug
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov


For huge caches it is possible that defragmentation will lead to bigger indexes 
size.

The reason is that we only append new data to index trees and never insert into 
the middle, this leads to under-utilization of B+Tree pages space.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13743) Defragmentation JMX API for schedule/cancel/status

2020-11-23 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-13743:
--

 Summary: Defragmentation JMX API for schedule/cancel/status
 Key: IGNITE-13743
 URL: https://issues.apache.org/jira/browse/IGNITE-13743
 Project: Ignite
  Issue Type: Sub-task
Reporter: Ivan Bessonov
Assignee: Semyon Danilov






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13742) Fix failed WalModeChangeAdvancedSelfTest.testMaintenanceIsSkippedIfWasFixedManuallyOnDowntime

2020-11-20 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-13742:
--

 Summary: Fix failed 
WalModeChangeAdvancedSelfTest.testMaintenanceIsSkippedIfWasFixedManuallyOnDowntime
 Key: IGNITE-13742
 URL: https://issues.apache.org/jira/browse/IGNITE-13742
 Project: Ignite
  Issue Type: Sub-task
Reporter: Ivan Bessonov


https://ci.ignite.apache.org/project.html?projectId=IgniteTests24Java8=5803772702668480758=testDetails



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [REVIEW REQUEST] IEP-47 Native Persistence Defragmentation, core logic

2020-11-17 Thread Ivan Bessonov
But maybe I just don't know the date. To be short - right now
defragmentation is my first priority.

вт, 17 нояб. 2020 г. в 15:18, Ivan Bessonov :

> Denis,
>
> chances that feature will be fully complete is a bit low. We still make
> adjustments to the API
> and we need a few optimizations so that it would work faster.
>
> чт, 12 нояб. 2020 г. в 19:11, Denis Magda :
>
>> Ivan,
>>
>> Nice! Is the plan to get it added to Ignite 2.10?
>>
>> -
>> Denis
>>
>>
>> On Thu, Nov 12, 2020 at 7:11 AM Ivan Bessonov 
>> wrote:
>>
>> > Hi Igniters,
>> >
>> > Core functionality of defragmentation is finally implemented in [1].
>> > There's no public API in it
>> > for now, patch is already very big and had to be split into smaller
>> tasks
>> > (that consist mostly of refactoring).
>> >
>> > Code is a little rough right now, I'm gonna go through all the remaining
>> > TODO, but you can already
>> > start reviewing it. PR is here: [2].
>> >
>> > First control.sh commands are here, but I don't have TC test results
>> yet:
>> > [3].
>> > There will be more API related issues later, but now I'd like to polish
>> > core classes.
>> >
>> > Please leave your thoughts here and in PR.
>> >
>> > Thank you!
>> >
>> > [0]
>> >
>> >
>> https://cwiki.apache.org/confluence/display/IGNITE/IEP-47%3A+Native+persistence+defragmentation
>> > [1] https://issues.apache.org/jira/browse/IGNITE-13190
>> > [2] https://github.com/apache/ignite/pull/7984/files
>> > [3] https://issues.apache.org/jira/browse/IGNITE-13697
>> >
>> > --
>> > Sincerely yours,
>> > Ivan Bessonov
>> >
>>
>
>
> --
> Sincerely yours,
> Ivan Bessonov
>


-- 
Sincerely yours,
Ivan Bessonov


Re: [REVIEW REQUEST] IEP-47 Native Persistence Defragmentation, core logic

2020-11-17 Thread Ivan Bessonov
Denis,

chances that feature will be fully complete is a bit low. We still make
adjustments to the API
and we need a few optimizations so that it would work faster.

чт, 12 нояб. 2020 г. в 19:11, Denis Magda :

> Ivan,
>
> Nice! Is the plan to get it added to Ignite 2.10?
>
> -
> Denis
>
>
> On Thu, Nov 12, 2020 at 7:11 AM Ivan Bessonov 
> wrote:
>
> > Hi Igniters,
> >
> > Core functionality of defragmentation is finally implemented in [1].
> > There's no public API in it
> > for now, patch is already very big and had to be split into smaller tasks
> > (that consist mostly of refactoring).
> >
> > Code is a little rough right now, I'm gonna go through all the remaining
> > TODO, but you can already
> > start reviewing it. PR is here: [2].
> >
> > First control.sh commands are here, but I don't have TC test results yet:
> > [3].
> > There will be more API related issues later, but now I'd like to polish
> > core classes.
> >
> > Please leave your thoughts here and in PR.
> >
> > Thank you!
> >
> > [0]
> >
> >
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-47%3A+Native+persistence+defragmentation
> > [1] https://issues.apache.org/jira/browse/IGNITE-13190
> > [2] https://github.com/apache/ignite/pull/7984/files
> > [3] https://issues.apache.org/jira/browse/IGNITE-13697
> >
> > --
> > Sincerely yours,
> > Ivan Bessonov
> >
>


-- 
Sincerely yours,
Ivan Bessonov


[jira] [Created] (IGNITE-13709) Control.sh API - status

2020-11-16 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-13709:
--

 Summary: Control.sh API - status
 Key: IGNITE-13709
 URL: https://issues.apache.org/jira/browse/IGNITE-13709
 Project: Ignite
  Issue Type: Sub-task
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[REVIEW REQUEST] IEP-47 Native Persistence Defragmentation, core logic

2020-11-12 Thread Ivan Bessonov
Hi Igniters,

Core functionality of defragmentation is finally implemented in [1].
There's no public API in it
for now, patch is already very big and had to be split into smaller tasks
(that consist mostly of refactoring).

Code is a little rough right now, I'm gonna go through all the remaining
TODO, but you can already
start reviewing it. PR is here: [2].

First control.sh commands are here, but I don't have TC test results yet:
[3].
There will be more API related issues later, but now I'd like to polish
core classes.

Please leave your thoughts here and in PR.

Thank you!

[0]
https://cwiki.apache.org/confluence/display/IGNITE/IEP-47%3A+Native+persistence+defragmentation
[1] https://issues.apache.org/jira/browse/IGNITE-13190
[2] https://github.com/apache/ignite/pull/7984/files
[3] https://issues.apache.org/jira/browse/IGNITE-13697

-- 
Sincerely yours,
Ivan Bessonov


[jira] [Created] (IGNITE-13697) Control.sh API - schedule & cancel

2020-11-12 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-13697:
--

 Summary: Control.sh API - schedule & cancel
 Key: IGNITE-13697
 URL: https://issues.apache.org/jira/browse/IGNITE-13697
 Project: Ignite
  Issue Type: Sub-task
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13597) Execution timeout in PDS 2

2020-10-20 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-13597:
--

 Summary: Execution timeout in PDS 2 
 Key: IGNITE-13597
 URL: https://issues.apache.org/jira/browse/IGNITE-13597
 Project: Ignite
  Issue Type: Test
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov


https://ci.ignite.apache.org/buildConfiguration/IgniteTests24Java8_Pds2/5677092?buildTab=log=3



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Ignite compilation fails in IntelliJ IDEA (IgniteLinkTaglet)

2020-08-10 Thread Ivan Bessonov
Hi Pavel,

this issue is unrelated to your problem, but yes, it wouldn't allow you to
save changes.
This sucks. You should remove one of those modules in your settings, they
point on
the same pom.xml, this means that this is the same module twice in your
settings.

пн, 10 авг. 2020 г. в 11:24, Pavel Tupitsyn :

> Ivan,
>
> Thanks for the suggestion. Unfortunately, it does not help -
> Idea does not let me apply the changes:
>
> `Content root "/home/pavel/w/ignite" is defined for modules "apache-ignite"
> and "ignite".
> Two modules in a project cannot share the same content root.`
>
> Clearly I'm doing something wrong - maybe I'm importing the project into
> Idea in a wrong way?
> Or should I use a different JDK? Which version is best for Ignite
> development right now?
> (I'm using OpenJDK 8 just out of habit)
>
>
> On Mon, Aug 10, 2020 at 10:02 AM Ivan Bessonov 
> wrote:
>
> > Hi Pavel,
> >
> > please go to "Project Structure | Project Settings | Modules",
> > find module "ignite-tools", open tab "Sources" and mark folder
> > "src/main/java11" as "excluded". Should help.
> >
> > This happens from time to time if you switch from a very old branch
> > (like "ignite-2.5") to a fresh branch like "master".
> >
> > вс, 9 авг. 2020 г. в 21:18, Pavel Tupitsyn :
> >
> > > Igniters,
> > >
> > > The project does not seem to compile in IDEA:
> > > there are two IgniteLinkTaglet versions for Java 8 and Java 9+,
> > > and both files get picked up by the IDE for some reason, resulting
> > > in build errors.
> > >
> > > I've done all the usual things (fresh clone, invalidate caches).
> > > java-8 profile is enabled, java-9+ disabled, only JDK 8 is installed.
> > > Maven build is fine, only IDEA gives me errors.
> > >
> > > I've seen some people just delete one of the IgniteLinkTaglet files,
> > > this works for me too but is quite inconvenient.
> > > Is there any trick to this?
> > >
> >
> >
> > --
> > Sincerely yours,
> > Ivan Bessonov
> >
>


-- 
Sincerely yours,
Ivan Bessonov


Re: Ignite compilation fails in IntelliJ IDEA (IgniteLinkTaglet)

2020-08-10 Thread Ivan Bessonov
Hi Pavel,

please go to "Project Structure | Project Settings | Modules",
find module "ignite-tools", open tab "Sources" and mark folder
"src/main/java11" as "excluded". Should help.

This happens from time to time if you switch from a very old branch
(like "ignite-2.5") to a fresh branch like "master".

вс, 9 авг. 2020 г. в 21:18, Pavel Tupitsyn :

> Igniters,
>
> The project does not seem to compile in IDEA:
> there are two IgniteLinkTaglet versions for Java 8 and Java 9+,
> and both files get picked up by the IDE for some reason, resulting
> in build errors.
>
> I've done all the usual things (fresh clone, invalidate caches).
> java-8 profile is enabled, java-9+ disabled, only JDK 8 is installed.
> Maven build is fine, only IDEA gives me errors.
>
> I've seen some people just delete one of the IgniteLinkTaglet files,
> this works for me too but is quite inconvenient.
> Is there any trick to this?
>


-- 
Sincerely yours,
Ivan Bessonov


Re: PDS suites fail with exit code 137

2020-07-27 Thread Ivan Bessonov
Hi Ivan P.,

I configured it for both PDS (Indexing) and PDS 4 (was asked by Nikita
Tolstunov). It totally worked, not a single 137 since then.
Occasional 130 will be fixed in [1], it has a different problem behind it.

Now I'm trying to find someone who knows TC configuration better and
will be able to propagate the setting to all suites. Also I don't have the
access to agents so "jemalloc" is definitely not an option for me
specifically.

[1] https://issues.apache.org/jira/browse/IGNITE-13266

вс, 26 июл. 2020 г. в 17:36, Ivan Pavlukhin :

> Ivan B.,
>
> I noticed that you were able to configure environment variables for
> PDS (Indexing). Do field experiments show that the suggested approach
> fixes the problem?
>
> Interesting stuff with jemalloc. It might be useful to file a ticket.
>
> 2020-07-23 16:07 GMT+03:00, Ivan Daschinsky :
> >>
> >> About "jemalloc" - it's also an option, but it also requires
> >> reconfiguring
> >> suites on
> >> TC, maybe in a more complicated way. It requires additional
> installation,
> >> right?
> >> Can we stick to the solution that I already tested or should we update
> TC
> >> agents? :)
> >
> >
> > Yes, if you want to use jemalloc, you should install it and configure a
> > specific env variable.
> > This is just an option to consider, nothing more. I suppose that your
> > approach is may be the
> > best variant right now.
> >
> >
> > чт, 23 июл. 2020 г. в 15:28, Ivan Bessonov :
> >
> >> >
> >> > glibc allocator uses arenas for minimize contention between threads
> >>
> >>
> >> I understand it the same way. I did testing with running of Indexing
> >> suite
> >> locally
> >> and periodically executing "pmap ", it showed that the number of
> >> 64mb
> >> arenas grows constantly and never shrinks. By the middle of the suite
> the
> >> amount
> >> of virtual memory was close to 50 Gb and used physical memory was at
> >> least
> >> 6-7 Gb, if I recall it correctly. I have only 8 cores BTW, so it should
> >> be
> >> worse on TC.
> >> It means that there is enough contention somewhere in tests.
> >>
> >> About "jemalloc" - it's also an option, but it also requires
> >> reconfiguring
> >> suites on
> >> TC, maybe in a more complicated way. It requires additional
> installation,
> >> right?
> >> Can we stick to the solution that I already tested or should we update
> TC
> >> agents? :)
> >>
> >> чт, 23 июл. 2020 г. в 15:02, Ivan Daschinsky :
> >>
> >> > AFAIK, glibc allocator uses arenas for minimize contention between
> >> threads
> >> > when they trying to access
> >> > or free preallocated bit of memory. But seems that we
> >> > use -XX:+AlwaysPreTouch, so heap is allocated
> >> > and committed at start time. We allocate memory for durable memory in
> >> > one
> >> > thread.
> >> > So I think there will be not so much contention between threads for
> >> native
> >> > memory pools.
> >> >
> >> > Also, there is another approach -- try to use jemalloc.
> >> > This allocator shows better result than default glibc malloc in our
> >> > scenarios. (memory consumption) [1]
> >> >
> >> > [1] --
> >> >
> >> >
> >>
> http://ithare.com/testing-memory-allocators-ptmalloc2-tcmalloc-hoard-jemalloc-while-trying-to-simulate-real-world-loads/
> >> >
> >> >
> >> >
> >> > чт, 23 июл. 2020 г. в 14:19, Ivan Bessonov :
> >> >
> >> > > Hello Ivan,
> >> > >
> >> > > It feels like the problem is more about new starting threads rather
> >> than
> >> > > the
> >> > > allocation of offheap regions. Plus I'd like to see results soon,
> >> > > your
> >> > > proposal is
> >> > > a major change for Ignite that can't be implemented fast enough.
> >> > >
> >> > > Anyway, I think this makes sense, considering that one day Unsafe
> >> > > will
> >> be
> >> > > removed. But I wouldn't think about it right now, maybe as a
> separate
> >> > > proposal...
> >> > >
> >> > >
> >> > >
> >> > > чт, 23 июл. 2020 г. в 13:40, Ivan Daschinsky :
> >> > >
> >> >

Re: PDS suites fail with exit code 137

2020-07-23 Thread Ivan Bessonov
>
> glibc allocator uses arenas for minimize contention between threads


I understand it the same way. I did testing with running of Indexing suite
locally
and periodically executing "pmap ", it showed that the number of 64mb
arenas grows constantly and never shrinks. By the middle of the suite the
amount
of virtual memory was close to 50 Gb and used physical memory was at least
6-7 Gb, if I recall it correctly. I have only 8 cores BTW, so it should be
worse on TC.
It means that there is enough contention somewhere in tests.

About "jemalloc" - it's also an option, but it also requires reconfiguring
suites on
TC, maybe in a more complicated way. It requires additional installation,
right?
Can we stick to the solution that I already tested or should we update TC
agents? :)

чт, 23 июл. 2020 г. в 15:02, Ivan Daschinsky :

> AFAIK, glibc allocator uses arenas for minimize contention between threads
> when they trying to access
> or free preallocated bit of memory. But seems that we
> use -XX:+AlwaysPreTouch, so heap is allocated
> and committed at start time. We allocate memory for durable memory in one
> thread.
> So I think there will be not so much contention between threads for native
> memory pools.
>
> Also, there is another approach -- try to use jemalloc.
> This allocator shows better result than default glibc malloc in our
> scenarios. (memory consumption) [1]
>
> [1] --
>
> http://ithare.com/testing-memory-allocators-ptmalloc2-tcmalloc-hoard-jemalloc-while-trying-to-simulate-real-world-loads/
>
>
>
> чт, 23 июл. 2020 г. в 14:19, Ivan Bessonov :
>
> > Hello Ivan,
> >
> > It feels like the problem is more about new starting threads rather than
> > the
> > allocation of offheap regions. Plus I'd like to see results soon, your
> > proposal is
> > a major change for Ignite that can't be implemented fast enough.
> >
> > Anyway, I think this makes sense, considering that one day Unsafe will be
> > removed. But I wouldn't think about it right now, maybe as a separate
> > proposal...
> >
> >
> >
> > чт, 23 июл. 2020 г. в 13:40, Ivan Daschinsky :
> >
> > > Ivan, I think that we should use mmap/munmap to allocate huge chunks of
> > > memory.
> > >
> > > I've experimented with JNA and invoke mmap/munmap with it and it works
> > > fine.
> > > May be we can create module (similar to direct-io) that use mmap/munap
> on
> > > platforms, that support them
> > > and fallback to Unsafe if not?
> > >
> > > чт, 23 июл. 2020 г. в 13:31, Ivan Bessonov :
> > >
> > > > Hello Igniters,
> > > >
> > > > I'd like to discuss the current issue with "out of memory" fails on
> > > > TeamCity. Particularly suites [1]
> > > > and [2], they have quite a lot of "Exit code 137" failures.
> > > >
> > > > I investigated the "PDS (Indexing)" suite under [3]. There's another
> > > > similar issue as well: [4].
> > > > I came to the conclusion that the main problem is inside the default
> > > memory
> > > > allocator (malloc).
> > > > Let me explain the way I see it right now:
> > > >
> > > > "malloc" is allowed to allocate (for internal usages) up to 8 *
> (number
> > > of
> > > > cores) blocks called
> > > > ARENA, 64 mb each. This may happen when a program creates/stops
> threads
> > > > frequently and
> > > > allocates a lot of memory all the time, which is exactly what our
> tests
> > > do.
> > > > Given that TC agents
> > > > have 32 cores, 8 * 32 * 64 mb gives 16 gigabytes, that's like the
> whole
> > > > amount of RAM on the
> > > > single agent.
> > > >
> > > > The total amount of arenas can be manually lowered by setting
> > > > the MALLOC_ARENA_MAX
> > > > environment variable to 4 (or other small value). I tried it locally
> > and
> > > in
> > > > PDS (Indexing) suite
> > > > settings on TC, results look very promising: [5]
> > > >
> > > > It is said that changing this variable may lead to some performance
> > > > degradation, but it's hard to tell whether we have it or not, because
> > the
> > > > suite usually failed before it was completed.
> > > >
> > > > So, I have two questions right now:
> > > >
> > > > - can those of you, who are into hardcore Linux and C, confirm that
> the
> > &g

Re: PDS suites fail with exit code 137

2020-07-23 Thread Ivan Bessonov
Hello Ivan,

It feels like the problem is more about new starting threads rather than the
allocation of offheap regions. Plus I'd like to see results soon, your
proposal is
a major change for Ignite that can't be implemented fast enough.

Anyway, I think this makes sense, considering that one day Unsafe will be
removed. But I wouldn't think about it right now, maybe as a separate
proposal...



чт, 23 июл. 2020 г. в 13:40, Ivan Daschinsky :

> Ivan, I think that we should use mmap/munmap to allocate huge chunks of
> memory.
>
> I've experimented with JNA and invoke mmap/munmap with it and it works
> fine.
> May be we can create module (similar to direct-io) that use mmap/munap on
> platforms, that support them
> and fallback to Unsafe if not?
>
> чт, 23 июл. 2020 г. в 13:31, Ivan Bessonov :
>
> > Hello Igniters,
> >
> > I'd like to discuss the current issue with "out of memory" fails on
> > TeamCity. Particularly suites [1]
> > and [2], they have quite a lot of "Exit code 137" failures.
> >
> > I investigated the "PDS (Indexing)" suite under [3]. There's another
> > similar issue as well: [4].
> > I came to the conclusion that the main problem is inside the default
> memory
> > allocator (malloc).
> > Let me explain the way I see it right now:
> >
> > "malloc" is allowed to allocate (for internal usages) up to 8 * (number
> of
> > cores) blocks called
> > ARENA, 64 mb each. This may happen when a program creates/stops threads
> > frequently and
> > allocates a lot of memory all the time, which is exactly what our tests
> do.
> > Given that TC agents
> > have 32 cores, 8 * 32 * 64 mb gives 16 gigabytes, that's like the whole
> > amount of RAM on the
> > single agent.
> >
> > The total amount of arenas can be manually lowered by setting
> > the MALLOC_ARENA_MAX
> > environment variable to 4 (or other small value). I tried it locally and
> in
> > PDS (Indexing) suite
> > settings on TC, results look very promising: [5]
> >
> > It is said that changing this variable may lead to some performance
> > degradation, but it's hard to tell whether we have it or not, because the
> > suite usually failed before it was completed.
> >
> > So, I have two questions right now:
> >
> > - can those of you, who are into hardcore Linux and C, confirm that the
> > solution can help us? Experiments show that it completely solves the
> > problem.
> > - can you please point me to a person who usually does TC maintenance?
> I'm
> > not entirely sure
> > that I can propagate this environment variable to all suites by myself,
> > which is necessary to
> > avoid occasional error 137 (resulted from the same problem) in future. I
> > just don't know all the
> > details about suites structure.
> >
> > Thank you!
> >
> > [1]
> >
> >
> https://ci.ignite.apache.org/viewType.html?buildTypeId=IgniteTests24Java8_PdsIndexing=buildTypeHistoryList=failed_IgniteTests24Java8=%3Cdefault%3E
> > [2]
> >
> >
> https://ci.ignite.apache.org/viewType.html?buildTypeId=IgniteTests24Java8_Pds4=buildTypeHistoryList_IgniteTests24Java8=%3Cdefault%3E=failed
> > [3] https://issues.apache.org/jira/browse/IGNITE-13266
> > [4] https://issues.apache.org/jira/browse/IGNITE-13263
> > [5]
> >
> >
> https://ci.ignite.apache.org/viewType.html?buildTypeId=IgniteTests24Java8_PdsIndexing=buildTypeHistoryList_IgniteTests24Java8=pull%2F8051%2Fhead
> >
> > --
> > Sincerely yours,
> > Ivan Bessonov
> >
>
>
> --
> Sincerely yours, Ivan Daschinskiy
>


-- 
Sincerely yours,
Ivan Bessonov


PDS suites fail with exit code 137

2020-07-23 Thread Ivan Bessonov
Hello Igniters,

I'd like to discuss the current issue with "out of memory" fails on
TeamCity. Particularly suites [1]
and [2], they have quite a lot of "Exit code 137" failures.

I investigated the "PDS (Indexing)" suite under [3]. There's another
similar issue as well: [4].
I came to the conclusion that the main problem is inside the default memory
allocator (malloc).
Let me explain the way I see it right now:

"malloc" is allowed to allocate (for internal usages) up to 8 * (number of
cores) blocks called
ARENA, 64 mb each. This may happen when a program creates/stops threads
frequently and
allocates a lot of memory all the time, which is exactly what our tests do.
Given that TC agents
have 32 cores, 8 * 32 * 64 mb gives 16 gigabytes, that's like the whole
amount of RAM on the
single agent.

The total amount of arenas can be manually lowered by setting
the MALLOC_ARENA_MAX
environment variable to 4 (or other small value). I tried it locally and in
PDS (Indexing) suite
settings on TC, results look very promising: [5]

It is said that changing this variable may lead to some performance
degradation, but it's hard to tell whether we have it or not, because the
suite usually failed before it was completed.

So, I have two questions right now:

- can those of you, who are into hardcore Linux and C, confirm that the
solution can help us? Experiments show that it completely solves the
problem.
- can you please point me to a person who usually does TC maintenance? I'm
not entirely sure
that I can propagate this environment variable to all suites by myself,
which is necessary to
avoid occasional error 137 (resulted from the same problem) in future. I
just don't know all the
details about suites structure.

Thank you!

[1]
https://ci.ignite.apache.org/viewType.html?buildTypeId=IgniteTests24Java8_PdsIndexing=buildTypeHistoryList=failed_IgniteTests24Java8=%3Cdefault%3E
[2]
https://ci.ignite.apache.org/viewType.html?buildTypeId=IgniteTests24Java8_Pds4=buildTypeHistoryList_IgniteTests24Java8=%3Cdefault%3E=failed
[3] https://issues.apache.org/jira/browse/IGNITE-13266
[4] https://issues.apache.org/jira/browse/IGNITE-13263
[5]
https://ci.ignite.apache.org/viewType.html?buildTypeId=IgniteTests24Java8_PdsIndexing=buildTypeHistoryList_IgniteTests24Java8=pull%2F8051%2Fhead

-- 
Sincerely yours,
Ivan Bessonov


Re: Re[2]: Apache Ignite 2.9.0 RELEASE [Time, Scope, Manager]

2020-07-23 Thread Ivan Bessonov
Hi guys,

considering Denis's question: [1] ("inverse TCP connection establishment")
is
already in master. I think we should port it in 2.9, would be a good thing.

"Serverless functions" [2] support is code complete in a private branch,
it's safe
to say that the issue will be completed next week (I need to run all tests
and
pass the review, it'll take some time). If we're not in a hurry then it
might be worth
waiting. Are you ok with this estimation?

[1] https://issues.apache.org/jira/browse/IGNITE-12438
[2] https://issues.apache.org/jira/browse/IGNITE-13013

ср, 22 июл. 2020 г. в 18:19, Denis Magda :

> Sharing a correct link for the Web Console task:
> https://issues.apache.org/jira/browse/IGNITE-13038
>
> -
> Denis
>
>
> On Wed, Jul 22, 2020 at 7:59 AM Denis Magda  wrote:
>
> > Hi Alex,
> >
> > Thanks for wrapping this up and sharing the progress.
> >
> > I've continued the discussion in the Hadoop thread. Let's take a couple
> of
> > days to solve all open questions. Personally, I don't see any reason to
> put
> > the merge off to Ignite 3.0.
> >
> > Also, I would try to deliver the following two changes in Ignite 2.9:
> >
> >- Communication SPI changes [1] and serverless functions support.
> @Ivan
> >Bessonov , the first is completed but no
> >merged. The second should be already solved too. Could you please
> shed some
> >light on this?
> >- Phasing out Web Console [3]. It's ready for the review and I believe
> >that it can be merged quickly. @Alexey Kuznetsov
> >, could you please share your thoughts?
> >
> > [1] https://issues.apache.org/jira/browse/IGNITE-12438
> > [2] https://issues.apache.org/jira/browse/IGNITE-13013
> > [3] https://ggsystems.atlassian.net/browse/IGN-15304
> >
> > -
> > Denis
> >
> >
> > On Wed, Jul 22, 2020 at 12:22 AM Alex Plehanov 
> > wrote:
> >
> >> Guys,
> >>
> >> We are in code-freeze phase now. I've moved almost all non-blocker
> >> unresolved tickets from 2.9 to the next release. If you think that
> >> some ticket is a blocker and should be included into 2.9 release, please
> >> write a note in this thread.
> >>
> >> There are some tickets with "blocker" priority targeted to 2.9, some of
> >> them in "open" state and still unassigned, and I'm not sure we need all
> of
> >> these tickets in 2.9:
> >>
> >> IGNITE-13006 [1] (Apache Ignite spring libs upgrade from version 4x to
> >> spring 5.2 version or later) - Is it really a blocker for 2.9 release?
> If
> >> yes, can somebody help with resolving this ticket?
> >>
> >> IGNITE-11942 [2] (IGFS and Hadoop Accelerator Discontinuation) - ticket
> in
> >> "Patch available" state. There is a thread on dev-list related to this
> >> ticket ([6]), but as far as I understand we still don't have consensus
> >> about version for this patch (2.9, 2.10, 3.0).
> >>
> >> IGNITE-12489 [3] (Error during purges by expiration: Unknown page type)
> -
> >> perhaps issue is already resolved by some related tickets, there is
> still
> >> no reproducer, no additional details and no work in progress. I propose
> to
> >> move this ticket to the next release.
> >>
> >> IGNITE-12911 [4] (B+Tree Corrupted exception when using a key extracted
> >> from a BinaryObject value object --- and SQL enabled) - ticket in "Patch
> >> available" state, but there is no activity since May 2020. Anton
> >> Kalashnikov, Ilya Kasnacheev, do we have any updates on this ticket? Is
> it
> >> still in progress?
> >>
> >> IGNITE-12553 [5] ([IEP-35] public Java metric API) - since the new
> metrics
> >> framework is already released in 2.8 and it's still marked with
> >> @IgniteExperemental annotation, I think this ticket is not a blocker. I
> >> propose to change the ticket priority and move it to the next release.
> >>
> >>
> >> [1]: https://issues.apache.org/jira/browse/IGNITE-13006
> >> [2]: https://issues.apache.org/jira/browse/IGNITE-11942
> >> [3]: https://issues.apache.org/jira/browse/IGNITE-12489
> >> [4]: https://issues.apache.org/jira/browse/IGNITE-12911
> >> [5]: https://issues.apache.org/jira/browse/IGNITE-12553
> >> [6]:
> >>
> >>
> http://apache-ignite-developers.2346864.n4.nabble.com/DISCUSSION-Complete-Discontinuation-of-IGFS-and-Hadoop-Accelerator-td42282.html
> >>
> >> пт, 17 июл. 2020 г. в 11:50, Ale

[jira] [Created] (IGNITE-13266) PDS (Indexing) fails with 'Exit code 137"

2020-07-17 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-13266:
--

 Summary: PDS (Indexing) fails with 'Exit code 137"
 Key: IGNITE-13266
 URL: https://issues.apache.org/jira/browse/IGNITE-13266
 Project: Ignite
  Issue Type: Test
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov


[https://ci.ignite.apache.org/viewType.html?buildTypeId=IgniteTests24Java8_PdsIndexing_IgniteTests24Java8=%3Cdefault%3E=buildTypeHistoryList]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Re[2]: Apache Ignite 2.9.0 RELEASE [Time, Scope, Manager]

2020-07-15 Thread Ivan Bessonov
Guys,

can you please backport https://issues.apache.org/jira/browse/IGNITE-13246
to ignite-2.9? Me and Alexey Kuznetsov really want these new events in
release.

This time I prepared PR with resolved conflicts:
https://github.com/apache/ignite/pull/8042

Thank you!

вт, 14 июл. 2020 г. в 19:39, Zhenya Stanilovsky :

>
>
>
> Alex, i also suggest to merge this
> https://issues.apache.org/jira/browse/IGNITE-13229 too, GridClient
> leakage and further TC OOM preventing.
>
> >Ivan,
> >
> >It was already in release scope as discussed in this thread.
> >
> >вт, 14 июл. 2020 г. в 14:31, Ivan Rakov < ivan.glu...@gmail.com >:
> >
> >> Hi,
> >>
> >> We are still waiting for a final review of Tracing functionality [1]
> until
> >> the end of tomorrow (July 15).
> >> We anticipate that it will be merged to Ignite master no later than July
> >> 16.
> >>
> >> Sorry for being a bit late here. Alex P., can you include [1] to the
> >> release scope?
> >>
> >> [1]:  https://issues.apache.org/jira/browse/IGNITE-13060
> >>
> >> --
> >> Best Regards,
> >> Ivan Rakov
> >>
> >> On Tue, Jul 14, 2020 at 6:16 AM Alexey Kuznetsov <
> akuznet...@gridgain.com >
> >> wrote:
> >>
> >>> Alex,
> >>>
> >>> Can you cherry-pick to Ignite 2.9 this issue:
> >>>  https://issues.apache.org/jira/browse/IGNITE-13246 ?
> >>>
> >>> This issue is about BASELINE events and it is very useful for
> notification
> >>> external tools about changes in baseline.
> >>>
> >>> Thank you!
> >>>
> >>> ---
> >>> Alexey Kuznetsov
> >>>
> >>
>
>
>
>



-- 
Sincerely yours,
Ivan Bessonov


[jira] [Created] (IGNITE-13246) Implement EVT_BASELINE_XXX events

2020-07-13 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-13246:
--

 Summary: Implement EVT_BASELINE_XXX events
 Key: IGNITE-13246
 URL: https://issues.apache.org/jira/browse/IGNITE-13246
 Project: Ignite
  Issue Type: Improvement
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov


In order to notify external tools we need events EVT_BASELINE_CHANGED, 
EVT_BASELINE_AUTO_ADJUST_ENABLED_CHANGED and 
EVT_BASELINE_AUTO_ADJUST_AWAITING_TIME_CHANGED to correctly update baseline 
info on UI.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[Check Code Style] is broken in master

2020-07-10 Thread Ivan Bessonov
Hi Igniters,

today I wanted to run tests on one of my PRs and found out that master is
broken [1].
Issue that broked it was merged 4 hours ago and "TC Bot" VISA was presumably
from the obsolete commit [2].

Sergey Kalashnikov, Aleksey Plekhanov, please be careful next time.

Can someone please fix it? I'm not a committer, otherwise I would do it
myself.

Fix is simple
- org.apache.ignite.internal.processors.query.SqlNotNullKeyValueFieldTest
class requires imports reorganization.

I guess this applies to the 2.9 release branch as well.

Thank you!

[1]
https://ci.ignite.apache.org/viewType.html?buildTypeId=IgniteTests24Java8_CheckCodeStyle=buildTypeStatusDiv_IgniteTests24Java8=%3Cdefault%3E
[2] https://issues.apache.org/jira/browse/IGNITE-13142

-- 
Sincerely yours,
Ivan Bessonov


[jira] [Created] (IGNITE-13242) LocalWalModeChangeDuringRebalancingSelfTest.testDataClearedAfterRestartWithDisabledWal fails

2020-07-10 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-13242:
--

 Summary: 
LocalWalModeChangeDuringRebalancingSelfTest.testDataClearedAfterRestartWithDisabledWal
 fails
 Key: IGNITE-13242
 URL: https://issues.apache.org/jira/browse/IGNITE-13242
 Project: Ignite
  Issue Type: Test
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov


[https://ci.ignite.apache.org/project.html?projectId=IgniteTests24Java8=-5966400795288779246=testDetails]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13235) Deadlock in IgniteServiceProcessor

2020-07-09 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-13235:
--

 Summary: Deadlock in IgniteServiceProcessor
 Key: IGNITE-13235
 URL: https://issues.apache.org/jira/browse/IGNITE-13235
 Project: Ignite
  Issue Type: Bug
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov
 Fix For: 2.9


{code:java}
"main" #1 prio=5 os_prio=0 tid=0x7ff9ac00f000 nid=0x86d in Object.wait() 
[0x7ff9b418b000]
   java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Object.java:502)
at 
org.apache.ignite.internal.util.worker.GridWorker.join(GridWorker.java:242)
- locked <0x000776ee2028> (a java.lang.Object)
at 
org.apache.ignite.internal.util.IgniteUtils.join(IgniteUtils.java:5009)
at 
org.apache.ignite.internal.processors.service.ServiceDeploymentManager.stopProcessing(ServiceDeploymentManager.java:145)
at 
org.apache.ignite.internal.processors.service.IgniteServiceProcessor.stopProcessor(IgniteServiceProcessor.java:261)
at 
org.apache.ignite.internal.processors.service.IgniteServiceProcessor.onKernalStop(IgniteServiceProcessor.java:248)
at org.apache.ignite.internal.IgniteKernal.stop0(IgniteKernal.java:2466)
at org.apache.ignite.internal.IgniteKernal.stop(IgniteKernal.java:2414)
at 
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.stop0(IgnitionEx.java:2577)
- locked <0x000776424138> (a 
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance)
at 
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.stop(IgnitionEx.java:2540)
at org.apache.ignite.internal.IgnitionEx.stop(IgnitionEx.java:333)
at org.apache.ignite.Ignition.stop(Ignition.java:221)
at 
org.apache.ignite.testframework.junits.GridAbstractTest.stopGrid(GridAbstractTest.java:1225)
at 
org.apache.ignite.testframework.junits.GridAbstractTest.stopAllGrids(GridAbstractTest.java:1268)
at 
org.apache.ignite.testframework.junits.GridAbstractTest.stopAllGrids(GridAbstractTest.java:1246)
at 
org.apache.ignite.events.ClusterActivationStartedEventTest.afterTest(ClusterActivationStartedEventTest.java:41)
at 
org.apache.ignite.testframework.junits.GridAbstractTest.cleanUpTestEnviroment(GridAbstractTest.java:701)
at 
org.apache.ignite.testframework.junits.GridAbstractTest.runTest(GridAbstractTest.java:2165)
at 
org.apache.ignite.testframework.junits.GridAbstractTest.access$600(GridAbstractTest.java:172)
at 
org.apache.ignite.testframework.junits.GridAbstractTest$2.evaluate(GridAbstractTest.java:207)
at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
at 
org.apache.ignite.testframework.junits.SystemPropertiesRule.lambda$methodStatement$1(SystemPropertiesRule.java:109)
at 
org.apache.ignite.testframework.junits.SystemPropertiesRule$$Lambda$6/167185492.evaluate(Unknown
 Source)
at 
org.apache.ignite.testframework.junits.DelegatingJUnitStatement.evaluate(DelegatingJUnitStatement.java:48)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at 
org.apache.ignite.testframework.junits.GridAbstractTest.evaluateInsideFixture(GridAbstractTest.java:2669)
at 
org.apache.ignite.testframework.junits.GridAbstractTest.access$500(GridAbstractTest.java:172)
at 
org.apache.ignite.testframework.junits.GridAbstractTest$BeforeFirstAndAfterLastTestRule$1.evaluate(GridAbstractTest.java:2649)
at 
org.apache.ignite.testframework.junits.SystemPropertiesRule.lambda$classStatement$0(SystemPropertiesRule.java:93)
at 
org.apache.ignite.testframework.junits.SystemPropertiesRule$$Lambda$2/1879492184.evaluate(Unknown
 Source)
at 
org.apache.ignite.testframework.junits.DelegatingJUnitStatement.evaluate(DelegatingJUnitStatement.java:48)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
at 
com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
at 
com.intellij.rt.junit.IdeaTestRunner$Repeat

Re: [DISCUSSION] New Ignite settings for IGNITE-12438 and IGNITE-13013

2020-06-29 Thread Ivan Bessonov
Ivan,

Currently we have no requirement to maintain all possible connections
opened. Every node can have arbitrary number of connections to every
other node (it's configurable with "connectionsPerNode" setting).

Also, we can't expect that client would magically open connection when
we need it, that's the main issue. Changing this approach is out of
scope and I can't guarantee that it can or will be implemented this way.


пн, 29 июн. 2020 г. в 17:30, Ivan Pavlukhin :

> Ivan,
>
> It seems that if a server notices that an existing connection to a
> client cannot be used anymore then the server can expect that the
> client will establish a new one. Is it just out of current iteration
> scope? Or are there still other fundamental problems?
>
> 2020-06-29 16:32 GMT+03:00, Ivan Bessonov :
> > Hi Ivan,
> >
> > sure, TCP connections are lazy. So, if a connection is not already opened
> > then node (trying to send a message) will initiate connection opening.
> > It's also possible that the opened connection is spontaneously closed for
> > some reason. Otherwise you are right, everything is as you described.
> >
> > There's also a tie breaker when two nodes connect to each other at the
> > same time. Only one of them will succeed and it depends on internal
> > discovery order, which you can't control basically.
> >
> > пн, 29 июн. 2020 г. в 16:01, Ivan Pavlukhin :
> >
> >> Hi Ivan,
> >>
> >> Sorry for a possibly naive question. As I understand we are talking
> >> about order of establishing client-server connections. And I suppose
> >> that in some environments (e.g. cloud) servers cannot directly
> >> establish connections with clients. But TCP connections are
> >> bidirectional and we still can send messages in both directions. Could
> >> you please provide an example case in which servers have to initiate
> >> new connections to clients?
> >>
> >> 2020-06-29 13:08 GMT+03:00, Ivan Bessonov :
> >> > Hi igniters, Hi Raymond,
> >> >
> >> > that was a really good point. I will try to address it as much as I
> >> > can.
> >> >
> >> > First of all, this new mode will be configurable for now. As Val
> >> suggested,
> >> > "TcpCommunicationSpi#forceClientToServerConnections" will be a new
> >> > setting to trigger this behavior. Disabled by default.
> >> >
> >> > About issues with K8S deployments - I'm not an expert, but from what
> >> > I've
> >> > heard, sometimes servers and client nodes are not in the same
> >> environments.
> >> > For example, there is an Ignite cluster and user tries to start client
> >> node
> >> > in
> >> > isolated K8S pod. In this case clients cannot properly resolve their
> >> > own
> >> > addresses
> >> > and send it to servers, making it impossible for servers to connect to
> >> such
> >> > clients.
> >> > Or, in other words, clients are used as if they were thin.
> >> >
> >> > In your case everything is fine, clients and servers share the same
> >> network
> >> > and can resolve each other's addresses.
> >> >
> >> > Now, CQ issue [1]. You can pass a custom event filter when you
> register
> >> > a
> >> > new
> >> > continuous query. But, depending on the setup, the class of this
> filter
> >> may
> >> > not
> >> > be in the classpath of the server node that holds the data and invokes
> >> that
> >> > filter.
> >> > There are two solutions to the problem:
> >> > - server fails to resolve class name and fails to register CQ;
> >> > - or server can have p2p deployment enabled. Let's assume that it was
> a
> >> > client
> >> > node that requested CQ. In this case the server will try to download
> >> > "class" file
> >> > directly from the node that sent the filter object in the first place.
> >> Due
> >> > to a poor
> >> > design decision it will be done synchronously while registering the
> >> query,
> >> > and
> >> > query registration is happening in "discovery" thread. In normal
> >> > circumstances
> >> > the server will load the class and finish query registration, it's
> just
> >> > a
> >> > little bit slow.
> >> >
> >> > Second case is not compatible with a new
> >> > &quo

Re: [DISCUSSION] New Ignite settings for IGNITE-12438 and IGNITE-13013

2020-06-29 Thread Ivan Bessonov
Hi Ivan,

sure, TCP connections are lazy. So, if a connection is not already opened
then node (trying to send a message) will initiate connection opening.
It's also possible that the opened connection is spontaneously closed for
some reason. Otherwise you are right, everything is as you described.

There's also a tie breaker when two nodes connect to each other at the
same time. Only one of them will succeed and it depends on internal
discovery order, which you can't control basically.

пн, 29 июн. 2020 г. в 16:01, Ivan Pavlukhin :

> Hi Ivan,
>
> Sorry for a possibly naive question. As I understand we are talking
> about order of establishing client-server connections. And I suppose
> that in some environments (e.g. cloud) servers cannot directly
> establish connections with clients. But TCP connections are
> bidirectional and we still can send messages in both directions. Could
> you please provide an example case in which servers have to initiate
> new connections to clients?
>
> 2020-06-29 13:08 GMT+03:00, Ivan Bessonov :
> > Hi igniters, Hi Raymond,
> >
> > that was a really good point. I will try to address it as much as I can.
> >
> > First of all, this new mode will be configurable for now. As Val
> suggested,
> > "TcpCommunicationSpi#forceClientToServerConnections" will be a new
> > setting to trigger this behavior. Disabled by default.
> >
> > About issues with K8S deployments - I'm not an expert, but from what I've
> > heard, sometimes servers and client nodes are not in the same
> environments.
> > For example, there is an Ignite cluster and user tries to start client
> node
> > in
> > isolated K8S pod. In this case clients cannot properly resolve their own
> > addresses
> > and send it to servers, making it impossible for servers to connect to
> such
> > clients.
> > Or, in other words, clients are used as if they were thin.
> >
> > In your case everything is fine, clients and servers share the same
> network
> > and can resolve each other's addresses.
> >
> > Now, CQ issue [1]. You can pass a custom event filter when you register a
> > new
> > continuous query. But, depending on the setup, the class of this filter
> may
> > not
> > be in the classpath of the server node that holds the data and invokes
> that
> > filter.
> > There are two solutions to the problem:
> > - server fails to resolve class name and fails to register CQ;
> > - or server can have p2p deployment enabled. Let's assume that it was a
> > client
> > node that requested CQ. In this case the server will try to download
> > "class" file
> > directly from the node that sent the filter object in the first place.
> Due
> > to a poor
> > design decision it will be done synchronously while registering the
> query,
> > and
> > query registration is happening in "discovery" thread. In normal
> > circumstances
> > the server will load the class and finish query registration, it's just a
> > little bit slow.
> >
> > Second case is not compatible with a new "forceClientToServerConnections"
> > setting. I'm not sure that I need to go into all technical details, but
> the
> > result of
> > such procedure is a cluster that cannot process any discovery messages
> > during
> > TCP connection timeout, we're talking about tens of seconds or maybe even
> > several minutes depending on the settings and the environment. All this
> > time the
> > server will be in a "deadlock" state inside of the "discovery" thread. It
> > means that
> > some cluster operations will be unavailable during this period, like new
> > node joining
> > or starting a new cache. Node failures will not be processed properly as
> > well. For
> > me it's hard to predict real behavior until we reproduce the situation
> in a
> > live
> > environment. I saw this in tests only.
> >
> > I hope that my message clarifies the situation, or at least doesn't cause
> > more
> > confusion. These changes will not affect your infrastructure or your
> Ignite
> > installations, they are aimed at adding more flexibility to other ways of
> > using Ignite.
> >
> > [1] https://issues.apache.org/jira/browse/IGNITE-13156
> >
> >
> >
> > сб, 27 июн. 2020 г. в 09:54, Raymond Wilson  >:
> >
> >> I have just caught up with this discussion and wanted to outline a set
> of
> >> use
> >> cases we have that rely on server nodes communicating with client nodes.
> >>
> >> Firstly, I'd like to conf

Re: [DISCUSSION] New Ignite settings for IGNITE-12438 and IGNITE-13013

2020-06-29 Thread Ivan Bessonov
 just doesn't work on K8s, which
> does
> not agree with our experience of it working. I'd also like to understand
> better the bounds of the issue with CQ: When does it not work and what are
> the symptoms we would see if there was an issue with the way we are using
> it, or the K8s infrastructure we deploy to?
>
> Thanks,
> Raymond.
>
>
>
>
> --
> Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/
>


-- 
Sincerely yours,
Ivan Bessonov


Re: [DISCUSSION] New Ignite settings for IGNITE-12438 and IGNITE-13013

2020-06-16 Thread Ivan Bessonov
our code for
> > more
> > > communication connection failures.
> > >
> > > So to me the least risky decision is not to delete new configuration
> but
> > > leave it with experimental status. If we find out that direct request
> > > (server -> router server -> target client) implementation works well
> and
> > > doesn't bring much complexity in failover scenarios we'll remove that
> > > configuration and prohibit servers to open connections to clients by
> > > default.
> > >
> > > Side note: there are rare but yet possible scenarios where client node
> > > needs to open communication connection to other client node. If we let
> > > clients not to publish their addresses these scenarios will stop
> working
> > > without additional logic like sending data through router node. As far
> > as I
> > > know client-client connectivity is involved in p2p class deployment
> > > scenarios, does anyone know about other cases?
> > >
> > > --
> > > Thanks,
> > > Sergey Chugunov
> > >
> > > On Wed, Jun 3, 2020 at 5:37 PM Denis Magda  wrote:
> > >
> > > > Ivan,
> > > >
> > > > It feels like Val is driving us in the right direction. Is there any
> > > reason
> > > > for keeping the current logic when servers can open connections to
> > > clients?
> > > >
> > > > -
> > > > Denis
> > > >
> > > >
> > > > On Thu, May 21, 2020 at 4:48 PM Valentin Kulichenko <
> > > > valentin.kuliche...@gmail.com> wrote:
> > > >
> > > > > Ivan,
> > > > >
> > > > > Have you considered eliminating server to client connections
> > > altogether?
> > > > > Or, at the very least making the "client to server only" mode the
> > > default
> > > > > one?
> > > > >
> > > > > All the suggested names are confusing and not intuitive, and I
> doubt
> > we
> > > > > will be able to find a good one. A server initiating a TCP
> connection
> > > > with
> > > > > a client is confusing in the first place and creates a usability
> > issue.
> > > > We
> > > > > now want to solve it by introducing an additional configuration
> > > > > parameter, and therefore additional complexity. I don't think this
> is
> > > the
> > > > > right approach.
> > > > >
> > > > > What are the drawbacks of permanently switching to client-to-server
> > > > > connections? Is there any value provided by the server-to-client
> > > option?
> > > > >
> > > > > As for pair connections, I'm not sure I understand why there is a
> > > > > limitation. As far as I know, the idea behind this feature is that
> we
> > > > > maintain two connections between two nodes instead of one, so that
> > > every
> > > > > connection is used for communication in a single direction only.
> Why
> > > does
> > > > > it matter which node initiates the connection? Why can't one of the
> > > nodes
> > > > > (e.g., a client) initiate both connections, and then use them
> > > > accordingly?
> > > > > Correct me if I'm wrong, but I don't see why we can't do this.
> > > > >
> > > > > -Val
> > > > >
> > > > > On Thu, May 21, 2020 at 1:58 PM Denis Magda 
> > wrote:
> > > > >
> > > > > > Ivan,
> > > > > >
> > > > > > Considering that the setting controls the way a communication SPI
> > > > > > connection is open I would add the new parameter to
> > CommunicationSpi
> > > > > > interface naming it as follows:
> > > > > >
> > > > > > >
> > > > > > > CommunicationSpi.connectionInitiationMode
> > > > > > > {
> > > > > > > BIDIRECTIONAL, //both clients and servers initiate a
> > connection
> > > > > > > initiation procedure
> > > > > > > CLIENTS_TO_SERVERS //servers cannot open a connection to
> > > clients,
> > > > > > only
> > > > > > > clients can do that
> > > > > > > }
> > > > > >
> > > > > >
> > > > > > The problem with

[jira] [Created] (IGNITE-13156) Continuous query filter deployment hungs discovery thread

2020-06-16 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-13156:
--

 Summary: Continuous query filter deployment hungs discovery thread
 Key: IGNITE-13156
 URL: https://issues.apache.org/jira/browse/IGNITE-13156
 Project: Ignite
  Issue Type: Bug
Reporter: Ivan Bessonov


Continuous query starts with a custom discovery event. Handler of the event is 
executed in discovery thread synchronously. Even worse is the fact that message 
itself is mutable and it blocks the ring.

Inside of the handler there is a is p2p resource request from other node, which 
can be pretty time consuming. And after 
https://issues.apache.org/jira/browse/IGNITE-12438 or similar tasks this could 
even lead to a deadlock.

All IO operations must be removed from discovery handlers.
{code:java}
 at 
org.apache.ignite.internal.managers.communication.GridIoManager.send(GridIoManager.java:2099)
 at 
org.apache.ignite.internal.managers.communication.GridIoManager.send(GridIoManager.java:2099)
 at 
org.apache.ignite.internal.managers.communication.GridIoManager.sendToGridTopic(GridIoManager.java:2231)
 at 
org.apache.ignite.internal.managers.deployment.GridDeploymentCommunication.sendResourceRequest(GridDeploymentCommunication.java:456)
 at 
org.apache.ignite.internal.managers.deployment.GridDeploymentClassLoader.sendResourceRequest(GridDeploymentClassLoader.java:793)
 at 
org.apache.ignite.internal.managers.deployment.GridDeploymentClassLoader.getResourceAsStreamEx(GridDeploymentClassLoader.java:745)
 at 
org.apache.ignite.internal.managers.deployment.GridDeploymentPerVersionStore.checkLoadRemoteClass(GridDeploymentPerVersionStore.java:729)
 at 
org.apache.ignite.internal.managers.deployment.GridDeploymentPerVersionStore.getDeployment(GridDeploymentPerVersionStore.java:314)
 at 
org.apache.ignite.internal.managers.deployment.GridDeploymentManager.getGlobalDeployment(GridDeploymentManager.java:498)
 at 
org.apache.ignite.internal.GridEventConsumeHandler.p2pUnmarshal(GridEventConsumeHandler.java:416)
 at 
org.apache.ignite.internal.processors.continuous.GridContinuousProcessor.processStartRequest(GridContinuousProcessor.java:1423)
 at 
org.apache.ignite.internal.processors.continuous.GridContinuousProcessor.access$400(GridContinuousProcessor.java:117)
 at 
org.apache.ignite.internal.processors.continuous.GridContinuousProcessor$2.onCustomEvent(GridContinuousProcessor.java:220)
 at 
org.apache.ignite.internal.processors.continuous.GridContinuousProcessor$2.onCustomEvent(GridContinuousProcessor.java:211)
 at 
org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$4.onDiscovery0(GridDiscoveryManager.java:670)
 at 
org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$4.lambda$onDiscovery$0(GridDiscoveryManager.java:533)
 at 
org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryMessageNotifierWorker.body0(GridDiscoveryManager.java:2635)
 at 
org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryMessageNotifierWorker.body(GridDiscoveryManager.java:2673)
 at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120) 
at java.lang.Thread.run(Thread.java:748)
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Apache Ignite 2.9.0 RELEASE [Time, Scope, Manager]

2020-06-10 Thread Ivan Bessonov
Hello,

Sorry for the delay. Sergey Chugunov (sergey.chugu...@gmail.com) just
replied
to the main conversation about "communication via discovery" [1]. We work
on it
together and recently have found one hard-to-fix scenario, detailed
description is
provided in Sergey's reply.

In short, July 10th looks realistic only if we introduce new behavior in
its current
implementation, with new setting and IgniteExperimental status. Blocker
here is
current implementation of Continuos Query protocol that in some cases
(described
at the end) initiates TCP connection right from discovery thread which
obviously
leads to deadlock. We haven't estimated efforts needed to redesign of CQ
protocol
but it is definitely a risk and fixing it isn't feasible with a code freeze
at 10th of July.
So my verdict: we can include this new feature in 2.9 scope as experimental
and with
highlighted limitation on CQ usage. Is that OK?

CQ limitation: server needs to open a communication connection to the
client if during
CQ registration client tries to p2p deploy new class not available on
server classpath.
In other cases registration of CQ should be fine.

[1]
http://apache-ignite-developers.2346864.n4.nabble.com/DISCUSSION-New-Ignite-settings-for-IGNITE-12438-and-IGNITE-13013-td47586.html

вт, 9 июн. 2020 г. в 19:36, Ivan Rakov :

> Hi,
>
> Indeed, the tracing feature is almost ready. Discovery, communication and
> transactions tracing will be introduced, as well as an option to configure
> tracing in runtime. Right now we are working on final performance
> optimizations, but it's very likely that we'll complete this activity
> before the code freeze date.
> Let's include tracing to the 2.9 release scope.
>
> More info:
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-48%3A+Tracing
> https://issues.apache.org/jira/browse/IGNITE-13060
>
> --
> Best Regards,
> Ivan Rakov
>
> On Sat, Jun 6, 2020 at 4:30 PM Denis Magda  wrote:
>
> > Hi folks,
> >
> > The timelines proposed by Alex Plekhanov sounds reasonable to me. I'd
> like
> > only to hear inputs of @Ivan Rakov , who is about
> to
> > finish with the tracing support, and @Ivan Bessonov
> > , who is fixing a serious limitation for K8
> > deployments [1]. Most likely, both features will be ready by the code
> > freeze date (July 10), but the guys should know it better.
> >
> > [1]
> >
> http://apache-ignite-developers.2346864.n4.nabble.com/DISCUSSION-New-Ignite-settings-for-IGNITE-12438-and-IGNITE-13013-td47586.html
> >
> > -
> > Denis
> >
> >
> > On Wed, Jun 3, 2020 at 4:45 AM Alex Plehanov 
> > wrote:
> >
> >> Hello Igniters,
> >>
> >> AI 2.8.1 is finally released and as we discussed here [1] its time to
> >> start
> >> the discussion about 2.9 release.
> >>
> >> I want to propose myself to be the release manager of the 2.9 release.
> >>
> >> What about release time, I agree with Maxim that we should deliver
> >> features
> >> as frequently as possible. If some feature doesn't fit into release
> dates
> >> we should better include it into the next release and schedule the next
> >> release earlier then postpone the current release.
> >>
> >> I propose the following dates for 2.9 release:
> >>
> >> Scope Freeze: June 26, 2020
> >> Code Freeze: July 10, 2020
> >> Voting Date: July 31, 2020
> >> Release Date: August 7, 2019
> >>
> >> WDYT?
> >>
> >> [1] :
> >>
> >>
> http://apache-ignite-developers.2346864.n4.nabble.com/Ignite-Releases-Plan-td47360.html#a47575
> >>
> >
>


-- 
Sincerely yours,
Ivan Bessonov


Re: [DISCUSSION] IEP-47 Native persistence defragmentation

2020-06-02 Thread Ivan Bessonov
Hello Anton,

I'd like to address your last message. First of all, it was already
partially discussed
in this thread: [1] To reiterate - expected performance degradation will be
significant.
There's no way that we can throttle it because free/reuse lists have to be
maintained
sorted all the time. And these are very optimized data structures.

More then that, "dummy" updates clash with data access, this is a very
dangerous
thing to do. And these updates don't save you from the situation when last
pages in
the file are not data pages, but tree pages, for example. They are much
harder to
move. Not only you should update all links to it but also do it
effectively, without
blocking the tree too much. I can think of many other examples.

*Easy to implement/understand*
 - no, it's not easy at all, defragmentation under the load is a very
challenging thing to
   implement.

*Why we're going to implement distributed system defragmentation in the old
(offline) way?*
 - because it's easier and safer, and it won't introduce any performance
degradation.

[1]
http://apache-ignite-developers.2346864.n4.nabble.com/How-to-free-up-space-on-disc-after-removing-entries-from-IgniteCache-with-enabled-PDS-td39839.html

вт, 2 июн. 2020 г. в 14:17, Anton Vinogradov :

> Folks,
>
> Modern OS never ask you to schedule defragmentation and turn your PC off,
> it performs it while you're browsing.
> Why we're going to implement distributed system defragmentation in the old
> (offline) way?
>
> All you need is to implement free/reuse-list sorting. They should provide
> pages closest to the file beginning.
> So, every insert/update will automatically defragment the entry.
> Also, a special process should iterate over the partitions in a reverse way
> just performing dummy updates.
> The partition file may be safely truncated after the iterator.
>
> Props:
> - Your system still operating (no downtime)
> - Defragmentation can be performed partially
> - Defragmentation can be scheduled to periods of inactivity or performed on
> a regular basis
> - SQL will not be broken (no reason to recalculate the whole index, it will
> be recalculated in a regular way on every entry update)
> - Topology changes allowed
> - Easy to implement/understand
>
> Cons:
> - Performance degradation (solvable by throttling)
>
> On Mon, Jun 1, 2020 at 4:04 PM Sergey Chugunov 
> wrote:
>
> > Hi Ivan,
> >
> > I have an idea about suggested maintenance mode.
> >
> > First of all, I agree with your ideas about discovery restrictions: node
> > should not join topology when performing defragmentation.
> >
> > At the same time I haven't heard about requests for this mode from users,
> > so we don't know much about possible requirements.
> > So I suggest to implement it in a pragmatical way: instead of inventing
> > (unknown in reality) user scenarios lets develop minimal but yet
> > well-designed functionality that suites our case. If we constrain our
> > implementation with reasonable set of restrictions that's OK.
> >
> > So my idea is the following: to transit a node to maintenance user has to
> > send special command to the node (e.g. with new command in control.sh
> > utility or via JMX interface). Node saves maintenance request in local
> > metastorage and waits for restart. User has to manually restart that node
> > in order to finish moving it to maintenance mode.
> >
> > When node restarts and finds maintenance request it creates special type
> of
> > discovery SPI that will not try to join topology at all yet node is able
> to
> > start all necessary components and APIs like REST processor or JMX
> > interface.
> >
> > When in maintenance, we'll be able to do defragmentation safely and
> remove
> > maintenance request from metastorage only when it is completed (with all
> > fault-tolerance logic in mind).
> >
> > As we don't have a mechanism (like watcher) to perform a "safe restart"
> (by
> > safe I mean Ignite restart without OS-level process restart) we cannot
> > finish maintenance mode without another manual restart but I think it is
> a
> > reasonable restriction as maintenance mode shouldn't be an every-day
> > routine and will be used quite rare.
> >
> > What do you think about it?
> >
> > On Tue, May 26, 2020 at 5:58 PM Ivan Bessonov 
> > wrote:
> >
> > > Hello Igniters,
> > >
> > > I'd like to discuss this new IEP with you: [1]. The main idea is to
> have
> > a
> > > procedure that relocates
> > > pages to the top of the file as compact as possible which allows us to
> > > trim the file and increase its
> > > fill-factor

[DISCUSSION] IEP-47 Native persistence defragmentation

2020-05-26 Thread Ivan Bessonov
Hello Igniters,

I'd like to discuss this new IEP with you: [1]. The main idea is to have a
procedure that relocates
pages to the top of the file as compact as possible which allows us to
trim the file and increase its
fill-factor. It will be configured manually and executed after the restart,
but before node joins
topology (it means any load would be impossible during defragmentation). It
is described in detail
in the IEP itself, please read it. This topic was also previously discussed
here on dev-list in [2].

Here I would like to list a few moments that are not as clear and require
your opinion.

 - what are your overall thoughts on the IEP? Any concerns?

 - maintenance mode - how do we communicate with the node that's not in
topology? What are
   the options? As far as I know, we have no current tools like this.

 - checkpointer refactoring - these changes will involve intensive writing
of pages to the storage.
   If we're going to reuse the offheap page model to perform
defragmentation then the
   checkpointing mechanism will have to be adapted in some form.
   Are you fine with this? Or we need a separate discussion?

[1]
https://cwiki.apache.org/confluence/display/IGNITE/IEP-47%3A+Native+persistence+defragmentation
[2]
http://apache-ignite-developers.2346864.n4.nabble.com/How-to-free-up-space-on-disc-after-removing-entries-from-IgniteCache-with-enabled-PDS-td39839.html


-- 
Sincerely yours,
Ivan Bessonov


[jira] [Created] (IGNITE-13062) DistributedMetaStoragePersistentTest.testJoinNodeWithLongerHistory failed

2020-05-22 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-13062:
--

 Summary: 
DistributedMetaStoragePersistentTest.testJoinNodeWithLongerHistory failed
 Key: IGNITE-13062
 URL: https://issues.apache.org/jira/browse/IGNITE-13062
 Project: Ignite
  Issue Type: Bug
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov


Reason is a race between transition future and cluster state that this future 
modifies.
{code:java}
java.lang.AssertionError
at 
org.apache.ignite.internal.processors.cluster.GridClusterStateProcessor.publicApiActiveStateAsync(GridClusterStateProcessor.java:333)
at 
org.apache.ignite.internal.processors.cluster.GridClusterStateProcessor.publicApiActiveState(GridClusterStateProcessor.java:295)
at 
org.apache.ignite.internal.IgniteKernal.checkClusterState(IgniteKernal.java:4074)
at 
org.apache.ignite.internal.IgniteKernal.internalCache(IgniteKernal.java:2692)
at 
org.apache.ignite.testframework.junits.common.GridCommonAbstractTest$1.call(GridCommonAbstractTest.java:398)
at 
org.apache.ignite.testframework.junits.common.GridCommonAbstractTest$1.call(GridCommonAbstractTest.java:394)
at 
org.apache.ignite.testframework.junits.GridAbstractTest.executeOnLocalOrRemoteJvm(GridAbstractTest.java:2036)
at 
org.apache.ignite.testframework.junits.common.GridCommonAbstractTest.nearEnabled(GridCommonAbstractTest.java:393)
at 
org.apache.ignite.testframework.junits.common.GridCommonAbstractTest.dht(GridCommonAbstractTest.java:337)
at 
org.apache.ignite.testframework.junits.common.GridCommonAbstractTest.awaitPartitionMapExchange(GridCommonAbstractTest.java:775)
at 
org.apache.ignite.testframework.junits.common.GridCommonAbstractTest.awaitPartitionMapExchange(GridCommonAbstractTest.java:577)
at 
org.apache.ignite.testframework.junits.common.GridCommonAbstractTest.awaitPartitionMapExchange(GridCommonAbstractTest.java:562)
at 
org.apache.ignite.internal.processors.metastorage.DistributedMetaStoragePersistentTest.testJoinNodeWithLongerHistory(DistributedMetaStoragePersistentTest.java:179)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at 
org.apache.ignite.testframework.junits.GridAbstractTest$6.run(GridAbstractTest.java:2127)
at java.lang.Thread.run(Thread.java:748)
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13058) GridCommandHandlerTest.testKillHangingRemoteTransactions failed

2020-05-22 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-13058:
--

 Summary: GridCommandHandlerTest.testKillHangingRemoteTransactions 
failed
 Key: IGNITE-13058
 URL: https://issues.apache.org/jira/browse/IGNITE-13058
 Project: Ignite
  Issue Type: Bug
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov


Test may fail if not all clients completed local cache start routine.
{code:java}
at org.junit.Assert.fail(Assert.java:86)
at org.junit.Assert.assertTrue(Assert.java:41)
at org.junit.Assert.assertNotNull(Assert.java:712)
at org.junit.Assert.assertNotNull(Assert.java:722)
at 
org.apache.ignite.testframework.junits.JUnitAssertAware.assertNotNull(JUnitAssertAware.java:178)
at 
org.apache.ignite.util.GridCommandHandlerTest.testKillHangingRemoteTransactions(GridCommandHandlerTest.java:1044)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.apache.ignite.testframework.junits.GridAbstractTest$6.run(GridAbstractTest.java:2127)
at java.lang.Thread.run(Thread.java:748)
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[DISCUSSION] New Ignite settings for IGNITE-12438 and IGNITE-13013

2020-05-21 Thread Ivan Bessonov
Hello Igniters,

I'd like to discuss with you changes related to [1] and [2]. Both issues
are mostly the same so
let's discuss the core idea.

*Motivation.*

There are certain environments that don't allow Ignite server nodes to open
TCP connections to
thick clients, e.g. K8S, AWS Lambda or Azure Functions. To operate in such
environments, the
server needs a way to request a client to open an "inverse" communication
connection to it.

I've prepared a PR (still in progress) that introduces new mechanism of
opening connection and
related configuration.

*Main idea*

This mechanism is called "communication via discovery" or "inverse
connection", it works as
follows:
 - server that needs to connect to "unreachable" thick client sends a
specific Discovery message
   (inverse communication request) to that client;
 - client node upon receiving the request opens communication connection to
that server;
 - server sees connection opened by client and proceeds with its task (that
required opening
   connection to the client).

Working name for new configuration parameter for this feature is
environmentType, it is an
enum with two values (again, working names): STANDALONE (default) and
VIRTUALIZED.
It is used as a hint to server to speed-up establishing of connections:
when server sees a client
with VIRTUALIZED environmentType it doesn't try to open connection to it
and sends inverse
communication request right away.
If environmentType is STANDALONE then server tries to open a connection in
a regular way
(iterating over all client addresses) and sends request only if all
attempts failed.

There is a concern about naming of the configuration as it catches only one
use-case: when we
deal with some kind of virtualized environment like K8S.
There are other options I've encountered in private discussion:
- connectionMode - ALWAYS_INITIATE / INITIATE_OR_ACCEPT
- networkConnectivity - REACHABLE_ALWAYS / REACHABLE_ONE_WAY
- communicationViaDiscovery - ALWAYS / FALLBACK
- isReachableFromAllNodes (true/false flag)
- initiateConnectionsOnThisNode (true/false flag).

*Limitations*

The feature cannot be used along with pairedConnection setting as this
setting implies
establishing connections in both directions. Also current implementation
supports opening only
client-to-server connections. Other types of connections like
client-to-client or server-to-server
will be implemented in separate tickets.

[1] https://issues.apache.org/jira/browse/IGNITE-12438
[2] https://issues.apache.org/jira/browse/IGNITE-13013

-- 
Sincerely yours,
Ivan Bessonov


[jira] [Created] (IGNITE-13050) ClusterGroup that is recomputed on topology change

2020-05-21 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-13050:
--

 Summary: ClusterGroup that is recomputed on topology change
 Key: IGNITE-13050
 URL: https://issues.apache.org/jira/browse/IGNITE-13050
 Project: Ignite
  Issue Type: Improvement
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov


Currently, ClusterGroup comes in two favors:
One is a static set of UUIDs which will not change, second is predicate that is 
recomputed over ALL nodes on EVERY operation. This has bitten our client 
because recomputing of ClusterGroup happens in tcp-communication thread 
clogging it and delaying every operation in cluster. This is a major problem.

It would be nice if there was a ClusterGroup with predicate which would 
recompute once per topology affinity change. Bonus points if it precisely 
tracks current topology with zero delay or overrun.

Would be nice to upgrade firstNode/lastNode predicates to that mechanism since 
now they are static - topology changes but firstNode/lastNode projections 
don't, they may point to absent node.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13048) WAL FSYNC mode doesn't work with disabled archiver

2020-05-21 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-13048:
--

 Summary: WAL FSYNC mode doesn't work with disabled archiver
 Key: IGNITE-13048
 URL: https://issues.apache.org/jira/browse/IGNITE-13048
 Project: Ignite
  Issue Type: Bug
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov


{noformat}
Caused by: 
org.apache.ignite.internal.processors.cache.persistence.StorageException: 
Failed to initialize WAL log segment (WAL segment size change is not supported 
in 'DEFAULT' WAL mode) 
[filePath=/home/vsisko/gridgain/backend-work/work/db/wal/web_console_data/0001.wal,
 fileSize=24313258, configSize=10]
at 
org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager.checkFiles(FileWriteAheadLogManager.java:2427)
 ~[ignite-core-8.7.5.jar:8.7.5]
at 
org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager.checkOrPrepareFiles(FileWriteAheadLogManager.java:1404)
 ~[ignite-core-8.7.5.jar:8.7.5]
at 
org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager.start0(FileWriteAheadLogManager.java:435)
 ~[ignite-core-8.7.5.jar:8.7.5]
at 
org.apache.ignite.internal.processors.cache.GridCacheSharedManagerAdapter.start(GridCacheSharedManagerAdapter.java:60)
 ~[ignite-core-8.7.5.jar:8.7.5]
at 
org.apache.ignite.internal.processors.cache.GridCacheProcessor.start(GridCacheProcessor.java:841)
 ~[ignite-core-8.7.5.jar:8.7.5]
at 
org.apache.ignite.internal.IgniteKernal.startProcessor(IgniteKernal.java:1717) 
~[ignite-core-8.7.5.jar:8.7.5]
at 
org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:1020) 
~[ignite-core-8.7.5.jar:8.7.5]
at 
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:2039)
 ~[ignite-core-8.7.5.jar:8.7.5]
at 
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1731)
 ~[ignite-core-8.7.5.jar:8.7.5]
at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1157) 
~[ignite-core-8.7.5.jar:8.7.5]
at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:677) 
~[ignite-core-8.7.5.jar:8.7.5]
at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:602) 
~[ignite-core-8.7.5.jar:8.7.5]
at org.apache.ignite.Ignition.start(Ignition.java:322) 
~[ignite-core-8.7.5.jar:8.7.5]
at 
org.apache.ignite.console.config.GridConfiguration.igniteInstance(GridConfiguration.java:38)
 ~[classes/:?]
at 
org.apache.ignite.console.config.GridConfiguration$$EnhancerBySpringCGLIB$$b50da981.CGLIB$igniteInstance$0()
 ~[classes/:?]
at 
org.apache.ignite.console.config.GridConfiguration$$EnhancerBySpringCGLIB$$b50da981$$FastClassBySpringCGLIB$$d486ae88.invoke()
 ~[classes/:?]
at 
org.springframework.cglib.proxy.MethodProxy.invokeSuper(MethodProxy.java:228) 
~[spring-core-4.3.23.RELEASE.jar:4.3.23.RELEASE]
at 
org.springframework.context.annotation.ConfigurationClassEnhancer$BeanMethodInterceptor.intercept(ConfigurationClassEnhancer.java:358)
 ~[spring-context-4.3.23.RELEASE.jar:4.3.23.RELEASE]
at 
org.apache.ignite.console.config.GridConfiguration$$EnhancerBySpringCGLIB$$b50da981.igniteInstance()
 ~[classes/:?]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
~[?:1.8.0_222]
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
~[?:1.8.0_222]
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[?:1.8.0_222]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_222]
at 
org.springframework.beans.factory.support.SimpleInstantiationStrategy.instantiate(SimpleInstantiationStrategy.java:162)
 ~[spring-beans-4.3.23.RELEASE.jar:4.3.23.RELEASE]
at 
org.springframework.beans.factory.support.ConstructorResolver.instantiateUsingFactoryMethod(ConstructorResolver.java:588)
 ~[spring-beans-4.3.23.RELEASE.jar:4.3.23.RELEASE]
... 83 more{noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Can I get write permissions for Apache Wiki?

2020-05-15 Thread Ivan Bessonov
Hello Igniters,

I'm planning to describe a new IEP about Native Persistence Defragmentation
(as it's called in [1]) but have no permissions to do so. Can someone please
give me the access?

IEP discussion will start shortly after. Thank you!

[1] https://cwiki.apache.org/confluence/display/IGNITE/Apache+Ignite+Roadmap

-- 
Sincerely yours,
Ivan Bessonov


Re: Apache Ignite 2.8.1 RELEASE [Time, Scope, Manager]

2020-04-22 Thread Ivan Bessonov
Hi Igniters,

I'm continuing with IGNITE-12756 PR creation. It turned out that we need 3
more cherry-picks
to avoid massive code changes. Tests started failing after my initial
attempt to create that PR.

So, in total I have PR with 6 commits in it. Some of them fix components
and tests for them,
others are required for new tests to compile and run without changes in
2.8.1 branch.
RunAll is in progress now and I'll reply with news if I have any.

Andrey Gura, Nikolay Izhikov, are you OK with 3 more commits in the scope?
If you don't,
then I won't be able to port IGNITE-12756 to 2.8.1 properly by myself, I
guess it would be
easier to reimplement it from the scratch.

[1] https://issues.apache.org/jira/browse/IGNITE-12735
[2] https://issues.apache.org/jira/browse/IGNITE-12568
[3] https://issues.apache.org/jira/browse/IGNITE-12756
<https://issues.apache.org/jira/browse/IGNITE-12756>
[4] https://issues.apache.org/jira/browse/IGNITE-12285
[5] https://issues.apache.org/jira/browse/IGNITE-12668
[6] https://issues.apache.org/jira/browse/IGNITE-12682
[7] https://github.com/apache/ignite/pull/7708


вт, 21 апр. 2020 г. в 19:36, Pavel Pereslegin :

> Hello, Nikolay.
>
> This has been fixed by increasing the test timeout in IGNITE-12683 [1][2].
>
> [1] https://issues.apache.org/jira/browse/IGNITE-12683
> [2]
> https://github.com/apache/ignite/commit/bf394a77e1de6432e493eb2818243a82b577f11e
>
> вт, 21 апр. 2020 г. в 18:28, Nikolay Izhikov :
> >
> > Hello, Igniters.
> >
> > While cherry-picking to 2.8.1 I found flaky test regarding BPlusTree
> [1], [2]
> >
> > All failures of this tests relates to PR’s based on 2.8.
> > It seems we hav fix in master but doesn’t have it in 2.8 and 2.8.1
> >
> > Can someone suggest, what commit fixes this tests?
> >
> > [1]
> https://ci.ignite.apache.org/project.html?projectId=IgniteTests24Java8==testDetails=-7895536196794411367=TEST_STATUS_DESC_IgniteTests24Java8=__all_branches__=50
> >
> > [2]
> https://ci.ignite.apache.org/project.html?projectId=IgniteTests24Java8==testDetails=-3485591522651012009=TEST_STATUS_DESC_IgniteTests24Java8=__all_branches__=50
> >
> >
> > > 21 апр. 2020 г., в 17:49, Ivan Bessonov 
> написал(а):
> > >
> > > Sure, here's PR with 3 cherry-picked commits that I mentioned:
> > > https://github.com/apache/ignite/pull/7708
> > >
> > > вт, 21 апр. 2020 г. в 17:17, Nikolay Izhikov :
> > >
> > >> OK then, let’s include it in the 2.8.1
> > >>
> > >> Ivan, can you, please, prepare PR in the ignite-2.8.1 branch that
> contain
> > >> cherry-pick for all required commits?
> > >>
> > >>> 21 апр. 2020 г., в 17:01, Andrey Gura  написал(а):
> > >>>
> > >>> Hi
> > >>>
> > >>>> IGNITE-12735 - Metric exporter implementation could lead to
> > >> NullPointerException from gauge which invoke communication
> > >>>> IGNITE-12568 - MessageFactory implementations refactoring
> > >>>
> > >>>> Personally, I’m against any refactoring improvements in bug fix
> release.
> > >>>> So, I propose to exclude IGNITE-12756 from 2.8.1
> > >>>
> > >>>> Andrey, what do you think as a committer of this improvements?
> > >>>
> > >>> Mainly IGNITE-12756 brings some improvements related with TCP
> > >>> communication metrics (performance, memory footprint,
> IgniteSpiContext
> > >>> improved in order to provide ability to implement metrics related
> > >>> SPI's without using internal API's, code improvements)
> > >>>
> > >>> But! It also fixes potential NPE's which can be thrown on node start.
> > >>> So it would be great to include this fix to 2.8.1 release.
> > >>>
> > >>> On Tue, Apr 21, 2020 at 11:12 AM Nikolay Izhikov <
> nizhi...@apache.org>
> > >> wrote:
> > >>>>
> > >>>>> I've cherry-picked IGNITE-12734 to 2.8.1 branch.
> > >>>>
> > >>>>
> > >>>> Thanks!
> > >>>>
> > >>>>> considering commit "683f22e64f IGNITE-12756 TcpCommunication SPI
> > >> metrics
> > >>>>>> improvement" - it depends
> > >>>>>> on https://issues.apache.org/jira/browse/IGNITE-12735 and
> > >>>>>> https://issues.apache.org/jira/browse/IGNITE-12568,
> > >>>>>> both were targeted to 2.9, but this has to be changed probably.
> > >>>>

Re: Apache Ignite 2.8.1 RELEASE [Time, Scope, Manager]

2020-04-21 Thread Ivan Bessonov
Sure, here's PR with 3 cherry-picked commits that I mentioned:
https://github.com/apache/ignite/pull/7708

вт, 21 апр. 2020 г. в 17:17, Nikolay Izhikov :

> OK then, let’s include it in the 2.8.1
>
> Ivan, can you, please, prepare PR in the ignite-2.8.1 branch that contain
> cherry-pick for all required commits?
>
> > 21 апр. 2020 г., в 17:01, Andrey Gura  написал(а):
> >
> > Hi
> >
> >> IGNITE-12735 - Metric exporter implementation could lead to
> NullPointerException from gauge which invoke communication
> >> IGNITE-12568 - MessageFactory implementations refactoring
> >
> >> Personally, I’m against any refactoring improvements in bug fix release.
> >> So, I propose to exclude IGNITE-12756 from 2.8.1
> >
> >> Andrey, what do you think as a committer of this improvements?
> >
> > Mainly IGNITE-12756 brings some improvements related with TCP
> > communication metrics (performance, memory footprint, IgniteSpiContext
> > improved in order to provide ability to implement metrics related
> > SPI's without using internal API's, code improvements)
> >
> > But! It also fixes potential NPE's which can be thrown on node start.
> > So it would be great to include this fix to 2.8.1 release.
> >
> > On Tue, Apr 21, 2020 at 11:12 AM Nikolay Izhikov 
> wrote:
> >>
> >>> I've cherry-picked IGNITE-12734 to 2.8.1 branch.
> >>
> >>
> >> Thanks!
> >>
> >>> considering commit "683f22e64f IGNITE-12756 TcpCommunication SPI
> metrics
> >>>> improvement" - it depends
> >>>> on https://issues.apache.org/jira/browse/IGNITE-12735 and
> >>>> https://issues.apache.org/jira/browse/IGNITE-12568,
> >>>> both were targeted to 2.9, but this has to be changed probably.
> >>
> >> IGNITE-12735 - Metric exporter implementation could lead to
> NullPointerException from gauge which invoke communication
> >> IGNITE-12568 - MessageFactory implementations refactoring
> >>
> >> Ivan,
> >> Personally, I’m against any refactoring improvements in bug fix release.
> >> So, I propose to exclude IGNITE-12756 from 2.8.1
> >>
> >> Andrey, what do you think as a committer of this improvements?
> >>
> >>
> >>> 21 апр. 2020 г., в 10:44, Alex Plehanov 
> написал(а):
> >>>
> >>> Nikolay,
> >>>
> >>> I've cherry-picked IGNITE-12734 to 2.8.1 branch.
> >>>
> >>> вт, 21 апр. 2020 г. в 10:02, Ivan Bessonov :
> >>>
> >>>> Hello, Igniters,
> >>>>
> >>>> considering commit "683f22e64f IGNITE-12756 TcpCommunication SPI
> metrics
> >>>> improvement" - it depends
> >>>> on https://issues.apache.org/jira/browse/IGNITE-12735 and
> >>>> https://issues.apache.org/jira/browse/IGNITE-12568,
> >>>> both were targeted to 2.9, but this has to be changed probably.
> >>>>
> >>>> There might be other issues that I didn't find, we should probably ask
> >>>> Andrey Gura about it, he is the author of
> >>>> those changes.
> >>>>
> >>>> So, release scope should be expanded a little bit, are you ok with it?
> >>>>
> >>>> пн, 20 апр. 2020 г. в 19:24, Nikolay Izhikov :
> >>>>
> >>>>> Hello, Igniters.
> >>>>>
> >>>>> I perform cherry-pick for most commits targeted for 2.8.1 release.
> >>>>>
> >>>>> TC bot results -
> >>>>>
> >>>>
> https://mtcga.gridgain.com/pr.html?serverId=apache=IgniteTests24Java8_RunAll=pull%2F7690%2Fhead=Latest=pull%2F7102%2Fhead
> >>>>>
> >>>>> I need help with cherry picking the following commits:
> >>>>>
> >>>>> - 4e6cd2ce04 IGNITE-12759 Getting a SecurityContext from
> >>>>> GridSecurityProcessor - Fixes #7523.
> >>>>> - 60ebc23f99 IGNITE-12848 fix H2Connection leaks on INSERT (#7649)
> >>>>> - 0b185b192f IGNITE-12800  SQL: local queries cursors must be closed
> or
> >>>>> full read to unlock the GridH2Table. (#7551)
> >>>>> - bcaae8deef IGNITE-12734 Fixed scan query over evicted partition -
> Fixes
> >>>>> #7494.
> >>>>> - 683f22e64f IGNITE-12756 TcpCommunication SPI metrics improvement
> >>>>>
> >>>>> Denis Garus,
> >>>>> Taras Ledkov,
> >

Re: Apache Ignite 2.8.1 RELEASE [Time, Scope, Manager]

2020-04-21 Thread Ivan Bessonov
Hello, Igniters,

considering commit "683f22e64f IGNITE-12756 TcpCommunication SPI metrics
improvement" - it depends
on https://issues.apache.org/jira/browse/IGNITE-12735 and
https://issues.apache.org/jira/browse/IGNITE-12568,
both were targeted to 2.9, but this has to be changed probably.

There might be other issues that I didn't find, we should probably ask
Andrey Gura about it, he is the author of
those changes.

So, release scope should be expanded a little bit, are you ok with it?

пн, 20 апр. 2020 г. в 19:24, Nikolay Izhikov :

> Hello, Igniters.
>
> I perform cherry-pick for most commits targeted for 2.8.1 release.
>
> TC bot results -
> https://mtcga.gridgain.com/pr.html?serverId=apache=IgniteTests24Java8_RunAll=pull%2F7690%2Fhead=Latest=pull%2F7102%2Fhead
>
> I need help with cherry picking the following commits:
>
> - 4e6cd2ce04 IGNITE-12759 Getting a SecurityContext from
> GridSecurityProcessor - Fixes #7523.
> - 60ebc23f99 IGNITE-12848 fix H2Connection leaks on INSERT (#7649)
> - 0b185b192f IGNITE-12800  SQL: local queries cursors must be closed or
> full read to unlock the GridH2Table. (#7551)
> - bcaae8deef IGNITE-12734 Fixed scan query over evicted partition - Fixes
> #7494.
> - 683f22e64f IGNITE-12756 TcpCommunication SPI metrics improvement
>
> Denis Garus,
> Taras Ledkov,
> Alexey Plekhanov,
> Ivan Bessonov,
>
> Can you help me with above commits and cherry-pick your contribution in
> ignite-2.8.1 branch?
>
> CHERRY PICKED:
>
> + d0c155fe43 IGNITE-12772 Fixed JmxMetricExporterSpi uses log method which
> must not be used in production code (#7604)
> + 00cb1ad7a3 IGNITE-12764: Thin JDBC streaming fails BatchUpdateException
> if function is used (#7615)
> + b8167296b1 IGNITE-12805 Fixed NullPointerException when memory restore
> is in progress. Fixes #7562
> + f57509e8e6 IGNITE-12828 Fixes NPE during CQ registration with failed
> deployment. (#7620)
> + 826aad8890 IGNITE-12726 Long keys support for metastorage. (#7606)
> + 2b1d2b4dec IGNITE-12859: Support of .Net service call with the Timestamp
> and Guid params. (#7618)
> + 6f3515686f IGNITE-12769: histogramNames cache in MetricRegistryMBean
> removed. (#7549)
> + 795617fc94 IGNITE-12774 Handle "Too many open files" exception - Fixes
> #7516.
> + 3928bb3a13 IGNITE-12799 Fixed wrong SpEL expression.
> + ef4f67e351 IGNITE-12743 Java thin client: Fixed thread shutdown on
> client close when partition awareness is enabled - Fixes #7522.
> + e389bb8f55 IGNITE-12728 Collect time statistics on cache#putAllAsync -
> Fixes #7483.
> + 90951c6b2e IGNITE-12711 Fixed tests memory usage. - Fixes #7469.
> + f32b44deb1 IGNITE-12590: Fix (remove) check KEY at the MERGE command
> (#7321)
> + 59917f0731 IGNITE-12624 Java thin client: typeId generation for system
> types fixed - Fixes #7416.
> + cc6f4d7814 IGNITE-12671 Ignoring single messages during PME can prevent
> late affinity switch. - Fixes #7425.
> + 02ac292662 IGNITE-11798 Fix memory leak on unstable topology caused by
> partition reservation (#7251)
> + 8ed8576544 IGNITE-12665: SQL: Fix potential race on MapResult close.
> + 465cc444d0 IGNITE-12582 Add Spring EL support in Spring Data. - Fixes
> #7411.
> + 100101ccce IGNITE-12605: Reset initial update counter value before
> clearing a partition (#7341)
> + e17887bfbf IGNITE-12654 Some of rentingFutures in
> GridDhtPartitionTopologyImpl may accumulate a huge number of eviction
> callbacks - Fixes #7399.
> + bdbe6a79d0 IGNITE-12651 Non-comparable keys for eviction policy cause
> failure handle and node shutdown - Fixes #7397.
> + 56a515db6d IGNITE-12631 Incorrect rewriting wal record type in
> marshalled mode during iteration - Fixes #7371.
> + 14dd160f90 IGNITE-12621 Node leave may cause NullPointerException during
> IO message processing if security is enabled - Fixes #7366.
> + 67ac1d5d68 IGNITE-12636 Full rebalance instead of a historical one -
> Fixes #7379.
> + 4433485d74 IGNTIE-12468 Java thin client: deserialization of arrays,
> collections and maps fixed - Fixes #7320.
> + 0dfd98388e IGNITE-12618 Affinity cache for version of last server event
> can be wiped from history - Fixes #7359.
> + e2c597fff1 IGNITE-12013 NullPointerException is thrown by
> ExchangeLatchManager during cache creation - Fixes #7335.
> + cebc76f1f5d7d077093ddaecbabcd8db2106c60a - IGNITE-12545 Introduce
> listener interface for components to react to partition map exchange events
> + a9278eedf7 IGNITE-11797 Fixed partition consistency issues for atomic
> and mixed tx-atomic cache groups. - Fixes #7315.
> + e160c8f231 IGNITE-12557 Fix possible IgniteOOM during cache destroy. -
> Fixes #7298.
> + 41ed3294ec IGNITE-12567 H2Tree goes into illegal state when non-indexed
> columns are dropped - Fix

[jira] [Created] (IGNITE-12885) Checkpoint thread executes partitions fsync in single thread

2020-04-10 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-12885:
--

 Summary: Checkpoint thread executes partitions fsync in single 
thread
 Key: IGNITE-12885
 URL: https://issues.apache.org/jira/browse/IGNITE-12885
 Project: Ignite
  Issue Type: Bug
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov


It should use "asyncRunner" if it was configured, this will optimize checkpoint 
speed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-12877) "restorePartitionStates" always logs all meta pages into WAL

2020-04-08 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-12877:
--

 Summary: "restorePartitionStates" always logs all meta pages into 
WAL
 Key: IGNITE-12877
 URL: https://issues.apache.org/jira/browse/IGNITE-12877
 Project: Ignite
  Issue Type: Bug
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov


{noformat}
2020-01-31T21:09:27,203 [INFO 
][main][org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager]
 - Finished applying WAL changes [updatesApplied=11897, time=183531 ms] 
2020-01-31T21:09:27,203 [INFO 
][main][org.apache.ignite.internal.processors.cache.GridCacheProcessor] - 
Restoring partition state for local groups. 2020-01-31T21:17:49,692 [INFO 
][main][org.apache.ignite.internal.processors.cache.GridCacheProcessor] - 
Finished restoring partition state for local groups [groupsProcessed=32, 
partitionsProcessed=9310, time=502498ms] {noformat}
Main issue is that 
org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager#updateState
 unconditionally returns true. "stateId" is pretty much always not equal to 
"-1".

UPDATE: that wasn’t the only problem, please look in the fix itself for more 
details.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-12875) Implement "EVT_CLUSTER_STATE_CHANGE_STARTED" event

2020-04-08 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-12875:
--

 Summary: Implement "EVT_CLUSTER_STATE_CHANGE_STARTED" event
 Key: IGNITE-12875
 URL: https://issues.apache.org/jira/browse/IGNITE-12875
 Project: Ignite
  Issue Type: Improvement
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-12874) Possible NPE in GridDiscoveryManager#cacheGroupAffinityNode

2020-04-08 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-12874:
--

 Summary: Possible NPE in 
GridDiscoveryManager#cacheGroupAffinityNode
 Key: IGNITE-12874
 URL: https://issues.apache.org/jira/browse/IGNITE-12874
 Project: Ignite
  Issue Type: Bug
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov


If "grpId" is invalid then method will throw NPE instead of returning false.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-12839) IGNITE-12789 broke WALRecordSerializationTest

2020-03-26 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-12839:
--

 Summary: IGNITE-12789 broke WALRecordSerializationTest
 Key: IGNITE-12839
 URL: https://issues.apache.org/jira/browse/IGNITE-12839
 Project: Ignite
  Issue Type: Bug
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov


[https://ci.ignite.apache.org/project.html?projectId=IgniteTests24Java8=-3192056576753991319=%3Cdefault%3E=testDetails]

 

Sorry, too bad that I skipped it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-12789) Tracking page repairing has no WAL record associated with it

2020-03-16 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-12789:
--

 Summary: Tracking page repairing has no WAL record associated with 
it
 Key: IGNITE-12789
 URL: https://issues.apache.org/jira/browse/IGNITE-12789
 Project: Ignite
  Issue Type: Bug
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov


org.apache.ignite.internal.processors.cache.persistence.tree.io.TrackingPageIO#resetCorruptFlag(long)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


MetaStorage key length limitations and Cache Metrics configuration

2020-02-28 Thread Ivan Bessonov
Hello Igniters,

we have an issue in master branch and in the upcoming 2.8 release that
related to new metrics functionality implemented in [1]. You can't use new
"configureHistogramMetric" and "configureHitRateMetric" configuration
methods on caches with long names. My estimation shows that cache with 30
characters in its name will shut down your whole cluster with failure
handler if
you try to change metrics configuration for it using one of those methods.

Initially we wanted to merge [2] to show a valid error message instead of
failing
the cluster, but it wasn't in plans for 2.8 because we didn't know that it
clashes
with [1].

I created issue [3] with plans of removing MetaStorage key length
limitations, but
it requires some thoughtful MetaStorageTree reworkings. I mean that it
can't be
done in only a few days.

What do you think? Does this issue affect 2.8 release? AFAIK new metrics are
experimental and they can have some known issues. Feel free to ask me for
more
details if it's needed.


[1] https://issues.apache.org/jira/browse/IGNITE-11987
[2] https://issues.apache.org/jira/browse/IGNITE-12721
[3] https://issues.apache.org/jira/browse/IGNITE-12726

-- 
Sincerely yours,
Ivan Bessonov


[jira] [Created] (IGNITE-12726) Cache names can't be used as part of DistributedMetaStorage keys

2020-02-28 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-12726:
--

 Summary: Cache names can't be used as part of 
DistributedMetaStorage keys
 Key: IGNITE-12726
 URL: https://issues.apache.org/jira/browse/IGNITE-12726
 Project: Ignite
  Issue Type: Bug
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov


Issue was discovered during the implementation of IGNITE-12721. Here's a shot 
version of the description:
 * local MetaStorage can't handle keys that have more than 64 bytes in their 
"byte[]" representation. Since DistributedMetaStorage uses it and adds some 
specific prefixes on top, we have a strict limit on the key length.

Just to be clear - it just won't work, IGNITE-12721 only adds a valid exception 
and meaningful error message to the API.

 

Recently IGNITE-11987 from [IEP-35] has been merged to master and 2.8 release 
branch, and it does exactly whats written in the title - adds cache name as a 
part of the key. So, if you use long cache name in, for example, test called 
"org.apache.ignite.internal.metric.MetricsConfigurationTest#testConfigRemovedOnCacheRemove",
 you'll get AssertionErrors in log. By "long" I mean about 50 symbols. This 
should not happen.

 

I see two options here:
 * leave everything as it is and change keys format;
 * modify MetaStorage so that it can handle longer keys. I prefer this one.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Apache Ignite 2.8 RELEASE [Time, Scope, Manager]

2020-02-11 Thread Ivan Bessonov
Hello Igniters,

I'd like to add one more fix to the release: [1]
It adds versioning to internal classes of distributed metastorage component.
Without this fix it would be much harder to update these classes without
breaking binary compatibility.

[1] https://issues.apache.org/jira/browse/IGNITE-12638

ср, 5 февр. 2020 г. в 22:33, Maxim Muzafarov :

> Ivan,
>
> > Should not we state in release notes what new experimental API was added?
>
> I think we should. Will do.
> Just not to miss anything that we should mark with
> @IgniteExperimental: Consistency Check [1], Monitoring [2] anything
> else?
>
> > As Flink integration was moved to external repository how Ignite 2.8
> users will be able to use that integration?
>
> Since ignite-extension has a separate release cycle (right?), it is
> better to release ignite-extension rather than cherry-pick this change
> back to 2.8. I also think it is not a blocker for the release, but we
> should do our best make the first ignite-extension release as earlier
> as possible.
>
>
> [1] https://issues.apache.org/jira/browse/IGNITE-10663
> [2] https://issues.apache.org/jira/browse/IGNITE-11848
>
> On Wed, 5 Feb 2020 at 22:07, Ivan Pavlukhin  wrote:
> >
> > Maxim,
> >
> > A couple of questions:
> > 1. We added an annotation to designate experimental API. Should not we
> > state in release notes what new experimental API was added? Perhaps in
> > a separate block.
> > 2. As Flink integration was moved to external repository how Ignite
> > 2.8 users will be able to use that integration?
> >
> > ср, 5 февр. 2020 г. в 21:21, Maxim Muzafarov :
> > >
> > > Igniters,
> > >
> > >
> > > I've prepared RELEASE_NOTES pull-request [1] to the 2.8 release.
> > >
> > > Currently, IEP-35 monitoring issues are not included in this PR. Will
> > > do it soon.
> > > Please, take a look.
> > >
> > >
> > > [1] https://github.com/apache/ignite/pull/7367/files
> > >
> > > On Mon, 3 Feb 2020 at 14:38, Maxim Muzafarov 
> wrote:
> > > >
> > > > Igniters,
> > > >
> > > >
> > > > Let me share the current status of the release.
> > > >
> > > > 1.
> > > > Waiting for the issues [1] [2] (discussed previously this thread) to
> > > > be tested by TC.Bot and merged to the 2.8 release branch.
> > > >
> > > > 2.
> > > > Only 2 release BLOCKER issues left. I'm planning to move these issues
> > > > to 2.8.1 release.
> > > > The issue [4] (Error during purges by expiration: Unknown page type)
> > > > will be covered by [1] [2].
> > > > The issue [3] (Apache Ignite Cluster(Amazon S3 Based Discovery) Nodes
> > > > getting down) probably require additional info to reproduce the
> issue.
> > > >
> > > > 3.
> > > > A potential performance drop on `putAll` operations on an in-memory
> > > > cluster (see [5] for details).
> > > > I'll try to reproduce in another test environment.
> > > >
> > > >
> > > > Will keep you posted.
> > > >
> > > >
> > > > [1] https://issues.apache.org/jira/browse/IGNITE-12593
> > > > [2] https://issues.apache.org/jira/browse/IGNITE-12594
> > > > [3] https://issues.apache.org/jira/browse/IGNITE-12398
> > > > [4] https://issues.apache.org/jira/browse/IGNITE-12489
> > > > [5]
> https://cwiki.apache.org/confluence/display/IGNITE/Apache+Ignite+2.8#ApacheIgnite2.8-Benchmarks(LATEST)
> > > >
> > > > On Thu, 30 Jan 2020 at 15:02, Alexey Goncharuk
> > > >  wrote:
> > > > >
> > > > > Sounds good, will do!
> >
> >
> >
> > --
> > Best regards,
> > Ivan Pavlukhin
>


-- 
Sincerely yours,
Ivan Bessonov


[jira] [Created] (IGNITE-12638) Classes persisted by DistributedMetaStorage are not IgniteDTO

2020-02-07 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-12638:
--

 Summary: Classes persisted by DistributedMetaStorage are not 
IgniteDTO
 Key: IGNITE-12638
 URL: https://issues.apache.org/jira/browse/IGNITE-12638
 Project: Ignite
  Issue Type: Bug
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov
 Fix For: 2.8






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-12576) [IEP-35] TCP communication metrics use node ID instead of consistent ID

2020-01-24 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-12576:
--

 Summary: [IEP-35] TCP communication metrics use node ID instead of 
consistent ID
 Key: IGNITE-12576
 URL: https://issues.apache.org/jira/browse/IGNITE-12576
 Project: Ignite
  Issue Type: Improvement
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov


TcpCommunicationMetricsListener uses nodeId for metrics name. consistentId 
should be used instead.

Also all metrics for registry should be created at once before registry added 
to GridMetricManager. There is no need in separate initialization for sent and 
received counters.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Apache Ignite 2.8 RELEASE [Time, Scope, Manager]

2020-01-23 Thread Ivan Bessonov
; > > > > > > > >
> > > > > > > > > >> Maxim,
> > > > > > > > > >>
> > > > > > > > > >> Conflicts in pr [1] are resolved. TC Run all is started.
> > > > > > > > > >>
> > > > > > > > > >> [1]  https://github.com/apache/ignite/pull/7238
> > > > > > > > > >>
> > > > > > > > > >> пт, 17 янв. 2020 г. в 16:04, Sergey Antonov <
> > > > > > > > antonovserge...@gmail.com
> > > > > > > > > >:
> > > > > > > > > >>
> > > > > > > > > >>> Maxim,
> > > > > > > > > >>>
> > > > > > > > > >>> I will do that on monday (20/01).
> > > > > > > > > >>>
> > > > > > > > > >>> пт, 17 янв. 2020 г. в 13:08, Maxim Muzafarov <
> > > > > mmu...@apache.org
> > > > > > >:
> > > > > > > > > >>>
> > > > > > > > > >>>> Sergey,
> > > > > > > > > >>>>
> > > > > > > > > >>>>
> > > > > > > > > >>>> Can you, please, resolve the PR conflicts [1] [2]?
> > > > > > > > > >>>>
> > > > > > > > > >>>> [1]  https://github.com/apache/ignite/pull/7238
> > > > > > > > > >>>> [2]
> https://issues.apache.org/jira/browse/IGNITE-11256
> > > > > > > > > >>>>
> > > > > > > > > >>>> On Thu, 16 Jan 2020 at 16:59, Ilya Kasnacheev <
> > > > > > > > > ilya.kasnach...@gmail.com >
> > > > > > > > > >>>> wrote:
> > > > > > > > > >>>> >
> > > > > > > > > >>>> > Hello!
> > > > > > > > > >>>> >
> > > > > > > > > >>>> > I have bumped beanutils and re-ran Cassandra Store
> > > tests.
> > > > > Can
> > > > > > you
> > > > > > > > > >>>> please
> > > > > > > > > >>>> > comment on the ticket?
> > > > > > > > > >>>> >
> > > > > > > > > >>>> > I think that fixing ZooKeeper is too much effort
> > > (there's
> > > > > > chaos
> > > > > > > > with
> > > > > > > > > >>>> > jackson vs. jackson-asl), maybe it should be split
> up
> > > as a
> > > > > > > > separate
> > > > > > > > > >>>> ticket
> > > > > > > > > >>>> > to be done later.
> > > > > > > > > >>>> >
> > > > > > > > > >>>> > Regards,
> > > > > > > > > >>>> > --
> > > > > > > > > >>>> > Ilya Kasnacheev
> > > > > > > > > >>>> >
> > > > > > > > > >>>> >
> > > > > > > > > >>>> > ср, 15 янв. 2020 г. в 18:31, Vladimir Pligin <
> > > > > > > > vova199...@yandex.ru
> > > > > > > > > >:
> > > > > > > > > >>>> >
> > > > > > > > > >>>> > > Thanks, Ilya. It would be really great to have
> your
> > > patch
> > > > > > > > included
> > > > > > > > > >>>> into 2.8
> > > > > > > > > >>>> > > scope.
> > > > > > > > > >>>> > > I'd like to give my two cent as well. For example
> we
> > > have
> > > > > > > > > vulnerable
> > > > > > > > > >>>> > > dependencies here:
> > > > > > > > > >>>> > > modules/cassandra/store/pom.xml -
> commons-beanutils
> > > > > > > > > >>>> > > modules/zookeeper/pom.xml - transitive Jackson
> from
> > > > > Curator
> > > > > > > > > >>>> > >
> > > > > > > > > >>>> > > I'd suggest to uprgrade
> > > > > commons-beanutils:commons-beanutils
> > > > > > to
> > > > > > > > > 1.9.4
> > > > > > > > > >>>> and
> > > > > > > > > >>>> > > override
> com.fasterxml.jackson.core:jackson-databind
> > > to
> > > > > our
> > > > > > > > common
> > > > > > > > > >>>> jackson
> > > > > > > > > >>>> > > version from other modules.
> > > > > > > > > >>>> > >
> > > > > > > > > >>>> > >
> > > > > > > > > >>>> > >
> > > > > > > > > >>>> > > --
> > > > > > > > > >>>> > > Sent from:
> > > > > > > > > http://apache-ignite-developers.2346864.n4.nabble.com/
> > > > > > > > > >>>> > >
> > > > > > > > > >>>>
> > > > > > > > > >>>
> > > > > > > > > >>>
> > > > > > > > > >>> --
> > > > > > > > > >>> BR, Sergey Antonov
> > > > > > > > > >>>
> > > > > > > > > >>
> > > > > > > > > >>
> > > > > > > > > >> --
> > > > > > > > > >> BR, Sergey Antonov
> > > > > > > > > >>
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > >
> > > > >
> > >
> >
>


-- 
Sincerely yours,
Ivan Bessonov


[jira] [Created] (IGNITE-12542) Some tests failed after due to incompatible changes in IGNITE-12108 and IGNITE-11987

2020-01-15 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-12542:
--

 Summary: Some tests failed after due to incompatible changes in 
IGNITE-12108 and IGNITE-11987
 Key: IGNITE-12542
 URL: https://issues.apache.org/jira/browse/IGNITE-12542
 Project: Ignite
  Issue Type: Bug
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov


[https://ci.ignite.apache.org/buildConfiguration/IgniteTests24Java8_ComputeGrid?branch=%3Cdefault%3E=overview=builds]

 

[https://ci.ignite.apache.org/buildConfiguration/IgniteTests24Java8_Basic1?branch=%3Cdefault%3E=overview=builds]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-12515) GridMultinodeRedeploySharedModeSelfTest.testSharedMode fails sometimes

2019-12-30 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-12515:
--

 Summary: GridMultinodeRedeploySharedModeSelfTest.testSharedMode 
fails sometimes
 Key: IGNITE-12515
 URL: https://issues.apache.org/jira/browse/IGNITE-12515
 Project: Ignite
  Issue Type: Bug
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov
 Fix For: 2.9


Exception in 
org.apache.ignite.internal.managers.deployment.GridDeploymentPerVersionStore#searchDeploymentCache



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-12514) WAL don't flush several last records in LOG-ONLY/FSYNC mode if flush ptr=null

2019-12-30 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-12514:
--

 Summary: WAL don't flush several last records in LOG-ONLY/FSYNC 
mode if flush ptr=null
 Key: IGNITE-12514
 URL: https://issues.apache.org/jira/browse/IGNITE-12514
 Project: Ignite
  Issue Type: Bug
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov
 Fix For: 2.9


In the current implementation, last flush pointer dependent to thread-local. If 
some thread adds new records and another thread calls wal.flush(null), this 
flush may not be flushed records witch was added in thread one, because in case 
null flush pointer, thread flushed until the last record which was added in 
current thread.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Apache Ignite 2.8 RELEASE [Time, Scope, Manager]

2019-12-27 Thread Ivan Bessonov
> > to
> > > > > avoid shipping default links to outdated packages.
> > > > >
> > > > > I think this is one of things that are very hard to do between
> > > releases,
> > > > so
> > > > > I think this dependencies bumping should be a part of a formal
> > > > > release/testing cycle, and then be backported to master.
> > > > >
> > > > > I could volunteer to do that myself, if we agree to merge these
> version
> > > > > upgrades to ignite-2.8 and then re-test.
> > > > >
> > > > > Regards,
> > > > > --
> > > > > Ilya Kasnacheev
> > > > >
> > > > >
> > > > > вт, 24 дек. 2019 г. в 13:22, Zhenya Stanilovsky
> > > >  > > > > >:
> > > > >
> > > > > >
> > > > > > Igniters, i`l try to compare 2.8 release candidate vs 2.7.6,
> > > > > > last sha 2.8 was build from :  9d114f3137f92aebc2562a
> > > > > > i use yardstick benchmarks, 4 bare machine with:  2x Xeon X5570
> 96Gb
> > > > 512GB
> > > > > > SSD 2048GB HDD 10GB/s
> > > > > > 1 for  client (driver) and 3 for servers.
> > > > > > this mappings for graphs and real yardstick tests:
> > > > > >
> > > > > > atomic-put: IgnitePutBenchmark
> > > > > > sql-merge-query: IgniteSqlMergeQueryBenchmark
> > > > > > atomic-get: IgniteGetBenchmark
> > > > > > tx-get: IgniteGetTxBenchmark
> > > > > > tx-put: IgnitePutTxBenchmark
> > > > > > atomic-put-all-bs-10: IgnitePutAllBenchmark
> > > > > > tx-put-all-bs-10: IgnitePutAllTxBenchmark
> > > > > >
> > > > > > cacheMode — partitioned
> > > > > > CacheWriteSynchronizationMode.FULL_SYNC
> > > > > > 1 backup
> > > > > >
> > > > > > 1. wal = log_only 2. wal = none 3. persistence disabled.
> > > > > > Thanks Maxim for wiki page [1]
> > > > > >
> > > > > >
> > > > > > [1]
> > > > > >
> > > >
> > >
> https://cwiki.apache.org/confluence/display/IGNITE/Apache+Ignite+2.8#ApacheIgnite2.8-Benchmarks
> > > > > >
> > > > > > do we need some bisect or other work here ?
> > > > > >
> > > > > > >
> > > > > > >
> > > > > > >--- Forwarded message ---
> > > > > > >From: "Maxim Muzafarov" < mmu...@apache.org >
> > > > > > >To:  dev@ignite.apache.org
> > > > > > >Cc:
> > > > > > >Subject: Apache Ignite 2.8 RELEASE [Time, Scope, Manager]
> > > > > > >Date: Fri, 20 Sep 2019 14:44:31 +0300
> > > > > > >
> > > > > > >Igniters,
> > > > > > >
> > > > > > >
> > > > > > >It's almost a year has passed since the last major Apache
> Ignite 2.7
> > > > > > >has been released. We've accumulated a lot of performance
> > > improvements
> > > > > > >and a lot of new features which are waiting for their release
> date.
> > > > > > >Here is my list of the most interesting things from my point
> since
> > > the
> > > > > > >last major release:
> > > > > > >
> > > > > > >Service Grid,
> > > > > > >Monitoring,
> > > > > > >Recovery Read
> > > > > > >BLT auto-adjust,
> > > > > > >PDS compression,
> > > > > > >WAL page compression,
> > > > > > >Thin client: best effort affinity,
> > > > > > >Thin client: transactions support (not yet)
> > > > > > >SQL query history
> > > > > > >SQL statistics
> > > > > > >
> > > > > > >I think we should no longer wait and freeze the master branch
> > > anymore
> > > > > > >and prepare the next major release by the end of the year.
> > > > > > >
> > > > > > >
> > > > > > >I propose to discuss Time, Scope of Apache Ignite 2.8 release
> and
> > > also
> > > > > > >I want to propose myself to be the release manager of the
> planning
> > > > > > >release.
> > > > > > >
> > > > > > >Scope Freeze: November 4, 2019
> > > > > > >Code Freeze: November 18, 2019
> > > > > > >Voting Date: December 10, 2019
> > > > > > >Release Date: December 17, 2019
> > > > > > >
> > > > > > >
> > > > > > >WDYT?
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > >
> > >
>


-- 
Sincerely yours,
Ivan Bessonov


[jira] [Created] (IGNITE-12506) Deadlock in DistributedMetaStoragePersistentTest.testUnstableTopology

2019-12-27 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-12506:
--

 Summary: Deadlock in 
DistributedMetaStoragePersistentTest.testUnstableTopology
 Key: IGNITE-12506
 URL: https://issues.apache.org/jira/browse/IGNITE-12506
 Project: Ignite
  Issue Type: Bug
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov
 Fix For: 2.9


{code:java}
"wal-file-archiver%metastorage.DistributedMetaStoragePersistentTest4-#51609%metastorage.DistributedMetaStoragePersistentTest4%@88463"
 prio=5 tid=0xf889 nid=NA waiting for monitor entry
  java.lang.Thread.State: BLOCKED
 waiting for 
dms-writer-thread-#51614%metastorage.DistributedMetaStoragePersistentTest4%@88460
 to release lock on <0x159fe> (a 
org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$FileArchiver)
  at 
org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$FileArchiver$2.apply(FileWriteAheadLogManager.java:2042)
  at 
org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$FileArchiver$2.apply(FileWriteAheadLogManager.java:2040)
  at 
org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager.checkFiles(FileWriteAheadLogManager.java:2538)
  at 
org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager.access$3000(FileWriteAheadLogManager.java:157)
  at 
org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$FileArchiver.allocateRemainingFiles(FileWriteAheadLogManager.java:2032)
  at 
org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$FileArchiver.body(FileWriteAheadLogManager.java:1806)
  at 
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:119)
  at java.lang.Thread.run(Thread.java:748)
{code}
{code:java}
 
"dms-writer-thread-#51614%metastorage.DistributedMetaStoragePersistentTest4%@88460"
 prio=5 tid=0xf88e nid=NA waiting java.lang.Thread.State: WAITING blocks 
wal-file-archiver%metastorage.DistributedMetaStoragePersistentTest4-#51609%metastorage.DistributedMetaStoragePersistentTest4%@88463
 at java.lang.Object.wait(Object.java:-1) at 
java.lang.Object.wait(Object.java:502) at 
org.apache.ignite.internal.processors.cache.persistence.wal.aware.SegmentCurrentStateStorage.nextAbsoluteSegmentIndex(SegmentCurrentStateStorage.java:107)
 at 
org.apache.ignite.internal.processors.cache.persistence.wal.aware.SegmentAware.nextAbsoluteSegmentIndex(SegmentAware.java:66)
 at 
org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$FileArchiver.nextAbsoluteSegmentIndex(FileWriteAheadLogManager.java:1918)
 - locked <0x159fe> (a 
org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$FileArchiver)
 at 
org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$FileArchiver.access$1100(FileWriteAheadLogManager.java:1687)
 at 
org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager.pollNextFile(FileWriteAheadLogManager.java:1575)
 at 
org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager.initNextWriteHandle(FileWriteAheadLogManager.java:1387)
 at 
org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager.rollOver(FileWriteAheadLogManager.java:1258)
 at 
org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager.log(FileWriteAheadLogManager.java:875)
 at 
org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager.log(FileWriteAheadLogManager.java:796)
 at 
org.apache.ignite.internal.processors.cache.persistence.freelist.AbstractFreeList$WriteRowHandler.addRow(AbstractFreeList.java:207)
 at 
org.apache.ignite.internal.processors.cache.persistence.freelist.AbstractFreeList$WriteRowHandler.run(AbstractFreeList.java:158)
 at 
org.apache.ignite.internal.processors.cache.persistence.freelist.AbstractFreeList$WriteRowHandler.run(AbstractFreeList.java:138)
 at 
org.apache.ignite.internal.processors.cache.persistence.tree.util.PageHandler.writePage(PageHandler.java:292)
 at 
org.apache.ignite.internal.processors.cache.persistence.DataStructure.write(DataStructure.java:318)
 at 
org.apache.ignite.internal.processors.cache.persistence.freelist.AbstractFreeList.insertDataRow(AbstractFreeList.java:516)
 at 
org.apache.ignite.internal.processors.cache.persistence.metastorage.MetastorageRowStore.addRow(MetastorageRowStore.java:72)
 at 
org.apache.ignite.internal.processors.cache.persistence.metastorage.MetaStorage.writeRaw(MetaStorage.java:419)
 - locked <0x159ff> (a 
org.apache.ignite.internal.processors.cache.persistence.metastorage.MetaStorage)
 at 
org.apache.ignite.internal.processors.cache.persistence.metastorage.

[jira] [Created] (IGNITE-12499) Node took a long time to start after kill

2019-12-26 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-12499:
--

 Summary: Node took a long time to start after kill
 Key: IGNITE-12499
 URL: https://issues.apache.org/jira/browse/IGNITE-12499
 Project: Ignite
  Issue Type: Bug
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov
 Fix For: 2.9


Test scenario:
1) Start 4 node cluster
2) Activate
3) Load 1k rows to each cache
4) Stop node
5) Return it back without index.bin files
6) Wait until start

Somehow the first node takes Waiting for topology snapshot: server(s) 4/4, 
client(s) 0/*, timeout 1166/1800 sec to start.

[10:47:21,360][INFO][main][G] Node started : [stage="Configure system pool" 
(129 ms),stage="Start managers" (440 ms),stage="Configure binary metadata" (86 
ms),stage="Start processors" (39341 ms),stage="Start 'GridGain' plugin" (16 
ms),s
tage="Init and start regions" (210 ms),stage="Restore binary memory" (228224 
ms),stage="Restore logical state" (859694 ms),stage="Finish recovery" (8938 
ms),stage="Join topology" (6024 ms),stage="Await transition" (16 
ms),stage="Await e
xchange" (14855 ms),stage="Total time" (1157973 ms)]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-12491) Eliminate contention on ConcurrentHashMap.size()

2019-12-24 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-12491:
--

 Summary: Eliminate contention on ConcurrentHashMap.size()
 Key: IGNITE-12491
 URL: https://issues.apache.org/jira/browse/IGNITE-12491
 Project: Ignite
  Issue Type: Improvement
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov
 Fix For: 2.9


Methods who invoked checkpointReadLock/checkpointReadUnlock spend much time on 
calculation of quantity dirty pages. You will to see that when have some 
hundreds of regions.
Any persistent operation will be cost hundreds invokes of size on 
ConcurrentHashMap.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-12488) Fix JavaDocs in DistributedMetaStorage

2019-12-23 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-12488:
--

 Summary: Fix JavaDocs in DistributedMetaStorage
 Key: IGNITE-12488
 URL: https://issues.apache.org/jira/browse/IGNITE-12488
 Project: Ignite
  Issue Type: Improvement
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov
 Fix For: 2.9


Some information is obsolete after 
https://issues.apache.org/jira/browse/IGNITE-12109



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-12487) Inconsistent GridIoManager API for sendToGridTopic(Collection nodes) and sendToGridTopic(UUID nodeId)

2019-12-23 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-12487:
--

 Summary: Inconsistent GridIoManager API for 
sendToGridTopic(Collection nodes) and sendToGridTopic(UUID nodeId)
 Key: IGNITE-12487
 URL: https://issues.apache.org/jira/browse/IGNITE-12487
 Project: Ignite
  Issue Type: Improvement
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov
 Fix For: 2.9


Method
{{1}}{{ctx.io().sendToGridTopic(Collection nodes, )}}
will throw exception "Internal Ignite code should never call the method with 
local node in a node list."

But at the same time
{{1}}{{ctx.io().sendToGridTopic(((IgniteEx)ignite).localNode().id(), ...)}}
Works without any exception.

>From my point of view we should not throw exception.
Processing messages in common listener is much more comfortable than writing 
same code twice, one for remote nodes and one for local.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-12486) Truncation of archived WAL segments doesn't work

2019-12-23 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-12486:
--

 Summary: Truncation of archived WAL segments doesn't work
 Key: IGNITE-12486
 URL: https://issues.apache.org/jira/browse/IGNITE-12486
 Project: Ignite
  Issue Type: Bug
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov


Index calculation is wrong in FileWriteAheadLogManager#rollOver.

It leads to unexpected and faulty WAL segments truncation and data corruption 
as a result.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-12485) DiscoveryEvent make event message lazy initialization

2019-12-23 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-12485:
--

 Summary: DiscoveryEvent make event message lazy initialization
 Key: IGNITE-12485
 URL: https://issues.apache.org/jira/browse/IGNITE-12485
 Project: Ignite
  Issue Type: Improvement
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov
 Fix For: 2.9


In GridDiscoveryManager$DiscoveryWorker#recordEvent() we set to each event 
message: "msg " + clusterNode
Invocation toString() on ClusterNode's inheritor could be expensive. 
I think event message could be lazy generated from event type and cluster node.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


  1   2   >