from:"Dmitriy Govorukhin"

Re: [ANNOUNCE] New committer: Vyacheslav Koptilin

2020-02-18 Thread Dmitriy Govorukhin

My congratulations, Slava!

On Tue, Feb 18, 2020 at 11:33 PM Andrey Kuznetsov  wrote:

> Congratulations, Slava!
>
> вт, 18 февр. 2020 г. в 22:20, Dmitriy Pavlov :
>
> > Hello Ignite Community,
> >
> > The Project Management Committee (PMC) for Apache Ignite has invited
> > Vyacheslav Koptilin to become a committer and we are pleased to announce
> > that he has accepted.
> >
> > Vyacheslav investigated and fixed a number of non-trivial issues in the
> > Ignite Native persistent store, was a reviewer of Read Repair (ex.
> > Consistency Check).
> >
> > Being a committer enables easier contribution to the project since there
> is
> > no need to go via the patch submission process. This should enable better
> > productivity.
> >
> > Vyacheslav, thanks for supporting the community and keep the pace!
> >
> > Best Regards,
> > Dmitriy Pavlov
> > on behalf of Apache Ignite PMC
> >
>
>
> --
> Best regards,
>   Andrey Kuznetsov.
>

Re: [VOTE] Allow or prohibit a joint use of @deprecated and @IgniteExperimental

2020-02-10 Thread Dmitriy Govorukhin

-1 Prohibit

On Mon, Feb 10, 2020 at 12:58 PM Pavel Tupitsyn 
wrote:

> -1 Prohibit
>
> On Mon, Feb 10, 2020 at 12:41 PM Zhenya Stanilovsky
>  wrote:
>
> >
> > -1, sounds confusing, i wan`t use deprecated API
> > and @IgniteExperimental it`s something unknown with undefined «time for
> > support».
> >
> >
> >
> > >Dear Apache Ignite community,
> > >
> > >We would like to conduct a formal vote on the subject of whether to
> allow
> > >or prohibit a joint existence of @deprecated annotation for an old API
> > >and @IgniteExperimental [1] for a new (replacement) API. The result of
> > this
> > >vote will be formalized as an Apache Ignite development rule to be used
> in
> > >future.
> > >
> > >The discussion thread where you can address all non-vote messages is
> [2].
> > >
> > >The votes are:
> > >*[+1 Allow]* Allow to deprecate the old APIs even when new APIs are
> marked
> > >with @IgniteExperimental to explicitly notify users that an old APIs
> will
> > >be removed in the next major release AND new APIs are available.
> > >*[-1 Prohibit]* Never deprecate the old APIs unless the new APIs are
> > stable
> > >and released without @IgniteExperimental. The old APIs javadoc may be
> > >updated with a reference to new APIs to encourage users to evaluate new
> > >APIs. The deprecation and new API release may happen simultaneously if
> the
> > >new API is not marked with @IgniteExperimental or the annotation is
> > removed
> > >in the same release.
> > >
> > >Neither of the choices prohibits deprecation of an API without a
> > >replacement if community decides so.
> > >
> > >The vote will hold for 72 hours and will end on February 13th 2020 08:00
> > >UTC:
> > >
> >
> https://www.timeanddate.com/countdown/to?year=2020=2=13=8=0=0=utc-1
> > >
> > >All votes count, there is no binding/non-binding status for this.
> > >
> > >[1]
> > >
> >
> https://github.com/apache/ignite/blob/master/modules/core/src/main/java/org/apache/ignite/lang/IgniteExperimental.java
> > >[2]
> > >
> >
> http://apache-ignite-developers.2346864.n4.nabble.com/DISCUSS-Public-API-deprecation-rules-td45647.html
> > >
> > >Thanks,
> > >--AG
> > >
> >
> >
> >
> >
>

Re: [VOTE] Apache Ignite PMC Chair

2019-11-02 Thread Dmitriy Govorukhin

+1 for Dmitry Pavlov

сб, 2 нояб. 2019 г., 0:26 Alexey Goncharuk :

> +1 for Alexey Goncharuk (binding)
>
> пн, 28 окт. 2019 г. в 21:06, Denis Magda :
>
> > Ignite community,
> >
> > Please cast a vote for one of the following candidates:
> >
> >- Alexey Goncharuk
> >- Dmitry Pavlov
> >- Nikolay Izhikov
> >- Pavel Tupitsyn
> >
> > Everybody is encouraged to take part in the vote, however, remember that
> > only the binding votes of PMC members result in the Chair election. The
> > candidates can vote for themselves. The vote will end on November 1st,
> > 11:00 AM Pacific Time. More details about the next steps are here:
> > https://www.apache.org/dev/pmc.html#newchair
> >
> > -
> > Denis
> >
>

Re: Re[2]: Apache Ignite 2.7.6 (Time, Scope, and Release manager)

2019-09-05 Thread Dmitriy Govorukhin

Hi Igniters,

I finished work on https://issues.apache.org/jira/browse/IGNITE-12127, fix
already in master and ignite-2.7.6

On Wed, Sep 4, 2019 at 2:22 PM Dmitriy Govorukhin <
dmitriy.govoruk...@gmail.com> wrote:

> Hi Alexey,
>
> I think that I will finish work on the fix tomorrow. Fix already completed
> but I need to get VISA from TC bot.
>
> On Mon, Sep 2, 2019 at 8:27 PM Alexey Goncharuk <
> alexey.goncha...@gmail.com> wrote:
>
>> Folks, it looks like I was overly optimistic with the estimates for the
>> mentioned two tickets.
>>
>> Dmitriy, Anton,
>> Can you share your vision when the issues will be fixed? Perhaps, it makes
>> sense to release 2.7.6 with the already fixed issues and schedule 2.7.7?
>> Neither of them is a regression, so it's ok to release 2.7.6 as it is now.
>>
>> Thoughts?
>>
>> сб, 31 авг. 2019 г. в 11:37, Alexey Goncharuk > >:
>>
>> > Yes, my bad, forgot to include the link. That's the one.
>> >
>> > пт, 30 авг. 2019 г. в 15:01, Maxim Muzafarov :
>> >
>> >> Alexey,
>> >>
>> >> Does the issue [1] is related to this [2] discussion on the user-list?
>> >> If yes, I think it is very important to include these fixes to 2.7.6.
>> >>
>> >> [1] https://issues.apache.org/jira/browse/IGNITE-12127
>> >> [2]
>> >>
>> http://apache-ignite-users.70518.x6.nabble.com/Node-failure-with-quot-Failed-to-write-buffer-quot-error-td29100.html
>> >>
>> >> On Fri, 30 Aug 2019 at 14:26, Alexei Scherbakov
>> >>  wrote:
>> >> >
>> >> > Alexey,
>> >> >
>> >> > Looks like important fixes, better to include them.
>> >> >
>> >> > пт, 30 авг. 2019 г. в 12:51, Alexey Goncharuk <
>> >> alexey.goncha...@gmail.com>:
>> >> >
>> >> > > Igniters,
>> >> > >
>> >> > > Given that the RC1 vote did not succeed and we are still waiting
>> for
>> >> a few
>> >> > > minor fixes, may I suggest including these two tickest to the 2.7.6
>> >> scope?
>> >> > >
>> >> > > https://issues.apache.org/jira/browse/IGNITE-12127
>> >> > > https://issues.apache.org/jira/browse/IGNITE-12128
>> >> > >
>> >> > > The first one has been already reported on the dev-list [1], the
>> >> second one
>> >> > > may cause a state when an Ignite node cannot start on existing
>> >> persisted
>> >> > > data. Looking at the tickets, the fixes should be reasonably easy,
>> so
>> >> it
>> >> > > should not shift 2.7.6 release timeline much.
>> >> > >
>> >> > > Thoughts?
>> >> > >
>> >> > > ср, 28 авг. 2019 г. в 15:25, Nikolay Izhikov > >:
>> >> > >
>> >> > > > Separate repos for different Spark version is a good idea for me.
>> >> > > > Anyway, can you help with Spark version migration,  for now?
>> >> > > >
>> >> > > > В Ср, 28/08/2019 в 15:20 +0300, Alexey Zinoviev пишет:
>> >> > > > > Maybe the best solution today add for each new version of Spark
>> >> the
>> >> > > > > sub-module (Spark-2.3, Spark-2.4) or the separate repository
>> with
>> >> > > modules
>> >> > > > > for each version or another way with separate repository and
>> >> different
>> >> > > > > branches like in
>> >> > > https://github.com/datastax/spark-cassandra-connector
>> >> > > > >
>> >> > > > > 3 ways to support different versions with the different costs
>> of
>> >> > > support
>> >> > > > >
>> >> > > > > In the case of separate repository I could help, for example
>> >> > > > >
>> >> > > > > ср, 28 авг. 2019 г. в 14:57, Nikolay Izhikov <
>> nizhi...@apache.org
>> >> >:
>> >> > > > >
>> >> > > > > > Hello, Alexey.
>> >> > > > > >
>> >> > > > > > > But the
>> >> > > > > > > compatibility with Spark 2.3 will be broken, isn't it?
>> >> > > > > >
>> >> > > > > > Yes.
>> >> >

Re: Re[2]: Apache Ignite 2.7.6 (Time, Scope, and Release manager)

2019-09-04 Thread Dmitriy Govorukhin

Hi Alexey,

I think that I will finish work on the fix tomorrow. Fix already completed
but I need to get VISA from TC bot.

On Mon, Sep 2, 2019 at 8:27 PM Alexey Goncharuk 
wrote:

> Folks, it looks like I was overly optimistic with the estimates for the
> mentioned two tickets.
>
> Dmitriy, Anton,
> Can you share your vision when the issues will be fixed? Perhaps, it makes
> sense to release 2.7.6 with the already fixed issues and schedule 2.7.7?
> Neither of them is a regression, so it's ok to release 2.7.6 as it is now.
>
> Thoughts?
>
> сб, 31 авг. 2019 г. в 11:37, Alexey Goncharuk  >:
>
> > Yes, my bad, forgot to include the link. That's the one.
> >
> > пт, 30 авг. 2019 г. в 15:01, Maxim Muzafarov :
> >
> >> Alexey,
> >>
> >> Does the issue [1] is related to this [2] discussion on the user-list?
> >> If yes, I think it is very important to include these fixes to 2.7.6.
> >>
> >> [1] https://issues.apache.org/jira/browse/IGNITE-12127
> >> [2]
> >>
> http://apache-ignite-users.70518.x6.nabble.com/Node-failure-with-quot-Failed-to-write-buffer-quot-error-td29100.html
> >>
> >> On Fri, 30 Aug 2019 at 14:26, Alexei Scherbakov
> >>  wrote:
> >> >
> >> > Alexey,
> >> >
> >> > Looks like important fixes, better to include them.
> >> >
> >> > пт, 30 авг. 2019 г. в 12:51, Alexey Goncharuk <
> >> alexey.goncha...@gmail.com>:
> >> >
> >> > > Igniters,
> >> > >
> >> > > Given that the RC1 vote did not succeed and we are still waiting for
> >> a few
> >> > > minor fixes, may I suggest including these two tickest to the 2.7.6
> >> scope?
> >> > >
> >> > > https://issues.apache.org/jira/browse/IGNITE-12127
> >> > > https://issues.apache.org/jira/browse/IGNITE-12128
> >> > >
> >> > > The first one has been already reported on the dev-list [1], the
> >> second one
> >> > > may cause a state when an Ignite node cannot start on existing
> >> persisted
> >> > > data. Looking at the tickets, the fixes should be reasonably easy,
> so
> >> it
> >> > > should not shift 2.7.6 release timeline much.
> >> > >
> >> > > Thoughts?
> >> > >
> >> > > ср, 28 авг. 2019 г. в 15:25, Nikolay Izhikov :
> >> > >
> >> > > > Separate repos for different Spark version is a good idea for me.
> >> > > > Anyway, can you help with Spark version migration,  for now?
> >> > > >
> >> > > > В Ср, 28/08/2019 в 15:20 +0300, Alexey Zinoviev пишет:
> >> > > > > Maybe the best solution today add for each new version of Spark
> >> the
> >> > > > > sub-module (Spark-2.3, Spark-2.4) or the separate repository
> with
> >> > > modules
> >> > > > > for each version or another way with separate repository and
> >> different
> >> > > > > branches like in
> >> > > https://github.com/datastax/spark-cassandra-connector
> >> > > > >
> >> > > > > 3 ways to support different versions with the different costs of
> >> > > support
> >> > > > >
> >> > > > > In the case of separate repository I could help, for example
> >> > > > >
> >> > > > > ср, 28 авг. 2019 г. в 14:57, Nikolay Izhikov <
> nizhi...@apache.org
> >> >:
> >> > > > >
> >> > > > > > Hello, Alexey.
> >> > > > > >
> >> > > > > > > But the
> >> > > > > > > compatibility with Spark 2.3 will be broken, isn't it?
> >> > > > > >
> >> > > > > > Yes.
> >> > > > > >
> >> > > > > > > Do you have any
> >> > > > > > > plans to support the different version of Spark without
> >> loosing
> >> > > your
> >> > > > > >
> >> > > > > > unique
> >> > > > > > > expertise in Spark-Ignite integration?
> >> > > > > >
> >> > > > > > What do you mean by "my unique expertise"? :)
> >> > > > > >
> >> > > > > > How do you see support of several Spark version?
> >> > > > > >
> >> > > > > >
> >> > > > > > В Ср, 28/08/2019 в 14:29 +0300, Alexey Zinoviev пишет:
> >> > > > > > > Dear Nikolay Izhikov
> >> > > > > > > Are you going to update the Ignite-Spark integration for
> >> Spark 2.4.
> >> > > > But
> >> > > > > >
> >> > > > > > the
> >> > > > > > > compatibility with Spark 2.3 will be broken, isn't it? Do
> you
> >> have
> >> > > > any
> >> > > > > > > plans to support the different version of Spark without
> >> loosing
> >> > > your
> >> > > > > >
> >> > > > > > unique
> >> > > > > > > expertise in Spark-Ignite integration?
> >> > > > > > >
> >> > > > > > > чт, 15 авг. 2019 г. в 14:54, Nikolay Izhikov <
> >> nizhi...@apache.org
> >> > > >:
> >> > > > > > >
> >> > > > > > > > Hello, Igniters.
> >> > > > > > > >
> >> > > > > > > > I try to upgrade Spark version but failed.
> >> > > > > > > >
> >> > > > > > > > Seems, internal Spark API(External Catalog, SQL planner)
> >> that we
> >> > > > use
> >> > > > > > > > changed a lot.
> >> > > > > > > > So it will take some time to upgrade version.
> >> > > > > > > >
> >> > > > > > > > For now, I work hard to complete the second phase of
> IEP-35
> >> so I
> >> > > > > >
> >> > > > > > postpone
> >> > > > > > > > upgrade Spark version to Ignite 2.8.
> >> > > > > > > >
> >> > > > > > > > В Чт, 15/08/2019 в 14:51 +0300, Dmitriy Pavlov пишет:
> >> > > > > > > > > Hi Denis,
> >> > > > > > > > >
> >> > >

[jira] [Created] (IGNITE-12128) Potentially pds corruption on a failed node during checkpoint

2019-08-30 Thread Dmitriy Govorukhin (Jira)

Dmitriy Govorukhin created IGNITE-12128:
---

 Summary: Potentially pds corruption on a failed node during 
checkpoint
 Key: IGNITE-12128
 URL: https://issues.apache.org/jira/browse/IGNITE-12128
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin


There are the case when we start a checkpoint but not create CP file marker, 
but PageMemory may start to flush dirty pages from checkpoint pages to page 
store.  If node crashed at this moment, we can get inconsistency state, because 
we still not write checkpoint marker to disk but already write some pages for 
this checkpoint. If we try to recover from this state we cat get any sort of 
corruption problem. Recovery logic may not recognize that crash was during 
checkpoint because we did not write file marker when we start checkpoint but 
write some pages for this checkpoint.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Created] (IGNITE-12127) WAL writer may close file IO with unflushed changes when MMAP is disabled

2019-08-30 Thread Dmitriy Govorukhin (Jira)

Dmitriy Govorukhin created IGNITE-12127:
---

 Summary: WAL writer may close file IO with unflushed changes when 
MMAP is disabled
 Key: IGNITE-12127
 URL: https://issues.apache.org/jira/browse/IGNITE-12127
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin
Assignee: Dmitriy Govorukhin


Most likely the issue manifests itself as the following critical error:
{code}
2019-08-27 14:52:31.286 ERROR 26835 --- [wal-write-worker%null-#447] ROOT : 
Critical system error detected. Will be handled accordingly to configured 
handler [hnd=class o.a.i.failure.StopNodeOrHaltFailureHandler, 
failureCtx=FailureContext [type=CRITICAL_ERROR, err=class 
o.a.i.i.processors.cache.persistence.StorageException: Failed to write buffer.]]
org.apache.ignite.internal.processors.cache.persistence.StorageException: 
Failed to write buffer.
at 
org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$WALWriter.writeBuffer(FileWriteAheadLogManager.java:3444)
 [ignite-core-2.5.7.jar!/:2.5.7]
at 
org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$WALWriter.body(FileWriteAheadLogManager.java:3249)
 [ignite-core-2.5.7.jar!/:2.5.7]
at 
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110) 
[ignite-core-2.5.7.jar!/:2.5.7]
at java.lang.Thread.run(Thread.java:748) [na:1.8.0_201]
Caused by: java.nio.channels.ClosedChannelException: null
at sun.nio.ch.FileChannelImpl.ensureOpen(FileChannelImpl.java:110) 
~[na:1.8.0_201]
at sun.nio.ch.FileChannelImpl.position(FileChannelImpl.java:253) 
~[na:1.8.0_201]
at 
org.apache.ignite.internal.processors.cache.persistence.file.RandomAccessFileIO.position(RandomAccessFileIO.java:48)
 ~[ignite-core-2.5.7.jar!/:2.5.7]
at 
org.apache.ignite.internal.processors.cache.persistence.file.FileIODecorator.position(FileIODecorator.java:41)
 ~[ignite-core-2.5.7.jar!/:2.5.7]
at 
org.apache.ignite.internal.processors.cache.persistence.file.AbstractFileIO.writeFully(AbstractFileIO.java:111)
 ~[ignite-core-2.5.7.jar!/:2.5.7]
at 
org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$WALWriter.writeBuffer(FileWriteAheadLogManager.java:3437)
 [ignite-core-2.5.7.jar!/:2.5.7]
... 3 common frames omitted
{code}

It appears that there following sequence is possible:
 * Thread A attempts to log a large record which does not fit segment, 
{{addRecord}} fails and the thread A starts segment rollover. I successfully 
runs {{flushOrWait(null)}} and gets de-scheduled before adding switch segment 
record
 * Thread B attempts to log another record, which fits exactly till the end of 
the current segment. The record is added to the buffer
 * Thread A resumes and fails to add the switch segment record. No flush is 
performed and the thread immediately proceeds for wal-writer close
 * WAL writer thread wakes up, sees that there is a CLOSE request, closes the 
file IO and immediately proceeds to write unflushed changes causing the 
exception.

Unconditional flush after switch segment record write should fix the issue.

Besides the bug itself, I suggest the following changes to the 
{{FileWriteHandleImpl}} ({{FileWriteAheadLogManager}} in earlier versions):
 * There is an {{fsync(filePtr)}} call inside {{close()}}; however, {{fsync()}} 
checks the {{stop}} flag (which is set inside {{close}}) and returns 
immediately after {{flushOrWait()}} if the flag is set - this is very 
confusing. After all, the {{close()}} itself explicitly calls {{force}} after 
flush
 * There is an ignored IO exception in mmap mode - this should be propagated to 
the failure handler
 * In WAL writer, we check for file CLOSE and then attemp to write to 
(possibly) the same write handle - write should be always before close
 * In WAL writer, there are racy reads of current handle - it would be better 
if we read the current handle once and then operate on it during the whole loop 
iteration



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Created] (IGNITE-12110) Bugs & tests fixes

2019-08-27 Thread Dmitriy Govorukhin (Jira)

Dmitriy Govorukhin created IGNITE-12110:
---

 Summary:  Bugs & tests fixes
 Key: IGNITE-12110
 URL: https://issues.apache.org/jira/browse/IGNITE-12110
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin






--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Created] (IGNITE-12102) idle_verify should show info about lost partitions

2019-08-26 Thread Dmitriy Govorukhin (Jira)

Dmitriy Govorukhin created IGNITE-12102:
---

 Summary: idle_verify should show info about lost partitions
 Key: IGNITE-12102
 URL: https://issues.apache.org/jira/browse/IGNITE-12102
 Project: Ignite
  Issue Type: Improvement
Reporter: Dmitriy Govorukhin


In the current implementation, idle_verify do not show lost partitions, and 
check shows that everything is fine but it is not true.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Created] (IGNITE-12081) Page replacement can reload invalid page during checkpoint

2019-08-16 Thread Dmitriy Govorukhin (JIRA)

Dmitriy Govorukhin created IGNITE-12081:
---

 Summary: Page replacement can reload invalid page during checkpoint
 Key: IGNITE-12081
 URL: https://issues.apache.org/jira/browse/IGNITE-12081
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin
Assignee: Dmitriy Govorukhin


There is a race between {{writeCheckpointPages}} and page replacement process:
 * Checkpointer thread begins a checkpoint
 * Checkpointer thread calls {{getPageForCheckpoint()}}, which will copy page 
content *and clear dirty flag*
 * Page replacement tries to find a page for replacement and chooses this page, 
the page is thrown away
 * Before the page is written back to the store, the page is acquired again.

As a result, an older copy of the page is brought back to memory, which causes 
all kinds of corruption exceptions and assertions.

The attached unit test demonstrates the issue. It is likely that all baselines 
are affected starting from 2.4

As a part of this ticket, we must add more unit-tests for checkpointing 
protocol invariants we rely on.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Re: Apache Ignite 2.7.6 (Time, Scope, and Release manager)

2019-08-12 Thread Dmitriy Govorukhin

Hi all,
Can we include https://issues.apache.org/jira/browse/IGNITE-12060? This is
a https://issues.apache.org/jira/browse/IGNITE-11953

On Fri, Aug 9, 2019 at 10:04 AM Nikolay Izhikov  wrote:

> Hello, Deni.
>
> > Nickolay, could you check if that's a quick upgrade?
>
> Yes, I will take a look.
>
>
> В Чт, 08/08/2019 в 11:08 -0700, Denis Magda пишет:
> > Dmitry,
> >
> > Please add this BTree corruption fix to the scope:
> > https://issues.apache.org/jira/browse/IGNITE-11953
> >
> > Plus, I would upgrade our Spark integration to version 2.4 as long as 2.3
> > goes with limitations reported by our users:
> > https://issues.apache.org/jira/browse/IGNITE-12054
> >
> > Nickolay, could you check if that's a quick upgrade?
> >
> > -
> > Denis
> >
> >
> > On Thu, Aug 8, 2019 at 10:40 AM Dmitriy Pavlov 
> wrote:
> >
> > > Hi Ivan, Ilya, Igniters,
> > >
> > >
> > >
> > > I would like this release would be as minimal as possible.
> > >
> > >
> > >
> > > According to dates proposed we could freeze scope at 12.08, 4 days is
> more
> > > than enough to stand up and say, ‘Hey, I have an urgent fix’. But it is
> > > also ok for me if we decide to have more relaxed dates.
> > >
> > >
> > >
> > > For now, I suppose the following fixed should be cherry-picked:
> > >
> > > https://issues.apache.org/jira/browse/IGNITE-11767 (Blocker)
> > > GridDhtPartitionsFullMessage retains huge maps on heap in exchange
> history
> > >
> > >
> > >
> > > https://issues.apache.org/jira/browse/IGNITE-10451 (Major Bug) .NET:
> > > Persistence does not work with custom affinity function
> > >
> > >
> > >
> > > https://issues.apache.org/jira/browse/IGNITE-9562 (Critical Bug)
> Destroyed
> > > cache that resurrected on an old offline node breaks PME
> > >
> > >
> > >
> > > But I will continue to research JIRA.
> > >
> > >
> > >
> > > Sincerely,
> > >
> > > Dmitriy Pavlov
> > >
> > >
> > > чт, 8 авг. 2019 г. в 17:30, Павлухин Иван :
> > >
> > > > > What's the scope for this release?
> > > >
> > > > Same question.
> > > >
> > > > On the other hand an idea of 2.7.6 release attracts me because having
> > > > a practice of doing frequent minor releases can help us to build
> > > > reliable and predictable release rails.
> > > >
> > > > чт, 8 авг. 2019 г. в 15:09, Ilya Kasnacheev <
> ilya.kasnach...@gmail.com>:
> > > > >
> > > > > Hello!
> > > > >
> > > > > What's the scope for this release?
> > > > >
> > > > > Regards,
> > > > > --
> > > > > Ilya Kasnacheev
> > > > >
> > > > >
> > > > > чт, 8 авг. 2019 г. в 15:07, Dmitriy Pavlov :
> > > > >
> > > > > > Hi Apache Ignite Developers,
> > > > > >
> > > > > >
> > > > > >
> > > > > > We seem to be on the same page about 2.8 release, but we’ve
> started
> > >
> > > new
> > > > > > practice - minor releases, the first release was 2.7.5.
> > >
> > > Unfortunately,
> > > > > > there is a couple of issues still not fixed in that release, so I
> > >
> > > would
> > > > > > like to propose to prepare one more minor release for Apache
> Ignite.
> > > > > >
> > > > > >
> > > > > >
> > > > > > I propose my candidacy to be release manager once again.
> > > > > >
> > > > > >
> > > > > >
> > > > > > I could (of course with help from Peter Ivanov) do some
> additional
> > > >
> > > > efforts
> > > > > > to complete and improve process doc
> > > > > >
> https://cwiki.apache.org/confluence/display/IGNITE/Release+Process
> > > > > >
> > > > > >
> > > > > >
> > > > > > I propose (most optimistic) dates for release stages:
> > > > > >
> > > > > > Scope Freeze: August 12, 2019
> > > > > >
> > > > > > Code Freeze: August 15, 2019
> > > > > >
> > > > > > Voting Date: August 22, 2019
> > > > > >
> > > > > > Release Date: August 27, 2019
> > > > > >
> > > > > >
> > > > > > WDYT?
> > > > > >
> > > > > >
> > > > > > If nobody minds, I will create branch 2.7.6 based on 2.7.5 and
> set up
> > > >
> > > > in
> > > > > > the TC Bot during the weekend.
> > > > > >
> > > > > >
> > > > > >
> > > > > > Sincerely,
> > > > > >
> > > > > > Dmitriy Pavlov
> > > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Best regards,
> > > > Ivan Pavlukhin
> > > >
>

[jira] [Created] (IGNITE-12060) Incorrect row size calculation, lead to tree corruption.

2019-08-12 Thread Dmitriy Govorukhin (JIRA)

Dmitriy Govorukhin created IGNITE-12060:
---

 Summary: Incorrect row size calculation, lead to tree corruption.
 Key: IGNITE-12060
 URL: https://issues.apache.org/jira/browse/IGNITE-12060
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin
Assignee: Dmitriy Govorukhin
 Fix For: 2.8






--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Created] (IGNITE-12057) Persistence files are stored to temp dir

2019-08-10 Thread Dmitriy Govorukhin (JIRA)

Dmitriy Govorukhin created IGNITE-12057:
---

 Summary: Persistence files are stored to temp dir
 Key: IGNITE-12057
 URL: https://issues.apache.org/jira/browse/IGNITE-12057
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin


h2. Description
Check this thread:
[https://stackoverflow.com/questions/56951913/ignite-persistent-schema-tables-disappeared-sometimes/56977212#56977212]

This prospect almost dropped us because the company could figure out why 
persistence files disappear upon restarts. They turned off WARN logging level 
and could see our warning saying that the files are written to such a directory.

I've updated Ignite docs:
[https://apacheignite.readme.io/docs/distributed-persistent-store#section-persistence-path-management]



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Created] (IGNITE-12048) Bugs & tests fixes

2019-08-07 Thread Dmitriy Govorukhin (JIRA)

Dmitriy Govorukhin created IGNITE-12048:
---

 Summary: Bugs & tests fixes
 Key: IGNITE-12048
 URL: https://issues.apache.org/jira/browse/IGNITE-12048
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin


Page replacement can reload invalid page during checkpoint

There is a race between {{writeCheckpointPages}} and page replacement process:
 * Checkpointer thread begins a checkpoint
 * Checkpointer thread calls {{getPageForCheckpoint()}}, which will copy page 
content *and clear dirty flag*
 * Page replacement tries to find a page for replacement and chooses this page, 
the page is thrown away
 * Before the page is written back to the store, the page is acquired again.

As a result, an older copy of the page is brought back to memory, which causes 
all kinds of corruption exceptions and assertions.

-

checkpointReadLock() may hang during node stop

I got this hang during one of PDS (Indexing) runs (thread-dump is attached). 
The following code hang:
{code:java}
checkpointer.wakeupForCheckpoint(0, "too many dirty pages").cpBeginFut
.getUninterruptibly();
{code}
It looks like {{wakeupForCheckpoint}} can be called after the checkpointer is 
stopped and {{cpBeginFut}} will be never completed.

-

Fixed 
ZookeeperDiscoveryCommunicationFailureTest.testCommunicationFailureResolve_CachesInfo1

Fixed  *.testFailAfterStart



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Re: Link to Grid Gain internal ticket in Ignite

2019-07-20 Thread Dmitriy Govorukhin

Nikolay, my fault, already fixed.

https://issues.apache.org/jira/browse/IGNITE-11998

On Thu, Jul 18, 2019 at 4:06 PM Nikolay Izhikov  wrote:

> Hello, Igniters.
>
> Dmitriy, can you, please, create an Ignite Jira ticket from your internal?
> [1]
> Or change link to existing Ignite ticket.
>
> There is other link to GG ticket through the code bringed by your merge.
>
>
> ```
> @Ignore("https://ggsystems.atlassian.net/browse/GG-20800;)
> ```
>
> [1]
> https://github.com/apache/ignite/commit/9c323149db1cee0ff6586389def059a85428b116#diff-99a599db7a5b3f3272824d67e33ff162R96
>

[jira] [Created] (IGNITE-11953) BTree corruption caused by byte array values

2019-07-02 Thread Dmitriy Govorukhin (JIRA)

Dmitriy Govorukhin created IGNITE-11953:
---

 Summary: BTree corruption caused by byte array values
 Key: IGNITE-11953
 URL: https://issues.apache.org/jira/browse/IGNITE-11953
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin


In some cases for caches with cache group, we can get BTree corruption 
exception.

{code}
09:53:58,890][SEVERE][sys-stripe-10-#11][] Critical system error detected. Will 
be handled accordingly to configured handler [hnd=CustomFailureHandler 
[ignoreCriticalErrors=false, disabled=false][StopNodeOrHaltFailureHandler 
[tryStop=false, timeout=0]], failureCtx=FailureContext [type=CRITICAL_ERROR, 
err=class o.a.i.i.transactions.IgniteTxHeuristicCheckedException: Committing a 
transaction has produced runtime exception]]class 
org.apache.ignite.internal.transactions.IgniteTxHeuristicCheckedException: 
Committing a transaction has produced runtime exception
at 
org.apache.ignite.internal.processors.cache.transactions.IgniteTxAdapter.heuristicException(IgniteTxAdapter.java:800)
at 
org.apache.ignite.internal.processors.cache.transactions.IgniteTxLocalAdapter.userCommit(IgniteTxLocalAdapter.java:922)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxLocalAdapter.localFinish(GridDhtTxLocalAdapter.java:799)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxLocal.localFinish(GridDhtTxLocal.java:608)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxLocal.finishTx(GridDhtTxLocal.java:478)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxLocal.commitDhtLocalAsync(GridDhtTxLocal.java:535)
at 
org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.finishDhtLocal(IgniteTxHandler.java:1055)
at 
org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.finish(IgniteTxHandler.java:931)
at 
org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.processNearTxFinishRequest(IgniteTxHandler.java:887)
at 
org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.access$200(IgniteTxHandler.java:117)
at 
org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler$3.apply(IgniteTxHandler.java:209)
at 
org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler$3.apply(IgniteTxHandler.java:207)
at 
org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1129)
at 
org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:594)
at 
org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:393)
at 
org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:319)
at 
org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$100(GridCacheIoManager.java:109)
at 
org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:308)
at 
org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1568)
at 
org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1196)
at 
org.apache.ignite.internal.managers.communication.GridIoManager.access$4200(GridIoManager.java:126)
at 
org.apache.ignite.internal.managers.communication.GridIoManager$9.run(GridIoManager.java:1092)
at 
org.apache.ignite.internal.util.StripedExecutor$Stripe.body(StripedExecutor.java:504)
at 
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:119)
at java.lang.Thread.run(Thread.java:748)
Caused by: class 
org.apache.ignite.internal.processors.cache.persistence.tree.CorruptedTreeException:
 Runtime failure on search row: SearchRow [key=KeyCacheObjectImpl [part=427, 
val=Grkg1DUF3yQE6tC9Se50mi5w.T, hasValBytes=true], hash=1872857770, 
cacheId=-420893003]
at 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.invoke(BPlusTree.java:1811)
at 
org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl$CacheDataStoreImpl.invoke0(IgniteCacheOffheapManagerImpl.java:1620)
at 
org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl$CacheDataStoreImpl.invoke(IgniteCacheOffheapManagerImpl.java:1603)
at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore.invoke(GridCacheOffheapManager.java:2131)
at 
org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl.invoke(IgniteCacheOffheapManagerImpl.java:442)
at 
org.apache.ignite.internal.processors.cache.GridCacheMapEntry.storeValue

[jira] [Created] (IGNITE-11934) Bugs & tests fixes

2019-06-18 Thread Dmitriy Govorukhin (JIRA)

Dmitriy Govorukhin created IGNITE-11934:
---

 Summary:  Bugs & tests fixes
 Key: IGNITE-11934
 URL: https://issues.apache.org/jira/browse/IGNITE-11934
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin


This issue contains fixes for several issues:
 * AssertionError occurs on the client when coordinator killed (with ZK 
discovery)
 * IgniteVersionUtils#BUILD_TSTAMP_DATE_FORMATTER is used in a non thread-safe 
manner.
 * Possible discovery race on node joining with Authenticator.
 * PageLocksCommand#parseArguments cannot properly parse arguments user, 
password if its at the end of arguments list.
 * Test CheckpointFreeListTest.testRestoreFreeListCorrectlyAfterRandomStop 
failed on TC

 * IgniteWalFlushBackgroundSelfTest.testFailWhileStart & 
IgniteWalFlushLogOnlySelfTest.testFailWhileStart fail in disk compression suite.
 * IgniteClientConnectAfterCommunicationFailureTest fails
 * Add scale factor for PageLockTrackerTests



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IGNITE-11835) Support JMX/control.sh API for page lock dump

2019-05-06 Thread Dmitriy Govorukhin (JIRA)

Dmitriy Govorukhin created IGNITE-11835:
---

 Summary: Support JMX/control.sh API for page lock dump
 Key: IGNITE-11835
 URL: https://issues.apache.org/jira/browse/IGNITE-11835
 Project: Ignite
  Issue Type: Sub-task
Reporter: Dmitriy Govorukhin






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IGNITE-11824) Integrate PageLockTracker to DataStructure (per-thread tracker)

2019-04-30 Thread Dmitriy Govorukhin (JIRA)

Dmitriy Govorukhin created IGNITE-11824:
---

 Summary: Integrate PageLockTracker to DataStructure (per-thread 
tracker)
 Key: IGNITE-11824
 URL: https://issues.apache.org/jira/browse/IGNITE-11824
 Project: Ignite
  Issue Type: Sub-task
Reporter: Dmitriy Govorukhin
Assignee: Dmitriy Govorukhin


After [IGNITE-11750] will be completed, we will have a structure for tracking 
page locks per-thread. The next step, need to integrate it into diagnostic API 
and implements a component for creating this structure per-thread.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IGNITE-11786) Implement thread-local stack for trucking page locks

2019-04-19 Thread Dmitriy Govorukhin (JIRA)

Dmitriy Govorukhin created IGNITE-11786:
---

 Summary: Implement thread-local stack for trucking page locks
 Key: IGNITE-11786
 URL: https://issues.apache.org/jira/browse/IGNITE-11786
 Project: Ignite
  Issue Type: Sub-task
Reporter: Dmitriy Govorukhin


The new structure should work as a stack. 
When thread obtains lock we push pageId (+meta) on the top of the stack when 
thread release lock we pop pageId from the stack. Their cases when thread may 
unlock page not from current thread frame (some split pages in B-tree), from 
previous, in this case, we should go down to stack and find this page and 
update meta.

{code}
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IGNITE-11738) Incorrect check ObjectInput.available() in CacheMetricsSnapshot

2019-04-12 Thread Dmitriy Govorukhin (JIRA)

Dmitriy Govorukhin created IGNITE-11738:
---

 Summary: Incorrect check  ObjectInput.available() in 
CacheMetricsSnapshot
 Key: IGNITE-11738
 URL: https://issues.apache.org/jira/browse/IGNITE-11738
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IGNITE-11641) Server node copies a lot of WAL files in WAL archive after restart

2019-03-27 Thread Dmitriy Govorukhin (JIRA)

Dmitriy Govorukhin created IGNITE-11641:
---

 Summary: Server node copies a lot of WAL files in WAL archive 
after restart
 Key: IGNITE-11641
 URL: https://issues.apache.org/jira/browse/IGNITE-11641
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin


Pre-condition: PDS is enabled, wal_path and wal_archive_path are set in config 
file.

1. Cluster is up and running. Some data uploaded into caches.
2. Start load to generate a lot of files in wal archive (more than files in wal 
directory).
3. Stop some node and delete all files from wal archive.
4. Start node.

In this case node copies WAL files from WAL dir into wal archive dir again and 
again until the amount of files will be the same it was in wal archive before 
stop.

Here is information from server node log

{code}
10:10:17,054][INFO][main][GridCacheDatabaseSharedManager] Restoring partition 
state for local groups.
[10:10:18,522][INFO][wal-file-archiver%null-#64][FileWriteAheadLogManager] 
Copied file 
[src=/storage/ssd/avolkov/wal/node00-83c9db32-fee5-4f3e-8a1c-559221817759/.wal,
 
dst=/storage/ssd/avolkov/wal_archive/node00-83c9db32-fee5-4f3e-8a1c-559221817759/.wal]
[10:10:18,523][INFO][wal-file-archiver%null-#64][FileWriteAheadLogManager] 
Starting to copy WAL segment [absIdx=1, segIdx=1, 
origFile=/storage/ssd/avolkov/wal/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0001.wal,
 
dstFile=/storage/ssd/avolkov/wal_archive/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0001.wal]
[10:10:20,631][INFO][wal-file-archiver%null-#64][FileWriteAheadLogManager] 
Copied file 
[src=/storage/ssd/avolkov/wal/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0001.wal,
 
dst=/storage/ssd/avolkov/wal_archive/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0001.wal]
[10:10:20,632][INFO][wal-file-archiver%null-#64][FileWriteAheadLogManager] 
Starting to copy WAL segment [absIdx=2, segIdx=2, 
origFile=/storage/ssd/avolkov/wal/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0002.wal,
 
dstFile=/storage/ssd/avolkov/wal_archive/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0002.wal]
[10:10:23,276][INFO][wal-file-archiver%null-#64][FileWriteAheadLogManager] 
Copied file 
[src=/storage/ssd/avolkov/wal/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0002.wal,
 
dst=/storage/ssd/avolkov/wal_archive/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0002.wal]
[10:10:23,276][INFO][wal-file-archiver%null-#64][FileWriteAheadLogManager] 
Starting to copy WAL segment [absIdx=3, segIdx=3, 
origFile=/storage/ssd/avolkov/wal/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0003.wal,
 
dstFile=/storage/ssd/avolkov/wal_archive/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0003.wal]
[10:10:23,995][INFO][wal-file-archiver%null-#64][FileWriteAheadLogManager] 
Copied file 
[src=/storage/ssd/avolkov/wal/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0003.wal,
 
dst=/storage/ssd/avolkov/wal_archive/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0003.wal]
[10:10:23,996][INFO][wal-file-archiver%null-#64][FileWriteAheadLogManager] 
Starting to copy WAL segment [absIdx=4, segIdx=4, 
origFile=/storage/ssd/avolkov/wal/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0004.wal,
 
dstFile=/storage/ssd/avolkov/wal_archive/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0004.wal]
[10:10:24,644][INFO][wal-file-archiver%null-#64][FileWriteAheadLogManager] 
Copied file 
[src=/storage/ssd/avolkov/wal/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0004.wal,
 
dst=/storage/ssd/avolkov/wal_archive/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0004.wal]
[10:10:24,645][INFO][wal-file-archiver%null-#64][FileWriteAheadLogManager] 
Starting to copy WAL segment [absIdx=5, segIdx=5, 
origFile=/storage/ssd/avolkov/wal/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0005.wal,
 
dstFile=/storage/ssd/avolkov/wal_archive/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0005.wal]
[10:10:25,301][INFO][wal-file-archiver%null-#64][FileWriteAheadLogManager] 
Copied file 
[src=/storage/ssd/avolkov/wal/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0005.wal,
 
dst=/storage/ssd/avolkov/wal_archive/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0005.wal]
[10:10:25,301][INFO][wal-file-archiver%null-#64][FileWriteAheadLogManager] 
Starting to copy WAL segment [absIdx=6, segIdx=6, 
origFile=/storage/ssd/avolkov/wal/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0006.wal,
 
dstFile=/storage/ssd/avolkov/wal_archive/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0006.wal]
[10:10:26,043][INFO][wal-file-archiver%null-#64][FileWriteAheadLogManager] 
Copied file 
[src=/storage/ssd/avolkov/wal/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0006.wal,
 
dst=/storage/ssd/avolkov/wal_archive

[jira] [Created] (IGNITE-11509) Remove DistributedBaselineConfiguration and replace to methods on IgniteCluster

2019-03-08 Thread Dmitriy Govorukhin (JIRA)

Dmitriy Govorukhin created IGNITE-11509:
---

 Summary: Remove DistributedBaselineConfiguration and replace to 
methods on IgniteCluster
 Key: IGNITE-11509
 URL: https://issues.apache.org/jira/browse/IGNITE-11509
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Re: [DISCUSSION] Usage of system properties in tests

2019-02-15 Thread Dmitriy Govorukhin

++1 from my side. I guess it will be a more reliable way for works with
properties in the tests. From time to time somebody forgot clear property
after test and next tests may be failed or flaky failed completion.

On Thu, Feb 14, 2019 at 6:56 PM Dmitriy Pavlov  wrote:

> I find it absolutely positive and modern approach and good contribution.
> Count on my support if you will need any assistance with applying this
> patch.
>
> чт, 14 февр. 2019 г. в 18:53, Ivan Bessonov :
>
> > Hello Igniters,
> >
> > I'd like to discuss the way we treat system properties in our test
> > codebase.
> > It's a common pattern where we set system property in
> "beforeTestsStarted"
> > and clear it in "afterTestsStopped". Purest example that I've found is
> > class
> > "RedisProtocolGetAllAsArrayTest".
> >
> > There are similar things with "beforeTest"/"afterTest" or huge
> > "try/finally" blocks
> > right in test methods.
> >
> > I think that all this code can be avoided and solution might look like
> > this:
> >
> > @Test
> > @WithSystemProperty(key = IGNITE_PROPERTY_NAME, value = "true")
> > public void testSomething() throws Exception {
> > ...
> > }
> >
> > Same annotation might be used on class, this way new system property will
> > be applied to all test methods in the class.
> >
> > I already created the issue for this change [1] and PR with demo [2]. It
> > contains
> > implementation of annotation processing and a few migrated tests. If you
> > like
> > the idea then I will migrate all the other tests on the same mechanism.
> >
> > What do you think?
> >
> > [1] https://issues.apache.org/jira/browse/IGNITE-11323
> > [2] https://github.com/apache/ignite/pull/6109
> >
> > --
> > Sincerely yours,
> > Ivan Bessonov
> >
>

[jira] [Created] (IGNITE-11095) Failed WalCompactionTest flaky test

2019-01-25 Thread Dmitriy Govorukhin (JIRA)

Dmitriy Govorukhin created IGNITE-11095:
---

 Summary: Failed WalCompactionTest flaky test
 Key: IGNITE-11095
 URL: https://issues.apache.org/jira/browse/IGNITE-11095
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IGNITE-10974) Grid may hangs if an exception is thrown from PageMemoryImpl.beforeReleaseWrite()

2019-01-18 Thread Dmitriy Govorukhin (JIRA)

Dmitriy Govorukhin created IGNITE-10974:
---

 Summary: Grid may hangs if an exception is thrown from 
PageMemoryImpl.beforeReleaseWrite()
 Key: IGNITE-10974
 URL: https://issues.apache.org/jira/browse/IGNITE-10974
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin
Assignee: Dmitriy Govorukhin


 

 

{code}

[2019-01-17 14:35:15,953][WARN ][main][root] Thread dump at 2019/01/17 14:35:15 
UTC
[17:35:15]W: [org.apache.ignite:ignite-core] Thread 
[name="sys-#857%failure.IoomFailureHandlerTest0%", id=931, state=TIMED_WAITING, 
blockCnt=0, waitCnt=1]
[17:35:15]W: [org.apache.ignite:ignite-core] Lock 
[object=java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@4339baec,
 ownerName=null, ownerId=-1]
[17:35:15]W: [org.apache.ignite:ignite-core] at sun.misc.Unsafe.park(Native 
Method)
[17:35:15]W: [org.apache.ignite:ignite-core] at 
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
[17:35:15]W: [org.apache.ignite:ignite-core] at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
[17:35:15]W: [org.apache.ignite:ignite-core] at 
java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)
[17:35:15]W: [org.apache.ignite:ignite-core] at 
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1073)
[17:35:15]W: [org.apache.ignite:ignite-core] at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
[17:35:15]W: [org.apache.ignite:ignite-core] at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
[17:35:15]W: [org.apache.ignite:ignite-core] at 
java.lang.Thread.run(Thread.java:748)
[17:35:15]W: [org.apache.ignite:ignite-core] 
[17:35:15]W: [org.apache.ignite:ignite-core] Thread 
[name="sys-#856%failure.IoomFailureHandlerTest0%", id=930, state=TIMED_WAITING, 
blockCnt=0, waitCnt=1]
[17:35:15]W: [org.apache.ignite:ignite-core] Lock 
[object=java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@4339baec,
 ownerName=null, ownerId=-1]
[17:35:15]W: [org.apache.ignite:ignite-core] at sun.misc.Unsafe.park(Native 
Method)
[17:35:15]W: [org.apache.ignite:ignite-core] at 
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
[17:35:15]W: [org.apache.ignite:ignite-core] at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
[17:35:15]W: [org.apache.ignite:ignite-core] at 
java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)
[17:35:15]W: [org.apache.ignite:ignite-core] at 
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1073)
[17:35:15]W: [org.apache.ignite:ignite-core] at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
[17:35:15]W: [org.apache.ignite:ignite-core] at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
[17:35:15]W: [org.apache.ignite:ignite-core] at 
java.lang.Thread.run(Thread.java:748)
[17:35:15]W: [org.apache.ignite:ignite-core] 
[17:35:15]W: [org.apache.ignite:ignite-core] Thread 
[name="sys-#855%failure.IoomFailureHandlerTest0%", id=929, state=TIMED_WAITING, 
blockCnt=0, waitCnt=1]
[17:35:15]W: [org.apache.ignite:ignite-core] Lock 
[object=java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@4339baec,
 ownerName=null, ownerId=-1]
[17:35:15]W: [org.apache.ignite:ignite-core] at sun.misc.Unsafe.park(Native 
Method)
[17:35:15]W: [org.apache.ignite:ignite-core] at 
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
[17:35:15]W: [org.apache.ignite:ignite-core] at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
[17:35:15]W: [org.apache.ignite:ignite-core] at 
java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)
[17:35:15]W: [org.apache.ignite:ignite-core] at 
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1073)
[17:35:15]W: [org.apache.ignite:ignite-core] at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
[17:35:15]W: [org.apache.ignite:ignite-core] at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
[17:35:15]W: [org.apache.ignite:ignite-core] at 
java.lang.Thread.run(Thread.java:748)
[17:35:15]W: [org.apache.ignite:ignite-core] 
[17:35:15]W: [org.apache.ignite:ignite-core] Thread 
[name="sys-#854%failure.IoomFailureHandlerTest0%", id=928, state=TIMED_WAITING, 
blockCnt=0, waitCnt=1]
[17:35:15]W: [org.apache.ignite:ignite-core] Lock 
[object=java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@4339baec,
 ownerName=null, ownerId=-1]
[17:35:15]W: [org.apache.ignite:ignite-core] at sun.misc

[jira] [Created] (IGNITE-10909) GridCacheBalancingStoreSelfTest.testConcurrentLoad flaky test fail in Cache 1

2019-01-11 Thread Dmitriy Govorukhin (JIRA)

Dmitriy Govorukhin created IGNITE-10909:
---

 Summary: GridCacheBalancingStoreSelfTest.testConcurrentLoad flaky 
test fail in Cache 1
 Key: IGNITE-10909
 URL: https://issues.apache.org/jira/browse/IGNITE-10909
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin
Assignee: Dmitriy Govorukhin






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IGNITE-10908) GridServiceProcessorBatchDeploySelfTest.testDeployAllTopologyChange flaky fail with NPE in Service Grid (legacy mode)

2019-01-11 Thread Dmitriy Govorukhin (JIRA)

Dmitriy Govorukhin created IGNITE-10908:
---

 Summary: 
GridServiceProcessorBatchDeploySelfTest.testDeployAllTopologyChange flaky fail 
with NPE in Service Grid (legacy mode) 
 Key: IGNITE-10908
 URL: https://issues.apache.org/jira/browse/IGNITE-10908
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IGNITE-10907) IgniteUtilsSelfTest.testDoInParallelWithStealingJobRunTaskInExecutor flaky failed in Basic 1

2019-01-11 Thread Dmitriy Govorukhin (JIRA)

Dmitriy Govorukhin created IGNITE-10907:
---

 Summary: 
IgniteUtilsSelfTest.testDoInParallelWithStealingJobRunTaskInExecutor flaky 
failed in Basic 1
 Key: IGNITE-10907
 URL: https://issues.apache.org/jira/browse/IGNITE-10907
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin
Assignee: Dmitriy Govorukhin






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IGNITE-10891) IgnitePdsThreadInterruptionTest.testInterruptsOnLFSRead flaky in PDS indexing

2019-01-11 Thread Dmitriy Govorukhin (JIRA)

Dmitriy Govorukhin created IGNITE-10891:
---

 Summary: IgnitePdsThreadInterruptionTest.testInterruptsOnLFSRead 
flaky in PDS indexing
 Key: IGNITE-10891
 URL: https://issues.apache.org/jira/browse/IGNITE-10891
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin
Assignee: Dmitriy Govorukhin






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IGNITE-10883) IgniteRebalanceOnCachesStoppingOrDestroyingTest flaky failed in PDS4

2019-01-10 Thread Dmitriy Govorukhin (JIRA)

Dmitriy Govorukhin created IGNITE-10883:
---

 Summary: IgniteRebalanceOnCachesStoppingOrDestroyingTest flaky 
failed in PDS4
 Key: IGNITE-10883
 URL: https://issues.apache.org/jira/browse/IGNITE-10883
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin


[testStopCachesOnDeactivation|https://ci.ignite.apache.org/project.html?projectId=IgniteTests24Java8=-3436991258700651390=testDetails]

[testStopCachesOnDeactivation|https://ci.ignite.apache.org/project.html?projectId=IgniteTests24Java8=-3436991258700651390=testDetails]

The first problem in a test, it is not checked that rebalance completed after 
test action performed. And second problem in an assert, there are no guarantees 
that cache will not be desproyed before checkpoint completed.

{code}

Failed to notify listener: 
o.a.i.i.processors.cache.WalStateManager$3...@31e26a1java.lang.AssertionError
at 
org.apache.ignite.internal.processors.cache.WalStateManager$3.applyx(WalStateManager.java:510)
at 
org.apache.ignite.internal.processors.cache.WalStateManager$3.applyx(WalStateManager.java:505)
at 
org.apache.ignite.internal.util.lang.IgniteInClosureX.apply(IgniteInClosureX.java:38)
at 
org.apache.ignite.internal.util.future.GridFutureAdapter.notifyListener(GridFutureAdapter.java:399)
at 
org.apache.ignite.internal.util.future.GridFutureAdapter.unblock(GridFutureAdapter.java:347)
at 
org.apache.ignite.internal.util.future.GridFutureAdapter.unblockAll(GridFutureAdapter.java:335)
at 
org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:511)
at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$CheckpointProgress$1.onDone(GridCacheDatabaseSharedManager.java:4280)
at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$CheckpointProgress$1.onDone(GridCacheDatabaseSharedManager.java:4275)
at 
org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:490)
at 
org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:456)
at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$Checkpointer.markCheckpointEnd(GridCacheDatabaseSharedManager.java:3904)
at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$Checkpointer.doCheckpoint(GridCacheDatabaseSharedManager.java:3353)
at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$Checkpointer.body(GridCacheDatabaseSharedManager.java:3119)
at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)

{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IGNITE-10508) Need to support the new checkpoint feature not wait for the previous operation to complete

2018-12-03 Thread Dmitriy Govorukhin (JIRA)

Dmitriy Govorukhin created IGNITE-10508:
---

 Summary: Need to support the new checkpoint feature not wait for 
the previous operation to complete
 Key: IGNITE-10508
 URL: https://issues.apache.org/jira/browse/IGNITE-10508
 Project: Ignite
  Issue Type: Improvement
Reporter: Dmitriy Govorukhin


There are cases when we should trigger the checkpoint, some operations will be 
sure that all operation finished before the checkpoint. It is necessary to 
support the possibility of run checkpoint without waiting for the completion of 
the previous checkpoint.

Solution:

Merge checkpoint pages and append write new dirty pages to a current checkpoint.

Restrictions:

Trigger new checkpoint should not wait for the previous checkpoint operation 
completed.

- It should not break crash recovery mechanisms

- Only one merged is allow in the first implementation (potentially OOM, if we 
will try to merge many checkpoint operations)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Re: [VOTE] Creation dedicated list for github notifiacations

2018-12-02 Thread Dmitriy Govorukhin

+1

вт, 27 нояб. 2018 г., 19:41 Denis Mekhanikov dmekhani...@gmail.com:

> +1
> I'm for making the dev list readable without filters of any kind.
>
> On Tue, Nov 27, 2018, 15:14 Maxim Muzafarov 
> > +1
> >
> > Let's have a look at how it will be.
> >
> > On Tue, 27 Nov 2018 at 14:48 Seliverstov Igor 
> > wrote:
> >
> > > +1
> > >
> > > вт, 27 нояб. 2018 г. в 14:45, Юрий :
> > >
> > > > +1
> > > >
> > > > вт, 27 нояб. 2018 г. в 11:22, Andrey Mashenkov <
> > > andrey.mashen...@gmail.com
> > > > >:
> > > >
> > > > > +1
> > > > >
> > > > > On Tue, Nov 27, 2018 at 10:12 AM Sergey Chugunov <
> > > > > sergey.chugu...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > +1
> > > > > >
> > > > > > Plus this dedicated list should be properly documented in wiki,
> > > > > mentioning
> > > > > > it in How to Contribute [1] or in Make Teamcity Green Again [2]
> > would
> > > > be
> > > > > a
> > > > > > good idea.
> > > > > >
> > > > > > [1]
> > > > https://cwiki.apache.org/confluence/display/IGNITE/How+to+Contribute
> > > > > > [2]
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/IGNITE/Make+Teamcity+Green+Again
> > > > > >
> > > > > > On Tue, Nov 27, 2018 at 9:51 AM Павлухин Иван <
> vololo...@gmail.com
> > >
> > > > > wrote:
> > > > > >
> > > > > > > +1
> > > > > > > вт, 27 нояб. 2018 г. в 09:22, Dmitrii Ryabov <
> > > somefire...@gmail.com
> > > > >:
> > > > > > > >
> > > > > > > > 0
> > > > > > > > вт, 27 нояб. 2018 г. в 02:33, Alexey Kuznetsov <
> > > > > akuznet...@apache.org
> > > > > > >:
> > > > > > > > >
> > > > > > > > > +1
> > > > > > > > > Do not forget notification from GitBox too!
> > > > > > > > >
> > > > > > > > > On Tue, Nov 27, 2018 at 2:20 AM Zhenya
> > > >  > > > > >
> > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > +1, already make it by filers.
> > > > > > > > > >
> > > > > > > > > > > This was discussed already [1].
> > > > > > > > > > >
> > > > > > > > > > > So, I want to complete this discussion with moving
> > outside
> > > > > > dev-list
> > > > > > > > > > > GitHub-notification to dedicated list.
> > > > > > > > > > >
> > > > > > > > > > > Please start voting.
> > > > > > > > > > >
> > > > > > > > > > > +1 - to accept this change.
> > > > > > > > > > > 0 - you don't care.
> > > > > > > > > > > -1 - to decline this change.
> > > > > > > > > > >
> > > > > > > > > > > This vote will go for 72 hours.
> > > > > > > > > > >
> > > > > > > > > > > [1]
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> http://apache-ignite-developers.2346864.n4.nabble.com/Time-to-remove-automated-messages-from-the-devlist-td37484i20.html
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > Alexey Kuznetsov
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Best regards,
> > > > > > > Ivan Pavlukhin
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Best regards,
> > > > > Andrey V. Mashenkov
> > > > >
> > > >
> > > >
> > > > --
> > > > Живи с улыбкой! :D
> > > >
> > >
> > --
> > --
> > Maxim Muzafarov
> >
>

[jira] [Created] (IGNITE-10341) Missed loss policy tests with persistence

2018-11-20 Thread Dmitriy Govorukhin (JIRA)

Dmitriy Govorukhin created IGNITE-10341:
---

 Summary: Missed loss policy tests with persistence
 Key: IGNITE-10341
 URL: https://issues.apache.org/jira/browse/IGNITE-10341
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin


After IGNITE-10207 was implemented, the test was removed (check policy if 
persistence enables), it is a mistake, need to revert this changes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IGNITE-10290) Map.Entry interface for key cache may lead to incorrect calculation hash code

2018-11-15 Thread Dmitriy Govorukhin (JIRA)

Dmitriy Govorukhin created IGNITE-10290:
---

 Summary: Map.Entry interface for key cache may lead to incorrect 
calculation hash code
 Key: IGNITE-10290
 URL: https://issues.apache.org/jira/browse/IGNITE-10290
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin
 Attachments: Reproducer.java

In case if use Map.Entry interface for a key, we can try to find (key, value) 
in store with incorrect calculated hash code for binary representation.
The problem is in the 
GridPartitionedSingleGetFuture#localGet() and 
GridPartitionedGetFuture#localGet() does not execute prepareForCache before 
reading cacheDataRow from row store.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IGNITE-10285) U.doInParallel may lead to deadlock

2018-11-15 Thread Dmitriy Govorukhin (JIRA)

Dmitriy Govorukhin created IGNITE-10285:
---

 Summary: U.doInParallel may lead to deadlock
 Key: IGNITE-10285
 URL: https://issues.apache.org/jira/browse/IGNITE-10285
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin
 Attachments: dump.rtf

There are exist case when we can get a deadlock on the thread pool.
If we try doInParallel in the thread of sys-pool in the number of 
hreads==sys-pool.size we lead to deadlock because threads in sys-pool will try 
doInParallel through the same sys-pool, and they will wait on future infinitely 
because no one thread cannot complete operation doInParallel which require 
threads from sys-pool.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IGNITE-10252) Cache.get() may be mapped to the node with partition state is "MOVING"

2018-11-14 Thread Dmitriy Govorukhin (JIRA)

Dmitriy Govorukhin created IGNITE-10252:
---

 Summary: Cache.get() may be mapped to the node with partition 
state is "MOVING"
 Key: IGNITE-10252
 URL: https://issues.apache.org/jira/browse/IGNITE-10252
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin


After implemented IGNITE-5357, in some cases get maybe mapped to the node with 
partition state is "MOVING" for PARTITION cache and it may lead to some 
assertion errors (we do not allow read from moving partitions). In an original 
issue, a talk was about only replicated cache, why it was implemented for 
partition cache, not clear.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IGNITE-10207) Missed loss policy checks

2018-11-09 Thread Dmitriy Govorukhin (JIRA)

Dmitriy Govorukhin created IGNITE-10207:
---

 Summary: Missed loss policy checks
 Key: IGNITE-10207
 URL: https://issues.apache.org/jira/browse/IGNITE-10207
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin


In some cases (client reconnect, new client join, etc) PartitionLossPolicy may 
incorrectly validate operation. Return null for READ_ONLY_SAFE for loss 
partition.
To reproduce run CacheResultIsNotNullOnPartitionLossTest (1000 times
) with random node stop.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Re: Apache Ignite 2.7. Last Mile

2018-10-23 Thread Dmitriy Govorukhin

t; > > I have got one more potential 2.7 blocker [1] with
> > straightforward
> > > > >
> > > > > fix. I
> > > > > > > > beleive it will not break any production use case, but it
> > leads to
> > > > >
> > > > > test
> > > > > > > > suite hang, thus affecting other urgent issues.
> > > > > > > >
> > > > > > > > [1] https://issues.apache.org/jira/browse/IGNITE-9932
> > > > > > > >
> > > > > > > > чт, 18 окт. 2018 г. в 14:59, Ivan Daschinsky <
> > ivanda...@gmail.com>:
> > > > > > > >
> > > > > > > > > Hi! Is it possible to merge IGNITE-9854? Fix is pretty
> > simple, but
> > > > > > >
> > > > > > > quite
> > > > > > > > > important.
> > > > > > > > >
> > > > > > > > > ср, 17 окт. 2018 г. в 17:49, Andrey Gura  >:
> > > > > > > > >
> > > > > > > > > > JFYI
> > > > > > > > > >
> > > > > > > > > > IGNITE-9737 and IGNITE-9710 are merged to release branch.
> > > > > > > > > > On Wed, Oct 17, 2018 at 5:41 PM Pavel Tupitsyn <
> > > > >
> > > > > ptupit...@apache.org
> > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > Thank you. Fix has been merged to master and
> > cherry-picked to
> > > > > > > > >
> > > > > > > > > ignite-2.7.
> > > > > > > > > > >
> > > > > > > > > > > On Wed, Oct 17, 2018 at 1:26 PM Nikolay Izhikov <
> > > > > > >
> > > > > > > nizhi...@apache.org>
> > > > > > > > > >
> > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Pavel.
> > > > > > > > > > > >
> > > > > > > > > > > > Ok, I agree to include this ticket into 2.7
> > > > > > > > > > > > Let's do it.
> > > > > > > > > > > >
> > > > > > > > > > > > В Ср, 17/10/2018 в 13:20 +0300, Pavel Tupitsyn пишет:
> > > > > > > > > > > > > Nikolay,
> > > > > > > > > > > > >
> > > > > > > > > > > > > It completely breaks a major feature under certain
> > > > >
> > > > > conditions.
> > > > > > >
> > > > > > > I
> > > > > > > > > >
> > > > > > > > > > would
> > > > > > > > > > > > > consider it a blocker.
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Wed, Oct 17, 2018 at 1:00 PM Nikolay Izhikov <
> > > > > > > > >
> > > > > > > > > nizhi...@apache.org
> > > > > > > > > > >
> > > > > > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Hello, Pavel.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Is it a blocker?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > В Ср, 17/10/2018 в 12:58 +0300, Pavel Tupitsyn
> > пишет:
> > > > > > > > > > > > > > > Hi Igniters,
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > I'd like to include IGNITE-9877 in 2.7, can we
> > do that?
> > > > > > > > > > > > > > > The fix is ready, I'm waiting for TC run.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Pavel
> > > > > > > > > > > > > > >
> > > > > > > > > > > >

[jira] [Created] (IGNITE-9898) Checkpointer thread hangs on await async task complete

2018-10-16 Thread Dmitriy Govorukhin (JIRA)

Dmitriy Govorukhin created IGNITE-9898:
--

 Summary: Checkpointer thread hangs on await async task complete
 Key: IGNITE-9898
 URL: https://issues.apache.org/jira/browse/IGNITE-9898
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin


In some cases, we can reset thread pool counters during execution async task, 
and then we can get hangs on await

{code}
[19:36:01] : [Step 4/5] [2018-10-15 16:36:01,435][INFO 
][db-checkpoint-thread-#21691%db.IgnitePdsPageEvictionDuringPartitionClearTest0%][GridCacheDatabaseSharedManager]
 Await checkpoint pool tasks comleted, pendingTaskCnt=2, completedTaskCnt=3, 
initialized=true, err=null, activeCnt=0
[19:36:03] : [Step 4/5] [2018-10-15 16:36:03,435][INFO 
][db-checkpoint-thread-#21691%db.IgnitePdsPageEvictionDuringPartitionClearTest0%][GridCacheDatabaseSharedManager]
 Await checkpoint pool tasks comleted, pendingTaskCnt=2, completedTaskCnt=3, 
initialized=true, err=null, activeCnt=0
[19:36:05] : [Step 4/5] [2018-10-15 16:36:05,436][INFO 
][db-checkpoint-thread-#21691%db.IgnitePdsPageEvictionDuringPartitionClearTest0%][GridCacheDatabaseSharedManager]
 Await checkpoint pool tasks comleted, pendingTaskCnt=2, completedTaskCnt=3, 
initialized=true, err=null, activeCnt=0
[19:36:07] : [Step 4/5] [2018-10-15 16:36:07,436][INFO 
][db-checkpoint-thread-#21691%db.IgnitePdsPageEvictionDuringPartitionClearTest0%][GridCacheDatabaseSharedManager]
 Await checkpoint pool tasks comleted, pendingTaskCnt=2, completedTaskCnt=3, 
initialized=true, err=null, activeCnt=0
[19:36:09] : [Step 4/5] [2018-10-15 16:36:09,437][INFO 
][db-checkpoint-thread-#21691%db.IgnitePdsPageEvictionDuringPartitionClearTest0%][GridCacheDatabaseSharedManager]
 Await checkpoint pool tasks comleted, pendingTaskCnt=2, completedTaskCnt=3, 
initialized=true, err=null, activeCnt=0
[19:36:11] : [Step 4/5] [2018-10-15 16:36:11,437][INFO 
][db-checkpoint-thread-#21691%db.IgnitePdsPageEvictionDuringPartitionClearTest0%][GridCacheDatabaseSharedManager]
 Await checkpoint pool tasks comleted, pendingTaskCnt=2, completedTaskCnt=3, 
initialized=true, err=null, activeCnt=0
[19:36:13] : [Step 4/5] [2018-10-15 16:36:13,438][INFO 
][db-checkpoint-thread-#21691%db.IgnitePdsPageEvictionDuringPartitionClearTest0%][GridCacheDatabaseSharedManager]
 Await checkpoint pool tasks comleted, pendingTaskCnt=2, completedTaskCnt=3, 
initialized=true, err=null, activeCnt=0
[19:36:15] : [Step 4/5] [2018-10-15 16:36:15,439][INFO 
][db-checkpoint-thread-#21691%db.IgnitePdsPageEvictionDuringPartitionClearTest0%][GridCacheDatabaseSharedManager]
 Await checkpoint pool tasks comleted, pendingTaskCnt=2, completedTaskCnt=3, 
initialized=true, err=null, activeCnt=0
[19:36:17] : [Step 4/5] [2018-10-15 16:36:17,440][INFO 
][db-checkpoint-thread-#21691%db.IgnitePdsPageEvictionDuringPartitionClearTest0%][GridCacheDatabaseSharedManager]
 Await checkpoint pool tasks comleted, pendingTaskCnt=2, completedTaskCnt=3, 
initialized=true, err=null, activeCnt=0
[19:36:19] : [Step 4/5] [2018-10-15 16:36:19,441][INFO 
][db-checkpoint-thread-#21691%db.IgnitePdsPageEvictionDuringPartitionClearTest0%][GridCacheDatabaseSharedManager]
 Await checkpoint pool tasks comleted, pendingTaskCnt=2, completedTaskCnt=3, 
initialized=true, err=null, activeCnt=0
[19:36:21] : [Step 4/5] [2018-10-15 16:36:21,442][INFO 
][db-checkpoint-thread-#21691%db.IgnitePdsPageEvictionDuringPartitionClearTest0%][GridCacheDatabaseSharedManager]
 Await checkpoint pool tasks comleted, pendingTaskCnt=2, completedTaskCnt=3, 
initialized=true, err=null, activeCnt=0

{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Re: jira issue filling rules

2018-10-08 Thread Dmitriy Govorukhin

Vyacheslav,
Thanks for links, I just wanted to remind these rules for all. Not all
Igniters use these rules in work.

On Mon, Oct 8, 2018 at 12:17 PM Vyacheslav Daradur 
wrote:

> Hi Igniters,
>
> Dmitriy Govorukhin, the naming of GitHub PR [1] and Upsource review
> [2] already described in the wiki.
>
> About the naming of Jira issue, I'd suggest adding rules in related
> part [3] in the wiki.
>
> [1]
> https://cwiki.apache.org/confluence/display/IGNITE/How+to+Contribute#HowtoContribute-Creation
> [2]
> https://cwiki.apache.org/confluence/display/IGNITE/How+to+Contribute#HowtoContribute-ReviewWithUpsource
> [3]
> https://cwiki.apache.org/confluence/display/IGNITE/How+to+Contribute#HowtoContribute-TicketCreation
> On Mon, Oct 8, 2018 at 12:16 PM Maxim Muzafarov 
> wrote:
> >
> > Dmitry,
> >
> > I fully support you!
> >
> > It's never too much remind these simple rules. It's not obvious at first
> > glance but placing comment
> > 'why' and 'how' particular JIRA issue was done will help much for the
> > further code investigations.
> >
> > I'm not sure that pinning `link` to the TC result is a good practice. It
> > becomes obsolete too quickly.
> > Having the TC.Bot visa with ~no blockers found~ and the brief description
> > of implementation details should be enough.
> >
> > Thoughts?
> >
> > On Sun, 7 Oct 2018 at 18:53 Dmitriy Pavlov 
> wrote:
> >
> > > ++1 from my side, as always.
> > >
> > > I want to add GitHub as a possible place of discussion, but any place
> we've
> > > selected to run review should be mentioned in JIRA ticket. E.g., I
> left a
> > > couple of comments in PR; please take a look.
> > >
> > > вс, 7 окт. 2018 г. в 18:01, Dmitriy Govorukhin <
> > > dmitriy.govoruk...@gmail.com
> > > >:
> > >
> > > > Hi folks,
> > > >
> > > > Recently time ago, I noticed that many Jira issues created and
> filled in
> > > > different ways,
> > > > so someone does not fill issue description, someone does not attach
> links
> > > > in the links section instead, add links to comments and etc.
> > > > I want to star discussion regards to Jira issue filling rules.
> > > >
> > > > I suggest,
> > > >
> > > > The name should include a short description problem
> > > > The description should contain:
> > > > (if bug) *Steps to reproduce* (java reproducer) or TC link to failed
> > > tests.
> > > > (if feature) *Idea and solution*
> > > > PR should start with IGNITE- (JIRA bot automatically link your PR
> > > with
> > > > JIRA Issue)
> > > > TC link should be in the* links section, *link name example: TC Run
> all |
> > > >  TC Run pds | TC {suit name}
> > > > Upsource link should be in the* links section* (optional, if some
> > > reviewer
> > > > needs),
> > > > link name example: IGNT-CR-*** or CR-***
> > > > All discussions related code should be in JIRA issue comments or
> Upsource
> > > > review
> > > > When you answer someone use @username for Upsource or ~username for
> JIRA
> > > >
> > > > Comments are welcome!
> > > >
> > >
> > --
> > --
> > Maxim Muzafarov
>
>
>
> --
> Best Regards, Vyacheslav D.
>

jira issue filling rules

2018-10-07 Thread Dmitriy Govorukhin

Hi folks,

Recently time ago, I noticed that many Jira issues created and filled in
different ways,
so someone does not fill issue description, someone does not attach links
in the links section instead, add links to comments and etc.
I want to star discussion regards to Jira issue filling rules.

I suggest,

The name should include a short description problem
The description should contain:
(if bug) *Steps to reproduce* (java reproducer) or TC link to failed tests.
(if feature) *Idea and solution*
PR should start with IGNITE- (JIRA bot automatically link your PR with
JIRA Issue)
TC link should be in the* links section, *link name example: TC Run all |
 TC Run pds | TC {suit name}
Upsource link should be in the* links section* (optional, if some reviewer
needs),
link name example: IGNT-CR-*** or CR-***
All discussions related code should be in JIRA issue comments or Upsource
review
When you answer someone use @username for Upsource or ~username for JIRA

Comments are welcome!

Re: New PMC member: Dmitriy Pavlov

2018-08-30 Thread Dmitriy Govorukhin

Dmitriy,

My congratulations!
Thank you for your work!

On Thu, Aug 30, 2018 at 1:58 PM Maxim Muzafarov  wrote:

> Dmitry,
>
> My congratulations!
> Thank you for guiding and helping the community members and me, in
> particular,
> to follow the Apache Way principles and also for your contribution.
>
> On Wed, 29 Aug 2018 at 22:31 Denis Magda  wrote:
>
> > The Project Management Committee (PMC) for Apache Ignite
> > has invited Dmitriy Pavlov to become a PMC member and we are pleased
> > to announce that he has accepted.
> >
> > Being a PMC member enables assistance with the management
> > and to guide the direction of the project.
> >
> > Congratulations Dmitriy! Keep contributing to Ignite success the way you
> do
> > ;)
> >
> > Denis
> >
> --
> --
> Maxim Muzafarov
>

[jira] [Created] (IGNITE-9426) IgniteAtomicSequence benchmarks

2018-08-29 Thread Dmitriy Govorukhin (JIRA)

Dmitriy Govorukhin created IGNITE-9426:
--

 Summary: IgniteAtomicSequence benchmarks
 Key: IGNITE-9426
 URL: https://issues.apache.org/jira/browse/IGNITE-9426
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin


Need to create JMH and Yardstick benchmarks for the atomic sequence in order to 
be able to measure future performance improvements



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Re: StandaloneWalRecordsIterator: support iteration from custom pointer

2018-08-16 Thread Dmitriy Govorukhin

Ivan,

I implemented this issue, please review my changes.
https://reviews.ignite.apache.org/ignite/review/IGNT-CR-729

On Thu, Aug 16, 2018 at 3:09 PM Ivan Rakov  wrote:

> Thanks for your comments!
> I've created a ticket: https://issues.apache.org/jira/browse/IGNITE-9294
>
> Best Regards,
> Ivan Rakov
>
> On 15.08.2018 21:31, Dmitriy Setrakyan wrote:
> > Agree, this should be a great performance boost.
> >
> > On Wed, Aug 15, 2018 at 10:17 AM, Dmitriy Pavlov 
> > wrote:
> >
> >> Hi Ivan,
> >>
> >> I agree that providing WAL pointer is the better option. Initially,
> >> Standalone WAL iterator was developed for debugging utility, so a set of
> >> files was perfectly OK.
> >>
> >> Sincerely,
> >> Dmitriy Palov
> >>
> >> ср, 15 авг. 2018 г. в 20:06, Ivan Rakov :
> >>
> >>> Igniters,
> >>>
> >>> Right now we are developing WAL shipping process for our internal
> >>> purposes and we use StandaloneWalRecordsIterator to read WAL files from
> >>> custom destination. We have bumped into a problem - iterator can be
> >>> constructed from set of files and dirs, but there's no option to pass
> >>> WAL pointer to the iterator factory class to start iteration with. It
> >>> can be worked around (by filtering all records prior to needed
> pointer),
> >>> but I think it would be handy to add such option to
> >>> IgniteWalIteratorFactory API.
> >>>
> >>> What do you think?
> >>>
> >>> --
> >>> Best Regards,
> >>> Ivan Rakov
> >>>
> >>>
>
>

[jira] [Created] (IGNITE-9260) StandaloneWalRecordsIterator broken on WalSegmentTailReachedException not in work dir

2018-08-13 Thread Dmitriy Govorukhin (JIRA)

Dmitriy Govorukhin created IGNITE-9260:
--

 Summary: StandaloneWalRecordsIterator broken on 
WalSegmentTailReachedException not in work dir
 Key: IGNITE-9260
 URL: https://issues.apache.org/jira/browse/IGNITE-9260
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin


After implementation IGNITE-9050, StandaloneWalRecordsIterator became broke 
because in the standalone mode we can stop the iteration at any moment when the 
last available segment will be fully read.  And validation which was 
implemented in IGNITE-9050 is not applicable for standalone mode. Need to 
change behavior and validate what we stop an iteration in last available WAL 
segment.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IGNITE-9244) Partition eviction may use all threads in sys pool, it leads to hangs send a message via sys pool

2018-08-10 Thread Dmitriy Govorukhin (JIRA)

Dmitriy Govorukhin created IGNITE-9244:
--

 Summary: Partition eviction may use all threads in sys pool, it 
leads to hangs send a message via sys pool 
 Key: IGNITE-9244
 URL: https://issues.apache.org/jira/browse/IGNITE-9244
 Project: Ignite
  Issue Type: Bug
 Environment: In the current implementation, GridDhtPartitionsEvictor 
reset partition to evict one by one.
GridDhtPartitionsEvictor is created for each cache group, if we try to evict 
too many groups as sys pool size, group evictors will take all available 
threads in sys pool. It leads to hangs send a message via sys pool. As a fix, I 
suggest to limit concurrent execution via sys pool or use another pool for this 
purpose.
Reporter: Dmitriy Govorukhin






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Re: async operation is not fair async

2018-08-02 Thread Dmitriy Govorukhin

Folks,
Any comments?
I will start to implement withFairAsync(); decorator soon.

On Wed, Aug 1, 2018 at 12:22 PM Dmitriy Pavlov 
wrote:

> Igniters,
>
> I've re-read this thread in brief. As far as I know this change is not
> coming from some company, so this change will be at least good for healthy
> community building.
>
> And I didn't find any obstacles why not to implement approach with new mode
> .withFairAsync() for cases user is completely aware of consequences.
>
> It could be not public API to avoid anyone will use it. It could be
> used,e.g. in integrations and by qualified users to gain as much
> throutghput as it is possible.
>
> So I would like to be an sponsor here. If anyone or Dmitriy G. will
> contribute this change, I will merge it. I hope PMCs are agree here and
> will not veto this change.
>
> Sincerely,
> Dmitriy Pavlov
>
> чт, 24 мая 2018 г. в 15:13, Yakov Zhdanov :
>
> > Alexey Goncharuk, I remember we started working on async connection
> > establishment. This should fix latency issue related to network which I
> > believe gives the most contribution to overall latency. Mapping logic and
> > other stuff can be ignored as it can very rarely be an issue at least on
> > stable tolopogies. What is the status with async connections? That would
> > really be a huge improvement!
> >
> > Also please remember that uncontrolled async operations may lead to OOME,
> > therefore at some point when there are too many uncompleted async
> > operations newly invoked async operations should become synchronous, i.e.
> > we should return completed future, ignoring the fact that user expected
> us
> > to be async.
> >
> > I would like to have very strong reasons to start reapproaching this.
> >
> > --Yakov
> >
>

Re: Quick questions on B+ Trees and Partitions

2018-07-26 Thread Dmitriy Govorukhin

Hi John,

1. B + tree in a partition is a primary key index, this means that we use a
B+ tree index for searching data in the partition.
2. Not fully understood a question, please explain more details.
3. It depends how many partitions do you have for these caches, by default
it 1024 per cache, in your default case it will be 2048 B+ trees.
4. if I understood your question correctly, what differentiates between B+
tree in a partition and B+ tree in SQL index?
SQL B+ tree is one (or as much as you created indexes) per cache on Node
(value field index), partition B+ tree (primary key index) as many as many
partitions.

On Thu, Jul 26, 2018 at 3:07 AM John Wilson  wrote:

> Hi,
>
>
>1. B+ tree initialization, BPlusTree.initTree, seems to be called for
>every partition. Why?
>
> https://github.com/apache/ignite/blob/master/modules/core/src/main/java/org/apache/ignite/internal/processors/cache/persistence/tree/BPlusTree.java
>2. The documentation here,
>https://apacheignite.readme.io/docs/memory-architecture, also states
>that for each SQL index, Ignite instantiates and manages a dedicated B+
>Tree. So, is the number of B+ trees determined by partition number of #
> of
>indexes defined?
>3. Assume I have a Person cache and an Organization cache. How many B+
>trees are defined for each cache
>4. What differentiates one B+ tree from another B+ tree? Just the cache
>it represents?
>
>
> Thanks,
> John
>

Need review IGNITE-9050

2018-07-25 Thread Dmitriy Govorukhin

Igniters,

I completed work on
IGNITE-9050  (WALIterator
should throw an exception if iterator stopped in the WAL archive but not in
WAL work).

review link - CR-697

[jira] [Created] (IGNITE-9050) WALIterator should throws exception if iterator stopped in the WALArchive but not in WALWork

2018-07-23 Thread Dmitriy Govorukhin (JIRA)

Dmitriy Govorukhin created IGNITE-9050:
--

 Summary: WALIterator should throws exception if iterator stopped 
in the WALArchive but not in WALWork
 Key: IGNITE-9050
 URL: https://issues.apache.org/jira/browse/IGNITE-9050
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin


The iterator will stop iteration if next WAL record pointer is not equals 
expected (WalSegmentTailReachedException), if it happens during iteration over 
segments in WAL archive, this means WAL is corrupted and we cannot ignore this, 
WAL log is not fully read.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IGNITE-9049) Missed SWITCH_SEGMENT_RECORD at the end of WAL file but space enough

2018-07-22 Thread Dmitriy Govorukhin (JIRA)

Dmitriy Govorukhin created IGNITE-9049:
--

 Summary: Missed SWITCH_SEGMENT_RECORD at the end of WAL file but 
space enough 
 Key: IGNITE-9049
 URL: https://issues.apache.org/jira/browse/IGNITE-9049
 Project: Ignite
  Issue Type: Improvement
Reporter: Dmitriy Govorukhin


There is a situation the several threads try addRecord when the free space ends 
(need rollOver to the next WAL segment) and none thread writes 
SWITCH_SEGMENT_RECORD. This leads to an end of the file will have garbage. If 
we try to iterate over this segment, iterator stopped when try to read next 
record and stumble on the garbage at the end of the file, it leads log will not 
be fully read. Any type of operation required iterator may be broken (crash 
recovery, delta rebalance, etc.).

Example:
File size 1024 bytes
Current tail position 768 (free space 256)

1. Thread-1 try addRecord (size 128) -> tail update to 896.
2. Thread-2 try addRecord (size 128) -> tail update to 1024 (free space ended).
None thread still not write any data, it just reserves position for write. 
(SegmentedRingByteBuffer.offer).

3. Thread-3 try addRecord  (size 128) -> no space enough -> rollOver and CAS 
stop flag to TRUE.

4. Thread-1 and Thread-2 try to write data and cannot do it.

FileWriteHandle.addRecord
{code}
  if (buf == null || (stop.get() && rec.type() != SWITCH_SEGMENT_RECORD))
return null; // Can not write to this segment, need 
to switch to the next one.

{code}

Thread-3 - can not write SWITCH_SEGMENT_RECORD because of not enough space.
Thread-1 and Thread-2 cannot write their data because a stop is TRUE

We have garbage from 768 to 1024 position.




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IGNITE-9047) Add idleVerify check for GridCommonAbstractTest

2018-07-21 Thread Dmitriy Govorukhin (JIRA)

Dmitriy Govorukhin created IGNITE-9047:
--

 Summary: Add idleVerify check for GridCommonAbstractTest
 Key: IGNITE-9047
 URL: https://issues.apache.org/jira/browse/IGNITE-9047
 Project: Ignite
  Issue Type: Improvement
Reporter: Dmitriy Govorukhin


Since we have idleVerify (consistency check between primary and backups) it 
will be useful to add this command into test abstract class for subsequent 
verification of consistency after some test scenarios. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IGNITE-9042) Transaction with small timeout may lead to inconsistent partition state

2018-07-20 Thread Dmitriy Govorukhin (JIRA)

Dmitriy Govorukhin created IGNITE-9042:
--

 Summary: Transaction with small timeout may lead to inconsistent 
partition state
 Key: IGNITE-9042
 URL: https://issues.apache.org/jira/browse/IGNITE-9042
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin
 Attachments: Reproducer.java

The transaction with a small timeout may lead to inconsistent partition state. 
Reproducer in attached.

Problem in GridDhtTxPrepareFuture.sendPrepareRequests() if timeout will reached 
during iteration over  tx.dhtMap().values() we do not send 
GridDhtTxPrepareRequest for some backups, it lead that backup will not know any 
think about transaction and will not participate in commit.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Re: data extractor

2018-07-20 Thread Dmitriy Govorukhin

Alexey,

1. The utility will extract raw payload bytes. If you want to build binary
object or Java class instances you will need binary/marshaller metadata.
If two grid will have different metadata, you should move metadata as well
as dumped data for construct binary objects on another grid.
Do you have any ideas on how we can improve this approach?

2. I do not think that I understood your idea, please explain in more
details who do you want to use the utility in checkpoint statistic?

3. In the first implementation, I prefer simple *file path* approach, you
can specify a path as a parameter to some partition file or directory
cache/group or root to caches/groups directory.

4. I have not had time to work out how we will upload date to another grid.
Any ideas are welcome.


On Mon, Jul 2, 2018 at 5:34 PM Alexey Goncharuk 
wrote:

> Dmitriy,
>
> A few questions regarding the user cases for the utility:
> 1) Would I be able to read the extracted data from the dumped file without
> Ignite node binary/marshaller metadata? In other words, will I be able to
> move only the dumped file to another grid or will I need to move the
> metadata as well?
> 2) Are you planning to add a public API version of this utility as a part
> of Ignite? For example, if I am planning to run some statistics on a
> checkpointed data, will I be able to get some sort of an iterator to
> process this data?
> 3) How a user will choose which caches (cache groups) to process? Will the
> user need to provide a cache or cache ID (or either of them)? Will the
> utility be able to extract a single cache data from a cache group?
> 4) I think the upload part of the utility is missing some input parameters
> - for example, what cluster to connect to, what caches to upload to, etc.
>
> сб, 30 июн. 2018 г. в 22:38, Dmitriy Govorukhin <
> dmitriy.govoruk...@gmail.com>:
>
> > Igniters,
> >
> > I am working on IGNITE-7644
> > <https://issues.apache.org/jira/browse/IGNITE-7644> (export all
> key-value
> > data from a persisted partition),
> > it will be command line tool for extracting data from Ignite partition
> > file without the need to start node.
> > The main motivation is to have a lifebuoy in case if a file has damage
> for
> > some reason.
> >
> > I suggest simple API and two commands for the first implementation:
> >
> > -c
> > --CRC [srcPath] - check CRC for all(or by type) pages in partition
> >
> > -e
> > --extract [srcPath] [outPath] - dump all survey data from partition to
> > another file with raw key/value pair format
> > (required graceful stop for a node, not necessary after --restore will be
> > implemented)
> >
> > Output file format see in attached, this format does not contain any
> index
> > inside but it is very simple and
> > flexible for future works with raw key/value data.
> >
> > Future features:
> > -u
> > --upload - reload raw key/value pairs to node
> >
> > -s
> > --status - check current node file status, need binary recovery or not
> > (node crash on the middle of a checkpoint)
> >
> > -r
> > --restore - restore binary consistency (finish checkpoint, required WAL
> > file for recovery)
> >
> > Let's start a discussion, any comments are welcome.
> >
> >
>

[jira] [Created] (IGNITE-8973) Need to support dump for idle_verify

2018-07-10 Thread Dmitriy Govorukhin (JIRA)

Dmitriy Govorukhin created IGNITE-8973:
--

 Summary: Need to support dump for idle_verify 
 Key: IGNITE-8973
 URL: https://issues.apache.org/jira/browse/IGNITE-8973
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin


In a current implementation, idle_verify checking consistency between primary 
and backup partitions will be useful to have ability dump current state for all 
partition to file. This dump can help an investigation of some kind of problem 
with partition counters or sizes because it is a cluster partition hash 
snapshot by some partition state (hash include all keys in the partition).

idle_verify --dump - calculate partition hash and print into standard output
idle_verify --dump {path} - calculate partition hash and write output to file




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IGNITE-8929) WAL should not disable for the group if none a partition is not assigned to a local node.

2018-07-04 Thread Dmitriy Govorukhin (JIRA)

Dmitriy Govorukhin created IGNITE-8929:
--

 Summary: WAL should not disable for the group if none a partition 
is not assigned to a local node.
 Key: IGNITE-8929
 URL: https://issues.apache.org/jira/browse/IGNITE-8929
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Re: data extractor

2018-07-01 Thread Dmitriy Govorukhin

Nikolay,

I think we won't support extract from encrypted store In the first
implementation.
I guess we can support the encrypted store in future, or you have a reason
why we should do it in first?


On Sun, Jul 1, 2018 at 11:48 AM Nikolay Izhikov  wrote:

> Hello, Dmitriy.
>
> Should we support extraction of encrypted data?
>
> There will be 2 type of keys we should load to successfully extract data:
>
> * master key: keystore + password required.
> * cache keys: masterkey + access to metastore required.
>
> TDE task is almost done, please, take a look.
>
> ticket - https://issues.apache.org/jira/browse/IGNITE-8485
> prototype - https://github.com/apache/ignite/pull/4167
> spi -
> https://github.com/apache/ignite/pull/4167/files#diff-9a792ab0e6971f202d22d530af0ac933
>
> В Сб, 30/06/2018 в 22:37 +0300, Dmitriy Govorukhin пишет:
> > Igniters,
> >
> > I am working on IGNITE-7644 (export all key-value data from a persisted
> partition),
> > it will be command line tool for extracting data from Ignite partition
> file without the need to start node.
> > The main motivation is to have a lifebuoy in case if a file has damage
> for some reason.
> >
> > I suggest simple API and two commands for the first implementation:
> >
> > -c
> > --CRC [srcPath] - check CRC for all(or by type) pages in partition
> >
> > -e
> > --extract [srcPath] [outPath] - dump all survey data from partition to
> another file with raw key/value pair format
> > (required graceful stop for a node, not necessary after --restore will
> be implemented)
> >
> > Output file format see in attached, this format does not contain any
> index inside but it is very simple and
> > flexible for future works with raw key/value data.
> >
> > Future features:
> > -u
> > --upload - reload raw key/value pairs to node
> >
> > -s
> > --status - check current node file status, need binary recovery or not
> (node crash on the middle of a checkpoint)
> >
> > -r
> > --restore - restore binary consistency (finish checkpoint, required WAL
> file for recovery)
> >
> > Let's start a discussion, any comments are welcome.
> >

data extractor

2018-06-30 Thread Dmitriy Govorukhin

Igniters,

I am working on IGNITE-7644
 (export all key-value
data from a persisted partition),
it will be command line tool for extracting data from Ignite partition file
without the need to start node.
The main motivation is to have a lifebuoy in case if a file has damage for
some reason.

I suggest simple API and two commands for the first implementation:

-c
--CRC [srcPath] - check CRC for all(or by type) pages in partition

-e
--extract [srcPath] [outPath] - dump all survey data from partition to
another file with raw key/value pair format
(required graceful stop for a node, not necessary after --restore will be
implemented)

Output file format see in attached, this format does not contain any index
inside but it is very simple and
flexible for future works with raw key/value data.

Future features:
-u
--upload - reload raw key/value pairs to node

-s
--status - check current node file status, need binary recovery or not
(node crash on the middle of a checkpoint)

-r
--restore - restore binary consistency (finish checkpoint, required WAL
file for recovery)

Let's start a discussion, any comments are welcome.

Re: Introduce a sample of activation policy when cluster is activated first time

2018-06-26 Thread Dmitriy Govorukhin

Vladimir,

Auto-activation on the first start?
Please, shared an issue link if you have.

On Tue, Jun 26, 2018 at 11:29 AM Vladimir Ozerov 
wrote:

> Pavel,
>
> As far as I know we agreed to implement auto activation in one of the
> nearest releases. Am I missing something?
>
> вт, 26 июня 2018 г. в 0:56, Pavel Kovalenko :
>
> > Igniters,
> >
> > By the results of the recent Ignite meeting at St. Petersburg I've
> noticed
> > that some of our users getting stuck with the problem when a cluster is
> > activated the first time.
> > At the moment we have only manual options to do it (control.sh, Visor,
> > etc.), but it's not enough. Manual activation might be good when users
> have
> > a dedicated cluster in production with a stable environment.
> > But this problem becomes harder when users deploy embedded Ignite (with
> > persistence) inside other services, or frequently deploy to temporary
> stage
> > / test environment.
> > It's uncomfortable to manual invoke control.sh each time after deploy to
> > clean environment and hard to write a custom script to do it
> automatically.
> > This is the clearly usability problem.
> >
> > I think we should introduce an example of how to write such policy using
> > Ignite API, similarly as we did it with Baseline Watcher.
> >
> > I've created a ticket regarding the problem:
> > https://issues.apache.org/jira/browse/IGNITE-8844
> > I think we should provide an example of one of the simplest and most
> > useful policy - when cluster server nodes size reaches some number.
> >
> > Moreover, I think it would be nice to have some sort of automatic cluster
> > management service (external or internal) like Spark Driver or Storm
> > Nimbus which
> > will do such things without user actions.
> >
> > What do you think?
> >
>

Re: unused or only used in tests methonds

2018-06-25 Thread Dmitriy Govorukhin

Igniters,

If no one objects, then I delete these methods in next refactoring task.

On Fri, Jun 15, 2018 at 6:38 PM Alexey Goncharuk 
wrote:

> I have no objections to remove these methods.
>
> --AG
>
> чт, 14 июн. 2018 г. в 17:39, Dmitriy Govorukhin <
> dmitriy.govoruk...@gmail.com>:
>
> > Igniters,
> >
> > I found 2 methods,
> >
> > 1. PageStore.pageOffset(long pageId) -  it unused, seem that old
> artifact.
> > 2. IgnitePageStoreManager.pagesAllocated(int grpId) - used only in test,
> > seem that same problem can be solved with
> > IgnitePageStoreManager.pages(grpId,
> > partId).
> >
> > I suggest, remove these methods for the more clear interface.
> >
> > Any reason, why we can not remove them?
> >
>

Re: Review request for IGNITE-8740: Support reuse of already initialized Ignite in IgniteSpringBean

2018-06-23 Thread Dmitriy Govorukhin

Valentin,

Seems that these changes have classes without license head. TC link


/data/teamcity/work/c182b70f2dfa6507/modules/spring/src/test/java/org/apache/ignite/transactions/spring/GridSpringTransactionManagerSpringBeanSelfTest.java
/data/teamcity/work/c182b70f2dfa6507/modules/spring/src/test/java/org/apache/ignite/transactions/spring/GridSpringTransactionManagerAbstractTest.java

Please, add headers for these classes.

On Sat, Jun 23, 2018 at 2:33 AM Amir Akhmedov 
wrote:

> Great, thanks! As always, happy to contribute!
>
> Thanks,
> Amir
>
>
> On Fri, Jun 22, 2018 at 7:32 PM Valentin Kulichenko <
> valentin.kuliche...@gmail.com> wrote:
>
> > Amir,
> >
> > I merged you change to master and 2.6. Thanks!
> >
> > -Val
> >
> > On Fri, Jun 22, 2018 at 4:21 PM Amir Akhmedov 
> > wrote:
> >
> >> Val,
> >> I replied to it already :)
> >>
> >> Thanks,
> >> Amir
> >>
> >>
> >> On Fri, Jun 22, 2018 at 7:20 PM Valentin Kulichenko <
> >> valentin.kuliche...@gmail.com> wrote:
> >>
> >>> Amir,
> >>>
> >>> Thanks for quick reaction. I added a follow up question in the ticket.
> >>>
> >>> -Val
> >>>
> >>> On Fri, Jun 22, 2018 at 3:48 PM Amir Akhmedov  >
> >>> wrote:
> >>>
>  Hi Val,
>  Thanks for your comments. I replied in the ticket with my vision of
> the
>  issue and how I tried to solve it. Please check it and let me know.
> 
>  Thanks,
>  Amir
> 
> 
>  On Fri, Jun 22, 2018 at 5:42 PM Valentin Kulichenko <
>  valentin.kuliche...@gmail.com> wrote:
> 
> > Hi Amir,
> >
> > I reviewed the changes and I'm not sure I understood how they fix
> they
> > issue. I left more detailed comment in the ticket, can you please
> clarify?
> >
> > -Val
> >
> > On Fri, Jun 22, 2018 at 9:53 AM Dmitry Pavlov  >
> > wrote:
> >
> >> Hi Amir,
> >>
> >> let me say sincere thank you for continuing to contribute.
> >>
> >> Bumping up this thread.
> >>
> >> Igniters, who has an expertise here?
> >>
> >>
> >> вс, 17 июн. 2018 г. в 17:59, Amir Akhmedov  >:
> >>
> >> > Hi All,
> >> > Can you please review my changes for IGNITE-8740.
> >> >
> >> > PR: https://github.com/apache/ignite/pull/4208
> >> > TC:
> >> >
> >> >
> >>
> https://ci.ignite.apache.org/viewLog.html?buildId=1397283=buildResultsDiv=IgniteTests24Java8_RunAll
> >> >
> >> > Thanks,
> >> > Amir
> >> >
> >>
> >
>

[jira] [Created] (IGNITE-8827) Disable WAL during apply updates on recovery

2018-06-19 Thread Dmitriy Govorukhin (JIRA)

Dmitriy Govorukhin created IGNITE-8827:
--

 Summary: Disable WAL during apply updates on recovery
 Key: IGNITE-8827
 URL: https://issues.apache.org/jira/browse/IGNITE-8827
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Re: Need review IGNITE-8661

2018-06-16 Thread Dmitriy Govorukhin

Dmitriy,

I think we can try to include it in 2.6.

On Fri, Jun 15, 2018 at 7:10 PM Ivan Rakov  wrote:

> Dmitriy G.,
>
> I will take a look as well, just in case.
>
> Best Regards,
> Ivan Rakov
>
> On 15.06.2018 2:36, Dmitriy Setrakyan wrote:
> > Dmitriy, great job! Very important fix to a very bad problem in the
> product
> > that could cause data consistency issue.
> >
> > Will it be included into 2.6, or will we have to wait for 2.7?
> >
> > D.
> >
> > On Thu, Jun 14, 2018 at 6:53 AM, Dmitriy Govorukhin <
> > dmitriy.govoruk...@gmail.com> wrote:
> >
> >> Folks,
> >>
> >> I fixed a problem with an iterator. [1] WALItreater is not stopped if
> can
> >> not deserialize record or fail read record with corrupted content.
> Please
> >> review [2][3].
> >>
> >> [1] IGNITIGNITE-8661 <https://issues.apache.org/jira/browse/IGNITE-8661
> >
> >> [2] PR-4155 <https://github.com/apache/ignite/pull/4155>
> >> [3] CR-637 <https://reviews.ignite.apache.org/ignite/review/IGNT-CR-637
> >
> >>
> >> dev-list discussion: WAL iterator unexpected behavior
> >>
>
>

unused or only used in tests methonds

2018-06-14 Thread Dmitriy Govorukhin

Igniters,

I found 2 methods,

1. PageStore.pageOffset(long pageId) -  it unused, seem that old artifact.
2. IgnitePageStoreManager.pagesAllocated(int grpId) - used only in test,
seem that same problem can be solved with IgnitePageStoreManager.pages(grpId,
partId).

I suggest, remove these methods for the more clear interface.

Any reason, why we can not remove them?

Re: Atomic cache inconsistent state

2018-06-05 Thread Dmitriy Govorukhin

Denis,

Seem that you right, it is a problem.
I guess in this case primary node should send CachePartialUpdateException
to near node.

On Tue, Jun 5, 2018 at 6:13 PM, Denis Garus  wrote:

> Fix formatting
>
> Hello Igniters!
>
> I have found some confusing behavior of atomic partitioned cache with
> `PRIMARY_SYNC` write synchronization mode.
> Node with a primary partition sends a message to remote nodes with backup
> partitions via `GridDhtAtomicAbstractUpdateFuture#sendDhtRequests`.
> If during of sending occurs an error then it, in fact, will be ignored, see
> [1]:
> ```
> try {
> 
>
> cctx.io().send(req.nodeId(), req, cctx.ioPolicy());
>
> 
> }
> catch (ClusterTopologyCheckedException ignored) {
> 
>
> registerResponse(req.nodeId());
> }
> catch (IgniteCheckedException ignored) {
> 
>
> registerResponse(req.nodeId());
> }
>
> ```
> This behavior results in the primary partition and backup partitions have
> the different value for given key.
>
> There is the reproducer [2].
>
> Should we consider this behavior as valid?
>
> [1].
> https://github.com/dgarus/ignite/blob/d473b507f04e2ec843c1da1066d890
> 8e882396d7/modules/core/src/main/java/org/apache/ignite/
> internal/processors/cache/distributed/dht/atomic/
> GridDhtAtomicAbstractUpdateFuture.java#L473
> [2].
> https://github.com/apache/ignite/pull/4126/files#diff-
> 5e5bfb73bd917d85f56a05552b1d014aR26
>
> 2018-06-05 17:35 GMT+03:00 Denis Garus :
>
> > Hello Igniters!
> >
> >
> >
> > I have found some confusing behavior of atomic partitioned cache with
> > `PRIMARY_SYNC` write synchronization mode.
> >
> > Node with a primary partition sends a message to remote nodes with backup
> > partitions via `GridDhtAtomicAbstractUpdateFuture#sendDhtRequests`.
> >
> > If during of sending occurs an error then it, in fact, will be ignored,
> > see [1]:
> >
> > ```
> >
> > try {
> >
> > 
> >
> >
> >
> > cctx.io().send(req.nodeId(), req, cctx.ioPolicy());
> >
> >
> >
> > 
> >
> > }
> >
> > catch (ClusterTopologyCheckedException ignored) {
> >
> > 
> >
> >
> >
> > registerResponse(req.nodeId());
> >
> > }
> >
> > catch (IgniteCheckedException ignored) {
> >
> > 
> >
> >
> >
> > registerResponse(req.nodeId());
> >
> > }
> >
> > ```
> >
> > This behavior results in the primary partition and backup partitions have
> > the different value for given key.
> >
> >
> >
> > There is the reproducer [2].
> >
> >
> >
> > Should we consider this behavior as valid?
> >
> >
> >
> > [1]. https://github.com/dgarus/ignite/blob/
> d473b507f04e2ec843c1da1066d890
> > 8e882396d7/modules/core/src/main/java/org/apache/ignite/
> > internal/processors/cache/distributed/dht/atomic/
> > GridDhtAtomicAbstractUpdateFuture.java#L473
> >
> > [2]. https://github.com/apache/ignite/pull/4126/files#diff-
> > 5e5bfb73bd917d85f56a05552b1d014aR26
> >
>

[jira] [Created] (IGNITE-8707) DataStorageMetrics.getTotalAllocatedSize metric does not account reserved partition page header.

2018-06-05 Thread Dmitriy Govorukhin (JIRA)

Dmitriy Govorukhin created IGNITE-8707:
--

 Summary: DataStorageMetrics.getTotalAllocatedSize metric does not 
account reserved partition page header.
 Key: IGNITE-8707
 URL: https://issues.apache.org/jira/browse/IGNITE-8707
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

IGNITE-8685 review request

2018-06-05 Thread Dmitriy Govorukhin

Igniters,

I prepared a rather important patch [1] related to WAL.
In the current implementation, we calculation incorrect size for
SEGMENT_SWITCH record, it leads to stopping the iteration at the end of a
segment and iterator do not advance to the next segment.

[1] https://issues.apache.org/jira/browse/IGNITE-8685

[jira] [Created] (IGNITE-8685) Incorrect size for switch segment record

2018-06-04 Thread Dmitriy Govorukhin (JIRA)

Dmitriy Govorukhin created IGNITE-8685:
--

 Summary: Incorrect size for switch segment record 
 Key: IGNITE-8685
 URL: https://issues.apache.org/jira/browse/IGNITE-8685
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin


We have invariant that switch segment record should have the size of one byte.
Although, in the current implementation, size calculation with overhard for 
storing CRC and WAL pointer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Re: WAL iterator unexpected behavior

2018-06-03 Thread Dmitriy Govorukhin

Dmitriy,

Sorry, my mistake, I meant unknown exception of course.

On Sun, Jun 3, 2018 at 11:53 PM, Dmitriy Setrakyan 
wrote:

> I got a bit confused by your initial statement. So, the iterator is stopped
> in case of any exception, known or unknown. In that case, sounds good.
>
> D.
>
> On Sun, Jun 3, 2018, 12:11 Dmitriy Govorukhin <
> dmitriy.govoruk...@gmail.com>
> wrote:
>
> > Dmitriy,
> >
> > The iterator will be stopped. and method "it.next()" will throw the
> > exception.
> >
> > On Sat, Jun 2, 2018 at 4:27 PM, Dmitriy Setrakyan  >
> > wrote:
> >
> > > Dmitriy, what happens in case of unknown exceptions?
> > >
> > >
> > > On Thu, May 31, 2018 at 6:35 AM, Dmitriy Govorukhin <
> > > dmitriy.govoruk...@gmail.com> wrote:
> > >
> > > > Folks,
> > > >
> > > > I created the issue to solve this issue.
> > > >
> > > > IGNITE-8661 <https://issues.apache.org/jira/browse/IGNITE-8661>
> > > > WALItreater
> > > > is not stopped if can not deserialize record
> > > >
> > > > I suggest to make the following changes:
> > > > 1. We should only stop iteration on known exceptions
> > > > 2. Also, need to provide ability skip records by type or some pointer
> > for
> > > > the StandaloneWalRecordsIterator
> > > >
> > > > Comments are welcome.
> > > >
> > > > On Thu, May 31, 2018 at 12:53 AM, Dmitriy Setrakyan <
> > > dsetrak...@apache.org
> > > > >
> > > > wrote:
> > > >
> > > > > On Wed, May 30, 2018 at 8:04 AM, Dmitriy Govorukhin <
> > > > > dmitriy.govoruk...@gmail.com> wrote:
> > > > >
> > > > > > Dmitriy,
> > > > > >
> > > > > > I agree that in normal mode we should stop and report that error
> > > > > according.
> > > > > > I prefer to add ability skip records for offline mode.
> > > > > >
> > > > >
> > > > > Sounds good.
> > > > >
> > > >
> > >
> >
>

Re: WAL iterator unexpected behavior

2018-06-03 Thread Dmitriy Govorukhin

Dmitriy,

The iterator will be stopped. and method "it.next()" will throw the
exception.

On Sat, Jun 2, 2018 at 4:27 PM, Dmitriy Setrakyan 
wrote:

> Dmitriy, what happens in case of unknown exceptions?
>
>
> On Thu, May 31, 2018 at 6:35 AM, Dmitriy Govorukhin <
> dmitriy.govoruk...@gmail.com> wrote:
>
> > Folks,
> >
> > I created the issue to solve this issue.
> >
> > IGNITE-8661 <https://issues.apache.org/jira/browse/IGNITE-8661>
> > WALItreater
> > is not stopped if can not deserialize record
> >
> > I suggest to make the following changes:
> > 1. We should only stop iteration on known exceptions
> > 2. Also, need to provide ability skip records by type or some pointer for
> > the StandaloneWalRecordsIterator
> >
> > Comments are welcome.
> >
> > On Thu, May 31, 2018 at 12:53 AM, Dmitriy Setrakyan <
> dsetrak...@apache.org
> > >
> > wrote:
> >
> > > On Wed, May 30, 2018 at 8:04 AM, Dmitriy Govorukhin <
> > > dmitriy.govoruk...@gmail.com> wrote:
> > >
> > > > Dmitriy,
> > > >
> > > > I agree that in normal mode we should stop and report that error
> > > according.
> > > > I prefer to add ability skip records for offline mode.
> > > >
> > >
> > > Sounds good.
> > >
> >
>

Re: Issues with forceServerMode on clients and unclear contract of isClient/isClientMode methods

2018-05-31 Thread Dmitriy Govorukhin

Eduard,

Sounds reasonable, agree with you.

On Thu, May 31, 2018 at 5:10 PM, Eduard Shangareev <
eduard.shangar...@gmail.com> wrote:

> Hi, guys!
>
> I just have found that we widely
> misuse org.apache.ignite.cluster.ClusterNode#isClient method.
>
> Now it returns how a node is connected to cluster (as part of the ring or
> not).
>
> So, it's not the same as IgniteConfiguration#isClientMode!
>
> But! Actually, we treat as the same thing. At ignite-core we have 57 usages
> of ClusterNode#isClient.
> And nowhere we care about connection mode.
>
> Well, there is only one case when these methods would return different
> values, when forceClientMode=true and clientMode=true.
>
> To fix this I propose next:
> 1. Deprecate usage of forceClientMode.
> 2. Create a ticket to remove it in 3.0.
> 3. Make ClusterNode#isClient what everyone expects.
> 4. Reconcile other isClient* methods. We should care about way hot client
> is connected only in discovery SPI. Otherwise, it should return the same
> value as IgniteConfiguration#isClientMode.
>
>
> Any objections?
>

Re: WAL iterator unexpected behavior

2018-05-31 Thread Dmitriy Govorukhin

Folks,

I created the issue to solve this issue.

IGNITE-8661 <https://issues.apache.org/jira/browse/IGNITE-8661> WALItreater
is not stopped if can not deserialize record

I suggest to make the following changes:
1. We should only stop iteration on known exceptions
2. Also, need to provide ability skip records by type or some pointer for
the StandaloneWalRecordsIterator

Comments are welcome.

On Thu, May 31, 2018 at 12:53 AM, Dmitriy Setrakyan 
wrote:

> On Wed, May 30, 2018 at 8:04 AM, Dmitriy Govorukhin <
> dmitriy.govoruk...@gmail.com> wrote:
>
> > Dmitriy,
> >
> > I agree that in normal mode we should stop and report that error
> according.
> > I prefer to add ability skip records for offline mode.
> >
>
> Sounds good.
>

[jira] [Created] (IGNITE-8661) WALItreater is not stopped if can not deserialize record

2018-05-31 Thread Dmitriy Govorukhin (JIRA)

Dmitriy Govorukhin created IGNITE-8661:
--

 Summary: WALItreater is not stopped if can not deserialize record 
 Key: IGNITE-8661
 URL: https://issues.apache.org/jira/browse/IGNITE-8661
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Re: WAL iterator unexpected behavior

2018-05-30 Thread Dmitriy Govorukhin

Dmitriy,

I agree that in normal mode we should stop and report that error according.
I prefer to add ability skip records for offline mode.

On Wed, May 30, 2018 at 5:24 PM, Dmitriy Setrakyan 
wrote:

> Dmitriy,
>
> I think the behavior for offline and online iterator is fundamentally
> different. I do not think it is OK to skip records during normal operation.
> In my view, we should report an error and stop. However, when offline, it
> is OK to report an error and continue in my view.
>
> D.
>
> On Wed, May 30, 2018 at 5:39 AM, Dmitriy Govorukhin <
> dmitriy.govoruk...@gmail.com> wrote:
>
> > Yakov,
> >
> > This problem is not for offline only, it applicable for all types of the
> > iterators.
> > In general, iterator does not know which class will be needed for
> > deserialization.
> >
> > I guess we can expand WAL Iterator factory and provide a method for
> > creating the iterator with some filter,
> > which will accept predicate for skipping no interesting records. It will
> > fix 1 problem.
> >
> >
> > On Wed, May 30, 2018 at 3:18 PM, Yakov Zhdanov 
> > wrote:
> >
> > > This is for offline WAL analysis. So skipping record with proper
> message
> > is
> > > also a solution. If it is possible, iterator should output a suggestion
> > on
> > > what is missing in classpath. Option to suppress warnings should also
> > > present.
> > >
> > > Makes sense?
> > >
> > > And final question - did we look at similar utilities from other
> vendors?
> > >
> > > --Yakov
> > >
> >
>

Re: WAL iterator unexpected behavior

2018-05-30 Thread Dmitriy Govorukhin

Yakov,

This problem is not for offline only, it applicable for all types of the
iterators.
In general, iterator does not know which class will be needed for
deserialization.

I guess we can expand WAL Iterator factory and provide a method for
creating the iterator with some filter,
which will accept predicate for skipping no interesting records. It will
fix 1 problem.

On Wed, May 30, 2018 at 3:18 PM, Yakov Zhdanov  wrote:

> This is for offline WAL analysis. So skipping record with proper message is
> also a solution. If it is possible, iterator should output a suggestion on
> what is missing in classpath. Option to suppress warnings should also
> present.
>
> Makes sense?
>
> And final question - did we look at similar utilities from other vendors?
>
> --Yakov
>

[jira] [Created] (IGNITE-8607) [.NET] Support metrics changes in DataStorageMetricsMXBean

2018-05-24 Thread Dmitriy Govorukhin (JIRA)

Dmitriy Govorukhin created IGNITE-8607:
--

 Summary: [.NET] Support metrics changes in DataStorageMetricsMXBean
 Key: IGNITE-8607
 URL: https://issues.apache.org/jira/browse/IGNITE-8607
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Re: Baseline topology documentation clarified: usage scenarios and definition

2018-05-24 Thread Dmitriy Govorukhin

+1 for a webinar, I would like to see it.

On Wed, May 23, 2018 at 5:34 PM, Dmitry Pavlov 
wrote:

> Hi Ivan,
>
> Would you like to run presentation based on this slides for Igniters?
>
> We can set up webinar using gotomeeting if you like this idea.
>
> Sincerely,
> Dmitriy Pavlov
>
> ср, 23 мая 2018 г. в 0:19, Ivan Rakov :
>
> > I used to present some slides introducing baseline topology concept.
> > Please feel free to use or change any pictures.
> >
> > https://docs.google.com/presentation/d/1DVLwlaR0zPIuL5YMkX8MyGLutZw-
> tvK9InDTCvubAEQ/edit?usp=sharing
> >
> > Best Regards,
> > Ivan Rakov
> >
> > On 18.05.2018 15:20, Dmitriy Setrakyan wrote:
> > > Great docs! Any chance we could add some pictures to illustrate the
> > concept
> > > better.
> > >
> > > On Fri, May 18, 2018 at 2:15 AM, Denis Magda 
> wrote:
> > >
> > >> Igniters,
> > >>
> > >> With the help of Stanislav Lukyanov and Ivan Rakov, we could make our
> > >> baseline topology documentation much better and vivid. Check up the
> new
> > >> sections that did a better job explaining the topology, cover common
> > usage
> > >> scenarious and explain how to trigger the rebalancing programmatically
> > if
> > >> it's not expected that the baseline topology's node count to be
> > recovered
> > >> soon:
> > >>
> > >> - https://apacheignite.readme.io/v2.4/docs/cluster-
> > >> activation#section-baseline-topology
> > >> <
> > https://apacheignite.readme.io/v2.4/docs/cluster-
> activation#section-baseline-topology
> > >
> > >> - https://apacheignite.readme.io/v2.4/docs/cluster-
> > >> activation#section-usage-scenarios
> > >> <
> > https://apacheignite.readme.io/v2.4/docs/cluster-
> activation#section-usage-scenarios
> > >
> > >> - https://apacheignite.readme.io/v2.4/docs/cluster-
> > >> activation#section-triggering-rebalancing-programmatically
> > >> <
> > https://apacheignite.readme.io/v2.4/docs/cluster-
> activation#section-triggering-rebalancing-programmatically
> > >
> > >>
> > >> --
> > >> Denis
> > >>
> >
> >
>

IGNITE-8583 - review needed

2018-05-24 Thread Dmitriy Govorukhin

Igniters,

I fixed DataStorageMetricsMXBean.getOffHeapSize, please review my changes.

https://issues.apache.org/jira/browse/IGNITE-8583

Re: async operation is not fair async

2018-05-24 Thread Dmitriy Govorukhin

So, in a current implementation, how I can perform the real async operation
in one thread? Any workaround?
What can I do if I have event loop thread model?

On Wed, May 16, 2018 at 12:14 PM, Alexey Goncharuk <
alexey.goncha...@gmail.com> wrote:

> Dmitriy,
>
> I will add technical details to the ticket, however, looks like there is
> still no consensus on how this change should be presented to a user. It
> would be ok if we changed this behavior in Ignite 3.0, but for one of the
> next point releases we have to agree how this should be enabled/disabled
> (or whether we should delay this change to 3.0 at all).
>
> 2018-05-15 22:13 GMT+03:00 Dmitriy Govorukhin <
> dmitriy.govoruk...@gmail.com>
> :
>
> >  Alexey,
> >
> > Any updates?
> >
> > On Mon, May 14, 2018 at 6:19 PM, Dmitriy Govorukhin <
> > dmitriy.govoruk...@gmail.com> wrote:
> >
> > > Alexey,
> > >
> > > Could you please add more description information for this task? [1]
> > > Perhaps, base steps for implementation.
> > >
> > > [1] https://issues.apache.org/jira/browse/IGNITE-8475
> > >
> > > On Mon, May 14, 2018 at 4:58 PM, Alexey Goncharuk <
> > > alexey.goncha...@gmail.com> wrote:
> > >
> > >> Another +1 for the true asynchronous approach. I remember a while ago
> > one
> > >> of the Ignite users raised a similar question regarding the *async
> > method
> > >> being blocked on establishing a TCP connection.
> > >>
> > >> As far as deadlocks go, I have a counter-example. Currently, we check
> > the
> > >> thread-local chain only for a single cache, so if I run the following
> > >> code:
> > >> cache1.getAsync(k1);
> > >> cache2.getAsync(k2);
> > >> then the deadlock is still possible, and I did not see a single user
> > >> complaining about unexpected deadlocks. Rather than implementing this
> > >> cross-cache chain (which would probably add another overhead), I would
> > >> make
> > >> it consistent and allow operations to be run in parallel.
> > >>
> > >> There are many use-cases when having true async operations
> dramatically
> > >> improve performance. Consider, for example, a streaming example when
> > keys
> > >> are being pushed by a client to a cluster. Currently, to run effective
> > >> processing, the user will have to use a data streamer with custom keys
> > >> receiver which may be a huge usability downside. Async operations can
> > >> utilize the cluster resources very efficiently.
> > >>
> > >> Finally, if we want to be on the safe side, we can keep the operation
> > >> chain
> > >> inside a transaction. I see absolutely no point in maintaining this
> > chain
> > >> outside of transactions.
> > >>
> > >> --AG
> > >>
> > >> 2018-05-14 16:01 GMT+03:00 Dmitriy Govorukhin <
> > >> dmitriy.govoruk...@gmail.com>
> > >> :
> > >>
> > >> > Andrey,
> > >> >
> > >> > Do you prefer change behavior at runtime?
> > >> > I guess will be better have different methods for getting cache
> > instance
> > >> > with fair and not fair sync.
> > >> >
> > >> > On Mon, May 14, 2018 at 3:39 PM, Andrey Gura <ag...@apache.org>
> > wrote:
> > >> >
> > >> > > +1 for fair async operations.
> > >> > >
> > >> > > But I don't like idea use withFairSync() method. We added
> xxxAsync()
> > >> > > methods recently and withAsync() is deprecated.
> > >> > >
> > >> > > I think we should just make methods are async in nature and
> provide
> > >> > > ability of switching to the old behaviour using flag or property.
> > >> > >
> > >> > > On Fri, May 11, 2018 at 11:00 PM, Dmitriy Setrakyan
> > >> > > <dsetrak...@apache.org> wrote:
> > >> > > > Vladimir,
> > >> > > >
> > >> > > > In general I agree, but I do get greatly *close-minded* (pun
> > >> intended)
> > >> > > > whenever users' code that worked for the past several years all
> > of a
> > >> > > sudden
> > >> > > > gets deadlocked after an upgrade. Making this feature optional
> is
> > >> even
> > >>

[jira] [Created] (IGNITE-8583) DataStorageMetricsMXBean.getOffHeapSize include checkpoint buffer size, this is not clear

2018-05-23 Thread Dmitriy Govorukhin (JIRA)

Dmitriy Govorukhin created IGNITE-8583:
--

 Summary: DataStorageMetricsMXBean.getOffHeapSize include 
checkpoint buffer size, this is not clear
 Key: IGNITE-8583
 URL: https://issues.apache.org/jira/browse/IGNITE-8583
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Re: work with files and directories

2018-05-16 Thread Dmitriy Govorukhin

One more thing, If we implement this component and will aggregate
information about all files, we can
create failure handler which provided detail information about node
persistence structure. It will be helpful for debugging node crash.

On Wed, May 16, 2018 at 12:46 PM, Dmitry Pavlov <dpavlov@gmail.com>
wrote:

> Andrey Gura expressed in words what I also thought.
>
> I agree if we extract common code to some kind of file utils.
>
> In the same time creating common file operation framework is not possible
> because of different nature of WAL, Page Store and other components using
> files.
>
> ср, 16 мая 2018 г. в 11:27, Dmitriy Govorukhin <
> dmitriy.govoruk...@gmail.com
> >:
>
> > Andrey,
> >
> > My point is, it will be very cool if there is some component who will
> know
> > about persistence folder and files,
> > and we can move all generic code, for work with files,  to this
> component.
> >
> > On Wed, May 16, 2018 at 12:59 AM, Pavel Kovalenko <jokse...@gmail.com>
> > wrote:
> >
> > > Andrey,
> > >
> > > I think the main idea of that not about how to unify writing content to
> > > file. It's about creating peristence files and folders. For other
> > > persistence components (WAL, Checkpoint, etc.) it should be like
> > > FileFactory object with methods like "public File
> > > getOrCreatePartitionFile(...), public void commitFIle(..) and etc". So,
> > all
> > > logic about how files should look will be moved to centralized point
> and
> > it
> > > will be more easier to understand our persistence structure.
> > >
> > > 2018-05-16 0:42 GMT+03:00 Andrey Gura <ag...@apache.org>:
> > >
> > > > Hi,
> > > >
> > > > I understand you idea but it just increases dependencies of different
> > > > component from one that is in general bad practice.
> > > >
> > > > We have different components where each one can use different
> approach
> > > for
> > > > file management. For example page store and WAL have different file
> IO
> > > > implementations due to performance reasons and we have to provide
> > > different
> > > > mechanics for work with files.
> > > >
> > > > Of course we can refactor mentioned components in more structured
> > manner
> > > > but we should not strongly link it with one implementation.
> > > >
> > > > вт, 15 мая 2018 г., 20:10 Dmitry Pavlov <dpavlov@gmail.com>:
> > > >
> > > > > Hi Maxim,
> > > > >
> > > > > FileWriteAheadLogManager  &  FsyncModeFileWriteAheadLogManager
> were
> > > > > intentionally copy-pasted in hope we will soon delete FsyncManager.
> > > > >
> > > > > But it is still shows this tool works good. Probably we could
> > integrate
> > > > > this tool to our processes.
> > > > >
> > > > > Sincerely,
> > > > > Dmitriy Pavlov
> > > > >
> > > > >
> > > > > вт, 15 мая 2018 г. в 20:06, Maxim Muzafarov <maxmu...@gmail.com>:
> > > > >
> > > > > > +1 also for something like "resource manager".
> > > > > >
> > > > > > Recently, I've found for myself sonarcloud.io tool for code
> > > analisys.
> > > > > It's
> > > > > > free for open source project and I've made Ignite project initial
> > run
> > > > > [1].
> > > > > >
> > > > > > I've prepeared analisys for mysefl and found a lot of duplicated
> > code
> > > > > > blocks [1]. Of course it's not the ideal tool but gave us
> direction
> > > of
> > > > > > thoughts. E.g. these classes [3]:
> > > > > > 1) FileWriteAheadLogManager.java
> > > > > > 2) FsyncModeFileWriteAheadLogManager.java
> > > > > >
> > > > > >
> > > > > > [1] https://sonarcloud.io/dashboard?id=org.apache.
> > > > ignite%3Aapache-ignite
> > > > > > [2]
> > > > > >
> > > > > >
> > > > > https://sonarcloud.io/component_measures?id=org.
> > > > apache.ignite%3Aapache-ignite=duplicated_blocks
> > > > > > [3]
> > > > > >
> > > > > >
> > > > > https://sonarcloud.io/component_measures?id=org.
> >

Re: Avoid JIRA comments deletion

2018-05-16 Thread Dmitriy Govorukhin

+1 Dmitriy,

I do not see any reason for deletion comments.
Maybe only edit operation must be allowed.

On Wed, May 16, 2018 at 12:32 PM, Igor Sapego  wrote:

> I totally agree. There is no sense in most cases in deletion
> of commentaries.
>
> There is even less sense, when you can look into ticket
> history and see all the removed comments anyway.
>
> Best Regards,
> Igor
>
> On Wed, May 16, 2018 at 12:27 PM, Dmitry Pavlov 
> wrote:
>
> > Hi Igniters,
> >
> > What do you think about deletion of comment in JIRA tickets? I constantly
> > see that participants add comments, and then delete them.
> >
> > IMO , we have partially lost the history of problem research, and we can
> > lose interesting ideas about the problem causes; or already tested
> > hypotheses.
> >
> > I understand that the commentary may be outdated, but it can be
> underlined
> > by the second comment, which speaks about it explicitly.
> >
> > Please share your vision.
> >
> > Sincerely,
> > Dmitriy Pavlov
> >
>

Re: work with files and directories

2018-05-16 Thread Dmitriy Govorukhin

Andrey,

My point is, it will be very cool if there is some component who will know
about persistence folder and files,
and we can move all generic code, for work with files,  to this component.

On Wed, May 16, 2018 at 12:59 AM, Pavel Kovalenko <jokse...@gmail.com>
wrote:

> Andrey,
>
> I think the main idea of that not about how to unify writing content to
> file. It's about creating peristence files and folders. For other
> persistence components (WAL, Checkpoint, etc.) it should be like
> FileFactory object with methods like "public File
> getOrCreatePartitionFile(...), public void commitFIle(..) and etc". So, all
> logic about how files should look will be moved to centralized point and it
> will be more easier to understand our persistence structure.
>
> 2018-05-16 0:42 GMT+03:00 Andrey Gura <ag...@apache.org>:
>
> > Hi,
> >
> > I understand you idea but it just increases dependencies of different
> > component from one that is in general bad practice.
> >
> > We have different components where each one can use different approach
> for
> > file management. For example page store and WAL have different file IO
> > implementations due to performance reasons and we have to provide
> different
> > mechanics for work with files.
> >
> > Of course we can refactor mentioned components in more structured manner
> > but we should not strongly link it with one implementation.
> >
> > вт, 15 мая 2018 г., 20:10 Dmitry Pavlov <dpavlov@gmail.com>:
> >
> > > Hi Maxim,
> > >
> > > FileWriteAheadLogManager  &  FsyncModeFileWriteAheadLogManager  were
> > > intentionally copy-pasted in hope we will soon delete FsyncManager.
> > >
> > > But it is still shows this tool works good. Probably we could integrate
> > > this tool to our processes.
> > >
> > > Sincerely,
> > > Dmitriy Pavlov
> > >
> > >
> > > вт, 15 мая 2018 г. в 20:06, Maxim Muzafarov <maxmu...@gmail.com>:
> > >
> > > > +1 also for something like "resource manager".
> > > >
> > > > Recently, I've found for myself sonarcloud.io tool for code
> analisys.
> > > It's
> > > > free for open source project and I've made Ignite project initial run
> > > [1].
> > > >
> > > > I've prepeared analisys for mysefl and found a lot of duplicated code
> > > > blocks [1]. Of course it's not the ideal tool but gave us direction
> of
> > > > thoughts. E.g. these classes [3]:
> > > > 1) FileWriteAheadLogManager.java
> > > > 2) FsyncModeFileWriteAheadLogManager.java
> > > >
> > > >
> > > > [1] https://sonarcloud.io/dashboard?id=org.apache.
> > ignite%3Aapache-ignite
> > > > [2]
> > > >
> > > >
> > > https://sonarcloud.io/component_measures?id=org.
> > apache.ignite%3Aapache-ignite=duplicated_blocks
> > > > [3]
> > > >
> > > >
> > > https://sonarcloud.io/component_measures?id=org.
> > apache.ignite%3Aapache-ignite=duplicated_blocks&
> > selected=org.apache.ignite%3Aignite-core%3Asrc%2Fmain%
> > 2Fjava%2Forg%2Fapache%2Fignite%2Finternal%2Fprocessors%2Fcache%
> > 2Fpersistence%2Fwal%2FFsyncModeFileWriteAheadLogManager.java
> > > >
> > > > вт, 15 мая 2018 г. в 19:26, Pavel Kovalenko <jokse...@gmail.com>:
> > > >
> > > > > +1 to this approach,
> > > > >
> > > > > It can be also very helpful in failover scenarios when something
> > wrong
> > > > > happened with disk. In this case we're reducing the number of
> points
> > of
> > > > > failure.
> > > > >
> > > > > 2018-05-15 18:37 GMT+03:00 Dmitriy Govorukhin <
> > > > > dmitriy.govoruk...@gmail.com>
> > > > > :
> > > > >
> > > > > > Hi Ignite,
> > > > > >
> > > > > > Do we have a general approach to work with a file and
> directories?
> > > > > > I see many duplication logic for write through .tmp file.
> > > > > >
> > > > > > For example,
> > > > > >
> > > > > > GridCacheDatabaseSharedManager.writeCheckpointEntry();
> > > > > > GridCacheDatabaseSharedManage.nodeStart();
> > > > > > FileWriteAheadLogManager.FileArchiver.archiveSegment();
> > > > > >
> > > > > > All of these methods implement the same logic, write to tmp file
> > and
> > > > > rename
> > > > > > to normal name.
> > > > > >
> > > > > > I guess, will be better if we stopping write duplication logic
> code
> > > and
> > > > > > start to consolidate all in one place.
> > > > > >
> > > > > > Also, I think that current approach to creating files is not
> quite
> > > > right
> > > > > > faithful. Each internal component
> > > > > > create/delete files inside himself, and nobody knows where which
> > > files
> > > > > > located.
> > > > > >
> > > > > > I suggest refactoring code and create something (maybe new
> manager)
> > > > that
> > > > > > will know about all files inside the node. All internal
> components
> > > must
> > > > > > create files only through this one.
> > > > > >
> > > > > > It makes help to write tests for persistence easier and reduce
> > > > > duplication
> > > > > > code in working with files.
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: supporting different configuration format json,yaml...

2018-05-16 Thread Dmitriy Govorukhin

;
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hello!
> > > > > > > > >
> > > > > > > > > Maybe we should take .Net configuration as a standard,
> extend
> > > it
> > > > to
> > > > > > > JSON
> > > > > > > > > and YAML?
> > > > > > > > >
> > > > > > > > > 
> > > > > > > > > https://apacheignite-net.readme.io/docs/configuration
> > > > > > > > >
> > > > > > > > > It should be fairly robust, and there's much less
> > boilerplate.
> > > > > > > > >
> > > > > > > > > Regards,
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > Ilya Kasnacheev
> > > > > > > > >
> > > > > > > > > 2018-05-15 16:09 GMT+03:00 Pavel Kovalenko <
> > jokse...@gmail.com
> > > >:
> > > > > > > > >
> > > > > > > > > > +1 to Dmitriy G. proposal.
> > > > > > > > > >
> > > > > > > > > > Since we're moving Ignite towards outside of Java world,
> we
> > > > > should
> > > > > > > > > > definitely care about config usability for users who are
> > not
> > > > > > familiar
> > > > > > > > > with
> > > > > > > > > > Java/Spring.
> > > > > > > > > > If we take a look at any of our XML-configs, we can see a
> > lot
> > > > of
> > > > > > > > > > boilerplate like "", ""
> -
> > > > terms
> > > > > > > which
> > > > > > > > > say
> > > > > > > > > > nothing to users outside of Java world.
> > > > > > > > > > When I see such configs my eyes are filled with bloody
> > tears.
> > > > > > > > > >
> > > > > > > > > > I think we should really consider YAML as our additional
> > > > approach
> > > > > > to
> > > > > > > > > > configure Ignite with full replacement instead of XML in
> > > > future.
> > > > > > > > > > Comparing to XML, YAML is significantly more
> human-readable
> > > and
> > > > > > > > > lightweight
> > > > > > > > > > format and has stable Java library to parse and translate
> > > > config
> > > > > > > files
> > > > > > > > to
> > > > > > > > > > Java objects without extra-magic.
> > > > > > > > > >
> > > > > > > > > > We can find a lot of famous projects which are using
> YAML:
> > > > Apache
> > > > > > > > Flink,
> > > > > > > > > > Apache Storm/Heron and one of the our main rivals -
> Apache
> > > > > > Cassandra.
> > > > > > > > > >
> > > > > > > > > > Some of the projects use simple = config
> > form
> > > > > > (Kafka,
> > > > > > > > > > Spark), some of the projects use their own YAML-like
> format
> > > > > > > (Aerospike,
> > > > > > > > > > Tarantool), but it's really difficult to find such
> project
> > > > which
> > > > > > has
> > > > > > > so
> > > > > > > > > > heavy config as us (maybe VoltDB).
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > 2018-05-15 14:02 GMT+03:00 Andrey Gura <ag...@apache.org
> >:
> > > > > > > > > >
> > > > > > > > > > > Actually sometimes users ask about JSON configuration
> > (

Re: async operation is not fair async

2018-05-15 Thread Dmitriy Govorukhin

 Alexey,

Any updates?

On Mon, May 14, 2018 at 6:19 PM, Dmitriy Govorukhin <
dmitriy.govoruk...@gmail.com> wrote:

> Alexey,
>
> Could you please add more description information for this task? [1]
> Perhaps, base steps for implementation.
>
> [1] https://issues.apache.org/jira/browse/IGNITE-8475
>
> On Mon, May 14, 2018 at 4:58 PM, Alexey Goncharuk <
> alexey.goncha...@gmail.com> wrote:
>
>> Another +1 for the true asynchronous approach. I remember a while ago one
>> of the Ignite users raised a similar question regarding the *async method
>> being blocked on establishing a TCP connection.
>>
>> As far as deadlocks go, I have a counter-example. Currently, we check the
>> thread-local chain only for a single cache, so if I run the following
>> code:
>> cache1.getAsync(k1);
>> cache2.getAsync(k2);
>> then the deadlock is still possible, and I did not see a single user
>> complaining about unexpected deadlocks. Rather than implementing this
>> cross-cache chain (which would probably add another overhead), I would
>> make
>> it consistent and allow operations to be run in parallel.
>>
>> There are many use-cases when having true async operations dramatically
>> improve performance. Consider, for example, a streaming example when keys
>> are being pushed by a client to a cluster. Currently, to run effective
>> processing, the user will have to use a data streamer with custom keys
>> receiver which may be a huge usability downside. Async operations can
>> utilize the cluster resources very efficiently.
>>
>> Finally, if we want to be on the safe side, we can keep the operation
>> chain
>> inside a transaction. I see absolutely no point in maintaining this chain
>> outside of transactions.
>>
>> --AG
>>
>> 2018-05-14 16:01 GMT+03:00 Dmitriy Govorukhin <
>> dmitriy.govoruk...@gmail.com>
>> :
>>
>> > Andrey,
>> >
>> > Do you prefer change behavior at runtime?
>> > I guess will be better have different methods for getting cache instance
>> > with fair and not fair sync.
>> >
>> > On Mon, May 14, 2018 at 3:39 PM, Andrey Gura <ag...@apache.org> wrote:
>> >
>> > > +1 for fair async operations.
>> > >
>> > > But I don't like idea use withFairSync() method. We added xxxAsync()
>> > > methods recently and withAsync() is deprecated.
>> > >
>> > > I think we should just make methods are async in nature and provide
>> > > ability of switching to the old behaviour using flag or property.
>> > >
>> > > On Fri, May 11, 2018 at 11:00 PM, Dmitriy Setrakyan
>> > > <dsetrak...@apache.org> wrote:
>> > > > Vladimir,
>> > > >
>> > > > In general I agree, but I do get greatly *close-minded* (pun
>> intended)
>> > > > whenever users' code that worked for the past several years all of a
>> > > sudden
>> > > > gets deadlocked after an upgrade. Making this feature optional is
>> even
>> > > > worse and more confusing. In this case the best action is no action
>> at
>> > > all.
>> > > >
>> > > > BTW, would be interesting to find out how Oracle async driver
>> behaves
>> > in
>> > > > this case.
>> > > >
>> > > > D.
>> > > >
>> > > >
>> > > >
>> > > >
>> > > >
>> > > > On Fri, May 11, 2018 at 8:29 PM, Vladimir Ozerov <
>> voze...@gridgain.com
>> > >
>> > > > wrote:
>> > > >
>> > > >> Guys,
>> > > >>
>> > > >> To build a great product we should be open minded and look to the
>> > > future,
>> > > >> not to the past.
>> > > >>
>> > > >> Dima raised very valid point - why async is not async? Current
>> > > programming
>> > > >> culture and demanding performance requirements pushes users towards
>> > > >> reactive-style programming. I do not want my thread to ever be
>> > blocked.
>> > > >> Instead, I want to send a number of concurrent commands and
>> optionally
>> > > >> subscribe to final result. So trully async API makes total sense to
>> > me.
>> > > >>
>> > > >> But personally, my primary interest in this area is SQL. Oracle is
>> > > >> preparing new async d

work with files and directories

2018-05-15 Thread Dmitriy Govorukhin

Hi Ignite,

Do we have a general approach to work with a file and directories?
I see many duplication logic for write through .tmp file.

For example,

GridCacheDatabaseSharedManager.writeCheckpointEntry();
GridCacheDatabaseSharedManage.nodeStart();
FileWriteAheadLogManager.FileArchiver.archiveSegment();

All of these methods implement the same logic, write to tmp file and rename
to normal name.

I guess, will be better if we stopping write duplication logic code and
start to consolidate all in one place.

Also, I think that current approach to creating files is not quite right
faithful. Each internal component
create/delete files inside himself, and nobody knows where which files
located.

I suggest refactoring code and create something (maybe new manager) that
will know about all files inside the node. All internal components must
create files only through this one.

It makes help to write tests for persistence easier and reduce duplication
code in working with files.

Re: IGNITE-6430 (Done) CacheGroupsMetricsRebalanceTest.testRebalanceEstimateFinishTime test fails periodically

2018-05-15 Thread Dmitriy Govorukhin

Dmitriy,

Looks good for me please merge.

On Tue, May 15, 2018 at 11:20 AM, Dmitriy Govorukhin <
dmitriy.govoruk...@gmail.com> wrote:

> Dmitriy,
>
> Ok, I will do it soon.
>
> On Mon, May 14, 2018 at 8:41 PM, Dmitry Pavlov <dpavlov@gmail.com>
> wrote:
>
>> Hi Dmitriy G,
>>
>> could you please pick up this review?
>>
>> Sincerely,
>> Dmitriy Pavlov
>>
>> чт, 10 мая 2018 г. в 15:50, Александр Меньшиков <sharple...@gmail.com>:
>>
>>> Hi,
>>>
>>> I have fixed problem with the test
>>> CacheGroupsMetricsRebalanceTest.testRebalanceEstimateFinishTime.
>>>
>>> Please review and merge if okay.
>>>
>>> P.S.
>>> Aleksey Plekhanov already approved it (see Jira).
>>>
>>>
>>> JIRA: https://issues.apache.org/jira/browse/IGNITE-6430
>>> PR: https://github.com/apache/ignite/pull/3914
>>> TC (with 120 passes):
>>> https://ci.ignite.apache.org/viewLog.html?buildId=1246058
>>> ildTypeId=IgniteTests24Java8_Cache3=testsInfo
>>> TC:
>>> https://ci.ignite.apache.org/viewType.html?buildTypeId=Ignit
>>> eTests24Java8_BuildApacheIgnite_IgniteTests24Java8=
>>> pull%2F3914%2Fhead=buildTypeStatusDiv
>>>
>>
>

Re: supporting different configuration format json,yaml...

2018-05-15 Thread Dmitriy Govorukhin

Folks,

I guess when work on a thin client will be completed, we get more newcomers
who use go/python/php/js.
And we can do ignite more friendly for them, support familiar formats for
configuration.

On Tue, May 15, 2018 at 12:13 PM, Dmitry Pavlov <dpavlov@gmail.com>
wrote:

> Hi Igniters,
>
> In general I aggree with adding new format, e.g. JSON is more popular than
> XML for new applications.
>
> In the same time I've never heard that user asked this in the user list. Or
> did I missed such topics?
>
> Sincerely,
> Dmitriy Pavlov
>
> вт, 15 мая 2018 г. в 9:31, Pavel Tupitsyn <ptupit...@apache.org>:
>
> > Dmitriy,
> >
> > We don't need to support different config formats on server in order to
> add
> > that to thin clients.
> >
> > Thin client protocol provides a way to create a cache with custom config
> > [1].
> > It is up to thin client library authors to use any config format they
> like
> > and then convert it into protocol-defined format.
> >
> > C# thin client uses custom format, for example, not Spring.
> >
> > [1]
> >
> > https://apacheignite.readme.io/docs/binary-client-
> protocol-cache-configuration-operations#section-op_cache_
> create_with_configuration
> >
> > On Mon, May 14, 2018 at 7:54 PM, Ivan Rakov <ivan.glu...@gmail.com>
> wrote:
> >
> > > Dmitry,
> > >
> > > We rely on Spring Framework when we start Ignite node from XML
> > > configuration. Spring doesn't easily support another formats of
> > > configuration files. I think, the main reason for this is built-in
> > ability
> > > to validate configuration via XML Schema. We can surely hack this
> around
> > (I
> > > bet there are existing libraries for configuring Spring with JSON),
> but I
> > > don't think that anyone suffered from inability to statically configure
> > > Ignite with json/yaml.
> > >
> > > Regarding thin clients: makes sense. I suppose necessary mappings will
> be
> > > implemented as a part of thin client.
> > >
> > > Best Regards,
> > > Ivan Rakov
> > >
> > >
> > > On 14.05.2018 18:58, Dmitriy Govorukhin wrote:
> > >
> > >> Hi, Igniters!
> > >>
> > >> As far as I know, many people work on a thin client for different
> > language
> > >> (go,js,php...).
> > >> Are there any reasons why ignite does not support yaml or json format
> > for
> > >> configuration? or some other popular format?
> > >> In future, it can help to integrate with thin clients, for example, js
> > >> client may want to dynamic cache start, he passes cache configuration
> > (in
> > >> native format, for js it will json) through TCP, Ignite node unwrap
> and
> > >> remap to java representation and dynamic start cache.
> > >>
> > >>
> > >
> >
>

Re: IGNITE-6430 (Done) CacheGroupsMetricsRebalanceTest.testRebalanceEstimateFinishTime test fails periodically

2018-05-15 Thread Dmitriy Govorukhin

Dmitriy,

Ok, I will do it soon.

On Mon, May 14, 2018 at 8:41 PM, Dmitry Pavlov 
wrote:

> Hi Dmitriy G,
>
> could you please pick up this review?
>
> Sincerely,
> Dmitriy Pavlov
>
> чт, 10 мая 2018 г. в 15:50, Александр Меньшиков :
>
>> Hi,
>>
>> I have fixed problem with the test
>> CacheGroupsMetricsRebalanceTest.testRebalanceEstimateFinishTime.
>>
>> Please review and merge if okay.
>>
>> P.S.
>> Aleksey Plekhanov already approved it (see Jira).
>>
>>
>> JIRA: https://issues.apache.org/jira/browse/IGNITE-6430
>> PR: https://github.com/apache/ignite/pull/3914
>> TC (with 120 passes):
>> https://ci.ignite.apache.org/viewLog.html?buildId=1246058=
>> IgniteTests24Java8_Cache3=testsInfo
>> TC:
>> https://ci.ignite.apache.org/viewType.html?buildTypeId=
>> IgniteTests24Java8_BuildApacheIgnite_IgniteTests24Java8=pull%
>> 2F3914%2Fhead=buildTypeStatusDiv
>>
>

Re: async operation is not fair async

2018-05-14 Thread Dmitriy Govorukhin

Alexey,

Could you please add more description information for this task? [1]
Perhaps, base steps for implementation.

[1] https://issues.apache.org/jira/browse/IGNITE-8475

On Mon, May 14, 2018 at 4:58 PM, Alexey Goncharuk <
alexey.goncha...@gmail.com> wrote:

> Another +1 for the true asynchronous approach. I remember a while ago one
> of the Ignite users raised a similar question regarding the *async method
> being blocked on establishing a TCP connection.
>
> As far as deadlocks go, I have a counter-example. Currently, we check the
> thread-local chain only for a single cache, so if I run the following code:
> cache1.getAsync(k1);
> cache2.getAsync(k2);
> then the deadlock is still possible, and I did not see a single user
> complaining about unexpected deadlocks. Rather than implementing this
> cross-cache chain (which would probably add another overhead), I would make
> it consistent and allow operations to be run in parallel.
>
> There are many use-cases when having true async operations dramatically
> improve performance. Consider, for example, a streaming example when keys
> are being pushed by a client to a cluster. Currently, to run effective
> processing, the user will have to use a data streamer with custom keys
> receiver which may be a huge usability downside. Async operations can
> utilize the cluster resources very efficiently.
>
> Finally, if we want to be on the safe side, we can keep the operation chain
> inside a transaction. I see absolutely no point in maintaining this chain
> outside of transactions.
>
> --AG
>
> 2018-05-14 16:01 GMT+03:00 Dmitriy Govorukhin <
> dmitriy.govoruk...@gmail.com>
> :
>
> > Andrey,
> >
> > Do you prefer change behavior at runtime?
> > I guess will be better have different methods for getting cache instance
> > with fair and not fair sync.
> >
> > On Mon, May 14, 2018 at 3:39 PM, Andrey Gura <ag...@apache.org> wrote:
> >
> > > +1 for fair async operations.
> > >
> > > But I don't like idea use withFairSync() method. We added xxxAsync()
> > > methods recently and withAsync() is deprecated.
> > >
> > > I think we should just make methods are async in nature and provide
> > > ability of switching to the old behaviour using flag or property.
> > >
> > > On Fri, May 11, 2018 at 11:00 PM, Dmitriy Setrakyan
> > > <dsetrak...@apache.org> wrote:
> > > > Vladimir,
> > > >
> > > > In general I agree, but I do get greatly *close-minded* (pun
> intended)
> > > > whenever users' code that worked for the past several years all of a
> > > sudden
> > > > gets deadlocked after an upgrade. Making this feature optional is
> even
> > > > worse and more confusing. In this case the best action is no action
> at
> > > all.
> > > >
> > > > BTW, would be interesting to find out how Oracle async driver behaves
> > in
> > > > this case.
> > > >
> > > > D.
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > On Fri, May 11, 2018 at 8:29 PM, Vladimir Ozerov <
> voze...@gridgain.com
> > >
> > > > wrote:
> > > >
> > > >> Guys,
> > > >>
> > > >> To build a great product we should be open minded and look to the
> > > future,
> > > >> not to the past.
> > > >>
> > > >> Dima raised very valid point - why async is not async? Current
> > > programming
> > > >> culture and demanding performance requirements pushes users towards
> > > >> reactive-style programming. I do not want my thread to ever be
> > blocked.
> > > >> Instead, I want to send a number of concurrent commands and
> optionally
> > > >> subscribe to final result. So trully async API makes total sense to
> > me.
> > > >>
> > > >> But personally, my primary interest in this area is SQL. Oracle is
> > > >> preparing new async driver. ADBA - async database access. It was
> > > presented
> > > >> on recent JavaOne [1]. It is under active development right now -
> juse
> > > >> weave through the mailing list [2]. Some prototypes are already
> there
> > > [3].
> > > >> PostgreSQL community even started adopted it [4]!
> > > >>
> > > >> I am not pushing for immediate actions, but at least we should
> > > understand
> > > >> which way the wind is blowing. As a mid-term goals

Re: async operation is not fair async

2018-05-14 Thread Dmitriy Govorukhin

Andrey,

Do you prefer change behavior at runtime?
I guess will be better have different methods for getting cache instance
with fair and not fair sync.

On Mon, May 14, 2018 at 3:39 PM, Andrey Gura <ag...@apache.org> wrote:

> +1 for fair async operations.
>
> But I don't like idea use withFairSync() method. We added xxxAsync()
> methods recently and withAsync() is deprecated.
>
> I think we should just make methods are async in nature and provide
> ability of switching to the old behaviour using flag or property.
>
> On Fri, May 11, 2018 at 11:00 PM, Dmitriy Setrakyan
> <dsetrak...@apache.org> wrote:
> > Vladimir,
> >
> > In general I agree, but I do get greatly *close-minded* (pun intended)
> > whenever users' code that worked for the past several years all of a
> sudden
> > gets deadlocked after an upgrade. Making this feature optional is even
> > worse and more confusing. In this case the best action is no action at
> all.
> >
> > BTW, would be interesting to find out how Oracle async driver behaves in
> > this case.
> >
> > D.
> >
> >
> >
> >
> >
> > On Fri, May 11, 2018 at 8:29 PM, Vladimir Ozerov <voze...@gridgain.com>
> > wrote:
> >
> >> Guys,
> >>
> >> To build a great product we should be open minded and look to the
> future,
> >> not to the past.
> >>
> >> Dima raised very valid point - why async is not async? Current
> programming
> >> culture and demanding performance requirements pushes users towards
> >> reactive-style programming. I do not want my thread to ever be blocked.
> >> Instead, I want to send a number of concurrent commands and optionally
> >> subscribe to final result. So trully async API makes total sense to me.
> >>
> >> But personally, my primary interest in this area is SQL. Oracle is
> >> preparing new async driver. ADBA - async database access. It was
> presented
> >> on recent JavaOne [1]. It is under active development right now - juse
> >> weave through the mailing list [2]. Some prototypes are already there
> [3].
> >> PostgreSQL community even started adopted it [4]!
> >>
> >> I am not pushing for immediate actions, but at least we should
> understand
> >> which way the wind is blowing. As a mid-term goals I would propose to
> >> finally remove thread ID from our PESSIMISTIC transactions to allow for
> >> suspend/resume in different threads. And as a next step I would think on
> >> adopting async cache and SQL APIs.
> >>
> >> Vladimir.
> >>
> >> [1]
> >> http://www.oracle.com/technetwork/database/
> application-development/jdbc/
> >> con1491-3961036.pdf
> >> [2] http://mail.openjdk.java.net/pipermail/jdbc-spec-discuss/
> >> [3] https://github.com/oracle/oracle-db-examples/tree/master/java/AoJ
> >> [4] https://github.com/pgjdbc/pgjdbc/issues/978
> >>
> >> On Fri, May 11, 2018 at 9:48 PM, Dmitriy Setrakyan <
> dsetrak...@apache.org>
> >> wrote:
> >>
> >> > On Fri, May 11, 2018 at 7:46 PM, Dmitriy Govorukhin <
> >> > dmitriy.govoruk...@gmail.com> wrote:
> >> >
> >> > > I will edit IGNITE-8475, and remove all part that belong to the
> public
> >> > api.
> >> > > Is it acceptable for you?
> >> > >
> >> >
> >> > Everything is acceptable, as long as the public API is safe :)
> >> >
> >>
>

Re: async operation is not fair async

2018-05-11 Thread Dmitriy Govorukhin

Dmitriy S,

If it will be in the internal package, and only for internal usage, are you
agree with changes?

пт, 11 мая 2018 г., 21:12 Dmitriy Setrakyan <dsetrak...@apache.org>:

> On Fri, May 11, 2018 at 6:49 PM, Dmitriy Govorukhin <
> dmitriy.govoruk...@gmail.com> wrote:
>
> > Dmitriy S,
> >
> > It is not broke existing code, because for use this ability you must use
> > decorator "withFairSycn()".
> >
> > What about the argument of Vladimir?
> >
>
> Here is Vladimir's quote:
>
>
> *This would also be helpful for transactional SQL as it would allow to
> hide**network
> latency. But there is a problem - deadlocks. We need to inform user*
> *that this mode should be used with great care. *
>
>
> I would rather not change anything instead of increasing the probability of
> deadlocks. This was the main reason for the current behavior to begin with.
>
> In my view, if something is needed for the transactional SQL, please add it
> internally. Let's not corrupt the public API by adding dangerous methods.
>
> D.
>

Re: async operation is not fair async

2018-05-11 Thread Dmitriy Govorukhin

Dmitriy S,

It is not broke existing code, because for use this ability you must use
decorator "withFairSycn()".

What about the argument of Vladimir?

On Fri, May 11, 2018 at 8:41 PM, Dmitriy Setrakyan <dsetrak...@apache.org>
wrote:

> On Fri, May 11, 2018 at 6:38 PM, Dmitriy Govorukhin <
> dmitriy.govoruk...@gmail.com> wrote:
>
> > Dmitriy S,
> >
> > Why method named as "async" but does not work as async? This is
> misleading.
> >
> > getAllAsync() is a special case. Not always you can use getAllAsync()
> > instead
> > of multiple getAsync().
> > In this topic, I wanna discuss problem not only for the GET operation but
> > also all async operation behavior in the one thread.
> >
> > In compute grid we can run multiple async compute operation in the one
> > thread, so why we can not do this for cache?
>
>
> Because it will break a lot of existing code and create bugs we cannot even
> predict at this point. I am not sure why has this become a problem. Is it
> preventing us from accomplishing some other task? If not, then I propose to
> drop it.
>
> D.
>

Re: async operation is not fair async

2018-05-11 Thread Dmitriy Govorukhin

Dmitriy S,

Why method named as "async" but does not work as async? This is misleading.

getAllAsync() is a special case. Not always you can use getAllAsync() instead
of multiple getAsync().
In this topic, I wanna discuss problem not only for the GET operation but
also all async operation behavior in the one thread.

In compute grid we can run multiple async compute operation in the one
thread, so why we can not do this for cache?

On Fri, May 11, 2018 at 8:33 PM, Dmitriy Setrakyan 
wrote:

> On Fri, May 11, 2018 at 6:29 PM, Dmitry Pavlov 
> wrote:
>
> > IMO you can complete async operations one before another if these
> > operations are related to independent data.
> >
> > It is strange why Ignite users are not confused by current API. So I
> > support Dmitriy's G. suggestion.
> >
>
> Again, this is a solution looking for a problem. Nobody complains about
> this, so there really isn't any issue. There are so many other tasks we
> could focus on. Let's not fix what's not broken.
>
> D.
>

Re: async operation is not fair async

2018-05-11 Thread Dmitriy Govorukhin

Igniters,

I created the issue. IGNITE-8475
<https://issues.apache.org/jira/browse/IGNITE-8475>

Any comments are welcome.

On Fri, May 11, 2018 at 6:32 PM, Ivan Rakov <ivan.glu...@gmail.com> wrote:

> Agree. "fair" is more descriptive.
>
> Best Regards,
> Ivan Rakov
>
>
> On 11.05.2018 18:30, Dmitriy Govorukhin wrote:
>
>> Ivan,
>>
>> My suggestion "withFairAsync()". What do you think?
>>
>> On Fri, May 11, 2018 at 6:23 PM, Ivan Rakov <ivan.glu...@gmail.com>
>> wrote:
>>
>> I think, the best option from API side is to add decorating
>>> withExplicitAsync() method.
>>> We already have withKeepBinary, withExpiryPolicy and so on.
>>>
>>> Best Regards,
>>> Ivan Rakov
>>>
>>>
>>> On 11.05.2018 18:18, Dmitriy Govorukhin wrote:
>>>
>>> Vladimir,
>>>>
>>>> Should we create the new cache adapter or rework GridCacheAdapter?
>>>>
>>>> On Fri, May 11, 2018 at 5:52 PM, Vladimir Ozerov <voze...@gridgain.com>
>>>> wrote:
>>>>
>>>> +1
>>>>
>>>>> This would also be helpful for transactional SQL as it would allow to
>>>>> hide
>>>>> network latency. But there is a problem - deadlocks. We need to inform
>>>>> user
>>>>> that this mode should be used with great care.
>>>>>
>>>>> On Fri, May 11, 2018 at 5:21 PM, Dmitriy Govorukhin <
>>>>> dmitriy.govoruk...@gmail.com> wrote:
>>>>>
>>>>> Hi Igniters,
>>>>>
>>>>>> I have a question. Why our async operation in not really async?
>>>>>>
>>>>>> GridCacheAdapter.syncOp has awaitLastFut(); this call wait last async
>>>>>> operation completed.
>>>>>>
>>>>>> This means all async operation in one thread will be executed one by
>>>>>> one
>>>>>> but not in parallel. Async operation is not async.
>>>>>>
>>>>>> Example for atomic cache
>>>>>>
>>>>>> f1=cache.getAsync(key1);
>>>>>> f2=cache.getAsync(key2);
>>>>>>
>>>>>> f1 always will be complete before f2.
>>>>>>
>>>>>> I want to have the ability run multiple async operations in one
>>>>>> thread.
>>>>>> What do you think?
>>>>>>
>>>>>> Maybe we can add new cache adapter with fair async operations?
>>>>>>
>>>>>>
>>>>>>
>

[jira] [Created] (IGNITE-8475) Create new IgniteCache decorator with fair async methonds

2018-05-11 Thread Dmitriy Govorukhin (JIRA)

Dmitriy Govorukhin created IGNITE-8475:
--

 Summary: Create new IgniteCache decorator with fair async methonds
 Key: IGNITE-8475
 URL: https://issues.apache.org/jira/browse/IGNITE-8475
 Project: Ignite
  Issue Type: Improvement
  Components: cache
Affects Versions: 2.4
Reporter: Dmitriy Govorukhin
 Fix For: None


GridCacheAdapter.syncOp has awaitLastFut(); this call wait last async 
operation completed. 

This means all async operation in one thread will be executed one by one but
not in parallel. Async operation is not async. 

Example for atomic cache 

f1=cache.getAsync(key1); 
f2=cache.getAsync(key2); 

f1 always will be complete before f2. 

Need to create a new decorator for IgniteCache, and return IgniteCache proxy 
with fair async 

operations.

 IgniteCache.withFairAsync()



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Re: async operation is not fair async

2018-05-11 Thread Dmitriy Govorukhin

Ivan,

My suggestion "withFairAsync()". What do you think?

On Fri, May 11, 2018 at 6:23 PM, Ivan Rakov <ivan.glu...@gmail.com> wrote:

> I think, the best option from API side is to add decorating
> withExplicitAsync() method.
> We already have withKeepBinary, withExpiryPolicy and so on.
>
> Best Regards,
> Ivan Rakov
>
>
> On 11.05.2018 18:18, Dmitriy Govorukhin wrote:
>
>> Vladimir,
>>
>> Should we create the new cache adapter or rework GridCacheAdapter?
>>
>> On Fri, May 11, 2018 at 5:52 PM, Vladimir Ozerov <voze...@gridgain.com>
>> wrote:
>>
>> +1
>>>
>>> This would also be helpful for transactional SQL as it would allow to
>>> hide
>>> network latency. But there is a problem - deadlocks. We need to inform
>>> user
>>> that this mode should be used with great care.
>>>
>>> On Fri, May 11, 2018 at 5:21 PM, Dmitriy Govorukhin <
>>> dmitriy.govoruk...@gmail.com> wrote:
>>>
>>> Hi Igniters,
>>>>
>>>> I have a question. Why our async operation in not really async?
>>>>
>>>> GridCacheAdapter.syncOp has awaitLastFut(); this call wait last async
>>>> operation completed.
>>>>
>>>> This means all async operation in one thread will be executed one by one
>>>> but not in parallel. Async operation is not async.
>>>>
>>>> Example for atomic cache
>>>>
>>>> f1=cache.getAsync(key1);
>>>> f2=cache.getAsync(key2);
>>>>
>>>> f1 always will be complete before f2.
>>>>
>>>> I want to have the ability run multiple async operations in one thread.
>>>> What do you think?
>>>>
>>>> Maybe we can add new cache adapter with fair async operations?
>>>>
>>>>
>

Re: async operation is not fair async

2018-05-11 Thread Dmitriy Govorukhin

Vladimir,

Should we create the new cache adapter or rework GridCacheAdapter?

On Fri, May 11, 2018 at 5:52 PM, Vladimir Ozerov <voze...@gridgain.com>
wrote:

> +1
>
> This would also be helpful for transactional SQL as it would allow to hide
> network latency. But there is a problem - deadlocks. We need to inform user
> that this mode should be used with great care.
>
> On Fri, May 11, 2018 at 5:21 PM, Dmitriy Govorukhin <
> dmitriy.govoruk...@gmail.com> wrote:
>
> > Hi Igniters,
> >
> > I have a question. Why our async operation in not really async?
> >
> > GridCacheAdapter.syncOp has awaitLastFut(); this call wait last async
> > operation completed.
> >
> > This means all async operation in one thread will be executed one by one
> > but not in parallel. Async operation is not async.
> >
> > Example for atomic cache
> >
> > f1=cache.getAsync(key1);
> > f2=cache.getAsync(key2);
> >
> > f1 always will be complete before f2.
> >
> > I want to have the ability run multiple async operations in one thread.
> > What do you think?
> >
> > Maybe we can add new cache adapter with fair async operations?
> >
>

Re: Disable WAL for several cache groups within one exchange

2018-05-11 Thread Dmitriy Govorukhin

Ivan,

Agree, if we have the batch method for cache create, we should have the
ability to enable/disable WAL in the batch too.

On Fri, May 11, 2018 at 5:17 PM, Ivan Rakov  wrote:

> Igniters,
>
> API method for disabling WAL in IgniteCluster accepts only one cache name.
> Every call triggers exchange and checkpoints cluster-wide - it takes plenty
> of time to disable/enable WAL for multiple caches.
> I think, we should add option to disable/enable WAL for several caches
> with single command.
>
> Thoughts?
>
> --
> Best Regards,
> Ivan Rakov
>
>

1 2 >

1 - 100 of 173 matches

Mail list logo