date:20180323

[GitHub] ignite pull request #3694: IGNITE-5819: SQL: add support for TRUNCATE TABLE ...

2018-03-23 Thread shroman

GitHub user shroman opened a pull request:

https://github.com/apache/ignite/pull/3694

IGNITE-5819: SQL: add support for TRUNCATE TABLE command.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/shroman/ignite IGNITE-5819

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/ignite/pull/3694.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3694


commit 4c1ab3df27f23a2f0086fd324d2f22778a41b649
Author: shroman 
Date:   2018-03-23T09:27:57Z

IGNITE-5819: SQL: add support for TRUNCATE TABLE command.




---

Re: Apache Ignite nightly release builds

2018-03-23 Thread Denis Magda

Created a JIRA ticket for that:
https://issues.apache.org/jira/browse/IGNITE-8040

--
Denis

On Fri, Mar 23, 2018 at 1:27 AM, Dmitriy Setrakyan 
wrote:

> Awesome! Finally instead of asking our users to build from the master, we
> can provide a link to the nightly build instead.
>
> Denis, can you please add these links to the website?
>
> D.
>
> On Thu, Mar 22, 2018 at 1:27 PM, Petr Ivanov  wrote:
>
>> It works, thanks!
>>
>>
>> Here is updated links for Artifacts and Changes respectively with silent
>> guest login (can be added to bookmarks):
>> * https://ci.ignite.apache.org/viewLog.html?buildId=lastSucces
>> sful=Releases_NightlyRelease_RunApacheIgnite
>> NightlyRelease=artifacts=1
>> * https://ci.ignite.apache.org/viewLog.html?buildId=lastSucces
>> sful=Releases_NightlyRelease_RunApacheIgnite
>> NightlyRelease=buildChangesDiv=1
>>
>>
>>
>> > On 22 Mar 2018, at 13:06, Vitaliy Osipov  wrote:
>> >
>> > 
>>
>>
>

[jira] [Created] (IGNITE-8040) Add Ignite nightly builds references to the site

2018-03-23 Thread Denis Magda (JIRA)

Denis Magda created IGNITE-8040:
---

 Summary: Add Ignite nightly builds references to the site
 Key: IGNITE-8040
 URL: https://issues.apache.org/jira/browse/IGNITE-8040
 Project: Ignite
  Issue Type: New Feature
  Components: site
Reporter: Denis Magda
Assignee: Prachi Garg
 Fix For: 2.5


We should add a link to the nightly builds somewhere on the site:
https://ci.ignite.apache.org/viewType.html?buildTypeId=Releases_NightlyRelease_RunApacheIgniteNightlyRelease

Here is a relevant discussion:
http://apache-ignite-developers.2346864.n4.nabble.com/Apache-Ignite-nightly-release-builds-td27966.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Re: What's about releasing Ignite 2.5 a bit earlier?

2018-03-23 Thread Denis Magda

Petr,

I think your statement in regards the non-readiness for the production
usage is too strong :) The issue you listed treated as usability ones by
me. Agree that it's inconvenient to inject ‘--add-modules’ here and there
but it doesn't lead to failures at runtime.

If I'm still missing something and there are the bugs that might affect the
stability of the clusters at runtime, then please share them with us.
You're more aware than anyone else.

--
Denis

On Fri, Mar 23, 2018 at 8:50 PM, Petr Ivanov  wrote:

> Agree with no blocker.
>
> Then as developers we just have to send a clear message that "we making it
> towards Java9/10 honest support, but do not use it yet on prod
> environments” :)
>
>
> > On 23 Mar 2018, at 22:41, Denis Magda  wrote:
> >
> > Petr,
> >
> > I guess we would need much more time to handle all Java 9/10 tasks. Since
> > none of them looks like a blocker to me, I would suggest us not to rush
> and
> > target the tasks for 2.4 release.
> >
> > Overall, I see that we're in an agreement in regards the earlier date of
> > 2.5 release. I'll let the rest of the community to speak out until the
> > beginning of the next week.
> >
> > --
> > Denis
> >
> >
> >
> > On Thu, Mar 22, 2018 at 8:24 PM, Petr Ivanov 
> wrote:
> >
> >> I’ll do my best to deliver packages in time.
> >>
> >>
> >> Concerning Java 9/10 — AFAIK there are limitations how do we support it:
> >> * compilation is supported with scala_2.10 modules switched off (due to
> >> scala support Java 9 itself starting from 2.12.x+ versions);
> >> * also compilation and run is possible only with lots of ‘--add-modules’
> >> hacks — I think that to say we support Java 9 we have to add modules
> info
> >> to every package;
> >> * tests (project ‘Ignite Tests 2.4+ (Java 9)’) also require attention,
> >> because there are some problems with theirs run.
> >>
> >>
> >>
> >>> On 22 Mar 2018, at 21:09, Denis Magda  wrote:
> >>>
> >>> Petr,
> >>>
> >>> Java Thin client, GA Grid, COPY command are the most significant
> >> additions
> >>> to the release (if to talk about the features). Aside from that, I see
> >> many
> >>> valuable optimizations and fixes that have been sitting in the master
> >> for a
> >>> while (some of them were merged in January). It's preferable to release
> >>> them.
> >>>
> >>> As for Java 10 and DEB, if you feel you can't make them to the release,
> >>> then it can wait till 2.6.
> >>>
> >>> Speaking of Java 9, what's left? I thought it's already fully
> supported.
> >>>
> >>> --
> >>> Denis
> >>>
> >>> On Thu, Mar 22, 2018 at 10:47 AM, Petr Ivanov 
> >> wrote:
> >>>
>  Will there be a major new feature or this release will concentrate on
>  stability and optimisations?
> 
>  Also, I guess, we will have to include into 2.5 release full support
> for
>  Java 9 and Java 10 (that WILL require some developers time).
>  And RPM / DEB packages stage II phase will require lots of testing and
>  infrastructure preparations: place to store package, RPM / DEB build
>  inclusion into release process and so on.
> 
>  Apr 30 sounds good, but I’d have a “place for manoeuvre” in case all
> >> those
>  activities are not finished in time.
> 
> 
> 
> > On 22 Mar 2018, at 20:39, Denis Magda  wrote:
> >
> > Igniters,
> >
> > According to our regular schedule, every new Ignite version usually
> >> goes
> > public once in 3 months. As you remember, the latest 2.4 release,
> which
> > took us 5 months to improve and roll out, was based on the version of
> >> the
> > source code dated by January.
> >
> > Since that time the master branch went far ahead and already
> >> incorporates
> > many valuable fixes and capabilities such as:
> >
> > - Fixes provided as a part of "Gree Team City" activity.
> > - Persistence: page replacement algorithm and throttling
> >> optimizations,
> > out of memory in checkpointing buffer corrections, etc. Alex G and
>  Ivan can
> > shed more light here.
> > - Data loading optimizations for SQL: streaming for JDBC thin driver
>  and
> > copy command
> > - Genetic Algorithms Grid Contribution!
> > - Java Thin Client developed by Alexey Kukushkin
> >
> >
> > - the list goes on and on... Please share the contributions you are
> > ready to release.
> >
> > So, why don't we go ahead and release the current master and possibly
>  extra
> > tickets that are in the review state earlier? What's about April 30
> as
>  the
> > next release date?
> >
> > --
> > Densi
> 
> 
> >>
> >>
>
>

Re: What's about releasing Ignite 2.5 a bit earlier?

2018-03-23 Thread Petr Ivanov

Agree with no blocker.

Then as developers we just have to send a clear message that "we making it 
towards Java9/10 honest support, but do not use it yet on prod environments” :)


> On 23 Mar 2018, at 22:41, Denis Magda  wrote:
> 
> Petr,
> 
> I guess we would need much more time to handle all Java 9/10 tasks. Since
> none of them looks like a blocker to me, I would suggest us not to rush and
> target the tasks for 2.4 release.
> 
> Overall, I see that we're in an agreement in regards the earlier date of
> 2.5 release. I'll let the rest of the community to speak out until the
> beginning of the next week.
> 
> --
> Denis
> 
> 
> 
> On Thu, Mar 22, 2018 at 8:24 PM, Petr Ivanov  wrote:
> 
>> I’ll do my best to deliver packages in time.
>> 
>> 
>> Concerning Java 9/10 — AFAIK there are limitations how do we support it:
>> * compilation is supported with scala_2.10 modules switched off (due to
>> scala support Java 9 itself starting from 2.12.x+ versions);
>> * also compilation and run is possible only with lots of ‘--add-modules’
>> hacks — I think that to say we support Java 9 we have to add modules info
>> to every package;
>> * tests (project ‘Ignite Tests 2.4+ (Java 9)’) also require attention,
>> because there are some problems with theirs run.
>> 
>> 
>> 
>>> On 22 Mar 2018, at 21:09, Denis Magda  wrote:
>>> 
>>> Petr,
>>> 
>>> Java Thin client, GA Grid, COPY command are the most significant
>> additions
>>> to the release (if to talk about the features). Aside from that, I see
>> many
>>> valuable optimizations and fixes that have been sitting in the master
>> for a
>>> while (some of them were merged in January). It's preferable to release
>>> them.
>>> 
>>> As for Java 10 and DEB, if you feel you can't make them to the release,
>>> then it can wait till 2.6.
>>> 
>>> Speaking of Java 9, what's left? I thought it's already fully supported.
>>> 
>>> --
>>> Denis
>>> 
>>> On Thu, Mar 22, 2018 at 10:47 AM, Petr Ivanov 
>> wrote:
>>> 
 Will there be a major new feature or this release will concentrate on
 stability and optimisations?
 
 Also, I guess, we will have to include into 2.5 release full support for
 Java 9 and Java 10 (that WILL require some developers time).
 And RPM / DEB packages stage II phase will require lots of testing and
 infrastructure preparations: place to store package, RPM / DEB build
 inclusion into release process and so on.
 
 Apr 30 sounds good, but I’d have a “place for manoeuvre” in case all
>> those
 activities are not finished in time.
 
 
 
> On 22 Mar 2018, at 20:39, Denis Magda  wrote:
> 
> Igniters,
> 
> According to our regular schedule, every new Ignite version usually
>> goes
> public once in 3 months. As you remember, the latest 2.4 release, which
> took us 5 months to improve and roll out, was based on the version of
>> the
> source code dated by January.
> 
> Since that time the master branch went far ahead and already
>> incorporates
> many valuable fixes and capabilities such as:
> 
> - Fixes provided as a part of "Gree Team City" activity.
> - Persistence: page replacement algorithm and throttling
>> optimizations,
> out of memory in checkpointing buffer corrections, etc. Alex G and
 Ivan can
> shed more light here.
> - Data loading optimizations for SQL: streaming for JDBC thin driver
 and
> copy command
> - Genetic Algorithms Grid Contribution!
> - Java Thin Client developed by Alexey Kukushkin
> 
> 
> - the list goes on and on... Please share the contributions you are
> ready to release.
> 
> So, why don't we go ahead and release the current master and possibly
 extra
> tickets that are in the review state earlier? What's about April 30 as
 the
> next release date?
> 
> --
> Densi
 
 
>> 
>>

Re: Ignite Direct I/O plugin description added to wiki

2018-03-23 Thread Denis Magda

Dmitriy, Prachi,

Thanks to both of you for the development and documentation of the feature.
I've added a reference from readme to Direct I/O section on wiki so that
people can learn even more about the capability.

--
Denis

On Fri, Mar 23, 2018 at 2:34 PM, Dmitry Pavlov 
wrote:

> Hi Prachi,
>
> Yes, it's a great description.
>
> I'm impressed by how informative and short you managed to make it.
>
> Thank you!
>
> Sincerely,
> Dmitriy Pavlov
>
> сб, 24 мар. 2018 г. в 0:29, Prachi Garg :
>
>> Dmitriy,
>>
>> I have documented the Direct I/O plugin. Please review [1] and provide
>> comments/feedback in the ticket [2].
>>
>> [1]
>> https://apacheignite.readme.io/v2.4/docs/durable-memory-
>> tuning#section-enabling-direct-i-o
>>
>> [2] https://issues.apache.org/jira/browse/IGNITE-7466
>>
>>
>> -Prachi
>>
>> On Wed, Mar 21, 2018 at 11:22 PM, Dmitry Pavlov 
>> wrote:
>>
>> > Yes, will do.
>> >
>> > чт, 22 мар. 2018 г. в 1:05, Denis Magda :
>> >
>> > > Dmitriy,
>> > >
>> > > Thanks for updating the WAL section. Now it makes total sense to me.
>> > >
>> > > As for the page with JNA projects, let's get added there! Could you
>> > > contact the owners?
>> > >
>> > > --
>> > > Denis
>> > >
>> > >
>> > > On Wed, Mar 21, 2018 at 9:27 AM, Dmitry Pavlov > >
>> > > wrote:
>> > >
>> > >> Denis,
>> > >>
>> > >> one more thing, can/should we mention Ignite Direct IO plugin in
>> list of
>> > >> project using JNA here: https://github.com/java-native-access/jna
>> > >>
>> > >> Sincerely,
>> > >> Dmitriy Pavlov
>> > >>
>> > >> ср, 21 мар. 2018 г. в 1:59, Denis Magda :
>> > >>
>> > >>> *Dmitriy*, thanks. Astonishing job! We'll add a section to the
>> durable
>> > >>> memory tuning page and refer to the wiki for more details:
>> > >>> https://issues.apache.org/jira/browse/IGNITE-7466
>> > >>>
>> > >>> Please clarify the following:
>> > >>>
>> > >>> > Direct I/O mode can't be enabled for Write Ahead Log files.
>> However,
>> > >>> when
>> > >>> > working with plugin, WAL manager applies advising Linux systems do
>> > not
>> > >>> > store the data of the file in page cache as they are not required.
>> > >>>
>> > >>>
>> > >>> For me, it means that WAL always goes through the operating system
>> I/O
>> > >>> calls. Nothing changes for the WAL. However, I'm not sure what you
>> > meant
>> > >>> to
>> > >>> explain by saying "when working with the plugin (Direct I/O) WAL
>> > manager
>> > >>> applies...". Could you rephrase it to bring more clarity?
>> > >>>
>> > >>> *Raymond,*
>> > >>>
>> > >>>
>> > >>> If Direct I/O is enabled by default it will bring down the
>> performance
>> > of
>> > >>> read-intensive application because, as Dmitry says, the reads bypass
>> > page
>> > >>> cache. So, I would recommend using it for write-intensive workloads
>> > and,
>> > >>> probably, for mixed-workloads depending on the reads and writes
>> rate.
>> > >>>
>> > >>> --
>> > >>> Denis
>> > >>>
>> > >>>
>> > >>> On Tue, Mar 20, 2018 at 2:29 PM, Raymond Wilson <
>> > >>> raymond_wil...@trimble.com>
>> > >>> wrote:
>> > >>>
>> > >>> > Looks good!
>> > >>> >
>> > >>> > Is there any reason why this should not be a default setting if it
>> > >>> > gracefully downgrades to non-Direct IO if not supported by the OS?
>> > >>> >
>> > >>> > Thanks,
>> > >>> > Raymond.
>> > >>> >
>> > >>> > -Original Message-
>> > >>> > From: Dmitriy Setrakyan [mailto:dsetrak...@apache.org]
>> > >>> > Sent: Wednesday, March 21, 2018 10:23 AM
>> > >>> > To: dev 
>> > >>> > Subject: Re: Ignite Direct I/O plugin description added to wiki
>> > >>> >
>> > >>> > Thanks Dmitry, awesome work!
>> > >>> >
>> > >>> > On Wed, Mar 21, 2018 at 12:21 AM, Dmitry Pavlov <
>> > dpavlov@gmail.com
>> > >>> >
>> > >>> > wrote:
>> > >>> >
>> > >>> > > Hi Igniters,
>> > >>> > >
>> > >>> > > I've added description of new plugin for Direct I/O for native
>> > >>> > > persistence (
>> > >>> > > https://issues.apache.org/jira/browse/IGNITE-6341)  to wiki
>> > >>> > > https://cwiki.apache.org/confluence/display/IGNITE/
>> > >>> > > Ignite+Persistent+Store+-+under+the+hood#IgnitePersistentStore-
>> > >>> > > underthehood-DirectI/O
>> > >>> > >
>> > >>> > >
>> > >>> > > SIncerely,
>> > >>> > > Dmitriy Pavlov
>> > >>> > >
>> > >>> >
>> > >>>
>> > >>
>> > >
>> >
>>
>

Re: Reconsider default WAL mode: we need something between LOG_ONLY and FSYNC

2018-03-23 Thread Valentin Kulichenko

Dmitry,

Thanks for clarification. So it sounds like if we fix all other modes as we
discuss here, NONE would be the only one allowing corruption. I also don't
see much sense in this and I think we should clearly state this in the doc,
as well print out a warning if NONE mode is used. Eventually, if it's
confirmed that there are no reasonable use cases for it, we can deprecate
it.

-Val

On Fri, Mar 23, 2018 at 3:26 PM, Dmitry Pavlov 
wrote:

> Hi Val,
>
> NONE means that the WAL log is disabled and not written at all. Use of the
> mode is at your own risk. It is possible that restore state after the crash
> at the middle of checkpoint will not succeed. I do not see much sence in
> it, especially in production.
>
> BACKGROUND is full functional WAL mode, but allows some delay before flush
> to disk.
>
> Sincerely,
> Dmitriy Pavlov
>
> сб, 24 мар. 2018 г. в 1:07, Valentin Kulichenko <
> valentin.kuliche...@gmail.com>:
>
> > I agree. In my view, any possibility to get a corrupted storage is a bug
> > which needs to be fixed.
> >
> > BTW, can someone explain semantics of NONE mode? What is the difference
> > from BACKGROUND from user's perspective? Is there any particular use case
> > where it can be used?
> >
> > -Val
> >
> > On Fri, Mar 23, 2018 at 2:49 AM, Dmitry Pavlov 
> > wrote:
> >
> > > Hi Ivan,
> > >
> > > IMO we have to add extra FSYNCS for BACKGROUND WAL. Agree?
> > >
> > > Sincerely,
> > > Dmitriy Pavlov
> > >
> > > пт, 23 мар. 2018 г. в 12:23, Ivan Rakov :
> > >
> > > > Igniters, there's another important question about this matter.
> > > > Do we want to add extra FSYNCS for BACKGROUND WAL mode? I think that
> we
> > > > have to do it: it will cause similar performance drop, but if we
> > > > consider LOG_ONLY broken without these fixes, BACKGROUND is broken as
> > > well.
> > > >
> > > > Best Regards,
> > > > Ivan Rakov
> > > >
> > > > On 23.03.2018 10:27, Ivan Rakov wrote:
> > > > > Fixes are quite simple.
> > > > > I expect them to be merged in master in a week in worst case.
> > > > >
> > > > > Best Regards,
> > > > > Ivan Rakov
> > > > >
> > > > > On 22.03.2018 17:49, Denis Magda wrote:
> > > > >> Ivan,
> > > > >>
> > > > >> How quick are you going to merge the fix into the master? Many
> > > > >> persistence
> > > > >> related optimizations have already stacked up. Probably, we can
> > > release
> > > > >> them sooner if the community agrees.
> > > > >>
> > > > >> --
> > > > >> Denis
> > > > >>
> > > > >> On Thu, Mar 22, 2018 at 5:22 AM, Ivan Rakov <
> ivan.glu...@gmail.com>
> > > > >> wrote:
> > > > >>
> > > > >>> Thanks all!
> > > > >>> We seem to have reached a consensus on this issue. I'll just add
> > > > >>> necessary
> > > > >>> fsyncs under IGNITE-7754.
> > > > >>>
> > > > >>> Best Regards,
> > > > >>> Ivan Rakov
> > > > >>>
> > > > >>>
> > > > >>> On 22.03.2018 15:13, Ilya Lantukh wrote:
> > > > >>>
> > > >  +1 for fixing LOG_ONLY. If current implementation doesn't
> protect
> > > from
> > > >  data
> > > >  corruption, it doesn't make sence.
> > > > 
> > > >  On Wed, Mar 21, 2018 at 10:38 PM, Denis Magda <
> dma...@apache.org>
> > > >  wrote:
> > > > 
> > > >  +1 for the fix of LOG_ONLY
> > > > > On Wed, Mar 21, 2018 at 11:23 AM, Alexey Goncharuk <
> > > > > alexey.goncha...@gmail.com> wrote:
> > > > >
> > > > > +1 for fixing LOG_ONLY to enforce corruption safety given the
> > > > > provided
> > > > >> performance results.
> > > > >>
> > > > >> 2018-03-21 18:20 GMT+03:00 Vladimir Ozerov <
> > voze...@gridgain.com
> > > >:
> > > > >>
> > > > >> +1 for accepting drop in LOG_ONLY. 7% is not that much and
> not a
> > > > >> drop
> > > > >> at
> > > > >> all, provided that we fixing a bug. I.e. should we implement
> it
> > > > >> correctly
> > > > >> in the first place we would never notice any "drop".
> > > > >>> I do not understand why someone would like to use current
> > broken
> > > > >>> mode.
> > > > >>>
> > > > >>> On Wed, Mar 21, 2018 at 6:11 PM, Dmitry Pavlov
> > > > >>> 
> > > > >>> wrote:
> > > > >>>
> > > > >>> Hi, I think option 1 is better. As Val said any mode that
> > allows
> > > > >>> corruption
> > > > >>>
> > > >  does not make much sense.
> > > > 
> > > >  What Ivan mentioned here as drop, in relation to old mode
> > > DEFAULT
> > > > 
> > > > >>> (FSYNC
> > > > >>> now), is still significant perfromance boost.
> > > >  Sincerely,
> > > >  Dmitriy Pavlov
> > > > 
> > > >  ср, 21 мар. 2018 г. в 17:56, Ivan Rakov <
> > ivan.glu...@gmail.com
> > > >:
> > > > 
> > > >  I've attached benchmark results to the JIRA ticket.
> > > > > We observe ~7% drop in "fair" LOG_ONLY_SAFE mode,
> independent
> > > of
> > > > >
> > > >  WAL
> > > >

Re: Reconsider default WAL mode: we need something between LOG_ONLY and FSYNC

2018-03-23 Thread Dmitry Pavlov

Hi Val,

NONE means that the WAL log is disabled and not written at all. Use of the
mode is at your own risk. It is possible that restore state after the crash
at the middle of checkpoint will not succeed. I do not see much sence in
it, especially in production.

BACKGROUND is full functional WAL mode, but allows some delay before flush
to disk.

Sincerely,
Dmitriy Pavlov

сб, 24 мар. 2018 г. в 1:07, Valentin Kulichenko <
valentin.kuliche...@gmail.com>:

> I agree. In my view, any possibility to get a corrupted storage is a bug
> which needs to be fixed.
>
> BTW, can someone explain semantics of NONE mode? What is the difference
> from BACKGROUND from user's perspective? Is there any particular use case
> where it can be used?
>
> -Val
>
> On Fri, Mar 23, 2018 at 2:49 AM, Dmitry Pavlov 
> wrote:
>
> > Hi Ivan,
> >
> > IMO we have to add extra FSYNCS for BACKGROUND WAL. Agree?
> >
> > Sincerely,
> > Dmitriy Pavlov
> >
> > пт, 23 мар. 2018 г. в 12:23, Ivan Rakov :
> >
> > > Igniters, there's another important question about this matter.
> > > Do we want to add extra FSYNCS for BACKGROUND WAL mode? I think that we
> > > have to do it: it will cause similar performance drop, but if we
> > > consider LOG_ONLY broken without these fixes, BACKGROUND is broken as
> > well.
> > >
> > > Best Regards,
> > > Ivan Rakov
> > >
> > > On 23.03.2018 10:27, Ivan Rakov wrote:
> > > > Fixes are quite simple.
> > > > I expect them to be merged in master in a week in worst case.
> > > >
> > > > Best Regards,
> > > > Ivan Rakov
> > > >
> > > > On 22.03.2018 17:49, Denis Magda wrote:
> > > >> Ivan,
> > > >>
> > > >> How quick are you going to merge the fix into the master? Many
> > > >> persistence
> > > >> related optimizations have already stacked up. Probably, we can
> > release
> > > >> them sooner if the community agrees.
> > > >>
> > > >> --
> > > >> Denis
> > > >>
> > > >> On Thu, Mar 22, 2018 at 5:22 AM, Ivan Rakov 
> > > >> wrote:
> > > >>
> > > >>> Thanks all!
> > > >>> We seem to have reached a consensus on this issue. I'll just add
> > > >>> necessary
> > > >>> fsyncs under IGNITE-7754.
> > > >>>
> > > >>> Best Regards,
> > > >>> Ivan Rakov
> > > >>>
> > > >>>
> > > >>> On 22.03.2018 15:13, Ilya Lantukh wrote:
> > > >>>
> > >  +1 for fixing LOG_ONLY. If current implementation doesn't protect
> > from
> > >  data
> > >  corruption, it doesn't make sence.
> > > 
> > >  On Wed, Mar 21, 2018 at 10:38 PM, Denis Magda 
> > >  wrote:
> > > 
> > >  +1 for the fix of LOG_ONLY
> > > > On Wed, Mar 21, 2018 at 11:23 AM, Alexey Goncharuk <
> > > > alexey.goncha...@gmail.com> wrote:
> > > >
> > > > +1 for fixing LOG_ONLY to enforce corruption safety given the
> > > > provided
> > > >> performance results.
> > > >>
> > > >> 2018-03-21 18:20 GMT+03:00 Vladimir Ozerov <
> voze...@gridgain.com
> > >:
> > > >>
> > > >> +1 for accepting drop in LOG_ONLY. 7% is not that much and not a
> > > >> drop
> > > >> at
> > > >> all, provided that we fixing a bug. I.e. should we implement it
> > > >> correctly
> > > >> in the first place we would never notice any "drop".
> > > >>> I do not understand why someone would like to use current
> broken
> > > >>> mode.
> > > >>>
> > > >>> On Wed, Mar 21, 2018 at 6:11 PM, Dmitry Pavlov
> > > >>> 
> > > >>> wrote:
> > > >>>
> > > >>> Hi, I think option 1 is better. As Val said any mode that
> allows
> > > >>> corruption
> > > >>>
> > >  does not make much sense.
> > > 
> > >  What Ivan mentioned here as drop, in relation to old mode
> > DEFAULT
> > > 
> > > >>> (FSYNC
> > > >>> now), is still significant perfromance boost.
> > >  Sincerely,
> > >  Dmitriy Pavlov
> > > 
> > >  ср, 21 мар. 2018 г. в 17:56, Ivan Rakov <
> ivan.glu...@gmail.com
> > >:
> > > 
> > >  I've attached benchmark results to the JIRA ticket.
> > > > We observe ~7% drop in "fair" LOG_ONLY_SAFE mode, independent
> > of
> > > >
> > >  WAL
> > > >> compaction enabled flag. It's pretty significant drop: WAL
> > >  compaction
> > > >> itself gives only ~3% drop.
> > > > I see two options here:
> > > > 1) Change LOG_ONLY behavior. That implies that we'll be ready
> > to
> > > >
> > >  release
> > >  AI 2.5 with 7% drop.
> > > > 2) Introduce LOG_ONLY_SAFE, make it default, add release note
> > > > to AI
> > > >
> > >  2.5
> > > >>> that we added power loss durability in default mode, but user
> may
> > > > fallback to previous LOG_ONLY in order to retain performance.
> > > >
> > > > Thoughts?
> > > >
> > > > Best Regards,
> > > > Ivan Rakov
> > > >
>

Re: Reconsider default WAL mode: we need something between LOG_ONLY and FSYNC

2018-03-23 Thread Valentin Kulichenko

I agree. In my view, any possibility to get a corrupted storage is a bug
which needs to be fixed.

BTW, can someone explain semantics of NONE mode? What is the difference
from BACKGROUND from user's perspective? Is there any particular use case
where it can be used?

-Val

On Fri, Mar 23, 2018 at 2:49 AM, Dmitry Pavlov 
wrote:

> Hi Ivan,
>
> IMO we have to add extra FSYNCS for BACKGROUND WAL. Agree?
>
> Sincerely,
> Dmitriy Pavlov
>
> пт, 23 мар. 2018 г. в 12:23, Ivan Rakov :
>
> > Igniters, there's another important question about this matter.
> > Do we want to add extra FSYNCS for BACKGROUND WAL mode? I think that we
> > have to do it: it will cause similar performance drop, but if we
> > consider LOG_ONLY broken without these fixes, BACKGROUND is broken as
> well.
> >
> > Best Regards,
> > Ivan Rakov
> >
> > On 23.03.2018 10:27, Ivan Rakov wrote:
> > > Fixes are quite simple.
> > > I expect them to be merged in master in a week in worst case.
> > >
> > > Best Regards,
> > > Ivan Rakov
> > >
> > > On 22.03.2018 17:49, Denis Magda wrote:
> > >> Ivan,
> > >>
> > >> How quick are you going to merge the fix into the master? Many
> > >> persistence
> > >> related optimizations have already stacked up. Probably, we can
> release
> > >> them sooner if the community agrees.
> > >>
> > >> --
> > >> Denis
> > >>
> > >> On Thu, Mar 22, 2018 at 5:22 AM, Ivan Rakov 
> > >> wrote:
> > >>
> > >>> Thanks all!
> > >>> We seem to have reached a consensus on this issue. I'll just add
> > >>> necessary
> > >>> fsyncs under IGNITE-7754.
> > >>>
> > >>> Best Regards,
> > >>> Ivan Rakov
> > >>>
> > >>>
> > >>> On 22.03.2018 15:13, Ilya Lantukh wrote:
> > >>>
> >  +1 for fixing LOG_ONLY. If current implementation doesn't protect
> from
> >  data
> >  corruption, it doesn't make sence.
> > 
> >  On Wed, Mar 21, 2018 at 10:38 PM, Denis Magda 
> >  wrote:
> > 
> >  +1 for the fix of LOG_ONLY
> > > On Wed, Mar 21, 2018 at 11:23 AM, Alexey Goncharuk <
> > > alexey.goncha...@gmail.com> wrote:
> > >
> > > +1 for fixing LOG_ONLY to enforce corruption safety given the
> > > provided
> > >> performance results.
> > >>
> > >> 2018-03-21 18:20 GMT+03:00 Vladimir Ozerov  >:
> > >>
> > >> +1 for accepting drop in LOG_ONLY. 7% is not that much and not a
> > >> drop
> > >> at
> > >> all, provided that we fixing a bug. I.e. should we implement it
> > >> correctly
> > >> in the first place we would never notice any "drop".
> > >>> I do not understand why someone would like to use current broken
> > >>> mode.
> > >>>
> > >>> On Wed, Mar 21, 2018 at 6:11 PM, Dmitry Pavlov
> > >>> 
> > >>> wrote:
> > >>>
> > >>> Hi, I think option 1 is better. As Val said any mode that allows
> > >>> corruption
> > >>>
> >  does not make much sense.
> > 
> >  What Ivan mentioned here as drop, in relation to old mode
> DEFAULT
> > 
> > >>> (FSYNC
> > >>> now), is still significant perfromance boost.
> >  Sincerely,
> >  Dmitriy Pavlov
> > 
> >  ср, 21 мар. 2018 г. в 17:56, Ivan Rakov  >:
> > 
> >  I've attached benchmark results to the JIRA ticket.
> > > We observe ~7% drop in "fair" LOG_ONLY_SAFE mode, independent
> of
> > >
> >  WAL
> > >> compaction enabled flag. It's pretty significant drop: WAL
> >  compaction
> > >> itself gives only ~3% drop.
> > > I see two options here:
> > > 1) Change LOG_ONLY behavior. That implies that we'll be ready
> to
> > >
> >  release
> >  AI 2.5 with 7% drop.
> > > 2) Introduce LOG_ONLY_SAFE, make it default, add release note
> > > to AI
> > >
> >  2.5
> > >>> that we added power loss durability in default mode, but user may
> > > fallback to previous LOG_ONLY in order to retain performance.
> > >
> > > Thoughts?
> > >
> > > Best Regards,
> > > Ivan Rakov
> > >
> > > On 20.03.2018 16:00, Ivan Rakov wrote:
> > >
> > >> Val,
> > >>
> > >> If a storage is in
> > >>> corrupted state, does it mean that it needs to be completely
> > >>>
> > >> removed
> > >>> and
> > > cluster needs to be restarted without data?
> > >> Yes, there's a chance that in LOG_ONLY all local data will be
> > >>
> > > lost,
> > >> but only in *power loss**/ OS crash* case.
> > >> kill -9, JVM crash, death of critical system thread and all
> > >> other
> > >> cases that usually take place are variations of *process
> crash*.
> > >>
> > > All
> > >>> WAL modes (except NONE, of

Re: Ignite Direct I/O plugin description added to wiki

2018-03-23 Thread Dmitry Pavlov

Hi Prachi,

Yes, it's a great description.

I'm impressed by how informative and short you managed to make it.

Thank you!

Sincerely,
Dmitriy Pavlov

сб, 24 мар. 2018 г. в 0:29, Prachi Garg :

> Dmitriy,
>
> I have documented the Direct I/O plugin. Please review [1] and provide
> comments/feedback in the ticket [2].
>
> [1]
>
> https://apacheignite.readme.io/v2.4/docs/durable-memory-tuning#section-enabling-direct-i-o
>
> [2] https://issues.apache.org/jira/browse/IGNITE-7466
>
>
> -Prachi
>
> On Wed, Mar 21, 2018 at 11:22 PM, Dmitry Pavlov 
> wrote:
>
> > Yes, will do.
> >
> > чт, 22 мар. 2018 г. в 1:05, Denis Magda :
> >
> > > Dmitriy,
> > >
> > > Thanks for updating the WAL section. Now it makes total sense to me.
> > >
> > > As for the page with JNA projects, let's get added there! Could you
> > > contact the owners?
> > >
> > > --
> > > Denis
> > >
> > >
> > > On Wed, Mar 21, 2018 at 9:27 AM, Dmitry Pavlov 
> > > wrote:
> > >
> > >> Denis,
> > >>
> > >> one more thing, can/should we mention Ignite Direct IO plugin in list
> of
> > >> project using JNA here: https://github.com/java-native-access/jna
> > >>
> > >> Sincerely,
> > >> Dmitriy Pavlov
> > >>
> > >> ср, 21 мар. 2018 г. в 1:59, Denis Magda :
> > >>
> > >>> *Dmitriy*, thanks. Astonishing job! We'll add a section to the
> durable
> > >>> memory tuning page and refer to the wiki for more details:
> > >>> https://issues.apache.org/jira/browse/IGNITE-7466
> > >>>
> > >>> Please clarify the following:
> > >>>
> > >>> > Direct I/O mode can't be enabled for Write Ahead Log files.
> However,
> > >>> when
> > >>> > working with plugin, WAL manager applies advising Linux systems do
> > not
> > >>> > store the data of the file in page cache as they are not required.
> > >>>
> > >>>
> > >>> For me, it means that WAL always goes through the operating system
> I/O
> > >>> calls. Nothing changes for the WAL. However, I'm not sure what you
> > meant
> > >>> to
> > >>> explain by saying "when working with the plugin (Direct I/O) WAL
> > manager
> > >>> applies...". Could you rephrase it to bring more clarity?
> > >>>
> > >>> *Raymond,*
> > >>>
> > >>>
> > >>> If Direct I/O is enabled by default it will bring down the
> performance
> > of
> > >>> read-intensive application because, as Dmitry says, the reads bypass
> > page
> > >>> cache. So, I would recommend using it for write-intensive workloads
> > and,
> > >>> probably, for mixed-workloads depending on the reads and writes rate.
> > >>>
> > >>> --
> > >>> Denis
> > >>>
> > >>>
> > >>> On Tue, Mar 20, 2018 at 2:29 PM, Raymond Wilson <
> > >>> raymond_wil...@trimble.com>
> > >>> wrote:
> > >>>
> > >>> > Looks good!
> > >>> >
> > >>> > Is there any reason why this should not be a default setting if it
> > >>> > gracefully downgrades to non-Direct IO if not supported by the OS?
> > >>> >
> > >>> > Thanks,
> > >>> > Raymond.
> > >>> >
> > >>> > -Original Message-
> > >>> > From: Dmitriy Setrakyan [mailto:dsetrak...@apache.org]
> > >>> > Sent: Wednesday, March 21, 2018 10:23 AM
> > >>> > To: dev 
> > >>> > Subject: Re: Ignite Direct I/O plugin description added to wiki
> > >>> >
> > >>> > Thanks Dmitry, awesome work!
> > >>> >
> > >>> > On Wed, Mar 21, 2018 at 12:21 AM, Dmitry Pavlov <
> > dpavlov@gmail.com
> > >>> >
> > >>> > wrote:
> > >>> >
> > >>> > > Hi Igniters,
> > >>> > >
> > >>> > > I've added description of new plugin for Direct I/O for native
> > >>> > > persistence (
> > >>> > > https://issues.apache.org/jira/browse/IGNITE-6341)  to wiki
> > >>> > > https://cwiki.apache.org/confluence/display/IGNITE/
> > >>> > > Ignite+Persistent+Store+-+under+the+hood#IgnitePersistentStore-
> > >>> > > underthehood-DirectI/O
> > >>> > >
> > >>> > >
> > >>> > > SIncerely,
> > >>> > > Dmitriy Pavlov
> > >>> > >
> > >>> >
> > >>>
> > >>
> > >
> >
>

Re: Ignite Direct I/O plugin description added to wiki

2018-03-23 Thread Prachi Garg

Dmitriy,

I have documented the Direct I/O plugin. Please review [1] and provide
comments/feedback in the ticket [2].

[1]
https://apacheignite.readme.io/v2.4/docs/durable-memory-tuning#section-enabling-direct-i-o

[2] https://issues.apache.org/jira/browse/IGNITE-7466


-Prachi

On Wed, Mar 21, 2018 at 11:22 PM, Dmitry Pavlov 
wrote:

> Yes, will do.
>
> чт, 22 мар. 2018 г. в 1:05, Denis Magda :
>
> > Dmitriy,
> >
> > Thanks for updating the WAL section. Now it makes total sense to me.
> >
> > As for the page with JNA projects, let's get added there! Could you
> > contact the owners?
> >
> > --
> > Denis
> >
> >
> > On Wed, Mar 21, 2018 at 9:27 AM, Dmitry Pavlov 
> > wrote:
> >
> >> Denis,
> >>
> >> one more thing, can/should we mention Ignite Direct IO plugin in list of
> >> project using JNA here: https://github.com/java-native-access/jna
> >>
> >> Sincerely,
> >> Dmitriy Pavlov
> >>
> >> ср, 21 мар. 2018 г. в 1:59, Denis Magda :
> >>
> >>> *Dmitriy*, thanks. Astonishing job! We'll add a section to the durable
> >>> memory tuning page and refer to the wiki for more details:
> >>> https://issues.apache.org/jira/browse/IGNITE-7466
> >>>
> >>> Please clarify the following:
> >>>
> >>> > Direct I/O mode can't be enabled for Write Ahead Log files. However,
> >>> when
> >>> > working with plugin, WAL manager applies advising Linux systems do
> not
> >>> > store the data of the file in page cache as they are not required.
> >>>
> >>>
> >>> For me, it means that WAL always goes through the operating system I/O
> >>> calls. Nothing changes for the WAL. However, I'm not sure what you
> meant
> >>> to
> >>> explain by saying "when working with the plugin (Direct I/O) WAL
> manager
> >>> applies...". Could you rephrase it to bring more clarity?
> >>>
> >>> *Raymond,*
> >>>
> >>>
> >>> If Direct I/O is enabled by default it will bring down the performance
> of
> >>> read-intensive application because, as Dmitry says, the reads bypass
> page
> >>> cache. So, I would recommend using it for write-intensive workloads
> and,
> >>> probably, for mixed-workloads depending on the reads and writes rate.
> >>>
> >>> --
> >>> Denis
> >>>
> >>>
> >>> On Tue, Mar 20, 2018 at 2:29 PM, Raymond Wilson <
> >>> raymond_wil...@trimble.com>
> >>> wrote:
> >>>
> >>> > Looks good!
> >>> >
> >>> > Is there any reason why this should not be a default setting if it
> >>> > gracefully downgrades to non-Direct IO if not supported by the OS?
> >>> >
> >>> > Thanks,
> >>> > Raymond.
> >>> >
> >>> > -Original Message-
> >>> > From: Dmitriy Setrakyan [mailto:dsetrak...@apache.org]
> >>> > Sent: Wednesday, March 21, 2018 10:23 AM
> >>> > To: dev 
> >>> > Subject: Re: Ignite Direct I/O plugin description added to wiki
> >>> >
> >>> > Thanks Dmitry, awesome work!
> >>> >
> >>> > On Wed, Mar 21, 2018 at 12:21 AM, Dmitry Pavlov <
> dpavlov@gmail.com
> >>> >
> >>> > wrote:
> >>> >
> >>> > > Hi Igniters,
> >>> > >
> >>> > > I've added description of new plugin for Direct I/O for native
> >>> > > persistence (
> >>> > > https://issues.apache.org/jira/browse/IGNITE-6341)  to wiki
> >>> > > https://cwiki.apache.org/confluence/display/IGNITE/
> >>> > > Ignite+Persistent+Store+-+under+the+hood#IgnitePersistentStore-
> >>> > > underthehood-DirectI/O
> >>> > >
> >>> > >
> >>> > > SIncerely,
> >>> > > Dmitriy Pavlov
> >>> > >
> >>> >
> >>>
> >>
> >
>

Re: What's about releasing Ignite 2.5 a bit earlier?

2018-03-23 Thread Denis Magda

Petr,

I guess we would need much more time to handle all Java 9/10 tasks. Since
none of them looks like a blocker to me, I would suggest us not to rush and
target the tasks for 2.4 release.

Overall, I see that we're in an agreement in regards the earlier date of
2.5 release. I'll let the rest of the community to speak out until the
beginning of the next week.

--
Denis



On Thu, Mar 22, 2018 at 8:24 PM, Petr Ivanov  wrote:

> I’ll do my best to deliver packages in time.
>
>
> Concerning Java 9/10 — AFAIK there are limitations how do we support it:
> * compilation is supported with scala_2.10 modules switched off (due to
> scala support Java 9 itself starting from 2.12.x+ versions);
> * also compilation and run is possible only with lots of ‘--add-modules’
> hacks — I think that to say we support Java 9 we have to add modules info
> to every package;
> * tests (project ‘Ignite Tests 2.4+ (Java 9)’) also require attention,
> because there are some problems with theirs run.
>
>
>
> > On 22 Mar 2018, at 21:09, Denis Magda  wrote:
> >
> > Petr,
> >
> > Java Thin client, GA Grid, COPY command are the most significant
> additions
> > to the release (if to talk about the features). Aside from that, I see
> many
> > valuable optimizations and fixes that have been sitting in the master
> for a
> > while (some of them were merged in January). It's preferable to release
> > them.
> >
> > As for Java 10 and DEB, if you feel you can't make them to the release,
> > then it can wait till 2.6.
> >
> > Speaking of Java 9, what's left? I thought it's already fully supported.
> >
> > --
> > Denis
> >
> > On Thu, Mar 22, 2018 at 10:47 AM, Petr Ivanov 
> wrote:
> >
> >> Will there be a major new feature or this release will concentrate on
> >> stability and optimisations?
> >>
> >> Also, I guess, we will have to include into 2.5 release full support for
> >> Java 9 and Java 10 (that WILL require some developers time).
> >> And RPM / DEB packages stage II phase will require lots of testing and
> >> infrastructure preparations: place to store package, RPM / DEB build
> >> inclusion into release process and so on.
> >>
> >> Apr 30 sounds good, but I’d have a “place for manoeuvre” in case all
> those
> >> activities are not finished in time.
> >>
> >>
> >>
> >>> On 22 Mar 2018, at 20:39, Denis Magda  wrote:
> >>>
> >>> Igniters,
> >>>
> >>> According to our regular schedule, every new Ignite version usually
> goes
> >>> public once in 3 months. As you remember, the latest 2.4 release, which
> >>> took us 5 months to improve and roll out, was based on the version of
> the
> >>> source code dated by January.
> >>>
> >>> Since that time the master branch went far ahead and already
> incorporates
> >>> many valuable fixes and capabilities such as:
> >>>
> >>>  - Fixes provided as a part of "Gree Team City" activity.
> >>>  - Persistence: page replacement algorithm and throttling
> optimizations,
> >>>  out of memory in checkpointing buffer corrections, etc. Alex G and
> >> Ivan can
> >>>  shed more light here.
> >>>  - Data loading optimizations for SQL: streaming for JDBC thin driver
> >> and
> >>>  copy command
> >>>  - Genetic Algorithms Grid Contribution!
> >>>  - Java Thin Client developed by Alexey Kukushkin
> >>>
> >>>
> >>>  - the list goes on and on... Please share the contributions you are
> >>>  ready to release.
> >>>
> >>> So, why don't we go ahead and release the current master and possibly
> >> extra
> >>> tickets that are in the review state earlier? What's about April 30 as
> >> the
> >>> next release date?
> >>>
> >>> --
> >>> Densi
> >>
> >>
>
>

Re: IEP-14: Ignite failures handling (Discussion)

2018-03-23 Thread Yakov Zhdanov

Andrey, I understand your point but you are trying to build one more
mechanism and introduce abstractions that are already here. Again, please
take a look at segmentation policy and event types we already have.

Thanks!

Yakov

Re: Memory usage per cache

2018-03-23 Thread Andrey Kuznetsov

Denis,

I'll need to conduct some experiments to estimate the difference. And the
answer will depend on numerous parameters: object sizes, number of caches,
that share the same data region and so on.

пт, 23 марта 2018, 21:53 Denis Magda :

> Andrey,
>
> How good will be the estimate if we go for 1. and utilize pagesFillFactor
> somehow? In other words, how big can be a difference between 100% precise
> calculation you the approach you're suggesting?
>
> --
> Denis
>
>

Re: Rebalancing - how to make it faster

2018-03-23 Thread Denis Magda

Ilya,

That's a decent boost (5-20%) even having WAL enabled. Not sure that we
should stake on the WAL "off" mode here because if the whole cluster goes
down, it's then the data consistency is questionable. As an architect, I
wouldn't disable WAL for the sake of rebalancing; it's too risky.

If you agree, then let's create the IEP. This way it will be easier to
track this endeavor. BTW, are you already ready to release any
optimizations in 2.5 that is being discussed in a separate thread?

--
Denis



On Fri, Mar 23, 2018 at 6:37 AM, Ilya Lantukh  wrote:

> Denis,
>
> > - Don't you want to aggregate the tickets under an IEP?
> Yes, I think so.
>
> > - Does it mean we're going to update our B+Tree implementation? Any ideas
> how risky it is?
> One of tickets that I created (
> https://issues.apache.org/jira/browse/IGNITE-7935) involves B+Tree
> modification, but I am not planning to do it in the nearest future. It
> shouldn't affect existing tree operations, only introduce new ones (putAll,
> invokeAll, removeAll).
>
> > - Any chance you had a prototype that shows performance optimizations the
> approach you are suggesting to take?
> I have a prototype for simplest improvements (https://issues.apache.org/
> jira/browse/IGNITE-8019 & https://issues.apache.org/
> jira/browse/IGNITE-8018)
> - together they increase throughput by 5-20%, depending on configuration
> and environment. Also, I've tested different WAL modes - switching from
> LOG_ONLY to NONE gives over 100% boost - this is what I expect from
> https://issues.apache.org/jira/browse/IGNITE-8017.
>
> On Thu, Mar 22, 2018 at 9:48 PM, Denis Magda  wrote:
>
> > Ilya,
> >
> > That's outstanding research and summary. Thanks for spending your time on
> > this.
> >
> > Not sure I have enough expertise to challenge your approach, but it
> sounds
> > 100% reasonable to me. As side notes:
> >
> >- Don't you want to aggregate the tickets under an IEP?
> >- Does it mean we're going to update our B+Tree implementation? Any
> >ideas how risky it is?
> >- Any chance you had a prototype that shows performance optimizations
> of
> >the approach you are suggesting to take?
> >
> > --
> > Denis
> >
> > On Thu, Mar 22, 2018 at 8:38 AM, Ilya Lantukh 
> > wrote:
> >
> > > Igniters,
> > >
> > > I've spent some time analyzing performance of rebalancing process. The
> > > initial goal was to understand, what limits it's throughput, because it
> > is
> > > significantly slower than network and storage device can theoretically
> > > handle.
> > >
> > > Turns out, our current implementation has a number of issues caused by
> a
> > > single fundamental problem.
> > >
> > > During rebalance data is sent in batches called
> > > GridDhtPartitionSupplyMessages. Batch size is configurable, default
> > value
> > > is 512KB, which could mean thousands of key-value pairs. However, we
> > don't
> > > take any advantage over this fact and process each entry independently:
> > > - checkpointReadLock is acquired multiple times for every entry,
> leading
> > to
> > > unnecessary contention - this is clearly a bug;
> > > - for each entry we write (and fsync, if configuration assumes it) a
> > > separate WAL record - so, if batch contains N entries, we might end up
> > > doing N fsyncs;
> > > - adding every entry into CacheDataStore also happens completely
> > > independently. It means, we will traverse and modify each index tree N
> > > times, we will allocate space in FreeList N times and we will have to
> > > additionally store in WAL O(N*log(N)) page delta records.
> > >
> > > I've created a few tickets in JIRA with very different levels of scale
> > and
> > > complexity.
> > >
> > > Ways to reduce impact of independent processing:
> > > - https://issues.apache.org/jira/browse/IGNITE-8019 - aforementioned
> > bug,
> > > causing contention on checkpointReadLock;
> > > - https://issues.apache.org/jira/browse/IGNITE-8018 - inefficiency in
> > > GridCacheMapEntry implementation;
> > > - https://issues.apache.org/jira/browse/IGNITE-8017 - automatically
> > > disable
> > > WAL during preloading.
> > >
> > > Ways to solve problem on more global level:
> > > - https://issues.apache.org/jira/browse/IGNITE-7935 - a ticket to
> > > introduce
> > > batch modification;
> > > - https://issues.apache.org/jira/browse/IGNITE-8020 - complete
> redesign
> > of
> > > rebalancing process for persistent caches, based on file transfer.
> > >
> > > Everyone is welcome to criticize above ideas, suggest new ones or
> > > participate in implementation.
> > >
> > > --
> > > Best regards,
> > > Ilya
> > >
> >
>
>
>
> --
> Best regards,
> Ilya
>

Re: Mass TC Run-All during this weekend

2018-03-23 Thread Denis Magda

Ignite the Team City :)

--
Denis

On Fri, Mar 23, 2018 at 7:00 AM, Dmitry Pavlov 
wrote:

> Hi Igniters,
>
> To get the full picture of the tests now, I'm not going to, I'm going to
> run a massive Run All this weekend.
>
> I'm going to start it at 20:00 MSK (17:00 UTC).
>
> Are there any objections? If you need to run your branches, rearrange them
> at the beginning of the queue ..
>
> Sincerely,
> Dmitriy Pavlov
>

Re: Service grid redesign

2018-03-23 Thread Denis Magda

Denis,

Thanks for the extensive analysis. There is a vast room for optimizations
on the service grid side.

Yakov, Sam, Alex G.,

How do you like the idea of the usage of discovery protocol for the service
grid system messages exchange? Any pitfalls?


--
Denis


On Fri, Mar 23, 2018 at 8:01 AM, Denis Mekhanikov 
wrote:

> Igniters,
>
> I'd like to start a discussion on Ignite service grid redesign.
> We have a number of problems in our current architecture, that have to be
> addressed.
>
> Here are the most severe ones:
>
> One of them is lack of guarantee, that service is successfully deployed and
> ready for work by the time, when *IgniteService.deploy*()* methods return.
> Furthermore, if an exception is thrown from *Service.init() *method, then
> the deploying side is not able to receive it, or even understand, that
> service is in unusable state.
> So, you may end up in such situation, when you deployed a service without
> receiving any errors, then called a service's method, and hung indefinitely
> on this invocation.
> JIRA ticket: https://issues.apache.org/jira/browse/IGNITE-3392
>
> Another problem is locking during service deployment on unstable topology.
> This issue is caused by missing updates in continuous query listeners on
> the internal cache.
> It is hard to reproduce, but it happens sometimes. We shouldn't allow such
> possibility, that deployment methods hang without saying anything.
> JIRA ticket: https://issues.apache.org/jira/browse/IGNITE-6259
>
> I think, we should change the deployment procedure to make it more
> reliable.
> Moving from operating over internal replicated service cache to sending
> custom discovery events seems to be a good idea.
> Service deployment may trigger a discovery event, that will make chosen
> nodes deploy the service, and the same event will notify other nodes about
> the deployed service instances.
> It will eliminate the need for distributed transactions on the internal
> replicated system cache, and make the service deployment protocol more
> transparent.
>
> There are a few points, that should be taken into account though.
>
> First of all, we can't wait for services to be deployed and initialised in
> the discovery thread.
> So, we need to make notification about service deployment result
> asynchronous, presumably over communication protocol.
> I can think of a procedure similar to the current exchange protocol, when
> service deployment is initialised with an initial discovery message,
> followed by asynchronous notifications from the hosting servers over
> communication. And finally, one more discovery message will notify all
> nodes about the service deployment result and location of the deployed
> service instances. Coordinator will be responsible for collecting of the
> deployment results in this scheme.
>
> Another problem is failover in case, when some nodes fail during deployment
> or further work.
> The following cases should be handled:
>
>1. coordinator failure during deployment;
>2. failure of nodes, that were chosen to host the service, during
>deployment;
>3. failure of nodes, that contain deployed services, after the
>deployment.
>
> The first case may be resolved by either continuation of deployment with a
> new coordinator, or by cancelling it.
> The second case will require another node to be chosen and notified. Maybe
> another discovery message will be needed.
> The third case will require redeployment, so coordinator should track
> topology changes and redeploy failed services.
>
> Another good improvement would be service versioning. This matter was
> already discussed in another thread:
> http://apache-ignite-developers.2346864.n4.nabble.com/Service-versioning-
> td20858.html
> Let's resume this discussion and state the final decision here.
> This feature is closely connected to peer class loading, which is not
> working for services currently.
> So, service versioning should be implemented along with peer class loading.
> JIRA ticket for versioning:
> https://issues.apache.org/jira/browse/IGNITE-6069
> Peer class loading: https://issues.apache.org/jira/browse/IGNITE-975
>
> Please share your thoughts. Constructive criticism is highly appreciated.
>
> Denis
>

Re: Memory usage per cache

2018-03-23 Thread Denis Magda

Andrey,

How good will be the estimate if we go for 1. and utilize pagesFillFactor
somehow? In other words, how big can be a difference between 100% precise
calculation you the approach you're suggesting?

--
Denis

On Thu, Mar 22, 2018 at 11:17 AM, Andrey Kuznetsov 
wrote:

> Denis,
>
> Do we really need to measure cache sizes precisely? We can use approach 1
> and also we have average pagesFillFactor metric. These two metrics can give
> a good estimation of page memory size per cache.
>
> 2018-03-22 21:04 GMT+03:00 Denis Magda :
>
> > Igniters,
> >
> > Glad to see that 2.4 made it possible to show memory usage in bytes for
> > data regions. Good progress but it's definitely not enough.
> >
> > I'd like to discuss how we can support "memory usage per cache/table" in
> > bytes. That's actually what the users are interested in. Talking to Alex
> G.
> > we came up with 2 possible approaches:
> >
> > 1. Usage of a per cache counter that accumulates total size of all binary
> > objects passed through our cache update/insert/remove API. Pros: easy to
> > implement. Cons: we don't consider extra space taken by our data pages
> and
> > other memory components.
> >
> > 2. To address the cons of 1. we can count the size of the data pages, but
> > in that scenario, a data page must not store entries of varies cache.
> This
> > approach is ideal for the memory usage by cache reporting but requires
> more
> > changes in the architecture side.
> >
> > Alex G., please step in and give more details suggesting your approach.
> >
> > Others, please join the discussion as well.
> >
> > --
> > Denis
> >
>
>
>
> --
> Best regards,
>   Andrey Kuznetsov.
>

[jira] [Created] (IGNITE-8039) Binary Client Protocol spec: data types/format clarifications

2018-03-23 Thread Alexey Kosenchuk (JIRA)

Alexey Kosenchuk created IGNITE-8039:


 Summary: Binary Client Protocol spec: data types/format 
clarifications
 Key: IGNITE-8039
 URL: https://issues.apache.org/jira/browse/IGNITE-8039
 Project: Ignite
  Issue Type: Bug
  Components: documentation, thin client
Affects Versions: 2.4
Reporter: Alexey Kosenchuk


Assuming the Binary Client Protocol spec should be detalized enough to allow a 
client development basing on the spec only, w/o looking at other client 
implementations and asking additional questions...

The following should be clarified / corrected in the Binary Client Protocol 
spec (v.2.4) 
(https://apacheignite.readme.io/v2.4/docs/binary-client-protocol#section-data-format):

Type Codes table:
-

- UUID (Guid) size: should be 16 bytes, not 8 (?) 

- what is Object array (type code 23) ? What is the difference between it and 
Objects Wrapped In Array (type code 27) ?

- what is Collection USER_SET ?

- what is Collection USER_COL ?

- what is Collection SINGLETON_LIST ?

- Collection: misprint: should be "... + length ..."

- what is Decimal ?

- what is Timestamp ?

- what is Time ?

Complex Objects:


- what does flag USER_TYPE mean ?

- Schema "field Id; Java-style hash code of field" -> should be "... of field 
name".

- "Repeat for as many times as the total number of schemas you have" -> should 
be "... total number of fields you have".

- is it mandated that the number of fields in the Schema must be equal to the 
number of fields in the Data Object ?

Objects Wrapped In Array


- may binary objects with different type codes be in the same array ?

- may complex objects with different type ids be in the same array ?

- "All cache operations return complex objects inside a wrapper (but not 
primitives)." -> does it mean that in general a complex object (103) must 
always be sent via the Binary Protocol in a wrapper (27)? 

- "Byte array size" -> "Payload size" or "Size of the whole array with header" ?

- Offset. What is "object graph" here ? The Binary Protocol nowhere describes 
any relations ("graph") between data objects in the protocol.

Terminology
---

Not critical but would be really convenient to define and use the same terms 
along the whole spec. For example:

- "binary object" is always the same as "data object" of any type (?). Can be 
"standard/predefined type object" or "complex object".

- "cluster" or "server" ?

- "cluster member" or "server nodes" ?




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IGNITE-8038) DynamicColumnsConcurrentTransactionalPartitionedSelfTest#testDropColumnCoordinatorChange is flaky

2018-03-23 Thread Sergey Chugunov (JIRA)

Sergey Chugunov created IGNITE-8038:
---

 Summary: 
DynamicColumnsConcurrentTransactionalPartitionedSelfTest#testDropColumnCoordinatorChange
 is flaky
 Key: IGNITE-8038
 URL: https://issues.apache.org/jira/browse/IGNITE-8038
 Project: Ignite
  Issue Type: Sub-task
Reporter: Sergey Chugunov
Assignee: Sergey Chugunov


Test fails on TC as well as locally with the following error:
{noformat}
SchemaOperationException [code=1, msg=Cache doesn't exist: SQL_PUBLIC_PERSON
]
at 
org.apache.ignite.internal.processors.query.GridQueryProcessor.processSchemaOperationLocal(GridQueryProcessor.java:1419)
at 
org.apache.ignite.internal.processors.query.schema.SchemaOperationWorker.body(SchemaOperationWorker.java:108)
at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
at java.lang.Thread.run(Thread.java:745){noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IGNITE-8037) DynamicColumnsConcurrentTransactionalPartitionedSelfTest#testClientReconnectWithNonDynamicCacheRestart is flaky

2018-03-23 Thread Sergey Chugunov (JIRA)

Sergey Chugunov created IGNITE-8037:
---

 Summary: 
DynamicColumnsConcurrentTransactionalPartitionedSelfTest#testClientReconnectWithNonDynamicCacheRestart
 is flaky
 Key: IGNITE-8037
 URL: https://issues.apache.org/jira/browse/IGNITE-8037
 Project: Ignite
  Issue Type: Sub-task
Reporter: Sergey Chugunov
Assignee: Sergey Chugunov


Test fails on TC as well as locally (although local runs reproduce failure 
rarely: 1-3 failed tests for 100 runs).

There are suspicious errors in logs:
{noformat}
[2018-03-23 
18:55:19,371][ERROR][exchange-worker-#630%index.DynamicColumnsConcurrentTransactionalPartitionedSelfTest4%][GridQueryProcessor]
 Failed to clear indexing on cache unregister (will ignore): idx
class org.apache.ignite.internal.processors.query.IgniteSQLException: Failed to 
set schema for DB connection for thread [schema=idx]
at 
org.apache.ignite.internal.processors.query.h2.IgniteH2Indexing.connectionForThread(IgniteH2Indexing.java:542)
at 
org.apache.ignite.internal.processors.query.h2.IgniteH2Indexing.dropTable(IgniteH2Indexing.java:705)
at 
org.apache.ignite.internal.processors.query.h2.IgniteH2Indexing.unregisterCache(IgniteH2Indexing.java:2794)
at 
org.apache.ignite.internal.processors.query.GridQueryProcessor.onCacheStop0(GridQueryProcessor.java:1684)
at 
org.apache.ignite.internal.processors.query.GridQueryProcessor.registerCache0(GridQueryProcessor.java:1630)
at 
org.apache.ignite.internal.processors.query.GridQueryProcessor.onCacheStart0(GridQueryProcessor.java:799)
at 
org.apache.ignite.internal.processors.query.GridQueryProcessor.onCacheStart(GridQueryProcessor.java:860)
at 
org.apache.ignite.internal.processors.cache.GridCacheProcessor.startCache(GridCacheProcessor.java:1161)
at 
org.apache.ignite.internal.processors.cache.GridCacheProcessor.prepareCacheStart(GridCacheProcessor.java:1902)
at 
org.apache.ignite.internal.processors.cache.GridCacheProcessor.startCachesOnLocalJoin(GridCacheProcessor.java:1768)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.initCachesOnLocalJoin(GridDhtPartitionsExchangeFuture.java:784)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.init(GridDhtPartitionsExchangeFuture.java:666)
at 
org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body(GridCachePartitionExchangeManager.java:2344)
at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.h2.jdbc.JdbcSQLException: Schema "idx" not found; SQL statement:
SET SCHEMA "idx" [90079-195]
at org.h2.message.DbException.getJdbcSQLException(DbException.java:345)
at org.h2.message.DbException.get(DbException.java:179)
at org.h2.message.DbException.get(DbException.java:155)
at org.h2.engine.Database.getSchema(Database.java:1755)
at org.h2.command.dml.Set.update(Set.java:408)
at org.h2.command.CommandContainer.update(CommandContainer.java:101)
at org.h2.command.Command.executeUpdate(Command.java:260)
at org.h2.jdbc.JdbcStatement.executeUpdateInternal(JdbcStatement.java:137)
at org.h2.jdbc.JdbcStatement.executeUpdate(JdbcStatement.java:122)
at 
org.apache.ignite.internal.processors.query.h2.IgniteH2Indexing.connectionForThread(IgniteH2Indexing.java:534)
... 14 more{noformat}
Test fails with the following error:
{noformat}
junit.framework.AssertionFailedError: Failed to wait for disconnect/reconnect 
event.

at junit.framework.Assert.fail(Assert.java:57)
at junit.framework.TestCase.fail(TestCase.java:227)
at 
org.apache.ignite.internal.IgniteClientReconnectAbstractTest.waitReconnectEvent(IgniteClientReconnectAbstractTest.java:120)
at 
org.apache.ignite.internal.IgniteClientReconnectAbstractTest.reconnectClientNodes(IgniteClientReconnectAbstractTest.java:297)
at 
org.apache.ignite.internal.IgniteClientReconnectAbstractTest.reconnectClientNode(IgniteClientReconnectAbstractTest.java:238)
at 
org.apache.ignite.internal.processors.cache.index.DynamicColumnsAbstractConcurrentSelfTest.reconnectClientNode(DynamicColumnsAbstractConcurrentSelfTest.java:804)
at 
org.apache.ignite.internal.processors.cache.index.DynamicColumnsAbstractConcurrentSelfTest.checkClientReconnect(DynamicColumnsAbstractConcurrentSelfTest.java:781)
at 
org.apache.ignite.internal.processors.cache.index.DynamicColumnsAbstractConcurrentSelfTest.testClientReconnectWithNonDynamicCacheRestart(DynamicColumnsAbstractConcurrentSelfTest.java:750)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at junit.framework.TestCase.runTest(TestCase.java:176)
at

[GitHub] ignite pull request #3693: IGNITE-8036 MulticastIpFinder is replaced with Vm...

2018-03-23 Thread sergey-chugunov-1985

GitHub user sergey-chugunov-1985 opened a pull request:

https://github.com/apache/ignite/pull/3693

IGNITE-8036 MulticastIpFinder is replaced with VmIpFinder



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/gridgain/apache-ignite ignite-8036

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/ignite/pull/3693.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3693


commit 870d7256c64c20b8c09c263f8f83bd5ec179d7a3
Author: Sergey Chugunov 
Date:   2018-03-23T16:22:05Z

IGNITE-8036 MulticastIpFinder is replaced with VmIpFinder




---

[jira] [Created] (IGNITE-8036) DynamicColumnsAbstractConcurrentSelfTest should not use MulticastIpFinder

2018-03-23 Thread Sergey Chugunov (JIRA)

Sergey Chugunov created IGNITE-8036:
---

 Summary: DynamicColumnsAbstractConcurrentSelfTest should not use 
MulticastIpFinder
 Key: IGNITE-8036
 URL: https://issues.apache.org/jira/browse/IGNITE-8036
 Project: Ignite
  Issue Type: Sub-task
Reporter: Sergey Chugunov


It turned out that DynamicColumnsAbstractConcurrentSelfTest prepares 
IgniteConfiguration without redefining default MulticastIpFinder which causes 
flaky fails of tests later.

 

VmIpFinder should be used instead of MulticastIpFinder.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Re: .NET: Add "authenticationEnabled" flag to IgntieConfiguration

2018-03-23 Thread Pavel Tupitsyn

Thanks, got it, will do.

On Fri, Mar 23, 2018 at 4:36 PM, Dmitry Pavlov 
wrote:

> Hi Pavel,
>
> Related ticket is https://issues.apache.org/jira/browse/IGNITE-7436
>
> Sincerely,
> Dmitriy Pavlov
>
> пт, 23 мар. 2018 г. в 16:24, Pavel Tupitsyn :
>
> > Please provide description in IGNITE-8034 and link Java-side ticket
> there.
> >
> > On Fri, Mar 23, 2018 at 4:23 PM, Pavel Tupitsyn 
> > wrote:
> >
> > > Hi Vladimir,
> > >
> > > Can you provide more details?
> > > * What does it do?
> > > * Do we need to only propagate the flag to .NET or do anything else?
> > > * Related ticket?
> > >
> > > Thanks,
> > > Pavel
> > >
> > > On Fri, Mar 23, 2018 at 2:25 PM, Vladimir Ozerov  >
> > > wrote:
> > >
> > >> Pavel,
> > >>
> > >> We introduced new flag IgniteConfiguration.authenticationEnabled
> > recently.
> > >> Would you mind adding it to IgniteConfigutation.cs [1]?
> > >>
> > >> Vladimir.
> > >>
> > >> [1] https://issues.apache.org/jira/browse/IGNITE-8034
> > >>
> > >
> > >
> >
>

Re: Service grid redesign

2018-03-23 Thread Dmitriy Setrakyan

I think it is about time we take another look at our service functionality.
All the points you have raised sound reasonable to me.

On Fri, Mar 23, 2018 at 6:01 PM, Denis Mekhanikov 
wrote:

> Igniters,
>
> I'd like to start a discussion on Ignite service grid redesign.
> We have a number of problems in our current architecture, that have to be
> addressed.
>
> Here are the most severe ones:
>
> One of them is lack of guarantee, that service is successfully deployed and
> ready for work by the time, when *IgniteService.deploy*()* methods return.
> Furthermore, if an exception is thrown from *Service.init() *method, then
> the deploying side is not able to receive it, or even understand, that
> service is in unusable state.
> So, you may end up in such situation, when you deployed a service without
> receiving any errors, then called a service's method, and hung indefinitely
> on this invocation.
> JIRA ticket: https://issues.apache.org/jira/browse/IGNITE-3392
>
> Another problem is locking during service deployment on unstable topology.
> This issue is caused by missing updates in continuous query listeners on
> the internal cache.
> It is hard to reproduce, but it happens sometimes. We shouldn't allow such
> possibility, that deployment methods hang without saying anything.
> JIRA ticket: https://issues.apache.org/jira/browse/IGNITE-6259
>
> I think, we should change the deployment procedure to make it more
> reliable.
> Moving from operating over internal replicated service cache to sending
> custom discovery events seems to be a good idea.
> Service deployment may trigger a discovery event, that will make chosen
> nodes deploy the service, and the same event will notify other nodes about
> the deployed service instances.
> It will eliminate the need for distributed transactions on the internal
> replicated system cache, and make the service deployment protocol more
> transparent.
>
> There are a few points, that should be taken into account though.
>
> First of all, we can't wait for services to be deployed and initialised in
> the discovery thread.
> So, we need to make notification about service deployment result
> asynchronous, presumably over communication protocol.
> I can think of a procedure similar to the current exchange protocol, when
> service deployment is initialised with an initial discovery message,
> followed by asynchronous notifications from the hosting servers over
> communication. And finally, one more discovery message will notify all
> nodes about the service deployment result and location of the deployed
> service instances. Coordinator will be responsible for collecting of the
> deployment results in this scheme.
>
> Another problem is failover in case, when some nodes fail during deployment
> or further work.
> The following cases should be handled:
>
>1. coordinator failure during deployment;
>2. failure of nodes, that were chosen to host the service, during
>deployment;
>3. failure of nodes, that contain deployed services, after the
>deployment.
>
> The first case may be resolved by either continuation of deployment with a
> new coordinator, or by cancelling it.
> The second case will require another node to be chosen and notified. Maybe
> another discovery message will be needed.
> The third case will require redeployment, so coordinator should track
> topology changes and redeploy failed services.
>
> Another good improvement would be service versioning. This matter was
> already discussed in another thread:
> http://apache-ignite-developers.2346864.n4.nabble.com/Service-versioning-
> td20858.html
> Let's resume this discussion and state the final decision here.
> This feature is closely connected to peer class loading, which is not
> working for services currently.
> So, service versioning should be implemented along with peer class loading.
> JIRA ticket for versioning:
> https://issues.apache.org/jira/browse/IGNITE-6069
> Peer class loading: https://issues.apache.org/jira/browse/IGNITE-975
>
> Please share your thoughts. Constructive criticism is highly appreciated.
>
> Denis
>

Service grid redesign

2018-03-23 Thread Denis Mekhanikov

Igniters,

I'd like to start a discussion on Ignite service grid redesign.
We have a number of problems in our current architecture, that have to be
addressed.

Here are the most severe ones:

One of them is lack of guarantee, that service is successfully deployed and
ready for work by the time, when *IgniteService.deploy*()* methods return.
Furthermore, if an exception is thrown from *Service.init() *method, then
the deploying side is not able to receive it, or even understand, that
service is in unusable state.
So, you may end up in such situation, when you deployed a service without
receiving any errors, then called a service's method, and hung indefinitely
on this invocation.
JIRA ticket: https://issues.apache.org/jira/browse/IGNITE-3392

Another problem is locking during service deployment on unstable topology.
This issue is caused by missing updates in continuous query listeners on
the internal cache.
It is hard to reproduce, but it happens sometimes. We shouldn't allow such
possibility, that deployment methods hang without saying anything.
JIRA ticket: https://issues.apache.org/jira/browse/IGNITE-6259

I think, we should change the deployment procedure to make it more reliable.
Moving from operating over internal replicated service cache to sending
custom discovery events seems to be a good idea.
Service deployment may trigger a discovery event, that will make chosen
nodes deploy the service, and the same event will notify other nodes about
the deployed service instances.
It will eliminate the need for distributed transactions on the internal
replicated system cache, and make the service deployment protocol more
transparent.

There are a few points, that should be taken into account though.

First of all, we can't wait for services to be deployed and initialised in
the discovery thread.
So, we need to make notification about service deployment result
asynchronous, presumably over communication protocol.
I can think of a procedure similar to the current exchange protocol, when
service deployment is initialised with an initial discovery message,
followed by asynchronous notifications from the hosting servers over
communication. And finally, one more discovery message will notify all
nodes about the service deployment result and location of the deployed
service instances. Coordinator will be responsible for collecting of the
deployment results in this scheme.

Another problem is failover in case, when some nodes fail during deployment
or further work.
The following cases should be handled:

   1. coordinator failure during deployment;
   2. failure of nodes, that were chosen to host the service, during
   deployment;
   3. failure of nodes, that contain deployed services, after the
   deployment.

The first case may be resolved by either continuation of deployment with a
new coordinator, or by cancelling it.
The second case will require another node to be chosen and notified. Maybe
another discovery message will be needed.
The third case will require redeployment, so coordinator should track
topology changes and redeploy failed services.

Another good improvement would be service versioning. This matter was
already discussed in another thread:
http://apache-ignite-developers.2346864.n4.nabble.com/Service-versioning-td20858.html
Let's resume this discussion and state the final decision here.
This feature is closely connected to peer class loading, which is not
working for services currently.
So, service versioning should be implemented along with peer class loading.
JIRA ticket for versioning:
https://issues.apache.org/jira/browse/IGNITE-6069
Peer class loading: https://issues.apache.org/jira/browse/IGNITE-975

Please share your thoughts. Constructive criticism is highly appreciated.

Denis

Mass TC Run-All during this weekend

2018-03-23 Thread Dmitry Pavlov

Hi Igniters,

To get the full picture of the tests now, I'm not going to, I'm going to
run a massive Run All this weekend.

I'm going to start it at 20:00 MSK (17:00 UTC).

Are there any objections? If you need to run your branches, rearrange them
at the beginning of the queue ..

Sincerely,
Dmitriy Pavlov

[jira] [Created] (IGNITE-8035) Duplicated events in ContinuousQuery's Events Listener

2018-03-23 Thread Ruslan Gilemzyanov (JIRA)

Ruslan Gilemzyanov created IGNITE-8035:
--

 Summary: Duplicated events in ContinuousQuery's Events Listener 
 Key: IGNITE-8035
 URL: https://issues.apache.org/jira/browse/IGNITE-8035
 Project: Ignite
  Issue Type: Bug
  Components: cache
Affects Versions: 2.1
Reporter: Ruslan Gilemzyanov


We faced with bug in ContinuousQuery's EventListener work in Ignite. I wrote 
sample project to demonstrate it.

We started 2 server nodes connected to the one cache.

Topology snapshot became [ver=2, servers=2, clients=0, CPUs=4, heap=3.6GB]

I have put elements in cache (about 50 elements). Elements were distributed 
between two nodes approxiamtely in the same amount.

After pushing every element to cache we waited 100ms (to ensure that Listener 
did his work) and deleted element from cache. 

Then we stopped one node. (Topology snapshot became [ver=3, servers=1, 
clients=0, CPUs=4, heap=1.8GB])

And then some absolutely randomly chosen (deleted from cache to this moment) 
events came to other working node with status CREATED (Remind you that we 
deleted them from cache to this moment). In our case it was 5 events.

I think this is direct violation of Continuous Query's "exactly once delivery" 
contract. 

Source code here: https://github.com/ruslangm/ignite-sample

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[GitHub] ignite pull request #3654: IGNITE-7581 avoiding ConcurrentModificationExcept...

2018-03-23 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/ignite/pull/3654


---

Re: Rebalancing - how to make it faster

2018-03-23 Thread Ilya Lantukh

Denis,

> - Don't you want to aggregate the tickets under an IEP?
Yes, I think so.

> - Does it mean we're going to update our B+Tree implementation? Any ideas
how risky it is?
One of tickets that I created (
https://issues.apache.org/jira/browse/IGNITE-7935) involves B+Tree
modification, but I am not planning to do it in the nearest future. It
shouldn't affect existing tree operations, only introduce new ones (putAll,
invokeAll, removeAll).

> - Any chance you had a prototype that shows performance optimizations the
approach you are suggesting to take?
I have a prototype for simplest improvements (https://issues.apache.org/
jira/browse/IGNITE-8019 & https://issues.apache.org/jira/browse/IGNITE-8018)
- together they increase throughput by 5-20%, depending on configuration
and environment. Also, I've tested different WAL modes - switching from
LOG_ONLY to NONE gives over 100% boost - this is what I expect from
https://issues.apache.org/jira/browse/IGNITE-8017.

On Thu, Mar 22, 2018 at 9:48 PM, Denis Magda  wrote:

> Ilya,
>
> That's outstanding research and summary. Thanks for spending your time on
> this.
>
> Not sure I have enough expertise to challenge your approach, but it sounds
> 100% reasonable to me. As side notes:
>
>- Don't you want to aggregate the tickets under an IEP?
>- Does it mean we're going to update our B+Tree implementation? Any
>ideas how risky it is?
>- Any chance you had a prototype that shows performance optimizations of
>the approach you are suggesting to take?
>
> --
> Denis
>
> On Thu, Mar 22, 2018 at 8:38 AM, Ilya Lantukh 
> wrote:
>
> > Igniters,
> >
> > I've spent some time analyzing performance of rebalancing process. The
> > initial goal was to understand, what limits it's throughput, because it
> is
> > significantly slower than network and storage device can theoretically
> > handle.
> >
> > Turns out, our current implementation has a number of issues caused by a
> > single fundamental problem.
> >
> > During rebalance data is sent in batches called
> > GridDhtPartitionSupplyMessages. Batch size is configurable, default
> value
> > is 512KB, which could mean thousands of key-value pairs. However, we
> don't
> > take any advantage over this fact and process each entry independently:
> > - checkpointReadLock is acquired multiple times for every entry, leading
> to
> > unnecessary contention - this is clearly a bug;
> > - for each entry we write (and fsync, if configuration assumes it) a
> > separate WAL record - so, if batch contains N entries, we might end up
> > doing N fsyncs;
> > - adding every entry into CacheDataStore also happens completely
> > independently. It means, we will traverse and modify each index tree N
> > times, we will allocate space in FreeList N times and we will have to
> > additionally store in WAL O(N*log(N)) page delta records.
> >
> > I've created a few tickets in JIRA with very different levels of scale
> and
> > complexity.
> >
> > Ways to reduce impact of independent processing:
> > - https://issues.apache.org/jira/browse/IGNITE-8019 - aforementioned
> bug,
> > causing contention on checkpointReadLock;
> > - https://issues.apache.org/jira/browse/IGNITE-8018 - inefficiency in
> > GridCacheMapEntry implementation;
> > - https://issues.apache.org/jira/browse/IGNITE-8017 - automatically
> > disable
> > WAL during preloading.
> >
> > Ways to solve problem on more global level:
> > - https://issues.apache.org/jira/browse/IGNITE-7935 - a ticket to
> > introduce
> > batch modification;
> > - https://issues.apache.org/jira/browse/IGNITE-8020 - complete redesign
> of
> > rebalancing process for persistent caches, based on file transfer.
> >
> > Everyone is welcome to criticize above ideas, suggest new ones or
> > participate in implementation.
> >
> > --
> > Best regards,
> > Ilya
> >
>



-- 
Best regards,
Ilya

Re: .NET: Add "authenticationEnabled" flag to IgntieConfiguration

2018-03-23 Thread Dmitry Pavlov

Hi Pavel,

Related ticket is https://issues.apache.org/jira/browse/IGNITE-7436

Sincerely,
Dmitriy Pavlov

пт, 23 мар. 2018 г. в 16:24, Pavel Tupitsyn :

> Please provide description in IGNITE-8034 and link Java-side ticket there.
>
> On Fri, Mar 23, 2018 at 4:23 PM, Pavel Tupitsyn 
> wrote:
>
> > Hi Vladimir,
> >
> > Can you provide more details?
> > * What does it do?
> > * Do we need to only propagate the flag to .NET or do anything else?
> > * Related ticket?
> >
> > Thanks,
> > Pavel
> >
> > On Fri, Mar 23, 2018 at 2:25 PM, Vladimir Ozerov 
> > wrote:
> >
> >> Pavel,
> >>
> >> We introduced new flag IgniteConfiguration.authenticationEnabled
> recently.
> >> Would you mind adding it to IgniteConfigutation.cs [1]?
> >>
> >> Vladimir.
> >>
> >> [1] https://issues.apache.org/jira/browse/IGNITE-8034
> >>
> >
> >
>

Re: .NET: Add "authenticationEnabled" flag to IgntieConfiguration

2018-03-23 Thread Vladimir Ozerov

Hi Pavel,

I added the link to original ticket.

On Fri, Mar 23, 2018 at 4:24 PM, Pavel Tupitsyn 
wrote:

> Please provide description in IGNITE-8034 and link Java-side ticket there.
>
> On Fri, Mar 23, 2018 at 4:23 PM, Pavel Tupitsyn 
> wrote:
>
>> Hi Vladimir,
>>
>> Can you provide more details?
>> * What does it do?
>> * Do we need to only propagate the flag to .NET or do anything else?
>> * Related ticket?
>>
>> Thanks,
>> Pavel
>>
>> On Fri, Mar 23, 2018 at 2:25 PM, Vladimir Ozerov 
>> wrote:
>>
>>> Pavel,
>>>
>>> We introduced new flag IgniteConfiguration.authenticationEnabled recently.
>>> Would you mind adding it to IgniteConfigutation.cs [1]?
>>>
>>> Vladimir.
>>>
>>> [1] https://issues.apache.org/jira/browse/IGNITE-8034
>>>
>>
>>
>

Re: .NET: Add "authenticationEnabled" flag to IgntieConfiguration

2018-03-23 Thread Pavel Tupitsyn

Please provide description in IGNITE-8034 and link Java-side ticket there.

On Fri, Mar 23, 2018 at 4:23 PM, Pavel Tupitsyn 
wrote:

> Hi Vladimir,
>
> Can you provide more details?
> * What does it do?
> * Do we need to only propagate the flag to .NET or do anything else?
> * Related ticket?
>
> Thanks,
> Pavel
>
> On Fri, Mar 23, 2018 at 2:25 PM, Vladimir Ozerov 
> wrote:
>
>> Pavel,
>>
>> We introduced new flag IgniteConfiguration.authenticationEnabled recently.
>> Would you mind adding it to IgniteConfigutation.cs [1]?
>>
>> Vladimir.
>>
>> [1] https://issues.apache.org/jira/browse/IGNITE-8034
>>
>
>

Re: .NET: Add "authenticationEnabled" flag to IgntieConfiguration

2018-03-23 Thread Pavel Tupitsyn

Hi Vladimir,

Can you provide more details?
* What does it do?
* Do we need to only propagate the flag to .NET or do anything else?
* Related ticket?

Thanks,
Pavel

On Fri, Mar 23, 2018 at 2:25 PM, Vladimir Ozerov 
wrote:

> Pavel,
>
> We introduced new flag IgniteConfiguration.authenticationEnabled recently.
> Would you mind adding it to IgniteConfigutation.cs [1]?
>
> Vladimir.
>
> [1] https://issues.apache.org/jira/browse/IGNITE-8034
>

[GitHub] ignite pull request #3692: Suspend resume benchmark

2018-03-23 Thread voipp

GitHub user voipp opened a pull request:

https://github.com/apache/ignite/pull/3692

Suspend resume benchmark



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/voipp/ignite suspend-resume-benchmark

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/ignite/pull/3692.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3692


commit 4f52ed67923c0461b7b5f645d02875abda2e4f3c
Author: voipp 
Date:   2018-03-20T16:26:41Z

suspend-resume yardstick benchmark

commit 1759dac4019df4dbd557b9468a120f1fb3cce2bf
Author: voipp 
Date:   2018-03-22T16:26:23Z

benchmark for explicit tx get operation added




---

.NET: Add "authenticationEnabled" flag to IgntieConfiguration

2018-03-23 Thread Vladimir Ozerov

Pavel,

We introduced new flag IgniteConfiguration.authenticationEnabled recently.
Would you mind adding it to IgniteConfigutation.cs [1]?

Vladimir.

[1] https://issues.apache.org/jira/browse/IGNITE-8034

[jira] [Created] (IGNITE-8034) .NET: Add "authenticationEnabled" flag to IgniteConfiguration

2018-03-23 Thread Vladimir Ozerov (JIRA)

Vladimir Ozerov created IGNITE-8034:
---

 Summary: .NET: Add "authenticationEnabled" flag to 
IgniteConfiguration
 Key: IGNITE-8034
 URL: https://issues.apache.org/jira/browse/IGNITE-8034
 Project: Ignite
  Issue Type: Task
  Components: platforms
Reporter: Vladimir Ozerov
 Fix For: 2.5






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[GitHub] ignite pull request #3572: IGNITE-7814 returned scala211.library.version par...

2018-03-23 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/ignite/pull/3572


---

Re: Test 150 clients migrated to `Cache [6]` suite

2018-03-23 Thread Petr Ivanov

No objections then.



> On 23 Mar 2018, at 13:03, Dmitry Pavlov  wrote:
> 
> Hi Dmitriy,
> 
>  Thank you for this improvement.
> 
> Hi Petr,
> 
>  IMO there is no need to keep Obsolete suites for so long time because
> Igniters ususally do not use test history (excepting TC Jedi who helps with
> monitoring). Moreover, test history itself is global, so still no need to
> keep suites too long time.
> 
> I think it is reasonable to notify before changes because someone may have
> created unique parameters combination in suite, so it is not desirable to
> delete it without notification (as it was for zIgnores & reproducing
> suite).
> 
> So let's wait for some time for objections, and if noone objects, we'll
> remove.
> 
> Sincerely,
> Dmitriy Pavlov
> 
> пт, 23 мар. 2018 г. в 12:07, Petr Ivanov :
> 
>> Nice work!
>> 
>> 
>> Yet, I have to ask to keep '~[Obsolete] 150 Clients’ build configuration
>> for at least month from latest run until its build history is removed by
>> autoclean — it’ll be nice to keep that test history.
>> 
>> 
>> 
>>> On 23 Mar 2018, at 11:53, Дмитрий Рябов  wrote:
>>> 
>>> Hello Igniters!
>>> 
>>> 
>>> 
>>> I migrated test `IgniteCache150ClientsTest` from `150 Clients` suite to `
>>> Cache [6]`, because `150 Clients` contains only one test class and makes
>>> ~6min excess Ignite build.
>>> 
>>> 
>>> 
>>> So, test suites `150 Clients` and `~[Obsolete] 150 Clients` will be
>> deleted
>>> soon.
>> 
>>

[GitHub] ignite pull request #3691: IGNITE-7691: Provide info about DECIMAL column sc...

2018-03-23 Thread nizhikov

GitHub user nizhikov opened a pull request:

https://github.com/apache/ignite/pull/3691

IGNITE-7691: Provide info about DECIMAL column scale and precision



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/nizhikov/ignite IGNITE-7691

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/ignite/pull/3691.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3691


commit a9c9548301056172454d83cb191a4866e37cb828
Author: Nikolay Izhikov 
Date:   2018-03-07T14:02:00Z

IGNITE-7691: Wrote some tests and empty implementation

commit 1b7604462321e40ea331e28be3f7d6352ffbd65c
Author: Nikolay Izhikov 
Date:   2018-03-14T19:31:38Z

IGNITE-7691: commit in the middle of development

commit 7bf4aee3db6362958c133c23f0efdd8fc8ef11fa
Author: Nikolay Izhikov 
Date:   2018-03-16T10:49:48Z

IGNITE-7691: Couple of edits to follow the way IGNITE-5623 follows.

commit 34b879264851d5b58133cd332d72fdfd2513d270
Author: Nikolay Izhikov 
Date:   2018-03-20T14:20:08Z

IGNITE-7691: One of failed tests fixed.

commit 8ec5d4508b60803dab0026ee1928e30207e4aa1b
Author: Nikolay Izhikov 
Date:   2018-03-22T19:12:45Z

Merge branch 'master' into IGNITE-7691

commit 496566a14ead99d368f8f933469ff7f63df9fa1d
Author: Nikolay Izhikov 
Date:   2018-03-23T09:16:41Z

Merge branch 'master' into IGNITE-7691

commit 4612eb7daad5bf31ccf0b603c2d66992f6185da5
Author: Nikolay Izhikov 
Date:   2018-03-23T10:44:15Z

IGNITE-7691: Tests seems to be OK




---

[jira] [Created] (IGNITE-8033) Flaky failure TxOptimisticDeadlockDetectionCrossCacheTest.testDeadlock on TC

2018-03-23 Thread Aleksey Plekhanov (JIRA)

Aleksey Plekhanov created IGNITE-8033:
-

 Summary: Flaky failure 
TxOptimisticDeadlockDetectionCrossCacheTest.testDeadlock on TC
 Key: IGNITE-8033
 URL: https://issues.apache.org/jira/browse/IGNITE-8033
 Project: Ignite
  Issue Type: Bug
Reporter: Aleksey Plekhanov
Assignee: Aleksey Plekhanov


Test TxOptimisticDeadlockDetectionCrossCacheTest.testDeadlock flakily fail on 
TC. Sometimes with timeout, sometimes with the following error:

{noformat}
[2018-03-21 12:06:23,469][ERROR][main][root] Test failed.   

   
junit.framework.AssertionFailedError

   
at junit.framework.Assert.fail(Assert.java:55)
at junit.framework.Assert.assertTrue(Assert.java:22)
at junit.framework.Assert.assertTrue(Assert.java:31)
at junit.framework.TestCase.assertTrue(TestCase.java:201)
at 
org.apache.ignite.internal.processors.cache.transactions.TxOptimisticDeadlockDetectionCrossCacheTest.doTestDeadlock(TxOptimisticDeadlockDetectionCrossCacheTest.java:186)
at 
org.apache.ignite.internal.processors.cache.transactions.TxOptimisticDeadlockDetectionCrossCacheTest.testDeadlock(TxOptimisticDeadlockDetectionCrossCacheTest.java:118)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)   

 
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at junit.framework.TestCase.runTest(TestCase.java:176)
at 
org.apache.ignite.testframework.junits.GridAbstractTest.runTestInternal(GridAbstractTest.java:2001)
at 
org.apache.ignite.testframework.junits.GridAbstractTest.access$000(GridAbstractTest.java:133)
at 
org.apache.ignite.testframework.junits.GridAbstractTest$5.run(GridAbstractTest.java:1916)
at java.lang.Thread.run(Thread.java:745)
{noformat}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IGNITE-8032) Fix issues within TX DML reducer.

2018-03-23 Thread Sergey Kalashnikov (JIRA)

Sergey Kalashnikov created IGNITE-8032:
--

 Summary: Fix issues within TX DML reducer.
 Key: IGNITE-8032
 URL: https://issues.apache.org/jira/browse/IGNITE-8032
 Project: Ignite
  Issue Type: Sub-task
Reporter: Sergey Kalashnikov
Assignee: Sergey Kalashnikov


The following code review issues need to be addressed:

1. GridNearTxQueryResultsEnlistFuture

1.1. remove GridCacheCompoundIdentityFuture implementation.
remote mini-futures.
1.2  Improve concurrency around sendNextBatches calls.

2. Refactor iterator UpdateIteratorAdapter/TxDmlReducerIterator to avoid 
multi-level nesting.

3. Normalize usage of IgniteBiTuple(k,v)/Object(key) instead of Object[] to 
represent rows.




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[GitHub] ignite pull request #3690: IGNITE-7928 .NET: Exception is not propagated to ...

2018-03-23 Thread apopovgg

GitHub user apopovgg opened a pull request:

https://github.com/apache/ignite/pull/3690

IGNITE-7928 .NET: Exception is not propagated to the C# client and thâ¦

â¦e app hangs

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/gridgain/apache-ignite ignite-7928

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/ignite/pull/3690.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3690


commit 7090407889d0ade418167054e7660c1af6cab48e
Author: apopov 
Date:   2018-03-23T10:24:50Z

IGNITE-7928 .NET: Exception is not propagated to the C# client and the app 
hangs




---

[jira] [Created] (IGNITE-8031) MVCC TX: TxLog does not support partitions rebalance at the moment. We need to implement it.

2018-03-23 Thread Roman Kondakov (JIRA)

Roman Kondakov created IGNITE-8031:
--

 Summary: MVCC TX: TxLog does not support partitions rebalance at 
the moment. We need to implement it.
 Key: IGNITE-8031
 URL: https://issues.apache.org/jira/browse/IGNITE-8031
 Project: Ignite
  Issue Type: Bug
  Components: sql
Reporter: Roman Kondakov


When new node joins to the cluster after the partitions rebalance it has empty 
TxLog. And therefore all transactions committed before this join are considered 
as uncommitted by this node.

We need to replicate TxLog to the new nodes as well as data partitions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Re: Server stores cache data on-heap if client has near cache - IGNITE-4662

2018-03-23 Thread Dmitry Pavlov

Hi Igniters,

Alexey, please step in. I did not fully understand issue description, and
can't identify if data is still not moved from onheap.

I suggest to consider it as outdated if we don't have any additional info.

Sincerely,
Dmitriy Pavlov

чт, 22 мар. 2018 г. в 10:48, Vyacheslav Daradur :

> Hi, Igniters!
>
> Look like it's outdated and can be closed.
>
> Could someone comment if this ticket [1] is still valid?
>
>
> [1] https://issues.apache.org/jira/browse/IGNITE-4662
>
> On Wed, Jun 28, 2017 at 2:38 AM, Valentin Kulichenko
>  wrote:
> > I'm not sure this ticket is valid for 2.0. Semen, can you comment?
> >
> > -Val
> >
> > On Tue, Jun 27, 2017 at 1:14 AM, Vyacheslav Daradur  >
> > wrote:
> >
> >> Hi Igniters.
> >>
> >> I have some questions according to this task:
> >>
> >> 1. Does the method: GridCacheMapEntry#evictInternal do the
> >> eviction(on-heap
> >> -> off-heap)?
> >> 2. Is CacheOffheapEvictionManager responsible for managing the
> >> eviction(on-heap -> off-heap)? (if not, then who is?)
> >> 3. At what moment the eviction(on-heap -> off-heap) is called?
> >>
> >>
> >> --
> >> Best Regards, Vyacheslav D.
> >>
>
>
>
> --
> Best Regards, Vyacheslav D.
>

[GitHub] ignite pull request #3689: Ignite-1.6.15

2018-03-23 Thread AMashenkov

GitHub user AMashenkov opened a pull request:

https://github.com/apache/ignite/pull/3689

Ignite-1.6.15

For test purposes

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/gridgain/apache-ignite ignite-1.6.15

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/ignite/pull/3689.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3689


commit eea42827738301168d66562a05d3360150e6fb89
Author: sboikov 
Date:   2016-08-23T07:47:15Z

Fix for IgniteDynamicCacheStartNoExchangeTimeoutTest.

commit f9258736c5cfa157e138d879aa0dbacc5a7cb4b2
Author: Alexey Kuznetsov 
Date:   2016-08-23T07:59:45Z

Fixed tests.

commit d6033712425cd0157fa640a7d46ec6579fd7e74b
Author: Pavel Tupitsyn 
Date:   2016-08-23T11:10:55Z

IGNITE-3325 .NET: Improve CompiledQuery in LINQ provider

commit 207e8cb3b939670655da14082ac9e1cf1d822fd0
Author: Pavel Tupitsyn 
Date:   2016-08-23T17:01:48Z

IGNITE-3325 .NET: Improve CompiledQuery in LINQ provider - minor fixes

commit 417b269fe844c2a0b2c18aff8f58a2b38bc27531
Author: Vasiliy Sisko 
Date:   2016-08-24T02:44:04Z

IGNITE-3005 Implemented showing of node ID8 , IP and checked values on 
alert.

commit 8c370c6927848b66782ecc18df499460c0bfdd11
Author: Vasiliy Sisko 
Date:   2016-08-24T03:03:33Z

IGNITE-2726 Added per-nodes cache information about off-heap.

commit 25b59d7c6ea7bd23d5a8366fb5618e11990327c8
Author: Pavel Tupitsyn 
Date:   2016-08-24T08:07:17Z

IGNITE-3325 .NET: Rename CompiledQueryDelegate -> CompiledQueryFunc to 
conform to design guidelines

commit 4e9e7b8ee1c990bacdc2d081b706ca315927fdce
Author: vozerov-gridgain 
Date:   2016-08-24T09:12:00Z

IGNITE-3716: ODBC: Added SQL escape sequence parsing.

commit 118db2fcffe3534aa1e5f4b97b8fbe23891752c4
Author: vozerov-gridgain 
Date:   2016-08-24T09:12:24Z

Merge remote-tracking branch 'upstream/ignite-1.6.6' into ignite-1.6.6

commit c02ad87f863bd730e60fb7052547fa0848e7eb77
Author: isapego 
Date:   2016-08-24T14:21:48Z

IGNITE-3736: ODBC: Added support for string scalar functions. This closes 
#979.

commit 13dfcbe03aca137ee70698f6083df27c10ecdaf9
Author: vozerov-gridgain 
Date:   2016-08-24T14:31:38Z

IGNITE-3736: ODBC: Reverted back removed "supported" futures.

commit d6449ffbc65acda6a2cf4484608188367837dd17
Author: sboikov 
Date:   2016-08-24T15:34:02Z

Fixed issues on node stop:
- in service processor need guard depExe access with busyLock
- do not error log IO errors in ClientImpl on stop

commit 12fd4976f482ebc43831754645e34042c9073b2d
Author: sboikov 
Date:   2016-08-25T09:29:04Z

Fixed GridQueryParsingTest.

commit 5a3b3e2c6ecb5d6c96513b79f21828526b4a98a0
Author: isapego 
Date:   2016-08-25T09:35:07Z

IGNITE-3749: ODBC: Added support for numeric scalar functions. This closes 
#981.

commit 0e3a6e2df8b42f255a5a4688d5827dccaabfd3a4
Author: isapego 
Date:   2016-08-25T11:34:31Z

IGNITE-3757: ODBC: Added aggregate functions support. This closes #983.

commit e2f287039011bc9437c94fb574e61e2ac226
Author: Andrey V. Mashenkov 
Date:   2016-08-25T13:26:02Z

IGNITE-3738: ODBC: Fixed escape sequence whitespaces handling. This closes 
#982.

commit 8aabd6ea65d883d3bbcf37c05c146105dff8a6e2
Author: isapego 
Date:   2016-08-25T13:30:20Z

IGNITE-3751: ODBC: Added system functions support. This closes #985.

commit ae0b5ebf02f3eb70d24dd3b0eb63dde9843c82b0
Author: Andrey V. Mashenkov 
Date:   2016-08-26T08:12:31Z

IGNITE-3739: ODBC: Added GUID escape sequence support. This closes #988.

commit 6fd53ea5b50148e5a1156d83ea28acb8faf84035
Author: Igor Sapego 
Date:   2016-08-26T08:19:39Z

IGNITE-3761: ODBC: Added tests for SQL_SQL92_VALUE_EXPRESSIONS. This closes 
#989.

commit 99e3e8a2d997aa681264460c2845984712ded90e
Author: Igor Sapego 
Date:   2016-08-26T08:23:49Z

IGNITE-3764: ODBC Added tests for SQL operators. This closes #986.

commit 87a1928a4f90b4f8a221041cfff9d22e3dd801cc
Author: vozerov-gridgain 
Date:   2016-08-26T12:22:15Z

IGNITE-3776: Removed code duplication in GridNearAtomicAbstractUpdateFuture.

commit 92f18bf353cc8c3821c6500ce9f1cd397a7cf17c
Author: Andrey V. Mashenkov 
Date:   2016-08-26T12:31:30Z

IGNITE-3745: ODBC: Implemented date/time/timestamp escape sequence parsing. 
This closes #991.

commit b5757642e135908d9baa027a605035dd0d4acfc9
Author: tledkov-gridgain 
Date:   2016-08-26T12:47:02Z

IGNITE-3670 IGFS: Improved symlink handling for delete operation and added 
more tests. This

Re: Test 150 clients migrated to `Cache [6]` suite

2018-03-23 Thread Dmitry Pavlov

Hi Dmitriy,

  Thank you for this improvement.

Hi Petr,

  IMO there is no need to keep Obsolete suites for so long time because
Igniters ususally do not use test history (excepting TC Jedi who helps with
monitoring). Moreover, test history itself is global, so still no need to
keep suites too long time.

I think it is reasonable to notify before changes because someone may have
created unique parameters combination in suite, so it is not desirable to
delete it without notification (as it was for zIgnores & reproducing
suite).

So let's wait for some time for objections, and if noone objects, we'll
remove.

Sincerely,
Dmitriy Pavlov

пт, 23 мар. 2018 г. в 12:07, Petr Ivanov :

> Nice work!
>
>
> Yet, I have to ask to keep '~[Obsolete] 150 Clients’ build configuration
> for at least month from latest run until its build history is removed by
> autoclean — it’ll be nice to keep that test history.
>
>
>
> > On 23 Mar 2018, at 11:53, Дмитрий Рябов  wrote:
> >
> > Hello Igniters!
> >
> >
> >
> > I migrated test `IgniteCache150ClientsTest` from `150 Clients` suite to `
> > Cache [6]`, because `150 Clients` contains only one test class and makes
> > ~6min excess Ignite build.
> >
> >
> >
> > So, test suites `150 Clients` and `~[Obsolete] 150 Clients` will be
> deleted
> > soon.
>
>

[jira] [Created] (IGNITE-8030) Cluster hangs on deactivation process in time stopping indexed cache

2018-03-23 Thread Vladislav Pyatkov (JIRA)

Vladislav Pyatkov created IGNITE-8030:
-

 Summary: Cluster hangs on deactivation process in time stopping 
indexed cache
 Key: IGNITE-8030
 URL: https://issues.apache.org/jira/browse/IGNITE-8030
 Project: Ignite
  Issue Type: Bug
Reporter: Vladislav Pyatkov
 Attachments: thrdump-server.log

{noformat}

"sys-#10283%DPL_GRID%DplGridNodeName%" #13068 prio=5 os_prio=0 
tid=0x7f07040eb000 nid=0x2e0f waiting on condition [0x7e6deb9b8000]

   java.lang.Thread.State: WAITING (parking)

    at sun.misc.Unsafe.park(Native Method)

    - parking to wait for  <0x7f0bd2b0> (a 
java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)

    at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)

    at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)

    at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireInterruptibly(AbstractQueuedSynchronizer.java:897)

    at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1222)

    at 
java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lockInterruptibly(ReentrantReadWriteLock.java:998)

    at 
org.apache.ignite.internal.processors.query.h2.opt.GridH2Table.lock(GridH2Table.java:292)

    at 
org.apache.ignite.internal.processors.query.h2.opt.GridH2Table.lock(GridH2Table.java:253)

    at org.h2.command.ddl.DropTable.prepareDrop(DropTable.java:87)

    at org.h2.command.ddl.DropTable.update(DropTable.java:113)

    at org.h2.command.CommandContainer.update(CommandContainer.java:101)

    at org.h2.command.Command.executeUpdate(Command.java:260)

    - locked <0x7f0c276c85b8> (a org.h2.engine.Session)

    at 
org.h2.jdbc.JdbcStatement.executeUpdateInternal(JdbcStatement.java:137)

    - locked <0x7f0c276c85b8> (a org.h2.engine.Session)

    at org.h2.jdbc.JdbcStatement.executeUpdate(JdbcStatement.java:122)

    at 
org.apache.ignite.internal.processors.query.h2.IgniteH2Indexing.dropTable(IgniteH2Indexing.java:654)

    at 
org.apache.ignite.internal.processors.query.h2.IgniteH2Indexing.unregisterCache(IgniteH2Indexing.java:2482)

    at 
org.apache.ignite.internal.processors.query.GridQueryProcessor.onCacheStop0(GridQueryProcessor.java:1684)

    - locked <0x7f0b69f822d0> (a java.lang.Object)

    at 
org.apache.ignite.internal.processors.query.GridQueryProcessor.onCacheStop(GridQueryProcessor.java:879)

    at 
org.apache.ignite.internal.processors.cache.GridCacheProcessor.stopCache(GridCacheProcessor.java:1189)

    at 
org.apache.ignite.internal.processors.cache.GridCacheProcessor.prepareCacheStop(GridCacheProcessor.java:2063)

    at 
org.apache.ignite.internal.processors.cache.GridCacheProcessor.onExchangeDone(GridCacheProcessor.java:2219)

    at 
org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.onDone(GridDhtPartitionsExchangeFuture.java:1518)

    at 
org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.finishExchangeOnCoordinator(GridDhtPartitionsExchangeFuture.java:2538)

    at 
org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.onAllReceived(GridDhtPartitionsExchangeFuture.java:2297)

    at 
org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.processSingleMessage(GridDhtPartitionsExchangeFuture.java:2034)

    at 
org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.access$100(GridDhtPartitionsExchangeFuture.java:122)

    at 
org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture$2.apply(GridDhtPartitionsExchangeFuture.java:1891)

    at 
org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture$2.apply(GridDhtPartitionsExchangeFuture.java:1879)

    at 
org.apache.ignite.internal.util.future.GridFutureAdapter.notifyListener(GridFutureAdapter.java:383)

    at 
org.apache.ignite.internal.util.future.GridFutureAdapter.listen(GridFutureAdapter.java:353)

    at 
org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.onReceiveSingleMessage(GridDhtPartitionsExchangeFuture.java:1879)

    at 
org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager.processSinglePartitionUpdate(GridCachePartitionExchangeManager.java:1523)

    at 
org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager.access$1000(GridCachePartitionExchangeManager.java:133)

    at

Re: Nodes which started in separate JVM couldn't stop properly (in tests)

2018-03-23 Thread Vyacheslav Daradur

Dmitry, Nikolay, thanks for your help!

Since the changes are merged we are able to implement this feature in
Compatibility Testing Framework.

It provides us to work flexibly with multi-version cluster and
implement new testing scenarios (I know that Ignite doesn't have such
feature yet).

Does it make sense to create a ticket?


On Wed, Mar 21, 2018 at 8:16 PM, Dmitry Pavlov  wrote:
> Hi Nickolay,
>
> it seems we have lazy consesus here.
>
> Failures:  tests 11 suites 1, all these tests are failed in master.
>
> Could you merge?
>
> Sincerely,
> Dmitriy Pavlov
>
> пт, 16 мар. 2018 г. в 18:29, Nikolay Izhikov :
>
>> Hello, Guys.
>>
>> I'm reviewed changes and it looks good to me.
>> There is a simple reproducer for a bug in test framework, see below.
>>
>> It fails in master and works in branch.
>>
>> I'm planning to merge the fix [1] if Run All will be OK.
>>
>> Please, write to me if you have any objections.
>>
>> [1] https://github.com/apache/ignite/pull/2382
>>
>> ```
>> public class MultiJvmSelfTest extends GridCommonAbstractTest {
>> @Override protected boolean isMultiJvm() { return true; }
>>
>> public void testGrid() throws Exception {
>> final IgniteInternalFuture fut = GridTestUtils.runAsync(new
>> RunnableX() {
>> @Override public void runx() throws Exception {
>> try {
>> startGrid(0);
>> startGrid(1);
>> }
>> finally {
>> stopGrid(1);
>> stopGrid(0);
>> }
>> }
>> });
>>
>> try {
>> fut.get(20_000L);
>> } finally {
>> stopAllGrids(true);
>> }
>> }
>> }
>> ```
>>
>> В Чт, 15/03/2018 в 15:59 +, Dmitry Pavlov пишет:
>> > I see now. Thank you.
>> >
>> > Nikolay, could you please merge this change?
>> >
>> > чт, 15 мар. 2018 г. в 18:48, Vyacheslav Daradur :
>> >
>> > > In brief:
>> > > Nodes in *separate* JVMs are shutting down by the computing task
>> > > *StopGridTask* which has sent from *local* JVM *synchronously* that
>> > > means *local* node must wait for task's finish.
>> > >
>> > > At the same time when a node in *separate* JVM executes the received
>> > > *StopGridTask* which *synchronously* calls *G.stop(igniteInstanceName,
>> > > FALSE)* which is waiting for all computing task's finish, including
>> > > *StopGridTask* which has invoked it.
>> > >
>> > > We have some kind of deadlock:
>> > > *Local* node is waiting for the computing task's finish which is
>> > > waiting for finish of execution *G.stop* which is waiting for all
>> > > computing tasks finish including *StopGridTask*.
>> > >
>> > > We have not noticed that before because we use only stopAllGrids() in
>> > > out tests which stop local JVM without waiting for nodes in other
>> > > JVMs.
>> > >
>> > >
>> > >
>> > > On Thu, Mar 15, 2018 at 6:11 PM, Dmitry Pavlov 
>> > > wrote:
>> > > > Please address comments in PR.
>> > > >
>> > > > I did not fully understood why sync GridStopMessage message was
>> lost, but
>> > > > async will be successfull. Probably we need discuss it briefly.
>> > > >
>> > > > чт, 1 мар. 2018 г. в 12:11, Vyacheslav Daradur > >:
>> > > > >
>> > > > > Thank you, Dmitry!
>> > > > >
>> > > > > I'll join this review soon.
>> > > > >
>> > > > > On Thu, Mar 1, 2018 at 12:07 PM, Dmitry Pavlov <
>> dpavlov@gmail.com>
>> > > > > wrote:
>> > > > > > Hi Vyacheslav,
>> > > > > >
>> > > > > > I will take a look, but first of all I am going to review
>> > > > > > https://reviews.ignite.apache.org/ignite/review/IGNT-CR-502  -
>> it is
>> > > > > > impact
>> > > > > > change in testing framework. Hope you also will join to this
>> review .
>> > > > > >
>> > > > > > Sincerely,
>> > > > > > Dmitiry Pavlov
>> > > > > >
>> > > > > >
>> > > > > > чт, 1 мар. 2018 г. в 11:13, Vyacheslav Daradur <
>> daradu...@gmail.com>:
>> > > > > > >
>> > > > > > > Hi, Dmitry, could you please review it, because you are one of
>> the
>> > > > > > > most experienced people in the testing framework.
>> > > > > > >
>> > > > > > > Please see comment in Jira, because it is in pretty-format
>> there.
>> > > > > > >
>> > > > > > > On Thu, Feb 22, 2018 at 11:56 AM, Vyacheslav Daradur
>> > > > > > >  wrote:
>> > > > > > > > Hi Igniters!
>> > > > > > > >
>> > > > > > > > I have investigated the issue [1] and found that stopping
>> node in
>> > > > > > > > separate JVM may stuck thread or leave system process alive
>> after
>> > > > > > > > test
>> > > > > > > > finished.
>> > > > > > > > The main reason is *StopGridTask* that we send from node in
>> local
>> > >
>> > > JVM
>> > > > > > > > to node in separate JVM via remote computing.
>> > > > > > > > We send job synchronously to be sure that node will be
>> stopped, but
>> > > > > > > > job calls synchronously

Re: What's about releasing Ignite 2.5 a bit earlier?

2018-03-23 Thread Dmitry Pavlov

Hi Igniters,

There are two tickets I forgot to mention, it is optimisation of Native
Persistence:

Performing data pages IO outside locks
- https://issues.apache.org/jira/browse/IGNITE-7606  - Write removed dirty
page during replacement without holding segment write lock
- https://issues.apache.org/jira/browse/IGNITE-7698  - Page read during
replacement should be outside of segment write lock

Both are resovled, so no delays from their side.

Sincerely,
Dmitriy Pavlov

пт, 23 мар. 2018 г. в 12:06, Ivan Rakov :

> I agree.
> Fix of LOG_ONLY is critical for Ignite durability and can be done quickly.
> Another reason to release 2.5 earlier is fix of
> https://issues.apache.org/jira/browse/IGNITE-7751. It's another critical
> bug that possibly can lead to checkpoint buffer overflow and corruption of
> internal data structures even with Pages Write Throttling enabled.
>
> Best Regards,
> Ivan Rakov
>
> On 22.03.2018 23:20, Dmitry Pavlov wrote:
>
> Hi Igniters,
>
> Why not? Let's release it a bit earlier.
>
> I think it would be good if Ivan will prepare fix for WAL LOG_ONLY:
> http://apache-ignite-developers.2346864.n4.nabble.com/Reconsider-default-WAL-mode-we-need-something-between-LOG-ONLY-and-FSYNC-td28165.html
>  Ivan, what do you think?
>
> Also it would be great if we can complete as much MTCGA tickets as we can.
> Let's do this release with green TC.
>
> Sincerely,
> Dmitriy Pavlov
>
> чт, 22 мар. 2018 г. в 22:08, Nikolay Izhikov :
>
>> Also, we has new API - ContinuousQueryWithTransformer merged for 2.5
>> release - [IGNITE-425]
>>
>> В Чт, 22/03/2018 в 21:37 +0300, Nikolay Izhikov пишет:
>> > Hello, guys
>> >
>> > I agree with earlier release.
>> >
>> > I propose to include my task IGNITE-7077 to 2.5 release.
>> >
>> > Valentin, do you have a time slot to review my implementation?
>> >
>> > В Чт, 22/03/2018 в 10:39 -0700, Denis Magda пишет:
>> > > Igniters,
>> > >
>> > > According to our regular schedule, every new Ignite version usually
>> goes
>> > > public once in 3 months. As you remember, the latest 2.4 release,
>> which
>> > > took us 5 months to improve and roll out, was based on the version of
>> the
>> > > source code dated by January.
>> > >
>> > > Since that time the master branch went far ahead and already
>> incorporates
>> > > many valuable fixes and capabilities such as:
>> > >
>> > >- Fixes provided as a part of "Gree Team City" activity.
>> > >- Persistence: page replacement algorithm and throttling
>> optimizations,
>> > >out of memory in checkpointing buffer corrections, etc. Alex G and
>> Ivan can
>> > >shed more light here.
>> > >- Data loading optimizations for SQL: streaming for JDBC thin
>> driver and
>> > >copy command
>> > >- Genetic Algorithms Grid Contribution!
>> > >- Java Thin Client developed by Alexey Kukushkin
>> > >
>> > >
>> > >- the list goes on and on... Please share the contributions you are
>> > >ready to release.
>> > >
>> > > So, why don't we go ahead and release the current master and possibly
>> extra
>> > > tickets that are in the review state earlier? What's about April 30
>> as the
>> > > next release date?
>> > >
>> > > --
>> > > Densi
>
>
>

Re: Reconsider default WAL mode: we need something between LOG_ONLY and FSYNC

2018-03-23 Thread Dmitry Pavlov

Hi Ivan,

IMO we have to add extra FSYNCS for BACKGROUND WAL. Agree?

Sincerely,
Dmitriy Pavlov

пт, 23 мар. 2018 г. в 12:23, Ivan Rakov :

> Igniters, there's another important question about this matter.
> Do we want to add extra FSYNCS for BACKGROUND WAL mode? I think that we
> have to do it: it will cause similar performance drop, but if we
> consider LOG_ONLY broken without these fixes, BACKGROUND is broken as well.
>
> Best Regards,
> Ivan Rakov
>
> On 23.03.2018 10:27, Ivan Rakov wrote:
> > Fixes are quite simple.
> > I expect them to be merged in master in a week in worst case.
> >
> > Best Regards,
> > Ivan Rakov
> >
> > On 22.03.2018 17:49, Denis Magda wrote:
> >> Ivan,
> >>
> >> How quick are you going to merge the fix into the master? Many
> >> persistence
> >> related optimizations have already stacked up. Probably, we can release
> >> them sooner if the community agrees.
> >>
> >> --
> >> Denis
> >>
> >> On Thu, Mar 22, 2018 at 5:22 AM, Ivan Rakov 
> >> wrote:
> >>
> >>> Thanks all!
> >>> We seem to have reached a consensus on this issue. I'll just add
> >>> necessary
> >>> fsyncs under IGNITE-7754.
> >>>
> >>> Best Regards,
> >>> Ivan Rakov
> >>>
> >>>
> >>> On 22.03.2018 15:13, Ilya Lantukh wrote:
> >>>
>  +1 for fixing LOG_ONLY. If current implementation doesn't protect from
>  data
>  corruption, it doesn't make sence.
> 
>  On Wed, Mar 21, 2018 at 10:38 PM, Denis Magda 
>  wrote:
> 
>  +1 for the fix of LOG_ONLY
> > On Wed, Mar 21, 2018 at 11:23 AM, Alexey Goncharuk <
> > alexey.goncha...@gmail.com> wrote:
> >
> > +1 for fixing LOG_ONLY to enforce corruption safety given the
> > provided
> >> performance results.
> >>
> >> 2018-03-21 18:20 GMT+03:00 Vladimir Ozerov :
> >>
> >> +1 for accepting drop in LOG_ONLY. 7% is not that much and not a
> >> drop
> >> at
> >> all, provided that we fixing a bug. I.e. should we implement it
> >> correctly
> >> in the first place we would never notice any "drop".
> >>> I do not understand why someone would like to use current broken
> >>> mode.
> >>>
> >>> On Wed, Mar 21, 2018 at 6:11 PM, Dmitry Pavlov
> >>> 
> >>> wrote:
> >>>
> >>> Hi, I think option 1 is better. As Val said any mode that allows
> >>> corruption
> >>>
>  does not make much sense.
> 
>  What Ivan mentioned here as drop, in relation to old mode DEFAULT
> 
> >>> (FSYNC
> >>> now), is still significant perfromance boost.
>  Sincerely,
>  Dmitriy Pavlov
> 
>  ср, 21 мар. 2018 г. в 17:56, Ivan Rakov :
> 
>  I've attached benchmark results to the JIRA ticket.
> > We observe ~7% drop in "fair" LOG_ONLY_SAFE mode, independent of
> >
>  WAL
> >> compaction enabled flag. It's pretty significant drop: WAL
>  compaction
> >> itself gives only ~3% drop.
> > I see two options here:
> > 1) Change LOG_ONLY behavior. That implies that we'll be ready to
> >
>  release
>  AI 2.5 with 7% drop.
> > 2) Introduce LOG_ONLY_SAFE, make it default, add release note
> > to AI
> >
>  2.5
> >>> that we added power loss durability in default mode, but user may
> > fallback to previous LOG_ONLY in order to retain performance.
> >
> > Thoughts?
> >
> > Best Regards,
> > Ivan Rakov
> >
> > On 20.03.2018 16:00, Ivan Rakov wrote:
> >
> >> Val,
> >>
> >> If a storage is in
> >>> corrupted state, does it mean that it needs to be completely
> >>>
> >> removed
> >>> and
> > cluster needs to be restarted without data?
> >> Yes, there's a chance that in LOG_ONLY all local data will be
> >>
> > lost,
> >> but only in *power loss**/ OS crash* case.
> >> kill -9, JVM crash, death of critical system thread and all
> >> other
> >> cases that usually take place are variations of *process crash*.
> >>
> > All
> >>> WAL modes (except NONE, of course) ensure corruption-safety in
> > case
> >> of
>  process crash.
> >> If so, I'm not sure any mode
> >>> that allows corruption makes much sense to me.
> >>>
> >> It depends on performance impact of enforcing power-loss
> >>
> > corruption
> >> safety. Price of full protection from power loss is high - FSYNC
> > is
> >> way slower (2-10 times) than other WAL modes. The question is
> > whether
> >>> ensuring weaker guarantees (corruption can't happen, but loss of
> > last
> >>> updates can) will affect performance as badly

[GitHub] ignite pull request #3688: check if changing finder helps to fix the test ha...

2018-03-23 Thread sergey-chugunov-1985

GitHub user sergey-chugunov-1985 opened a pull request:

https://github.com/apache/ignite/pull/3688

check if changing finder helps to fix the test hangs



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/gridgain/apache-ignite 
master-check_dyno_columns_tests

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/ignite/pull/3688.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3688


commit 07f28161e3cafc08bdad6b63f9773af47c64284f
Author: Sergey Chugunov 
Date:   2018-03-23T09:26:36Z

check if changing finder helps to fix the test hangs




---

Re: Reconsider default WAL mode: we need something between LOG_ONLY and FSYNC

2018-03-23 Thread Ivan Rakov


Igniters, there's another important question about this matter.
Do we want to add extra FSYNCS for BACKGROUND WAL mode? I think that we 
have to do it: it will cause similar performance drop, but if we 
consider LOG_ONLY broken without these fixes, BACKGROUND is broken as well.


Best Regards,
Ivan Rakov

On 23.03.2018 10:27, Ivan Rakov wrote:

Fixes are quite simple.
I expect them to be merged in master in a week in worst case.

Best Regards,
Ivan Rakov

On 22.03.2018 17:49, Denis Magda wrote:

Ivan,

How quick are you going to merge the fix into the master? Many 
persistence

related optimizations have already stacked up. Probably, we can release
them sooner if the community agrees.

--
Denis

On Thu, Mar 22, 2018 at 5:22 AM, Ivan Rakov  
wrote:



Thanks all!
We seem to have reached a consensus on this issue. I'll just add 
necessary

fsyncs under IGNITE-7754.

Best Regards,
Ivan Rakov


On 22.03.2018 15:13, Ilya Lantukh wrote:


+1 for fixing LOG_ONLY. If current implementation doesn't protect from
data
corruption, it doesn't make sence.

On Wed, Mar 21, 2018 at 10:38 PM, Denis Magda  
wrote:


+1 for the fix of LOG_ONLY

On Wed, Mar 21, 2018 at 11:23 AM, Alexey Goncharuk <
alexey.goncha...@gmail.com> wrote:

+1 for fixing LOG_ONLY to enforce corruption safety given the 
provided

performance results.

2018-03-21 18:20 GMT+03:00 Vladimir Ozerov :

+1 for accepting drop in LOG_ONLY. 7% is not that much and not a 
drop

at
all, provided that we fixing a bug. I.e. should we implement it
correctly
in the first place we would never notice any "drop".
I do not understand why someone would like to use current broken 
mode.


On Wed, Mar 21, 2018 at 6:11 PM, Dmitry Pavlov 


wrote:

Hi, I think option 1 is better. As Val said any mode that allows
corruption


does not make much sense.

What Ivan mentioned here as drop, in relation to old mode DEFAULT


(FSYNC
now), is still significant perfromance boost.

Sincerely,
Dmitriy Pavlov

ср, 21 мар. 2018 г. в 17:56, Ivan Rakov :

I've attached benchmark results to the JIRA ticket.

We observe ~7% drop in "fair" LOG_ONLY_SAFE mode, independent of


WAL

compaction enabled flag. It's pretty significant drop: WAL

compaction

itself gives only ~3% drop.

I see two options here:
1) Change LOG_ONLY behavior. That implies that we'll be ready to


release
AI 2.5 with 7% drop.
2) Introduce LOG_ONLY_SAFE, make it default, add release note 
to AI



2.5

that we added power loss durability in default mode, but user may

fallback to previous LOG_ONLY in order to retain performance.

Thoughts?

Best Regards,
Ivan Rakov

On 20.03.2018 16:00, Ivan Rakov wrote:


Val,

If a storage is in

corrupted state, does it mean that it needs to be completely


removed

and

cluster needs to be restarted without data?

Yes, there's a chance that in LOG_ONLY all local data will be


lost,

but only in *power loss**/ OS crash* case.
kill -9, JVM crash, death of critical system thread and all 
other

cases that usually take place are variations of *process crash*.


All

WAL modes (except NONE, of course) ensure corruption-safety in

case

of

process crash.

If so, I'm not sure any mode

that allows corruption makes much sense to me.


It depends on performance impact of enforcing power-loss


corruption

safety. Price of full protection from power loss is high - FSYNC

is

way slower (2-10 times) than other WAL modes. The question is

whether

ensuring weaker guarantees (corruption can't happen, but loss of

last

updates can) will affect performance as badly as strong

guarantees.

I'll share benchmark results soon.

Best Regards,
Ivan Rakov

On 20.03.2018 5:09, Valentin Kulichenko wrote:


Guys,

What do we understand under "data corruption" here? If a 
storage



is

in


corrupted state, does it mean that it needs to be completely

removed

and

cluster needs to be restarted without data? If so, I'm not sure

any

mode

that allows corruption makes much sense to me. How am I supposed

to

use a
database, if virtually any failure can end with complete 
loss of



data?

In any case, this definitely should not be a default behavior.

If

user ever

switches to corruption-unsafe mode, there should be a clear


warning

about

this.

-Val

On Fri, Mar 16, 2018 at 1:06 AM, Ivan Rakov <


ivan.glu...@gmail.com>

wrote:

Ticket to track changes:

https://issues.apache.org/jira/browse/IGNITE-7754

Best Regards,
Ivan Rakov


On 16.03.2018 10:58, Dmitriy Setrakyan wrote:

On Fri, Mar 16, 2018 at 12:55 AM, Ivan Rakov <
ivan.glu...@gmail.com

wrote:

Vladimir,


Unlike BACKGROUND, LOG_ONLY provides strict write guarantees
unless power
loss has happened.
Seems like we need to measure performance difference to


decide

whether do

we need separate WAL mode. If it will be invisible, we'll


just

fix

these
bugs without introducing new mode; if it will be 
perceptible,



we'll

continue the

[jira] [Created] (IGNITE-8029) Web console: wrong behaviour of cluster activation component

2018-03-23 Thread Pavel Konstantinov (JIRA)

Pavel Konstantinov created IGNITE-8029:
--

 Summary: Web console: wrong behaviour of cluster activation 
component
 Key: IGNITE-8029
 URL: https://issues.apache.org/jira/browse/IGNITE-8029
 Project: Ignite
  Issue Type: Bug
Reporter: Pavel Konstantinov
Assignee: Alexey Kuznetsov


I've noticed that during activation the text 'Deactivation' is printed for ~0.5 
second
and else - during deactivation, the text 'Activation' is printed



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Re: Test 150 clients migrated to `Cache [6]` suite

2018-03-23 Thread Petr Ivanov

Nice work!

Yet, I have to ask to keep '~[Obsolete] 150 Clients’ build configuration for at 
least month from latest run until its build history is removed by autoclean — 
it’ll be nice to keep that test history.

> On 23 Mar 2018, at 11:53, Дмитрий Рябов  wrote:
> 
> Hello Igniters!
> 
> 
> 
> I migrated test `IgniteCache150ClientsTest` from `150 Clients` suite to `
> Cache [6]`, because `150 Clients` contains only one test class and makes
> ~6min excess Ignite build.
> 
> 
> 
> So, test suites `150 Clients` and `~[Obsolete] 150 Clients` will be deleted
> soon.

Re: What's about releasing Ignite 2.5 a bit earlier?

2018-03-23 Thread Ivan Rakov


I agree.
Fix of LOG_ONLY is critical for Ignite durability and can be done quickly.
Another reason to release 2.5 earlier is fix of 
https://issues.apache.org/jira/browse/IGNITE-7751. It's another critical 
bug that possibly can lead to checkpoint buffer overflow and corruption 
of internal data structures even with Pages Write Throttling enabled.


Best Regards,
Ivan Rakov

On 22.03.2018 23:20, Dmitry Pavlov wrote:

Hi Igniters,

Why not? Let's release it a bit earlier.

I think it would be good if Ivan will prepare fix for WAL LOG_ONLY: 
http://apache-ignite-developers.2346864.n4.nabble.com/Reconsider-default-WAL-mode-we-need-something-between-LOG-ONLY-and-FSYNC-td28165.html 
 Ivan, what do you think?


Also it would be great if we can complete as much MTCGA tickets as we 
can. Let's do this release with green TC.


Sincerely,
Dmitriy Pavlov

чт, 22 мар. 2018 г. в 22:08, Nikolay Izhikov >:


Also, we has new API - ContinuousQueryWithTransformer merged for
2.5 release - [IGNITE-425]

В Чт, 22/03/2018 в 21:37 +0300, Nikolay Izhikov пишет:
> Hello, guys
>
> I agree with earlier release.
>
> I propose to include my task IGNITE-7077 to 2.5 release.
>
> Valentin, do you have a time slot to review my implementation?
>
> В Чт, 22/03/2018 в 10:39 -0700, Denis Magda пишет:
> > Igniters,
> >
> > According to our regular schedule, every new Ignite version
usually goes
> > public once in 3 months. As you remember, the latest 2.4
release, which
> > took us 5 months to improve and roll out, was based on the
version of the
> > source code dated by January.
> >
> > Since that time the master branch went far ahead and already
incorporates
> > many valuable fixes and capabilities such as:
> >
> >    - Fixes provided as a part of "Gree Team City" activity.
> >    - Persistence: page replacement algorithm and throttling
optimizations,
> >    out of memory in checkpointing buffer corrections, etc.
Alex G and Ivan can
> >    shed more light here.
> >    - Data loading optimizations for SQL: streaming for JDBC
thin driver and
> >    copy command
> >    - Genetic Algorithms Grid Contribution!
> >    - Java Thin Client developed by Alexey Kukushkin
> >
> >
> >    - the list goes on and on... Please share the contributions
you are
> >    ready to release.
> >
> > So, why don't we go ahead and release the current master and
possibly extra
> > tickets that are in the review state earlier? What's about
April 30 as the
> > next release date?
> >
> > --
> > Densi

Test 150 clients migrated to `Cache [6]` suite

2018-03-23 Thread Дмитрий Рябов

Hello Igniters!



I migrated test `IgniteCache150ClientsTest` from `150 Clients` suite to `
Cache [6]`, because `150 Clients` contains only one test class and makes
~6min excess Ignite build.



So, test suites `150 Clients` and `~[Obsolete] 150 Clients` will be deleted
 soon.

Re: Apache Ignite nightly release builds

2018-03-23 Thread Dmitriy Setrakyan

Awesome! Finally instead of asking our users to build from the master, we
can provide a link to the nightly build instead.

Denis, can you please add these links to the website?

D.

On Thu, Mar 22, 2018 at 1:27 PM, Petr Ivanov  wrote:

> It works, thanks!
>
>
> Here is updated links for Artifacts and Changes respectively with silent
> guest login (can be added to bookmarks):
> * https://ci.ignite.apache.org/viewLog.html?buildId=
> lastSuccessful=Releases_NightlyRelease_
> RunApacheIgniteNightlyRelease=artifacts=1
> * https://ci.ignite.apache.org/viewLog.html?buildId=
> lastSuccessful=Releases_NightlyRelease_
> RunApacheIgniteNightlyRelease=buildChangesDiv=1
>
>
>
> > On 22 Mar 2018, at 13:06, Vitaliy Osipov  wrote:
> >
> > 
>
>

Re: Reconsider default WAL mode: we need something between LOG_ONLY and FSYNC

2018-03-23 Thread Ivan Rakov


Fixes are quite simple.
I expect them to be merged in master in a week in worst case.

Best Regards,
Ivan Rakov

On 22.03.2018 17:49, Denis Magda wrote:

Ivan,

How quick are you going to merge the fix into the master? Many persistence
related optimizations have already stacked up. Probably, we can release
them sooner if the community agrees.

--
Denis

On Thu, Mar 22, 2018 at 5:22 AM, Ivan Rakov  wrote:


Thanks all!
We seem to have reached a consensus on this issue. I'll just add necessary
fsyncs under IGNITE-7754.

Best Regards,
Ivan Rakov


On 22.03.2018 15:13, Ilya Lantukh wrote:


+1 for fixing LOG_ONLY. If current implementation doesn't protect from
data
corruption, it doesn't make sence.

On Wed, Mar 21, 2018 at 10:38 PM, Denis Magda  wrote:

+1 for the fix of LOG_ONLY

On Wed, Mar 21, 2018 at 11:23 AM, Alexey Goncharuk <
alexey.goncha...@gmail.com> wrote:

+1 for fixing LOG_ONLY to enforce corruption safety given the provided

performance results.

2018-03-21 18:20 GMT+03:00 Vladimir Ozerov :

+1 for accepting drop in LOG_ONLY. 7% is not that much and not a drop
at
all, provided that we fixing a bug. I.e. should we implement it
correctly
in the first place we would never notice any "drop".

I do not understand why someone would like to use current broken mode.

On Wed, Mar 21, 2018 at 6:11 PM, Dmitry Pavlov 
wrote:

Hi, I think option 1 is better. As Val said any mode that allows
corruption


does not make much sense.

What Ivan mentioned here as drop, in relation to old mode DEFAULT


(FSYNC
now), is still significant perfromance boost.

Sincerely,
Dmitriy Pavlov

ср, 21 мар. 2018 г. в 17:56, Ivan Rakov :

I've attached benchmark results to the JIRA ticket.

We observe ~7% drop in "fair" LOG_ONLY_SAFE mode, independent of


WAL

compaction enabled flag. It's pretty significant drop: WAL

compaction

itself gives only ~3% drop.

I see two options here:
1) Change LOG_ONLY behavior. That implies that we'll be ready to


release
AI 2.5 with 7% drop.

2) Introduce LOG_ONLY_SAFE, make it default, add release note to AI


2.5

that we added power loss durability in default mode, but user may

fallback to previous LOG_ONLY in order to retain performance.

Thoughts?

Best Regards,
Ivan Rakov

On 20.03.2018 16:00, Ivan Rakov wrote:


Val,

If a storage is in

corrupted state, does it mean that it needs to be completely


removed

and

cluster needs to be restarted without data?

Yes, there's a chance that in LOG_ONLY all local data will be


lost,

but only in *power loss**/ OS crash* case.

kill -9, JVM crash, death of critical system thread and all other
cases that usually take place are variations of *process crash*.


All

WAL modes (except NONE, of course) ensure corruption-safety in

case

of

process crash.

If so, I'm not sure any mode

that allows corruption makes much sense to me.


It depends on performance impact of enforcing power-loss


corruption

safety. Price of full protection from power loss is high - FSYNC

is

way slower (2-10 times) than other WAL modes. The question is

whether

ensuring weaker guarantees (corruption can't happen, but loss of

last

updates can) will affect performance as badly as strong

guarantees.

I'll share benchmark results soon.

Best Regards,
Ivan Rakov

On 20.03.2018 5:09, Valentin Kulichenko wrote:


Guys,

What do we understand under "data corruption" here? If a storage


is

in


corrupted state, does it mean that it needs to be completely

removed

and

cluster needs to be restarted without data? If so, I'm not sure

any

mode

that allows corruption makes much sense to me. How am I supposed

to

use a

database, if virtually any failure can end with complete loss of


data?

In any case, this definitely should not be a default behavior.

If

user ever

switches to corruption-unsafe mode, there should be a clear


warning

about

this.

-Val

On Fri, Mar 16, 2018 at 1:06 AM, Ivan Rakov <


ivan.glu...@gmail.com>

wrote:

Ticket to track changes:

https://issues.apache.org/jira/browse/IGNITE-7754

Best Regards,
Ivan Rakov


On 16.03.2018 10:58, Dmitriy Setrakyan wrote:

On Fri, Mar 16, 2018 at 12:55 AM, Ivan Rakov <
ivan.glu...@gmail.com

wrote:

Vladimir,


Unlike BACKGROUND, LOG_ONLY provides strict write guarantees
unless power
loss has happened.
Seems like we need to measure performance difference to


decide

whether do

we need separate WAL mode. If it will be invisible, we'll


just

fix

these

bugs without introducing new mode; if it will be perceptible,


we'll

continue the discussion about introducing LOG_ONLY_SAFE.

Makes sense?

Yes, this sounds like the right approach.

[GitHub] ignite pull request #3687: IGNITE-8025 runMultiThreadedAsync cancellation fi...

2018-03-23 Thread andrey-kuznetsov

GitHub user andrey-kuznetsov opened a pull request:

https://github.com/apache/ignite/pull/3687

IGNITE-8025 runMultiThreadedAsync cancellation fix.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/andrey-kuznetsov/ignite ignite-8025

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/ignite/pull/3687.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3687


commit 2b6dd75e80a1b59ec04c2a5a8d9cf147b736822d
Author: Andrey Kuznetsov 
Date:   2018-03-23T06:46:41Z

IGNITE-8025 runMultiThreadedAsync cancellation fix.




---

60 matches

Mail list logo