date:20191004

[jira] [Created] (IGNITE-12262) Implement an Apache Camel Data Streamer

2019-10-04 Thread Emmanouil Gkatziouras (Jira)

Emmanouil Gkatziouras created IGNITE-12262:
--

 Summary: Implement an Apache Camel Data Streamer
 Key: IGNITE-12262
 URL: https://issues.apache.org/jira/browse/IGNITE-12262
 Project: Ignite
  Issue Type: New Feature
  Components: streaming
Affects Versions: 2.7.6
Reporter: Emmanouil Gkatziouras
Assignee: Emmanouil Gkatziouras


A Pub/Sub data streamer would assist GCP users to consume data and feed them 
into an Ignite cache.

This data streamer will instantiate a Pub/Sub consumer endpoint,

The user will specify and apply a StreamTransformer to the incoming Exchange 
which shall add the result to the data streamer.

The streamer will register as a subscriber and will listen to incoming 
messages. 
The same subscriber Id shall be used for all nodes.
Only one subscriber/node will process an incoming message, instead of every 
subscriber receiving the same message



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Re: Replacing default work dir from tmp to current dir

2019-10-04 Thread Denis Magda

Ilya, thanks for the ticket.

However, I would advise us to enforce "user.home" as the only one default
instead of the proposed fallback mechanism. There is already a lot "if else
if else if else" scenarios in Ignite. Let's focus on simplicity and stick
to one possible option when it works. This case is an example when one
option is doable and preferred.

Btw, sounds like 2.7.7?

-
Denis


On Fri, Oct 4, 2019 at 8:34 AM Ilya Kasnacheev 
wrote:

> Hello!
>
> I have created https://issues.apache.org/jira/browse/IGNITE-12260
>
> Regards,
> --
> Ilya Kasnacheev
>
>
> пт, 4 окт. 2019 г. в 10:16, Ivan Pavlukhin :
>
> > Interesting things about those LINQPad/JPad scenarios. Was not aware
> > of it. Still some doubts about applicability. It seems to me that JPad
> > having work dir in "Program Files" have a lot of problems by itself,
> > e.g. a user is not able to run basic file IO snippets with relative
> > file paths.
> >
> > чт, 3 окт. 2019 г. в 23:24, Pavel Tupitsyn :
> > >
> > > Ilya, fallback is a good idea.
> > > Still I'd prefer to have user.home as a default, and fallback to
> user.dir
> > > when home does not work for some reason.
> > >
> > > On Thu, Oct 3, 2019 at 11:07 PM Ilya Kasnacheev <
> > ilya.kasnach...@gmail.com>
> > > wrote:
> > >
> > > > Hello!
> > > >
> > > > We can try and fallback to home dir with warning, when file cannot be
> > > > created in current dir.
> > > >
> > > > WDYT?
> > > >
> > > > Regards,
> > > > --
> > > > Ilya Kasnacheev
> > > >
> > > >
> > > > чт, 3 окт. 2019 г. в 20:05, Pavel Tupitsyn :
> > > >
> > > > > >  Cannot tell about NuGet. Maven is typically used during
> > development,
> > > > > usually there is no Maven in production deployments.
> > > > > NuGet and Maven are very similar. Yes, both of them are build-time
> > tools,
> > > > > production is unrelated.
> > > > > For production-ready deployments we can expect users to tweak
> Ignite
> > to
> > > > > their needs, set custom storage dirs, adjust heap sizes and so on.
> > > > >
> > > > > I'm talking about new users, about "getting started" scenarios -
> > > > > it is super important to make Ignite easy to get started with,
> > provide
> > > > > reasonable defaults for all the configuration properties.
> > > > >
> > > > > For Ignite.NET, LINQPad is one of those "get started in 2 clicks"
> > > > > scenarios. And this scenario got broken as explained above.
> > > > > 2.7.5 and earlier used temp dir, which worked. 2.7.6 fails: "Work
> > > > directory
> > > > > does not exist and cannot be created: C:\Program
> > > > > Files\LINQPad5\ignite\work"
> > > > >
> > > > > For Java there is JPad, which will fail in the same way - when you
> > run
> > > > code
> > > > > from there, `user.dir` points to Program Files.
> > > > >
> > > > > I expect that there are more use cases like this, and `user.home`
> is
> > a
> > > > > reasonable solution.
> > > > >
> > > > > On Thu, Oct 3, 2019 at 5:56 PM Ilya Kasnacheev <
> > > > ilya.kasnach...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Hello!
> > > > > >
> > > > > > I want to point out that I didn't change this location (current
> > dir).
> > > > It
> > > > > > was already implemented when I raised this issue, the only change
> > I did
> > > > > was
> > > > > > to swap current dir/work to current dir/ignite/work to avoid
> > confusion
> > > > > > whose work dir that is.
> > > > > >
> > > > > > I also communicated this to you all in ML when I discovered that
> > > > current
> > > > > > dir is used.
> > > > > >
> > > > > > I think that current dir is actually *well defined* when
> starting a
> > > > > > project. A project is expected to be started from the same dir,
> > and all
> > > > > > "Run..." dialogs usually allow specifying that one.
> > > > > >
> > > > > > For embedded scenarios, you definitely not want work dir from two
> > > > > different
> > > > > > Ignite-using tools to interfere. For embedded scenarios, you
> should
> > > > only
> > > > > > expect that current dir is writable.
> > > > > >
> > > > > > Even after these considerations, it's too late to change that
> > because
> > > > > > people don't expect this dir to move with every release of
> Ignite,
> > and
> > > > we
> > > > > > already did it once.
> > > > > >
> > > > > > Regards,
> > > > > > --
> > > > > > Ilya Kasnacheev
> > > > > >
> > > > > >
> > > > > > чт, 3 окт. 2019 г. в 17:34, Alexey Goncharuk <
> > > > alexey.goncha...@gmail.com
> > > > > >:
> > > > > >
> > > > > > > >
> > > > > > > > Seems, we should have different defaults and even
> > distributions for
> > > > > > > > different usage scenarios.
> > > > > > > >
> > > > > > > I still do not understand why defaults should be different for
> > > > embedded
> > > > > > and
> > > > > > > "traditional RDBMS-like" installations. Having different
> defaults
> > > > will
> > > > > > > likely confuse users, not make usability easier. Personally, I
> > would
> > > > > > forbid
> > > > > > > to start Ignite if IGNITE_HOME is not set, but this suggestion
> > was
> >

Re: Getting involved in Apache Ignite

2019-10-04 Thread Denis Magda

Emmanouil,

Thanks for reaching us out! It's great to have you as a contributor to
Ignite. More integrations with the cloud ecosystem is an invaluable
addition to the project. Ping me or anybody else within the community if
you have any challenges with Ignite internals.

Please feel free to start a dev list discussion explaining the
integration in detail. Once everybody is on the same page you can proceed
with a ticket creation and carry on with the development ;) One thing to
note, before we tended to add all the integrations into Ignite primary
repository and release bundle, this approach doesn't scale well and is not
sustainable as long as Ignite is getting enormous, it's difficult to update
and release integrations independently, versions conflicts, etc. Thus, the
community is planning to proceed with the modularization:
https://cwiki.apache.org/confluence/display/IGNITE/IEP-36%3A+Modularization

Your integrations can triggethe r decoupling of AWS and GCE specifics from
the Ignite core into separate modules. Those modules still will be part of
ASF and belong the o Ignite community but to be stored separately and can
always be released independently.

What's your thinking?

-
Denis

On Fri, Oct 4, 2019 at 12:59 AM Emmanouil Gkatziouras 
wrote:

> Greetings,
>
> I am amazed by Apache Ignite and its features!
> For my use case integrating with Google Cloud Pub/Sub and Amazon SQS would
> help getting the most out of it.
>
> Since developing those streamers is something I would do in any case, I
> would like to get involved in your project and therefore give back to the
> project and make those features available to the community.
>
> I have contributed to projects such as the InfluxDB Java Driver and
> Alpakka.
> Part of my every day work has to do with implementing solutions in the
> cloud, thus I can contribute to the streaming solutions that have to do
> with Cloud Providers. Particularly with the GCP Pub/Sub and AWS SQS as well
> as other Cloud Based Messaging systems such as Azure Storage Queues.
> Also I would like to propose on adding a streamer implementation for cache
> invalidation as I have some use cases in need of it.
>
> You can find me on LinkedIn (link in my signature) and get to know my
> background a little more.
>
> Thanks for your great work so far!
> Regards,
>
> *Emmanouil Gkatziouras*
> https://egkatzioura.com/ |
> https://www.linkedin.com/in/gkatziourasemmanouil/
> https://github.com/gkatzioura
>

[jira] [Created] (IGNITE-12261) Issue with adding nested index dynamically

2019-10-04 Thread Hemambara (Jira)

Hemambara created IGNITE-12261:
--

 Summary:   Issue with adding nested index dynamically 
 Key: IGNITE-12261
 URL: https://issues.apache.org/jira/browse/IGNITE-12261
 Project: Ignite
  Issue Type: Bug
Affects Versions: 2.7.6
Reporter: Hemambara


[http://apache-ignite-users.70518.x6.nabble.com/Issue-with-adding-nested-index-dynamically-tt29571.html]

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Re: Getting involved in Apache Ignite

2019-10-04 Thread Emmanouil Gkatziouras

Thanks!

Much appreciated!

Regards,

*Emmanouil Gkatziouras*
https://egkatzioura.com/ | https://www.linkedin.com/in/gkatziourasemmanouil/
https://github.com/gkatzioura


On Fri, 4 Oct 2019 at 16:25, Ilya Kasnacheev 
wrote:

> Hello!
>
> I have added you to contributors, now you can assign issues to yourself.
> Please familiarise yourself with
> https://cwiki.apache.org/confluence/display/IGNITE/How+to+Contribute
>
> Feel free to create new issues or even IEPs!
> https://cwiki.apache.org/confluence/display/IGNITE/Active+Proposals
>
> Regards,
> --
> Ilya Kasnacheev
>
>
> пт, 4 окт. 2019 г. в 18:13, Emmanouil Gkatziouras :
>
> > Greetings!
> >
> > Thank you for your response.
> > This is my JIRA login
> > https://issues.apache.org/jira/secure/ViewProfile.jspa?name=gkatzioura
> > Once you add me to the contributors should I create the tickets I
> described
> > on the initial mail?
> >
> > Kind regards,
> > Emmanouil
> >
> > *Emmanouil Gkatziouras*
> > https://egkatzioura.com/ |
> > https://www.linkedin.com/in/gkatziourasemmanouil/
> > https://github.com/gkatzioura
> >
> >
> > On Fri, 4 Oct 2019 at 13:33, Ilya Kasnacheev 
> > wrote:
> >
> > > Hello!
> > >
> > > Do you have an Apache JIRA login? Please create it if not exists, share
> > > with us so that you can be added to contributors.
> > >
> > > Regards,
> > > --
> > > Ilya Kasnacheev
> > >
> > >
> > > пт, 4 окт. 2019 г. в 10:59, Emmanouil Gkatziouras <
> gkatzio...@gmail.com
> > >:
> > >
> > > > Greetings,
> > > >
> > > > I am amazed by Apache Ignite and its features!
> > > > For my use case integrating with Google Cloud Pub/Sub and Amazon SQS
> > > would
> > > > help getting the most out of it.
> > > >
> > > > Since developing those streamers is something I would do in any
> case, I
> > > > would like to get involved in your project and therefore give back to
> > the
> > > > project and make those features available to the community.
> > > >
> > > > I have contributed to projects such as the InfluxDB Java Driver and
> > > > Alpakka.
> > > > Part of my every day work has to do with implementing solutions in
> the
> > > > cloud, thus I can contribute to the streaming solutions that have to
> do
> > > > with Cloud Providers. Particularly with the GCP Pub/Sub and AWS SQS
> as
> > > well
> > > > as other Cloud Based Messaging systems such as Azure Storage Queues.
> > > > Also I would like to propose on adding a streamer implementation for
> > > cache
> > > > invalidation as I have some use cases in need of it.
> > > >
> > > > You can find me on LinkedIn (link in my signature) and get to know my
> > > > background a little more.
> > > >
> > > > Thanks for your great work so far!
> > > > Regards,
> > > >
> > > > *Emmanouil Gkatziouras*
> > > > https://egkatzioura.com/ |
> > > > https://www.linkedin.com/in/gkatziourasemmanouil/
> > > > https://github.com/gkatzioura
> > > >
> > >
> >
>

Re: ApacheCon Europe 2019 talks which are relevant to Apache Ignite

2019-10-04 Thread Denis Magda

Nice! Alexey, I completely forgot that you are going as a speaker. Kseniya,
could you help to update our events page with all the upcoming Ignite
events? https://ignite.apache.org/events.html

-
Denis


On Fri, Oct 4, 2019 at 10:10 AM Alexey Zinoviev 
wrote:

> Dear @myrle the link on my talk about ML is incorrect
> This is a correct link
>
> https://aceu19.apachecon.com/session/ensembles-ml-algorithms-and-distributed-online-machine-learning-apache-ignite-0
>
>
> пт, 4 окт. 2019 г. в 19:59, Denis Magda :
>
> > Igniters,
> >
> > Is anybody planning to visit the event this year? Unfortunately, I won't
> be
> > able to make it in Berlin. However, the one we had in Las Vegas this fall
> > was amazing. So, highly recommended.
> >
> > -
> > Denis
> >
> >
> > On Fri, Oct 4, 2019 at 9:45 AM  wrote:
> >
> > > Dear Apache Ignite committers,
> > >
> > > In a little over 2 weeks time, ApacheCon Europe is taking place in
> > > Berlin. Join us from October 22 to 24 for an exciting program and
> lovely
> > > get-together of the Apache Community.
> > >
> > > We are also planning a hackathon.  If your project is interested in
> > > participating, please enter yourselves here:
> > > https://cwiki.apache.org/confluence/display/COMDEV/Hackathon
> > >
> > > The following talks should be especially relevant for you:
> > >
> > >   * *
> > >
> >
> https://aceu19.apachecon.com/session/ensembles-ml-algorithms-and-distributed-online-machine-learning-apache-ignite-0*
> > >   *
> > >
> >
> https://aceu19.apachecon.com/session/patterns-and-anti-patterns-running-apache-bigdata-projects-kubernetes
> > > <
> > > https://aceu19.apachecon.com/session/fast-federated-sql-apache-calcite
> >
> > >   *
> > https://aceu19.apachecon.com/session/fast-federated-sql-apache-calcite
> > > <
> > >
> >
> https://aceu19.apachecon.com/session/open-source-big-data-tools-accelerating-physics-research-cern
> > > >
> > >   *
> > >
> >
> https://aceu19.apachecon.com/session/open-source-big-data-tools-accelerating-physics-research-cern
> > > <
> > >
> >
> https://aceu19.apachecon.com/session/ui-dev-big-data-world-using-open-source
> > > >
> > >   *
> > >
> >
> https://aceu19.apachecon.com/session/data-driven-aiml-solutions-apache-software
> > > <
> > >
> >
> https://aceu19.apachecon.com/session/apache-beam-running-big-data-pipelines-python-and-go-spark
> > > >
> > >   *
> > >
> >
> https://aceu19.apachecon.com/session/ui-dev-big-data-world-using-open-source
> > >
> > > Furthermore there will be a whole conference track on community topics:
> > > Learn how to motivate users to contribute patches, how the board of
> > > directors works, how to navigate the Incubator and much more: ApacheCon
> > > Europe 2019 Community track <
> > > https://aceu19.apachecon.com/sessions?track=42>
> > >
> > > Tickets are available here 
> –
> > > for Apache Committers we offer discounted tickets.  Prices will be
> going
> > > up on October 7th, so book soon.
> > >
> > > Please also help spread the word and make ApacheCon Europe 2019 a
> > success!
> > >
> > > We’re looking forward to welcoming you at #ACEU19!
> > >
> > > Best,
> > >
> > > Your ApacheCon team
> > >
> > >
> >
>

Re: ApacheCon Europe 2019 talks which are relevant to Apache Ignite

2019-10-04 Thread Myrle Krantz

Thank you Alexey,

I've corrected it in the remaining mailings that I'm still sending out.

I look forward to seeing you in Berlin!

Best,
Myrle

On Fri, Oct 4, 2019 at 7:10 PM Alexey Zinoviev 
wrote:

> Dear @myrle the link on my talk about ML is incorrect
> This is a correct link
>
> https://aceu19.apachecon.com/session/ensembles-ml-algorithms-and-distributed-online-machine-learning-apache-ignite-0
>
>
> пт, 4 окт. 2019 г. в 19:59, Denis Magda :
>
>> Igniters,
>>
>> Is anybody planning to visit the event this year? Unfortunately, I won't
>> be
>> able to make it in Berlin. However, the one we had in Las Vegas this fall
>> was amazing. So, highly recommended.
>>
>> -
>> Denis
>>
>>
>> On Fri, Oct 4, 2019 at 9:45 AM  wrote:
>>
>> > Dear Apache Ignite committers,
>> >
>> > In a little over 2 weeks time, ApacheCon Europe is taking place in
>> > Berlin. Join us from October 22 to 24 for an exciting program and lovely
>> > get-together of the Apache Community.
>> >
>> > We are also planning a hackathon.  If your project is interested in
>> > participating, please enter yourselves here:
>> > https://cwiki.apache.org/confluence/display/COMDEV/Hackathon
>> >
>> > The following talks should be especially relevant for you:
>> >
>> >   * *
>> >
>> https://aceu19.apachecon.com/session/ensembles-ml-algorithms-and-distributed-online-machine-learning-apache-ignite-0*
>> >   *
>> >
>> https://aceu19.apachecon.com/session/patterns-and-anti-patterns-running-apache-bigdata-projects-kubernetes
>> > <
>> > https://aceu19.apachecon.com/session/fast-federated-sql-apache-calcite>
>> >   *
>> https://aceu19.apachecon.com/session/fast-federated-sql-apache-calcite
>> > <
>> >
>> https://aceu19.apachecon.com/session/open-source-big-data-tools-accelerating-physics-research-cern
>> > >
>> >   *
>> >
>> https://aceu19.apachecon.com/session/open-source-big-data-tools-accelerating-physics-research-cern
>> > <
>> >
>> https://aceu19.apachecon.com/session/ui-dev-big-data-world-using-open-source
>> > >
>> >   *
>> >
>> https://aceu19.apachecon.com/session/data-driven-aiml-solutions-apache-software
>> > <
>> >
>> https://aceu19.apachecon.com/session/apache-beam-running-big-data-pipelines-python-and-go-spark
>> > >
>> >   *
>> >
>> https://aceu19.apachecon.com/session/ui-dev-big-data-world-using-open-source
>> >
>> > Furthermore there will be a whole conference track on community topics:
>> > Learn how to motivate users to contribute patches, how the board of
>> > directors works, how to navigate the Incubator and much more: ApacheCon
>> > Europe 2019 Community track <
>> > https://aceu19.apachecon.com/sessions?track=42>
>> >
>> > Tickets are available here 
>> –
>> > for Apache Committers we offer discounted tickets.  Prices will be going
>> > up on October 7th, so book soon.
>> >
>> > Please also help spread the word and make ApacheCon Europe 2019 a
>> success!
>> >
>> > We’re looking forward to welcoming you at #ACEU19!
>> >
>> > Best,
>> >
>> > Your ApacheCon team
>> >
>> >
>>
>

Re: ApacheCon Europe 2019 talks which are relevant to Apache Ignite

2019-10-04 Thread Alexey Zinoviev

Dear @myrle the link on my talk about ML is incorrect
This is a correct link
https://aceu19.apachecon.com/session/ensembles-ml-algorithms-and-distributed-online-machine-learning-apache-ignite-0


пт, 4 окт. 2019 г. в 19:59, Denis Magda :

> Igniters,
>
> Is anybody planning to visit the event this year? Unfortunately, I won't be
> able to make it in Berlin. However, the one we had in Las Vegas this fall
> was amazing. So, highly recommended.
>
> -
> Denis
>
>
> On Fri, Oct 4, 2019 at 9:45 AM  wrote:
>
> > Dear Apache Ignite committers,
> >
> > In a little over 2 weeks time, ApacheCon Europe is taking place in
> > Berlin. Join us from October 22 to 24 for an exciting program and lovely
> > get-together of the Apache Community.
> >
> > We are also planning a hackathon.  If your project is interested in
> > participating, please enter yourselves here:
> > https://cwiki.apache.org/confluence/display/COMDEV/Hackathon
> >
> > The following talks should be especially relevant for you:
> >
> >   * *
> >
> https://aceu19.apachecon.com/session/ensembles-ml-algorithms-and-distributed-online-machine-learning-apache-ignite-0*
> >   *
> >
> https://aceu19.apachecon.com/session/patterns-and-anti-patterns-running-apache-bigdata-projects-kubernetes
> > <
> > https://aceu19.apachecon.com/session/fast-federated-sql-apache-calcite>
> >   *
> https://aceu19.apachecon.com/session/fast-federated-sql-apache-calcite
> > <
> >
> https://aceu19.apachecon.com/session/open-source-big-data-tools-accelerating-physics-research-cern
> > >
> >   *
> >
> https://aceu19.apachecon.com/session/open-source-big-data-tools-accelerating-physics-research-cern
> > <
> >
> https://aceu19.apachecon.com/session/ui-dev-big-data-world-using-open-source
> > >
> >   *
> >
> https://aceu19.apachecon.com/session/data-driven-aiml-solutions-apache-software
> > <
> >
> https://aceu19.apachecon.com/session/apache-beam-running-big-data-pipelines-python-and-go-spark
> > >
> >   *
> >
> https://aceu19.apachecon.com/session/ui-dev-big-data-world-using-open-source
> >
> > Furthermore there will be a whole conference track on community topics:
> > Learn how to motivate users to contribute patches, how the board of
> > directors works, how to navigate the Incubator and much more: ApacheCon
> > Europe 2019 Community track <
> > https://aceu19.apachecon.com/sessions?track=42>
> >
> > Tickets are available here  –
> > for Apache Committers we offer discounted tickets.  Prices will be going
> > up on October 7th, so book soon.
> >
> > Please also help spread the word and make ApacheCon Europe 2019 a
> success!
> >
> > We’re looking forward to welcoming you at #ACEU19!
> >
> > Best,
> >
> > Your ApacheCon team
> >
> >
>

Re: Metric showing how many nodes may safely leave the cluster

2019-10-04 Thread Denis Magda

I'm for the proposal to add new JMX metrics and enhance the existing
tooling. But I would encourage us to integrate this into the new metrics
framework Nikolay has been working on. Otherwise, we will be deprecating
these JMX metrics in a short time frame in favor of the new monitoring APIs.

-
Denis


On Fri, Oct 4, 2019 at 9:33 AM Alexey Goncharuk 
wrote:

> I agree that we should have the ability to read any metric using simple
> Ignite tooling. I am not sure if visor.sh is a good fit - if I
> remember correctly, it will start a daemon node which will bump the
> topology version with all related consequences. I believe in the long term
> it will beneficial to migrate all visor.sh functionality to a more
> lightweight protocol, such as used in control.sh.
>
> As for the metrics, the metric suggested by Ivan totally makes sense to me
> - it is a simple and, actually, quite critical metric. It will be
> completely unusable to select a minimum of some metric for all cache groups
> manually. A monitoring system, on the other hand, might not be available
> when the metric is needed, or may not support aggregation.
>
> --AG
>
> пт, 4 окт. 2019 г. в 18:58, Ivan Rakov :
>
> > Nikolay,
> >
> > Many users start to use Ignite with a small project without
> > production-level monitoring. When proof-of-concept appears to be viable,
> > they tend to expand Ignite usage by growing cluster and adding needed
> > environment (including monitoring systems).
> > Inability to find such basic thing as survival in case of next node
> > crash may affect overall product impression. We all want Ignite to be
> > successful and widespread.
> >
> > > Can you clarify, what do you mean, exactly?
> >
> > Right now user can access metric mentioned by Alex and choose minimum of
> > all cache groups. I want to highlight that not every user understands
> > Ignite and its internals so much to find out that exactly these sequence
> > of actions will bring him to desired answer.
> >
> > > Can you clarify, what do you mean, exactly?
> > > We have a ticket[1] to support metrics output via visor.sh.
> > >
> > > My understanding: we should have an easy way to output metric values
> for
> > each node in cluster.
> > >
> > > [1] https://issues.apache.org/jira/browse/IGNITE-12191
> > I propose to add metric method for aggregated
> > "getMinimumNumberOfPartitionCopies" and expose it to control.sh.
> > My understanding: it's result is critical enough to be accessible in a
> > short path. I've started this topic due to request from user list, and
> > I've heard many similar complaints before.
> >
> > Best Regards,
> > Ivan Rakov
> >
> > On 04.10.2019 17:18, Nikolay Izhikov wrote:
> > > Ivan.
> > >
> > >> We shouldn't force users to configure external tools and write extra
> > code for basic things.
> > > Actually, I don't agree with you.
> > > Having external monitoring system for any production cluster is a
> > *basic* thing.
> > >
> > > Can you, please, define "basic things"?
> > >
> > >> single method for the whole cluster
> > > Can you clarify, what do you mean, exactly?
> > > We have a ticket[1] to support metrics output via visor.sh.
> > >
> > > My understanding: we should have an easy way to output metric values
> for
> > each node in cluster.
> > >
> > > [1] https://issues.apache.org/jira/browse/IGNITE-12191
> > >
> > >
> > > В Пт, 04/10/2019 в 17:09 +0300, Ivan Rakov пишет:
> > >> Max,
> > >>
> > >> What if user simply don't have configured monitoring system?
> > >> Knowing whether cluster will survive node shutdown is critical for any
> > >> administrator that performs any manipulations with cluster topology.
> > >> Essential information should be easily accessed. We shouldn't force
> > >> users to configure external tools and write extra code for basic
> things.
> > >>
> > >> Alex,
> > >>
> > >> Thanks, that's exact metric we need.
> > >> My point is that we should make it more accessible: via control.sh
> > >> command and single method for the whole cluster.
> > >>
> > >> Best Regards,
> > >> Ivan Rakov
> > >>
> > >> On 04.10.2019 16:34, Alex Plehanov wrote:
> > >>> Ivan, there already exist metric
> > >>> CacheGroupMetricsMXBean#getMinimumNumberOfPartitionCopies, which
> shows
> > the
> > >>> current redundancy level for the cache group.
> > >>> We can lose up to ( getMinimumNumberOfPartitionCopies-1) nodes
> without
> > data
> > >>> loss in this cache group.
> > >>>
> > >>> пт, 4 окт. 2019 г. в 16:17, Ivan Rakov :
> > >>>
> >  Igniters,
> > 
> >  I've seen numerous requests to find out an easy way to check whether
> > is
> >  it safe to turn off cluster node. As we know, in Ignite protection
> > from
> >  sudden node shutdown is implemented through keeping several backup
> >  copies of each partition. However, this guarantee can be weakened
> for
> > a
> >  while in case cluster has recently experienced node restart and
> >  rebalancing process is still in progress.
> >  Example scenario is restarting

Re: [DISCUSSION][IEP-35] Replace RunningQueryManager with GridSystemViewManager

2019-10-04 Thread Ivan Pavlukhin

Nikolay,

> As I understand, RunningQueryManager track query only for an export.

Not quite. Also it is responsible for explicit query cancellation and
running queries cancellation on node stop. I do not think that a view
should be responsible for it.

пт, 4 окт. 2019 г. в 18:17, Nikolay Izhikov :
>
> Ivan.
>
> > RunningQueryManager is responsible for tracking running queries (and query 
> > history)
>
> As I understand, RunningQueryManager track query only for an export.
> So we don't need explicit entity for that, we already have System Views.
>
> В Пт, 04/10/2019 в 17:40 +0300, Ivan Pavlukhin пишет:
> > Nikolay,
> >
> > Thank you for sharing knowledge.
> >
> > > I think we should replace `RunningQueryManager` with the special 
> > > SystemView implementation.
> >
> > Not sure that I got the intention and abstraction here. For me a
> > straightforward approach here is to keep RunningQueryManager as is and
> > use a new API to expose it's content to monitoring system.
> > RunningQueryManager is responsible for tracking running queries (and
> > query history). All in all, other views expose info from other
> > managers and processors (e.g. IgniteTxManager, GridTaskProcessor,
> > SchemaManager). Have I missed something?
> >
> > пт, 4 окт. 2019 г. в 14:12, Nikolay Izhikov :
> > >
> > > Hello, Ivan.
> > >
> > > > 1. How system views are going to be exposed? Is there any difference
> > > > in comparison to other metrics?
> > >
> > > We have a `SystemViewExporterSpi`.
> > > Built-in implementations are `JmxSystemViewExporterSpi` and 
> > > `SqlViewExporterSpi`.
> > >
> > > > 2. What should be done to adopt RunningQueryManager to SystemView API?
> > >
> > > I think we should replace `RunningQueryManager` with the special 
> > > SystemView implementation.
> > >
> > > > what is the difference between metrics and system views?
> > >
> > > Actually, it's a very good question :)
> > >
> > > System view is a collection of internal Ignite objects exported to a user.
> > > Each system view is a table.
> > >
> > > Metric is a value representing some instantaneous state of the internal 
> > > Ignite object.
> > > So its a "cell" of table.
> > >
> > > We need metrics to build charts and history of processes.
> > > We need system views to known what objects exist in node and its params.
> > >
> > > В Пт, 04/10/2019 в 11:51 +0300, Ivan Pavlukhin пишет:
> > > > Nikolay,
> > > >
> > > > I checked the IEP [1]. Now it is more clear for me about SystemView
> > > > API. Follow-up questions:
> > > > 1. How system views are going to be exposed? Is there any difference
> > > > in comparison to other metrics?
> > > > 2. What should be done to adopt RunningQueryManager to SystemView API?
> > > >
> > > > Also some bits for my understanding. I do not have a clear intuition
> > > > what is the difference between metrics and system views? For example,
> > > > how a system view is different from a metric holding a collection of
> > > > values? And why they were introduced as a separate class?
> > > >
> > > > [1] 
> > > > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=112820392
> > > >
> > > > чт, 3 окт. 2019 г. в 16:37, Nikolay Izhikov :
> > > > >
> > > > > Hello, Ivan.
> > > > >
> > > > > Thanks for feedback.
> > > > >
> > > > > Initial IEP [1] naming was changed during code review.
> > > > > I updated the IEP [1] with the current naming.
> > > > >
> > > > > Can you take a look and check is all clear now?
> > > > >
> > > > > [1] 
> > > > > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=112820392
> > > > >
> > > > > В Ср, 02/10/2019 в 17:21 +0300, Ivan Pavlukhin пишет:
> > > > > > Hi Nikolay,
> > > > > >
> > > > > > Actually I do not fully understand what is SystemView API. I have 
> > > > > > not
> > > > > > found it in IEP [1] (I searched for words "system" and "view").
> > > > > >
> > > > > > RunningQueryManager is a component responsible for tracking running
> > > > > > queries internally. This info is exposed to users as SQL view via
> > > > > > SqlSystemViewRunningQueries. In the same package you can find a 
> > > > > > plenty
> > > > > > of other SQL views.
> > > > > >
> > > > > > [1] 
> > > > > > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=112820392
> > > > > >
> > > > > > вт, 1 окт. 2019 г. в 06:42, Nikolay Izhikov :
> > > > > > >
> > > > > > > Hello, Igniters.
> > > > > > >
> > > > > > > Since the last release `RunningQueryManager` [1] was added.
> > > > > > > It used to track a running query.
> > > > > > >
> > > > > > > In IEP-35 [2] SystemView API was added.
> > > > > > > SystemView API supposed to be used to track all kinds of internal 
> > > > > > > Ignite objects.
> > > > > > >
> > > > > > > I think this RunningQueryManager should be replaced [3] with the 
> > > > > > > more unified SystemView API.
> > > > > > >
> > > > > > > Any objections?
> > > > > > >
> > > > > > > [1] https://issues.apache.org/jira/browse/IGNITE-10754
> > > > > > > [2] 
> > > > > > >

Re: ApacheCon Europe 2019 talks which are relevant to Apache Ignite

2019-10-04 Thread Denis Magda

Igniters,

Is anybody planning to visit the event this year? Unfortunately, I won't be
able to make it in Berlin. However, the one we had in Las Vegas this fall
was amazing. So, highly recommended.

-
Denis


On Fri, Oct 4, 2019 at 9:45 AM  wrote:

> Dear Apache Ignite committers,
>
> In a little over 2 weeks time, ApacheCon Europe is taking place in
> Berlin. Join us from October 22 to 24 for an exciting program and lovely
> get-together of the Apache Community.
>
> We are also planning a hackathon.  If your project is interested in
> participating, please enter yourselves here:
> https://cwiki.apache.org/confluence/display/COMDEV/Hackathon
>
> The following talks should be especially relevant for you:
>
>   * *
> https://aceu19.apachecon.com/session/ensembles-ml-algorithms-and-distributed-online-machine-learning-apache-ignite-0*
>   *
> https://aceu19.apachecon.com/session/patterns-and-anti-patterns-running-apache-bigdata-projects-kubernetes
> <
> https://aceu19.apachecon.com/session/fast-federated-sql-apache-calcite>
>   * https://aceu19.apachecon.com/session/fast-federated-sql-apache-calcite
> <
> https://aceu19.apachecon.com/session/open-source-big-data-tools-accelerating-physics-research-cern
> >
>   *
> https://aceu19.apachecon.com/session/open-source-big-data-tools-accelerating-physics-research-cern
> <
> https://aceu19.apachecon.com/session/ui-dev-big-data-world-using-open-source
> >
>   *
> https://aceu19.apachecon.com/session/data-driven-aiml-solutions-apache-software
> <
> https://aceu19.apachecon.com/session/apache-beam-running-big-data-pipelines-python-and-go-spark
> >
>   *
> https://aceu19.apachecon.com/session/ui-dev-big-data-world-using-open-source
>
> Furthermore there will be a whole conference track on community topics:
> Learn how to motivate users to contribute patches, how the board of
> directors works, how to navigate the Incubator and much more: ApacheCon
> Europe 2019 Community track <
> https://aceu19.apachecon.com/sessions?track=42>
>
> Tickets are available here  –
> for Apache Committers we offer discounted tickets.  Prices will be going
> up on October 7th, so book soon.
>
> Please also help spread the word and make ApacheCon Europe 2019 a success!
>
> We’re looking forward to welcoming you at #ACEU19!
>
> Best,
>
> Your ApacheCon team
>
>

ApacheCon Europe 2019 talks which are relevant to Apache Ignite

2019-10-04 Thread myrle


Dear Apache Ignite committers,

In a little over 2 weeks time, ApacheCon Europe is taking place in 
Berlin. Join us from October 22 to 24 for an exciting program and lovely 
get-together of the Apache Community.


We are also planning a hackathon.  If your project is interested in 
participating, please enter yourselves here: 
https://cwiki.apache.org/confluence/display/COMDEV/Hackathon


The following talks should be especially relevant for you:

 * 
*https://aceu19.apachecon.com/session/ensembles-ml-algorithms-and-distributed-online-machine-learning-apache-ignite-0*
 * 
https://aceu19.apachecon.com/session/patterns-and-anti-patterns-running-apache-bigdata-projects-kubernetes
   
 * https://aceu19.apachecon.com/session/fast-federated-sql-apache-calcite
   

 * 
https://aceu19.apachecon.com/session/open-source-big-data-tools-accelerating-physics-research-cern
   

 * 
https://aceu19.apachecon.com/session/data-driven-aiml-solutions-apache-software
   

 * https://aceu19.apachecon.com/session/ui-dev-big-data-world-using-open-source

Furthermore there will be a whole conference track on community topics: 
Learn how to motivate users to contribute patches, how the board of 
directors works, how to navigate the Incubator and much more: ApacheCon 
Europe 2019 Community track 


Tickets are available here  – 
for Apache Committers we offer discounted tickets.  Prices will be going 
up on October 7th, so book soon.


Please also help spread the word and make ApacheCon Europe 2019 a success!

We’re looking forward to welcoming you at #ACEU19!

Best,

Your ApacheCon team

Re: How to free up space on disc after removing entries from IgniteCache with enabled PDS?

2019-10-04 Thread Alexey Goncharuk

Maxim,

Having a cluster-wide lock for a cache does not improve availability of the
solution. A user cannot defragment a cache if the cache is involved in a
mission-critical operation, so having a lock on such a cache is equivalent
to the whole cluster shutdown.

We should decide between either a single offline node or a more complex
fully online solution.

пт, 4 окт. 2019 г. в 11:55, Maxim Muzafarov :

> Igniters,
>
> This thread seems to be endless, but we if some kind of cache group
> distributed write lock (exclusive for some of the internal Ignite
> process) will be introduced? I think it will help to solve a batch of
> problems, like:
>
> 1. defragmentation of all cache group partitions on the local node
> without concurrent updates.
> 2. improve data loading with data streamer isolation mode [1]. It
> seems we should not allow concurrent updates to cache if we on `fast
> data load` step.
> 3. recovery from a snapshot without cache stop\start actions
>
>
> [1] https://issues.apache.org/jira/browse/IGNITE-11793
>
> On Thu, 3 Oct 2019 at 22:50, Sergey Kozlov  wrote:
> >
> > Hi
> >
> > I'm not sure that node offline is a best way to do that.
> > Cons:
> >  - different caches may have different defragmentation but we force to
> stop
> > whole node
> >  - offline node is a maintenance operation will require to add +1 backup
> to
> > reduce the risk of data loss
> >  - baseline auto adjustment?
> >  - impact to index rebuild?
> >  - cache configuration changes (or destroy) during node offline
> >
> > What about other ways without node stop? E.g. make cache group on a node
> > offline? Add *defrag  *command to control.sh to force start
> > rebalance internally in the node with expected impact to performance.
> >
> >
> >
> > On Thu, Oct 3, 2019 at 12:08 PM Anton Vinogradov  wrote:
> >
> > > Alexey,
> > > As for me, it does not matter will it be IEP, umbrella or a single
> issue.
> > > The most important thing is Assignee :)
> > >
> > > On Thu, Oct 3, 2019 at 11:59 AM Alexey Goncharuk <
> > > alexey.goncha...@gmail.com>
> > > wrote:
> > >
> > > > Anton, do you think we should file a single ticket for this or
> should we
> > > go
> > > > with an IEP? As of now, the change does not look big enough for an
> IEP
> > > for
> > > > me.
> > > >
> > > > чт, 3 окт. 2019 г. в 11:18, Anton Vinogradov :
> > > >
> > > > > Alexey,
> > > > >
> > > > > Sounds good to me.
> > > > >
> > > > > On Thu, Oct 3, 2019 at 10:51 AM Alexey Goncharuk <
> > > > > alexey.goncha...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Anton,
> > > > > >
> > > > > > Switching a partition to and from the SHRINKING state will
> require
> > > > > > intricate synchronizations in order to properly determine the
> start
> > > > > > position for historical rebalance without PME.
> > > > > >
> > > > > > I would still go with an offline-node approach, but instead of
> > > cleaning
> > > > > the
> > > > > > persistence, we can do effective defragmentation when the node is
> > > > offline
> > > > > > because we are sure that there is no concurrent load. After the
> > > > > > defragmentation completes, we bring the node back to the cluster
> and
> > > > > > historical rebalance will kick in automatically. It will still
> > > require
> > > > > > manual node restarts, but since the data is not removed, there
> are no
> > > > > > additional risks. Also, this will be an excellent solution for
> those
> > > > who
> > > > > > can afford downtime and execute the defragment command on all
> nodes
> > > in
> > > > > the
> > > > > > cluster simultaneously - this will be the fastest way possible.
> > > > > >
> > > > > > --AG
> > > > > >
> > > > > > пн, 30 сент. 2019 г. в 09:29, Anton Vinogradov :
> > > > > >
> > > > > > > Alexei,
> > > > > > > >> stopping fragmented node and removing partition data, then
> > > > starting
> > > > > it
> > > > > > > again
> > > > > > >
> > > > > > > That's exactly what we're doing to solve the fragmentation
> issue.
> > > > > > > The problem here is that we have to perform N/B
> restart-rebalance
> > > > > > > operations (N - cluster size, B - backups count) and it takes
> a lot
> > > > of
> > > > > > time
> > > > > > > with risks to lose the data.
> > > > > > >
> > > > > > > On Fri, Sep 27, 2019 at 5:49 PM Alexei Scherbakov <
> > > > > > > alexey.scherbak...@gmail.com> wrote:
> > > > > > >
> > > > > > > > Probably this should be allowed to do using public API,
> actually
> > > > this
> > > > > > is
> > > > > > > > same as manual rebalancing.
> > > > > > > >
> > > > > > > > пт, 27 сент. 2019 г. в 17:40, Alexei Scherbakov <
> > > > > > > > alexey.scherbak...@gmail.com>:
> > > > > > > >
> > > > > > > > > The poor man's solution for the problem would be stopping
> > > > > fragmented
> > > > > > > node
> > > > > > > > > and removing partition data, then starting it again
> allowing
> > > full
> > > > > > state
> > > > > > > > > transfer already without deletes.
> > > > > > > > > Rinse and repeat for all owners.
> > > > > > > > >
> > > > >

Re: Metric showing how many nodes may safely leave the cluster

2019-10-04 Thread Alexey Goncharuk

I agree that we should have the ability to read any metric using simple
Ignite tooling. I am not sure if visor.sh is a good fit - if I
remember correctly, it will start a daemon node which will bump the
topology version with all related consequences. I believe in the long term
it will beneficial to migrate all visor.sh functionality to a more
lightweight protocol, such as used in control.sh.

As for the metrics, the metric suggested by Ivan totally makes sense to me
- it is a simple and, actually, quite critical metric. It will be
completely unusable to select a minimum of some metric for all cache groups
manually. A monitoring system, on the other hand, might not be available
when the metric is needed, or may not support aggregation.

--AG

пт, 4 окт. 2019 г. в 18:58, Ivan Rakov :

> Nikolay,
>
> Many users start to use Ignite with a small project without
> production-level monitoring. When proof-of-concept appears to be viable,
> they tend to expand Ignite usage by growing cluster and adding needed
> environment (including monitoring systems).
> Inability to find such basic thing as survival in case of next node
> crash may affect overall product impression. We all want Ignite to be
> successful and widespread.
>
> > Can you clarify, what do you mean, exactly?
>
> Right now user can access metric mentioned by Alex and choose minimum of
> all cache groups. I want to highlight that not every user understands
> Ignite and its internals so much to find out that exactly these sequence
> of actions will bring him to desired answer.
>
> > Can you clarify, what do you mean, exactly?
> > We have a ticket[1] to support metrics output via visor.sh.
> >
> > My understanding: we should have an easy way to output metric values for
> each node in cluster.
> >
> > [1] https://issues.apache.org/jira/browse/IGNITE-12191
> I propose to add metric method for aggregated
> "getMinimumNumberOfPartitionCopies" and expose it to control.sh.
> My understanding: it's result is critical enough to be accessible in a
> short path. I've started this topic due to request from user list, and
> I've heard many similar complaints before.
>
> Best Regards,
> Ivan Rakov
>
> On 04.10.2019 17:18, Nikolay Izhikov wrote:
> > Ivan.
> >
> >> We shouldn't force users to configure external tools and write extra
> code for basic things.
> > Actually, I don't agree with you.
> > Having external monitoring system for any production cluster is a
> *basic* thing.
> >
> > Can you, please, define "basic things"?
> >
> >> single method for the whole cluster
> > Can you clarify, what do you mean, exactly?
> > We have a ticket[1] to support metrics output via visor.sh.
> >
> > My understanding: we should have an easy way to output metric values for
> each node in cluster.
> >
> > [1] https://issues.apache.org/jira/browse/IGNITE-12191
> >
> >
> > В Пт, 04/10/2019 в 17:09 +0300, Ivan Rakov пишет:
> >> Max,
> >>
> >> What if user simply don't have configured monitoring system?
> >> Knowing whether cluster will survive node shutdown is critical for any
> >> administrator that performs any manipulations with cluster topology.
> >> Essential information should be easily accessed. We shouldn't force
> >> users to configure external tools and write extra code for basic things.
> >>
> >> Alex,
> >>
> >> Thanks, that's exact metric we need.
> >> My point is that we should make it more accessible: via control.sh
> >> command and single method for the whole cluster.
> >>
> >> Best Regards,
> >> Ivan Rakov
> >>
> >> On 04.10.2019 16:34, Alex Plehanov wrote:
> >>> Ivan, there already exist metric
> >>> CacheGroupMetricsMXBean#getMinimumNumberOfPartitionCopies, which shows
> the
> >>> current redundancy level for the cache group.
> >>> We can lose up to ( getMinimumNumberOfPartitionCopies-1) nodes without
> data
> >>> loss in this cache group.
> >>>
> >>> пт, 4 окт. 2019 г. в 16:17, Ivan Rakov :
> >>>
>  Igniters,
> 
>  I've seen numerous requests to find out an easy way to check whether
> is
>  it safe to turn off cluster node. As we know, in Ignite protection
> from
>  sudden node shutdown is implemented through keeping several backup
>  copies of each partition. However, this guarantee can be weakened for
> a
>  while in case cluster has recently experienced node restart and
>  rebalancing process is still in progress.
>  Example scenario is restarting nodes one by one in order to update a
>  local configuration parameter. User restarts one node and rebalancing
>  starts: when it will be completed, it will be safe to proceed (backup
>  count=1). However, there's no transparent way to determine whether
>  rebalancing is over.
> From my perspective, it would be very helpful to:
>  1) Add information about rebalancing and number of free-to-go nodes to
>  ./control.sh --state command.
>  Examples of output:
> 
> > Cluster  ID: 125a6dce-74b1-4ee7-a453-c58f23f1f8fc
> > Cluster tag: new_tag
>

Re: Metric showing how many nodes may safely leave the cluster

2019-10-04 Thread Ivan Rakov


Nikolay,

Many users start to use Ignite with a small project without 
production-level monitoring. When proof-of-concept appears to be viable, 
they tend to expand Ignite usage by growing cluster and adding needed 
environment (including monitoring systems).
Inability to find such basic thing as survival in case of next node 
crash may affect overall product impression. We all want Ignite to be 
successful and widespread.



Can you clarify, what do you mean, exactly?


Right now user can access metric mentioned by Alex and choose minimum of 
all cache groups. I want to highlight that not every user understands 
Ignite and its internals so much to find out that exactly these sequence 
of actions will bring him to desired answer.



Can you clarify, what do you mean, exactly?
We have a ticket[1] to support metrics output via visor.sh.

My understanding: we should have an easy way to output metric values for each 
node in cluster.

[1] https://issues.apache.org/jira/browse/IGNITE-12191
I propose to add metric method for aggregated 
"getMinimumNumberOfPartitionCopies" and expose it to control.sh.
My understanding: it's result is critical enough to be accessible in a 
short path. I've started this topic due to request from user list, and 
I've heard many similar complaints before.


Best Regards,
Ivan Rakov

On 04.10.2019 17:18, Nikolay Izhikov wrote:

Ivan.


We shouldn't force users to configure external tools and write extra code for 
basic things.

Actually, I don't agree with you.
Having external monitoring system for any production cluster is a *basic* thing.

Can you, please, define "basic things"?


single method for the whole cluster

Can you clarify, what do you mean, exactly?
We have a ticket[1] to support metrics output via visor.sh.

My understanding: we should have an easy way to output metric values for each 
node in cluster.

[1] https://issues.apache.org/jira/browse/IGNITE-12191


В Пт, 04/10/2019 в 17:09 +0300, Ivan Rakov пишет:

Max,

What if user simply don't have configured monitoring system?
Knowing whether cluster will survive node shutdown is critical for any
administrator that performs any manipulations with cluster topology.
Essential information should be easily accessed. We shouldn't force
users to configure external tools and write extra code for basic things.

Alex,

Thanks, that's exact metric we need.
My point is that we should make it more accessible: via control.sh
command and single method for the whole cluster.

Best Regards,
Ivan Rakov

On 04.10.2019 16:34, Alex Plehanov wrote:

Ivan, there already exist metric
CacheGroupMetricsMXBean#getMinimumNumberOfPartitionCopies, which shows the
current redundancy level for the cache group.
We can lose up to ( getMinimumNumberOfPartitionCopies-1) nodes without data
loss in this cache group.

пт, 4 окт. 2019 г. в 16:17, Ivan Rakov :


Igniters,

I've seen numerous requests to find out an easy way to check whether is
it safe to turn off cluster node. As we know, in Ignite protection from
sudden node shutdown is implemented through keeping several backup
copies of each partition. However, this guarantee can be weakened for a
while in case cluster has recently experienced node restart and
rebalancing process is still in progress.
Example scenario is restarting nodes one by one in order to update a
local configuration parameter. User restarts one node and rebalancing
starts: when it will be completed, it will be safe to proceed (backup
count=1). However, there's no transparent way to determine whether
rebalancing is over.
   From my perspective, it would be very helpful to:
1) Add information about rebalancing and number of free-to-go nodes to
./control.sh --state command.
Examples of output:


Cluster  ID: 125a6dce-74b1-4ee7-a453-c58f23f1f8fc
Cluster tag: new_tag




Cluster is active
All partitions are up-to-date.
3 node(s) can safely leave the cluster without partition loss.
Cluster  ID: 125a6dce-74b1-4ee7-a453-c58f23f1f8fc
Cluster tag: new_tag




Cluster is active
Rebalancing is in progress.
1 node(s) can safely leave the cluster without partition loss.

2) Provide the same information via ClusterMetrics. For example:
ClusterMetrics#isRebalanceInProgress // boolean
ClusterMetrics#getSafeToLeaveNodesCount // int

Here I need to mention that this information can be calculated from
existing rebalance metrics (see CacheMetrics#*rebalance*). However, I
still think that we need more simple and understandable flag whether
cluster is in danger of data loss. Another point is that current metrics
are bound to specific cache, which makes this information even harder to
analyze.

Thoughts?

--
Best Regards,
Ivan Rakov

Re: Replacing default work dir from tmp to current dir

2019-10-04 Thread Ilya Kasnacheev

Hello!

I have created https://issues.apache.org/jira/browse/IGNITE-12260

Regards,
-- 
Ilya Kasnacheev


пт, 4 окт. 2019 г. в 10:16, Ivan Pavlukhin :

> Interesting things about those LINQPad/JPad scenarios. Was not aware
> of it. Still some doubts about applicability. It seems to me that JPad
> having work dir in "Program Files" have a lot of problems by itself,
> e.g. a user is not able to run basic file IO snippets with relative
> file paths.
>
> чт, 3 окт. 2019 г. в 23:24, Pavel Tupitsyn :
> >
> > Ilya, fallback is a good idea.
> > Still I'd prefer to have user.home as a default, and fallback to user.dir
> > when home does not work for some reason.
> >
> > On Thu, Oct 3, 2019 at 11:07 PM Ilya Kasnacheev <
> ilya.kasnach...@gmail.com>
> > wrote:
> >
> > > Hello!
> > >
> > > We can try and fallback to home dir with warning, when file cannot be
> > > created in current dir.
> > >
> > > WDYT?
> > >
> > > Regards,
> > > --
> > > Ilya Kasnacheev
> > >
> > >
> > > чт, 3 окт. 2019 г. в 20:05, Pavel Tupitsyn :
> > >
> > > > >  Cannot tell about NuGet. Maven is typically used during
> development,
> > > > usually there is no Maven in production deployments.
> > > > NuGet and Maven are very similar. Yes, both of them are build-time
> tools,
> > > > production is unrelated.
> > > > For production-ready deployments we can expect users to tweak Ignite
> to
> > > > their needs, set custom storage dirs, adjust heap sizes and so on.
> > > >
> > > > I'm talking about new users, about "getting started" scenarios -
> > > > it is super important to make Ignite easy to get started with,
> provide
> > > > reasonable defaults for all the configuration properties.
> > > >
> > > > For Ignite.NET, LINQPad is one of those "get started in 2 clicks"
> > > > scenarios. And this scenario got broken as explained above.
> > > > 2.7.5 and earlier used temp dir, which worked. 2.7.6 fails: "Work
> > > directory
> > > > does not exist and cannot be created: C:\Program
> > > > Files\LINQPad5\ignite\work"
> > > >
> > > > For Java there is JPad, which will fail in the same way - when you
> run
> > > code
> > > > from there, `user.dir` points to Program Files.
> > > >
> > > > I expect that there are more use cases like this, and `user.home` is
> a
> > > > reasonable solution.
> > > >
> > > > On Thu, Oct 3, 2019 at 5:56 PM Ilya Kasnacheev <
> > > ilya.kasnach...@gmail.com>
> > > > wrote:
> > > >
> > > > > Hello!
> > > > >
> > > > > I want to point out that I didn't change this location (current
> dir).
> > > It
> > > > > was already implemented when I raised this issue, the only change
> I did
> > > > was
> > > > > to swap current dir/work to current dir/ignite/work to avoid
> confusion
> > > > > whose work dir that is.
> > > > >
> > > > > I also communicated this to you all in ML when I discovered that
> > > current
> > > > > dir is used.
> > > > >
> > > > > I think that current dir is actually *well defined* when starting a
> > > > > project. A project is expected to be started from the same dir,
> and all
> > > > > "Run..." dialogs usually allow specifying that one.
> > > > >
> > > > > For embedded scenarios, you definitely not want work dir from two
> > > > different
> > > > > Ignite-using tools to interfere. For embedded scenarios, you should
> > > only
> > > > > expect that current dir is writable.
> > > > >
> > > > > Even after these considerations, it's too late to change that
> because
> > > > > people don't expect this dir to move with every release of Ignite,
> and
> > > we
> > > > > already did it once.
> > > > >
> > > > > Regards,
> > > > > --
> > > > > Ilya Kasnacheev
> > > > >
> > > > >
> > > > > чт, 3 окт. 2019 г. в 17:34, Alexey Goncharuk <
> > > alexey.goncha...@gmail.com
> > > > >:
> > > > >
> > > > > > >
> > > > > > > Seems, we should have different defaults and even
> distributions for
> > > > > > > different usage scenarios.
> > > > > > >
> > > > > > I still do not understand why defaults should be different for
> > > embedded
> > > > > and
> > > > > > "traditional RDBMS-like" installations. Having different defaults
> > > will
> > > > > > likely confuse users, not make usability easier. Personally, I
> would
> > > > > forbid
> > > > > > to start Ignite if IGNITE_HOME is not set, but this suggestion
> was
> > > not
> > > > > > accepted by the community.
> > > > > >
> > > > > > As far as I know, both rocksdb and SQLite is local only
> libraries and
> > > > > don't
> > > > > > > have any distrubted features.
> > > > > >
> > > > > > See no difference here. Imagine a user starts only one Ignite
> node
> > > for
> > > > > > development or just to play (which, I believe, happes quite a
> lot) -
> > > > same
> > > > > > as with local databases. BTW, it is impossible to start SQLite
> > > without
> > > > > > database path, so a user either provides a full path, or a
> relative
> > > > path
> > > > > > from the current directory - which is an explicit action from a
> user.
> > > > > >
> > > > > >
> > > > > > > I agree with you.
>

[jira] [Created] (IGNITE-12260) Fallback to {user.home}/ignite/work if {user.dir} is not writable

2019-10-04 Thread Ilya Kasnacheev (Jira)

Ilya Kasnacheev created IGNITE-12260:


 Summary: Fallback to {user.home}/ignite/work if {user.dir} is not 
writable
 Key: IGNITE-12260
 URL: https://issues.apache.org/jira/browse/IGNITE-12260
 Project: Ignite
  Issue Type: Improvement
  Components: general
Affects Versions: 2.7.6
Reporter: Ilya Kasnacheev


After IGNITE-12103 we have a new program, that some software under Windows, 
e.g. that is installed in Program Files, tries to create ignite\work\ dir under 
current dir which is not writable.

It was suggested to fallback to {user.home}\ignite\work dir in such cases. On 
each start we will try to create workdir in current dir, fail, print warning 
and fallback to home dir.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Re: Getting involved in Apache Ignite

2019-10-04 Thread Ilya Kasnacheev

Hello!

I have added you to contributors, now you can assign issues to yourself.
Please familiarise yourself with
https://cwiki.apache.org/confluence/display/IGNITE/How+to+Contribute

Feel free to create new issues or even IEPs!
https://cwiki.apache.org/confluence/display/IGNITE/Active+Proposals

Regards,
-- 
Ilya Kasnacheev


пт, 4 окт. 2019 г. в 18:13, Emmanouil Gkatziouras :

> Greetings!
>
> Thank you for your response.
> This is my JIRA login
> https://issues.apache.org/jira/secure/ViewProfile.jspa?name=gkatzioura
> Once you add me to the contributors should I create the tickets I described
> on the initial mail?
>
> Kind regards,
> Emmanouil
>
> *Emmanouil Gkatziouras*
> https://egkatzioura.com/ |
> https://www.linkedin.com/in/gkatziourasemmanouil/
> https://github.com/gkatzioura
>
>
> On Fri, 4 Oct 2019 at 13:33, Ilya Kasnacheev 
> wrote:
>
> > Hello!
> >
> > Do you have an Apache JIRA login? Please create it if not exists, share
> > with us so that you can be added to contributors.
> >
> > Regards,
> > --
> > Ilya Kasnacheev
> >
> >
> > пт, 4 окт. 2019 г. в 10:59, Emmanouil Gkatziouras  >:
> >
> > > Greetings,
> > >
> > > I am amazed by Apache Ignite and its features!
> > > For my use case integrating with Google Cloud Pub/Sub and Amazon SQS
> > would
> > > help getting the most out of it.
> > >
> > > Since developing those streamers is something I would do in any case, I
> > > would like to get involved in your project and therefore give back to
> the
> > > project and make those features available to the community.
> > >
> > > I have contributed to projects such as the InfluxDB Java Driver and
> > > Alpakka.
> > > Part of my every day work has to do with implementing solutions in the
> > > cloud, thus I can contribute to the streaming solutions that have to do
> > > with Cloud Providers. Particularly with the GCP Pub/Sub and AWS SQS as
> > well
> > > as other Cloud Based Messaging systems such as Azure Storage Queues.
> > > Also I would like to propose on adding a streamer implementation for
> > cache
> > > invalidation as I have some use cases in need of it.
> > >
> > > You can find me on LinkedIn (link in my signature) and get to know my
> > > background a little more.
> > >
> > > Thanks for your great work so far!
> > > Regards,
> > >
> > > *Emmanouil Gkatziouras*
> > > https://egkatzioura.com/ |
> > > https://www.linkedin.com/in/gkatziourasemmanouil/
> > > https://github.com/gkatzioura
> > >
> >
>

Re: [DISCUSSION][IEP-35] Replace RunningQueryManager with GridSystemViewManager

2019-10-04 Thread Nikolay Izhikov

Ivan.

> RunningQueryManager is responsible for tracking running queries (and query 
> history)

As I understand, RunningQueryManager track query only for an export.
So we don't need explicit entity for that, we already have System Views.

В Пт, 04/10/2019 в 17:40 +0300, Ivan Pavlukhin пишет:
> Nikolay,
> 
> Thank you for sharing knowledge.
> 
> > I think we should replace `RunningQueryManager` with the special SystemView 
> > implementation.
> 
> Not sure that I got the intention and abstraction here. For me a
> straightforward approach here is to keep RunningQueryManager as is and
> use a new API to expose it's content to monitoring system.
> RunningQueryManager is responsible for tracking running queries (and
> query history). All in all, other views expose info from other
> managers and processors (e.g. IgniteTxManager, GridTaskProcessor,
> SchemaManager). Have I missed something?
> 
> пт, 4 окт. 2019 г. в 14:12, Nikolay Izhikov :
> > 
> > Hello, Ivan.
> > 
> > > 1. How system views are going to be exposed? Is there any difference
> > > in comparison to other metrics?
> > 
> > We have a `SystemViewExporterSpi`.
> > Built-in implementations are `JmxSystemViewExporterSpi` and 
> > `SqlViewExporterSpi`.
> > 
> > > 2. What should be done to adopt RunningQueryManager to SystemView API?
> > 
> > I think we should replace `RunningQueryManager` with the special SystemView 
> > implementation.
> > 
> > > what is the difference between metrics and system views?
> > 
> > Actually, it's a very good question :)
> > 
> > System view is a collection of internal Ignite objects exported to a user.
> > Each system view is a table.
> > 
> > Metric is a value representing some instantaneous state of the internal 
> > Ignite object.
> > So its a "cell" of table.
> > 
> > We need metrics to build charts and history of processes.
> > We need system views to known what objects exist in node and its params.
> > 
> > В Пт, 04/10/2019 в 11:51 +0300, Ivan Pavlukhin пишет:
> > > Nikolay,
> > > 
> > > I checked the IEP [1]. Now it is more clear for me about SystemView
> > > API. Follow-up questions:
> > > 1. How system views are going to be exposed? Is there any difference
> > > in comparison to other metrics?
> > > 2. What should be done to adopt RunningQueryManager to SystemView API?
> > > 
> > > Also some bits for my understanding. I do not have a clear intuition
> > > what is the difference between metrics and system views? For example,
> > > how a system view is different from a metric holding a collection of
> > > values? And why they were introduced as a separate class?
> > > 
> > > [1] 
> > > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=112820392
> > > 
> > > чт, 3 окт. 2019 г. в 16:37, Nikolay Izhikov :
> > > > 
> > > > Hello, Ivan.
> > > > 
> > > > Thanks for feedback.
> > > > 
> > > > Initial IEP [1] naming was changed during code review.
> > > > I updated the IEP [1] with the current naming.
> > > > 
> > > > Can you take a look and check is all clear now?
> > > > 
> > > > [1] 
> > > > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=112820392
> > > > 
> > > > В Ср, 02/10/2019 в 17:21 +0300, Ivan Pavlukhin пишет:
> > > > > Hi Nikolay,
> > > > > 
> > > > > Actually I do not fully understand what is SystemView API. I have not
> > > > > found it in IEP [1] (I searched for words "system" and "view").
> > > > > 
> > > > > RunningQueryManager is a component responsible for tracking running
> > > > > queries internally. This info is exposed to users as SQL view via
> > > > > SqlSystemViewRunningQueries. In the same package you can find a plenty
> > > > > of other SQL views.
> > > > > 
> > > > > [1] 
> > > > > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=112820392
> > > > > 
> > > > > вт, 1 окт. 2019 г. в 06:42, Nikolay Izhikov :
> > > > > > 
> > > > > > Hello, Igniters.
> > > > > > 
> > > > > > Since the last release `RunningQueryManager` [1] was added.
> > > > > > It used to track a running query.
> > > > > > 
> > > > > > In IEP-35 [2] SystemView API was added.
> > > > > > SystemView API supposed to be used to track all kinds of internal 
> > > > > > Ignite objects.
> > > > > > 
> > > > > > I think this RunningQueryManager should be replaced [3] with the 
> > > > > > more unified SystemView API.
> > > > > > 
> > > > > > Any objections?
> > > > > > 
> > > > > > [1] https://issues.apache.org/jira/browse/IGNITE-10754
> > > > > > [2] 
> > > > > > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=112820392
> > > > > > [3] https://issues.apache.org/jira/browse/IGNITE-12223
> > > > > > [4] https://issues.apache.org/jira/browse/IGNITE-12224
> > > > > 
> > > > > 
> > > > > 
> > > 
> > > 
> > > 
> 
> 
> 


signature.asc
Description: This is a digitally signed message part

Re: Getting involved in Apache Ignite

2019-10-04 Thread Emmanouil Gkatziouras

Greetings!

Thank you for your response.
This is my JIRA login
https://issues.apache.org/jira/secure/ViewProfile.jspa?name=gkatzioura
Once you add me to the contributors should I create the tickets I described
on the initial mail?

Kind regards,
Emmanouil

*Emmanouil Gkatziouras*
https://egkatzioura.com/ | https://www.linkedin.com/in/gkatziourasemmanouil/
https://github.com/gkatzioura


On Fri, 4 Oct 2019 at 13:33, Ilya Kasnacheev 
wrote:

> Hello!
>
> Do you have an Apache JIRA login? Please create it if not exists, share
> with us so that you can be added to contributors.
>
> Regards,
> --
> Ilya Kasnacheev
>
>
> пт, 4 окт. 2019 г. в 10:59, Emmanouil Gkatziouras :
>
> > Greetings,
> >
> > I am amazed by Apache Ignite and its features!
> > For my use case integrating with Google Cloud Pub/Sub and Amazon SQS
> would
> > help getting the most out of it.
> >
> > Since developing those streamers is something I would do in any case, I
> > would like to get involved in your project and therefore give back to the
> > project and make those features available to the community.
> >
> > I have contributed to projects such as the InfluxDB Java Driver and
> > Alpakka.
> > Part of my every day work has to do with implementing solutions in the
> > cloud, thus I can contribute to the streaming solutions that have to do
> > with Cloud Providers. Particularly with the GCP Pub/Sub and AWS SQS as
> well
> > as other Cloud Based Messaging systems such as Azure Storage Queues.
> > Also I would like to propose on adding a streamer implementation for
> cache
> > invalidation as I have some use cases in need of it.
> >
> > You can find me on LinkedIn (link in my signature) and get to know my
> > background a little more.
> >
> > Thanks for your great work so far!
> > Regards,
> >
> > *Emmanouil Gkatziouras*
> > https://egkatzioura.com/ |
> > https://www.linkedin.com/in/gkatziourasemmanouil/
> > https://github.com/gkatzioura
> >
>

Re: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)

2019-10-04 Thread Andrey Mashenkov

Yuriy,

Just FYI we have a review checklist [1], coding guidelines [2].
To test a PR someone can use TeamCity [3] or TeamCityBot project [4].

The last way (using TCBot) makes test validation much easier and do not
bother with flacky tests.
Long story short you can trigger tests for the PR from Bot page and then
make Bot attach these results to a Jira ticket if you found results
acceptable.

So, next step is to run tests and chek if all is ok.

[1] https://cwiki.apache.org/confluence/display/IGNITE/Review+Checklist
[2] https://cwiki.apache.org/confluence/display/IGNITE/Coding+Guidelines
[3] https://ci.ignite.apache.org/
[4] https://mtcga.gridgain.com/



On Fri, Oct 4, 2019 at 3:10 PM Yuriy Shuliga  wrote:

> Andrew,
>
> I have corrected PR according to your notes. Please review.
> What will be the next steps in order to merge in?
>
> Y.
>
> чт, 3 жовт. 2019 о 17:47 Andrey Mashenkov 
> пише:
>
> > Yuri,
> >
> > I've done with review.
> > No crime found, but trivial compatibility bug.
> >
> > On Thu, Oct 3, 2019 at 3:54 PM Yuriy Shuliga  wrote:
> >
> > > Denis,
> > >
> > > Thank you for your attention to this.
> > > as for now, the https://issues.apache.org/jira/browse/IGNITE-12189
> > ticket
> > > is still pending review.
> > > Do we have a chance to move it forward somehow?
> > >
> > > BR,
> > > Yuriy Shuliha
> > >
> > > пн, 30 вер. 2019 о 23:35 Denis Magda  пише:
> > >
> > > > Yuriy,
> > > >
> > > > I've seen you opening a pull-request with the first changes:
> > > > https://issues.apache.org/jira/browse/IGNITE-12189
> > > >
> > > > Alex Scherbakov and Ivan are you the right guys to do the review?
> > > >
> > > > -
> > > > Denis
> > > >
> > > >
> > > > On Fri, Sep 27, 2019 at 8:48 AM Павлухин Иван 
> > > wrote:
> > > >
> > > > > Yuriy,
> > > > >
> > > > > Thank you for providing details! Quite interesting.
> > > > >
> > > > > Yes, we already have support of distributed limit and merging
> sorted
> > > > > subresults for SQL queries. E.g. ReduceIndexSorted and
> > > > > MergeStreamIterator are used for merging sorted streams.
> > > > >
> > > > > Could you please also clarify about score/relevance? Is it provided
> > by
> > > > > Lucene engine for each query result? I am thinking how to do sorted
> > > > > merge properly in this case.
> > > > >
> > > > > ср, 25 сент. 2019 г. в 18:56, Yuriy Shuliga :
> > > > > >
> > > > > > Ivan,
> > > > > >
> > > > > > Thank you for interesting question!
> > > > > >
> > > > > > Text searches (or full text searches) are mostly human-oriented.
> > And
> > > > the
> > > > > > point of user's interest is topmost part of response.
> > > > > > Then user can read it, evaluate and use the given records for
> > further
> > > > > > purposes.
> > > > > >
> > > > > > Particularly in our case, we use Ignite for operations with
> > financial
> > > > > data,
> > > > > > and there lots of text stuff like assets names, fin. instruments,
> > > > > companies
> > > > > > etc.
> > > > > > In order to operate with this quickly and reliably, users used to
> > > work
> > > > > with
> > > > > > text search, type-ahead completions, suggestions.
> > > > > >
> > > > > > For this purposes we are indexing particular string data in
> > separate
> > > > > caches.
> > > > > >
> > > > > > Sorting capabilities and response size limitations are very
> > important
> > > > > > there. As our API have to provide most relevant information in
> view
> > > of
> > > > > > limited size.
> > > > > >
> > > > > > Now let me comment some Ignite/Lucene perspective.
> > > > > > Actually Ignite queries and Lucene returns *TopDocs.scoresDocs
> > > *already
> > > > > > sorted by *score *(relevance). So most relevant documents are on
> > the
> > > > top.
> > > > > > And currently distributed queries responses from different nodes
> > are
> > > > > merged
> > > > > > into final query cursor queue in arbitrary way.
> > > > > > So in fact we already have the score order ruined here. Also
> Ignite
> > > > > > requests all possible documents from Lucene that is redundant and
> > not
> > > > > good
> > > > > > for performance.
> > > > > >
> > > > > > I'm implementing *limit* parameter to be part of *TextQuery *and
> > have
> > > > to
> > > > > > notice that we still have to add sorting for text queries
> > processing
> > > in
> > > > > > order to have applicable results.
> > > > > >
> > > > > > *Limit* parameter itself should improve the part of issues from
> > > above,
> > > > > but
> > > > > > definitely, sorting by document score at least  should be
> > implemented
> > > > > along
> > > > > > with limit.
> > > > > >
> > > > > > This is a pretty short commentary if you still have any
> questions,
> > > > please
> > > > > > ask, do not hesitate)
> > > > > >
> > > > > > BR,
> > > > > > Yuriy Shuliha
> > > > > >
> > > > > > чт, 19 вер. 2019 о 11:38 Павлухин Иван 
> пише:
> > > > > >
> > > > > > > Yuriy,
> > > > > > >
> > > > > > > Greatly

Re: [DISCUSSION][IEP-35] Replace RunningQueryManager with GridSystemViewManager

2019-10-04 Thread Ivan Pavlukhin

Nikolay,

Thank you for sharing knowledge.

> I think we should replace `RunningQueryManager` with the special SystemView 
> implementation.

Not sure that I got the intention and abstraction here. For me a
straightforward approach here is to keep RunningQueryManager as is and
use a new API to expose it's content to monitoring system.
RunningQueryManager is responsible for tracking running queries (and
query history). All in all, other views expose info from other
managers and processors (e.g. IgniteTxManager, GridTaskProcessor,
SchemaManager). Have I missed something?

пт, 4 окт. 2019 г. в 14:12, Nikolay Izhikov :
>
> Hello, Ivan.
>
> > 1. How system views are going to be exposed? Is there any difference
> > in comparison to other metrics?
>
> We have a `SystemViewExporterSpi`.
> Built-in implementations are `JmxSystemViewExporterSpi` and 
> `SqlViewExporterSpi`.
>
> > 2. What should be done to adopt RunningQueryManager to SystemView API?
>
> I think we should replace `RunningQueryManager` with the special SystemView 
> implementation.
>
> > what is the difference between metrics and system views?
>
> Actually, it's a very good question :)
>
> System view is a collection of internal Ignite objects exported to a user.
> Each system view is a table.
>
> Metric is a value representing some instantaneous state of the internal 
> Ignite object.
> So its a "cell" of table.
>
> We need metrics to build charts and history of processes.
> We need system views to known what objects exist in node and its params.
>
> В Пт, 04/10/2019 в 11:51 +0300, Ivan Pavlukhin пишет:
> > Nikolay,
> >
> > I checked the IEP [1]. Now it is more clear for me about SystemView
> > API. Follow-up questions:
> > 1. How system views are going to be exposed? Is there any difference
> > in comparison to other metrics?
> > 2. What should be done to adopt RunningQueryManager to SystemView API?
> >
> > Also some bits for my understanding. I do not have a clear intuition
> > what is the difference between metrics and system views? For example,
> > how a system view is different from a metric holding a collection of
> > values? And why they were introduced as a separate class?
> >
> > [1] 
> > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=112820392
> >
> > чт, 3 окт. 2019 г. в 16:37, Nikolay Izhikov :
> > >
> > > Hello, Ivan.
> > >
> > > Thanks for feedback.
> > >
> > > Initial IEP [1] naming was changed during code review.
> > > I updated the IEP [1] with the current naming.
> > >
> > > Can you take a look and check is all clear now?
> > >
> > > [1] 
> > > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=112820392
> > >
> > > В Ср, 02/10/2019 в 17:21 +0300, Ivan Pavlukhin пишет:
> > > > Hi Nikolay,
> > > >
> > > > Actually I do not fully understand what is SystemView API. I have not
> > > > found it in IEP [1] (I searched for words "system" and "view").
> > > >
> > > > RunningQueryManager is a component responsible for tracking running
> > > > queries internally. This info is exposed to users as SQL view via
> > > > SqlSystemViewRunningQueries. In the same package you can find a plenty
> > > > of other SQL views.
> > > >
> > > > [1] 
> > > > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=112820392
> > > >
> > > > вт, 1 окт. 2019 г. в 06:42, Nikolay Izhikov :
> > > > >
> > > > > Hello, Igniters.
> > > > >
> > > > > Since the last release `RunningQueryManager` [1] was added.
> > > > > It used to track a running query.
> > > > >
> > > > > In IEP-35 [2] SystemView API was added.
> > > > > SystemView API supposed to be used to track all kinds of internal 
> > > > > Ignite objects.
> > > > >
> > > > > I think this RunningQueryManager should be replaced [3] with the more 
> > > > > unified SystemView API.
> > > > >
> > > > > Any objections?
> > > > >
> > > > > [1] https://issues.apache.org/jira/browse/IGNITE-10754
> > > > > [2] 
> > > > > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=112820392
> > > > > [3] https://issues.apache.org/jira/browse/IGNITE-12223
> > > > > [4] https://issues.apache.org/jira/browse/IGNITE-12224
> > > >
> > > >
> > > >
> >
> >
> >



-- 
Best regards,
Ivan Pavlukhin

[jira] [Created] (IGNITE-12259) Create new module for support spring-5.2.X and spring-data-2.2.X

2019-10-04 Thread Surkov Aleksandr (Jira)

Surkov Aleksandr created IGNITE-12259:
-

 Summary: Create new module for support spring-5.2.X and 
spring-data-2.2.X
 Key: IGNITE-12259
 URL: https://issues.apache.org/jira/browse/IGNITE-12259
 Project: Ignite
  Issue Type: Wish
Reporter: Surkov Aleksandr


The actual spring version is 
[5.2.0.RELEASE|https://mvnrepository.com/artifact/org.springframework/spring-context/5.2.0.RELEASE],
 spring data version is 
[2.2.0.RELEASE.|https://mvnrepository.com/artifact/org.springframework.data/spring-data-commons/2.2.0.RELEASE]

It would be nice to add a module to support these versions.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Re: Metric showing how many nodes may safely leave the cluster

2019-10-04 Thread Nikolay Izhikov

Ivan.

> We shouldn't force users to configure external tools and write extra code for 
> basic things.

Actually, I don't agree with you.
Having external monitoring system for any production cluster is a *basic* thing.

Can you, please, define "basic things"?

> single method for the whole cluster

Can you clarify, what do you mean, exactly?
We have a ticket[1] to support metrics output via visor.sh.

My understanding: we should have an easy way to output metric values for each 
node in cluster.

[1] https://issues.apache.org/jira/browse/IGNITE-12191


В Пт, 04/10/2019 в 17:09 +0300, Ivan Rakov пишет:
> Max,
> 
> What if user simply don't have configured monitoring system?
> Knowing whether cluster will survive node shutdown is critical for any 
> administrator that performs any manipulations with cluster topology.
> Essential information should be easily accessed. We shouldn't force 
> users to configure external tools and write extra code for basic things.
> 
> Alex,
> 
> Thanks, that's exact metric we need.
> My point is that we should make it more accessible: via control.sh 
> command and single method for the whole cluster.
> 
> Best Regards,
> Ivan Rakov
> 
> On 04.10.2019 16:34, Alex Plehanov wrote:
> > Ivan, there already exist metric
> > CacheGroupMetricsMXBean#getMinimumNumberOfPartitionCopies, which shows the
> > current redundancy level for the cache group.
> > We can lose up to ( getMinimumNumberOfPartitionCopies-1) nodes without data
> > loss in this cache group.
> > 
> > пт, 4 окт. 2019 г. в 16:17, Ivan Rakov :
> > 
> > > Igniters,
> > > 
> > > I've seen numerous requests to find out an easy way to check whether is
> > > it safe to turn off cluster node. As we know, in Ignite protection from
> > > sudden node shutdown is implemented through keeping several backup
> > > copies of each partition. However, this guarantee can be weakened for a
> > > while in case cluster has recently experienced node restart and
> > > rebalancing process is still in progress.
> > > Example scenario is restarting nodes one by one in order to update a
> > > local configuration parameter. User restarts one node and rebalancing
> > > starts: when it will be completed, it will be safe to proceed (backup
> > > count=1). However, there's no transparent way to determine whether
> > > rebalancing is over.
> > >   From my perspective, it would be very helpful to:
> > > 1) Add information about rebalancing and number of free-to-go nodes to
> > > ./control.sh --state command.
> > > Examples of output:
> > > 
> > > > Cluster  ID: 125a6dce-74b1-4ee7-a453-c58f23f1f8fc
> > > > Cluster tag: new_tag
> > > > 
> > > 
> > > 
> > > > Cluster is active
> > > > All partitions are up-to-date.
> > > > 3 node(s) can safely leave the cluster without partition loss.
> > > > Cluster  ID: 125a6dce-74b1-4ee7-a453-c58f23f1f8fc
> > > > Cluster tag: new_tag
> > > > 
> > > 
> > > 
> > > > Cluster is active
> > > > Rebalancing is in progress.
> > > > 1 node(s) can safely leave the cluster without partition loss.
> > > 
> > > 2) Provide the same information via ClusterMetrics. For example:
> > > ClusterMetrics#isRebalanceInProgress // boolean
> > > ClusterMetrics#getSafeToLeaveNodesCount // int
> > > 
> > > Here I need to mention that this information can be calculated from
> > > existing rebalance metrics (see CacheMetrics#*rebalance*). However, I
> > > still think that we need more simple and understandable flag whether
> > > cluster is in danger of data loss. Another point is that current metrics
> > > are bound to specific cache, which makes this information even harder to
> > > analyze.
> > > 
> > > Thoughts?
> > > 
> > > --
> > > Best Regards,
> > > Ivan Rakov
> > > 
> > > 


signature.asc
Description: This is a digitally signed message part

Re: Metric showing how many nodes may safely leave the cluster

2019-10-04 Thread Ivan Rakov


Max,

What if user simply don't have configured monitoring system?
Knowing whether cluster will survive node shutdown is critical for any 
administrator that performs any manipulations with cluster topology.
Essential information should be easily accessed. We shouldn't force 
users to configure external tools and write extra code for basic things.


Alex,

Thanks, that's exact metric we need.
My point is that we should make it more accessible: via control.sh 
command and single method for the whole cluster.


Best Regards,
Ivan Rakov

On 04.10.2019 16:34, Alex Plehanov wrote:

Ivan, there already exist metric
CacheGroupMetricsMXBean#getMinimumNumberOfPartitionCopies, which shows the
current redundancy level for the cache group.
We can lose up to ( getMinimumNumberOfPartitionCopies-1) nodes without data
loss in this cache group.

пт, 4 окт. 2019 г. в 16:17, Ivan Rakov :


Igniters,

I've seen numerous requests to find out an easy way to check whether is
it safe to turn off cluster node. As we know, in Ignite protection from
sudden node shutdown is implemented through keeping several backup
copies of each partition. However, this guarantee can be weakened for a
while in case cluster has recently experienced node restart and
rebalancing process is still in progress.
Example scenario is restarting nodes one by one in order to update a
local configuration parameter. User restarts one node and rebalancing
starts: when it will be completed, it will be safe to proceed (backup
count=1). However, there's no transparent way to determine whether
rebalancing is over.
  From my perspective, it would be very helpful to:
1) Add information about rebalancing and number of free-to-go nodes to
./control.sh --state command.
Examples of output:


Cluster  ID: 125a6dce-74b1-4ee7-a453-c58f23f1f8fc
Cluster tag: new_tag




Cluster is active
All partitions are up-to-date.
3 node(s) can safely leave the cluster without partition loss.
Cluster  ID: 125a6dce-74b1-4ee7-a453-c58f23f1f8fc
Cluster tag: new_tag




Cluster is active
Rebalancing is in progress.
1 node(s) can safely leave the cluster without partition loss.

2) Provide the same information via ClusterMetrics. For example:
ClusterMetrics#isRebalanceInProgress // boolean
ClusterMetrics#getSafeToLeaveNodesCount // int

Here I need to mention that this information can be calculated from
existing rebalance metrics (see CacheMetrics#*rebalance*). However, I
still think that we need more simple and understandable flag whether
cluster is in danger of data loss. Another point is that current metrics
are bound to specific cache, which makes this information even harder to
analyze.

Thoughts?

--
Best Regards,
Ivan Rakov

Re: Metric showing how many nodes may safely leave the cluster

2019-10-04 Thread Ilya Kasnacheev

Hello!

That's a very useful metric which we already discussed in the past. This
may be called "cluster backup factor" and "effective cache backup factor".
You can look up other mentions by searching in maillist archives.

Regards,
-- 
Ilya Kasnacheev


пт, 4 окт. 2019 г. в 16:17, Ivan Rakov :

> Igniters,
>
> I've seen numerous requests to find out an easy way to check whether is
> it safe to turn off cluster node. As we know, in Ignite protection from
> sudden node shutdown is implemented through keeping several backup
> copies of each partition. However, this guarantee can be weakened for a
> while in case cluster has recently experienced node restart and
> rebalancing process is still in progress.
> Example scenario is restarting nodes one by one in order to update a
> local configuration parameter. User restarts one node and rebalancing
> starts: when it will be completed, it will be safe to proceed (backup
> count=1). However, there's no transparent way to determine whether
> rebalancing is over.
>  From my perspective, it would be very helpful to:
> 1) Add information about rebalancing and number of free-to-go nodes to
> ./control.sh --state command.
> Examples of output:
>
> > Cluster  ID: 125a6dce-74b1-4ee7-a453-c58f23f1f8fc
> > Cluster tag: new_tag
> >
> 
> > Cluster is active
> > All partitions are up-to-date.
> > 3 node(s) can safely leave the cluster without partition loss.
> > Cluster  ID: 125a6dce-74b1-4ee7-a453-c58f23f1f8fc
> > Cluster tag: new_tag
> >
> 
> > Cluster is active
> > Rebalancing is in progress.
> > 1 node(s) can safely leave the cluster without partition loss.
> 2) Provide the same information via ClusterMetrics. For example:
> ClusterMetrics#isRebalanceInProgress // boolean
> ClusterMetrics#getSafeToLeaveNodesCount // int
>
> Here I need to mention that this information can be calculated from
> existing rebalance metrics (see CacheMetrics#*rebalance*). However, I
> still think that we need more simple and understandable flag whether
> cluster is in danger of data loss. Another point is that current metrics
> are bound to specific cache, which makes this information even harder to
> analyze.
>
> Thoughts?
>
> --
> Best Regards,
> Ivan Rakov
>
>

Re: Metric showing how many nodes may safely leave the cluster

2019-10-04 Thread Alex Plehanov

Ivan, there already exist metric
CacheGroupMetricsMXBean#getMinimumNumberOfPartitionCopies, which shows the
current redundancy level for the cache group.
We can lose up to ( getMinimumNumberOfPartitionCopies-1) nodes without data
loss in this cache group.

пт, 4 окт. 2019 г. в 16:17, Ivan Rakov :

> Igniters,
>
> I've seen numerous requests to find out an easy way to check whether is
> it safe to turn off cluster node. As we know, in Ignite protection from
> sudden node shutdown is implemented through keeping several backup
> copies of each partition. However, this guarantee can be weakened for a
> while in case cluster has recently experienced node restart and
> rebalancing process is still in progress.
> Example scenario is restarting nodes one by one in order to update a
> local configuration parameter. User restarts one node and rebalancing
> starts: when it will be completed, it will be safe to proceed (backup
> count=1). However, there's no transparent way to determine whether
> rebalancing is over.
>  From my perspective, it would be very helpful to:
> 1) Add information about rebalancing and number of free-to-go nodes to
> ./control.sh --state command.
> Examples of output:
>
> > Cluster  ID: 125a6dce-74b1-4ee7-a453-c58f23f1f8fc
> > Cluster tag: new_tag
> >
> 
> > Cluster is active
> > All partitions are up-to-date.
> > 3 node(s) can safely leave the cluster without partition loss.
> > Cluster  ID: 125a6dce-74b1-4ee7-a453-c58f23f1f8fc
> > Cluster tag: new_tag
> >
> 
> > Cluster is active
> > Rebalancing is in progress.
> > 1 node(s) can safely leave the cluster without partition loss.
> 2) Provide the same information via ClusterMetrics. For example:
> ClusterMetrics#isRebalanceInProgress // boolean
> ClusterMetrics#getSafeToLeaveNodesCount // int
>
> Here I need to mention that this information can be calculated from
> existing rebalance metrics (see CacheMetrics#*rebalance*). However, I
> still think that we need more simple and understandable flag whether
> cluster is in danger of data loss. Another point is that current metrics
> are bound to specific cache, which makes this information even harder to
> analyze.
>
> Thoughts?
>
> --
> Best Regards,
> Ivan Rakov
>
>

Re: Metric showing how many nodes may safely leave the cluster

2019-10-04 Thread Maxim Muzafarov

Ivan,

1. I think the rebalance cache metrics should be deprecated and
removed (someday). Here is the [1] issue to do such things.

2. I think #isRebalanceInProgress can and should be calculated by an
external monitoring system from local nodes based on
#localMovingPartitionsCount > 0 (or the more precise value
rebalancingPartitionsLeft from the issue [1]) values gathered from
each online node. Also, we should provide such templates for each
monitoring system (Zabbix, Prometheus etc.).

[1] https://issues.apache.org/jira/browse/IGNITE-12183

On Fri, 4 Oct 2019 at 16:17, Ivan Rakov  wrote:
>
> Igniters,
>
> I've seen numerous requests to find out an easy way to check whether is
> it safe to turn off cluster node. As we know, in Ignite protection from
> sudden node shutdown is implemented through keeping several backup
> copies of each partition. However, this guarantee can be weakened for a
> while in case cluster has recently experienced node restart and
> rebalancing process is still in progress.
> Example scenario is restarting nodes one by one in order to update a
> local configuration parameter. User restarts one node and rebalancing
> starts: when it will be completed, it will be safe to proceed (backup
> count=1). However, there's no transparent way to determine whether
> rebalancing is over.
>  From my perspective, it would be very helpful to:
> 1) Add information about rebalancing and number of free-to-go nodes to
> ./control.sh --state command.
> Examples of output:
>
> > Cluster  ID: 125a6dce-74b1-4ee7-a453-c58f23f1f8fc
> > Cluster tag: new_tag
> > 
> > Cluster is active
> > All partitions are up-to-date.
> > 3 node(s) can safely leave the cluster without partition loss.
> > Cluster  ID: 125a6dce-74b1-4ee7-a453-c58f23f1f8fc
> > Cluster tag: new_tag
> > 
> > Cluster is active
> > Rebalancing is in progress.
> > 1 node(s) can safely leave the cluster without partition loss.
> 2) Provide the same information via ClusterMetrics. For example:
> ClusterMetrics#isRebalanceInProgress // boolean
> ClusterMetrics#getSafeToLeaveNodesCount // int
>
> Here I need to mention that this information can be calculated from
> existing rebalance metrics (see CacheMetrics#*rebalance*). However, I
> still think that we need more simple and understandable flag whether
> cluster is in danger of data loss. Another point is that current metrics
> are bound to specific cache, which makes this information even harder to
> analyze.
>
> Thoughts?
>
> --
> Best Regards,
> Ivan Rakov
>

Metric showing how many nodes may safely leave the cluster

2019-10-04 Thread Ivan Rakov


Igniters,

I've seen numerous requests to find out an easy way to check whether is 
it safe to turn off cluster node. As we know, in Ignite protection from 
sudden node shutdown is implemented through keeping several backup 
copies of each partition. However, this guarantee can be weakened for a 
while in case cluster has recently experienced node restart and 
rebalancing process is still in progress.
Example scenario is restarting nodes one by one in order to update a 
local configuration parameter. User restarts one node and rebalancing 
starts: when it will be completed, it will be safe to proceed (backup 
count=1). However, there's no transparent way to determine whether 
rebalancing is over.

From my perspective, it would be very helpful to:
1) Add information about rebalancing and number of free-to-go nodes to 
./control.sh --state command.

Examples of output:


Cluster  ID: 125a6dce-74b1-4ee7-a453-c58f23f1f8fc
Cluster tag: new_tag

Cluster is active
All partitions are up-to-date.
3 node(s) can safely leave the cluster without partition loss.
Cluster  ID: 125a6dce-74b1-4ee7-a453-c58f23f1f8fc
Cluster tag: new_tag

Cluster is active
Rebalancing is in progress.
1 node(s) can safely leave the cluster without partition loss.

2) Provide the same information via ClusterMetrics. For example:
ClusterMetrics#isRebalanceInProgress // boolean
ClusterMetrics#getSafeToLeaveNodesCount // int

Here I need to mention that this information can be calculated from 
existing rebalance metrics (see CacheMetrics#*rebalance*). However, I 
still think that we need more simple and understandable flag whether 
cluster is in danger of data loss. Another point is that current metrics 
are bound to specific cache, which makes this information even harder to 
analyze.


Thoughts?

--
Best Regards,
Ivan Rakov

Re: Getting involved in Apache Ignite

2019-10-04 Thread Ilya Kasnacheev

Hello!

Do you have an Apache JIRA login? Please create it if not exists, share
with us so that you can be added to contributors.

Regards,
-- 
Ilya Kasnacheev


пт, 4 окт. 2019 г. в 10:59, Emmanouil Gkatziouras :

> Greetings,
>
> I am amazed by Apache Ignite and its features!
> For my use case integrating with Google Cloud Pub/Sub and Amazon SQS would
> help getting the most out of it.
>
> Since developing those streamers is something I would do in any case, I
> would like to get involved in your project and therefore give back to the
> project and make those features available to the community.
>
> I have contributed to projects such as the InfluxDB Java Driver and
> Alpakka.
> Part of my every day work has to do with implementing solutions in the
> cloud, thus I can contribute to the streaming solutions that have to do
> with Cloud Providers. Particularly with the GCP Pub/Sub and AWS SQS as well
> as other Cloud Based Messaging systems such as Azure Storage Queues.
> Also I would like to propose on adding a streamer implementation for cache
> invalidation as I have some use cases in need of it.
>
> You can find me on LinkedIn (link in my signature) and get to know my
> background a little more.
>
> Thanks for your great work so far!
> Regards,
>
> *Emmanouil Gkatziouras*
> https://egkatzioura.com/ |
> https://www.linkedin.com/in/gkatziourasemmanouil/
> https://github.com/gkatzioura
>

Re: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)

2019-10-04 Thread Yuriy Shuliga

Andrew,

I have corrected PR according to your notes. Please review.
What will be the next steps in order to merge in?

Y.

чт, 3 жовт. 2019 о 17:47 Andrey Mashenkov  пише:

> Yuri,
>
> I've done with review.
> No crime found, but trivial compatibility bug.
>
> On Thu, Oct 3, 2019 at 3:54 PM Yuriy Shuliga  wrote:
>
> > Denis,
> >
> > Thank you for your attention to this.
> > as for now, the https://issues.apache.org/jira/browse/IGNITE-12189
> ticket
> > is still pending review.
> > Do we have a chance to move it forward somehow?
> >
> > BR,
> > Yuriy Shuliha
> >
> > пн, 30 вер. 2019 о 23:35 Denis Magda  пише:
> >
> > > Yuriy,
> > >
> > > I've seen you opening a pull-request with the first changes:
> > > https://issues.apache.org/jira/browse/IGNITE-12189
> > >
> > > Alex Scherbakov and Ivan are you the right guys to do the review?
> > >
> > > -
> > > Denis
> > >
> > >
> > > On Fri, Sep 27, 2019 at 8:48 AM Павлухин Иван 
> > wrote:
> > >
> > > > Yuriy,
> > > >
> > > > Thank you for providing details! Quite interesting.
> > > >
> > > > Yes, we already have support of distributed limit and merging sorted
> > > > subresults for SQL queries. E.g. ReduceIndexSorted and
> > > > MergeStreamIterator are used for merging sorted streams.
> > > >
> > > > Could you please also clarify about score/relevance? Is it provided
> by
> > > > Lucene engine for each query result? I am thinking how to do sorted
> > > > merge properly in this case.
> > > >
> > > > ср, 25 сент. 2019 г. в 18:56, Yuriy Shuliga :
> > > > >
> > > > > Ivan,
> > > > >
> > > > > Thank you for interesting question!
> > > > >
> > > > > Text searches (or full text searches) are mostly human-oriented.
> And
> > > the
> > > > > point of user's interest is topmost part of response.
> > > > > Then user can read it, evaluate and use the given records for
> further
> > > > > purposes.
> > > > >
> > > > > Particularly in our case, we use Ignite for operations with
> financial
> > > > data,
> > > > > and there lots of text stuff like assets names, fin. instruments,
> > > > companies
> > > > > etc.
> > > > > In order to operate with this quickly and reliably, users used to
> > work
> > > > with
> > > > > text search, type-ahead completions, suggestions.
> > > > >
> > > > > For this purposes we are indexing particular string data in
> separate
> > > > caches.
> > > > >
> > > > > Sorting capabilities and response size limitations are very
> important
> > > > > there. As our API have to provide most relevant information in view
> > of
> > > > > limited size.
> > > > >
> > > > > Now let me comment some Ignite/Lucene perspective.
> > > > > Actually Ignite queries and Lucene returns *TopDocs.scoresDocs
> > *already
> > > > > sorted by *score *(relevance). So most relevant documents are on
> the
> > > top.
> > > > > And currently distributed queries responses from different nodes
> are
> > > > merged
> > > > > into final query cursor queue in arbitrary way.
> > > > > So in fact we already have the score order ruined here. Also Ignite
> > > > > requests all possible documents from Lucene that is redundant and
> not
> > > > good
> > > > > for performance.
> > > > >
> > > > > I'm implementing *limit* parameter to be part of *TextQuery *and
> have
> > > to
> > > > > notice that we still have to add sorting for text queries
> processing
> > in
> > > > > order to have applicable results.
> > > > >
> > > > > *Limit* parameter itself should improve the part of issues from
> > above,
> > > > but
> > > > > definitely, sorting by document score at least  should be
> implemented
> > > > along
> > > > > with limit.
> > > > >
> > > > > This is a pretty short commentary if you still have any questions,
> > > please
> > > > > ask, do not hesitate)
> > > > >
> > > > > BR,
> > > > > Yuriy Shuliha
> > > > >
> > > > > чт, 19 вер. 2019 о 11:38 Павлухин Иван  пише:
> > > > >
> > > > > > Yuriy,
> > > > > >
> > > > > > Greatly appreciate your interest.
> > > > > >
> > > > > > Could you please elaborate a little bit about sorting? What tasks
> > > does
> > > > > > it help to solve and how? It would be great to provide an
> example.
> > > > > >
> > > > > > ср, 18 сент. 2019 г. в 09:39, Alexei Scherbakov <
> > > > > > alexey.scherbak...@gmail.com>:
> > > > > > >
> > > > > > > Denis,
> > > > > > >
> > > > > > > I like the idea of throwing an exception for enabled text
> queries
> > > on
> > > > > > > persistent caches.
> > > > > > >
> > > > > > > Also I'm fine with proposed limit for unsorted searches.
> > > > > > >
> > > > > > > Yury, please proceed with ticket creation.
> > > > > > >
> > > > > > > вт, 17 сент. 2019 г., 22:06 Denis Magda :
> > > > > > >
> > > > > > > > Igniters,
> > > > > > > >
> > > > > > > > I see nothing wrong with Yury's proposal in regards full-text
> > > > search
> > > > > > API
> > > > > > > > evolution as long as Yury is ready to push it forward.
> > > > > > > >
> > > > > > > > As for the in-memory mode only, it makes total sense for
> > > in-memory
> > > > data
> > > > > >

Re: [DISCUSSION][IEP-35] Replace RunningQueryManager with GridSystemViewManager

2019-10-04 Thread Nikolay Izhikov

Hello, Ivan.

> 1. How system views are going to be exposed? Is there any difference
> in comparison to other metrics?

We have a `SystemViewExporterSpi`.
Built-in implementations are `JmxSystemViewExporterSpi` and 
`SqlViewExporterSpi`.

> 2. What should be done to adopt RunningQueryManager to SystemView API?

I think we should replace `RunningQueryManager` with the special SystemView 
implementation.

> what is the difference between metrics and system views?

Actually, it's a very good question :)

System view is a collection of internal Ignite objects exported to a user.
Each system view is a table.

Metric is a value representing some instantaneous state of the internal Ignite 
object.
So its a "cell" of table.

We need metrics to build charts and history of processes.
We need system views to known what objects exist in node and its params.

В Пт, 04/10/2019 в 11:51 +0300, Ivan Pavlukhin пишет:
> Nikolay,
> 
> I checked the IEP [1]. Now it is more clear for me about SystemView
> API. Follow-up questions:
> 1. How system views are going to be exposed? Is there any difference
> in comparison to other metrics?
> 2. What should be done to adopt RunningQueryManager to SystemView API?
> 
> Also some bits for my understanding. I do not have a clear intuition
> what is the difference between metrics and system views? For example,
> how a system view is different from a metric holding a collection of
> values? And why they were introduced as a separate class?
> 
> [1] https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=112820392
> 
> чт, 3 окт. 2019 г. в 16:37, Nikolay Izhikov :
> > 
> > Hello, Ivan.
> > 
> > Thanks for feedback.
> > 
> > Initial IEP [1] naming was changed during code review.
> > I updated the IEP [1] with the current naming.
> > 
> > Can you take a look and check is all clear now?
> > 
> > [1] 
> > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=112820392
> > 
> > В Ср, 02/10/2019 в 17:21 +0300, Ivan Pavlukhin пишет:
> > > Hi Nikolay,
> > > 
> > > Actually I do not fully understand what is SystemView API. I have not
> > > found it in IEP [1] (I searched for words "system" and "view").
> > > 
> > > RunningQueryManager is a component responsible for tracking running
> > > queries internally. This info is exposed to users as SQL view via
> > > SqlSystemViewRunningQueries. In the same package you can find a plenty
> > > of other SQL views.
> > > 
> > > [1] 
> > > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=112820392
> > > 
> > > вт, 1 окт. 2019 г. в 06:42, Nikolay Izhikov :
> > > > 
> > > > Hello, Igniters.
> > > > 
> > > > Since the last release `RunningQueryManager` [1] was added.
> > > > It used to track a running query.
> > > > 
> > > > In IEP-35 [2] SystemView API was added.
> > > > SystemView API supposed to be used to track all kinds of internal 
> > > > Ignite objects.
> > > > 
> > > > I think this RunningQueryManager should be replaced [3] with the more 
> > > > unified SystemView API.
> > > > 
> > > > Any objections?
> > > > 
> > > > [1] https://issues.apache.org/jira/browse/IGNITE-10754
> > > > [2] 
> > > > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=112820392
> > > > [3] https://issues.apache.org/jira/browse/IGNITE-12223
> > > > [4] https://issues.apache.org/jira/browse/IGNITE-12224
> > > 
> > > 
> > > 
> 
> 
> 


signature.asc
Description: This is a digitally signed message part

[jira] [Created] (IGNITE-12258) .NET: Add ContinuousQueryWithTransformer

2019-10-04 Thread Pavel Tupitsyn (Jira)

Pavel Tupitsyn created IGNITE-12258:
---

 Summary: .NET: Add ContinuousQueryWithTransformer
 Key: IGNITE-12258
 URL: https://issues.apache.org/jira/browse/IGNITE-12258
 Project: Ignite
  Issue Type: Improvement
  Components: platforms
Reporter: Pavel Tupitsyn
Assignee: Pavel Tupitsyn


ContinuousQueryWithTransformer is a powerful mechanism to improve continuous 
query performance by sending only relevant data back to listener nodes.
https://apacheignite.readme.io/docs/continuous-queries#section-remote-transformer



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Re: How to free up space on disc after removing entries from IgniteCache with enabled PDS?

2019-10-04 Thread Maxim Muzafarov

Igniters,

This thread seems to be endless, but we if some kind of cache group
distributed write lock (exclusive for some of the internal Ignite
process) will be introduced? I think it will help to solve a batch of
problems, like:

1. defragmentation of all cache group partitions on the local node
without concurrent updates.
2. improve data loading with data streamer isolation mode [1]. It
seems we should not allow concurrent updates to cache if we on `fast
data load` step.
3. recovery from a snapshot without cache stop\start actions


[1] https://issues.apache.org/jira/browse/IGNITE-11793

On Thu, 3 Oct 2019 at 22:50, Sergey Kozlov  wrote:
>
> Hi
>
> I'm not sure that node offline is a best way to do that.
> Cons:
>  - different caches may have different defragmentation but we force to stop
> whole node
>  - offline node is a maintenance operation will require to add +1 backup to
> reduce the risk of data loss
>  - baseline auto adjustment?
>  - impact to index rebuild?
>  - cache configuration changes (or destroy) during node offline
>
> What about other ways without node stop? E.g. make cache group on a node
> offline? Add *defrag  *command to control.sh to force start
> rebalance internally in the node with expected impact to performance.
>
>
>
> On Thu, Oct 3, 2019 at 12:08 PM Anton Vinogradov  wrote:
>
> > Alexey,
> > As for me, it does not matter will it be IEP, umbrella or a single issue.
> > The most important thing is Assignee :)
> >
> > On Thu, Oct 3, 2019 at 11:59 AM Alexey Goncharuk <
> > alexey.goncha...@gmail.com>
> > wrote:
> >
> > > Anton, do you think we should file a single ticket for this or should we
> > go
> > > with an IEP? As of now, the change does not look big enough for an IEP
> > for
> > > me.
> > >
> > > чт, 3 окт. 2019 г. в 11:18, Anton Vinogradov :
> > >
> > > > Alexey,
> > > >
> > > > Sounds good to me.
> > > >
> > > > On Thu, Oct 3, 2019 at 10:51 AM Alexey Goncharuk <
> > > > alexey.goncha...@gmail.com>
> > > > wrote:
> > > >
> > > > > Anton,
> > > > >
> > > > > Switching a partition to and from the SHRINKING state will require
> > > > > intricate synchronizations in order to properly determine the start
> > > > > position for historical rebalance without PME.
> > > > >
> > > > > I would still go with an offline-node approach, but instead of
> > cleaning
> > > > the
> > > > > persistence, we can do effective defragmentation when the node is
> > > offline
> > > > > because we are sure that there is no concurrent load. After the
> > > > > defragmentation completes, we bring the node back to the cluster and
> > > > > historical rebalance will kick in automatically. It will still
> > require
> > > > > manual node restarts, but since the data is not removed, there are no
> > > > > additional risks. Also, this will be an excellent solution for those
> > > who
> > > > > can afford downtime and execute the defragment command on all nodes
> > in
> > > > the
> > > > > cluster simultaneously - this will be the fastest way possible.
> > > > >
> > > > > --AG
> > > > >
> > > > > пн, 30 сент. 2019 г. в 09:29, Anton Vinogradov :
> > > > >
> > > > > > Alexei,
> > > > > > >> stopping fragmented node and removing partition data, then
> > > starting
> > > > it
> > > > > > again
> > > > > >
> > > > > > That's exactly what we're doing to solve the fragmentation issue.
> > > > > > The problem here is that we have to perform N/B restart-rebalance
> > > > > > operations (N - cluster size, B - backups count) and it takes a lot
> > > of
> > > > > time
> > > > > > with risks to lose the data.
> > > > > >
> > > > > > On Fri, Sep 27, 2019 at 5:49 PM Alexei Scherbakov <
> > > > > > alexey.scherbak...@gmail.com> wrote:
> > > > > >
> > > > > > > Probably this should be allowed to do using public API, actually
> > > this
> > > > > is
> > > > > > > same as manual rebalancing.
> > > > > > >
> > > > > > > пт, 27 сент. 2019 г. в 17:40, Alexei Scherbakov <
> > > > > > > alexey.scherbak...@gmail.com>:
> > > > > > >
> > > > > > > > The poor man's solution for the problem would be stopping
> > > > fragmented
> > > > > > node
> > > > > > > > and removing partition data, then starting it again allowing
> > full
> > > > > state
> > > > > > > > transfer already without deletes.
> > > > > > > > Rinse and repeat for all owners.
> > > > > > > >
> > > > > > > > Anton Vinogradov, would this work for you as workaround ?
> > > > > > > >
> > > > > > > > чт, 19 сент. 2019 г. в 13:03, Anton Vinogradov  > >:
> > > > > > > >
> > > > > > > >> Alexey,
> > > > > > > >>
> > > > > > > >> Let's combine your and Ivan's proposals.
> > > > > > > >>
> > > > > > > >> >> vacuum command, which acquires exclusive table lock, so no
> > > > > > concurrent
> > > > > > > >> activities on the table are possible.
> > > > > > > >> and
> > > > > > > >> >> Could the problem be solved by stopping a node which needs
> > to
> > > > be
> > > > > > > >> defragmented, clearing persistence files and restarting the
> > > node?
> > > > > > > >> >> After

Re: [DISCUSSION][IEP-35] Replace RunningQueryManager with GridSystemViewManager

2019-10-04 Thread Ivan Pavlukhin

Nikolay,

I checked the IEP [1]. Now it is more clear for me about SystemView
API. Follow-up questions:
1. How system views are going to be exposed? Is there any difference
in comparison to other metrics?
2. What should be done to adopt RunningQueryManager to SystemView API?

Also some bits for my understanding. I do not have a clear intuition
what is the difference between metrics and system views? For example,
how a system view is different from a metric holding a collection of
values? And why they were introduced as a separate class?

[1] https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=112820392

чт, 3 окт. 2019 г. в 16:37, Nikolay Izhikov :
>
> Hello, Ivan.
>
> Thanks for feedback.
>
> Initial IEP [1] naming was changed during code review.
> I updated the IEP [1] with the current naming.
>
> Can you take a look and check is all clear now?
>
> [1] https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=112820392
>
> В Ср, 02/10/2019 в 17:21 +0300, Ivan Pavlukhin пишет:
> > Hi Nikolay,
> >
> > Actually I do not fully understand what is SystemView API. I have not
> > found it in IEP [1] (I searched for words "system" and "view").
> >
> > RunningQueryManager is a component responsible for tracking running
> > queries internally. This info is exposed to users as SQL view via
> > SqlSystemViewRunningQueries. In the same package you can find a plenty
> > of other SQL views.
> >
> > [1] 
> > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=112820392
> >
> > вт, 1 окт. 2019 г. в 06:42, Nikolay Izhikov :
> > >
> > > Hello, Igniters.
> > >
> > > Since the last release `RunningQueryManager` [1] was added.
> > > It used to track a running query.
> > >
> > > In IEP-35 [2] SystemView API was added.
> > > SystemView API supposed to be used to track all kinds of internal Ignite 
> > > objects.
> > >
> > > I think this RunningQueryManager should be replaced [3] with the more 
> > > unified SystemView API.
> > >
> > > Any objections?
> > >
> > > [1] https://issues.apache.org/jira/browse/IGNITE-10754
> > > [2] 
> > > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=112820392
> > > [3] https://issues.apache.org/jira/browse/IGNITE-12223
> > > [4] https://issues.apache.org/jira/browse/IGNITE-12224
> >
> >
> >



-- 
Best regards,
Ivan Pavlukhin

Re: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)

2019-10-04 Thread Ivan Pavlukhin

Yuriy,

Thank you, fine with it.

пт, 4 окт. 2019 г. в 11:01, Yuriy Shuliga :
>
> Ivan,
>
> Yes, your observation is correct.
>
> This behavior lasts from the very beginning when Lucene indexing was
> implemented for distributed queries.
> Implementation of the *limit* solves the problem of redundant response
> size. Without it *ALL* off the records are fetched each time; that is not
> good, especially for loose patterns.
> In order to solve relevance issue correct sorting should be implemented.
>
> Y.
>
> пт, 4 жовт. 2019 о 10:45 Ivan Pavlukhin  пише:
>
> > Yuriy,
> >
> > Am I getting it right that in your PR if we have a limit N than
> > returned items (at most N) will not be strictly the most relevant
> > ones? E.g. if one node returned N items faster than others but with
> > not so good relevance?
> >
> > чт, 3 окт. 2019 г. в 17:47, Andrey Mashenkov :
> > >
> > > Yuri,
> > >
> > > I've done with review.
> > > No crime found, but trivial compatibility bug.
> > >
> > > On Thu, Oct 3, 2019 at 3:54 PM Yuriy Shuliga  wrote:
> > >
> > > > Denis,
> > > >
> > > > Thank you for your attention to this.
> > > > as for now, the https://issues.apache.org/jira/browse/IGNITE-12189
> > ticket
> > > > is still pending review.
> > > > Do we have a chance to move it forward somehow?
> > > >
> > > > BR,
> > > > Yuriy Shuliha
> > > >
> > > > пн, 30 вер. 2019 о 23:35 Denis Magda  пише:
> > > >
> > > > > Yuriy,
> > > > >
> > > > > I've seen you opening a pull-request with the first changes:
> > > > > https://issues.apache.org/jira/browse/IGNITE-12189
> > > > >
> > > > > Alex Scherbakov and Ivan are you the right guys to do the review?
> > > > >
> > > > > -
> > > > > Denis
> > > > >
> > > > >
> > > > > On Fri, Sep 27, 2019 at 8:48 AM Павлухин Иван 
> > > > wrote:
> > > > >
> > > > > > Yuriy,
> > > > > >
> > > > > > Thank you for providing details! Quite interesting.
> > > > > >
> > > > > > Yes, we already have support of distributed limit and merging
> > sorted
> > > > > > subresults for SQL queries. E.g. ReduceIndexSorted and
> > > > > > MergeStreamIterator are used for merging sorted streams.
> > > > > >
> > > > > > Could you please also clarify about score/relevance? Is it
> > provided by
> > > > > > Lucene engine for each query result? I am thinking how to do sorted
> > > > > > merge properly in this case.
> > > > > >
> > > > > > ср, 25 сент. 2019 г. в 18:56, Yuriy Shuliga :
> > > > > > >
> > > > > > > Ivan,
> > > > > > >
> > > > > > > Thank you for interesting question!
> > > > > > >
> > > > > > > Text searches (or full text searches) are mostly human-oriented.
> > And
> > > > > the
> > > > > > > point of user's interest is topmost part of response.
> > > > > > > Then user can read it, evaluate and use the given records for
> > further
> > > > > > > purposes.
> > > > > > >
> > > > > > > Particularly in our case, we use Ignite for operations with
> > financial
> > > > > > data,
> > > > > > > and there lots of text stuff like assets names, fin. instruments,
> > > > > > companies
> > > > > > > etc.
> > > > > > > In order to operate with this quickly and reliably, users used to
> > > > work
> > > > > > with
> > > > > > > text search, type-ahead completions, suggestions.
> > > > > > >
> > > > > > > For this purposes we are indexing particular string data in
> > separate
> > > > > > caches.
> > > > > > >
> > > > > > > Sorting capabilities and response size limitations are very
> > important
> > > > > > > there. As our API have to provide most relevant information in
> > view
> > > > of
> > > > > > > limited size.
> > > > > > >
> > > > > > > Now let me comment some Ignite/Lucene perspective.
> > > > > > > Actually Ignite queries and Lucene returns *TopDocs.scoresDocs
> > > > *already
> > > > > > > sorted by *score *(relevance). So most relevant documents are on
> > the
> > > > > top.
> > > > > > > And currently distributed queries responses from different nodes
> > are
> > > > > > merged
> > > > > > > into final query cursor queue in arbitrary way.
> > > > > > > So in fact we already have the score order ruined here. Also
> > Ignite
> > > > > > > requests all possible documents from Lucene that is redundant
> > and not
> > > > > > good
> > > > > > > for performance.
> > > > > > >
> > > > > > > I'm implementing *limit* parameter to be part of *TextQuery *and
> > have
> > > > > to
> > > > > > > notice that we still have to add sorting for text queries
> > processing
> > > > in
> > > > > > > order to have applicable results.
> > > > > > >
> > > > > > > *Limit* parameter itself should improve the part of issues from
> > > > above,
> > > > > > but
> > > > > > > definitely, sorting by document score at least  should be
> > implemented
> > > > > > along
> > > > > > > with limit.
> > > > > > >
> > > > > > > This is a pretty short commentary if you still have any
> > questions,
> > > > > please
> > > > > > > ask, do not hesitate)
> > > > > > >
> > > > > > > BR,
> > > > > > > Yuriy Shuliha
> > > > > > >
> > > > > > > чт, 19 вер. 2019 о

Re: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)

2019-10-04 Thread Yuriy Shuliga

Ivan,

Yes, your observation is correct.

This behavior lasts from the very beginning when Lucene indexing was
implemented for distributed queries.
Implementation of the *limit* solves the problem of redundant response
size. Without it *ALL* off the records are fetched each time; that is not
good, especially for loose patterns.
In order to solve relevance issue correct sorting should be implemented.

Y.

пт, 4 жовт. 2019 о 10:45 Ivan Pavlukhin  пише:

> Yuriy,
>
> Am I getting it right that in your PR if we have a limit N than
> returned items (at most N) will not be strictly the most relevant
> ones? E.g. if one node returned N items faster than others but with
> not so good relevance?
>
> чт, 3 окт. 2019 г. в 17:47, Andrey Mashenkov :
> >
> > Yuri,
> >
> > I've done with review.
> > No crime found, but trivial compatibility bug.
> >
> > On Thu, Oct 3, 2019 at 3:54 PM Yuriy Shuliga  wrote:
> >
> > > Denis,
> > >
> > > Thank you for your attention to this.
> > > as for now, the https://issues.apache.org/jira/browse/IGNITE-12189
> ticket
> > > is still pending review.
> > > Do we have a chance to move it forward somehow?
> > >
> > > BR,
> > > Yuriy Shuliha
> > >
> > > пн, 30 вер. 2019 о 23:35 Denis Magda  пише:
> > >
> > > > Yuriy,
> > > >
> > > > I've seen you opening a pull-request with the first changes:
> > > > https://issues.apache.org/jira/browse/IGNITE-12189
> > > >
> > > > Alex Scherbakov and Ivan are you the right guys to do the review?
> > > >
> > > > -
> > > > Denis
> > > >
> > > >
> > > > On Fri, Sep 27, 2019 at 8:48 AM Павлухин Иван 
> > > wrote:
> > > >
> > > > > Yuriy,
> > > > >
> > > > > Thank you for providing details! Quite interesting.
> > > > >
> > > > > Yes, we already have support of distributed limit and merging
> sorted
> > > > > subresults for SQL queries. E.g. ReduceIndexSorted and
> > > > > MergeStreamIterator are used for merging sorted streams.
> > > > >
> > > > > Could you please also clarify about score/relevance? Is it
> provided by
> > > > > Lucene engine for each query result? I am thinking how to do sorted
> > > > > merge properly in this case.
> > > > >
> > > > > ср, 25 сент. 2019 г. в 18:56, Yuriy Shuliga :
> > > > > >
> > > > > > Ivan,
> > > > > >
> > > > > > Thank you for interesting question!
> > > > > >
> > > > > > Text searches (or full text searches) are mostly human-oriented.
> And
> > > > the
> > > > > > point of user's interest is topmost part of response.
> > > > > > Then user can read it, evaluate and use the given records for
> further
> > > > > > purposes.
> > > > > >
> > > > > > Particularly in our case, we use Ignite for operations with
> financial
> > > > > data,
> > > > > > and there lots of text stuff like assets names, fin. instruments,
> > > > > companies
> > > > > > etc.
> > > > > > In order to operate with this quickly and reliably, users used to
> > > work
> > > > > with
> > > > > > text search, type-ahead completions, suggestions.
> > > > > >
> > > > > > For this purposes we are indexing particular string data in
> separate
> > > > > caches.
> > > > > >
> > > > > > Sorting capabilities and response size limitations are very
> important
> > > > > > there. As our API have to provide most relevant information in
> view
> > > of
> > > > > > limited size.
> > > > > >
> > > > > > Now let me comment some Ignite/Lucene perspective.
> > > > > > Actually Ignite queries and Lucene returns *TopDocs.scoresDocs
> > > *already
> > > > > > sorted by *score *(relevance). So most relevant documents are on
> the
> > > > top.
> > > > > > And currently distributed queries responses from different nodes
> are
> > > > > merged
> > > > > > into final query cursor queue in arbitrary way.
> > > > > > So in fact we already have the score order ruined here. Also
> Ignite
> > > > > > requests all possible documents from Lucene that is redundant
> and not
> > > > > good
> > > > > > for performance.
> > > > > >
> > > > > > I'm implementing *limit* parameter to be part of *TextQuery *and
> have
> > > > to
> > > > > > notice that we still have to add sorting for text queries
> processing
> > > in
> > > > > > order to have applicable results.
> > > > > >
> > > > > > *Limit* parameter itself should improve the part of issues from
> > > above,
> > > > > but
> > > > > > definitely, sorting by document score at least  should be
> implemented
> > > > > along
> > > > > > with limit.
> > > > > >
> > > > > > This is a pretty short commentary if you still have any
> questions,
> > > > please
> > > > > > ask, do not hesitate)
> > > > > >
> > > > > > BR,
> > > > > > Yuriy Shuliha
> > > > > >
> > > > > > чт, 19 вер. 2019 о 11:38 Павлухин Иван 
> пише:
> > > > > >
> > > > > > > Yuriy,
> > > > > > >
> > > > > > > Greatly appreciate your interest.
> > > > > > >
> > > > > > > Could you please elaborate a little bit about sorting? What
> tasks
> > > > does
> > > > > > > it help to solve and how? It would be great to provide an
> example.
> > > > > > >
> > > > > > > ср, 18 сент. 2019 г. в 09:39,

Re: Getting involved in Apache Ignite

2019-10-04 Thread Emmanouil Gkatziouras

Greetings,

I am amazed by Apache Ignite and its features!
For my use case integrating with Google Cloud Pub/Sub and Amazon SQS would
help getting the most out of it.

Since developing those streamers is something I would do in any case, I
would like to get involved in your project and therefore give back to the
project and make those features available to the community.

I have contributed to projects such as the InfluxDB Java Driver and Alpakka.
Part of my every day work has to do with implementing solutions in the
cloud, thus I can contribute to the streaming solutions that have to do
with Cloud Providers. Particularly with the GCP Pub/Sub and AWS SQS as well
as other Cloud Based Messaging systems such as Azure Storage Queues.
Also I would like to propose on adding a streamer implementation for cache
invalidation as I have some use cases in need of it.

You can find me on LinkedIn (link in my signature) and get to know my
background a little more.

Thanks for your great work so far!
Regards,

*Emmanouil Gkatziouras*
https://egkatzioura.com/ | https://www.linkedin.com/in/gkatziourasemmanouil/
https://github.com/gkatzioura

Re: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)

2019-10-04 Thread Ivan Pavlukhin

Yuriy,

Am I getting it right that in your PR if we have a limit N than
returned items (at most N) will not be strictly the most relevant
ones? E.g. if one node returned N items faster than others but with
not so good relevance?

чт, 3 окт. 2019 г. в 17:47, Andrey Mashenkov :
>
> Yuri,
>
> I've done with review.
> No crime found, but trivial compatibility bug.
>
> On Thu, Oct 3, 2019 at 3:54 PM Yuriy Shuliga  wrote:
>
> > Denis,
> >
> > Thank you for your attention to this.
> > as for now, the https://issues.apache.org/jira/browse/IGNITE-12189 ticket
> > is still pending review.
> > Do we have a chance to move it forward somehow?
> >
> > BR,
> > Yuriy Shuliha
> >
> > пн, 30 вер. 2019 о 23:35 Denis Magda  пише:
> >
> > > Yuriy,
> > >
> > > I've seen you opening a pull-request with the first changes:
> > > https://issues.apache.org/jira/browse/IGNITE-12189
> > >
> > > Alex Scherbakov and Ivan are you the right guys to do the review?
> > >
> > > -
> > > Denis
> > >
> > >
> > > On Fri, Sep 27, 2019 at 8:48 AM Павлухин Иван 
> > wrote:
> > >
> > > > Yuriy,
> > > >
> > > > Thank you for providing details! Quite interesting.
> > > >
> > > > Yes, we already have support of distributed limit and merging sorted
> > > > subresults for SQL queries. E.g. ReduceIndexSorted and
> > > > MergeStreamIterator are used for merging sorted streams.
> > > >
> > > > Could you please also clarify about score/relevance? Is it provided by
> > > > Lucene engine for each query result? I am thinking how to do sorted
> > > > merge properly in this case.
> > > >
> > > > ср, 25 сент. 2019 г. в 18:56, Yuriy Shuliga :
> > > > >
> > > > > Ivan,
> > > > >
> > > > > Thank you for interesting question!
> > > > >
> > > > > Text searches (or full text searches) are mostly human-oriented. And
> > > the
> > > > > point of user's interest is topmost part of response.
> > > > > Then user can read it, evaluate and use the given records for further
> > > > > purposes.
> > > > >
> > > > > Particularly in our case, we use Ignite for operations with financial
> > > > data,
> > > > > and there lots of text stuff like assets names, fin. instruments,
> > > > companies
> > > > > etc.
> > > > > In order to operate with this quickly and reliably, users used to
> > work
> > > > with
> > > > > text search, type-ahead completions, suggestions.
> > > > >
> > > > > For this purposes we are indexing particular string data in separate
> > > > caches.
> > > > >
> > > > > Sorting capabilities and response size limitations are very important
> > > > > there. As our API have to provide most relevant information in view
> > of
> > > > > limited size.
> > > > >
> > > > > Now let me comment some Ignite/Lucene perspective.
> > > > > Actually Ignite queries and Lucene returns *TopDocs.scoresDocs
> > *already
> > > > > sorted by *score *(relevance). So most relevant documents are on the
> > > top.
> > > > > And currently distributed queries responses from different nodes are
> > > > merged
> > > > > into final query cursor queue in arbitrary way.
> > > > > So in fact we already have the score order ruined here. Also Ignite
> > > > > requests all possible documents from Lucene that is redundant and not
> > > > good
> > > > > for performance.
> > > > >
> > > > > I'm implementing *limit* parameter to be part of *TextQuery *and have
> > > to
> > > > > notice that we still have to add sorting for text queries processing
> > in
> > > > > order to have applicable results.
> > > > >
> > > > > *Limit* parameter itself should improve the part of issues from
> > above,
> > > > but
> > > > > definitely, sorting by document score at least  should be implemented
> > > > along
> > > > > with limit.
> > > > >
> > > > > This is a pretty short commentary if you still have any questions,
> > > please
> > > > > ask, do not hesitate)
> > > > >
> > > > > BR,
> > > > > Yuriy Shuliha
> > > > >
> > > > > чт, 19 вер. 2019 о 11:38 Павлухин Иван  пише:
> > > > >
> > > > > > Yuriy,
> > > > > >
> > > > > > Greatly appreciate your interest.
> > > > > >
> > > > > > Could you please elaborate a little bit about sorting? What tasks
> > > does
> > > > > > it help to solve and how? It would be great to provide an example.
> > > > > >
> > > > > > ср, 18 сент. 2019 г. в 09:39, Alexei Scherbakov <
> > > > > > alexey.scherbak...@gmail.com>:
> > > > > > >
> > > > > > > Denis,
> > > > > > >
> > > > > > > I like the idea of throwing an exception for enabled text queries
> > > on
> > > > > > > persistent caches.
> > > > > > >
> > > > > > > Also I'm fine with proposed limit for unsorted searches.
> > > > > > >
> > > > > > > Yury, please proceed with ticket creation.
> > > > > > >
> > > > > > > вт, 17 сент. 2019 г., 22:06 Denis Magda :
> > > > > > >
> > > > > > > > Igniters,
> > > > > > > >
> > > > > > > > I see nothing wrong with Yury's proposal in regards full-text
> > > > search
> > > > > > API
> > > > > > > > evolution as long as Yury is ready to push it forward.
> > > > > > > >
> > > > > > > > As for the in-memory

Re: New SQL execution engine

2019-10-04 Thread Ivan Pavlukhin

Nikolay,

Guys updated IEP [1]. Could you please check it? Are there any missing
parts needed at that stage?

[1] 
https://cwiki.apache.org/confluence/display/IGNITE/IEP-37%3A+New+query+execution+engine

вт, 1 окт. 2019 г. в 12:19, Ivan Pavlukhin :
>
> Folks,
>
> I marked IEP-33 as obsolete. Also now the IEP-37 we currently are
> working with has a pretty URL
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-37%3A+New+query+execution+engine
>
> вт, 1 окт. 2019 г. в 11:17, Seliverstov Igor :
> >
> > Nikolay,
> >
> > The document you edited is wrong (and outdated).
> >
> > Since the author meant another idea, I decided not to change IEP-35 and
> > create a new one - IEP-37 (https://cwiki.apache.org/confluence/x/NBLABw).
> > It's already have a number of key requirements.
> >
> > Regards,
> > Igor
> >
> > вт, 1 окт. 2019 г., 6:14 Nikolay Izhikov :
> >
> > > Hello, Igniters.
> > >
> > > I extends IEP [1] with the tickets caused by H2 limitations.
> > >
> > > Please, let's write down requirements for engine in the IEP.
> > >
> > >
> > > https://cwiki.apache.org/confluence/display/IGNITE/IEP-33%3A+New+SQL+executor+engine+infrastructure
> > >
> > > В Пн, 30/09/2019 в 17:20 -0700, Denis Magda пишет:
> > > > Ivan, we need more of these discussions, totally agree with you ;)
> > > >
> > > > I've updated the Motivation paragraph outlining some high-level users we
> > > > see by working with our users. Hope it helps. Let's carry on and let me
> > > > send a note to Apache Calcite community.
> > > >
> > > > -
> > > > Denis
> > > >
> > > >
> > > > On Mon, Sep 30, 2019 at 1:56 AM Ivan Pavlukhin 
> > > wrote:
> > > >
> > > > > Folks,
> > > > >
> > > > > Thanks everyone for a hot discussion! Not every open source community
> > > > > has such open and boiling discussions. It means that people here
> > > > > really do care. And I am proud of it!
> > > > >
> > > > > As I understood, nobody is strictly against the proposed initiative.
> > > > > And I am glad that we can move forward (with some steps back along the
> > > > > way).
> > > > >
> > > > > пт, 27 сент. 2019 г. в 19:29, Nikolay Izhikov :
> > > > > >
> > > > > > Hello, Denis.
> > > > > >
> > > > > > Thanks for the clarifications.
> > > > > >
> > > > > > Sounds good for me.
> > > > > > All I try to say in this thread:
> > > > > > Guys, please, let's take a step back and write down
> > > requirements(what we
> > > > >
> > > > > want to get with SQL engine).
> > > > > > Which features and use-cases are primary for us.
> > > > > >
> > > > > > I'm sure you have done it, already during your research.
> > > > > >
> > > > > > Please, share it with the community.
> > > > > >
> > > > > > I'm pretty sure we would back to this document again and again 
> > > > > > during
> > > > >
> > > > > migration.
> > > > > > So good written design is worth it.
> > > > > >
> > > > > > В Пт, 27/09/2019 в 09:10 -0700, Denis Magda пишет:
> > > > > > > Ignite mates, let me try to move the discussion in a constructive
> > > way.
> > > > >
> > > > > It
> > > > > > > looks like we set a wrong context from the very beginning.
> > > > > > >
> > > > > > > Before proposing this idea to the community, some of us were
> > > > > > > discussing/researching the topic in different groups (the one need
> > > to
> > > > >
> > > > > think
> > > > > > > it through first before even suggesting to consider changes of 
> > > > > > > this
> > > > > > > magnitude). The day has come to share this idea with the whole
> > > > >
> > > > > community
> > > > > > > and outline the next actions. But (!) nobody is 100% sure that
> > > that's
> > > > >
> > > > > the
> > > > > > > right decision. Thus, this will be an *experiment*, some of our
> > > > >
> > > > > community
> > > > > > > members will be developing a *prototype* and only based on the
> > > > >
> > > > > prototype
> > > > > > > outcomes we shall make a final decision. Igor, Roman, Ivan, 
> > > > > > > Andrey,
> > > > >
> > > > > hope
> > > > > > > that nothing has changed and we're on the same page here.
> > > > > > >
> > > > > > > Many technical and architectural reasons that justify this project
> > > have
> > > > > > > been shared but let me throw in my perspective. There is nothing
> > > wrong
> > > > >
> > > > > with
> > > > > > > H2, that was the right choice for that time.  Thanks to H2 and
> > > Ignite
> > > > >
> > > > > SQL
> > > > > > > APIs, our project is used across hundreds of deployments who are
> > > > > > > accelerating relational databases or use Ignite as a system of
> > > records.
> > > > > > > However, these days many more companies are migrating to
> > > *distributed*
> > > > > > > databases that speak SQL. For instance, if a couple of years ago 1
> > > out
> > > > >
> > > > > of
> > > > > > > 10 use cases needed support for multi-joins queries or queries 
> > > > > > > with
> > > > > > > subselects or efficient memory usage then today there are 5 out of
> > > 10
> > > > >
> > > > > use
> > > > > > > cases of this kind; in the

Re: Replacing default work dir from tmp to current dir

2019-10-04 Thread Ivan Pavlukhin

Interesting things about those LINQPad/JPad scenarios. Was not aware
of it. Still some doubts about applicability. It seems to me that JPad
having work dir in "Program Files" have a lot of problems by itself,
e.g. a user is not able to run basic file IO snippets with relative
file paths.

чт, 3 окт. 2019 г. в 23:24, Pavel Tupitsyn :
>
> Ilya, fallback is a good idea.
> Still I'd prefer to have user.home as a default, and fallback to user.dir
> when home does not work for some reason.
>
> On Thu, Oct 3, 2019 at 11:07 PM Ilya Kasnacheev 
> wrote:
>
> > Hello!
> >
> > We can try and fallback to home dir with warning, when file cannot be
> > created in current dir.
> >
> > WDYT?
> >
> > Regards,
> > --
> > Ilya Kasnacheev
> >
> >
> > чт, 3 окт. 2019 г. в 20:05, Pavel Tupitsyn :
> >
> > > >  Cannot tell about NuGet. Maven is typically used during development,
> > > usually there is no Maven in production deployments.
> > > NuGet and Maven are very similar. Yes, both of them are build-time tools,
> > > production is unrelated.
> > > For production-ready deployments we can expect users to tweak Ignite to
> > > their needs, set custom storage dirs, adjust heap sizes and so on.
> > >
> > > I'm talking about new users, about "getting started" scenarios -
> > > it is super important to make Ignite easy to get started with, provide
> > > reasonable defaults for all the configuration properties.
> > >
> > > For Ignite.NET, LINQPad is one of those "get started in 2 clicks"
> > > scenarios. And this scenario got broken as explained above.
> > > 2.7.5 and earlier used temp dir, which worked. 2.7.6 fails: "Work
> > directory
> > > does not exist and cannot be created: C:\Program
> > > Files\LINQPad5\ignite\work"
> > >
> > > For Java there is JPad, which will fail in the same way - when you run
> > code
> > > from there, `user.dir` points to Program Files.
> > >
> > > I expect that there are more use cases like this, and `user.home` is a
> > > reasonable solution.
> > >
> > > On Thu, Oct 3, 2019 at 5:56 PM Ilya Kasnacheev <
> > ilya.kasnach...@gmail.com>
> > > wrote:
> > >
> > > > Hello!
> > > >
> > > > I want to point out that I didn't change this location (current dir).
> > It
> > > > was already implemented when I raised this issue, the only change I did
> > > was
> > > > to swap current dir/work to current dir/ignite/work to avoid confusion
> > > > whose work dir that is.
> > > >
> > > > I also communicated this to you all in ML when I discovered that
> > current
> > > > dir is used.
> > > >
> > > > I think that current dir is actually *well defined* when starting a
> > > > project. A project is expected to be started from the same dir, and all
> > > > "Run..." dialogs usually allow specifying that one.
> > > >
> > > > For embedded scenarios, you definitely not want work dir from two
> > > different
> > > > Ignite-using tools to interfere. For embedded scenarios, you should
> > only
> > > > expect that current dir is writable.
> > > >
> > > > Even after these considerations, it's too late to change that because
> > > > people don't expect this dir to move with every release of Ignite, and
> > we
> > > > already did it once.
> > > >
> > > > Regards,
> > > > --
> > > > Ilya Kasnacheev
> > > >
> > > >
> > > > чт, 3 окт. 2019 г. в 17:34, Alexey Goncharuk <
> > alexey.goncha...@gmail.com
> > > >:
> > > >
> > > > > >
> > > > > > Seems, we should have different defaults and even distributions for
> > > > > > different usage scenarios.
> > > > > >
> > > > > I still do not understand why defaults should be different for
> > embedded
> > > > and
> > > > > "traditional RDBMS-like" installations. Having different defaults
> > will
> > > > > likely confuse users, not make usability easier. Personally, I would
> > > > forbid
> > > > > to start Ignite if IGNITE_HOME is not set, but this suggestion was
> > not
> > > > > accepted by the community.
> > > > >
> > > > > As far as I know, both rocksdb and SQLite is local only libraries and
> > > > don't
> > > > > > have any distrubted features.
> > > > >
> > > > > See no difference here. Imagine a user starts only one Ignite node
> > for
> > > > > development or just to play (which, I believe, happes quite a lot) -
> > > same
> > > > > as with local databases. BTW, it is impossible to start SQLite
> > without
> > > > > database path, so a user either provides a full path, or a relative
> > > path
> > > > > from the current directory - which is an explicit action from a user.
> > > > >
> > > > >
> > > > > > I agree with you.
> > > > > > How it happens, that after wide discussion we implemented, reviewed
> > > and
> > > > > > merged wrong defaults?
> > > > > >
> > > > > > As I know, we have explicit release only to change this default.
> > > > > >
> > > > > > This release is broken, isn't it?
> > > > > >
> > > > > I think this is just a miscommunication. Ilya made a fix which was
> > > > exactly
> > > > > what he meant it to be. As for the release - it may have worse
> > > usability,
> > > >

41 matches

Mail list logo