Cool! I'll take a look today. Sergi
2016-11-30 18:23 GMT+03:00 Andrey Mashenkov <andrey.mashen...@gmail.com>: > Serj, you can see a PR attached to jira issue [1], that can be opened with > upsource [2]. > > Tanks, I remember about distributed queries and wiil rework them right > after we come to agreemant that the solution for simple queries is ok. > > [1] https://issues.apache.org/jira/browse/IGNITE-4106 > [2] http://reviews.ignite.apache.org/ignite/review/IGNT-CR-15 > > > > On Wed, Nov 30, 2016 at 5:34 PM, Sergi Vladykin <sergi.vlady...@gmail.com> > wrote: > > > Per cache SQL parallelism level looks reasonable to me here. > > > > I'm not sure what do you mean about "prepared statement cache is useless > > with splitted indices", most probably you parallelize queries in some > wrong > > way if this is true. > > > > Also do not forget about distributed joins: with parallel queries on the > > same node we will need to make index range requests not only to remote > > nodes, but to query contexts in parallel threads on the same local node > as > > well. > > > > Sergi > > > > 2016-11-30 17:23 GMT+03:00 Andrey Mashenkov <andrey.mashen...@gmail.com > >: > > > > > It looks like we can't just split sql query to several threads due to > H2 > > > limitations. > > > We can bound query thread with certain set of partitions, but, > actually, > > H2 > > > will read whole index and then filter entries regarding its partition. > > So, > > > we can get significant speed-up that way. > > > > > > Unfortunatelly, H2 does not support sharding, and we need to have a > > > workaround. We can try to split indices, so each query thread would be > > > bounded with its own index part. > > > I've implemented such prototype and get significant speed up with > single > > > node grid as if it was several node grid. > > > Due to H2 knows nothing about splitted indices, we must bother about > > every > > > query should be run as TwoStepQuery and utilize all table index parts. > > > > > > As index creation on demand is very heavy operation, index should be > > > splitted when it is created. So we can set parallelizm level on > per-cache > > > base but not per-query. > > > > > > Another issue I've faced is that our implementation of prepared > statement > > > cache is useless with splitted indices. Prepared statement cached in > > > thread local variable and it seems that the statement is bounded with > > > certain index part. So if we reuse same statement for different index > > parts > > > we will get unexpected results. > > > > > > On Sun, Oct 30, 2016 at 8:46 PM, Dmitriy Setrakyan < > > dsetrak...@apache.org> > > > wrote: > > > > > > > Completely agree, great point! > > > > > > > > On Sun, Oct 30, 2016 at 9:17 AM, Sergi Vladykin < > > > sergi.vlady...@gmail.com> > > > > wrote: > > > > > > > > > I think it must be a maximum local parallelism level but not just > > `on` > > > > and > > > > > `off` setting (the default is obviously 1). This along with > > separately > > > > > configurable query thread pool will give a finer grained control > over > > > > > resources. > > > > > > > > > > Sergi > > > > > > > > > > 2016-10-30 18:22 GMT+03:00 Dmitriy Setrakyan < > dsetrak...@apache.org > > >: > > > > > > > > > > > I already mentioned this in another email, but we should be able > to > > > > turn > > > > > > this property on and off on per-query and per-cache levels. > > > > > > > > > > > > On Sat, Oct 29, 2016 at 11:45 AM, Sergi Vladykin < > > > > > sergi.vlady...@gmail.com > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > Agree, lets implement such a parallelization. > > > > > > > > > > > > > > I think we will need an explicit setting for SqlQuery and > > > > > SqlFieldsQuery, > > > > > > > the default behavior should not change. > > > > > > > > > > > > > > Sergi > > > > > > > > > > > > > > 2016-10-28 22:39 GMT+03:00 Andrey Mashenkov < > > > amashen...@gridgain.com > > > > >: > > > > > > > > > > > > > > > So, now we have every SQL query run on each node in single > > > thread. > > > > > This > > > > > > > can > > > > > > > > be an issue for heavy queries or queries running on big data > > > sets, > > > > > e.g. > > > > > > > > analytical queries. > > > > > > > > > > > > > > > > For now, the only way to speed up such queries is to add more > > > nodes > > > > > to > > > > > > > grid > > > > > > > > running on same server. In this case, data will be > partitioned > > > over > > > > > all > > > > > > > > these nodes and query will be split and run on all nodes. > > > > > > > > > > > > > > > > It seems, we can have a benefit if split SQL queries locally > as > > > we > > > > do > > > > > > it > > > > > > > > across nodes with TwoStepQuery. > > > > > > > > > > > > > > > > > > > > > > > > Thoughts? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > С уважением, > > > Машенков Андрей Владимирович > > > Тел. +7-921-932-61-82 > > > > > > Best regards, > > > Andrey V. Mashenkov > > > Cerr: +7-921-932-61-82 > > > > > > > > > -- > С уважением, > Машенков Андрей Владимирович > Тел. +7-921-932-61-82 > > Best regards, > Andrey V. Mashenkov > Cerr: +7-921-932-61-82 >