RE: Query 3x slower with index

2018-10-12 Thread Stanislav Lukyanov
Yes, sure. From: Dave Harvey Sent: 11 октября 2018 г. 23:59 To: user@ignite.apache.org Subject: Re: Query 3x slower with index "Ignite will only use one index per table" I assume you mean "Ignite will only use one index per table per query"? On Thu, Oct 11, 2018 at 1:55 P

Re: Query 3x slower with index

2018-10-12 Thread wt
having worked with databases for 20 years i can see your indexes are not fully scoped. i see this as your issue having SUM(product_views_app) > 2 OR SUM(product_clicks_app) > 1 add those columns to the composite index so that it doesn't need to access the underlying table and can just use the

Re: Query 3x slower with index

2018-10-11 Thread Dave Harvey
y_id), in that order, with no columns in between. > > Note that index (customer_id, dt, category_id) can’t be used instead of it. > > On the other hand, (customer_id, category_id, dt) can - the last part of > the index will be left unused. > > > > Thanks, > > Stan > > &

Re: Query 3x slower with index

2018-10-11 Thread eugene miretsky
can’t be used instead of it. > > On the other hand, (customer_id, category_id, dt) can - the last part of > the index will be left unused. > > > > Thanks, > > Stan > > > > *From: *eugene miretsky > *Sent: *9 октября 2018 г. 19:40 > *To: *user@ignite.apach

RE: Query 3x slower with index

2018-10-11 Thread Stanislav Lukyanov
@ignite.apache.org Subject: Re: Query 3x slower with index Hi Ilya,  I have tried it, and got the same performance as forcing using category index in my initial benchmark - query is 3x slowers and uses only one thread.  From my experiments so far it seems like Ignite can either (a) use affinity key and run

Re: Query 3x slower with index

2018-10-09 Thread eugene miretsky
Hi Ilya, I have tried it, and got the same performance as forcing using category index in my initial benchmark - query is 3x slowers and uses only one thread. >From my experiments so far it seems like Ignite can either (a) use affinity key and run queries in parallel, (b) use index but run the

Re: Query 3x slower with index

2018-09-24 Thread Ilya Kasnacheev
Hello! I guess that using AFFINITY_KEY as index have something to do with the fact that GROUP BY really wants to work per-partition. I have the following query for you: 1: jdbc:ignite:thin://localhost> explain Select count(*) FROM( Select customer_id from (Select customer_id, product_views_app,

Re: Query 3x slower with index

2018-09-24 Thread eugene miretsky
An easy way to reproduce would be to 1. Create table CREATE TABLE GA_DATA ( customer_id bigint, dt timestamp, category_id int, product_views_app int, product_clict_app int, product_clict_web int, product_clict_web int, PRIMARY KEY (customer_id, dt, category_id) )

Re: Query 3x slower with index

2018-09-21 Thread Ilya Kasnacheev
Hello! Can you share a reproducer project which loads (or generates) data for caches and then queries them? I could try and debug it if I had the reproducer. Regards. -- Ilya Kasnacheev чт, 20 сент. 2018 г. в 21:05, eugene miretsky : > Thanks Ilya, > > Tried it, no luck. It performs the same

Re: Query 3x slower with index

2018-09-20 Thread eugene miretsky
Thanks Ilya, Tried it, no luck. It performs the same as when using category_id index alone (slow). Any combindation I try either uses AFFINITY_KEY or category index. When it uses category index it runs slowers. Also, when AFFINITY_KEY key is used, the jobs runs on 32 threads (my query

Re: Query 3x slower with index

2018-09-20 Thread Ilya Kasnacheev
Hello! > 2) ga_customer_and_category_id: on customer_id and category_id Have you tried to do an index on category_id first, customer_id second? Note that Ignite will use only one index when joining two tables and that in your case it should start with category_id. You can also try adding

Re: Query 3x slower with index

2018-09-19 Thread eugene miretsky
Hi Ilya, I created 4 indexs on the table: 1) ga_pKey: on customer_id, dt, category_id (that's our primary key columns) 2) ga_customer_and_category_id: on customer_id and category_id 2) ga_customer_id: on customer_id 4) ga_category_id: on category_id For the first query (category in ()), the

Re: Query 3x slower with index

2018-09-18 Thread Ilya Kasnacheev
Hello! I can see you try to use _key_PK as index. If your primary key is composite, it won't work properly for you. I recommend creating an explicit (category_id, customer_id) index. Regards, -- Ilya Kasnacheev вт, 18 сент. 2018 г. в 17:47, eugene miretsky : > Hi Ilya, > > The different

Re: Query 3x slower with index

2018-09-18 Thread eugene miretsky
Hi Ilya, The different query result was my mistake - one of the categoy_ids was duplicate, so in the query that used join, it counted rows for that category twice. My apologies. However, we are still having an issue with query time, and the index not being applied to category_id. Would

Re: Query 3x slower with index

2018-09-17 Thread Ilya Kasnacheev
Hello! Why don't you diff the results of those two queries, tell us what the difference is? Regards, -- Ilya Kasnacheev пн, 17 сент. 2018 г. в 16:08, eugene miretsky : > Hello, > > Just wanted to see if anybody had time to look into this. > > Cheers, > Eugene > > On Wed, Sep 12, 2018 at 6:29

Re: Query 3x slower with index

2018-09-17 Thread eugene miretsky
Hello, Just wanted to see if anybody had time to look into this. Cheers, Eugene On Wed, Sep 12, 2018 at 6:29 PM eugene miretsky wrote: > Thanks! > > Tried joining with an inlined table instead of IN as per the second > suggestion, and it didn't quite work. > > Query1: > >- Select COUNT(*)

Re: Query 3x slower with index

2018-09-12 Thread eugene miretsky
Thanks! Tried joining with an inlined table instead of IN as per the second suggestion, and it didn't quite work. Query1: - Select COUNT(*) FROM( Select customer_id from GATABLE3 use Index( ) where category_id in (9005, 175930, 175930, 175940,175945,101450, 6453) group by customer_id

Re: Query 3x slower with index

2018-09-05 Thread Ilya Kasnacheev
Hello! I don't think that we're able to use index with IN () clauses. Please convert it into OR clauses. Please see https://apacheignite-sql.readme.io/docs/performance-and-debugging#section-sql-performance-and-usability-considerations Regards, -- Ilya Kasnacheev пн, 3 сент. 2018 г. в 12:46,

Re: Query 3x slower with index

2018-09-03 Thread Andrey Mashenkov
Hi Actually, first query uses index on affinity key which looks more efficient than index on category_id column. The first query can process groups one by one and stream partial results from map phase to reduce phase as it use sorted index lookup, while second query should process full dataset on

Query 3x slower with index

2018-09-01 Thread eugene miretsky
Hello, Schema: - PUBLIC.GATABLE2.CUSTOMER_ID PUBLIC.GATABLE2.DT PUBLIC.GATABLE2.CATEGORY_ID PUBLIC.GATABLE2.VERTICAL_ID PUBLIC.GATABLE2.SERVICE PUBLIC.GATABLE2.PRODUCT_VIEWS_APP PUBLIC.GATABLE2.PRODUCT_CLICKS_APP PUBLIC.GATABLE2.PRODUCT_VIEWS_WEB