Hi Maryann,
My views: *create view "mytestview1" * *(* *"rowid" varchar primary key, * *"metadata"."bId" varchar, * *"metadata"."dId" varchar* *) * *create view "mytestview2" * *(* *"rowid" varchar primary key, * *"data"."bId" varchar,* *"data"."details" varchar,* *"data"."comment" varchar* *) * The amount of data: *SELECT count(*) FROM "mytestview1";* * -- 78,549* *SELECT count(*) FROM "mytestview2";* * -- 2,130,905* The query: Without hint: *EXPLAIN select count(*) * *from "mytestview1" m1* *inner join "mytestview2" m2* *on m1."bId" = m2."bId"* *CLIENT 1-CHUNK PARALLEL 1-WAY FULL SCAN OVER mytest1* * SERVER AGGREGATE INTO SINGLE ROW* * PARALLEL INNER-JOIN TABLE 0 (SKIP MERGE)* * CLIENT 5-CHUNK PARALLEL 1-WAY FULL SCAN OVER mytest2* With hint: *EXPLAIN select /*+ USE_SORT_MERGE_JOIN */ count(*) * *from "mytestview1" m1* *inner join "mytestview2" m2* *on m1."bId" = m2."bId"* *CLIENT 1-CHUNK PARALLEL 1-WAY FULL SCAN OVER mytest1* * SERVER AGGREGATE INTO SINGLE ROW* * PARALLEL INNER-JOIN TABLE 0 (SKIP MERGE)* * CLIENT 5-CHUNK PARALLEL 1-WAY FULL SCAN OVER mytest2* I know that when joining that I should ideally do <large table> join <small table>, but this is on my test environment and in production both tables are roughly the same size, so I’m trying to force it to use the sort merge before running the query in prod. My region servers in test don’t have that much heap space (about 2 gigs) if that makes a difference. Do I need to force a major compaction, generate statistics, anything like that? Thanks! Matt *From:* Maryann Xue [mailto:maryann....@gmail.com] *Sent:* 11 March 2015 20:16 *To:* user@phoenix.apache.org *Subject:* Re: Using Hints in Phoenix Hi Matt, Views or tables should not matter in this. Would you mind sharing your query and DDLs? Thanks, Maryann On Wed, Mar 11, 2015 at 6:06 AM, Matthew Johnson <matt.john...@algomi.com> wrote: Hi Maryann, I am using 4.3.0 (I upgraded as soon as it was released, largely so I could pick up this feature). I am actually joining views rather than tables – would this make a difference? Cheers, Matt *From:* Maryann Xue [mailto:maryann....@gmail.com] *Sent:* 10 March 2015 20:54 *To:* user@phoenix.apache.org *Subject:* Re: Using Hints in Phoenix Hi Matt, Which version of Phoenix are you using? Sort-merge join is only available in Phoenix 4.3. Thanks, Maryann On Tue, Mar 10, 2015 at 6:11 AM, Matthew Johnson <matt.john...@algomi.com> wrote: Hi Maryann, Thanks for clarifying that for me. I’ve been playing with the hint *USE_SORT_MERGE_JOIN* to try and solve an issue when joining two very large tables: *Error: Encountered exception in sub plan [0] execution.* *SQLState: null* *ErrorCode: 0* Which I believe is related to a lack of memory for building the hash table? I thought that using *SELECT /*+ USE_SORT_MERGE_JOIN*/* would make the join much slower but would work on an unlimited data set – did I misunderstand? The explain plan does not change with or without this hint: CLIENT 1-CHUNK PARALLEL 1-WAY FULL SCAN OVER mytable1 SERVER AGGREGATE INTO SINGLE ROW PARALLEL INNER-JOIN TABLE 0 (SKIP MERGE) CLIENT 15-CHUNK PARALLEL 1-WAY FULL SCAN OVER mytable2 Cheers, Matt *From:* Maryann Xue [mailto:maryann....@gmail.com] *Sent:* 09 March 2015 15:00 *To:* user@phoenix.apache.org *Subject:* Re: Using Hints in Phoenix Hi Matt, So far in Phoenix, hints are only supported as specified right after keywords SELECT, UPSERT and DELETE. Same for join queries. It is currently impossible to hint a certain join algorithm for a specific join node in a multiple join query. However, for subqueries, the inner query can have its own hints, independent of the outer query, like "SELECT /*+ INDEX(t idx1)*/ col1, col2 FROM t WHERE col3 IN (SELECT /*+ NO_INDEX*/ id FROM r WHERE name = 'x')". Thanks, Maryann On Mon, Mar 9, 2015 at 7:26 AM, Matthew Johnson <matt.john...@algomi.com> wrote: Hi guys, This is more of a general question than a problem – but I’m just wondering if someone can clarify for me what the syntax rules are for hints in Phoenix. Does it matter where in the query they go? Do they always go something like *SELECT <insert hint> x from y*? Or, if the hint is for a join (eg Sort Merge) does it go in the join part (*SELECT x from y inner join <insert hint> z on j = k*)? Couldn’t seem to find anything specific on this in the docs, and haven’t worked much with database hints in general so maybe there is a convention that I am not aware of – apologies if it’s a stupid question! Cheers, Matt