Re: Using Hints in Phoenix

Maryann Xue Thu, 12 Mar 2015 15:02:31 -0700

Hi Matt,

Thanks for sharing the query. Using that hint should supposedly force
sort-merge join no matter what. I will go ahead and verify that.



Thanks,
Maryann

On Thu, Mar 12, 2015 at 2:25 PM, Matthew Johnson <matt.john...@algomi.com>
wrote:

> Hi Maryann,
>
>
>
> My views:
>
>
>
> *create view "mytestview1" *
>
> *(*
>
> *"rowid" varchar primary key, *
>
> *"metadata"."bId" varchar, *
>
> *"metadata"."dId" varchar*
>
> *) *
>
>
>
> *create view "mytestview2" *
>
> *(*
>
> *"rowid" varchar primary key, *
>
> *"data"."bId" varchar,*
>
> *"data"."details" varchar,*
>
> *"data"."comment" varchar*
>
> *) *
>
>
>
> The amount of data:
>
>
>
> *SELECT count(*) FROM "mytestview1";*
>
> *  -- 78,549*
>
>
>
> *SELECT count(*) FROM "mytestview2";*
>
> *  -- 2,130,905*
>
>
>
> The query:
>
>
>
> Without hint:
>
>
>
> *EXPLAIN select count(*) *
>
> *from "mytestview1" m1*
>
> *inner join "mytestview2" m2*
>
> *on m1."bId" = m2."bId"*
>
>
>
> *CLIENT 1-CHUNK PARALLEL 1-WAY FULL SCAN OVER mytest1*
>
> *    SERVER AGGREGATE INTO SINGLE ROW*
>
> *    PARALLEL INNER-JOIN TABLE 0 (SKIP MERGE)*
>
> *        CLIENT 5-CHUNK PARALLEL 1-WAY FULL SCAN OVER mytest2*
>
>
>
> With hint:
>
>
>
> *EXPLAIN select /*+ USE_SORT_MERGE_JOIN */ count(*) *
>
> *from "mytestview1" m1*
>
> *inner join "mytestview2" m2*
>
> *on m1."bId" = m2."bId"*
>
>
>
> *CLIENT 1-CHUNK PARALLEL 1-WAY FULL SCAN OVER mytest1*
>
> *    SERVER AGGREGATE INTO SINGLE ROW*
>
> *    PARALLEL INNER-JOIN TABLE 0 (SKIP MERGE)*
>
> *        CLIENT 5-CHUNK PARALLEL 1-WAY FULL SCAN OVER mytest2*
>
>
>
>
>
> I know that when joining that I should ideally do <large table> join
> <small table>, but this is on my test environment and in production both
> tables are roughly the same size, so I’m trying to force it to use the sort
> merge before running the query in prod. My region servers in test don’t
> have that much heap space (about 2 gigs) if that makes a difference. Do I
> need to force a major compaction, generate statistics, anything like that?
>
>
>
> Thanks!
>
> Matt
>
>
>
>
>
> *From:* Maryann Xue [mailto:maryann....@gmail.com]
> *Sent:* 11 March 2015 20:16
>
> *To:* user@phoenix.apache.org
> *Subject:* Re: Using Hints in Phoenix
>
>
>
> Hi Matt,
>
>
>
> Views or tables should not matter in this. Would you mind sharing your
> query and DDLs?
>
>
>
>
>
> Thanks,
>
> Maryann
>
>
>
>
>
> On Wed, Mar 11, 2015 at 6:06 AM, Matthew Johnson <matt.john...@algomi.com>
> wrote:
>
> Hi Maryann,
>
>
>
> I am using 4.3.0 (I upgraded as soon as it was released, largely so I
> could pick up this feature). I am actually joining views rather than tables
> – would this make a difference?
>
>
>
> Cheers,
>
> Matt
>
>
>
> *From:* Maryann Xue [mailto:maryann....@gmail.com]
> *Sent:* 10 March 2015 20:54
>
>
> *To:* user@phoenix.apache.org
> *Subject:* Re: Using Hints in Phoenix
>
>
>
> Hi Matt,
>
>
>
> Which version of Phoenix are you using? Sort-merge join is only available
> in Phoenix 4.3.
>
>
>
>
>
> Thanks,
>
> Maryann
>
>
>
> On Tue, Mar 10, 2015 at 6:11 AM, Matthew Johnson <matt.john...@algomi.com>
> wrote:
>
> Hi Maryann,
>
>
>
> Thanks for clarifying that for me. I’ve been playing with the hint
> *USE_SORT_MERGE_JOIN* to try and solve an issue when joining two very
> large tables:
>
>
>
> *Error: Encountered exception in sub plan [0] execution.*
>
> *SQLState:  null*
>
> *ErrorCode: 0*
>
>
>
> Which I believe is related to a lack of memory for building the hash
> table? I thought that using *SELECT /*+ USE_SORT_MERGE_JOIN*/* would make
> the join much slower but would work on an unlimited data set – did I
> misunderstand? The explain plan does not change with or without this hint:
>
>
>
> CLIENT 1-CHUNK PARALLEL 1-WAY FULL SCAN OVER mytable1
>
>     SERVER AGGREGATE INTO SINGLE ROW
>
>     PARALLEL INNER-JOIN TABLE 0 (SKIP MERGE)
>
>         CLIENT 15-CHUNK PARALLEL 1-WAY FULL SCAN OVER mytable2
>
>
>
> Cheers,
>
> Matt
>
>
>
>
>
> *From:* Maryann Xue [mailto:maryann....@gmail.com]
> *Sent:* 09 March 2015 15:00
> *To:* user@phoenix.apache.org
> *Subject:* Re: Using Hints in Phoenix
>
>
>
> Hi Matt,
>
>
>
> So far in Phoenix, hints are only supported as specified right after
> keywords SELECT, UPSERT and DELETE. Same for join queries. It is currently
> impossible to hint a certain join algorithm for a specific join node in a
> multiple join query. However, for subqueries, the inner query can have its
> own hints, independent of the outer query, like "SELECT /*+ INDEX(t idx1)*/
> col1, col2 FROM t WHERE col3 IN (SELECT /*+ NO_INDEX*/ id FROM r WHERE name
> = 'x')".
>
>
>
>
>
> Thanks,
>
> Maryann
>
>
>
> On Mon, Mar 9, 2015 at 7:26 AM, Matthew Johnson <matt.john...@algomi.com>
> wrote:
>
> Hi guys,
>
>
>
> This is more of a general question than a problem – but I’m just wondering
> if someone can clarify for me what the syntax rules are for hints in
> Phoenix. Does it matter where in the query they go? Do they always go
> something like *SELECT <insert hint> x from y*? Or, if the hint is for a
> join (eg Sort Merge) does it go in the join part (*SELECT x from y inner
> join <insert hint> z on j = k*)?
>
>
>
> Couldn’t seem to find anything specific on this in the docs, and haven’t
> worked much with database hints in general so maybe there is a convention
> that I am not aware of – apologies if it’s a stupid question!
>
>
>
> Cheers,
>
> Matt
>
>
>
>
>
>
>
>
>

Re: Using Hints in Phoenix

Reply via email to