Re: Using Hints in Phoenix

Samarth Jain Mon, 23 Mar 2015 09:20:14 -0700

Hi Matt,

Any chance your hint is going to the next line when using squirrel? Just to
be sure, make sure you do this:


Navigate to File --> New Session Properties --> Tab SQL and uncheck the
"Remove multi line comment (/.../) from SQL before sending it to database"
so that hints you include in queries are not stripped.

- Samarth

On Monday, March 23, 2015, Matthew Johnson <matt.john...@algomi.com> wrote:

> Hi Maryann,
>
>
>
> My apologies, my Java test app was indeed still referencing 4.2.2 (stupid
> copy and paste error) which I have now fixed. Squirrel is very definitely
> on 4.3.0 though. Interestingly, here are my results:
>
>
>
> JAVA TEST APP – INNER JOIN
>
>
>
> SORT-MERGE-JOIN (INNER) TABLES
>
>     CLIENT 1-CHUNK PARALLEL 1-WAY FULL SCAN OVER TESTTABLE1
>
> AND
>
>     CLIENT 1-CHUNK PARALLEL 1-WAY FULL SCAN OVER TESTTABLE2
>
>         SERVER SORTED BY [T2.ROWID]
>
>     CLIENT MERGE SORT
>
>
>
> JAVA TEST APP – FULL OUTER JOIN
>
>
>
> SORT-MERGE-JOIN (FULL) TABLES
>
>     CLIENT 1-CHUNK PARALLEL 1-WAY FULL SCAN OVER TESTTABLE1
>
>         SERVER SORTED BY [T1.ROWID]
>
>     CLIENT MERGE SORT
>
> AND
>
>     CLIENT 1-CHUNK PARALLEL 1-WAY FULL SCAN OVER TESTTABLE2
>
>         SERVER SORTED BY [T2.ROWID]
>
>     CLIENT MERGE SORT
>
>
>
> SQUIRREL SQL – INNER JOIN
>
>
>
> CLIENT 1-CHUNK PARALLEL 1-WAY FULL SCAN OVER TESTTABLE1
>
>     PARALLEL INNER-JOIN TABLE 0
>
>         CLIENT 1-CHUNK PARALLEL 1-WAY FULL SCAN OVER TESTTABLE2
>
>     DYNAMIC SERVER FILTER BY ROWID IN (T2.ROWID)
>
>
>
> SQUIRREL SQL – FULL OUTER JOIN
>
>
>
> SORT-MERGE-JOIN (FULL) TABLES
>
>     CLIENT 1-CHUNK PARALLEL 1-WAY FULL SCAN OVER TESTTABLE1
>
>         SERVER SORTED BY [T1.ROWID]
>
>     CLIENT MERGE SORT
>
> AND
>
>     CLIENT 1-CHUNK PARALLEL 1-WAY FULL SCAN OVER TESTTABLE2
>
>         SERVER SORTED BY [T2.ROWID]
>
>     CLIENT MERGE SORT
>
>
>
>
>
> So it seems that Squirrel has an issue processing the hint when doing an
> inner join, even though it works fine from a basic Java prepared statement.
> I’m guessing the workaround for this is to use full outer join whenever I
> want Sort Merge in Squirrel, and put a ‘where t2.rowid is not null’ in the
> where clause to simulate an inner join?
>
>
>
> Thank you so much for all your help thus far – really appreciate it!
>
>
>
> Cheers,
>
> Matt
>
>
>
>
>
> *From:* Maryann Xue [mailto:maryann....@gmail.com
> <javascript:_e(%7B%7D,'cvml','maryann....@gmail.com');>]
> *Sent:* 19 March 2015 16:38
> *To:* user@phoenix.apache.org
> <javascript:_e(%7B%7D,'cvml','user@phoenix.apache.org');>
> *Subject:* Re: Using Hints in Phoenix
>
>
>
> Hi Matt,
>
>
>
> Ah, there we go. I think the your Squirrel and Java program somehow
> referred to the old client JAR and that's why.
>
>
>
>
>
> Thanks,
>
> Maryann
>
>
>
> On Thu, Mar 19, 2015 at 12:28 PM, Matthew Johnson <matt.john...@algomi.com
> <javascript:_e(%7B%7D,'cvml','matt.john...@algomi.com');>> wrote:
>
> Thanks Maryann for looking into it. I tried to run the following:
>
>
>
> *// Execute the SQL statement and get the results in a Resultset*
>
> *       ResultSet rsLimited = stmtLimited.executeQuery(**"explain SELECT
> /*+ USE_SORT_MERGE_JOIN */ * FROM testtable1 t1 full outer join testtable2
> t2 on t2.rowid = t1.rowid"**);*
>
>        *while**(rsLimited.next()) {*
>
> *              String plan = rsLimited.getString(**"PLAN"**);*
>
> *              System.**out**.println(plan);*
>
> *       }*
>
>
>
> But got this error:
>
>
>
> Executing statement
>
> *java.sql.SQLFeatureNotSupportedException*: Full joins not supported.
>
>       at org.apache.phoenix.compile.QueryCompiler.compileJoinQuery(
> *QueryCompiler.java:248*)
>
>       at org.apache.phoenix.compile.QueryCompiler.compile(
> *QueryCompiler.java:138*)
>
>       at
> org.apache.phoenix.jdbc.PhoenixStatement$ExecutableSelectStatement.compilePlan(
> *PhoenixStatement.java:322*)
>
>       at
> org.apache.phoenix.jdbc.PhoenixStatement$ExecutableSelectStatement.compilePlan(
> *PhoenixStatement.java:305*)
>
>       at
> org.apache.phoenix.jdbc.PhoenixStatement$ExecutableExplainStatement.compilePlan(
> *PhoenixStatement.java:380*)
>
>       at
> org.apache.phoenix.jdbc.PhoenixStatement$ExecutableExplainStatement.compilePlan(
> *PhoenixStatement.java:360*)
>
>       at org.apache.phoenix.jdbc.PhoenixStatement$1.call(
> *PhoenixStatement.java:221*)
>
>       at org.apache.phoenix.jdbc.PhoenixStatement$1.call(
> *PhoenixStatement.java:217*)
>
>       at org.apache.phoenix.call.CallRunner.run(*CallRunner.java:53*)
>
>       at org.apache.phoenix.jdbc.PhoenixStatement.executeQuery(
> *PhoenixStatement.java:216*)
>
>       at org.apache.phoenix.jdbc.PhoenixStatement.executeQuery(
> *PhoenixStatement.java:1030*)
>
>       at com.algomi.phoenix.PhoenixExplainPlan.main(
> *PhoenixExplainPlan.java:24*)
>
>
>
>
>
> Although this, too, seems to work from sqlline:
>
>
>
> *0: jdbc:phoenix:sales1,sales2,sales3> explain SELECT * FROM testtable1 t1
> full outer join testtable2 t2 on t2.rowid = t1.rowid;*
>
> *+------------------------------------------+*
>
> *|                   PLAN                   |*
>
> *+------------------------------------------+*
>
> *| SORT-MERGE-JOIN (FULL) TABLES            |*
>
> *|     CLIENT 1-CHUNK PARALLEL 1-WAY FULL SCAN OVER TESTTABLE1 |*
>
> *|         SERVER SORTED BY [T1.ROWID]      |*
>
> *|     CLIENT MERGE SORT                    |*
>
> *| AND                                      |*
>
> *|     CLIENT 1-CHUNK PARALLEL 1-WAY FULL SCAN OVER TESTTABLE2 |*
>
> *|         SERVER SORTED BY [T2.ROWID]      |*
>
> *|     CLIENT MERGE SORT                    |*
>
> *+------------------------------------------+*
>
> *8 rows selected (0.032 seconds)*
>
>
>
> Does Sqlline use the same phoenix-client jar to make the connection? Or
> does it work differently?
>
>
>
> Thanks!
>
> Matt
>
>
>
>
>
> *From:* Maryann Xue [mailto:maryann....@gmail.com
> <javascript:_e(%7B%7D,'cvml','maryann....@gmail.com');>]
> *Sent:* 19 March 2015 16:08
>
>
> *To:* user@phoenix.apache.org
> <javascript:_e(%7B%7D,'cvml','user@phoenix.apache.org');>
> *Subject:* Re: Using Hints in Phoenix
>
>
>
> By the way, to answer your previous questions, Phoenix joins have not
> started to use stats so far, but the hints are parsed and handled in a
> universal way regardless of what type of query it is.
>
>
>
> Thanks,
>
> Maryann
>
>
>
> On Thu, Mar 19, 2015 at 12:03 PM, Maryann Xue <maryann....@gmail.com
> <javascript:_e(%7B%7D,'cvml','maryann....@gmail.com');>> wrote:
>
> Hi Matt,
>
>
>
> Thank you very much for the investigation! Actually I tried with Java code
> in a unit test (for I don't have an standalone HBase environment right
> now), but could not reproduce it anyway. Then I thought it could be using
> "createStatement()" vs "prepareStatement()" after seeing your
> investigation, but again that was not it.
>
>
>
> But anyway, it just occurred to me that there might be an easy way to
> force running sort-merge-join (just for the purpose of testing), by running
> a query with FULL OUTER JOIN. Could you please try that out and see what
> happens?
>
>
>
>
>
> Thanks,
>
> Maryann
>
>
>
>
>
> On Thu, Mar 19, 2015 at 7:54 AM, Matthew Johnson <matt.john...@algomi.com
> <javascript:_e(%7B%7D,'cvml','matt.john...@algomi.com');>> wrote:
>
> Hi Maryann,
>
>
>
> Sending this to the whole user group now I have a bit more investigation.
> Managed to do some digging and also figured out how to use sqlline.py, and
> got the following results.
>
>
>
> With sqlline.py:
>
>
>
> *0: jdbc:phoenix:sales1,sales2,sales3> explain SELECT /*+
> USE_SORT_MERGE_JOIN */ * FROM testtable1 t1 join testtable2 t2 on t2.rowid
> = t1.rowid;*
>
>
>
> *+------------------------------------------+*
>
> *|                   PLAN                   |*
>
> *+------------------------------------------+*
>
> *| SORT-MERGE-JOIN (INNER) TABLES           |*
>
> *|     CLIENT 1-CHUNK PARALLEL 1-WAY FULL SCAN OVER TESTTABLE1 |*
>
> *| AND                                      |*
>
> *|     CLIENT 1-CHUNK PARALLEL 1-WAY FULL SCAN OVER TESTTABLE2 |*
>
> *|         SERVER SORTED BY [T2.ROWID]      |*
>
> *|     CLIENT MERGE SORT                    |*
>
> *+------------------------------------------+*
>
> *6 rows selected (0.035 seconds)*
>
>
>
>
>
> So it does seem to work from sqlline!! Then, I decided to try and use it
> from inside a Java class:
>
>
>
> Class.*forName*("org.apache.phoenix.jdbc.PhoenixDriver");
>
>
>
>                      Connection conn = DriverManager.*getConnection*(
> "jdbc:phoenix:sales1,sales2,sales3", "", "");
>
>
>
>                      // Create a Statement class to execute the SQL
> statement
>
>                      Statement stmtLimited = conn.createStatement();
>
>
>
>                      System.*out*.println("Executing statement");
>
>
>
>                      // Execute the SQL statement and get the results in
> a *Resultset*
>
>                      ResultSet rsLimited = stmtLimited.executeQuery("explain
> SELECT /*+ USE_SORT_MERGE_JOIN */ * FROM testtable1 t1 join testtable2 t2
> on t2.rowid = t1.rowid");
>
>                      *while*(rsLimited.next()) {
>
>                            String plan = rsLimited.getString("PLAN");
>
>                            System.*out*.println(plan);
>
>                      }
>
>
>
>                      stmtLimited.close();
>
>                      conn.close();
>
>
>
> And I get:
>
>
>
> Executing statement
>
> CLIENT 1-CHUNK PARALLEL 1-WAY FULL SCAN OVER TESTTABLE1
>
>     PARALLEL INNER-JOIN TABLE 0
>
>         CLIENT 1-CHUNK PARALLEL 1-WAY FULL SCAN OVER TESTTABLE2
>
>     DYNAMIC SERVER FILTER BY ROWID BETWEEN MIN/MAX OF (T2.ROWID)
>
>
>
>
>
> So it seems that when executed from Java code (Squirrel is also a
> Java-based client) it ignores the hint! What do you think? Is there a way
> around this?
>
>
>
> Thanks!
>
> Matt
>
>
>
>
>
> *From:* Maryann Xue [mailto:maryann....@gmail.com
> <javascript:_e(%7B%7D,'cvml','maryann....@gmail.com');>]
> *Sent:* 17 March 2015 16:23
>
> *To:* user@phoenix.apache.org
> <javascript:_e(%7B%7D,'cvml','user@phoenix.apache.org');>
> *Subject:* Re: Using Hints in Phoenix
>
>
>
> Hi Matt,
>
>
>
> Sorry that I still could not reproduce the issue. Could you try
> reproducing it in a unit test?
>
>
>
>
>
> Thanks,
>
> Maryann
>
>
>
> On Fri, Mar 13, 2015 at 6:20 AM, Matthew Johnson <matt.john...@algomi.com
> <javascript:_e(%7B%7D,'cvml','matt.john...@algomi.com');>> wrote:
>
> Hi Maryann,
>
>
>
> All hbases in my cluster have:
>
>
>
> *phoenix-4.3.0-server.jar*
>
>
>
> in the lib folder, and my client is using:
>
>
>
>                 *phoenix-4.3.0-client.jar*
>
>
>
> I generated the MD5 in case something has gone wrong with the versioning:
>
>
>
> *e0ade979e7b444fb8f8f4b7b5578edab*
>
>
>
> And I have opened up the jar, and can see the new class
> *SortMergeJoinPlan.class*, so presumably I have the right version – is
> there anything else I can check?
>
>
>
> Cheers,
>
> Matt
>
>
>
> *From:* Maryann Xue [mailto:maryann....@gmail.com
> <javascript:_e(%7B%7D,'cvml','maryann....@gmail.com');>]
> *Sent:* 12 March 2015 23:02
>
>
> *To:* user@phoenix.apache.org
> <javascript:_e(%7B%7D,'cvml','user@phoenix.apache.org');>
> *Subject:* Re: Using Hints in Phoenix
>
>
>
> Hi Matt,
>
>
>
> I checked on my side. Sort-merge-join would work for both tables and
> views. And I also verified that 4.3 branch does have the corresponding
> check-in (
> https://git1-us-west.apache.org/repos/asf?p=phoenix.git;a=commitdiff;h=ebc7ee42cdb2b05a293f54dc687ca975db9acbc3)
> although the check-in message had a little mistake there.
>
>
>
> Could you please verify your Phoenix library version again, Matt?
> Especially the client.
>
>
>
>
>
> Thanks,
>
> Maryann
>
>
>
> On Thu, Mar 12, 2015 at 6:00 PM, Maryann Xue <maryann....@gmail.com
> <javascript:_e(%7B%7D,'cvml','maryann....@gmail.com');>> wrote:
>
> Hi Matt,
>
>
>
> Thanks for sharing the query. Using that hint should supposedly force
> sort-merge join no matter what. I will go ahead and verify that.
>
>
>
>
>
> Thanks,
>
> Maryann
>
>
>
> On Thu, Mar 12, 2015 at 2:25 PM, Matthew Johnson <matt.john...@algomi.com
> <javascript:_e(%7B%7D,'cvml','matt.john...@algomi.com');>> wrote:
>
> Hi Maryann,
>
>
>
> My views:
>
>
>
> *create view "mytestview1" *
>
> *(*
>
> *"rowid" varchar primary key, *
>
> *"metadata"."bId" varchar, *
>
> *"metadata"."dId" varchar*
>
> *) *
>
>
>
> *create view "mytestview2" *
>
> *(*
>
> *"rowid" varchar primary key, *
>
> *"data"."bId" varchar,*
>
> *"data"."details" varchar,*
>
> *"data"."comment" varchar*
>
> *) *
>
>
>
> The amount of data:
>
>
>
> *SELECT count(*) FROM "mytestview1";*
>
> *  -- 78,549*
>
>
>
> *SELECT count(*) FROM "mytestview2";*
>
> *  -- 2,130,905*
>
>
>
> The query:
>
>
>
> Without hint:
>
>
>
> *EXPLAIN select count(*) *
>
> *from "mytestview1" m1*
>
> *inner join "mytestview2" m2*
>
> *on m1."bId" = m2."bId"*
>
>
>
> *CLIENT 1-CHUNK PARALLEL 1-WAY FULL SCAN OVER mytest1*
>
> *    SERVER AGGREGATE INTO SINGLE ROW*
>
> *    PARALLEL INNER-JOIN TABLE 0 (SKIP MERGE)*
>
> *        CLIENT 5-CHUNK PARALLEL 1-WAY FULL SCAN OVER mytest2*
>
>
>
> With hint:
>
>
>
> *EXPLAIN select /*+ USE_SORT_MERGE_JOIN */ count(*) *
>
> *from "mytestview1" m1*
>
> *inner join "mytestview2" m2*
>
> *on m1."bId" = m2."bId"*
>
>
>
> *CLIENT 1-CHUNK PARALLEL 1-WAY FULL SCAN OVER mytest1*
>
> *    SERVER AGGREGATE INTO SINGLE ROW*
>
> *    PARALLEL INNER-JOIN TABLE 0 (SKIP MERGE)*
>
> *        CLIENT 5-CHUNK PARALLEL 1-WAY FULL SCAN OVER mytest2*
>
>
>
>
>
> I know that when joining that I should ideally do <large table> join
> <small table>, but this is on my test environment and in production both
> tables are roughly the same size, so I’m trying to force it to use the sort
> merge before running the query in prod. My region servers in test don’t
> have that much heap space (about 2 gigs) if that makes a difference. Do I
> need to force a major compaction, generate statistics, anything like that?
>
>
>
> Thanks!
>
> Matt
>
>
>
>
>
> *From:* Maryann Xue [mailto:maryann....@gmail.com
> <javascript:_e(%7B%7D,'cvml','maryann....@gmail.com');>]
> *Sent:* 11 March 2015 20:16
>
>
> *To:* user@phoenix.apache.org
> <javascript:_e(%7B%7D,'cvml','user@phoenix.apache.org');>
> *Subject:* Re: Using Hints in Phoenix
>
>
>
> Hi Matt,
>
>
>
> Views or tables should not matter in this. Would you mind sharing your
> query and DDLs?
>
>
>
>
>
> Thanks,
>
> Maryann
>
>
>
>
>
> On Wed, Mar 11, 2015 at 6:06 AM, Matthew Johnson <matt.john...@algomi.com
> <javascript:_e(%7B%7D,'cvml','matt.john...@algomi.com');>> wrote:
>
> Hi Maryann,
>
>
>
> I am using 4.3.0 (I upgraded as soon as it was released, largely so I
> could pick up this feature). I am actually joining views rather than tables
> – would this make a difference?
>
>
>
> Cheers,
>
> Matt
>
>
>
> *From:* Maryann Xue [mailto:maryann....@gmail.com
> <javascript:_e(%7B%7D,'cvml','maryann....@gmail.com');>]
> *Sent:* 10 March 2015 20:54
>
>
> *To:* user@phoenix.apache.org
> <javascript:_e(%7B%7D,'cvml','user@phoenix.apache.org');>
> *Subject:* Re: Using Hints in Phoenix
>
>
>
> Hi Matt,
>
>
>
> Which version of Phoenix are you using? Sort-merge join is only available
> in Phoenix 4.3.
>
>
>
>
>
> Thanks,
>
> Maryann
>
>
>
> On Tue, Mar 10, 2015 at 6:11 AM, Matthew Johnson <matt.john...@algomi.com
> <javascript:_e(%7B%7D,'cvml','matt.john...@algomi.com');>> wrote:
>
> Hi Maryann,
>
>
>
> Thanks for clarifying that for me. I’ve been playing with the hint
> *USE_SORT_MERGE_JOIN* to try and solve an issue when joining two very
> large tables:
>
>
>
> *Error: Encountered exception in sub plan [0] execution.*
>
> *SQLState:  null*
>
> *ErrorCode: 0*
>
>
>
> Which I believe is related to a lack of memory for building the hash
> table? I thought that using *SELECT /*+ USE_SORT_MERGE_JOIN*/* would make
> the join much slower but would work on an unlimited data set – did I
> misunderstand? The explain plan does not change with or without this hint:
>
>
>
> CLIENT 1-CHUNK PARALLEL 1-WAY FULL SCAN OVER mytable1
>
>     SERVER AGGREGATE INTO SINGLE ROW
>
>     PARALLEL INNER-JOIN TABLE 0 (SKIP MERGE)
>
>         CLIENT 15-CHUNK PARALLEL 1-WAY FULL SCAN OVER mytable2
>
>
>
> Cheers,
>
> Matt
>
>
>
>
>
> *From:* Maryann Xue [mailto:maryann....@gmail.com
> <javascript:_e(%7B%7D,'cvml','maryann....@gmail.com');>]
> *Sent:* 09 March 2015 15:00
> *To:* user@phoenix.apache.org
> <javascript:_e(%7B%7D,'cvml','user@phoenix.apache.org');>
> *Subject:* Re: Using Hints in Phoenix
>
>
>
> Hi Matt,
>
>
>
> So far in Phoenix, hints are only supported as specified right after
> keywords SELECT, UPSERT and DELETE. Same for join queries. It is currently
> impossible to hint a certain join algorithm for a specific join node in a
> multiple join query. However, for subqueries, the inner query can have its
> own hints, independent of the outer query, like "SELECT /*+ INDEX(t idx1)*/
> col1, col2 FROM t WHERE col3 IN (SELECT /*+ NO_INDEX*/ id FROM r WHERE name
> = 'x')".
>
>
>
>
>
> Thanks,
>
> Maryann
>
>
>
> On Mon, Mar 9, 2015 at 7:26 AM, Matthew Johnson <matt.john...@algomi.com
> <javascript:_e(%7B%7D,'cvml','matt.john...@algomi.com');>> wrote:
>
> Hi guys,
>
>
>
> This is more of a general question than a problem – but I’m just wondering
> if someone can clarify for me what the syntax rules are for hints in
> Phoenix. Does it matter where in the query they go? Do they always go
> something like *SELECT <insert hint> x from y*? Or, if the hint is for a
> join (eg Sort Merge) does it go in the join part (*SELECT x from y inner
> join <insert hint> z on j = k*)?
>
>
>
> Couldn’t seem to find anything specific on this in the docs, and haven’t
> worked much with database hints in general so maybe there is a convention
> that I am not aware of – apologies if it’s a stupid question!
>
>
>
> Cheers,
>
> Matt
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>

Re: Using Hints in Phoenix

Reply via email to