Hi Matt, Any chance your hint is going to the next line when using squirrel? Just to be sure, make sure you do this:
Navigate to File --> New Session Properties --> Tab SQL and uncheck the "Remove multi line comment (/.../) from SQL before sending it to database" so that hints you include in queries are not stripped. - Samarth On Monday, March 23, 2015, Matthew Johnson <matt.john...@algomi.com> wrote: > Hi Maryann, > > > > My apologies, my Java test app was indeed still referencing 4.2.2 (stupid > copy and paste error) which I have now fixed. Squirrel is very definitely > on 4.3.0 though. Interestingly, here are my results: > > > > JAVA TEST APP – INNER JOIN > > > > SORT-MERGE-JOIN (INNER) TABLES > > CLIENT 1-CHUNK PARALLEL 1-WAY FULL SCAN OVER TESTTABLE1 > > AND > > CLIENT 1-CHUNK PARALLEL 1-WAY FULL SCAN OVER TESTTABLE2 > > SERVER SORTED BY [T2.ROWID] > > CLIENT MERGE SORT > > > > JAVA TEST APP – FULL OUTER JOIN > > > > SORT-MERGE-JOIN (FULL) TABLES > > CLIENT 1-CHUNK PARALLEL 1-WAY FULL SCAN OVER TESTTABLE1 > > SERVER SORTED BY [T1.ROWID] > > CLIENT MERGE SORT > > AND > > CLIENT 1-CHUNK PARALLEL 1-WAY FULL SCAN OVER TESTTABLE2 > > SERVER SORTED BY [T2.ROWID] > > CLIENT MERGE SORT > > > > SQUIRREL SQL – INNER JOIN > > > > CLIENT 1-CHUNK PARALLEL 1-WAY FULL SCAN OVER TESTTABLE1 > > PARALLEL INNER-JOIN TABLE 0 > > CLIENT 1-CHUNK PARALLEL 1-WAY FULL SCAN OVER TESTTABLE2 > > DYNAMIC SERVER FILTER BY ROWID IN (T2.ROWID) > > > > SQUIRREL SQL – FULL OUTER JOIN > > > > SORT-MERGE-JOIN (FULL) TABLES > > CLIENT 1-CHUNK PARALLEL 1-WAY FULL SCAN OVER TESTTABLE1 > > SERVER SORTED BY [T1.ROWID] > > CLIENT MERGE SORT > > AND > > CLIENT 1-CHUNK PARALLEL 1-WAY FULL SCAN OVER TESTTABLE2 > > SERVER SORTED BY [T2.ROWID] > > CLIENT MERGE SORT > > > > > > So it seems that Squirrel has an issue processing the hint when doing an > inner join, even though it works fine from a basic Java prepared statement. > I’m guessing the workaround for this is to use full outer join whenever I > want Sort Merge in Squirrel, and put a ‘where t2.rowid is not null’ in the > where clause to simulate an inner join? > > > > Thank you so much for all your help thus far – really appreciate it! > > > > Cheers, > > Matt > > > > > > *From:* Maryann Xue [mailto:maryann....@gmail.com > <javascript:_e(%7B%7D,'cvml','maryann....@gmail.com');>] > *Sent:* 19 March 2015 16:38 > *To:* user@phoenix.apache.org > <javascript:_e(%7B%7D,'cvml','user@phoenix.apache.org');> > *Subject:* Re: Using Hints in Phoenix > > > > Hi Matt, > > > > Ah, there we go. I think the your Squirrel and Java program somehow > referred to the old client JAR and that's why. > > > > > > Thanks, > > Maryann > > > > On Thu, Mar 19, 2015 at 12:28 PM, Matthew Johnson <matt.john...@algomi.com > <javascript:_e(%7B%7D,'cvml','matt.john...@algomi.com');>> wrote: > > Thanks Maryann for looking into it. I tried to run the following: > > > > *// Execute the SQL statement and get the results in a Resultset* > > * ResultSet rsLimited = stmtLimited.executeQuery(**"explain SELECT > /*+ USE_SORT_MERGE_JOIN */ * FROM testtable1 t1 full outer join testtable2 > t2 on t2.rowid = t1.rowid"**);* > > *while**(rsLimited.next()) {* > > * String plan = rsLimited.getString(**"PLAN"**);* > > * System.**out**.println(plan);* > > * }* > > > > But got this error: > > > > Executing statement > > *java.sql.SQLFeatureNotSupportedException*: Full joins not supported. > > at org.apache.phoenix.compile.QueryCompiler.compileJoinQuery( > *QueryCompiler.java:248*) > > at org.apache.phoenix.compile.QueryCompiler.compile( > *QueryCompiler.java:138*) > > at > org.apache.phoenix.jdbc.PhoenixStatement$ExecutableSelectStatement.compilePlan( > *PhoenixStatement.java:322*) > > at > org.apache.phoenix.jdbc.PhoenixStatement$ExecutableSelectStatement.compilePlan( > *PhoenixStatement.java:305*) > > at > org.apache.phoenix.jdbc.PhoenixStatement$ExecutableExplainStatement.compilePlan( > *PhoenixStatement.java:380*) > > at > org.apache.phoenix.jdbc.PhoenixStatement$ExecutableExplainStatement.compilePlan( > *PhoenixStatement.java:360*) > > at org.apache.phoenix.jdbc.PhoenixStatement$1.call( > *PhoenixStatement.java:221*) > > at org.apache.phoenix.jdbc.PhoenixStatement$1.call( > *PhoenixStatement.java:217*) > > at org.apache.phoenix.call.CallRunner.run(*CallRunner.java:53*) > > at org.apache.phoenix.jdbc.PhoenixStatement.executeQuery( > *PhoenixStatement.java:216*) > > at org.apache.phoenix.jdbc.PhoenixStatement.executeQuery( > *PhoenixStatement.java:1030*) > > at com.algomi.phoenix.PhoenixExplainPlan.main( > *PhoenixExplainPlan.java:24*) > > > > > > Although this, too, seems to work from sqlline: > > > > *0: jdbc:phoenix:sales1,sales2,sales3> explain SELECT * FROM testtable1 t1 > full outer join testtable2 t2 on t2.rowid = t1.rowid;* > > *+------------------------------------------+* > > *| PLAN |* > > *+------------------------------------------+* > > *| SORT-MERGE-JOIN (FULL) TABLES |* > > *| CLIENT 1-CHUNK PARALLEL 1-WAY FULL SCAN OVER TESTTABLE1 |* > > *| SERVER SORTED BY [T1.ROWID] |* > > *| CLIENT MERGE SORT |* > > *| AND |* > > *| CLIENT 1-CHUNK PARALLEL 1-WAY FULL SCAN OVER TESTTABLE2 |* > > *| SERVER SORTED BY [T2.ROWID] |* > > *| CLIENT MERGE SORT |* > > *+------------------------------------------+* > > *8 rows selected (0.032 seconds)* > > > > Does Sqlline use the same phoenix-client jar to make the connection? Or > does it work differently? > > > > Thanks! > > Matt > > > > > > *From:* Maryann Xue [mailto:maryann....@gmail.com > <javascript:_e(%7B%7D,'cvml','maryann....@gmail.com');>] > *Sent:* 19 March 2015 16:08 > > > *To:* user@phoenix.apache.org > <javascript:_e(%7B%7D,'cvml','user@phoenix.apache.org');> > *Subject:* Re: Using Hints in Phoenix > > > > By the way, to answer your previous questions, Phoenix joins have not > started to use stats so far, but the hints are parsed and handled in a > universal way regardless of what type of query it is. > > > > Thanks, > > Maryann > > > > On Thu, Mar 19, 2015 at 12:03 PM, Maryann Xue <maryann....@gmail.com > <javascript:_e(%7B%7D,'cvml','maryann....@gmail.com');>> wrote: > > Hi Matt, > > > > Thank you very much for the investigation! Actually I tried with Java code > in a unit test (for I don't have an standalone HBase environment right > now), but could not reproduce it anyway. Then I thought it could be using > "createStatement()" vs "prepareStatement()" after seeing your > investigation, but again that was not it. > > > > But anyway, it just occurred to me that there might be an easy way to > force running sort-merge-join (just for the purpose of testing), by running > a query with FULL OUTER JOIN. Could you please try that out and see what > happens? > > > > > > Thanks, > > Maryann > > > > > > On Thu, Mar 19, 2015 at 7:54 AM, Matthew Johnson <matt.john...@algomi.com > <javascript:_e(%7B%7D,'cvml','matt.john...@algomi.com');>> wrote: > > Hi Maryann, > > > > Sending this to the whole user group now I have a bit more investigation. > Managed to do some digging and also figured out how to use sqlline.py, and > got the following results. > > > > With sqlline.py: > > > > *0: jdbc:phoenix:sales1,sales2,sales3> explain SELECT /*+ > USE_SORT_MERGE_JOIN */ * FROM testtable1 t1 join testtable2 t2 on t2.rowid > = t1.rowid;* > > > > *+------------------------------------------+* > > *| PLAN |* > > *+------------------------------------------+* > > *| SORT-MERGE-JOIN (INNER) TABLES |* > > *| CLIENT 1-CHUNK PARALLEL 1-WAY FULL SCAN OVER TESTTABLE1 |* > > *| AND |* > > *| CLIENT 1-CHUNK PARALLEL 1-WAY FULL SCAN OVER TESTTABLE2 |* > > *| SERVER SORTED BY [T2.ROWID] |* > > *| CLIENT MERGE SORT |* > > *+------------------------------------------+* > > *6 rows selected (0.035 seconds)* > > > > > > So it does seem to work from sqlline!! Then, I decided to try and use it > from inside a Java class: > > > > Class.*forName*("org.apache.phoenix.jdbc.PhoenixDriver"); > > > > Connection conn = DriverManager.*getConnection*( > "jdbc:phoenix:sales1,sales2,sales3", "", ""); > > > > // Create a Statement class to execute the SQL > statement > > Statement stmtLimited = conn.createStatement(); > > > > System.*out*.println("Executing statement"); > > > > // Execute the SQL statement and get the results in > a *Resultset* > > ResultSet rsLimited = stmtLimited.executeQuery("explain > SELECT /*+ USE_SORT_MERGE_JOIN */ * FROM testtable1 t1 join testtable2 t2 > on t2.rowid = t1.rowid"); > > *while*(rsLimited.next()) { > > String plan = rsLimited.getString("PLAN"); > > System.*out*.println(plan); > > } > > > > stmtLimited.close(); > > conn.close(); > > > > And I get: > > > > Executing statement > > CLIENT 1-CHUNK PARALLEL 1-WAY FULL SCAN OVER TESTTABLE1 > > PARALLEL INNER-JOIN TABLE 0 > > CLIENT 1-CHUNK PARALLEL 1-WAY FULL SCAN OVER TESTTABLE2 > > DYNAMIC SERVER FILTER BY ROWID BETWEEN MIN/MAX OF (T2.ROWID) > > > > > > So it seems that when executed from Java code (Squirrel is also a > Java-based client) it ignores the hint! What do you think? Is there a way > around this? > > > > Thanks! > > Matt > > > > > > *From:* Maryann Xue [mailto:maryann....@gmail.com > <javascript:_e(%7B%7D,'cvml','maryann....@gmail.com');>] > *Sent:* 17 March 2015 16:23 > > *To:* user@phoenix.apache.org > <javascript:_e(%7B%7D,'cvml','user@phoenix.apache.org');> > *Subject:* Re: Using Hints in Phoenix > > > > Hi Matt, > > > > Sorry that I still could not reproduce the issue. Could you try > reproducing it in a unit test? > > > > > > Thanks, > > Maryann > > > > On Fri, Mar 13, 2015 at 6:20 AM, Matthew Johnson <matt.john...@algomi.com > <javascript:_e(%7B%7D,'cvml','matt.john...@algomi.com');>> wrote: > > Hi Maryann, > > > > All hbases in my cluster have: > > > > *phoenix-4.3.0-server.jar* > > > > in the lib folder, and my client is using: > > > > *phoenix-4.3.0-client.jar* > > > > I generated the MD5 in case something has gone wrong with the versioning: > > > > *e0ade979e7b444fb8f8f4b7b5578edab* > > > > And I have opened up the jar, and can see the new class > *SortMergeJoinPlan.class*, so presumably I have the right version – is > there anything else I can check? > > > > Cheers, > > Matt > > > > *From:* Maryann Xue [mailto:maryann....@gmail.com > <javascript:_e(%7B%7D,'cvml','maryann....@gmail.com');>] > *Sent:* 12 March 2015 23:02 > > > *To:* user@phoenix.apache.org > <javascript:_e(%7B%7D,'cvml','user@phoenix.apache.org');> > *Subject:* Re: Using Hints in Phoenix > > > > Hi Matt, > > > > I checked on my side. Sort-merge-join would work for both tables and > views. And I also verified that 4.3 branch does have the corresponding > check-in ( > https://git1-us-west.apache.org/repos/asf?p=phoenix.git;a=commitdiff;h=ebc7ee42cdb2b05a293f54dc687ca975db9acbc3) > although the check-in message had a little mistake there. > > > > Could you please verify your Phoenix library version again, Matt? > Especially the client. > > > > > > Thanks, > > Maryann > > > > On Thu, Mar 12, 2015 at 6:00 PM, Maryann Xue <maryann....@gmail.com > <javascript:_e(%7B%7D,'cvml','maryann....@gmail.com');>> wrote: > > Hi Matt, > > > > Thanks for sharing the query. Using that hint should supposedly force > sort-merge join no matter what. I will go ahead and verify that. > > > > > > Thanks, > > Maryann > > > > On Thu, Mar 12, 2015 at 2:25 PM, Matthew Johnson <matt.john...@algomi.com > <javascript:_e(%7B%7D,'cvml','matt.john...@algomi.com');>> wrote: > > Hi Maryann, > > > > My views: > > > > *create view "mytestview1" * > > *(* > > *"rowid" varchar primary key, * > > *"metadata"."bId" varchar, * > > *"metadata"."dId" varchar* > > *) * > > > > *create view "mytestview2" * > > *(* > > *"rowid" varchar primary key, * > > *"data"."bId" varchar,* > > *"data"."details" varchar,* > > *"data"."comment" varchar* > > *) * > > > > The amount of data: > > > > *SELECT count(*) FROM "mytestview1";* > > * -- 78,549* > > > > *SELECT count(*) FROM "mytestview2";* > > * -- 2,130,905* > > > > The query: > > > > Without hint: > > > > *EXPLAIN select count(*) * > > *from "mytestview1" m1* > > *inner join "mytestview2" m2* > > *on m1."bId" = m2."bId"* > > > > *CLIENT 1-CHUNK PARALLEL 1-WAY FULL SCAN OVER mytest1* > > * SERVER AGGREGATE INTO SINGLE ROW* > > * PARALLEL INNER-JOIN TABLE 0 (SKIP MERGE)* > > * CLIENT 5-CHUNK PARALLEL 1-WAY FULL SCAN OVER mytest2* > > > > With hint: > > > > *EXPLAIN select /*+ USE_SORT_MERGE_JOIN */ count(*) * > > *from "mytestview1" m1* > > *inner join "mytestview2" m2* > > *on m1."bId" = m2."bId"* > > > > *CLIENT 1-CHUNK PARALLEL 1-WAY FULL SCAN OVER mytest1* > > * SERVER AGGREGATE INTO SINGLE ROW* > > * PARALLEL INNER-JOIN TABLE 0 (SKIP MERGE)* > > * CLIENT 5-CHUNK PARALLEL 1-WAY FULL SCAN OVER mytest2* > > > > > > I know that when joining that I should ideally do <large table> join > <small table>, but this is on my test environment and in production both > tables are roughly the same size, so I’m trying to force it to use the sort > merge before running the query in prod. My region servers in test don’t > have that much heap space (about 2 gigs) if that makes a difference. Do I > need to force a major compaction, generate statistics, anything like that? > > > > Thanks! > > Matt > > > > > > *From:* Maryann Xue [mailto:maryann....@gmail.com > <javascript:_e(%7B%7D,'cvml','maryann....@gmail.com');>] > *Sent:* 11 March 2015 20:16 > > > *To:* user@phoenix.apache.org > <javascript:_e(%7B%7D,'cvml','user@phoenix.apache.org');> > *Subject:* Re: Using Hints in Phoenix > > > > Hi Matt, > > > > Views or tables should not matter in this. Would you mind sharing your > query and DDLs? > > > > > > Thanks, > > Maryann > > > > > > On Wed, Mar 11, 2015 at 6:06 AM, Matthew Johnson <matt.john...@algomi.com > <javascript:_e(%7B%7D,'cvml','matt.john...@algomi.com');>> wrote: > > Hi Maryann, > > > > I am using 4.3.0 (I upgraded as soon as it was released, largely so I > could pick up this feature). I am actually joining views rather than tables > – would this make a difference? > > > > Cheers, > > Matt > > > > *From:* Maryann Xue [mailto:maryann....@gmail.com > <javascript:_e(%7B%7D,'cvml','maryann....@gmail.com');>] > *Sent:* 10 March 2015 20:54 > > > *To:* user@phoenix.apache.org > <javascript:_e(%7B%7D,'cvml','user@phoenix.apache.org');> > *Subject:* Re: Using Hints in Phoenix > > > > Hi Matt, > > > > Which version of Phoenix are you using? Sort-merge join is only available > in Phoenix 4.3. > > > > > > Thanks, > > Maryann > > > > On Tue, Mar 10, 2015 at 6:11 AM, Matthew Johnson <matt.john...@algomi.com > <javascript:_e(%7B%7D,'cvml','matt.john...@algomi.com');>> wrote: > > Hi Maryann, > > > > Thanks for clarifying that for me. I’ve been playing with the hint > *USE_SORT_MERGE_JOIN* to try and solve an issue when joining two very > large tables: > > > > *Error: Encountered exception in sub plan [0] execution.* > > *SQLState: null* > > *ErrorCode: 0* > > > > Which I believe is related to a lack of memory for building the hash > table? I thought that using *SELECT /*+ USE_SORT_MERGE_JOIN*/* would make > the join much slower but would work on an unlimited data set – did I > misunderstand? The explain plan does not change with or without this hint: > > > > CLIENT 1-CHUNK PARALLEL 1-WAY FULL SCAN OVER mytable1 > > SERVER AGGREGATE INTO SINGLE ROW > > PARALLEL INNER-JOIN TABLE 0 (SKIP MERGE) > > CLIENT 15-CHUNK PARALLEL 1-WAY FULL SCAN OVER mytable2 > > > > Cheers, > > Matt > > > > > > *From:* Maryann Xue [mailto:maryann....@gmail.com > <javascript:_e(%7B%7D,'cvml','maryann....@gmail.com');>] > *Sent:* 09 March 2015 15:00 > *To:* user@phoenix.apache.org > <javascript:_e(%7B%7D,'cvml','user@phoenix.apache.org');> > *Subject:* Re: Using Hints in Phoenix > > > > Hi Matt, > > > > So far in Phoenix, hints are only supported as specified right after > keywords SELECT, UPSERT and DELETE. Same for join queries. It is currently > impossible to hint a certain join algorithm for a specific join node in a > multiple join query. However, for subqueries, the inner query can have its > own hints, independent of the outer query, like "SELECT /*+ INDEX(t idx1)*/ > col1, col2 FROM t WHERE col3 IN (SELECT /*+ NO_INDEX*/ id FROM r WHERE name > = 'x')". > > > > > > Thanks, > > Maryann > > > > On Mon, Mar 9, 2015 at 7:26 AM, Matthew Johnson <matt.john...@algomi.com > <javascript:_e(%7B%7D,'cvml','matt.john...@algomi.com');>> wrote: > > Hi guys, > > > > This is more of a general question than a problem – but I’m just wondering > if someone can clarify for me what the syntax rules are for hints in > Phoenix. Does it matter where in the query they go? Do they always go > something like *SELECT <insert hint> x from y*? Or, if the hint is for a > join (eg Sort Merge) does it go in the join part (*SELECT x from y inner > join <insert hint> z on j = k*)? > > > > Couldn’t seem to find anything specific on this in the docs, and haven’t > worked much with database hints in general so maybe there is a convention > that I am not aware of – apologies if it’s a stupid question! > > > > Cheers, > > Matt > > > > > > > > > > > > > > > > > > > > >