Re: SORT_MERGE_JOIN on non-leading key: server-side sorting

2018-05-17 Thread Gerald Sangudi
Thanks Maryann and James.

I'll verify the RC as soon as it's available.

Gerald


On Thu, May 17, 2018 at 5:01 PM, James Taylor 
wrote:

> Hi Gerald,
> The fix for PHOENIX-4508 will appear in the 4.14.0 release which we're
> working on now. We should have a second RC up shortly that you can use to
> verify. The fix isn't in 4.13 since it was checked in after the release.
>
> Thanks,
> James
>
> On Thu, May 17, 2018 at 4:44 PM, Maryann Xue 
> wrote:
>
>> Hey, this is a simple experiment I did based on an existing test case.
>> Look for "query3". The test verifies that there is no "SERVER SORT" in the
>> entire query plan (both salted and unsalted tables have been covered):
>>
>>
>> @Test
>> public void testBug4508() throws Exception {
>> Properties props = PropertiesUtil.deepCopy(TEST_PROPERTIES);
>> Connection conn = DriverManager.getConnection(getUrl(), props);
>> props = PropertiesUtil.deepCopy(TEST_PROPERTIES);
>> props.setProperty("TenantId", "010");
>> Connection conn010 = DriverManager.getConnection(getUrl(), props);
>> try {
>> // Salted tables
>> String peopleTable = generateUniqueName();
>> String myTable = generateUniqueName();
>> conn.createStatement().execute("CREATE TABLE " + peopleTable + " 
>> (\n" +
>> "PERSON_ID VARCHAR NOT NULL,\n" +
>> "NAME VARCHAR\n" +
>> "CONSTRAINT PK_TEST_PEOPLE PRIMARY KEY (PERSON_ID)) 
>> SALT_BUCKETS = 3");
>> conn.createStatement().execute("CREATE TABLE " + myTable + " (\n" +
>> "LOCALID VARCHAR NOT NULL,\n" +
>> "DSID VARCHAR(255) NOT NULL, \n" +
>> "EID CHAR(40),\n" +
>> "HAS_CANDIDATES BOOLEAN\n" +
>> "CONSTRAINT PK_MYTABLE PRIMARY KEY (LOCALID, DSID)) 
>> SALT_BUCKETS = 3");
>> verifyQueryPlanAndResultForBug4508(conn, peopleTable, myTable);
>>
>> // Salted multi-tenant tables
>> String peopleTable2 = generateUniqueName();
>> String myTable2 = generateUniqueName();
>> conn.createStatement().execute("CREATE TABLE " + peopleTable2 + " 
>> (\n" +
>> "TENANT_ID VARCHAR NOT NULL,\n" +
>> "PERSON_ID VARCHAR NOT NULL,\n" +
>> "NAME VARCHAR\n" +
>> "CONSTRAINT PK_TEST_PEOPLE PRIMARY KEY (TENANT_ID, 
>> PERSON_ID))\n" +
>> "SALT_BUCKETS = 3, MULTI_TENANT=true");
>> conn.createStatement().execute("CREATE TABLE " + myTable2 + " (\n" +
>> "TENANT_ID VARCHAR NOT NULL,\n" +
>> "LOCALID VARCHAR NOT NULL,\n" +
>> "DSID VARCHAR(255) NOT NULL, \n" +
>> "EID CHAR(40),\n" +
>> "HAS_CANDIDATES BOOLEAN\n" +
>> "CONSTRAINT PK_MYTABLE PRIMARY KEY (TENANT_ID, LOCALID, 
>> DSID))\n" +
>> "SALT_BUCKETS = 3, MULTI_TENANT=true");
>> verifyQueryPlanAndResultForBug4508(conn010, peopleTable2, myTable2);
>> } finally {
>> conn.close();
>> conn010.close();
>> }
>> }
>>
>>
>> private static void verifyQueryPlanAndResultForBug4508(
>> Connection conn, String peopleTable, String myTable) throws Exception {
>> PreparedStatement peopleTableUpsertStmt = conn.prepareStatement(
>> "UPSERT INTO " + peopleTable + " VALUES(?, ?)");
>> peopleTableUpsertStmt.setString(1, "X001");
>> peopleTableUpsertStmt.setString(2, "Marcus");
>> peopleTableUpsertStmt.execute();
>> peopleTableUpsertStmt.setString(1, "X002");
>> peopleTableUpsertStmt.setString(2, "Jenny");
>> peopleTableUpsertStmt.execute();
>> peopleTableUpsertStmt.setString(1, "X003");
>> peopleTableUpsertStmt.setString(2, "Seymour");
>> peopleTableUpsertStmt.execute();
>> conn.commit();
>>
>> PreparedStatement myTableUpsertStmt = conn.prepareStatement(
>> "UPSERT INTO " + myTable + " VALUES(?, ?, ?, ?)");
>> myTableUpsertStmt.setString(1, "X001");
>> myTableUpsertStmt.setString(2, "GROUP");
>> myTableUpsertStmt.setString(3, null);
>> myTableUpsertStmt.setBoolean(4, false);
>> myTableUpsertStmt.execute();
>> myTableUpsertStmt.setString(1, "X001");
>> myTableUpsertStmt.setString(2, "PEOPLE");
>> myTableUpsertStmt.setString(3, null);
>> myTableUpsertStmt.setBoolean(4, false);
>> myTableUpsertStmt.execute();
>> myTableUpsertStmt.setString(1, "X003");
>> myTableUpsertStmt.setString(2, "PEOPLE");
>> myTableUpsertStmt.setString(3, null);
>> myTableUpsertStmt.setBoolean(4, false);
>> myTableUpsertStmt.execute();
>> myTableUpsertStmt.setString(1, "X002");
>> myTableUpsertStmt.setString(2, "PEOPLE");
>> myTableUpsertStmt.setString(3, "Z990");
>> myTableUpsertStmt.setBoolean(4, false);
>> myTableUpsertStmt.execute();
>> conn.commit();
>>
>> String query1 = "SELECT /*+ USE_SORT_MERGE_JOIN*/ COUNT(*)\n" +
>> "FROM " + peopleTable + " ds JOIN " + myTable + " l\n" +
>> "ON ds.PERSON_ID = l.LOCALID\n" +
>> "WHERE l.EID IS NULL AND l.DSID = 'PEOPLE' AND 

Re: SORT_MERGE_JOIN on non-leading key: server-side sorting

2018-05-17 Thread James Taylor
Hi Gerald,
The fix for PHOENIX-4508 will appear in the 4.14.0 release which we're
working on now. We should have a second RC up shortly that you can use to
verify. The fix isn't in 4.13 since it was checked in after the release.

Thanks,
James

On Thu, May 17, 2018 at 4:44 PM, Maryann Xue  wrote:

> Hey, this is a simple experiment I did based on an existing test case.
> Look for "query3". The test verifies that there is no "SERVER SORT" in the
> entire query plan (both salted and unsalted tables have been covered):
>
>
> @Test
> public void testBug4508() throws Exception {
> Properties props = PropertiesUtil.deepCopy(TEST_PROPERTIES);
> Connection conn = DriverManager.getConnection(getUrl(), props);
> props = PropertiesUtil.deepCopy(TEST_PROPERTIES);
> props.setProperty("TenantId", "010");
> Connection conn010 = DriverManager.getConnection(getUrl(), props);
> try {
> // Salted tables
> String peopleTable = generateUniqueName();
> String myTable = generateUniqueName();
> conn.createStatement().execute("CREATE TABLE " + peopleTable + " (\n" 
> +
> "PERSON_ID VARCHAR NOT NULL,\n" +
> "NAME VARCHAR\n" +
> "CONSTRAINT PK_TEST_PEOPLE PRIMARY KEY (PERSON_ID)) 
> SALT_BUCKETS = 3");
> conn.createStatement().execute("CREATE TABLE " + myTable + " (\n" +
> "LOCALID VARCHAR NOT NULL,\n" +
> "DSID VARCHAR(255) NOT NULL, \n" +
> "EID CHAR(40),\n" +
> "HAS_CANDIDATES BOOLEAN\n" +
> "CONSTRAINT PK_MYTABLE PRIMARY KEY (LOCALID, DSID)) 
> SALT_BUCKETS = 3");
> verifyQueryPlanAndResultForBug4508(conn, peopleTable, myTable);
>
> // Salted multi-tenant tables
> String peopleTable2 = generateUniqueName();
> String myTable2 = generateUniqueName();
> conn.createStatement().execute("CREATE TABLE " + peopleTable2 + " 
> (\n" +
> "TENANT_ID VARCHAR NOT NULL,\n" +
> "PERSON_ID VARCHAR NOT NULL,\n" +
> "NAME VARCHAR\n" +
> "CONSTRAINT PK_TEST_PEOPLE PRIMARY KEY (TENANT_ID, 
> PERSON_ID))\n" +
> "SALT_BUCKETS = 3, MULTI_TENANT=true");
> conn.createStatement().execute("CREATE TABLE " + myTable2 + " (\n" +
> "TENANT_ID VARCHAR NOT NULL,\n" +
> "LOCALID VARCHAR NOT NULL,\n" +
> "DSID VARCHAR(255) NOT NULL, \n" +
> "EID CHAR(40),\n" +
> "HAS_CANDIDATES BOOLEAN\n" +
> "CONSTRAINT PK_MYTABLE PRIMARY KEY (TENANT_ID, LOCALID, 
> DSID))\n" +
> "SALT_BUCKETS = 3, MULTI_TENANT=true");
> verifyQueryPlanAndResultForBug4508(conn010, peopleTable2, myTable2);
> } finally {
> conn.close();
> conn010.close();
> }
> }
>
>
> private static void verifyQueryPlanAndResultForBug4508(
> Connection conn, String peopleTable, String myTable) throws Exception {
> PreparedStatement peopleTableUpsertStmt = conn.prepareStatement(
> "UPSERT INTO " + peopleTable + " VALUES(?, ?)");
> peopleTableUpsertStmt.setString(1, "X001");
> peopleTableUpsertStmt.setString(2, "Marcus");
> peopleTableUpsertStmt.execute();
> peopleTableUpsertStmt.setString(1, "X002");
> peopleTableUpsertStmt.setString(2, "Jenny");
> peopleTableUpsertStmt.execute();
> peopleTableUpsertStmt.setString(1, "X003");
> peopleTableUpsertStmt.setString(2, "Seymour");
> peopleTableUpsertStmt.execute();
> conn.commit();
>
> PreparedStatement myTableUpsertStmt = conn.prepareStatement(
> "UPSERT INTO " + myTable + " VALUES(?, ?, ?, ?)");
> myTableUpsertStmt.setString(1, "X001");
> myTableUpsertStmt.setString(2, "GROUP");
> myTableUpsertStmt.setString(3, null);
> myTableUpsertStmt.setBoolean(4, false);
> myTableUpsertStmt.execute();
> myTableUpsertStmt.setString(1, "X001");
> myTableUpsertStmt.setString(2, "PEOPLE");
> myTableUpsertStmt.setString(3, null);
> myTableUpsertStmt.setBoolean(4, false);
> myTableUpsertStmt.execute();
> myTableUpsertStmt.setString(1, "X003");
> myTableUpsertStmt.setString(2, "PEOPLE");
> myTableUpsertStmt.setString(3, null);
> myTableUpsertStmt.setBoolean(4, false);
> myTableUpsertStmt.execute();
> myTableUpsertStmt.setString(1, "X002");
> myTableUpsertStmt.setString(2, "PEOPLE");
> myTableUpsertStmt.setString(3, "Z990");
> myTableUpsertStmt.setBoolean(4, false);
> myTableUpsertStmt.execute();
> conn.commit();
>
> String query1 = "SELECT /*+ USE_SORT_MERGE_JOIN*/ COUNT(*)\n" +
> "FROM " + peopleTable + " ds JOIN " + myTable + " l\n" +
> "ON ds.PERSON_ID = l.LOCALID\n" +
> "WHERE l.EID IS NULL AND l.DSID = 'PEOPLE' AND l.HAS_CANDIDATES = FALSE";
> String query2 = "SELECT /*+ USE_SORT_MERGE_JOIN */ COUNT(*)\n" +
> "FROM (SELECT LOCALID FROM " + myTable + "\n" +
> "WHERE EID IS NULL AND DSID = 'PEOPLE' AND HAS_CANDIDATES = FALSE) l\n" +
> "JOIN " + peopleTable + " ds ON ds.PERSON_ID = l.LOCALID";
> String 

Re: SORT_MERGE_JOIN on non-leading key: server-side sorting

2018-05-17 Thread Maryann Xue
Hey, this is a simple experiment I did based on an existing test case. Look
for "query3". The test verifies that there is no "SERVER SORT" in the
entire query plan (both salted and unsalted tables have been covered):


@Test
public void testBug4508() throws Exception {
Properties props = PropertiesUtil.deepCopy(TEST_PROPERTIES);
Connection conn = DriverManager.getConnection(getUrl(), props);
props = PropertiesUtil.deepCopy(TEST_PROPERTIES);
props.setProperty("TenantId", "010");
Connection conn010 = DriverManager.getConnection(getUrl(), props);
try {
// Salted tables
String peopleTable = generateUniqueName();
String myTable = generateUniqueName();
conn.createStatement().execute("CREATE TABLE " + peopleTable + " (\n" +
"PERSON_ID VARCHAR NOT NULL,\n" +
"NAME VARCHAR\n" +
"CONSTRAINT PK_TEST_PEOPLE PRIMARY KEY (PERSON_ID))
SALT_BUCKETS = 3");
conn.createStatement().execute("CREATE TABLE " + myTable + " (\n" +
"LOCALID VARCHAR NOT NULL,\n" +
"DSID VARCHAR(255) NOT NULL, \n" +
"EID CHAR(40),\n" +
"HAS_CANDIDATES BOOLEAN\n" +
"CONSTRAINT PK_MYTABLE PRIMARY KEY (LOCALID, DSID))
SALT_BUCKETS = 3");
verifyQueryPlanAndResultForBug4508(conn, peopleTable, myTable);

// Salted multi-tenant tables
String peopleTable2 = generateUniqueName();
String myTable2 = generateUniqueName();
conn.createStatement().execute("CREATE TABLE " + peopleTable2 + " (\n" +
"TENANT_ID VARCHAR NOT NULL,\n" +
"PERSON_ID VARCHAR NOT NULL,\n" +
"NAME VARCHAR\n" +
"CONSTRAINT PK_TEST_PEOPLE PRIMARY KEY (TENANT_ID,
PERSON_ID))\n" +
"SALT_BUCKETS = 3, MULTI_TENANT=true");
conn.createStatement().execute("CREATE TABLE " + myTable2 + " (\n" +
"TENANT_ID VARCHAR NOT NULL,\n" +
"LOCALID VARCHAR NOT NULL,\n" +
"DSID VARCHAR(255) NOT NULL, \n" +
"EID CHAR(40),\n" +
"HAS_CANDIDATES BOOLEAN\n" +
"CONSTRAINT PK_MYTABLE PRIMARY KEY (TENANT_ID,
LOCALID, DSID))\n" +
"SALT_BUCKETS = 3, MULTI_TENANT=true");
verifyQueryPlanAndResultForBug4508(conn010, peopleTable2, myTable2);
} finally {
conn.close();
conn010.close();
}
}


private static void verifyQueryPlanAndResultForBug4508(
Connection conn, String peopleTable, String myTable) throws Exception {
PreparedStatement peopleTableUpsertStmt = conn.prepareStatement(
"UPSERT INTO " + peopleTable + " VALUES(?, ?)");
peopleTableUpsertStmt.setString(1, "X001");
peopleTableUpsertStmt.setString(2, "Marcus");
peopleTableUpsertStmt.execute();
peopleTableUpsertStmt.setString(1, "X002");
peopleTableUpsertStmt.setString(2, "Jenny");
peopleTableUpsertStmt.execute();
peopleTableUpsertStmt.setString(1, "X003");
peopleTableUpsertStmt.setString(2, "Seymour");
peopleTableUpsertStmt.execute();
conn.commit();

PreparedStatement myTableUpsertStmt = conn.prepareStatement(
"UPSERT INTO " + myTable + " VALUES(?, ?, ?, ?)");
myTableUpsertStmt.setString(1, "X001");
myTableUpsertStmt.setString(2, "GROUP");
myTableUpsertStmt.setString(3, null);
myTableUpsertStmt.setBoolean(4, false);
myTableUpsertStmt.execute();
myTableUpsertStmt.setString(1, "X001");
myTableUpsertStmt.setString(2, "PEOPLE");
myTableUpsertStmt.setString(3, null);
myTableUpsertStmt.setBoolean(4, false);
myTableUpsertStmt.execute();
myTableUpsertStmt.setString(1, "X003");
myTableUpsertStmt.setString(2, "PEOPLE");
myTableUpsertStmt.setString(3, null);
myTableUpsertStmt.setBoolean(4, false);
myTableUpsertStmt.execute();
myTableUpsertStmt.setString(1, "X002");
myTableUpsertStmt.setString(2, "PEOPLE");
myTableUpsertStmt.setString(3, "Z990");
myTableUpsertStmt.setBoolean(4, false);
myTableUpsertStmt.execute();
conn.commit();

String query1 = "SELECT /*+ USE_SORT_MERGE_JOIN*/ COUNT(*)\n" +
"FROM " + peopleTable + " ds JOIN " + myTable + " l\n" +
"ON ds.PERSON_ID = l.LOCALID\n" +
"WHERE l.EID IS NULL AND l.DSID = 'PEOPLE' AND l.HAS_CANDIDATES = FALSE";
String query2 = "SELECT /*+ USE_SORT_MERGE_JOIN */ COUNT(*)\n" +
"FROM (SELECT LOCALID FROM " + myTable + "\n" +
"WHERE EID IS NULL AND DSID = 'PEOPLE' AND HAS_CANDIDATES = FALSE) l\n" +
"JOIN " + peopleTable + " ds ON ds.PERSON_ID = l.LOCALID";
String query3 = "SELECT /*+ USE_SORT_MERGE_JOIN */ COUNT(*)\n" +
"FROM " + myTable + " t1 JOIN " + myTable + " t2\n" +
"ON t1.DSID = t2.DSID\n" +
"WHERE t1.LOCALID = 'X001' AND t2.LOCALID = 'X002'";

for (String q : new String[]{query1, query2, query3}) {
ResultSet rs = conn.createStatement().executeQuery("explain " + q);
String plan = QueryUtil.getExplainPlan(rs);
assertFalse("Tables should not be sorted over their PKs:\n" + plan,
plan.contains("SERVER SORTED BY"));

rs = conn.createStatement().executeQuery(q);
assertTrue(rs.next());

Re: SORT_MERGE_JOIN on non-leading key: server-side sorting

2018-05-17 Thread Gerald Sangudi
Hi Maryann,

Would you mind sharing the EXPLAIN output you are getting?

I'm not able to find a download for phoenix-4.13.2-HBase or 4.14.0-HBase.
The *-cdh downloads do not work with Amazon EMR. I tried building against
the 4.14.0-rc0 tag. This produced 4.14.0 for phoenix-core.jar, but all the
other jars produced are 4.13.1, including the client jar. When I deploy
these jars on EMR and run queries via sqlline, I get a NoClassDefFoundError.

Are the plans to provide 4.13.2-HBase as a public download?

Thanks,
Gerald

On Wed, May 16, 2018 at 10:40 AM, Maryann Xue  wrote:

> Hi Gerald,
>
> I checked again. Unfortunately this fix is included with 4.13.2 but not
> 4.13.1. Would you mind upgrading your library to 4.13.2?
>
> Thanks,
> Maryann
>
> On Wed, May 16, 2018 at 9:41 AM, Maryann Xue 
> wrote:
>
>> Sorry for the late response. Yes, sure, I will try it right away.
>>
>> On Wed, May 16, 2018 at 9:40 AM Gerald Sangudi 
>> wrote:
>>
>>> Hi Maryann,
>>>
>>> Following up again -- I ran EXPLAIN on 4.13.1 and saw the same behavior,
>>> a sort on the RHS. Is it possible for you to try it on 4.13.1?
>>>
>>> Thanks,
>>> Gerald
>>>
>>> On Fri, May 11, 2018 at 10:52 AM, Gerald Sangudi 
>>> wrote:
>>>
 Hi Maryann,

 Thanks for verifying against latest. However, I did not detect the fix
 in Phoenix 4.13.1. AWS EMR currently provides Phoenix 4.13.0. I manually
 upgraded an AWS EMR cluster to Phoenix 4.13.1 and rebooted the cluster. The
 EXPLAIN plan still shows the same issue.

 Thanks,
 Gerald

 On Wed, May 9, 2018 at 11:51 AM, Maryann Xue 
 wrote:

> Hi Gerald,
>
> I have verified against latest Phoenix code that this problem has been
> fixed. I have also checked Phoenix 4.13 release tags. Looks like all
> versions of 4.13 packages now include that fix. Would you mind getting the
> latest Phoenix-4.13 package and testing it again? Thank you!
>
>
> Thanks,
> Maryann
>
> On Tue, May 8, 2018 at 12:24 PM, Gerald Sangudi 
> wrote:
>
>> Hello,
>>
>> I'm running Phoenix 4.13 on AWS EMR and getting the following EXPLAIN
>> plan:
>>
>> Table:
>>
>> *CREATE TABLE salted (   keyA BIGINT NOT NULL,   keyB BIGINT
>> NOT NULL,   val SMALLINT,   CONSTRAINT pk PRIMARY KEY (keyA,
>> keyB))SALT_BUCKETS = 64;*
>> EXPLAIN:
>>
>> *EXPLAINSELECT /*+ USE_SORT_MERGE_JOIN */COUNT(*) cFROM salted t1
>> JOIN salted t2ON (t1.keyB = t2.keyB)WHERE t1.keyA = 10AND t2.keyA = 20;*
>>
>> *+--+-+-+|
>> PLAN   | EST_BYTES_READ |
>> |+--+-+-+|
>> SORT-MERGE-JOIN (INNER) TABLES
>>   | null | || CLIENT
>> 64-CHUNK PARALLEL 64-WAY RANGE SCAN OVER SALTED [0,10] - [63,10]  | null 
>> |
>> || SERVER FILTER BY FIRST KEY ONLY
>>  | null | || CLIENT MERGE SORT
>>| null | || AND
>> (SKIP MERGE) 
>> |
>> null | || CLIENT 64-CHUNK PARALLEL 64-WAY RANGE SCAN OVER SALTED 
>> [0,20]
>> - [63,20]  | null | || SERVER FILTER BY FIRST KEY ONLY
>>  | null | || SERVER SORTED BY
>> [T2.KEYB]   | null | || 
>> CLIENT
>> MERGE SORT| null 
>> |
>> || CLIENT AGGREGATE INTO SINGLE ROW
>> | null |
>> |+--+-+-+*
>>
>> In the EXPLAIN output, I see "SERVER SORTED BY [T2.KEYB]". Is this
>> sort necessary? For both JOIN terms T1 and T2, the value of keyA, the
>> leading part of the primary key, is fixed. Furthermore, there is no
>> corresponding sort of T1.KEYB.
>>
>> When I EXPLAIN the same query on a non-salted table, neither T1.KEYB
>> nor T2.KEYB is re-sorted. I'm hoping the sort is unnecessary. If so, is
>> there an open ticket? I would be happy to file a ticket and to contribute
>> to a fix. I would appreciate any guidance.
>>
>> Thanks,
>> Gerald
>>
>>
>

>>>
>


Re: SORT_MERGE_JOIN on non-leading key: server-side sorting

2018-05-16 Thread Maryann Xue
Hi Gerald,

I checked again. Unfortunately this fix is included with 4.13.2 but not
4.13.1. Would you mind upgrading your library to 4.13.2?

Thanks,
Maryann

On Wed, May 16, 2018 at 9:41 AM, Maryann Xue  wrote:

> Sorry for the late response. Yes, sure, I will try it right away.
>
> On Wed, May 16, 2018 at 9:40 AM Gerald Sangudi 
> wrote:
>
>> Hi Maryann,
>>
>> Following up again -- I ran EXPLAIN on 4.13.1 and saw the same behavior,
>> a sort on the RHS. Is it possible for you to try it on 4.13.1?
>>
>> Thanks,
>> Gerald
>>
>> On Fri, May 11, 2018 at 10:52 AM, Gerald Sangudi 
>> wrote:
>>
>>> Hi Maryann,
>>>
>>> Thanks for verifying against latest. However, I did not detect the fix
>>> in Phoenix 4.13.1. AWS EMR currently provides Phoenix 4.13.0. I manually
>>> upgraded an AWS EMR cluster to Phoenix 4.13.1 and rebooted the cluster. The
>>> EXPLAIN plan still shows the same issue.
>>>
>>> Thanks,
>>> Gerald
>>>
>>> On Wed, May 9, 2018 at 11:51 AM, Maryann Xue 
>>> wrote:
>>>
 Hi Gerald,

 I have verified against latest Phoenix code that this problem has been
 fixed. I have also checked Phoenix 4.13 release tags. Looks like all
 versions of 4.13 packages now include that fix. Would you mind getting the
 latest Phoenix-4.13 package and testing it again? Thank you!


 Thanks,
 Maryann

 On Tue, May 8, 2018 at 12:24 PM, Gerald Sangudi 
 wrote:

> Hello,
>
> I'm running Phoenix 4.13 on AWS EMR and getting the following EXPLAIN
> plan:
>
> Table:
>
> *CREATE TABLE salted (   keyA BIGINT NOT NULL,   keyB BIGINT
> NOT NULL,   val SMALLINT,   CONSTRAINT pk PRIMARY KEY (keyA,
> keyB))SALT_BUCKETS = 64;*
> EXPLAIN:
>
> *EXPLAINSELECT /*+ USE_SORT_MERGE_JOIN */COUNT(*) cFROM salted t1 JOIN
> salted t2ON (t1.keyB = t2.keyB)WHERE t1.keyA = 10AND t2.keyA = 20;*
>
> *+--+-+-+|
> PLAN   | EST_BYTES_READ |
> |+--+-+-+|
> SORT-MERGE-JOIN (INNER) TABLES
>   | null | || CLIENT
> 64-CHUNK PARALLEL 64-WAY RANGE SCAN OVER SALTED [0,10] - [63,10]  | null |
> || SERVER FILTER BY FIRST KEY ONLY
>  | null | || CLIENT MERGE SORT
>| null | || AND
> (SKIP MERGE) |
> null | || CLIENT 64-CHUNK PARALLEL 64-WAY RANGE SCAN OVER SALTED 
> [0,20]
> - [63,20]  | null | || SERVER FILTER BY FIRST KEY ONLY
>  | null | || SERVER SORTED BY
> [T2.KEYB]   | null | || CLIENT
> MERGE SORT| null |
> || CLIENT AGGREGATE INTO SINGLE ROW
> | null |
> |+--+-+-+*
>
> In the EXPLAIN output, I see "SERVER SORTED BY [T2.KEYB]". Is this
> sort necessary? For both JOIN terms T1 and T2, the value of keyA, the
> leading part of the primary key, is fixed. Furthermore, there is no
> corresponding sort of T1.KEYB.
>
> When I EXPLAIN the same query on a non-salted table, neither T1.KEYB
> nor T2.KEYB is re-sorted. I'm hoping the sort is unnecessary. If so, is
> there an open ticket? I would be happy to file a ticket and to contribute
> to a fix. I would appreciate any guidance.
>
> Thanks,
> Gerald
>
>

>>>
>>


Re: SORT_MERGE_JOIN on non-leading key: server-side sorting

2018-05-16 Thread Maryann Xue
Sorry for the late response. Yes, sure, I will try it right away.
On Wed, May 16, 2018 at 9:40 AM Gerald Sangudi  wrote:

> Hi Maryann,
>
> Following up again -- I ran EXPLAIN on 4.13.1 and saw the same behavior, a
> sort on the RHS. Is it possible for you to try it on 4.13.1?
>
> Thanks,
> Gerald
>
> On Fri, May 11, 2018 at 10:52 AM, Gerald Sangudi 
> wrote:
>
>> Hi Maryann,
>>
>> Thanks for verifying against latest. However, I did not detect the fix in
>> Phoenix 4.13.1. AWS EMR currently provides Phoenix 4.13.0. I manually
>> upgraded an AWS EMR cluster to Phoenix 4.13.1 and rebooted the cluster. The
>> EXPLAIN plan still shows the same issue.
>>
>> Thanks,
>> Gerald
>>
>> On Wed, May 9, 2018 at 11:51 AM, Maryann Xue 
>> wrote:
>>
>>> Hi Gerald,
>>>
>>> I have verified against latest Phoenix code that this problem has been
>>> fixed. I have also checked Phoenix 4.13 release tags. Looks like all
>>> versions of 4.13 packages now include that fix. Would you mind getting the
>>> latest Phoenix-4.13 package and testing it again? Thank you!
>>>
>>>
>>> Thanks,
>>> Maryann
>>>
>>> On Tue, May 8, 2018 at 12:24 PM, Gerald Sangudi 
>>> wrote:
>>>
 Hello,

 I'm running Phoenix 4.13 on AWS EMR and getting the following EXPLAIN
 plan:

 Table:

 *CREATE TABLE salted (   keyA BIGINT NOT NULL,   keyB BIGINT
 NOT NULL,   val SMALLINT,   CONSTRAINT pk PRIMARY KEY (keyA,
 keyB))SALT_BUCKETS = 64;*
 EXPLAIN:

 *EXPLAINSELECT /*+ USE_SORT_MERGE_JOIN */COUNT(*) cFROM salted t1 JOIN
 salted t2ON (t1.keyB = t2.keyB)WHERE t1.keyA = 10AND t2.keyA = 20;*

 *+--+-+-+|
 PLAN   | EST_BYTES_READ |
 |+--+-+-+|
 SORT-MERGE-JOIN (INNER) TABLES
   | null | || CLIENT
 64-CHUNK PARALLEL 64-WAY RANGE SCAN OVER SALTED [0,10] - [63,10]  | null |
 || SERVER FILTER BY FIRST KEY ONLY
  | null | || CLIENT MERGE SORT
| null | || AND
 (SKIP MERGE) |
 null | || CLIENT 64-CHUNK PARALLEL 64-WAY RANGE SCAN OVER SALTED [0,20]
 - [63,20]  | null | || SERVER FILTER BY FIRST KEY ONLY
  | null | || SERVER SORTED BY
 [T2.KEYB]   | null | || CLIENT
 MERGE SORT| null |
 || CLIENT AGGREGATE INTO SINGLE ROW
 | null |
 |+--+-+-+*

 In the EXPLAIN output, I see "SERVER SORTED BY [T2.KEYB]". Is this sort
 necessary? For both JOIN terms T1 and T2, the value of keyA, the leading
 part of the primary key, is fixed. Furthermore, there is no corresponding
 sort of T1.KEYB.

 When I EXPLAIN the same query on a non-salted table, neither T1.KEYB
 nor T2.KEYB is re-sorted. I'm hoping the sort is unnecessary. If so, is
 there an open ticket? I would be happy to file a ticket and to contribute
 to a fix. I would appreciate any guidance.

 Thanks,
 Gerald


>>>
>>
>


Re: SORT_MERGE_JOIN on non-leading key: server-side sorting

2018-05-16 Thread Gerald Sangudi
Hi Maryann,

Following up again -- I ran EXPLAIN on 4.13.1 and saw the same behavior, a
sort on the RHS. Is it possible for you to try it on 4.13.1?

Thanks,
Gerald

On Fri, May 11, 2018 at 10:52 AM, Gerald Sangudi 
wrote:

> Hi Maryann,
>
> Thanks for verifying against latest. However, I did not detect the fix in
> Phoenix 4.13.1. AWS EMR currently provides Phoenix 4.13.0. I manually
> upgraded an AWS EMR cluster to Phoenix 4.13.1 and rebooted the cluster. The
> EXPLAIN plan still shows the same issue.
>
> Thanks,
> Gerald
>
> On Wed, May 9, 2018 at 11:51 AM, Maryann Xue 
> wrote:
>
>> Hi Gerald,
>>
>> I have verified against latest Phoenix code that this problem has been
>> fixed. I have also checked Phoenix 4.13 release tags. Looks like all
>> versions of 4.13 packages now include that fix. Would you mind getting the
>> latest Phoenix-4.13 package and testing it again? Thank you!
>>
>>
>> Thanks,
>> Maryann
>>
>> On Tue, May 8, 2018 at 12:24 PM, Gerald Sangudi 
>> wrote:
>>
>>> Hello,
>>>
>>> I'm running Phoenix 4.13 on AWS EMR and getting the following EXPLAIN
>>> plan:
>>>
>>> Table:
>>>
>>> *CREATE TABLE salted (   keyA BIGINT NOT NULL,   keyB BIGINT NOT
>>> NULL,   val SMALLINT,   CONSTRAINT pk PRIMARY KEY (keyA,
>>> keyB))SALT_BUCKETS = 64;*
>>> EXPLAIN:
>>>
>>> *EXPLAINSELECT /*+ USE_SORT_MERGE_JOIN */COUNT(*) cFROM salted t1 JOIN
>>> salted t2ON (t1.keyB = t2.keyB)WHERE t1.keyA = 10AND t2.keyA = 20;*
>>>
>>> *+--+-+-+|
>>> PLAN   | EST_BYTES_READ |
>>> |+--+-+-+|
>>> SORT-MERGE-JOIN (INNER) TABLES
>>>   | null | || CLIENT
>>> 64-CHUNK PARALLEL 64-WAY RANGE SCAN OVER SALTED [0,10] - [63,10]  | null |
>>> || SERVER FILTER BY FIRST KEY ONLY
>>>  | null | || CLIENT MERGE SORT
>>>| null | || AND
>>> (SKIP MERGE) |
>>> null | || CLIENT 64-CHUNK PARALLEL 64-WAY RANGE SCAN OVER SALTED [0,20]
>>> - [63,20]  | null | || SERVER FILTER BY FIRST KEY ONLY
>>>  | null | || SERVER SORTED BY
>>> [T2.KEYB]   | null | || CLIENT
>>> MERGE SORT| null |
>>> || CLIENT AGGREGATE INTO SINGLE ROW
>>> | null |
>>> |+--+-+-+*
>>>
>>> In the EXPLAIN output, I see "SERVER SORTED BY [T2.KEYB]". Is this sort
>>> necessary? For both JOIN terms T1 and T2, the value of keyA, the leading
>>> part of the primary key, is fixed. Furthermore, there is no corresponding
>>> sort of T1.KEYB.
>>>
>>> When I EXPLAIN the same query on a non-salted table, neither T1.KEYB nor
>>> T2.KEYB is re-sorted. I'm hoping the sort is unnecessary. If so, is there
>>> an open ticket? I would be happy to file a ticket and to contribute to a
>>> fix. I would appreciate any guidance.
>>>
>>> Thanks,
>>> Gerald
>>>
>>>
>>
>


Re: SORT_MERGE_JOIN on non-leading key: server-side sorting

2018-05-11 Thread Gerald Sangudi
Hi Maryann,

Thanks for verifying against latest. However, I did not detect the fix in
Phoenix 4.13.1. AWS EMR currently provides Phoenix 4.13.0. I manually
upgraded an AWS EMR cluster to Phoenix 4.13.1 and rebooted the cluster. The
EXPLAIN plan still shows the same issue.

Thanks,
Gerald

On Wed, May 9, 2018 at 11:51 AM, Maryann Xue  wrote:

> Hi Gerald,
>
> I have verified against latest Phoenix code that this problem has been
> fixed. I have also checked Phoenix 4.13 release tags. Looks like all
> versions of 4.13 packages now include that fix. Would you mind getting the
> latest Phoenix-4.13 package and testing it again? Thank you!
>
>
> Thanks,
> Maryann
>
> On Tue, May 8, 2018 at 12:24 PM, Gerald Sangudi 
> wrote:
>
>> Hello,
>>
>> I'm running Phoenix 4.13 on AWS EMR and getting the following EXPLAIN
>> plan:
>>
>> Table:
>>
>> *CREATE TABLE salted (   keyA BIGINT NOT NULL,   keyB BIGINT NOT
>> NULL,   val SMALLINT,   CONSTRAINT pk PRIMARY KEY (keyA,
>> keyB))SALT_BUCKETS = 64;*
>> EXPLAIN:
>>
>> *EXPLAINSELECT /*+ USE_SORT_MERGE_JOIN */COUNT(*) cFROM salted t1 JOIN
>> salted t2ON (t1.keyB = t2.keyB)WHERE t1.keyA = 10AND t2.keyA = 20;*
>>
>> *+--+-+-+|
>> PLAN   | EST_BYTES_READ |
>> |+--+-+-+|
>> SORT-MERGE-JOIN (INNER) TABLES
>>   | null | || CLIENT
>> 64-CHUNK PARALLEL 64-WAY RANGE SCAN OVER SALTED [0,10] - [63,10]  | null |
>> || SERVER FILTER BY FIRST KEY ONLY
>>  | null | || CLIENT MERGE SORT
>>| null | || AND
>> (SKIP MERGE) |
>> null | || CLIENT 64-CHUNK PARALLEL 64-WAY RANGE SCAN OVER SALTED [0,20]
>> - [63,20]  | null | || SERVER FILTER BY FIRST KEY ONLY
>>  | null | || SERVER SORTED BY
>> [T2.KEYB]   | null | || CLIENT
>> MERGE SORT| null |
>> || CLIENT AGGREGATE INTO SINGLE ROW
>> | null |
>> |+--+-+-+*
>>
>> In the EXPLAIN output, I see "SERVER SORTED BY [T2.KEYB]". Is this sort
>> necessary? For both JOIN terms T1 and T2, the value of keyA, the leading
>> part of the primary key, is fixed. Furthermore, there is no corresponding
>> sort of T1.KEYB.
>>
>> When I EXPLAIN the same query on a non-salted table, neither T1.KEYB nor
>> T2.KEYB is re-sorted. I'm hoping the sort is unnecessary. If so, is there
>> an open ticket? I would be happy to file a ticket and to contribute to a
>> fix. I would appreciate any guidance.
>>
>> Thanks,
>> Gerald
>>
>>
>


Re: SORT_MERGE_JOIN on non-leading key: server-side sorting

2018-05-09 Thread Maryann Xue
Hi Gerald,

I have verified against latest Phoenix code that this problem has been
fixed. I have also checked Phoenix 4.13 release tags. Looks like all
versions of 4.13 packages now include that fix. Would you mind getting the
latest Phoenix-4.13 package and testing it again? Thank you!


Thanks,
Maryann

On Tue, May 8, 2018 at 12:24 PM, Gerald Sangudi 
wrote:

> Hello,
>
> I'm running Phoenix 4.13 on AWS EMR and getting the following EXPLAIN plan:
>
> Table:
>
> *CREATE TABLE salted (   keyA BIGINT NOT NULL,   keyB BIGINT NOT
> NULL,   val SMALLINT,   CONSTRAINT pk PRIMARY KEY (keyA,
> keyB))SALT_BUCKETS = 64;*
> EXPLAIN:
>
> *EXPLAINSELECT /*+ USE_SORT_MERGE_JOIN */COUNT(*) cFROM salted t1 JOIN
> salted t2ON (t1.keyB = t2.keyB)WHERE t1.keyA = 10AND t2.keyA = 20;*
>
> *+--+-+-+|
> PLAN   | EST_BYTES_READ |
> |+--+-+-+|
> SORT-MERGE-JOIN (INNER) TABLES
>   | null | || CLIENT
> 64-CHUNK PARALLEL 64-WAY RANGE SCAN OVER SALTED [0,10] - [63,10]  | null |
> || SERVER FILTER BY FIRST KEY ONLY
>  | null | || CLIENT MERGE SORT
>| null | || AND
> (SKIP MERGE) |
> null | || CLIENT 64-CHUNK PARALLEL 64-WAY RANGE SCAN OVER SALTED [0,20]
> - [63,20]  | null | || SERVER FILTER BY FIRST KEY ONLY
>  | null | || SERVER SORTED BY
> [T2.KEYB]   | null | || CLIENT
> MERGE SORT| null |
> || CLIENT AGGREGATE INTO SINGLE ROW
> | null |
> |+--+-+-+*
>
> In the EXPLAIN output, I see "SERVER SORTED BY [T2.KEYB]". Is this sort
> necessary? For both JOIN terms T1 and T2, the value of keyA, the leading
> part of the primary key, is fixed. Furthermore, there is no corresponding
> sort of T1.KEYB.
>
> When I EXPLAIN the same query on a non-salted table, neither T1.KEYB nor
> T2.KEYB is re-sorted. I'm hoping the sort is unnecessary. If so, is there
> an open ticket? I would be happy to file a ticket and to contribute to a
> fix. I would appreciate any guidance.
>
> Thanks,
> Gerald
>
>


Re: SORT_MERGE_JOIN on non-leading key: server-side sorting

2018-05-08 Thread Maryann Xue
Hi Gerald,

Thank you for finding this issue! I think it is similar to PHOENIX-4508.
I'll verify your case on the latest Phoenix branch and see if it has been
fixed.


Thanks,
Maryann

On Tue, May 8, 2018 at 12:24 PM, Gerald Sangudi 
wrote:

> Hello,
>
> I'm running Phoenix 4.13 on AWS EMR and getting the following EXPLAIN plan:
>
> Table:
>
> *CREATE TABLE salted (   keyA BIGINT NOT NULL,   keyB BIGINT NOT
> NULL,   val SMALLINT,   CONSTRAINT pk PRIMARY KEY (keyA,
> keyB))SALT_BUCKETS = 64;*
> EXPLAIN:
>
> *EXPLAINSELECT /*+ USE_SORT_MERGE_JOIN */COUNT(*) cFROM salted t1 JOIN
> salted t2ON (t1.keyB = t2.keyB)WHERE t1.keyA = 10AND t2.keyA = 20;*
>
> *+--+-+-+|
> PLAN   | EST_BYTES_READ |
> |+--+-+-+|
> SORT-MERGE-JOIN (INNER) TABLES
>   | null | || CLIENT
> 64-CHUNK PARALLEL 64-WAY RANGE SCAN OVER SALTED [0,10] - [63,10]  | null |
> || SERVER FILTER BY FIRST KEY ONLY
>  | null | || CLIENT MERGE SORT
>| null | || AND
> (SKIP MERGE) |
> null | || CLIENT 64-CHUNK PARALLEL 64-WAY RANGE SCAN OVER SALTED [0,20]
> - [63,20]  | null | || SERVER FILTER BY FIRST KEY ONLY
>  | null | || SERVER SORTED BY
> [T2.KEYB]   | null | || CLIENT
> MERGE SORT| null |
> || CLIENT AGGREGATE INTO SINGLE ROW
> | null |
> |+--+-+-+*
>
> In the EXPLAIN output, I see "SERVER SORTED BY [T2.KEYB]". Is this sort
> necessary? For both JOIN terms T1 and T2, the value of keyA, the leading
> part of the primary key, is fixed. Furthermore, there is no corresponding
> sort of T1.KEYB.
>
> When I EXPLAIN the same query on a non-salted table, neither T1.KEYB nor
> T2.KEYB is re-sorted. I'm hoping the sort is unnecessary. If so, is there
> an open ticket? I would be happy to file a ticket and to contribute to a
> fix. I would appreciate any guidance.
>
> Thanks,
> Gerald
>
>


SORT_MERGE_JOIN on non-leading key: server-side sorting

2018-05-08 Thread Gerald Sangudi
Hello,

I'm running Phoenix 4.13 on AWS EMR and getting the following EXPLAIN plan:

Table:

*CREATE TABLE salted (   keyA BIGINT NOT NULL,   keyB BIGINT NOT
NULL,   val SMALLINT,   CONSTRAINT pk PRIMARY KEY (keyA,
keyB))SALT_BUCKETS = 64;*
EXPLAIN:

*EXPLAINSELECT /*+ USE_SORT_MERGE_JOIN */COUNT(*) cFROM salted t1 JOIN
salted t2ON (t1.keyB = t2.keyB)WHERE t1.keyA = 10AND t2.keyA = 20;*

*+--+-+-+|
PLAN   | EST_BYTES_READ |
|+--+-+-+|
SORT-MERGE-JOIN (INNER) TABLES
  | null | || CLIENT
64-CHUNK PARALLEL 64-WAY RANGE SCAN OVER SALTED [0,10] - [63,10]  | null |
|| SERVER FILTER BY FIRST KEY ONLY
 | null | || CLIENT MERGE SORT
   | null | || AND
(SKIP MERGE) |
null | || CLIENT 64-CHUNK PARALLEL 64-WAY RANGE SCAN OVER SALTED [0,20]
- [63,20]  | null | || SERVER FILTER BY FIRST KEY ONLY
 | null | || SERVER SORTED BY
[T2.KEYB]   | null | || CLIENT
MERGE SORT| null |
|| CLIENT AGGREGATE INTO SINGLE ROW
| null |
|+--+-+-+*

In the EXPLAIN output, I see "SERVER SORTED BY [T2.KEYB]". Is this sort
necessary? For both JOIN terms T1 and T2, the value of keyA, the leading
part of the primary key, is fixed. Furthermore, there is no corresponding
sort of T1.KEYB.

When I EXPLAIN the same query on a non-salted table, neither T1.KEYB nor
T2.KEYB is re-sorted. I'm hoping the sort is unnecessary. If so, is there
an open ticket? I would be happy to file a ticket and to contribute to a
fix. I would appreciate any guidance.

Thanks,
Gerald