Hi Pierre, Thank you very much for reporting this issue! Can you create a JIRA with all the information you've attached above along with the table DDL info? I'll take a look at it.
Thanks, Maryann On Thu, May 12, 2016 at 6:18 AM, pierre lacave <[email protected]> wrote: > Hi > > I am seeing weird result with joins where the output seems to be incomplete > I tried to summarise the problem with the queries bellow. > > in query 1, I do a join over a period for which I would have expected to > return a dozen of rows, but only one is returned for a time T1, > in query 2, I do the same join but filtering for one of the missing row at > time T2 which is now returned ?! > > I re-ran query 1 to make sure it was not a timing issue, but had the same > wrong partial result. > > > Surely that is not the expected behaviour, what would be the next step to > get to the bottom of this? > > > Thanks > > > The two tables are using a salt of 2. > Using Phoenix 4.7, Hbase 1.1 > > > > > > > Query #1 > 0: jdbc:phoenix:localhost:2181> SELECT C.BUCKET, C.TIMESTAMP FROM ( > . . . . . . . . . . . . . . . > SELECT E.BUCKET as BUCKET, L.BUCKET as > LBUCKET, E.TIMESTAMP as TIMESTAMP, L.TIMESTAMP as LTIMESTAMP FROM > . . . . . . . . . . . . . . . > (SELECT BUCKET, TIMESTAMP FROM > EVENT_COUNT > . . . . . . . . . . . . . . . > WHERE BUCKET = '5SEC' AND > LOCATION = 'Tr/Bal' > . . . . . . . . . . . . . . . > AND TIMESTAMP <= > 1462993520000000000 AND TIMESTAMP > 1462993420000000000 > . . . . . . . . . . . . . . . > GROUP BY BUCKET, TIMESTAMP, > LOCATION > . . . . . . . . . . . . . . . > ) E > . . . . . . . . . . . . . . . > JOIN > . . . . . . . . . . . . . . . > (SELECT BUCKET, TIMESTAMP FROM > EVENT_LATENCY > . . . . . . . . . . . . . . . > WHERE BUCKET = '5SEC' AND > SRC_LOCATION = 'Tr/Bal' AND SRC_LOCATION = DST_LOCATION > . . . . . . . . . . . . . . . > AND TIMESTAMP <= > 1462993520000000000 AND TIMESTAMP > 1462993420000000000 > . . . . . . . . . . . . . . . > GROUP BY BUCKET, TIMESTAMP, > SRC_LOCATION, DST_LOCATION > . . . . . . . . . . . . . . . > ) L > . . . . . . . . . . . . . . . > ON L.BUCKET = E.BUCKET AND L.TIMESTAMP > = E.TIMESTAMP > . . . . . . . . . . . . . . . > ) C > . . . . . . . . . . . . . . . > GROUP BY C.BUCKET, C.TIMESTAMP; > +-----------+----------------------+ > | E.BUCKET | E.TIMESTAMP | > +-----------+----------------------+ > | 5SEC | 1462993430000000000 | > +-----------+----------------------+ > 1 row selected (0.169 seconds) > > > > Query #2 > 0: jdbc:phoenix:localhost:2181> SELECT C.BUCKET, C.TIMESTAMP FROM ( > . . . . . . . . . . . . . . . > SELECT E.BUCKET as BUCKET, L.BUCKET as > LBUCKET, E.TIMESTAMP as TIMESTAMP, L.TIMESTAMP as LTIMESTAMP FROM > . . . . . . . . . . . . . . . > (SELECT BUCKET, TIMESTAMP FROM > EVENT_COUNT > . . . . . . . . . . . . . . . > WHERE BUCKET = '5SEC' AND > LOCATION = 'Tr/Bal' > . . . . . . . . . . . . . . . > AND TIMESTAMP <= > 1462993520000000000 AND TIMESTAMP > 1462993420000000000 *AND TIMESTAMP = > 1462993520000000000* > . . . . . . . . . . . . . . . > GROUP BY BUCKET, TIMESTAMP, > LOCATION > . . . . . . . . . . . . . . . > ) E > . . . . . . . . . . . . . . . > JOIN > . . . . . . . . . . . . . . . > (SELECT BUCKET, TIMESTAMP FROM > EVENT_LATENCY > . . . . . . . . . . . . . . . > WHERE BUCKET = '5SEC' AND > SRC_LOCATION = 'Tr/Bal' AND SRC_LOCATION = DST_LOCATION > . . . . . . . . . . . . . . . > AND TIMESTAMP <= > 1462993520000000000 AND TIMESTAMP > 1462993420000000000 *AND TIMESTAMP = > 1462993520000000000* > . . . . . . . . . . . . . . . > GROUP BY BUCKET, TIMESTAMP, > SRC_LOCATION, DST_LOCATION > . . . . . . . . . . . . . . . > ) L > . . . . . . . . . . . . . . . > ON L.BUCKET = E.BUCKET AND L.TIMESTAMP > = E.TIMESTAMP > . . . . . . . . . . . . . . . > ) C > . . . . . . . . . . . . . . . > GROUP BY C.BUCKET, C.TIMESTAMP; > +-----------+----------------------+ > | E.BUCKET | E.TIMESTAMP | > +-----------+----------------------+ > | 5SEC | *1462993520000000000* | > +-----------+----------------------+ > 1 row selected (0.081 seconds) > > > > > > > > For reference the content of each table and the plan of each query bellow > > 0: jdbc:phoenix:localhost:2181> SELECT BUCKET, TIMESTAMP as T FROM > EVENT_LATENCY > . . . . . . . . . . . . . . . > WHERE BUCKET = '5SEC' AND SRC_LOCATION > = 'Tr/Bal' AND SRC_LOCATION = DST_LOCATION > . . . . . . . . . . . . . . . > AND TIMESTAMP <= > 1462993520000000000 AND TIMESTAMP > 1462993420000000000 > . . . . . . . . . . . . . . . > GROUP BY BUCKET, TIMESTAMP, > SRC_LOCATION, DST_LOCATION ORDER BY T DESC; > +---------+----------------------+ > | BUCKET | T | > +---------+----------------------+ > | 5SEC | *1462993520000000000* | > | 5SEC | 1462993515000000000 | > | 5SEC | 1462993510000000000 | > | 5SEC | 1462993505000000000 | > | 5SEC | 1462993490000000000 | > | 5SEC | 1462993485000000000 | > | 5SEC | 1462993480000000000 | > | 5SEC | 1462993475000000000 | > | 5SEC | 1462993470000000000 | > | 5SEC | 1462993430000000000 | > +---------+----------------------+ > > > 0: jdbc:phoenix:localhost:2181> SELECT BUCKET, TIMESTAMP as T FROM > EVENT_COUNT > . . . . . . . . . . . . . . . > WHERE BUCKET = '5SEC' AND LOCATION = > 'Tr/Bal' > . . . . . . . . . . . . . . . > AND TIMESTAMP <= > 1462993520000000000 AND TIMESTAMP > 1462993420000000000 > . . . . . . . . . . . . . . . > GROUP BY BUCKET, TIMESTAMP, LOCATION > order BY T DESC; > +---------+----------------------+ > | BUCKET | T | > +---------+----------------------+ > | 5SEC | *1462993520000000000* | > | 5SEC | 1462993515000000000 | > | 5SEC | 1462993510000000000 | > | 5SEC | 1462993505000000000 | > | 5SEC | 1462993500000000000 | > | 5SEC | 1462993495000000000 | > | 5SEC | 1462993490000000000 | > | 5SEC | 1462993485000000000 | > | 5SEC | 1462993480000000000 | > | 5SEC | 1462993475000000000 | > | 5SEC | 1462993470000000000 | > | 5SEC | 1462993465000000000 | > | 5SEC | 1462993460000000000 | > | 5SEC | 1462993455000000000 | > | 5SEC | 1462993450000000000 | > | 5SEC | 1462993445000000000 | > | 5SEC | 1462993440000000000 | > | 5SEC | 1462993430000000000 | > +---------+----------------------+ > > > > Query #1 plan > > +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ > | > PLAN > | > > +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ > | SORT-MERGE-JOIN (INNER) TABLES > > | > | CLIENT 2-CHUNK 963058947 ROWS 100663313890 BYTES PARALLEL 2-WAY SKIP > SCAN ON 2 RANGES OVER EVENT_COUNT [0,'5SEC',~1462993520000000000,'Tr/Bal'] > - [1,'5SEC',~1462993420000000000,'Tr/Bal'] | > | SERVER FILTER BY FIRST KEY ONLY > > | > | SERVER AGGREGATE INTO ORDERED DISTINCT ROWS BY [BUCKET, > TIMESTAMP, LOCATION] > | > | CLIENT MERGE SORT > > | > | CLIENT SORTED BY [BUCKET, TIMESTAMP] > > | > | AND (SKIP MERGE) > > | > | CLIENT 2-CHUNK 1387870519 ROWS 167352757221 BYTES PARALLEL 2-WAY > SKIP SCAN ON 2 RANGES OVER EVENT_LATENCY > [0,'5SEC',~1462993520000000000,'Tr/Bal'] - > [1,'5SEC',~1462993420000000000,'Tr/Bal'] | > | SERVER FILTER BY FIRST KEY ONLY AND SRC_LOCATION = DST_LOCATION > > | > | SERVER AGGREGATE INTO ORDERED DISTINCT ROWS BY [BUCKET, > TIMESTAMP, SRC_LOCATION, DST_LOCATION] > | > | CLIENT MERGE SORT > > | > | CLIENT SORTED BY [BUCKET, TIMESTAMP] > > | > | CLIENT SORTED BY [E.BUCKET, E.TIMESTAMP] > > | > | CLIENT AGGREGATE INTO DISTINCT ROWS BY [E.BUCKET, E.TIMESTAMP] > > | > > +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ > 14 rows selected (0.089 seconds) > > > Query #2 plan > > +----------------------------------------------------------------------------------------------------------------------------------------------+ > | PLAN > | > > +----------------------------------------------------------------------------------------------------------------------------------------------+ > | SORT-MERGE-JOIN (INNER) TABLES > | > | CLIENT 2-CHUNK 963058947 ROWS 100663313890 BYTES PARALLEL 2-WAY > RANGE SCAN OVER EVENT_COUNT [0,'5SEC',~1462993520000000000,'Tr/Bal'] | > | SERVER FILTER BY FIRST KEY ONLY > | > | SERVER AGGREGATE INTO ORDERED DISTINCT ROWS BY [BUCKET, > TIMESTAMP, LOCATION] > | > | CLIENT MERGE SORT > | > | CLIENT SORTED BY [BUCKET, TIMESTAMP] > | > | AND (SKIP MERGE) > | > | CLIENT 2-CHUNK 1387870519 ROWS 167352757221 BYTES PARALLEL 2-WAY > RANGE SCAN OVER EVENT_LATENCY [0,'5SEC',~1462993520000000000,'Tr/Bal'] | > | SERVER FILTER BY FIRST KEY ONLY AND SRC_LOCATION = DST_LOCATION > | > | SERVER AGGREGATE INTO ORDERED DISTINCT ROWS BY [BUCKET, > TIMESTAMP, SRC_LOCATION, DST_LOCATION] > | > | CLIENT MERGE SORT > | > | CLIENT SORTED BY [BUCKET, TIMESTAMP] > | > | CLIENT SORTED BY [E.BUCKET, E.TIMESTAMP] > | > | CLIENT AGGREGATE INTO DISTINCT ROWS BY [E.BUCKET, E.TIMESTAMP] > | > > +----------------------------------------------------------------------------------------------------------------------------------------------+ > 14 rows selected (0.079 seconds) > > > possibly relevant hbase settings > > <property> > <name>phoenix.query.rowKeyOrderSaltedTable</name> > <value>true</value> > <description>Whether or not a non aggregate query returns rows in row > key or > der for salted tables. If this option is turned on, split points may not > be spec > ified at table create time, but instead the default splits on each salt > bucket m > ust be used. Default is true.</description> > </property> > > >
