from:"Daniel Wong \(Jira\)"

[jira] [Commented] (PHOENIX-7253) Perf improvement for non-full scan queries on large table

2024-03-05 Thread Daniel Wong (Jira)



[ 
https://issues.apache.org/jira/browse/PHOENIX-7253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17823491#comment-17823491
 ] 

Daniel Wong commented on PHOENIX-7253:
--

I found some of my notes:

Phoenix code doing this is a hack given HBase limitations... in 1.x there was 
no easy access to the hbase clients metadata cache and this method was used to 
kind of hackly get the cached values only as opposed to directly calling 
{{HConnection.getAllRegionLocations }}

This code path is being called from “{{{}getAllTableRegions.{}}} This is used 
when the Phoenix client wants/needs to update the region info for a hbase 
table. This could be done during scan generation or when a split/merge is 
detected. This method calls {{HConnection.getRegionLocation}} iteratively using 
one region at a time to find the next region. This is done so that this uses 
the cache in the HConnection itself rather than the 
HTable.getAllRegionLocations which appears to cause a MetaScan and re-cache on 
the HBase client of all the regions in the 1.x codebase. Where this can become 
an issue is if one region splits many scans may simultaneously call this method 
in overloading the client threads as well as possibly meta. Some protection so 
that a client can only call this from a single thread for a single table would 
improve this flow greatly. Whether this should be done inside of hbase itself 
or in phoenix is debatable but Phoenix is intending to use the HBase client 
cache rather than store its own cache of region boundaries.

I haven't had time to explore the 2.X codebase but much of this has changed and 
been replace via async region locator flow here 
[https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/AsyncTableRegionLocatorImpl.java#L59]
 and this appears to again call and then cache rather than use the cache were 
available... some trickyness here.
Want the code orignally wanted/intended was to get all the regions from the 
hclient cache and only renew the cache on detected split/merge iirc.

> Perf improvement for non-full scan queries on large table
> -
>
> Key: PHOENIX-7253
> URL: https://issues.apache.org/jira/browse/PHOENIX-7253
> Project: Phoenix
>  Issue Type: Improvement
>Affects Versions: 5.2.0, 5.1.3
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Critical
> Fix For: 5.2.0, 5.1.4
>
>
> Any considerably large table with more than 100k regions can give problematic 
> performance if we access all region locations from meta for the given table 
> before generating parallel or sequential scans for the given query. The perf 
> impact can really hurt range scan queries.
> Consider a table with hundreds of thousands of tenant views. Unless the query 
> is strict point lookup, any query on any tenant view would end up retrieving 
> region locations of all regions of the base table. In case if IOException is 
> thrown by HBase client during any region location lookup in meta, we only 
> perform single retry.
> Proposal:
>  # All non point lookup queries should only retrieve region locations that 
> cover the scan boundary. Avoid fetching all region locations of the base 
> table.
>  # Make retries configurable with higher default value.
>  
> Sample stacktrace from the multiple failures observed:
> {code:java}
> java.sql.SQLException: ERROR 1102 (XCL02): Cannot get all table regions.Stack 
> trace: java.sql.SQLException: ERROR 1102 (XCL02): Cannot get all table 
> regions.
>     at 
> org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:620)
>     at 
> org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:229)
>     at 
> org.apache.phoenix.query.ConnectionQueryServicesImpl.getAllTableRegions(ConnectionQueryServicesImpl.java:781)
>     at 
> org.apache.phoenix.query.DelegateConnectionQueryServices.getAllTableRegions(DelegateConnectionQueryServices.java:87)
>     at 
> org.apache.phoenix.query.DelegateConnectionQueryServices.getAllTableRegions(DelegateConnectionQueryServices.java:87)
>     at 
> org.apache.phoenix.iterate.DefaultParallelScanGrouper.getRegionBoundaries(DefaultParallelScanGrouper.java:74)
>     at 
> org.apache.phoenix.iterate.BaseResultIterators.getRegionBoundaries(BaseResultIterators.java:587)
>     at 
> org.apache.phoenix.iterate.BaseResultIterators.getParallelScans(BaseResultIterators.java:936)
>     at 
> org.apache.phoenix.iterate.BaseResultIterators.getParallelScans(BaseResultIterators.java:669)
>     at 
> org.apache.phoenix.iterate.BaseResultIterators.(BaseResultIterators.java:555)
>     at 
> org.apache.phoenix.iterate.SerialIterators.(SerialIterators.java:69)
>     at org.apache.phoenix.execute.ScanPlan.newIterator(ScanPlan.java:278)
>     at

[jira] [Commented] (PHOENIX-7253) Perf improvement for non-full scan queries on large table

2024-03-05 Thread Daniel Wong (Jira)



[ 
https://issues.apache.org/jira/browse/PHOENIX-7253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17823483#comment-17823483
 ] 

Daniel Wong commented on PHOENIX-7253:
--

Hi [~vjasani] I wanted to mention that you need to be a bit careful about this 
change the entire "getAllTableRegions" function is a bit of a hack in the first 
place due to the restriction on how hbase cache's table regions it actually 
servers 2 purposes in phoenix code... the right fix probably needs an hbase API 
update as well.  I'm trying to find the context in all of my 
docs/notes/conversations but we should probably sync on this case prior to 
implementation changes.

> Perf improvement for non-full scan queries on large table
> -
>
> Key: PHOENIX-7253
> URL: https://issues.apache.org/jira/browse/PHOENIX-7253
> Project: Phoenix
>  Issue Type: Improvement
>Affects Versions: 5.2.0, 5.1.3
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Critical
> Fix For: 5.2.0, 5.1.4
>
>
> Any considerably large table with more than 100k regions can give problematic 
> performance if we access all region locations from meta for the given table 
> before generating parallel or sequential scans for the given query. The perf 
> impact can really hurt range scan queries.
> Consider a table with hundreds of thousands of tenant views. Unless the query 
> is strict point lookup, any query on any tenant view would end up retrieving 
> region locations of all regions of the base table. In case if IOException is 
> thrown by HBase client during any region location lookup in meta, we only 
> perform single retry.
> Proposal:
>  # All non point lookup queries should only retrieve region locations that 
> cover the scan boundary. Avoid fetching all region locations of the base 
> table.
>  # Make retries configurable with higher default value.
>  
> Sample stacktrace from the multiple failures observed:
> {code:java}
> java.sql.SQLException: ERROR 1102 (XCL02): Cannot get all table regions.Stack 
> trace: java.sql.SQLException: ERROR 1102 (XCL02): Cannot get all table 
> regions.
>     at 
> org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:620)
>     at 
> org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:229)
>     at 
> org.apache.phoenix.query.ConnectionQueryServicesImpl.getAllTableRegions(ConnectionQueryServicesImpl.java:781)
>     at 
> org.apache.phoenix.query.DelegateConnectionQueryServices.getAllTableRegions(DelegateConnectionQueryServices.java:87)
>     at 
> org.apache.phoenix.query.DelegateConnectionQueryServices.getAllTableRegions(DelegateConnectionQueryServices.java:87)
>     at 
> org.apache.phoenix.iterate.DefaultParallelScanGrouper.getRegionBoundaries(DefaultParallelScanGrouper.java:74)
>     at 
> org.apache.phoenix.iterate.BaseResultIterators.getRegionBoundaries(BaseResultIterators.java:587)
>     at 
> org.apache.phoenix.iterate.BaseResultIterators.getParallelScans(BaseResultIterators.java:936)
>     at 
> org.apache.phoenix.iterate.BaseResultIterators.getParallelScans(BaseResultIterators.java:669)
>     at 
> org.apache.phoenix.iterate.BaseResultIterators.(BaseResultIterators.java:555)
>     at 
> org.apache.phoenix.iterate.SerialIterators.(SerialIterators.java:69)
>     at org.apache.phoenix.execute.ScanPlan.newIterator(ScanPlan.java:278)
>     at 
> org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:374)
>     at 
> org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:222)
>     at 
> org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:217)
>     at 
> org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:212)
>     at 
> org.apache.phoenix.jdbc.PhoenixStatement$1.call(PhoenixStatement.java:370)
>     at 
> org.apache.phoenix.jdbc.PhoenixStatement$1.call(PhoenixStatement.java:328)
>     at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53)
>     at 
> org.apache.phoenix.jdbc.PhoenixStatement.executeQuery(PhoenixStatement.java:328)
>     at 
> org.apache.phoenix.jdbc.PhoenixStatement.executeQuery(PhoenixStatement.java:320)
>     at 
> org.apache.phoenix.jdbc.PhoenixPreparedStatement.executeQuery(PhoenixPreparedStatement.java:188)
>     ...
>     ...
>     Caused by: java.io.InterruptedIOException: Origin: InterruptedException
>         at 
> org.apache.hadoop.hbase.util.ExceptionUtil.asInterrupt(ExceptionUtil.java:72)
>         at 
> org.apache.hadoop.hbase.client.ConnectionImplementation.takeUserRegionLock(ConnectionImplementation.java:1129)
>         at 
> org.apache.hadoop.hbase.client.ConnectionImplementation.locateRegionInMeta(ConnectionImplementation.java:994)
>         at 
>

[jira] [Commented] (PHOENIX-5833) Incorrect results with RVCs and AND operator

2023-08-01 Thread Daniel Wong (Jira)



[ 
https://issues.apache.org/jira/browse/PHOENIX-5833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17750043#comment-17750043
 ] 

Daniel Wong commented on PHOENIX-5833:
--

Thanks [~stoty] for doing this I'm not as active in Phoenix but saw you did 
this and a bunch of other cleanup today!

> Incorrect results with RVCs and AND operator
> 
>
> Key: PHOENIX-5833
> URL: https://issues.apache.org/jira/browse/PHOENIX-5833
> Project: Phoenix
>  Issue Type: Bug
>  Components: core
>Affects Versions: 4.15.0
>Reporter: Bharath Vissapragada
>Assignee: Istvan Toth
>Priority: Critical
> Fix For: 5.2.0, 5.1.4
>
> Attachments: PHOENIX-5833.4.x.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Phoenix version: 4.15-HBase-1.5
> -- Create a test table and populate a couple of rows.
> {noformat}
> create table repro_bug(a varchar(10) not null, b varchar(10) not null, c 
> varchar(10) not null constraint pk primary key(a, b, c));
> upsert into repro_bug values('abc', 'def', 'RRSQ_IMKKL');
> upsert into repro_bug values('abc', 'def', 'RRS_ZYTDT');
> select * from repro_bug;
> +--+--+-+
> |  A   |  B   |  C  |
> +--+--+-+
> | abc  | def  | RRSQ_IMKKL  |
> | abc  | def  | RRS_ZYTDT   |
> +--+--+-+
> {noformat}
> -- Query 1 - Look for rows where C has a certain prefix - Returns correct 
> result
> {noformat}
> select A, B, C from REPRO_BUG where C like 'RRS\\_%';
> +--+--++
> |  A   |  B   | C  |
> +--+--++
> | abc  | def  | RRS_ZYTDT  |
> +--+--++
> {noformat}
> -- Query 2 - Look for rows where (a, b, c) > first row - Returns correct 
> result
> {noformat}
> select A, B, C from REPRO_BUG where (A, B, C) > ('abc', 'def', 'RRSQ_IMKKL')
> +--+--++
> |  A   |  B   | C  |
> +--+--++
> | abc  | def  | RRS_ZYTDT  |
> +--+--++
> {noformat}
> -- Query 3 - Combine the filters from Query 1 and Query2 - Returns incorrect 
> result.. Ideally it should return the same row as above.
> {noformat}
>  select A, B, C from REPRO_BUG where (A, B, C) > ('abc', 'def', 'RRSQ_IMKKL') 
> AND C like 'RRS\\_%';
> ++++
> | A  | B  | C  |
> ++++
> ++++
> {noformat}
> -- Explain for the above incase someone is interested.
> {noformat}
> explain select A, B, C from REPRO_BUG where (A, B, C) > ('abc', 'def', 
> 'RRSQ_IMKKL') AND C like 'RRS\\_%';
> ++-++--+
> |  PLAN   
>| EST_BYTES_READ  | EST_ROWS_READ  | EST_INFO_TS  |
> ++-++--+
> | CLIENT 1-CHUNK PARALLEL 1-WAY ROUND ROBIN RANGE SCAN OVER REPRO_BUG 
> ['abcdef'] - [*]  | null| null   | null |
> | SERVER FILTER BY FIRST KEY ONLY AND C LIKE 'RRS\_%' 
>| null| null   | null |
> ++-++--+
> 2 rows selected (0.003 seconds)
> {noformat}
> I'm trying to poke around in the code to figure out the issue but my 
> understanding of  Phoenix is limited at this point. So creating a bug report 
> incase someone can figure this out quickly.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (PHOENIX-6796) Pluggable query logger in Phoenix

2022-09-28 Thread Daniel Wong (Jira)



[ 
https://issues.apache.org/jira/browse/PHOENIX-6796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17610786#comment-17610786
 ] 

Daniel Wong commented on PHOENIX-6796:
--

For implementation thoughts I had were 3 components.  A query logger can be 1 
class, a sampler controller (how often and what type of queries),  a santizer, 
and a publisher.  Default might be always sample, publish to system.log and 
don't sanitize.  

> Pluggable query logger in Phoenix
> -
>
> Key: PHOENIX-6796
> URL: https://issues.apache.org/jira/browse/PHOENIX-6796
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Daniel Wong
>Priority: Minor
>
>  
> Requirements
>  # Pluggable logger or log publisher
>  ## , we don't want to use system.log for multiple reasons from auditing to 
> load uncertainty, in addition we need 2 publishing implementations to handle 
> future/past deployment patterns. (Think of this as taking the same info we 
> would push in system.log but for example using log4j to publish)
>  ## We need to sanitize personal information prior to logging for GDPR etc, 
> (I have some ideas on how to do this easily using the syntax-tree on 
> sanitizing everything by replacing constants with ? or similar)
>  # We need some client side performance metrics embedded in the logged 
> queries. Most of this is there today I believe but need to audit.  
>  # We need errors/exceptions with the query logging.  
>  # Ability to log every ddl/dml/dql.
>  # Logging capability across phoenix jdbc/phoenix spark/phoenix MR/phoenix pig
> Some initial discussion on Apache slack 
> [thread|https://the-asf.slack.com/archives/CPGHNKLSK/p1620026444027500]
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (PHOENIX-6491) Phoenix High Availability

2022-04-27 Thread Daniel Wong (Jira)



[ 
https://issues.apache.org/jira/browse/PHOENIX-6491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17529110#comment-17529110
 ] 

Daniel Wong commented on PHOENIX-6491:
--

Initial WIP feature commit from our internal fork.  
[https://github.com/apache/phoenix/pull/1430]

> Phoenix High Availability
> -
>
> Key: PHOENIX-6491
> URL: https://issues.apache.org/jira/browse/PHOENIX-6491
> Project: Phoenix
>  Issue Type: New Feature
>  Components: core
>Reporter: Daniel Wong
>Assignee: Daniel Wong
>Priority: Major
>
> This JIRA proposes Phoenix High Availability (HA) feature which allows 
> Phoenix users to interact with multiple Phoenix/HBase clusters in order to 
> achieve additional availability compared to a single cluster. In particular 
> we target the common deployment configuration of 2 HBase clusters with 
> master/master asynchronous replication enabled between the queried tables, 
> but with consideration to future extensions in use cases, replication, and 
> number of clusters.
>  
>  Currently targeted use cases:
>  # Active-Standby HA for disaster recovery, enables end users to switch HBase 
> clusters (triggered by administrators) collectively across multiple clients 
> without restarting.
>  # Active-Active HA for immutable use cases with point get queries without 
> deletes, enables a client to connect to both clusters simultaneously for 
> these use cases which inherently have relaxed consistency requirements.
> Concepts:
>  * HA Group - An HA group is an association between a pair of HBase clusters, 
> a group of Phoenix clients, metadata state, and an HA policy (see below). HA 
> groups are pre-defined and a client provides the group name when creating a 
> phoenix connection to the clusters in that HA group. Note that the same pair 
> of HBase clusters can be in multiple HA groups. This allows clients to be 
> grouped based on different use cases, availability requirements, consistency 
> requirements, and load balancing.
>  * HA Policy - Every HA group has an associated HA policy which specifies how 
> to use the HBase clusters pair. This is implemented by an interface that 
> replaces the JDBC Connection as well as changes in the public APIs in 
> QueryServices. Currently, there are 2 policies one for each targeted use case 
> defined above. It is possible to support more HA policies in future for 
> incoming use cases.
>  * Metadata Store - Stores the state / manages the state transitions of an HA 
> group. For example in the Active-Standby setup the store manages which 
> cluster is currently Active to which all clients in that HA group should 
> connect. For a particular HA group an entry is referred to as a Cluster Role 
> Record.
>  * HA Client - JDBC implementation as well as a handler for metadata store 
> state changes. End users will use via PhoenixDriver with JDBC string with 
> special format {{jdbc:phoenix:[zk1,zk2,zk3|zk1',zk2',zk3']}} and HA group 
> name in the connection properties. Using such a JDBC connection for creating 
> {{Statement}} or querying a {{ResultSet}} does not require any application 
> code change. Internally, the implementation will serve incoming client 
> operation requests according to the HA policy of that group.
> More details to come with a detailed design document.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Comment Edited] (PHOENIX-6669) RVC returns a wrong result

2022-03-16 Thread Daniel Wong (Jira)



[ 
https://issues.apache.org/jira/browse/PHOENIX-6669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17507901#comment-17507901
 ] 

Daniel Wong edited comment on PHOENIX-6669 at 3/16/22, 11:28 PM:
-

The rewrite is not correct still.  However, my main concern and why i mwntion 
this is without understanding exactly what the rvc is suppose to accomplish the 
test cases may not cover the set of issues or the code changes may be fragile.  
I do agree the wrong result is the critical issue of course but eant us 
implementing the full fix.  


was (Author: dbwong):
The rewrite is not correct still.  However, my main concern anf why i mwntion 
this is without understanding exactly what the rvc is suppose to accomplish the 
test cases may not cover the set of issues or the code changes may be fragile.  

> RVC returns a wrong result
> --
>
> Key: PHOENIX-6669
> URL: https://issues.apache.org/jira/browse/PHOENIX-6669
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.16.1
>Reporter: Xinyi Yan
>Priority: Major
>
> {code:java}
> CREATE TABLE IF NOT EXISTS DUMMY (
>     PK1 VARCHAR NOT NULL,
>     PK2 BIGINT NOT NULL,
>     PK3 BIGINT NOT NULL,
>     PK4 VARCHAR NOT NULL,
>     COL1 BIGINT,
>     COL2 INTEGER,
>     COL3 VARCHAR,
>     COL4 VARCHAR,    CONSTRAINT PK PRIMARY KEY
>     (
>         PK1,
>         PK2,
>         PK3,
>         PK4
>     )
> );UPSERT INTO DUMMY (PK1, PK4, COL1, PK2, COL2, PK3, COL3, COL4)
>             VALUES ('xx', 'xid1', 0, 7, 7, 7, 'INSERT', null);
>  {code}
> The non-RVC query returns no row, but the RVC query returns a wrong result.
> {code:java}
> 0: jdbc:phoenix:localhost> select PK2
> . . . . . . . . . . . . .> from DUMMY
> . . . . . . . . . . . . .> where PK1 ='xx'
> . . . . . . . . . . . . .> and (PK1 > 'xx' AND PK1 <= 'xx')
> . . . . . . . . . . . . .> and (PK2 > 5 AND PK2 <=5)
> . . . . . . . . . . . . .> and (PK3 > 2 AND PK3 <=2);
> +--+
> |                   PK2                    |
> +--+
> +--+
>  No rows selected (0.022 seconds)
> 0: jdbc:phoenix:localhost> select PK2
> . . . . . . . . . . . . .> from DUMMY
> . . . . . . . . . . . . .> where (PK1 = 'xx')
> . . . . . . . . . . . . .> and (PK1, PK2, PK3) > ('xx', 5, 2)
> . . . . . . . . . . . . .> and (PK1, PK2, PK3) <= ('xx', 5, 2);
> +--+
> |                   PK2                    |
> +--+
> | 7                                        |
> +--+
> 1 row selected (0.033 seconds) {code}
> {code:java}
> 0: jdbc:phoenix:localhost> EXPLAIN select PK2 from DUMMY where (PK1 = 'xx') 
> and (PK1, PK2, PK3) > ('xx', 5, 2) and (PK1, PK2, PK3) <= ('xx', 5, 2);
> +--+--+--+--+
> |                   PLAN                   |              EST_BYTES_READ      
>         |              EST_ROWS_READ               |  |
> +--+--+--+--+
> | CLIENT 1-CHUNK PARALLEL 1-WAY ROUND ROBIN RANGE SCAN OVER DUMMY ['xx'] | 
> null                                     | null          |
> |     SERVER FILTER BY FIRST KEY ONLY      | null                             
>         | null                                     |  |
> +--+--+--+--+
> 2 rows selected (0.024 seconds) 
> 0: jdbc:phoenix:localhost> explain select PK2 from DUMMY where PK1 ='xx' and 
> (PK1 > 'xx' AND PK1 <= 'xx') and (PK2 > 5 AND PK2 <=5) and (PK3 > 2 AND PK3 
> <=2);
> +--+--+--+--+
> |                   PLAN                   |              EST_BYTES_READ      
>         |              EST_ROWS_READ               |  |
> +--+--+--+--+
> | DEGENERATE SCAN OVER DUMMY               | null                             
>         | null                                     |  |
> +--+--+--+--+
> 1 row selected (0.015 seconds){code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Commented] (PHOENIX-6669) RVC returns a wrong result

2022-03-16 Thread Daniel Wong (Jira)



[ 
https://issues.apache.org/jira/browse/PHOENIX-6669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17507901#comment-17507901
 ] 

Daniel Wong commented on PHOENIX-6669:
--

The rewrite is not correct still.  However, my main concern anf why i mwntion 
this is without understanding exactly what the rvc is suppose to accomplish the 
test cases may not cover the set of issues or the code changes may be fragile.  

> RVC returns a wrong result
> --
>
> Key: PHOENIX-6669
> URL: https://issues.apache.org/jira/browse/PHOENIX-6669
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.16.1
>Reporter: Xinyi Yan
>Priority: Major
>
> {code:java}
> CREATE TABLE IF NOT EXISTS DUMMY (
>     PK1 VARCHAR NOT NULL,
>     PK2 BIGINT NOT NULL,
>     PK3 BIGINT NOT NULL,
>     PK4 VARCHAR NOT NULL,
>     COL1 BIGINT,
>     COL2 INTEGER,
>     COL3 VARCHAR,
>     COL4 VARCHAR,    CONSTRAINT PK PRIMARY KEY
>     (
>         PK1,
>         PK2,
>         PK3,
>         PK4
>     )
> );UPSERT INTO DUMMY (PK1, PK4, COL1, PK2, COL2, PK3, COL3, COL4)
>             VALUES ('xx', 'xid1', 0, 7, 7, 7, 'INSERT', null);
>  {code}
> The non-RVC query returns no row, but the RVC query returns a wrong result.
> {code:java}
> 0: jdbc:phoenix:localhost> select PK2
> . . . . . . . . . . . . .> from DUMMY
> . . . . . . . . . . . . .> where PK1 ='xx'
> . . . . . . . . . . . . .> and (PK1 > 'xx' AND PK1 <= 'xx')
> . . . . . . . . . . . . .> and (PK2 > 5 AND PK2 <=5)
> . . . . . . . . . . . . .> and (PK3 > 2 AND PK3 <=2);
> +--+
> |                   PK2                    |
> +--+
> +--+
>  No rows selected (0.022 seconds)
> 0: jdbc:phoenix:localhost> select PK2
> . . . . . . . . . . . . .> from DUMMY
> . . . . . . . . . . . . .> where (PK1 = 'xx')
> . . . . . . . . . . . . .> and (PK1, PK2, PK3) > ('xx', 5, 2)
> . . . . . . . . . . . . .> and (PK1, PK2, PK3) <= ('xx', 5, 2);
> +--+
> |                   PK2                    |
> +--+
> | 7                                        |
> +--+
> 1 row selected (0.033 seconds) {code}
> {code:java}
> 0: jdbc:phoenix:localhost> EXPLAIN select PK2 from DUMMY where (PK1 = 'xx') 
> and (PK1, PK2, PK3) > ('xx', 5, 2) and (PK1, PK2, PK3) <= ('xx', 5, 2);
> +--+--+--+--+
> |                   PLAN                   |              EST_BYTES_READ      
>         |              EST_ROWS_READ               |  |
> +--+--+--+--+
> | CLIENT 1-CHUNK PARALLEL 1-WAY ROUND ROBIN RANGE SCAN OVER DUMMY ['xx'] | 
> null                                     | null          |
> |     SERVER FILTER BY FIRST KEY ONLY      | null                             
>         | null                                     |  |
> +--+--+--+--+
> 2 rows selected (0.024 seconds) 
> 0: jdbc:phoenix:localhost> explain select PK2 from DUMMY where PK1 ='xx' and 
> (PK1 > 'xx' AND PK1 <= 'xx') and (PK2 > 5 AND PK2 <=5) and (PK3 > 2 AND PK3 
> <=2);
> +--+--+--+--+
> |                   PLAN                   |              EST_BYTES_READ      
>         |              EST_ROWS_READ               |  |
> +--+--+--+--+
> | DEGENERATE SCAN OVER DUMMY               | null                             
>         | null                                     |  |
> +--+--+--+--+
> 1 row selected (0.015 seconds){code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Commented] (PHOENIX-6669) RVC returns a wrong result

2022-03-16 Thread Daniel Wong (Jira)



[ 
https://issues.apache.org/jira/browse/PHOENIX-6669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17507784#comment-17507784
 ] 

Daniel Wong commented on PHOENIX-6669:
--

[~yanxinyi] careful on your transformations, im worried you may not be testing 
cleanly why not use the actual query rewrite rather than these non equivalent 
rewrites?

> RVC returns a wrong result
> --
>
> Key: PHOENIX-6669
> URL: https://issues.apache.org/jira/browse/PHOENIX-6669
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.16.1
>Reporter: Xinyi Yan
>Priority: Major
>
> {code:java}
> CREATE TABLE IF NOT EXISTS DUMMY (
>     PK1 VARCHAR NOT NULL,
>     PK2 BIGINT NOT NULL,
>     PK3 BIGINT NOT NULL,
>     PK4 VARCHAR NOT NULL,
>     COL1 BIGINT,
>     COL2 INTEGER,
>     COL3 VARCHAR,
>     COL4 VARCHAR,    CONSTRAINT PK PRIMARY KEY
>     (
>         PK1,
>         PK2,
>         PK3,
>         PK4
>     )
> );UPSERT INTO DUMMY (PK1, PK4, COL1, PK2, COL2, PK3, COL3, COL4)
>             VALUES ('xx', 'xid1', 0, 7, 7, 7, 'INSERT', null);
>  {code}
> The non-RVC query returns no row, but the RVC query returns a wrong result.
> {code:java}
> 0: jdbc:phoenix:localhost> select PK2
> . . . . . . . . . . . . .> from DUMMY
> . . . . . . . . . . . . .> where (PK1 = 'xx')
> . . . . . . . . . . . . .> and ((PK2 > 5 AND PK2 <=5))
> . . . . . . . . . . . . .> and ((PK3 > 2 AND PK3 <=2));
> +--+
> |                   PK2                    |
> +--+
> +--+
> No rows selected (0.022 seconds)
> 0: jdbc:phoenix:localhost> select PK2
> . . . . . . . . . . . . .> from DUMMY
> . . . . . . . . . . . . .> where (PK1 = 'xx')
> . . . . . . . . . . . . .> and (PK1, PK2, PK3) > ('xx', 5, 2)
> . . . . . . . . . . . . .> and (PK1, PK2, PK3) <= ('xx', 5, 2);
> +--+
> |                   PK2                    |
> +--+
> | 7                                        |
> +--+
> 1 row selected (0.033 seconds) {code}
> {code:java}
> 0: jdbc:phoenix:localhost> EXPLAIN select PK2 from DUMMY where (PK1 = 'xx') 
> and (PK1, PK2, PK3) > ('xx', 5, 2) and (PK1, PK2, PK3) <= ('xx', 5, 2);
> +--+--+--+--+
> |                   PLAN                   |              EST_BYTES_READ      
>         |              EST_ROWS_READ               |  |
> +--+--+--+--+
> | CLIENT 1-CHUNK PARALLEL 1-WAY ROUND ROBIN RANGE SCAN OVER DUMMY ['xx'] | 
> null                                     | null          |
> |     SERVER FILTER BY FIRST KEY ONLY      | null                             
>         | null                                     |  |
> +--+--+--+--+
> 2 rows selected (0.024 seconds)
> 0: jdbc:phoenix:localhost> EXPLAIN select PK2 from DUMMY where (PK1 = 'xx') 
> and ((PK2 > 5 AND PK2 <=5)) and ((PK3 > 2 AND PK3 <=2));
> +--+--+--+--+
> |                   PLAN                   |              EST_BYTES_READ      
>         |              EST_ROWS_READ               |  |
> +--+--+--+--+
> | DEGENERATE SCAN OVER DUMMY               | null                             
>         | null                                     |  |
> +--+--+--+--+
> 1 row selected (0.005 seconds) {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Comment Edited] (PHOENIX-6659) RVC with AND clauses return incorrect result

2022-03-02 Thread Daniel Wong (Jira)



[ 
https://issues.apache.org/jira/browse/PHOENIX-6659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17500447#comment-17500447
 ] 

Daniel Wong edited comment on PHOENIX-6659 at 3/3/22, 1:37 AM:
---

[~yanxinyi] good catch but one thing you need to be careful on is your 
reduction isn't always going to be valid.  
{code:java}
SELECT * FROM DUMMY WHERE (PK1 = 'a') AND (PK1,PK2,PK3) <= ('a',3,1);{code}
and 
{code:java}
SELECT * FROM DUMMY WHERE (PK1 = 'a') AND (PK2,PK3) <= (3,1);{code}
works but consider 

 
{code:java}
SELECT * FROM DUMMY WHERE (PK1 = 'a' OR PK1 = 'b' ) AND (PK1,PK2,PK3) <= 
('b',3,1);
SELECT * FROM DUMMY WHERE (PK1 = 'a' OR PK1 = 'b' ) AND (PK2,PK3) <= (3,1);
 
{code}
these are not equivalent.


was (Author: dbwong):
[~yanxinyi] good catch but one thing you need to be careful on is your 
reduction isn't always going to be valid.  
{code:java}
SELECT * FROM DUMMY WHERE (PK1 = 'a') AND (PK1,PK2,PK3) <= ('a',3,1);{code}
and 
{code:java}
SELECT * FROM DUMMY WHERE (PK1 = 'a') AND (PK2,PK3) <= (3,1);{code}
works but consider 

 
{code:java}
SELECT * FROM DUMMY WHERE (PK1 = 'a' OR PK1 = 'b' ) AND (PK1,PK2,PK3) <= 
('a',3,1);
SELECT * FROM DUMMY WHERE (PK1 = 'a' OR PK1 = 'b' ) AND (PK2,PK3) <= (3,1);
 
{code}
these are not equivalent.

> RVC with AND clauses return incorrect result
> 
>
> Key: PHOENIX-6659
> URL: https://issues.apache.org/jira/browse/PHOENIX-6659
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.16.1
>Reporter: Xinyi Yan
>Priority: Critical
> Attachments: Screen Shot 2022-03-01 at 1.26.44 PM.png
>
>
> CREATE TABLE DUMMY (PK1 VARCHAR NOT NULL, PK2 BIGINT NOT NULL, PK3 BIGINT NOT 
> NULL CONSTRAINT PK PRIMARY KEY (PK1,PK2,PK3));
> UPSERT INTO DUMMY VALUES ('a',0,1);
> UPSERT INTO DUMMY VALUES ('a',1,1);
> UPSERT INTO DUMMY VALUES ('a',2,1);
> UPSERT INTO DUMMY VALUES ('a',3,1);
> UPSERT INTO DUMMY VALUES ('a',3,2);
> UPSERT INTO DUMMY VALUES ('a',4,1);
>  
> {code:java}
> 0: jdbc:phoenix:localhost> SELECT * FROM DUMMY WHERE (PK1 = 'a') AND 
> (PK1,PK2,PK3) <= ('a',3,1);
> +--+--++
> |                   PK1                    |                   PK2            
>         |                   PK3          |
> +--+--++
> +--+--++
> No rows selected (0.045 seconds)
> 0: jdbc:phoenix:localhost> explain SELECT * FROM DUMMY WHERE (PK1 = 'a') AND 
> (PK1,PK2,PK3) <= ('a',3,1);
> +--+--++
> |                   PLAN                   |              EST_BYTES_READ      
>         |              EST_ROWS_READ     |
> +--+--++
> | CLIENT 1-CHUNK PARALLEL 1-WAY ROUND ROBIN RANGE SCAN OVER DUMMY ['a',*] - 
> ['a',-9187343239835811840] | null          |
> |     SERVER FILTER BY FIRST KEY ONLY      | null                             
>         | null                           |
> +--+--++
> 2 rows selected (0.012 seconds)
> 0: jdbc:phoenix:localhost> SELECT * FROM DUMMY WHERE (PK1 = 'a') AND 
> (PK2,PK3) <= (3,1);
> +--+--++
> |                   PK1                    |                   PK2            
>         |                   PK3          |
> +--+--++
> | a                                        | 0                                
>         | 1                              |
> | a                                        | 1                                
>         | 1                              |
> | a                                        | 2                                
>         | 1                              |
> | a                                        | 3                                
>         | 1                              |
> +--+--++
> 4 rows selected (0.014 seconds)
> 0: jdbc:phoenix:localhost> EXPLAIN SELECT * FROM DUMMY WHERE (PK1 = 'a') AND 
> (PK2,PK3) <= (3,1);
>

[jira] [Comment Edited] (PHOENIX-6659) RVC with AND clauses return incorrect result

2022-03-02 Thread Daniel Wong (Jira)



[ 
https://issues.apache.org/jira/browse/PHOENIX-6659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17500447#comment-17500447
 ] 

Daniel Wong edited comment on PHOENIX-6659 at 3/3/22, 1:34 AM:
---

[~yanxinyi] good catch but one thing you need to be careful on is your 
reduction isn't always going to be valid.  
{code:java}
SELECT * FROM DUMMY WHERE (PK1 = 'a') AND (PK1,PK2,PK3) <= ('a',3,1);{code}
and 
{code:java}
SELECT * FROM DUMMY WHERE (PK1 = 'a') AND (PK2,PK3) <= (3,1);{code}
works but consider 

 
{code:java}
SELECT * FROM DUMMY WHERE (PK1 = 'a' OR PK1 = 'b' ) AND (PK1,PK2,PK3) <= 
('a',3,1);
SELECT * FROM DUMMY WHERE (PK1 = 'a' OR PK1 = 'b' ) AND (PK2,PK3) <= (3,1);
 
{code}
these are not equivalent.


was (Author: dbwong):
[~yanxinyi] good catch but one thing you need to be careful on is your 
reduction isn't always going to be valid.  
SELECT * FROM DUMMY WHERE (PK1 = 'a') AND (PK1,PK2,PK3) <= ('a',3,1);
and 
SELECT * FROM DUMMY WHERE (PK1 = 'a') AND (PK2,PK3) <= (3,1);
works but consider 

SELECT * FROM DUMMY WHERE (PK1 = 'a' OR PK1 = 'b' ) AND (PK1,PK2,PK3) <= 
('a',3,1);

SELECT * FROM DUMMY WHERE (PK1 = 'a' OR PK1 = 'b' ) AND (PK2,PK3) <= (3,1);

these are not equivalent.

> RVC with AND clauses return incorrect result
> 
>
> Key: PHOENIX-6659
> URL: https://issues.apache.org/jira/browse/PHOENIX-6659
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.16.1
>Reporter: Xinyi Yan
>Priority: Critical
> Attachments: Screen Shot 2022-03-01 at 1.26.44 PM.png
>
>
> CREATE TABLE DUMMY (PK1 VARCHAR NOT NULL, PK2 BIGINT NOT NULL, PK3 BIGINT NOT 
> NULL CONSTRAINT PK PRIMARY KEY (PK1,PK2,PK3));
> UPSERT INTO DUMMY VALUES ('a',0,1);
> UPSERT INTO DUMMY VALUES ('a',1,1);
> UPSERT INTO DUMMY VALUES ('a',2,1);
> UPSERT INTO DUMMY VALUES ('a',3,1);
> UPSERT INTO DUMMY VALUES ('a',3,2);
> UPSERT INTO DUMMY VALUES ('a',4,1);
>  
> {code:java}
> 0: jdbc:phoenix:localhost> SELECT * FROM DUMMY WHERE (PK1 = 'a') AND 
> (PK1,PK2,PK3) <= ('a',3,1);
> +--+--++
> |                   PK1                    |                   PK2            
>         |                   PK3          |
> +--+--++
> +--+--++
> No rows selected (0.045 seconds)
> 0: jdbc:phoenix:localhost> explain SELECT * FROM DUMMY WHERE (PK1 = 'a') AND 
> (PK1,PK2,PK3) <= ('a',3,1);
> +--+--++
> |                   PLAN                   |              EST_BYTES_READ      
>         |              EST_ROWS_READ     |
> +--+--++
> | CLIENT 1-CHUNK PARALLEL 1-WAY ROUND ROBIN RANGE SCAN OVER DUMMY ['a',*] - 
> ['a',-9187343239835811840] | null          |
> |     SERVER FILTER BY FIRST KEY ONLY      | null                             
>         | null                           |
> +--+--++
> 2 rows selected (0.012 seconds)
> 0: jdbc:phoenix:localhost> SELECT * FROM DUMMY WHERE (PK1 = 'a') AND 
> (PK2,PK3) <= (3,1);
> +--+--++
> |                   PK1                    |                   PK2            
>         |                   PK3          |
> +--+--++
> | a                                        | 0                                
>         | 1                              |
> | a                                        | 1                                
>         | 1                              |
> | a                                        | 2                                
>         | 1                              |
> | a                                        | 3                                
>         | 1                              |
> +--+--++
> 4 rows selected (0.014 seconds)
> 0: jdbc:phoenix:localhost> EXPLAIN SELECT * FROM DUMMY WHERE (PK1 = 'a') AND 
> (PK2,PK3) <= (3,1);
> +--+--++
> |

[jira] [Commented] (PHOENIX-6659) RVC with AND clauses return incorrect result

2022-03-02 Thread Daniel Wong (Jira)



[ 
https://issues.apache.org/jira/browse/PHOENIX-6659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17500447#comment-17500447
 ] 

Daniel Wong commented on PHOENIX-6659:
--

[~yanxinyi] good catch but one thing you need to be careful on is your 
reduction isn't always going to be valid.  
SELECT * FROM DUMMY WHERE (PK1 = 'a') AND (PK1,PK2,PK3) <= ('a',3,1);
and 
SELECT * FROM DUMMY WHERE (PK1 = 'a') AND (PK2,PK3) <= (3,1);
works but consider 

SELECT * FROM DUMMY WHERE (PK1 = 'a' OR PK1 = 'b' ) AND (PK1,PK2,PK3) <= 
('a',3,1);

SELECT * FROM DUMMY WHERE (PK1 = 'a' OR PK1 = 'b' ) AND (PK2,PK3) <= (3,1);

these are not equivalent.

> RVC with AND clauses return incorrect result
> 
>
> Key: PHOENIX-6659
> URL: https://issues.apache.org/jira/browse/PHOENIX-6659
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.16.1
>Reporter: Xinyi Yan
>Priority: Critical
> Attachments: Screen Shot 2022-03-01 at 1.26.44 PM.png
>
>
> CREATE TABLE DUMMY (PK1 VARCHAR NOT NULL, PK2 BIGINT NOT NULL, PK3 BIGINT NOT 
> NULL CONSTRAINT PK PRIMARY KEY (PK1,PK2,PK3));
> UPSERT INTO DUMMY VALUES ('a',0,1);
> UPSERT INTO DUMMY VALUES ('a',1,1);
> UPSERT INTO DUMMY VALUES ('a',2,1);
> UPSERT INTO DUMMY VALUES ('a',3,1);
> UPSERT INTO DUMMY VALUES ('a',3,2);
> UPSERT INTO DUMMY VALUES ('a',4,1);
>  
> {code:java}
> 0: jdbc:phoenix:localhost> SELECT * FROM DUMMY WHERE (PK1 = 'a') AND 
> (PK1,PK2,PK3) <= ('a',3,1);
> +--+--++
> |                   PK1                    |                   PK2            
>         |                   PK3          |
> +--+--++
> +--+--++
> No rows selected (0.045 seconds)
> 0: jdbc:phoenix:localhost> explain SELECT * FROM DUMMY WHERE (PK1 = 'a') AND 
> (PK1,PK2,PK3) <= ('a',3,1);
> +--+--++
> |                   PLAN                   |              EST_BYTES_READ      
>         |              EST_ROWS_READ     |
> +--+--++
> | CLIENT 1-CHUNK PARALLEL 1-WAY ROUND ROBIN RANGE SCAN OVER DUMMY ['a',*] - 
> ['a',-9187343239835811840] | null          |
> |     SERVER FILTER BY FIRST KEY ONLY      | null                             
>         | null                           |
> +--+--++
> 2 rows selected (0.012 seconds)
> 0: jdbc:phoenix:localhost> SELECT * FROM DUMMY WHERE (PK1 = 'a') AND 
> (PK2,PK3) <= (3,1);
> +--+--++
> |                   PK1                    |                   PK2            
>         |                   PK3          |
> +--+--++
> | a                                        | 0                                
>         | 1                              |
> | a                                        | 1                                
>         | 1                              |
> | a                                        | 2                                
>         | 1                              |
> | a                                        | 3                                
>         | 1                              |
> +--+--++
> 4 rows selected (0.014 seconds)
> 0: jdbc:phoenix:localhost> EXPLAIN SELECT * FROM DUMMY WHERE (PK1 = 'a') AND 
> (PK2,PK3) <= (3,1);
> +--+--++
> |                   PLAN                   |              EST_BYTES_READ      
>         |              EST_ROWS_READ     |
> +--+--++
> | CLIENT 1-CHUNK PARALLEL 1-WAY ROUND ROBIN RANGE SCAN OVER DUMMY ['a',*] - 
> ['a',3] | null                             |
> |     SERVER FILTER BY FIRST KEY ONLY      | null                             
>         | null                           |
>

[jira] [Commented] (PHOENIX-6604) Allow using indexes for wildcard topN queries on salted tables

2021-12-04 Thread Daniel Wong (Jira)



[ 
https://issues.apache.org/jira/browse/PHOENIX-6604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17453471#comment-17453471
 ] 

Daniel Wong commented on PHOENIX-6604:
--

Some of us are still watching Lars.

> Allow using indexes for wildcard topN queries on salted tables
> --
>
> Key: PHOENIX-6604
> URL: https://issues.apache.org/jira/browse/PHOENIX-6604
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 5.1.2
>Reporter: Lars Hofhansl
>Priority: Major
> Fix For: 5.1.3
>
> Attachments: 6604-1.5.1.3, 6604.5.1.3
>
>
> Just randomly came across this, playing with TPCH data.
> {code:java}
> CREATE TABLE lineitem (
>  orderkey bigint not null,
>  partkey bigint,
>  suppkey bigint,
>  linenumber integer not null,
>  quantity double,
>  extendedprice double,
>  discount double,
>  tax double,
>  returnflag varchar(1),
>  linestatus varchar(1),
>  shipdate date,
>  commitdate date,
>  receiptdate date,
>  shipinstruct varchar(25),
>  shipmode varchar(10),
>  comment varchar(44)
>  constraint pk primary key(orderkey, linenumber)) 
> IMMUTABLE_ROWS=true,SALT_BUCKETS=4;
> CREATE LOCAL INDEX l_shipdate ON lineitem(shipdate);{code}
> Now:
> {code:java}
>  > explain select * from lineitem order by shipdate limit 1;
> +---+
> |                                          PLAN                               
>       |
> +---+
> | CLIENT 199-CHUNK 8859938 ROWS 2044738843 BYTES PARALLEL 199-WAY FULL SCAN 
> OVER LI |
> |     SERVER TOP 1 ROW SORTED BY [SHIPDATE]                                   
>       |
> | CLIENT MERGE SORT                                                           
>       |
> | CLIENT LIMIT 1                                                              
>       |
> +---+
> 4 rows selected (6.525 seconds)
> -- SAME COLUMNS!
> > explain select ORDERKEY, PARTKEY, SUPPKEY, LINENUMBER, QUANTITY, 
> > EXTENDEDPRICE, DISCOUNT, TAX, RETURNFLAG, LINESTATUS, SHIPDATE, COMMITDATE, 
> > RECEIPTDATE, SHIPINSTRUCT, SHIPMODE, COMMENT from lineitem order by 
> > shipdate limit 1;
> +---+
> |                                                                             
>       |
> +---+
> | CLIENT 4-CHUNK 4 ROWS 204 BYTES PARALLEL 4-WAY RANGE SCAN OVER LINEITEM [1] 
>       |
> |     SERVER MERGE [0.PARTKEY, 0.SUPPKEY, 0.QUANTITY, 0.EXTENDEDPRICE, 
> 0.DISCOUNT,  |
> |     SERVER FILTER BY FIRST KEY ONLY                                         
>       |
> |     SERVER 1 ROW LIMIT                                                      
>       |
> | CLIENT MERGE SORT                                                           
>       |
> | CLIENT 1 ROW LIMIT                                                          
>       |
> +---+
> 6 rows selected (2.736 seconds){code}
>  
> The same happens with a covered global index.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Commented] (PHOENIX-6440) failed to write to multiple index table issue

2021-11-09 Thread Daniel Wong (Jira)



[ 
https://issues.apache.org/jira/browse/PHOENIX-6440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17441383#comment-17441383
 ] 

Daniel Wong commented on PHOENIX-6440:
--

[~giskender] I assume this was fixed in the later 4.14 versions?  

> failed to write to multiple index table issue
> -
>
> Key: PHOENIX-6440
> URL: https://issues.apache.org/jira/browse/PHOENIX-6440
> Project: Phoenix
>  Issue Type: Bug
>  Components: core
>Affects Versions: 4.14.0
> Environment: Our environment is:
>RHEL: 7.3 64bits
>JDK :180_152 64bits
> zookeeper: 3.4.10
> HADOOP: 2.7.1 64bits
>HBASE 1.2.6 64bits
>PHOENIX:4.14.0
>Reporter: Gary
>Priority: Major
>
> When we load data into phoenix(psql.py load csv file) table A, and at the 
> same time we drop and create table B,in hbase log, we can 
>   find one ERROR,that says index issue:
>  ERROR 1121(XCL21): Write to the index failed.
> disableIndexOnFailure=true,Failed to write multiple index tables
> Any ideas for this issue?



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Commented] (PHOENIX-6494) Delete data and insert immediately cause the data inconsistently

2021-11-09 Thread Daniel Wong (Jira)



[ 
https://issues.apache.org/jira/browse/PHOENIX-6494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17441382#comment-17441382
 ] 

Daniel Wong commented on PHOENIX-6494:
--

Not sure what was happening here.  Going to mark this as fixed in the above 
version for now.

> Delete data and insert immediately cause the data inconsistently
> 
>
> Key: PHOENIX-6494
> URL: https://issues.apache.org/jira/browse/PHOENIX-6494
> Project: Phoenix
>  Issue Type: Bug
>  Components: core
>Affects Versions: 5.1.1
> Environment: Phoenix 5.1.1
> Hbase 2.3.5
> Hadoop 2.10.1
>Reporter: Future.Zhang
>Priority: Major
>
> I have a table named “TEST1”, it contain 30 record. 
> I use SpringJDBC to connect phoenix.
>  
> My Java code like below
> {code:java}
> //代码占位符
> //First, Delete all data
> phoenixJdbcTemplate.execute("DELETE FROM TEST1");
> //Second, use alibaba datax to sync data (30 record) into "TEST1" table. 
> JavaRemoteShell.execute("python /opt/data/bin/datax sync_data.json");
> {code}
> after, i query 'TEST1' table, it only 19 record into the table.
>  
> Is it a phoenix bug? 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Commented] (PHOENIX-6549) Connection leak checks sometimes fail in tests

2021-11-09 Thread Daniel Wong (Jira)



[ 
https://issues.apache.org/jira/browse/PHOENIX-6549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17441379#comment-17441379
 ] 

Daniel Wong commented on PHOENIX-6549:
--

Certain paths use both, for specifically upsert-select and delete we should 
test internal and non-internal.  We should likely handle both.

> Connection leak checks sometimes fail in tests
> --
>
> Key: PHOENIX-6549
> URL: https://issues.apache.org/jira/browse/PHOENIX-6549
> Project: Phoenix
>  Issue Type: Bug
>  Components: core
>Affects Versions: 5.2.0
>Reporter: Istvan Toth
>Priority: Major
> Attachments: 
> org.apache.phoenix.end2end.UpsertSelectIT-output.txt.bz2, 
> org.apache.phoenix.end2end.UpsertSelectIT.txt.gz
>
>
> In rare cases we see 
> {code:java}
> GLOBAL_OPEN_PHOENIX_CONNECTIONS.getMetric().getValue()
> {code}
> returning 1 instead of 0, even thugh the test class uses try-with-resources 
> blocks to handle the Phoenix Connections.
> I'm attaching the output from a recent test run on ASF Jenkins



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Commented] (PHOENIX-6564) Add Phoenix Connection Failed Global Metric

2021-11-09 Thread Daniel Wong (Jira)



[ 
https://issues.apache.org/jira/browse/PHOENIX-6564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17441378#comment-17441378
 ] 

Daniel Wong commented on PHOENIX-6564:
--

Per discussion I've assigned to you [~vmeka] 

> Add Phoenix Connection Failed Global Metric
> ---
>
> Key: PHOENIX-6564
> URL: https://issues.apache.org/jira/browse/PHOENIX-6564
> Project: Phoenix
>  Issue Type: New Feature
>  Components: core
>Affects Versions: 4.16.1, 5.1.2
>Reporter: Daniel Wong
>Assignee: vikas meka
>Priority: Minor
>
> Add Phoenix Connection Failed Global Metric.  We have 
> PHOENIX_CONNECTIONS_ATTEMPTED_COUNTER as well as a 
> PHOENIX_CONNECTIONS_THROTTLED_COUNTER but this doesn't cover all possible 
> failures under attempted during DriverManager.getConnection.  This Jira is to 
> provide a counter that covers all failures and to ensure that attempted also 
> covers as wide of a net as possible.  This would be a counter metric that 
> increments when DriverManager.getConnection fails to return a jdbc connection 
> to the user.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Commented] (PHOENIX-4922) CONVERT_TZ with date 0001-01-01

2021-11-09 Thread Daniel Wong (Jira)



[ 
https://issues.apache.org/jira/browse/PHOENIX-4922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17441376#comment-17441376
 ] 

Daniel Wong commented on PHOENIX-4922:
--

I was looking at some issues in beginner/noob and saw this one.  Probably the 
big date/time project that was being done might resolve this.  [~stoty] I can't 
remember who was leading those changes maybe can we consolidate/remove 
duplicate issues?  There is Phoenix-5066 this and maybe some more?  I've 
removed the beginner themes from this as there was some discussion on how to 
resolve.

> CONVERT_TZ with date 0001-01-01
> ---
>
> Key: PHOENIX-4922
> URL: https://issues.apache.org/jira/browse/PHOENIX-4922
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.14.0
> Environment: SLF4J: Found binding in 
> [jar:file:/opt/cloudera/parcels/APACHE_PHOENIX-4.14.0-cdh5.14.2.p0.3/lib/phoenix/phoenix-4.14.0-cdh5.14.2-client.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>  
>Reporter: Eduardo
>Priority: Major
>  Labels: beginner
>
> Hi all, 
>      We are storing dates from DB2 (field is a DATE) to Phoenix (Field is a 
> DATE) and the conversion function CONVERT_TZ is handle this way with this 
> three different records:
> 1) Only "select FECSALDO from table;" without conversion
> 2011-05-16 23:00:00.000 (this is summer time date)
> 2016-03-02 00:00:00.000 (this is winter time date)
> 0001-01-01 00:00:00.000
> 2) Using
> "select CONVERT_TZ(FECSALDO, 'UTC', 'Europe/London') AS FECSALDO from table";
> 2011-05-17 00:00:00.000 -> OK
> 2016-03-02 00:00:00.000 -> OK
> {color:#FF}0001-12-31 23:58:45.000 . -> WTF{color}
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Commented] (PHOENIX-6571) Hang of connections after OutOfMemoryError

2021-10-08 Thread Daniel Wong (Jira)



[ 
https://issues.apache.org/jira/browse/PHOENIX-6571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17426343#comment-17426343
 ] 

Daniel Wong commented on PHOENIX-6571:
--

Please attach the logs to the issue directly.  Can you take a heapdump or setup 
for a hprof and understand what objects are leaking?  Examples we have seen in 
the past include ZK thread leak, Phoenix regionInfos etc.

> Hang of connections after OutOfMemoryError 
> ---
>
> Key: PHOENIX-6571
> URL: https://issues.apache.org/jira/browse/PHOENIX-6571
> Project: Phoenix
>  Issue Type: Bug
>  Components: core
>Affects Versions: 4.16.1
>Reporter: Aleksey Stavrov
>Priority: Major
>
> We often have connection hangs after OOM. Hangs often occur on locks.
> I made a test case locally that reproduces this problem, but I don't know 
> Java, so I have a complex bundle (perl -> dbd_jdbc -> hbase/phoenix) to use 
> phoenix, so because of this i am not showing a minimal example to reproduce 
> this behavior.
> Here is the error log shown by dbd_jdbc:  [https://pastebin.com/sHcH2iVq]
> After the OOM happened, all connections to dbd_jdbc hang and very often have 
> such a stack trace:
> {noformat}
> "Thread-18" #197 prio=5 os_prio=0 cpu=21.70ms elapsed=1822.79s 
> tid=0x7ffaa4124800 nid=0x1c84 waiting on condition  [0x7ff9c0ecc000]  
>   
> 
>java.lang.Thread.State: WAITING (parking)  
>   
>   
>  
> at jdk.internal.misc.Unsafe.park(java.base@11.0.12/Native Method) 
>   
>   
>  
> - parking to wait for  <0xf82e9998> (a 
> java.util.concurrent.locks.ReentrantLock$NonfairSync) 
>   
> 
> at 
> java.util.concurrent.locks.LockSupport.park(java.base@11.0.12/LockSupport.java:194)
>   
>   
>  
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(java.base@11.0.12/AbstractQueuedSynchronizer.java:885)
>   
> 
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(java.base@11.0.12/AbstractQueuedSynchronizer.java:917)
>   
> 
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(java.base@11.0.12/AbstractQueuedSynchronizer.java:1240)
>   
>  
> at 
> java.util.concurrent.locks.ReentrantLock.lock(java.base@11.0.12/ReentrantLock.java:267)
>   
>   
>  
> at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1326)
>   
> 
> at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1230)
>   
>   
> at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1214)
>   
>   
> at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1171)
>   
>

[jira] [Commented] (PHOENIX-5812) Automatically Close "Idle" Long Open Connections

2021-05-26 Thread Daniel Wong (Jira)



[ 
https://issues.apache.org/jira/browse/PHOENIX-5812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17352203#comment-17352203
 ] 

Daniel Wong commented on PHOENIX-5812:
--

Coming back to this a bit wondering if we want 1 or 2 Jira for phoenix 
connection level reaping vs hconnection level reaping?  [~elserj] [~jisaac]

> Automatically Close "Idle" Long Open Connections
> 
>
> Key: PHOENIX-5812
> URL: https://issues.apache.org/jira/browse/PHOENIX-5812
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Daniel Wong
>Priority: Major
>  Labels: quality-improvement
>
> As Phoenix may keep a maximum default number of connections.  Badly 
> performing client calls or internal errors (See PHOENIX-5802).  Can cause 
> total available connections to go to 0.  Proposing a client connection 
> monitor with a connection reaper like task to reap idle connections.
> Definition of "Idle"
> Simple may be simple time based say if a connection has been open for 
> configurable amount of minutes simply close.
> More complicated solution may be keeping track of last interaction time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (PHOENIX-6456) Support query logging for DDL and DML

2021-04-28 Thread Daniel Wong (Jira)



[ 
https://issues.apache.org/jira/browse/PHOENIX-6456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17334923#comment-17334923
 ] 

Daniel Wong commented on PHOENIX-6456:
--

Can we consider grouping the logging parameters in query services?  It is not 
clear what audit log is and can we group these into the queryLogger as opposed 
to being grouped into the phoenix connection and statements?  

> Support query logging for DDL and DML
> -
>
> Key: PHOENIX-6456
> URL: https://issues.apache.org/jira/browse/PHOENIX-6456
> Project: Phoenix
>  Issue Type: Task
>  Components: core
>Reporter: Richárd Antal
>Assignee: Richárd Antal
>Priority: Major
>
> Add Support for DDL and DML logging.
> Previously only select query logging was available.
> In some situation it can be useful to have support for different sql query 
> logging as well.
> It should be disabled by default because of the write overhead.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (PHOENIX-5872) Close Internal Phoenix Connections that were running during cancel

2021-02-01 Thread Daniel Wong (Jira)



[ 
https://issues.apache.org/jira/browse/PHOENIX-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17276511#comment-17276511
 ] 

Daniel Wong commented on PHOENIX-5872:
--

Thanks [~stoty] for the help.  Hopefully the patches weren't too troublesome.  
Sorry I didn't get to it earlier.

> Close Internal Phoenix Connections that were running during cancel
> --
>
> Key: PHOENIX-5872
> URL: https://issues.apache.org/jira/browse/PHOENIX-5872
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.14.3, 4.x
>Reporter: Daniel Wong
>Assignee: Daniel Wong
>Priority: Major
> Fix For: 5.1.0, 4.16.0
>
> Attachments: PHOENIX-5872.4.x.patch, PHOENIX-5872v2.4.x.patch, 
> PHOENIX-5872v3.4.x.patch, PHOENIX-5872v3.master.patch, 
> PHOENIX-5872v4.4.x.patch, PHOENIX-5872v5.4.x.patch
>
>  Time Spent: 6h 40m
>  Remaining Estimate: 0h
>
> 3 part approach:
> 1 don't count internal phoenix connections toward the client limit.
> 2 count internal phoenix connections toward a newly defined limit
> 3 track parent and child relationships between connections to close those 
> connections



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (PHOENIX-5833) Incorrect results with RVCs and AND operator

2020-12-01 Thread Daniel Wong (Jira)



[ 
https://issues.apache.org/jira/browse/PHOENIX-5833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17241988#comment-17241988
 ] 

Daniel Wong commented on PHOENIX-5833:
--

Trying the old 4.x patch first.

> Incorrect results with RVCs and AND operator
> 
>
> Key: PHOENIX-5833
> URL: https://issues.apache.org/jira/browse/PHOENIX-5833
> Project: Phoenix
>  Issue Type: Bug
>  Components: core
>Affects Versions: 4.15.0
>Reporter: Bharath Vissapragada
>Assignee: Daniel Wong
>Priority: Critical
> Fix For: 4.16.0
>
> Attachments: PHOENIX-5833.4.x.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Phoenix version: 4.15-HBase-1.5
> -- Create a test table and populate a couple of rows.
> {noformat}
> create table repro_bug(a varchar(10) not null, b varchar(10) not null, c 
> varchar(10) not null constraint pk primary key(a, b, c));
> upsert into repro_bug values('abc', 'def', 'RRSQ_IMKKL');
> upsert into repro_bug values('abc', 'def', 'RRS_ZYTDT');
> select * from repro_bug;
> +--+--+-+
> |  A   |  B   |  C  |
> +--+--+-+
> | abc  | def  | RRSQ_IMKKL  |
> | abc  | def  | RRS_ZYTDT   |
> +--+--+-+
> {noformat}
> -- Query 1 - Look for rows where C has a certain prefix - Returns correct 
> result
> {noformat}
> select A, B, C from REPRO_BUG where C like 'RRS\\_%';
> +--+--++
> |  A   |  B   | C  |
> +--+--++
> | abc  | def  | RRS_ZYTDT  |
> +--+--++
> {noformat}
> -- Query 2 - Look for rows where (a, b, c) > first row - Returns correct 
> result
> {noformat}
> select A, B, C from REPRO_BUG where (A, B, C) > ('abc', 'def', 'RRSQ_IMKKL')
> +--+--++
> |  A   |  B   | C  |
> +--+--++
> | abc  | def  | RRS_ZYTDT  |
> +--+--++
> {noformat}
> -- Query 3 - Combine the filters from Query 1 and Query2 - Returns incorrect 
> result.. Ideally it should return the same row as above.
> {noformat}
>  select A, B, C from REPRO_BUG where (A, B, C) > ('abc', 'def', 'RRSQ_IMKKL') 
> AND C like 'RRS\\_%';
> ++++
> | A  | B  | C  |
> ++++
> ++++
> {noformat}
> -- Explain for the above incase someone is interested.
> {noformat}
> explain select A, B, C from REPRO_BUG where (A, B, C) > ('abc', 'def', 
> 'RRSQ_IMKKL') AND C like 'RRS\\_%';
> ++-++--+
> |  PLAN   
>| EST_BYTES_READ  | EST_ROWS_READ  | EST_INFO_TS  |
> ++-++--+
> | CLIENT 1-CHUNK PARALLEL 1-WAY ROUND ROBIN RANGE SCAN OVER REPRO_BUG 
> ['abcdef'] - [*]  | null| null   | null |
> | SERVER FILTER BY FIRST KEY ONLY AND C LIKE 'RRS\_%' 
>| null| null   | null |
> ++-++--+
> 2 rows selected (0.003 seconds)
> {noformat}
> I'm trying to poke around in the code to figure out the issue but my 
> understanding of  Phoenix is limited at this point. So creating a bug report 
> incase someone can figure this out quickly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (PHOENIX-5436) Histogram view of Global Client Metrics

2020-10-02 Thread Daniel Wong (Jira)



[ 
https://issues.apache.org/jira/browse/PHOENIX-5436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17206557#comment-17206557
 ] 

Daniel Wong commented on PHOENIX-5436:
--

Hey we should combine this with the work [~abhishek.sen] [~rushabh.shah] and 
[~vmeka] are working .  ICan we close 
https://issues.apache.org/jira/browse/PHOENIX-5838 perhaps as a duplicate or 
other was around?  Same with the pair of 
https://issues.apache.org/jira/browse/PHOENIX-5837 and 
https://issues.apache.org/jira/browse/PHOENIX-5469 .  Can we come to an 
agreement and close the duplicates?  
FYI [~ckulkarni]

> Histogram view of Global Client Metrics
> ---
>
> Key: PHOENIX-5436
> URL: https://issues.apache.org/jira/browse/PHOENIX-5436
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: abhishek sen
>Priority: Minor
>  Labels: histogram, metric-collector, metrics
> Attachments: 
> Histogram_View_of_Global_Client_metrics_log_typo_corrected.patch, 
> Histogram_View_of_Global_Client_metrics_removing_star_imports.patch, 
> Histogram_View_of_some_Global_Client_metrics.patch
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Currently the 
> [GlobalClientMetrics|https://github.com/apache/phoenix/blob/master/phoenix-core/src/main/java/org/apache/phoenix/monitoring/GlobalClientMetrics.java]
>  enum-class defines Phoenix Global Client Metrics as an enum and each 
> metric-enum tracks a [Phoenix 
> Metric|https://github.com/apache/phoenix/blob/master/phoenix-core/src/main/java/org/apache/phoenix/monitoring/Metric.java]
>  object implemented by 
> [AtomicMetric|https://github.com/apache/phoenix/blob/master/phoenix-core/src/main/java/org/apache/phoenix/monitoring/AtomicMetric.java]
>  class. This tracks a single counter, but in some use cases we want the 
> distribution of a particular metric value over some period of time. One 
> example could be the metric 
> [TASK_EXECUTION_TIME|https://github.com/apache/phoenix/blob/master/phoenix-core/src/main/java/org/apache/phoenix/monitoring/MetricType.java#L47].
>  The Global Client metric tracks the aggregate value for the execution time 
> of the task. More useful information to monitor would be the distribution of 
> task execution time instead. 
>  Now, in order to incorporate histogram view of the metric, we can use the 
> [Histogram 
> Implementation|https://github.com/apache/hbase/blob/master/hbase-metrics/src/main/java/org/apache/hadoop/hbase/metrics/impl/HistogramImpl.java]
>  from hbase-metrics-api package. The current GlobalClientMetric also 
> ultimately adapts to hbase-metrics-api interface but it 
> [implements|https://github.com/apache/phoenix/blob/master/phoenix-core/src/main/java/org/apache/phoenix/monitoring/GlobalClientMetrics.java#L155]
>  the 
> [Gauge|https://github.com/apache/hbase/blob/master/hbase-metrics-api/src/main/java/org/apache/hadoop/hbase/metrics/Gauge.java]
>  class which can only keep track of a single value not a histogram. So one 
> way could be to create a new GlobalClientHistogramMetrics enum class which 
> keeps track of a Histogram (based on 
> [MutableTimeHistogram|https://github.com/apache/hbase/blob/master/hbase-hadoop2-compat/src/main/java/org/apache/hadoop/metrics2/lib/MutableTimeHistogram.java])
>  for each enum-metric instead of a single counter as it is now. The updating 
> of the these new metric can go 
> [here|https://github.com/apache/phoenix/blob/master/phoenix-core/src/main/java/org/apache/phoenix/job/JobManager.java#L289]
>  within JobManager class where the currently available Global as well as 
> request-level metrics are being updated.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (PHOENIX-5872) Close Internal Phoenix Connections that were running during cancel

2020-09-15 Thread Daniel Wong (Jira)



[ 
https://issues.apache.org/jira/browse/PHOENIX-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17196561#comment-17196561
 ] 

Daniel Wong commented on PHOENIX-5872:
--

[~gjacoby] there is a master patch attached but hadoop QA was being extremely 
unhelpful in running consistently at the time and we did not believe the 
failures to be patch related.  Looks like master patch may not even apply now?  

> Close Internal Phoenix Connections that were running during cancel
> --
>
> Key: PHOENIX-5872
> URL: https://issues.apache.org/jira/browse/PHOENIX-5872
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.14.3, 4.x
>Reporter: Daniel Wong
>Assignee: Daniel Wong
>Priority: Major
> Fix For: 4.16.0
>
> Attachments: PHOENIX-5872.4.x.patch, PHOENIX-5872.master.patch, 
> PHOENIX-5872v2.4.x.patch, PHOENIX-5872v3.4.x.patch, PHOENIX-5872v4.4.x.patch, 
> PHOENIX-5872v5.4.x.patch
>
>  Time Spent: 6h 40m
>  Remaining Estimate: 0h
>
> 3 part approach:
> 1 don't count internal phoenix connections toward the client limit.
> 2 count internal phoenix connections toward a newly defined limit
> 3 track parent and child relationships between connections to close those 
> connections



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (PHOENIX-5066) The TimeZone is incorrectly used during writing or reading data

2020-09-15 Thread Daniel Wong (Jira)



[ 
https://issues.apache.org/jira/browse/PHOENIX-5066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17196547#comment-17196547
 ] 

Daniel Wong commented on PHOENIX-5066:
--

Left some initial comments.

> The TimeZone is incorrectly used during writing or reading data
> ---
>
> Key: PHOENIX-5066
> URL: https://issues.apache.org/jira/browse/PHOENIX-5066
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 5.0.0, 4.14.1
>Reporter: Jaanai Zhang
>Assignee: Richard Antal
>Priority: Critical
> Fix For: 5.1.1, 4.16.0
>
> Attachments: DateTest.java, PHOENIX-5066.4x.v1.patch, 
> PHOENIX-5066.4x.v2.patch, PHOENIX-5066.4x.v3.patch, 
> PHOENIX-5066.master.v1.patch, PHOENIX-5066.master.v2.patch, 
> PHOENIX-5066.master.v3.patch, PHOENIX-5066.master.v4.patch, 
> PHOENIX-5066.master.v5.patch, PHOENIX-5066.master.v6.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> We have two methods to write data when uses JDBC API.
> #1. Uses _the exceuteUpdate_ method to execute a string that is an upsert SQL.
> #2. Uses the _prepareStatement_ method to set some objects and execute.
> The _string_ data needs to convert to a new object by the schema information 
> of tables. we'll use some date formatters to convert string data to object 
> for Date/Time/Timestamp types when writes data and the formatters are used 
> when reads data as well.
>  
> *Uses default timezone test*
>  Writing 3 records by the different ways.
> {code:java}
> UPSERT INTO date_test VALUES (1,'2018-12-10 15:40:47','2018-12-10 
> 15:40:47','2018-12-10 15:40:47') 
> UPSERT INTO date_test VALUES (2,to_date('2018-12-10 
> 15:40:47'),to_time('2018-12-10 15:40:47'),to_timestamp('2018-12-10 15:40:47'))
> stmt.setInt(1, 3);stmt.setDate(2, date);stmt.setTime(3, 
> time);stmt.setTimestamp(4, ts);
> {code}
> Reading the table by the getObject(getDate/getTime/getTimestamp) methods.
> {code:java}
> 1 | 2018-12-10 | 23:45:07 | 2018-12-10 23:45:07.0 
> 2 | 2018-12-10 | 23:45:07 | 2018-12-10 23:45:07.0 
> 3 | 2018-12-10 | 15:45:07 | 2018-12-10 15:45:07.66 
> {code}
> Reading the table by the getString methods 
> {code:java}
> 1 | 2018-12-10 15:45:07.000 | 2018-12-10 15:45:07.000 | 2018-12-10 
> 15:45:07.000 
> 2 | 2018-12-10 15:45:07.000 | 2018-12-10 15:45:07.000 | 2018-12-10 
> 15:45:07.000 
> 3 | 2018-12-10 07:45:07.660 | 2018-12-10 07:45:07.660 | 2018-12-10 
> 07:45:07.660
> {code}
>  *Uses GMT+8 test*
>  Writing 3 records by the different ways.
> {code:java}
> UPSERT INTO date_test VALUES (1,'2018-12-10 15:40:47','2018-12-10 
> 15:40:47','2018-12-10 15:40:47')
> UPSERT INTO date_test VALUES (2,to_date('2018-12-10 
> 15:40:47'),to_time('2018-12-10 15:40:47'),to_timestamp('2018-12-10 15:40:47'))
> stmt.setInt(1, 3);stmt.setDate(2, date);stmt.setTime(3, 
> time);stmt.setTimestamp(4, ts);
> {code}
> Reading the table by the getObject(getDate/getTime/getTimestamp) methods.
> {code:java}
> 1 | 2018-12-10 | 23:40:47 | 2018-12-10 23:40:47.0 
> 2 | 2018-12-10 | 15:40:47 | 2018-12-10 15:40:47.0 
> 3 | 2018-12-10 | 15:40:47 | 2018-12-10 15:40:47.106 {code}
> Reading the table by the getString methods
> {code:java}
>  1 | 2018-12-10 23:40:47.000 | 2018-12-10 23:40:47.000 | 2018-12-10 
> 23:40:47.000
> 2 | 2018-12-10 15:40:47.000 | 2018-12-10 15:40:47.000 | 2018-12-10 
> 15:40:47.000
> 3 | 2018-12-10 15:40:47.106 | 2018-12-10 15:40:47.106 | 2018-12-10 
> 15:40:47.106
> {code}
>  
> _We_ have a historical problem,  we'll parse the string to 
> Date/Time/Timestamp objects with timezone in #1, which means the actual data 
> is going to be changed when stored in HBase table。



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (PHOENIX-5384) org.apache.hadoop.hbase.client.NoServerForRegionException: No server address listed in hbase:meta for region SYSTEM.CATALOG,,1561983176360.29b36daaceebbfc1ff863e445e9

2020-09-15 Thread Daniel Wong (Jira)



[ 
https://issues.apache.org/jira/browse/PHOENIX-5384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17196460#comment-17196460
 ] 

Daniel Wong commented on PHOENIX-5384:
--

This looks to be an incorrect setup type of issue and a cloudera distrubution.  
Jira is not for support type of issues.  I recommend reaching out to cloudera 
support, closing.

> org.apache.hadoop.hbase.client.NoServerForRegionException: No server address 
> listed in hbase:meta for region 
> SYSTEM.CATALOG,,1561983176360.29b36daaceebbfc1ff863e445e9f4cd5. containing row
> ---
>
> Key: PHOENIX-5384
> URL: https://issues.apache.org/jira/browse/PHOENIX-5384
> Project: Phoenix
>  Issue Type: Task
>Affects Versions: 5.0.0
> Environment: CDH =6.0
> Hbase:-2.0.0
> Apache phoenix:-5.0.0
> org.apache.hadoop.hbas
>Reporter: Pradyumna K Dalai
>Priority: Critical
> Attachments: Capture.PNG
>
>
> Hi Team,
> We are getting this error while testing apache phoenix in our server.
> CDH =6.0
> Hbase:-2.0.0
> Apache phoenix:-5.0.0
> org.apache.hadoop.hbase.client.NoServerForRegionException: No server address 
> listed in hbase:meta for region
> SYSTEM.CATALOG,,1561983176360.29b36daaceebbfc1ff863e445e9f4cd5. containing row
>  
> Please assist.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (PHOENIX-5462) Region Server crash with: Added a key not lexically larger than previous

2020-09-15 Thread Daniel Wong (Jira)



[ 
https://issues.apache.org/jira/browse/PHOENIX-5462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17196456#comment-17196456
 ] 

Daniel Wong commented on PHOENIX-5462:
--

Looks like a mutable index issue we saw and should be solved by new indexing 
FYI [~gjacoby] and [~kozdemir].

> Region Server crash with: Added a key not lexically larger than previous
> 
>
> Key: PHOENIX-5462
> URL: https://issues.apache.org/jira/browse/PHOENIX-5462
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.14.2
>Reporter: Alex Batyrshin
>Priority: Critical
>
> Region-Server crash with "Added a key not lexically larger than previous" 
> after table update with Phoenix client.
> This bug also was created for HBase - 
> https://issues.apache.org/jira/browse/HBASE-22862
> Looks like mutations was added in reverse order or something like this.
> Our table and pseudo code for table updates:
> {code:java}
> CREATE TABLE IF NOT EXISTS TBL_TABLE_CODE (
> "c" VARCHAR NOT NULL PRIMARY KEY,
> "d"."apd" TIMESTAMP,
> "d"."emd" TIMESTAMP,
> "d"."prid" VARCHAR,
> "d"."o" VARCHAR,
> "d"."elr" UNSIGNED_TINYINT,
> "d"."st" UNSIGNED_TINYINT
> ...
> );
> CREATE INDEX "IDX_TBL_TABLE_CODE_O" ON "TBL_TABLE_CODE" ("d"."o", "d"."emd") 
> INCLUDE( "d"."elr", "apd", "d"."st", ... );
> CREATE INDEX "IDX_TBL_TABLE_CODE_PRID" ON "TBL_TABLE_CODE" ("d"."prid", 
> "d"."emd") INCLUDE( "d"."elr", "d"."apd", "d"."st", ...);
> for (int i=0; i < batches.size; i += 1) {
>   for (int j = 0; j < batches[i].size; j += 1 ) {
> UPSERT INTO TBL_TABLE_CODE VALUES (?, ?, ? ...);
>   }
>   commit;
> }
> {code}
> Env:
> HBase-1.4.10
> Phoenix-4.14.2-HBase-1.4
> Exception:
> {code:java}
> 2019-08-15 18:02:21,777 FATAL [MemStoreFlusher.0] regionserver.HRegionServer: 
> ABORTING region server prod006,60020,1565873610692: Replay of WAL required. 
> Forcing server shutdown
> org.apache.hadoop.hbase.DroppedSnapshotException: region: 
> TBL_TABLE_CODE,\x0904606203097821slG=sPD,1563070299676.5110b3395ca64a51cea99c6572a4c3d9.
>at 
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushCacheAndCommit(HRegion.java:2675)
>at 
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2352)
>at 
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2314)
>at 
> org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:2200)
>at 
> org.apache.hadoop.hbase.regionserver.HRegion.flush(HRegion.java:2125)
>at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:512)
>at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:482)
>at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$900(MemStoreFlusher.java:76)
>at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:264)
>at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.IOException: Added a key not lexically larger than 
> previous. Current cell = 
> \x0901820448218>wGavb'/d:elr/1565881054828/DeleteColumn/vlen=0/seqid=44456567,
>  lastCell = 
> \x0901820448218>wGavb'/d:elr/1565881054828/Put/vlen=1/seqid=44457770
>at 
> org.apache.hadoop.hbase.io.hfile.AbstractHFileWriter.checkKey(AbstractHFileWriter.java:204)
>at 
> org.apache.hadoop.hbase.io.hfile.HFileWriterV2.append(HFileWriterV2.java:279)
>at 
> org.apache.hadoop.hbase.io.hfile.HFileWriterV3.append(HFileWriterV3.java:87)
>at 
> org.apache.hadoop.hbase.regionserver.StoreFile$Writer.append(StoreFile.java:1127)
>at 
> org.apache.hadoop.hbase.regionserver.StoreFlusher.performFlush(StoreFlusher.java:139)
>at 
> org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher.flushSnapshot(DefaultStoreFlusher.java:75)
>at 
> org.apache.hadoop.hbase.regionserver.HStore.flushCache(HStore.java:1003)
>at 
> org.apache.hadoop.hbase.regionserver.HStore$StoreFlusherImpl.flushCache(HStore.java:2523)
>at 
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushCacheAndCommit(HRegion.java:2622)
>... 9 more
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (PHOENIX-4906) Abnormal query result due to Phoenix plan error

2020-09-15 Thread Daniel Wong (Jira)



[ 
https://issues.apache.org/jira/browse/PHOENIX-4906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17196453#comment-17196453
 ] 

Daniel Wong commented on PHOENIX-4906:
--

Just going over the high priority issues and I think this is a "known" issue 
for salted tables that the splits must contain at most 1 salt byte and should 
be split on the salt byte boundaries, we have seen similar issues in past 
testing environments.  Not sure if we can selectively disable merging in HBase. 
 Maybe [~bharathv] or [~abhishek.chouhan] know more?

> Abnormal query result due to Phoenix plan error
> ---
>
> Key: PHOENIX-4906
> URL: https://issues.apache.org/jira/browse/PHOENIX-4906
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.11.0, 4.14.0
>Reporter: JeongMin Ju
>Priority: Critical
> Attachments: ScanRanges_intersectScan.png, 
> TestSaltingWithRegionMerge.java, initial_salting_region.png, merged-region.png
>
>
> For a salted table, when a query is made for an entire data target, a 
> different plan is created depending on the type of the query, and as a 
> result, erroneous data is retrieved as a result.
> {code:java}
> // Actually, the schema of the table I used is different, but please ignore 
> it.
> create table if not exists test.test_tale (
>   rk1 varchar not null,
>   rk2 varchar not null,
>   column1 varchar
>   constraint pk primary key (rk1, rk2)
> )
> ...
> SALT_BUCKETS=16...
> ;
> {code}
>  
> I created a table with 16 salting regions and then wrote a lot of data.
>  HBase automatically split the region and I did the merging regions for data 
> balancing between the region servers.
> Then, when run the query, you can see that another plan is created according 
> to the Where clause.
>  * query1
>  select count\(*) from test.test_table;
> {code:java}
> +---+-++
> |PLAN 
>   | EST_BYTES_READ  | EST_ROWS_READ  |
> +---+-++
> | CLIENT 1851-CHUNK 5005959292 ROWS 1944546675532 BYTES PARALLEL 11-WAY FULL 
> SCAN OVER TEST:TEST_TABLE  | 1944546675532   | 5005959292 |
> | SERVER FILTER BY FIRST KEY ONLY 
>   | 1944546675532   | 5005959292 |
> | SERVER AGGREGATE INTO SINGLE ROW
>   | 1944546675532   | 5005959292 |
> +---+-++
> {code}
>  * query2
>  select count\(*) from test.test_table where rk2 = 'aa';
> {code}
> +---+-++
> |  PLAN   
>   | EST_BYTES_READ  | EST_ROWS_READ  |
> +---+-++
> | CLIENT 1846-CHUNK 4992196444 ROWS 1939177965768 BYTES PARALLEL 11-WAY RANGE 
> SCAN OVER TEST:TEST_TABLE [0] - [15]  | 1939177965768   | 4992196444 |
> | SERVER FILTER BY FIRST KEY ONLY AND RK2 = 'aa'  
>   | 1939177965768   | 4992196444 |
> | SERVER AGGREGATE INTO SINGLE ROW
>   | 1939177965768   | 4992196444 |
> +---+-++
> {code}
> Since rk2 used in the where clause of query2 is the second column of the PK, 
> it must be a full scan query like query1.
> However, as you can see, query2 is created by range scan and the generated 
> chunk is also less than five compared to query1.
> I added the log and printed out the startkey and endkey of the scan object 
> generated by the plan.
> And I found 5 chunks missing by query2.
> All five missing chunks were found in regions where the originally generated 
> region boundary value was not maintained through the merge operation.
> !initial_salting_region.png!
> After merging regions
> !merged-region.png!
> The code that caused the problem is this part.
>  When a select query is executed, the 
>

[jira] [Commented] (PHOENIX-5833) Incorrect results with RVCs and AND operator

2020-09-15 Thread Daniel Wong (Jira)



[ 
https://issues.apache.org/jira/browse/PHOENIX-5833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17196452#comment-17196452
 ] 

Daniel Wong commented on PHOENIX-5833:
--

Patch applied cleanly to my 4.x trying to reupload again.

> Incorrect results with RVCs and AND operator
> 
>
> Key: PHOENIX-5833
> URL: https://issues.apache.org/jira/browse/PHOENIX-5833
> Project: Phoenix
>  Issue Type: Bug
>  Components: core
>Affects Versions: 4.15.0
>Reporter: Bharath Vissapragada
>Assignee: Daniel Wong
>Priority: Critical
> Attachments: PHOENIX-5833.4.x.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Phoenix version: 4.15-HBase-1.5
> -- Create a test table and populate a couple of rows.
> {noformat}
> create table repro_bug(a varchar(10) not null, b varchar(10) not null, c 
> varchar(10) not null constraint pk primary key(a, b, c));
> upsert into repro_bug values('abc', 'def', 'RRSQ_IMKKL');
> upsert into repro_bug values('abc', 'def', 'RRS_ZYTDT');
> select * from repro_bug;
> +--+--+-+
> |  A   |  B   |  C  |
> +--+--+-+
> | abc  | def  | RRSQ_IMKKL  |
> | abc  | def  | RRS_ZYTDT   |
> +--+--+-+
> {noformat}
> -- Query 1 - Look for rows where C has a certain prefix - Returns correct 
> result
> {noformat}
> select A, B, C from REPRO_BUG where C like 'RRS\\_%';
> +--+--++
> |  A   |  B   | C  |
> +--+--++
> | abc  | def  | RRS_ZYTDT  |
> +--+--++
> {noformat}
> -- Query 2 - Look for rows where (a, b, c) > first row - Returns correct 
> result
> {noformat}
> select A, B, C from REPRO_BUG where (A, B, C) > ('abc', 'def', 'RRSQ_IMKKL')
> +--+--++
> |  A   |  B   | C  |
> +--+--++
> | abc  | def  | RRS_ZYTDT  |
> +--+--++
> {noformat}
> -- Query 3 - Combine the filters from Query 1 and Query2 - Returns incorrect 
> result.. Ideally it should return the same row as above.
> {noformat}
>  select A, B, C from REPRO_BUG where (A, B, C) > ('abc', 'def', 'RRSQ_IMKKL') 
> AND C like 'RRS\\_%';
> ++++
> | A  | B  | C  |
> ++++
> ++++
> {noformat}
> -- Explain for the above incase someone is interested.
> {noformat}
> explain select A, B, C from REPRO_BUG where (A, B, C) > ('abc', 'def', 
> 'RRSQ_IMKKL') AND C like 'RRS\\_%';
> ++-++--+
> |  PLAN   
>| EST_BYTES_READ  | EST_ROWS_READ  | EST_INFO_TS  |
> ++-++--+
> | CLIENT 1-CHUNK PARALLEL 1-WAY ROUND ROBIN RANGE SCAN OVER REPRO_BUG 
> ['abcdef'] - [*]  | null| null   | null |
> | SERVER FILTER BY FIRST KEY ONLY AND C LIKE 'RRS\_%' 
>| null| null   | null |
> ++-++--+
> 2 rows selected (0.003 seconds)
> {noformat}
> I'm trying to poke around in the code to figure out the issue but my 
> understanding of  Phoenix is limited at this point. So creating a bug report 
> incase someone can figure this out quickly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (PHOENIX-5287) Incorrect results for COUNT(1) on a table with GLOBAL INDEX

2020-09-15 Thread Daniel Wong (Jira)



[ 
https://issues.apache.org/jira/browse/PHOENIX-5287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17196445#comment-17196445
 ] 

Daniel Wong commented on PHOENIX-5287:
--

Repinging [~kozdemir] and [~gjacoby] as there was already a Jira to port index 
changes to 5.0.

> Incorrect results for COUNT(1) on a table with GLOBAL INDEX
> ---
>
> Key: PHOENIX-5287
> URL: https://issues.apache.org/jira/browse/PHOENIX-5287
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 5.0.0
> Environment: Environment and data information:
>  * Column structure of TABLE_A is fully identical to TABLE_B
>  * TABLE_A has a GLOBAL INDEX
>  * TABLE_B has a LOCAL INDEX
>Reporter: Vlad Krava
>Priority: Blocker
>
> COUNT(\*\) and COUNT(1) commands display incorrect(outdated) statistics for 
> table with GLOBAL index.
> *Example:*
>  * Export TABLE_A to SCV file (SELECT * FROM *POMG.TABLE_A*)
>  * Import CSV file to TABLE_B
>  * COUNT operation on 'TABLE_A' was constantly returning with an amount of 
> 218623 (for 2 days without any data modifications!!!) :
>  ** 0: *jdbc:phoenix:> select count(1) from POMG.TABLE_A*;
>  *** RESULT: 218623
>  * Newly exported table from CSV file (TABLE_B) showed different (higher 
> amount of records)
>  ** 0: *jdbc:phoenix:> select count(1) from POMG.TABLE_B*;
>  *** RESULT: 218683
>  * COUNT in Hbase is returning the bigger value than COUNT comparing to 
> Phoenix table ( 218683 vs 218623)
>  * Phoenix Statistics for this table was updated few times for the past few 
> testing days
>  * I took few attends to define data misalignments by executing diff for 
> primary keys:
>  ** select key_1 from *POMG.TABLE_A* where key_1 not in (select key_1 from 
> *POMG.TABLE_B*) - 0 records selected (_Doesn't make sense considering a fact 
> that TABLE_A larger than TABLE_B and key_1 is unique PRIMARY KEY_)
>  ** select key_1 from *POMG.TABLE_B* where key_1 not in (select key_1 from 
> *POMG.TABLE_A*)  - 23 records selected (_Doesn't make sense considering a 
> fact that TABLE_A larger than TABLE_B and key_1 is unique PRIMARY KEY_)
> *Workaround:*
>  * After executing ALTER INDEX with REBUILD flag COUNT statistics for TABLE_A 
> become identical to TABLE_B
>  * Diff selects didn't show any differences between  *POMG.TABLE_A* and    
> *POMG.TABLE_B*
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (PHOENIX-5171) SkipScan incorrectly filters composite primary key which the key range contains all values

2020-09-09 Thread Daniel Wong (Jira)



[ 
https://issues.apache.org/jira/browse/PHOENIX-5171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17193084#comment-17193084
 ] 

Daniel Wong commented on PHOENIX-5171:
--

Main nit I had other than the comment you removed was we should use the table 
name unique generation.  Other than that patch looks good to me.  

> SkipScan incorrectly filters composite primary key which the key range 
> contains all values
> --
>
> Key: PHOENIX-5171
> URL: https://issues.apache.org/jira/browse/PHOENIX-5171
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 5.0.0, 4.14.1
>Reporter: Jaanai Zhang
>Assignee: Jaanai Zhang
>Priority: Blocker
> Fix For: 5.1.0, 4.16.0
>
> Attachments: 5171-master-v3.patch, PHOENIX-5171-master-v2.patch, 
> PHOENIX-5171-master.patch
>
>
> Running the below SQL:
> {code:sql}
> create table if not exists aiolos(
> vdate varchar,
> tab varchar,
> dev tinyint not null,
> app varchar,
> target varchar,
> channel varchar,
> one varchar,
> two varchar,
> count1 integer,
> count2 integer,
> CONSTRAINT PK PRIMARY KEY (vdate,tab,dev,app,target,channel,one,two));
> upsert into aiolos 
> values('2018-02-14','channel_agg',2,null,null,'A004',null,null,2,2);
> upsert into aiolos 
> values('2018-02-14','channel_agg',2,null,null,null,null,null,2,2);
> SELECT * FROM aiolos WHERE dev = 2 AND vdate BETWEEN '2018-02-10' AND 
> '2019-02-19' AND tab = 'channel_agg' and channel='A004';
> {code}
> Throws exception:
> {code:java}
> Caused by: java.lang.IllegalStateException: The next hint must come after 
> previous hint 
> (prev=2018-02-14\x00channel_agg\x00\x82//LATEST_TIMESTAMP/Maximum/vlen=0/seqid=0,
>  
> next=2018-02-14\x00channel_agg\x00\x82//LATEST_TIMESTAMP/Maximum/vlen=0/seqid=0,
>  
> kv=2018-02-14\x00channel_agg\x00\x82/0:\x00\x00\x00\x00/1550642992223/Put/vlen=4/seqid=5445463)
>   at 
> org.apache.phoenix.filter.SkipScanFilter.setNextCellHint(SkipScanFilter.java:171)
>   at 
> org.apache.phoenix.filter.SkipScanFilter.filterKeyValue(SkipScanFilter.java:145)
>   at 
> org.apache.hadoop.hbase.filter.FilterList.filterKeyValue(FilterList.java:264)
>   at 
> org.apache.hadoop.hbase.regionserver.ScanQueryMatcher.match(ScanQueryMatcher.java:418)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:557)
>   at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:147)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.populateResult(HRegion.java:6308)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:6459)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:6246)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:6232)
>   at 
> org.apache.phoenix.iterate.RegionScannerFactory$1.nextRaw(RegionScannerFactory.java:175)
>   ... 8 more
> {code}
> The caused by incorrect next cell hint, due to we have skipped the rest of 
> solts that some key ranges contain all values(EVERYTHING_RANGE) in 
> ScanUtil.setKey method. The next cell hint of current case is 
> _kv=2018-02-14\x00channel_agg\x00\x82/0:\x00\x00\x00\x00_, but it shoud be  
> _kv=2018-02-14\x00channel_agg\x00\x82\x00\x00A004_.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (PHOENIX-5171) SkipScan incorrectly filters composite primary key which the key range contains all values

2020-09-09 Thread Daniel Wong (Jira)



[ 
https://issues.apache.org/jira/browse/PHOENIX-5171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17192752#comment-17192752
 ] 

Daniel Wong commented on PHOENIX-5171:
--

[~larsh] took a look there are some minor style nits as well as some comments I 
would change.  Other than that i think the patch looks good, it essentially 
tightens the lower bound of the scan range generator.  I ran the original test 
cases that added the EVERYTHING range qualifier to the disjunction as well.  
Probably want a new patch upload to verify the tests.  [~jaanai] any 
interest/time to do this?  Otherwise I may just do this myself.

> SkipScan incorrectly filters composite primary key which the key range 
> contains all values
> --
>
> Key: PHOENIX-5171
> URL: https://issues.apache.org/jira/browse/PHOENIX-5171
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 5.0.0, 4.14.1
>Reporter: Jaanai Zhang
>Assignee: Jaanai Zhang
>Priority: Blocker
> Fix For: 5.1.0, 4.16.0
>
> Attachments: PHOENIX-5171-master-v2.patch, PHOENIX-5171-master.patch
>
>
> Running the below SQL:
> {code:sql}
> create table if not exists aiolos(
> vdate varchar,
> tab varchar,
> dev tinyint not null,
> app varchar,
> target varchar,
> channel varchar,
> one varchar,
> two varchar,
> count1 integer,
> count2 integer,
> CONSTRAINT PK PRIMARY KEY (vdate,tab,dev,app,target,channel,one,two));
> upsert into aiolos 
> values('2018-02-14','channel_agg',2,null,null,'A004',null,null,2,2);
> upsert into aiolos 
> values('2018-02-14','channel_agg',2,null,null,null,null,null,2,2);
> SELECT * FROM aiolos WHERE dev = 2 AND vdate BETWEEN '2018-02-10' AND 
> '2019-02-19' AND tab = 'channel_agg' and channel='A004';
> {code}
> Throws exception:
> {code:java}
> Caused by: java.lang.IllegalStateException: The next hint must come after 
> previous hint 
> (prev=2018-02-14\x00channel_agg\x00\x82//LATEST_TIMESTAMP/Maximum/vlen=0/seqid=0,
>  
> next=2018-02-14\x00channel_agg\x00\x82//LATEST_TIMESTAMP/Maximum/vlen=0/seqid=0,
>  
> kv=2018-02-14\x00channel_agg\x00\x82/0:\x00\x00\x00\x00/1550642992223/Put/vlen=4/seqid=5445463)
>   at 
> org.apache.phoenix.filter.SkipScanFilter.setNextCellHint(SkipScanFilter.java:171)
>   at 
> org.apache.phoenix.filter.SkipScanFilter.filterKeyValue(SkipScanFilter.java:145)
>   at 
> org.apache.hadoop.hbase.filter.FilterList.filterKeyValue(FilterList.java:264)
>   at 
> org.apache.hadoop.hbase.regionserver.ScanQueryMatcher.match(ScanQueryMatcher.java:418)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:557)
>   at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:147)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.populateResult(HRegion.java:6308)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:6459)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:6246)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:6232)
>   at 
> org.apache.phoenix.iterate.RegionScannerFactory$1.nextRaw(RegionScannerFactory.java:175)
>   ... 8 more
> {code}
> The caused by incorrect next cell hint, due to we have skipped the rest of 
> solts that some key ranges contain all values(EVERYTHING_RANGE) in 
> ScanUtil.setKey method. The next cell hint of current case is 
> _kv=2018-02-14\x00channel_agg\x00\x82/0:\x00\x00\x00\x00_, but it shoud be  
> _kv=2018-02-14\x00channel_agg\x00\x82\x00\x00A004_.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (PHOENIX-6010) Create phoenix-thirdparty, and consume guava through it

2020-08-03 Thread Daniel Wong (Jira)



[ 
https://issues.apache.org/jira/browse/PHOENIX-6010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17170369#comment-17170369
 ] 

Daniel Wong commented on PHOENIX-6010:
--

Thanks for pinging me on one of the downstream.  Just pinging some other people 
on FYI.  I think this is fine for phoenix master.  I think we need to add 
another subtask for phoenix-connectors as well unless you are considering that 
as part of the phoenix changes.  I would prefer another guava version but I 
haven't checked all the considerations between Java7 and newer versions.  I 
know currently at least for 4.x there are some reasons to stay on java7, i'm 
debating if on master do we have those same issues?  Last discussion of java 
compatibility https://issues.apache.org/jira/browse/PHOENIX-5855

FYI [~gjacoby] [~ckulkarni]

> Create phoenix-thirdparty, and consume guava through it
> ---
>
> Key: PHOENIX-6010
> URL: https://issues.apache.org/jira/browse/PHOENIX-6010
> Project: Phoenix
>  Issue Type: Improvement
>  Components: core, omid, tephra
>Affects Versions: 5.1.0, 4.16.0
>Reporter: Istvan Toth
>Assignee: Istvan Toth
>Priority: Major
>
> We have long-standing and well-documented problems with Guava, just like the 
> rest of the Hadoop components.
> Adopt the solution used by HBase:
>  * create phoenix-thirdparty repo
>  * create a pre-shaded phoenix-shaded-guava artifact in it
>  * Use the pre-shaded Guava in every phoenix component
> The advantages are well-known, but to name a few:
>  * Phoenix will work with Hadoop 3.1.3+
>  * One less CVE in our direct dependencies
>  * No more conflict with our consumer's Guava versions



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (PHOENIX-6022) RVC Offset does not handle trailing nulls properly

2020-07-29 Thread Daniel Wong (Jira)



[ 
https://issues.apache.org/jira/browse/PHOENIX-6022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17167474#comment-17167474
 ] 

Daniel Wong commented on PHOENIX-6022:
--

Latest patch for comment only nits.  Please merge for me [~yanxinyi].

> RVC Offset does not handle trailing nulls properly
> --
>
> Key: PHOENIX-6022
> URL: https://issues.apache.org/jira/browse/PHOENIX-6022
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.15.0, 4.16.0
>Reporter: Daniel Wong
>Assignee: Daniel Wong
>Priority: Major
> Fix For: 4.16.0
>
> Attachments: PHOENIX-6022-4.x.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> While working on PHOENIX-5924 found that if there was a trailing null the 
> Phoenix Optimizer during resolution of the miniwhere does not generate a 
> point get due to how we write/handle trailing nulls in the rowkey.  The 
> sanity check that it is a point get therefore fails.  This causes an 
> exception to be returned to the user.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (PHOENIX-6045) Delete that should qualify for index path does not use index when multiple indexes are available.

2020-07-27 Thread Daniel Wong (Jira)



[ 
https://issues.apache.org/jira/browse/PHOENIX-6045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17165986#comment-17165986
 ] 

Daniel Wong edited comment on PHOENIX-6045 at 7/27/20, 10:24 PM:
-

[~larsh] thats essentially what I was mentioning but I did not communicate it 
properly.  

I was suggesting for mutable tables that we could select from index1 the PKs 
and use that to generate mutation state for the base table which any index by 
definition has.  We could then send mutations to the base table.  As part of 
the mutation once we delete from the base table the deletions could propagate 
to both index1 and index2. Either through readback as you are mentioning or can 
the server propgate the mutations?  If the index select is much more efficient 
this overall process could be more efficient?  


was (Author: dbwong):
[~larsh] thats essentially what I was mentioning but I did not communicate it 
properly.  

I was suggesting for mutable tables that we could select from index1 the PKs 
and use that to generate mutation state for the base table which any index by 
definition has.  We could then send mutations to the base table.  As part of 
the mutation once we delete from the base table the deletions could propagate 
to both index1 and index2.   If the index select is much more efficient this 
overall process could be more efficient?  

> Delete that should qualify for index path does not use index when multiple 
> indexes are available.
> -
>
> Key: PHOENIX-6045
> URL: https://issues.apache.org/jira/browse/PHOENIX-6045
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 5.0.0, 4.15.0, 4.14.3
>Reporter: Daniel Wong
>Priority: Major
> Fix For: 5.1.0, 4.16.0
>
> Attachments: MultipleDeleteReproIT.java
>
>
> Delete that should qualify for index path does not use index when multiple 
> indexes are available.  Test case to reproduce will be below.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (PHOENIX-6045) Delete that should qualify for index path does not use index when multiple indexes are available.

2020-07-27 Thread Daniel Wong (Jira)



[ 
https://issues.apache.org/jira/browse/PHOENIX-6045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17165986#comment-17165986
 ] 

Daniel Wong commented on PHOENIX-6045:
--

[~larsh] thats essentially what I was mentioning but I did not communicate it 
properly.  

I was suggesting for mutable tables that we could select from index1 the PKs 
and use that to generate mutation state for the base table which any index by 
definition has.  We could then send mutations to the base table.  As part of 
the mutation once we delete from the base table the deletions could propagate 
to both index1 and index2.   If the index select is much more efficient this 
overall process could be more efficient?  

> Delete that should qualify for index path does not use index when multiple 
> indexes are available.
> -
>
> Key: PHOENIX-6045
> URL: https://issues.apache.org/jira/browse/PHOENIX-6045
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 5.0.0, 4.15.0, 4.14.3
>Reporter: Daniel Wong
>Priority: Major
> Fix For: 5.1.0, 4.16.0
>
> Attachments: MultipleDeleteReproIT.java
>
>
> Delete that should qualify for index path does not use index when multiple 
> indexes are available.  Test case to reproduce will be below.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (PHOENIX-6045) Delete that should qualify for index path does not use index when multiple indexes are available.

2020-07-25 Thread Daniel Wong (Jira)



[ 
https://issues.apache.org/jira/browse/PHOENIX-6045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17164774#comment-17164774
 ] 

Daniel Wong commented on PHOENIX-6045:
--

[~larsh] I think [~kozdemir] has a bit more context on this as he was looking 
into it in more detail but this is my understanding.

For mutable indexes we don't need the other keys since the delete of the index 
rows are powered by the server side, (rather than the client side).  For 
immutable cases you are correct that if we were powering it from the client we 
would need the other index keys to delete from the index rows.  I think this 
Jira is still valid.  What do you think?

 

 

> Delete that should qualify for index path does not use index when multiple 
> indexes are available.
> -
>
> Key: PHOENIX-6045
> URL: https://issues.apache.org/jira/browse/PHOENIX-6045
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 5.0.0, 4.15.0, 4.14.3
>Reporter: Daniel Wong
>Priority: Major
> Fix For: 5.1.0, 4.16.0
>
> Attachments: MultipleDeleteReproIT.java
>
>
> Delete that should qualify for index path does not use index when multiple 
> indexes are available.  Test case to reproduce will be below.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (PHOENIX-6023) Wrong result when issuing query for an immutable table with multiple column families

2020-07-24 Thread Daniel Wong (Jira)



[ 
https://issues.apache.org/jira/browse/PHOENIX-6023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17164719#comment-17164719
 ] 

Daniel Wong commented on PHOENIX-6023:
--

Thanks [~stoty] for running verify manually.

> Wrong result when issuing query for an immutable table with multiple column 
> families
> 
>
> Key: PHOENIX-6023
> URL: https://issues.apache.org/jira/browse/PHOENIX-6023
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Toshihiro Suzuki
>Assignee: Toshihiro Suzuki
>Priority: Major
> Fix For: 5.1.0, 4.16.0
>
> Attachments: PHOENIX-6023-addendum.master.v1.patch, 
> PHOENIX-6023.addendum.master.v2.patch, PHOENIX-6023.master.v3.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Steps to reproduce are as follows:
> 1. Create an immutable table with multiple column families:
> {code}
> 0: jdbc:phoenix:> CREATE TABLE TEST (
> . . . . . . . . >   ID VARCHAR PRIMARY KEY,
> . . . . . . . . >   A.COL1 VARCHAR,
> . . . . . . . . >   B.COL2 VARCHAR
> . . . . . . . . > ) IMMUTABLE_ROWS = TRUE;
> No rows affected (1.182 seconds)
> {code}
> 2. Upsert some rows:
> {code}
> 0: jdbc:phoenix:> UPSERT INTO TEST VALUES ('id0', '0', 'a');
> 1 row affected (0.138 seconds)
> 0: jdbc:phoenix:> UPSERT INTO TEST VALUES ('id1', '1', NULL);
> 1 row affected (0.009 seconds)
> 0: jdbc:phoenix:> UPSERT INTO TEST VALUES ('id2', '2', 'b');
> 1 row affected (0.011 seconds)
> 0: jdbc:phoenix:> UPSERT INTO TEST VALUES ('id3', '3', NULL);
> 1 row affected (0.007 seconds)
> 0: jdbc:phoenix:> UPSERT INTO TEST VALUES ('id4', '4', 'c');
> 1 row affected (0.006 seconds)
> 0: jdbc:phoenix:> UPSERT INTO TEST VALUES ('id5', '5', NULL);
> 1 row affected (0.007 seconds)
> 0: jdbc:phoenix:> UPSERT INTO TEST VALUES ('id6', '6', 'd');
> 1 row affected (0.007 seconds)
> 0: jdbc:phoenix:> UPSERT INTO TEST VALUES ('id7', '7', NULL);
> 1 row affected (0.007 seconds)
> 0: jdbc:phoenix:> UPSERT INTO TEST VALUES ('id8', '8', 'e');
> 1 row affected (0.007 seconds)
> 0: jdbc:phoenix:> UPSERT INTO TEST VALUES ('id9', '9', NULL);
> 1 row affected (0.009 seconds)
> {code}
> 3. Count query is okay:
> {code}
> 0: jdbc:phoenix:> SELECT COUNT(COL1) FROM TEST WHERE COL2 IS NOT NULL;
> ++
> | COUNT(A.COL1)  |
> ++
> | 5  |
> ++
> 1 row selected (0.1 seconds)
> {code}
> 4. However, the following select query returns wrong result (it should return 
> 5 records):
> {code}
> 0: jdbc:phoenix:> SELECT COL1 FROM TEST WHERE COL2 IS NOT NULL;
> +---+
> | COL1  |
> +---+
> | 0 |
> | 1 |
> | 2 |
> | 3 |
> | 4 |
> | 5 |
> | 6 |
> | 7 |
> | 8 |
> | 9 |
> +---+
> 10 rows selected (0.058 seconds)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (PHOENIX-6023) Wrong result when issuing query for an immutable table with multiple column families

2020-07-23 Thread Daniel Wong (Jira)



[ 
https://issues.apache.org/jira/browse/PHOENIX-6023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17164049#comment-17164049
 ] 

Daniel Wong commented on PHOENIX-6023:
--

Just tracked this failure down myself.  Please get a hadoopQA run for the 
addendum FYI [~yanxinyi]

> Wrong result when issuing query for an immutable table with multiple column 
> families
> 
>
> Key: PHOENIX-6023
> URL: https://issues.apache.org/jira/browse/PHOENIX-6023
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Toshihiro Suzuki
>Assignee: Toshihiro Suzuki
>Priority: Major
> Fix For: 5.1.0, 4.16.0
>
> Attachments: PHOENIX-6023-addendum.master.v1.patch, 
> PHOENIX-6023.addendum.master.v2.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Steps to reproduce are as follows:
> 1. Create an immutable table with multiple column families:
> {code}
> 0: jdbc:phoenix:> CREATE TABLE TEST (
> . . . . . . . . >   ID VARCHAR PRIMARY KEY,
> . . . . . . . . >   A.COL1 VARCHAR,
> . . . . . . . . >   B.COL2 VARCHAR
> . . . . . . . . > ) IMMUTABLE_ROWS = TRUE;
> No rows affected (1.182 seconds)
> {code}
> 2. Upsert some rows:
> {code}
> 0: jdbc:phoenix:> UPSERT INTO TEST VALUES ('id0', '0', 'a');
> 1 row affected (0.138 seconds)
> 0: jdbc:phoenix:> UPSERT INTO TEST VALUES ('id1', '1', NULL);
> 1 row affected (0.009 seconds)
> 0: jdbc:phoenix:> UPSERT INTO TEST VALUES ('id2', '2', 'b');
> 1 row affected (0.011 seconds)
> 0: jdbc:phoenix:> UPSERT INTO TEST VALUES ('id3', '3', NULL);
> 1 row affected (0.007 seconds)
> 0: jdbc:phoenix:> UPSERT INTO TEST VALUES ('id4', '4', 'c');
> 1 row affected (0.006 seconds)
> 0: jdbc:phoenix:> UPSERT INTO TEST VALUES ('id5', '5', NULL);
> 1 row affected (0.007 seconds)
> 0: jdbc:phoenix:> UPSERT INTO TEST VALUES ('id6', '6', 'd');
> 1 row affected (0.007 seconds)
> 0: jdbc:phoenix:> UPSERT INTO TEST VALUES ('id7', '7', NULL);
> 1 row affected (0.007 seconds)
> 0: jdbc:phoenix:> UPSERT INTO TEST VALUES ('id8', '8', 'e');
> 1 row affected (0.007 seconds)
> 0: jdbc:phoenix:> UPSERT INTO TEST VALUES ('id9', '9', NULL);
> 1 row affected (0.009 seconds)
> {code}
> 3. Count query is okay:
> {code}
> 0: jdbc:phoenix:> SELECT COUNT(COL1) FROM TEST WHERE COL2 IS NOT NULL;
> ++
> | COUNT(A.COL1)  |
> ++
> | 5  |
> ++
> 1 row selected (0.1 seconds)
> {code}
> 4. However, the following select query returns wrong result (it should return 
> 5 records):
> {code}
> 0: jdbc:phoenix:> SELECT COL1 FROM TEST WHERE COL2 IS NOT NULL;
> +---+
> | COL1  |
> +---+
> | 0 |
> | 1 |
> | 2 |
> | 3 |
> | 4 |
> | 5 |
> | 6 |
> | 7 |
> | 8 |
> | 9 |
> +---+
> 10 rows selected (0.058 seconds)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (PHOENIX-6045) Delete that should qualify for index path does not use index when multiple indexes are available.

2020-07-23 Thread Daniel Wong (Jira)



[ 
https://issues.apache.org/jira/browse/PHOENIX-6045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17163985#comment-17163985
 ] 

Daniel Wong commented on PHOENIX-6045:
--

FYI [~ChinmayKulkarni] [~kozdemir] 

> Delete that should qualify for index path does not use index when multiple 
> indexes are available.
> -
>
> Key: PHOENIX-6045
> URL: https://issues.apache.org/jira/browse/PHOENIX-6045
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Daniel Wong
>Priority: Major
> Attachments: MultipleDeleteReproIT.java
>
>
> Delete that should qualify for index path does not use index when multiple 
> indexes are available.  Test case to reproduce will be below.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (PHOENIX-6045) Delete that should qualify for index path does not use index when multiple indexes are available.

2020-07-23 Thread Daniel Wong (Jira)



[ 
https://issues.apache.org/jira/browse/PHOENIX-6045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17163984#comment-17163984
 ] 

Daniel Wong commented on PHOENIX-6045:
--

With indexes with disjoint set of primary keys I found this.  It appears that 
while we are considering index plans the both indexes' columns are included in 
the projected column list which causes the optimizer to reject the index as the 
columns are not covered by either index.

> Delete that should qualify for index path does not use index when multiple 
> indexes are available.
> -
>
> Key: PHOENIX-6045
> URL: https://issues.apache.org/jira/browse/PHOENIX-6045
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Daniel Wong
>Priority: Major
> Attachments: MultipleDeleteReproIT.java
>
>
> Delete that should qualify for index path does not use index when multiple 
> indexes are available.  Test case to reproduce will be below.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (PHOENIX-6013) RVC Offset does not handle coerced literal nulls properly.

2020-07-16 Thread Daniel Wong (Jira)



[ 
https://issues.apache.org/jira/browse/PHOENIX-6013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17159404#comment-17159404
 ] 

Daniel Wong commented on PHOENIX-6013:
--

[~gjacoby] will do waiting on getting the other patch in first as this one 
builds on top of it.

> RVC Offset does not handle coerced literal nulls properly.
> --
>
> Key: PHOENIX-6013
> URL: https://issues.apache.org/jira/browse/PHOENIX-6013
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.15.0
>Reporter: Daniel Wong
>Assignee: Daniel Wong
>Priority: Major
> Fix For: 4.16.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> As part of query rewrite paths the offset may go through the A=null gets 
> rewritten to A IS NULL, the code sanity checks against an equality comparison 
> op which is not the case on this rewrite.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (PHOENIX-5990) Delete statements don't honor index hints

2020-07-09 Thread Daniel Wong (Jira)



[ 
https://issues.apache.org/jira/browse/PHOENIX-5990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17154999#comment-17154999
 ] 

Daniel Wong edited comment on PHOENIX-5990 at 7/10/20, 4:24 AM:


I spent some time looking at this.  DELETE statements ignore hints for the 
select portion of their execution with the exception of /*+ 
USE_INDEX_OVER_DATA_TABLE */.  That hint does cause the optimizer to choose 
index path for deletes.  [~gjacoby] do you want to change the scope of this to 
include having the optimizer choose index path or do you want to close as no 
change needed?


was (Author: dbwong):
I spent some time looking at this.  DELETE statements ignore hints for the 
select portion of their execution with the exception of /*+ 
USE_INDEX_OVER_DATA_TABLE */.  That hint does cause the optimizer to choose 
index path for deletes.  [~gjacoby] do you want to change the scope of this to 
include having the optimizer choose index path or do you want to close as no 
change needed?{{}}{{}}{{}}

> Delete statements don't honor index hints
> -
>
> Key: PHOENIX-5990
> URL: https://issues.apache.org/jira/browse/PHOENIX-5990
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Geoffrey Jacoby
>Priority: Major
>
> While investigating a perf issue with a delete statement, we noticed that the 
> query optimizer wasn't using an index with a better (though not complete) 
> match for the WHERE clause than the base table PK was. Even using a query 
> hint to specify the index did not seem to change the EXPLAIN plan. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (PHOENIX-5990) Delete statements don't honor index hints

2020-07-09 Thread Daniel Wong (Jira)



[ 
https://issues.apache.org/jira/browse/PHOENIX-5990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17155091#comment-17155091
 ] 

Daniel Wong commented on PHOENIX-5990:
--

I'm not sure, I'm guessing some of the reason is historical as it was 
preventing indexes from being used with the following comment.  But some of it 
is probably just it being overlooked.


{noformat}
// If we have immutable indexes, we'd increase the number of bytes scanned by 
executing 
// separate queries against each index, so better to drive from a single table 
in that case.
{noformat}

> Delete statements don't honor index hints
> -
>
> Key: PHOENIX-5990
> URL: https://issues.apache.org/jira/browse/PHOENIX-5990
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Geoffrey Jacoby
>Priority: Major
>
> While investigating a perf issue with a delete statement, we noticed that the 
> query optimizer wasn't using an index with a better (though not complete) 
> match for the WHERE clause than the base table PK was. Even using a query 
> hint to specify the index did not seem to change the EXPLAIN plan. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (PHOENIX-5990) Delete statements don't honor index hints

2020-07-09 Thread Daniel Wong (Jira)



[ 
https://issues.apache.org/jira/browse/PHOENIX-5990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17154999#comment-17154999
 ] 

Daniel Wong commented on PHOENIX-5990:
--

I spent some time looking at this.  DELETE statements ignore hints for the 
select portion of their execution with the exception of /*+ 
USE_INDEX_OVER_DATA_TABLE */.  That hint does cause the optimizer to choose 
index path for deletes.  [~gjacoby] do you want to change the scope of this to 
include having the optimizer choose index path or do you want to close as no 
change needed?{{}}{{}}{{}}

> Delete statements don't honor index hints
> -
>
> Key: PHOENIX-5990
> URL: https://issues.apache.org/jira/browse/PHOENIX-5990
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Geoffrey Jacoby
>Priority: Major
>
> While investigating a perf issue with a delete statement, we noticed that the 
> query optimizer wasn't using an index with a better (though not complete) 
> match for the WHERE clause than the base table PK was. Even using a query 
> hint to specify the index did not seem to change the EXPLAIN plan. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (PHOENIX-5924) RVC Offset does not handle variable length fields exclusive scan boundary correctly

2020-06-15 Thread Daniel Wong (Jira)



[ 
https://issues.apache.org/jira/browse/PHOENIX-5924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17136082#comment-17136082
 ] 

Daniel Wong commented on PHOENIX-5924:
--

Hoping OSS infra works this time reuploading patch again.

> RVC Offset does not handle variable length fields exclusive scan boundary 
> correctly
> ---
>
> Key: PHOENIX-5924
> URL: https://issues.apache.org/jira/browse/PHOENIX-5924
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 5.0.0, 4.15.0, 4.14.3
>Reporter: Daniel Wong
>Assignee: Daniel Wong
>Priority: Major
> Fix For: 5.1.0, 4.16.0
>
> Attachments: PHOENIX-5924-4.x.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> The way exclusive boundary was handled by incrementing the key for variable 
> length fields is incorrect in the scan boundary.
>  
> In the following case we incrementing incorrectly from 0x490049 -> 0x490050 
> ('1','1' -> '1','2')
> We should increment from 0x490049 -> 0x49004900 ('1','1' -> '1','1'\x00) 
> @Test
> public void testScenario() throws Exception \{
> String TEST_DDL = "CREATE TABLE IF NOT EXISTS TEST_SCHEMA (\n"
> + "ORGANIZATION_ID VARCHAR(15), \n" + "TEST_ID 
> VARCHAR(15), \n"
> + "CREATED_DATE DATE, \n" + "LAST_UPDATE DATE\n"
> + "CONSTRAINT TEST_SCHEMA_PK PRIMARY KEY (ORGANIZATION_ID, 
> TEST_ID) \n" + ")";
> try (Statement statement = conn.createStatement()) {
> statement.execute(TEST_DDL);
> }
> //setup
> List upserts = new ArrayList<>();
> upserts.add("UPSERT INTO TEST_SCHEMA(ORGANIZATION_ID,TEST_ID) VALUES 
> ('1','1')");
> upserts.add("UPSERT INTO TEST_SCHEMA(ORGANIZATION_ID,TEST_ID) VALUES 
> ('1','10')");
> upserts.add("UPSERT INTO TEST_SCHEMA(ORGANIZATION_ID,TEST_ID) VALUES 
> ('2','2')");
> for(String sql : upserts) \{
> try (Statement statement = conn.createStatement()) {
> statement.execute(sql);
> }
> }
> conn.commit();
> String query1 = "SELECT * FROM TEST_SCHEMA";
> String query2 = "SELECT * FROM TEST_SCHEMA OFFSET 
> (ORGANIZATION_ID,TEST_ID) = ('1','1')";
> try (Statement statement = conn.createStatement() ; ResultSet rs1 = 
> statement.executeQuery(query1) ) \{
> TestUtil.printResultSet(rs1);
> }
> try (Statement statement = conn.createStatement() ; ResultSet rs2 = 
> statement.executeQuery(query2) ) \{
> TestUtil.printResultSet(rs2);
> }
> }
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (PHOENIX-5833) Incorrect results with RVCs and AND operator

2020-05-06 Thread Daniel Wong (Jira)



[ 
https://issues.apache.org/jira/browse/PHOENIX-5833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17101174#comment-17101174
 ] 

Daniel Wong commented on PHOENIX-5833:
--

As discussed offline with [~yanxinyi] and [~ckulkarni] will fix the wrong 
result and make a followup Jira for the cases where the wrong result was 
improving performance luckily.  

> Incorrect results with RVCs and AND operator
> 
>
> Key: PHOENIX-5833
> URL: https://issues.apache.org/jira/browse/PHOENIX-5833
> Project: Phoenix
>  Issue Type: Bug
>  Components: core
>Affects Versions: 4.15.0
>Reporter: Bharath Vissapragada
>Assignee: Daniel Wong
>Priority: Critical
> Attachments: PHOENIX-5833.4.x.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Phoenix version: 4.15-HBase-1.5
> -- Create a test table and populate a couple of rows.
> {noformat}
> create table repro_bug(a varchar(10) not null, b varchar(10) not null, c 
> varchar(10) not null constraint pk primary key(a, b, c));
> upsert into repro_bug values('abc', 'def', 'RRSQ_IMKKL');
> upsert into repro_bug values('abc', 'def', 'RRS_ZYTDT');
> select * from repro_bug;
> +--+--+-+
> |  A   |  B   |  C  |
> +--+--+-+
> | abc  | def  | RRSQ_IMKKL  |
> | abc  | def  | RRS_ZYTDT   |
> +--+--+-+
> {noformat}
> -- Query 1 - Look for rows where C has a certain prefix - Returns correct 
> result
> {noformat}
> select A, B, C from REPRO_BUG where C like 'RRS\\_%';
> +--+--++
> |  A   |  B   | C  |
> +--+--++
> | abc  | def  | RRS_ZYTDT  |
> +--+--++
> {noformat}
> -- Query 2 - Look for rows where (a, b, c) > first row - Returns correct 
> result
> {noformat}
> select A, B, C from REPRO_BUG where (A, B, C) > ('abc', 'def', 'RRSQ_IMKKL')
> +--+--++
> |  A   |  B   | C  |
> +--+--++
> | abc  | def  | RRS_ZYTDT  |
> +--+--++
> {noformat}
> -- Query 3 - Combine the filters from Query 1 and Query2 - Returns incorrect 
> result.. Ideally it should return the same row as above.
> {noformat}
>  select A, B, C from REPRO_BUG where (A, B, C) > ('abc', 'def', 'RRSQ_IMKKL') 
> AND C like 'RRS\\_%';
> ++++
> | A  | B  | C  |
> ++++
> ++++
> {noformat}
> -- Explain for the above incase someone is interested.
> {noformat}
> explain select A, B, C from REPRO_BUG where (A, B, C) > ('abc', 'def', 
> 'RRSQ_IMKKL') AND C like 'RRS\\_%';
> ++-++--+
> |  PLAN   
>| EST_BYTES_READ  | EST_ROWS_READ  | EST_INFO_TS  |
> ++-++--+
> | CLIENT 1-CHUNK PARALLEL 1-WAY ROUND ROBIN RANGE SCAN OVER REPRO_BUG 
> ['abcdef'] - [*]  | null| null   | null |
> | SERVER FILTER BY FIRST KEY ONLY AND C LIKE 'RRS\_%' 
>| null| null   | null |
> ++-++--+
> 2 rows selected (0.003 seconds)
> {noformat}
> I'm trying to poke around in the code to figure out the issue but my 
> understanding of  Phoenix is limited at this point. So creating a bug report 
> incase someone can figure this out quickly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (PHOENIX-5833) Incorrect results with LIKE operator

2020-04-21 Thread Daniel Wong (Jira)



[ 
https://issues.apache.org/jira/browse/PHOENIX-5833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17089124#comment-17089124
 ] 

Daniel Wong commented on PHOENIX-5833:
--

h3. Error Explanation

So the error was in the separator but not because it was missing.  

Prior to the consideration of the andExpresion we have the following 2 key 
ranges.

Span A,B,C -> ('abc\x00def\x00RRSQ_IMKKL' - *)

Span C -> ['RRS_'  - 'RRS'') //note ' is the character after '_'

Essentially as part of the AndExpression we generate the 2 key ranges.

Span A,B -> ('abc\x00def'  - *)
Span C -> ['RRS_'  - 'RRS'')

When writing the final combined minRange in ScanRanges we start by writing,

'abc\x00def'

Note it is not the final key so we append the asc separator.

'abc\x00def\x00'

Then finally we note that it is exclusive so we call next byte

'abc\x00def\x01'

Here we already now have a key range that will no longer select the row we 
intended whose PK was 'abc\x00def\x00RRS_ZYTDT'
h3. Fix

Short term.  I think we we truncate a range in the andExpression key extraction 
we may need to switch from exclusive to inclusive, as the bound that was 
exclusive was covering all 3 keys.  The inclusiveness cannot apply to only 2 of 
those 3 keys.  

> Incorrect results with LIKE operator
> 
>
> Key: PHOENIX-5833
> URL: https://issues.apache.org/jira/browse/PHOENIX-5833
> Project: Phoenix
>  Issue Type: Bug
>  Components: core
>Affects Versions: 4.15.0
>Reporter: Bharath Vissapragada
>Priority: Critical
>
> Phoenix version: 4.15-HBase-1.5
> -- Create a test table and populate a couple of rows.
> {noformat}
> create table repro_bug(a varchar(10) not null, b varchar(10) not null, c 
> varchar(10) not null constraint pk primary key(a, b, c));
> upsert into repro_bug values('abc', 'def', 'RRSQ_IMKKL');
> upsert into repro_bug values('abc', 'def', 'RRS_ZYTDT');
> select * from repro_bug;
> +--+--+-+
> |  A   |  B   |  C  |
> +--+--+-+
> | abc  | def  | RRSQ_IMKKL  |
> | abc  | def  | RRS_ZYTDT   |
> +--+--+-+
> {noformat}
> -- Query 1 - Look for rows where C has a certain prefix - Returns correct 
> result
> {noformat}
> select A, B, C from REPRO_BUG where C like 'RRS\\_%';
> +--+--++
> |  A   |  B   | C  |
> +--+--++
> | abc  | def  | RRS_ZYTDT  |
> +--+--++
> {noformat}
> -- Query 2 - Look for rows where (a, b, c) > first row - Returns correct 
> result
> {noformat}
> select A, B, C from REPRO_BUG where (A, B, C) > ('abc', 'def', 'RRSQ_IMKKL')
> +--+--++
> |  A   |  B   | C  |
> +--+--++
> | abc  | def  | RRS_ZYTDT  |
> +--+--++
> {noformat}
> -- Query 3 - Combine the filters from Query 1 and Query2 - Returns incorrect 
> result.. Ideally it should return the same row as above.
> {noformat}
>  select A, B, C from REPRO_BUG where (A, B, C) > ('abc', 'def', 'RRSQ_IMKKL') 
> AND C like 'RRS\\_%';
> ++++
> | A  | B  | C  |
> ++++
> ++++
> {noformat}
> -- Explain for the above incase someone is interested.
> {noformat}
> explain select A, B, C from REPRO_BUG where (A, B, C) > ('abc', 'def', 
> 'RRSQ_IMKKL') AND C like 'RRS\\_%';
> ++-++--+
> |  PLAN   
>| EST_BYTES_READ  | EST_ROWS_READ  | EST_INFO_TS  |
> ++-++--+
> | CLIENT 1-CHUNK PARALLEL 1-WAY ROUND ROBIN RANGE SCAN OVER REPRO_BUG 
> ['abcdef'] - [*]  | null| null   | null |
> | SERVER FILTER BY FIRST KEY ONLY AND C LIKE 'RRS\_%' 
>| null| null   | null |
> ++-++--+
> 2 rows selected (0.003 seconds)
> {noformat}
> I'm trying to poke around in the code to figure out the issue but my 
> understanding of  Phoenix is limited at this point. So creating a bug report 
> incase someone can figure this out quickly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (PHOENIX-5833) Incorrect results with LIKE operator

2020-04-21 Thread Daniel Wong (Jira)



[ 
https://issues.apache.org/jira/browse/PHOENIX-5833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17089124#comment-17089124
 ] 

Daniel Wong edited comment on PHOENIX-5833 at 4/21/20, 11:01 PM:
-

h3. Error Explanation

So the error was in the separator but not because it was missing.  

Prior to the consideration of the andExpresion we have the following 2 key 
ranges.

Span A,B,C -> ('abc\x00def\x00RRSQ_IMKKL' - *)

Span C -> ['RRS_'  - 'RRS'') //note ' is the character after '_'

Essentially as part of the AndExpression we generate the 2 key ranges.

Span A,B -> ('abc\x00def'  - *)
 Span C -> ['RRS_'  - 'RRS'')

When writing the final combined minRange in ScanRanges we start by writing,

'abc\x00def'

Note it is not the final key so we append the asc separator.

'abc\x00def\x00'

Then finally we note that it is exclusive so we call next byte

'abc\x00def\x01'

Here we already now have a key range that will no longer select the row we 
intended whose PK was 'abc\x00def\x00RRS_ZYTDT'
h3. Fix

Short term.  I think when we truncate a range in the andExpression key 
extraction we may need to switch from exclusive to inclusive, as the bound that 
was exclusive was covering all 3 keys.  The inclusiveness cannot apply to only 
2 of those 3 keys.  


was (Author: dbwong):
h3. Error Explanation

So the error was in the separator but not because it was missing.  

Prior to the consideration of the andExpresion we have the following 2 key 
ranges.

Span A,B,C -> ('abc\x00def\x00RRSQ_IMKKL' - *)

Span C -> ['RRS_'  - 'RRS'') //note ' is the character after '_'

Essentially as part of the AndExpression we generate the 2 key ranges.

Span A,B -> ('abc\x00def'  - *)
Span C -> ['RRS_'  - 'RRS'')

When writing the final combined minRange in ScanRanges we start by writing,

'abc\x00def'

Note it is not the final key so we append the asc separator.

'abc\x00def\x00'

Then finally we note that it is exclusive so we call next byte

'abc\x00def\x01'

Here we already now have a key range that will no longer select the row we 
intended whose PK was 'abc\x00def\x00RRS_ZYTDT'
h3. Fix

Short term.  I think we we truncate a range in the andExpression key extraction 
we may need to switch from exclusive to inclusive, as the bound that was 
exclusive was covering all 3 keys.  The inclusiveness cannot apply to only 2 of 
those 3 keys.  

> Incorrect results with LIKE operator
> 
>
> Key: PHOENIX-5833
> URL: https://issues.apache.org/jira/browse/PHOENIX-5833
> Project: Phoenix
>  Issue Type: Bug
>  Components: core
>Affects Versions: 4.15.0
>Reporter: Bharath Vissapragada
>Priority: Critical
>
> Phoenix version: 4.15-HBase-1.5
> -- Create a test table and populate a couple of rows.
> {noformat}
> create table repro_bug(a varchar(10) not null, b varchar(10) not null, c 
> varchar(10) not null constraint pk primary key(a, b, c));
> upsert into repro_bug values('abc', 'def', 'RRSQ_IMKKL');
> upsert into repro_bug values('abc', 'def', 'RRS_ZYTDT');
> select * from repro_bug;
> +--+--+-+
> |  A   |  B   |  C  |
> +--+--+-+
> | abc  | def  | RRSQ_IMKKL  |
> | abc  | def  | RRS_ZYTDT   |
> +--+--+-+
> {noformat}
> -- Query 1 - Look for rows where C has a certain prefix - Returns correct 
> result
> {noformat}
> select A, B, C from REPRO_BUG where C like 'RRS\\_%';
> +--+--++
> |  A   |  B   | C  |
> +--+--++
> | abc  | def  | RRS_ZYTDT  |
> +--+--++
> {noformat}
> -- Query 2 - Look for rows where (a, b, c) > first row - Returns correct 
> result
> {noformat}
> select A, B, C from REPRO_BUG where (A, B, C) > ('abc', 'def', 'RRSQ_IMKKL')
> +--+--++
> |  A   |  B   | C  |
> +--+--++
> | abc  | def  | RRS_ZYTDT  |
> +--+--++
> {noformat}
> -- Query 3 - Combine the filters from Query 1 and Query2 - Returns incorrect 
> result.. Ideally it should return the same row as above.
> {noformat}
>  select A, B, C from REPRO_BUG where (A, B, C) > ('abc', 'def', 'RRSQ_IMKKL') 
> AND C like 'RRS\\_%';
> ++++
> | A  | B  | C  |
> ++++
> ++++
> {noformat}
> -- Explain for the above incase someone is interested.
> {noformat}
> explain select A, B, C from REPRO_BUG where (A, B, C) > ('abc', 'def', 
> 'RRSQ_IMKKL') AND C like 'RRS\\_%';
> ++-++--+
> |  PLAN   
>| EST_BYTES_READ  | EST_ROWS_READ  | EST_INFO_TS  |
>

[jira] [Comment Edited] (PHOENIX-5833) Incorrect results with LIKE operator

2020-04-20 Thread Daniel Wong (Jira)



[ 
https://issues.apache.org/jira/browse/PHOENIX-5833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17088003#comment-17088003
 ] 

Daniel Wong edited comment on PHOENIX-5833 at 4/20/20, 6:44 PM:


Does look like one of the likey many RVC related bugs that lurk.  The 4.15 scan 
plan looks slightly improved but should actually be the more constrained 
startkey of 'abc\x00def\x00RRSQ_IMKKL' minimally.  I'm not sure if we have done 
any key optimization for the LIKE operator but that can be checked pretty 
easily.  I'll try to look in more detail.  If i had to guess we are missing the 
null separator.


was (Author: dbwong):
Does look like one of the likey many RVC related bugs that lurk.  The 4.15 scan 
plan looks slightly improved but should actually be the more constrained 
startkey of 'abcdefRRSQ_IMKKL' minimally.  I'm not sure if we have done any key 
optimization for the LIKE operator but that can be checked pretty easily.  I'll 
try to look in more detail.

> Incorrect results with LIKE operator
> 
>
> Key: PHOENIX-5833
> URL: https://issues.apache.org/jira/browse/PHOENIX-5833
> Project: Phoenix
>  Issue Type: Bug
>  Components: core
>Affects Versions: 4.15.0
>Reporter: Bharath Vissapragada
>Priority: Critical
>
> Phoenix version: 4.15-HBase-1.5
> -- Create a test table and populate a couple of rows.
> {noformat}
> create table repro_bug(a varchar(10) not null, b varchar(10) not null, c 
> varchar(10) not null constraint pk primary key(a, b, c));
> upsert into repro_bug values('abc', 'def', 'RRSQ_IMKKL');
> upsert into repro_bug values('abc', 'def', 'RRS_ZYTDT');
> select * from repro_bug;
> +--+--+-+
> |  A   |  B   |  C  |
> +--+--+-+
> | abc  | def  | RRSQ_IMKKL  |
> | abc  | def  | RRS_ZYTDT   |
> +--+--+-+
> {noformat}
> -- Query 1 - Look for rows where C has a certain prefix - Returns correct 
> result
> {noformat}
> select A, B, C from REPRO_BUG where C like 'RRS\\_%';
> +--+--++
> |  A   |  B   | C  |
> +--+--++
> | abc  | def  | RRS_ZYTDT  |
> +--+--++
> {noformat}
> -- Query 2 - Look for rows where (a, b, c) > first row - Returns correct 
> result
> {noformat}
> select A, B, C from REPRO_BUG where (A, B, C) > ('abc', 'def', 'RRSQ_IMKKL')
> +--+--++
> |  A   |  B   | C  |
> +--+--++
> | abc  | def  | RRS_ZYTDT  |
> +--+--++
> {noformat}
> -- Query 3 - Combine the filters from Query 1 and Query2 - Returns incorrect 
> result.. Ideally it should return the same row as above.
> {noformat}
>  select A, B, C from REPRO_BUG where (A, B, C) > ('abc', 'def', 'RRSQ_IMKKL') 
> AND C like 'RRS\\_%';
> ++++
> | A  | B  | C  |
> ++++
> ++++
> {noformat}
> -- Explain for the above incase someone is interested.
> {noformat}
> explain select A, B, C from REPRO_BUG where (A, B, C) > ('abc', 'def', 
> 'RRSQ_IMKKL') AND C like 'RRS\\_%';
> ++-++--+
> |  PLAN   
>| EST_BYTES_READ  | EST_ROWS_READ  | EST_INFO_TS  |
> ++-++--+
> | CLIENT 1-CHUNK PARALLEL 1-WAY ROUND ROBIN RANGE SCAN OVER REPRO_BUG 
> ['abcdef'] - [*]  | null| null   | null |
> | SERVER FILTER BY FIRST KEY ONLY AND C LIKE 'RRS\_%' 
>| null| null   | null |
> ++-++--+
> 2 rows selected (0.003 seconds)
> {noformat}
> I'm trying to poke around in the code to figure out the issue but my 
> understanding of  Phoenix is limited at this point. So creating a bug report 
> incase someone can figure this out quickly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (PHOENIX-5833) Incorrect results with LIKE operator

2020-04-20 Thread Daniel Wong (Jira)



[ 
https://issues.apache.org/jira/browse/PHOENIX-5833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17088003#comment-17088003
 ] 

Daniel Wong commented on PHOENIX-5833:
--

Does look like one of the likey many RVC related bugs that lurk.  The 4.15 scan 
plan looks slightly improved but should actually be the more constrained 
startkey of 'abcdefRRSQ_IMKKL' minimally.  I'm not sure if we have done any key 
optimization for the LIKE operator but that can be checked pretty easily.  I'll 
try to look in more detail.

> Incorrect results with LIKE operator
> 
>
> Key: PHOENIX-5833
> URL: https://issues.apache.org/jira/browse/PHOENIX-5833
> Project: Phoenix
>  Issue Type: Bug
>  Components: core
>Affects Versions: 4.15.0
>Reporter: Bharath Vissapragada
>Priority: Critical
>
> Phoenix version: 4.15-HBase-1.5
> -- Create a test table and populate a couple of rows.
> {noformat}
> create table repro_bug(a varchar(10) not null, b varchar(10) not null, c 
> varchar(10) not null constraint pk primary key(a, b, c));
> upsert into repro_bug values('abc', 'def', 'RRSQ_IMKKL');
> upsert into repro_bug values('abc', 'def', 'RRS_ZYTDT');
> select * from repro_bug;
> +--+--+-+
> |  A   |  B   |  C  |
> +--+--+-+
> | abc  | def  | RRSQ_IMKKL  |
> | abc  | def  | RRS_ZYTDT   |
> +--+--+-+
> {noformat}
> -- Query 1 - Look for rows where C has a certain prefix - Returns correct 
> result
> {noformat}
> select A, B, C from REPRO_BUG where C like 'RRS\\_%';
> +--+--++
> |  A   |  B   | C  |
> +--+--++
> | abc  | def  | RRS_ZYTDT  |
> +--+--++
> {noformat}
> -- Query 2 - Look for rows where (a, b, c) > first row - Returns correct 
> result
> {noformat}
> select A, B, C from REPRO_BUG where (A, B, C) > ('abc', 'def', 'RRSQ_IMKKL')
> +--+--++
> |  A   |  B   | C  |
> +--+--++
> | abc  | def  | RRS_ZYTDT  |
> +--+--++
> {noformat}
> -- Query 3 - Combine the filters from Query 1 and Query2 - Returns incorrect 
> result.. Ideally it should return the same row as above.
> {noformat}
>  select A, B, C from REPRO_BUG where (A, B, C) > ('abc', 'def', 'RRSQ_IMKKL') 
> AND C like 'RRS\\_%';
> ++++
> | A  | B  | C  |
> ++++
> ++++
> {noformat}
> -- Explain for the above incase someone is interested.
> {noformat}
> explain select A, B, C from REPRO_BUG where (A, B, C) > ('abc', 'def', 
> 'RRSQ_IMKKL') AND C like 'RRS\\_%';
> ++-++--+
> |  PLAN   
>| EST_BYTES_READ  | EST_ROWS_READ  | EST_INFO_TS  |
> ++-++--+
> | CLIENT 1-CHUNK PARALLEL 1-WAY ROUND ROBIN RANGE SCAN OVER REPRO_BUG 
> ['abcdef'] - [*]  | null| null   | null |
> | SERVER FILTER BY FIRST KEY ONLY AND C LIKE 'RRS\_%' 
>| null| null   | null |
> ++-++--+
> 2 rows selected (0.003 seconds)
> {noformat}
> I'm trying to poke around in the code to figure out the issue but my 
> understanding of  Phoenix is limited at this point. So creating a bug report 
> incase someone can figure this out quickly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (PHOENIX-5769) Phoenix precommit Flapping HadoopQA Tests in master

2020-04-16 Thread Daniel Wong (Jira)



[ 
https://issues.apache.org/jira/browse/PHOENIX-5769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17084610#comment-17084610
 ] 

Daniel Wong commented on PHOENIX-5769:
--

Thanks for all your work [~stoty].  I'll see if i can spend any time next week 
and I know [~neha.gupta] may be looking at these as well.

> Phoenix precommit Flapping HadoopQA Tests in master 
> 
>
> Key: PHOENIX-5769
> URL: https://issues.apache.org/jira/browse/PHOENIX-5769
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Daniel Wong
>Priority: Major
> Attachments: PHOENIX-5769.master.v1.patch, 
> PHOENIX-5769.master.v3.patch, PHOENIX-5769.master.v4.patch, consoleFull 
> (1).html, consoleFull (2).html, consoleFull (3).html, consoleFull (4).html, 
> consoleFull (5).html, consoleFull (6).html, consoleFull (7).html, consoleFull 
> (8).html, consoleFull.html, testhang.dump
>
>
> I was recently trying to commit changes to Phoenix for multiple issues and 
> were asked to get clean HadoopQA runs.  However, this took a huge effort as I 
> had to resubmit the same patch multiple times in order to get one "clean".  
> Looking at the errors the most common one were 3 "Multiple regions on 
> " and 3 for apache infra issues (host shutdown), 1 for 
> org.apache.hadoop.hbase.NotServingRegionException, 1 for 
> SnapshotDoesNotExistException.   See builds 
> [https://builds.apache.org/job/PreCommit-PHOENIX-Build/] here from 3540's to 
> 3560's.  In addition I see multiple builds running simultaneously, limiting 
> tests to running on 1 host should be configurable right?
> In addition I was recommended by [~yanxinyi] that master was less likely to 
> have issues getting a clean run than 4.x.  FYI [~ckulkarni]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (PHOENIX-5812) Automatically Close "Idle" Long Open Connections

2020-04-03 Thread Daniel Wong (Jira)



[ 
https://issues.apache.org/jira/browse/PHOENIX-5812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17074834#comment-17074834
 ] 

Daniel Wong edited comment on PHOENIX-5812 at 4/3/20, 7:14 PM:
---

If we want to do idleness correctly the first time we certainly can; however, 
there is value to the timing since connection open since for our usecases our 
customers use we expect no connection to be used for multiple queries due to 
the lightweight nature of our connections and our communicated recommended call 
pattern. 

>From the monitor vs doing this at throttle time.  I slightly prefer doing it 
>as a monitor as this would hopefully make the OPEN_CONNECTIONS metric clearer. 
> And two the end user experience.  If they hit a throttled event we could do 2 
>paths, 1 retry the query internally after calling `closeAllIdleConnections()`. 
> Two have the end user retry.  Both add latency or retries that I would want 
>to avoid.  

Any thoughts from outside of Salesforce?  [~elserj] [~stoty] [~RichardAntal] 
[~vtpavan]


was (Author: dbwong):
If we want to do idleness correctly the first time we certainly can; however, 
there is value to the timing since connection open since for our usecases our 
customers use we expect no connection to be used for multiple queries due to 
the lightweight nature of our connections and our communicated recommended call 
pattern.  

>From the monitor vs doing this at throttle time.  I slightly prefer doing it 
>as a monitor as this would hopefully make the OPEN_CONNECTIONS metric clearer. 
> And two the end user experience.  If they hit a throttled event we could do 2 
>paths, 1 retry the query internally after calling `closeAllIdleConnections()`. 
> Two have the end user retry.  Both add latency retries that I would want to 
>avoid.  

Any thoughts from outside of Salesforce?  [~elserj] [~stoty] [~RichardAntal] 
[~vtpavan]

> Automatically Close "Idle" Long Open Connections
> 
>
> Key: PHOENIX-5812
> URL: https://issues.apache.org/jira/browse/PHOENIX-5812
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Daniel Wong
>Priority: Major
>
> As Phoenix may keep a maximum default number of connections.  Badly 
> performing client calls or internal errors (See PHOENIX-5802).  Can cause 
> total available connections to go to 0.  Proposing a client connection 
> monitor with a connection reaper like task to reap idle connections.
> Definition of "Idle"
> Simple may be simple time based say if a connection has been open for 
> configurable amount of minutes simply close.
> More complicated solution may be keeping track of last interaction time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (PHOENIX-5812) Automatically Close "Idle" Long Open Connections

2020-04-03 Thread Daniel Wong (Jira)



[ 
https://issues.apache.org/jira/browse/PHOENIX-5812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17074834#comment-17074834
 ] 

Daniel Wong commented on PHOENIX-5812:
--

If we want to do idleness correctly the first time we certainly can; however, 
there is value to the timing since connection open since for our usecases our 
customers use we expect no connection to be used for multiple queries due to 
the lightweight nature of our connections and our communicated recommended call 
pattern.  

>From the monitor vs doing this at throttle time.  I slightly prefer doing it 
>as a monitor as this would hopefully make the OPEN_CONNECTIONS metric clearer. 
> And two the end user experience.  If they hit a throttled event we could do 2 
>paths, 1 retry the query internally after calling `closeAllIdleConnections()`. 
> Two have the end user retry.  Both add latency retries that I would want to 
>avoid.  

Any thoughts from outside of Salesforce?  [~elserj] [~stoty] [~RichardAntal] 
[~vtpavan]

> Automatically Close "Idle" Long Open Connections
> 
>
> Key: PHOENIX-5812
> URL: https://issues.apache.org/jira/browse/PHOENIX-5812
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Daniel Wong
>Priority: Major
>
> As Phoenix may keep a maximum default number of connections.  Badly 
> performing client calls or internal errors (See PHOENIX-5802).  Can cause 
> total available connections to go to 0.  Proposing a client connection 
> monitor with a connection reaper like task to reap idle connections.
> Definition of "Idle"
> Simple may be simple time based say if a connection has been open for 
> configurable amount of minutes simply close.
> More complicated solution may be keeping track of last interaction time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (PHOENIX-5812) Automatically Close "Idle" Long Open Connections

2020-04-01 Thread Daniel Wong (Jira)



[ 
https://issues.apache.org/jira/browse/PHOENIX-5812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17073165#comment-17073165
 ] 

Daniel Wong commented on PHOENIX-5812:
--

Thoughts [~liuml07]?  [~vmeka] 
I think the requirements would look like this:
 * New Connection Level Property phoenix.connection.lifetime or similar
 * defaults to 0, for infinity ie do not close connections.
 * When non-0, connections that are open longer than 
phoenix.connection.lifetime are closed.  (Could be CQS or elsewhere).
 * New global metric for number of reaped connections counter.  Individual 
users can metric/alarm on this.

> Automatically Close "Idle" Long Open Connections
> 
>
> Key: PHOENIX-5812
> URL: https://issues.apache.org/jira/browse/PHOENIX-5812
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Daniel Wong
>Priority: Major
>
> As Phoenix may keep a maximum default number of connections.  Badly 
> performing client calls or internal errors (See PHOENIX-5802).  Can cause 
> total available connections to go to 0.  Proposing a client connection 
> monitor with a connection reaper like task to reap idle connections.
> Definition of "Idle"
> Simple may be simple time based say if a connection has been open for 
> configurable amount of minutes simply close.
> More complicated solution may be keeping track of last interaction time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (PHOENIX-5812) Automatically Close "Idle" Long Open Connections

2020-04-01 Thread Daniel Wong (Jira)



[ 
https://issues.apache.org/jira/browse/PHOENIX-5812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17073019#comment-17073019
 ] 

Daniel Wong commented on PHOENIX-5812:
--

FYI [~jisaac]

> Automatically Close "Idle" Long Open Connections
> 
>
> Key: PHOENIX-5812
> URL: https://issues.apache.org/jira/browse/PHOENIX-5812
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Daniel Wong
>Priority: Major
>
> As Phoenix may keep a maximum default number of connections.  Badly 
> performing client calls or internal errors (See PHOENIX-5802).  Can cause 
> total available connections to go to 0.  Proposing a client connection 
> monitor with a connection reaper like task to reap idle connections.
> Definition of "Idle"
> Simple may be simple time based say if a connection has been open for 
> configurable amount of minutes simply close.
> More complicated solution may be keeping track of last interaction time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (PHOENIX-5769) Phoenix precommit Flapping HadoopQA Tests in master

2020-03-31 Thread Daniel Wong (Jira)



[ 
https://issues.apache.org/jira/browse/PHOENIX-5769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17072371#comment-17072371
 ] 

Daniel Wong commented on PHOENIX-5769:
--

So I took a look at the patch but I didn't see the corresponding timeout is 
that in the jenkins configuration I don't have access to I assume?

> Phoenix precommit Flapping HadoopQA Tests in master 
> 
>
> Key: PHOENIX-5769
> URL: https://issues.apache.org/jira/browse/PHOENIX-5769
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Daniel Wong
>Assignee: Istvan Toth
>Priority: Major
> Attachments: PHOENIX-5769.master.v1.patch, 
> PHOENIX-5769.master.v3.patch, consoleFull (1).html, consoleFull (2).html, 
> consoleFull (3).html, consoleFull (4).html, consoleFull (5).html, consoleFull 
> (6).html, consoleFull (7).html, consoleFull (8).html, consoleFull.html
>
>
> I was recently trying to commit changes to Phoenix for multiple issues and 
> were asked to get clean HadoopQA runs.  However, this took a huge effort as I 
> had to resubmit the same patch multiple times in order to get one "clean".  
> Looking at the errors the most common one were 3 "Multiple regions on 
> " and 3 for apache infra issues (host shutdown), 1 for 
> org.apache.hadoop.hbase.NotServingRegionException, 1 for 
> SnapshotDoesNotExistException.   See builds 
> [https://builds.apache.org/job/PreCommit-PHOENIX-Build/] here from 3540's to 
> 3560's.  In addition I see multiple builds running simultaneously, limiting 
> tests to running on 1 host should be configurable right?
> In addition I was recommended by [~yanxinyi] that master was less likely to 
> have issues getting a clean run than 4.x.  FYI [~ckulkarni]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (PHOENIX-5812) Automatically Close "Idle" Long Open Connections

2020-03-31 Thread Daniel Wong (Jira)



[ 
https://issues.apache.org/jira/browse/PHOENIX-5812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17072317#comment-17072317
 ] 

Daniel Wong commented on PHOENIX-5812:
--

Couple more thoughts after doing some additional research.  Oracle provides a 
connection lifetime parameter on the connection level.  We could easily do that 
and provide a default connection lifetime parameter default to infinite.  I 
think that gives the largest control to the end user, solves most lifetime 
problems by configuring a default connection lifetime to a reasonable value.  
We can look into concept of idle connections in a followup Jira if we decided 
we wanted more complicated logic.

> Automatically Close "Idle" Long Open Connections
> 
>
> Key: PHOENIX-5812
> URL: https://issues.apache.org/jira/browse/PHOENIX-5812
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Daniel Wong
>Priority: Major
>
> As Phoenix may keep a maximum default number of connections.  Badly 
> performing client calls or internal errors (See PHOENIX-5802).  Can cause 
> total available connections to go to 0.  Proposing a client connection 
> monitor with a connection reaper like task to reap idle connections.
> Definition of "Idle"
> Simple may be simple time based say if a connection has been open for 
> configurable amount of minutes simply close.
> More complicated solution may be keeping track of last interaction time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (PHOENIX-5812) Automatically Close "Idle" Long Open Connections

2020-03-31 Thread Daniel Wong (Jira)



[ 
https://issues.apache.org/jira/browse/PHOENIX-5812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17072202#comment-17072202
 ] 

Daniel Wong commented on PHOENIX-5812:
--

FYI [~liuml07] [~ckulkarni]

> Automatically Close "Idle" Long Open Connections
> 
>
> Key: PHOENIX-5812
> URL: https://issues.apache.org/jira/browse/PHOENIX-5812
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Daniel Wong
>Priority: Major
>
> As Phoenix may keep a maximum default number of connections.  Badly 
> performing client calls or internal errors (See PHOENIX-5802).  Can cause 
> total available connections to go to 0.  Proposing a client connection 
> monitor with a connection reaper like task to reap idle connections.
> Definition of "Idle"
> Simple may be simple time based say if a connection has been open for 
> configurable amount of minutes simply close.
> More complicated solution may be keeping track of last interaction time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (PHOENIX-5796) Possible query optimization when projecting uncovered columns and querying on indexed columns

2020-03-25 Thread Daniel Wong (Jira)



[ 
https://issues.apache.org/jira/browse/PHOENIX-5796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17066944#comment-17066944
 ] 

Daniel Wong commented on PHOENIX-5796:
--

Its not necessarily optimal.  

There are a bunch of possible jiras around optimization that we could consider. 
 If the selectivity of the index query is high, that is we select a large 
portion of the rows it will be faster to full table scan the base table.  
With stats or with region boundaries we can better estimate selectivity of 
predicates and thus decide if we want to join or FTS.

One easy one i want to open is small scan optimizations where for example if 
the user provides a limit we do stuff like this.  This exact case plus limit 
limit with index join on uncovered columns, i forget if i opened a Jira 
already.  Using region boundaries would be possibly enough in this case as the 
index is on B only, so its meaningful enough, if the index had been on C,B it 
would not be sufficient to estimate selectivity and we would need stats.

> Possible query optimization when projecting uncovered columns and querying on 
> indexed columns
> -
>
> Key: PHOENIX-5796
> URL: https://issues.apache.org/jira/browse/PHOENIX-5796
> Project: Phoenix
>  Issue Type: Improvement
>Affects Versions: 5.0.0, 4.15.0
>Reporter: Chinmay Kulkarni
>Priority: Major
> Attachments: Screen Shot 2020-03-23 at 3.25.38 PM.png, Screen Shot 
> 2020-03-23 at 3.32.24 PM.png, Screen Shot 2020-03-24 at 11.51.12 AM.png
>
>
> Start HBase-1.3 server with Phoenix-4.15.0-HBase-1.3 server jar. Connect to 
> it using sqlline.py which has Phoenix-4.15.0-HBase-1.3 Phoenix client.
> Create a base table like:
> {code:sql}
> create table t (a integer primary key, b varchar(10), c integer);
> {code}
> Create an uncovered index on top of it like:
> {code:sql}
> create index uncov_index_t on t(b);
> {code}
> Now if you issue the query:
> {code:sql}
> explain select c from t where b='abc';
> {code}
> You'd see the following explain plan:
>  !Screen Shot 2020-03-23 at 3.25.38 PM.png|height=150,width=700!
> *Which is a full table scan on the base table 't'* since we cannot use the 
> global index as 'c' is not a covered column in the global index.
> *However, projecting columns contained fully within the index pk is correctly 
> a range scan:*
> {code:sql}
> explain select a,b from t where b='abc';
> {code}
> produces the following explain plan:
>  !Screen Shot 2020-03-23 at 3.32.24 PM.png|height=150,width=700! 
> In the first query, can there be an optimization to *query the index table, 
> get the start and stop keys of the base table and then issue a range 
> scan/(bunch of point lookups) on the base table* instead of doing a full 
> table scan on the base table like we currently do?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (PHOENIX-5796) Possible query optimization when projecting uncovered columns and querying on indexed columns

2020-03-24 Thread Daniel Wong (Jira)



[ 
https://issues.apache.org/jira/browse/PHOENIX-5796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17066109#comment-17066109
 ] 

Daniel Wong commented on PHOENIX-5796:
--

Its a skip scan join so in theory should jump straight to the joined rows.  
[~ckulkarni] can double check though.

> Possible query optimization when projecting uncovered columns and querying on 
> indexed columns
> -
>
> Key: PHOENIX-5796
> URL: https://issues.apache.org/jira/browse/PHOENIX-5796
> Project: Phoenix
>  Issue Type: Improvement
>Affects Versions: 5.0.0, 4.15.0
>Reporter: Chinmay Kulkarni
>Priority: Major
> Attachments: Screen Shot 2020-03-23 at 3.25.38 PM.png, Screen Shot 
> 2020-03-23 at 3.32.24 PM.png, Screen Shot 2020-03-24 at 11.51.12 AM.png
>
>
> Start HBase-1.3 server with Phoenix-4.15.0-HBase-1.3 server jar. Connect to 
> it using sqlline.py which has Phoenix-4.15.0-HBase-1.3 Phoenix client.
> Create a base table like:
> {code:sql}
> create table t (a integer primary key, b varchar(10), c integer);
> {code}
> Create an uncovered index on top of it like:
> {code:sql}
> create index uncov_index_t on t(b);
> {code}
> Now if you issue the query:
> {code:sql}
> explain select c from t where b='abc';
> {code}
> You'd see the following explain plan:
>  !Screen Shot 2020-03-23 at 3.25.38 PM.png|height=150,width=700!
> *Which is a full table scan on the base table 't'* since we cannot use the 
> global index as 'c' is not a covered column in the global index.
> *However, projecting columns contained fully within the index pk is correctly 
> a range scan:*
> {code:sql}
> explain select a,b from t where b='abc';
> {code}
> produces the following explain plan:
>  !Screen Shot 2020-03-23 at 3.32.24 PM.png|height=150,width=700! 
> In the first query, can there be an optimization to *query the index table, 
> get the start and stop keys of the base table and then issue a range 
> scan/(bunch of point lookups) on the base table* instead of doing a full 
> table scan on the base table like we currently do?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (PHOENIX-5796) Possible query optimization when projecting uncovered columns and querying on indexed columns

2020-03-24 Thread Daniel Wong (Jira)



[ 
https://issues.apache.org/jira/browse/PHOENIX-5796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17066088#comment-17066088
 ] 

Daniel Wong commented on PHOENIX-5796:
--

Today at least we can force a range scan through index hint which will do the 
uncovered join.  Slightly different but should be similar or better performance.

> Possible query optimization when projecting uncovered columns and querying on 
> indexed columns
> -
>
> Key: PHOENIX-5796
> URL: https://issues.apache.org/jira/browse/PHOENIX-5796
> Project: Phoenix
>  Issue Type: Improvement
>Affects Versions: 5.0.0, 4.15.0
>Reporter: Chinmay Kulkarni
>Priority: Major
> Attachments: Screen Shot 2020-03-23 at 3.25.38 PM.png, Screen Shot 
> 2020-03-23 at 3.32.24 PM.png
>
>
> Start HBase-1.3 server with Phoenix-4.15.0-HBase-1.3 server jar. Connect to 
> it using sqlline.py which has Phoenix-4.15.0-HBase-1.3 Phoenix client.
> Create a base table like:
> {code:sql}
> create table t (a integer primary key, b varchar(10), c integer);
> {code}
> Create an uncovered index on top of it like:
> {code:sql}
> create index uncov_index_t on t(b);
> {code}
> Now if you issue the query:
> {code:sql}
> explain select c from t where b='abc';
> {code}
> You'd see the following explain plan:
>  !Screen Shot 2020-03-23 at 3.25.38 PM.png|height=150,width=700!
> *Which is a full table scan on the base table 't'* since we cannot use the 
> global index as 'c' is not a covered column in the global index.
> *However, projecting columns contained fully within the index pk is correctly 
> a range scan:*
> {code:sql}
> explain select a,b from t where b='abc';
> {code}
> produces the following explain plan:
>  !Screen Shot 2020-03-23 at 3.32.24 PM.png|height=150,width=700! 
> In the first query, can there be an optimization to *query the index table, 
> get the start and stop keys of the base table and then issue a range 
> scan/(bunch of point lookups) on the base table* instead of doing a full 
> table scan on the base table like we currently do?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (PHOENIX-4845) Support using Row Value Constructors in OFFSET clause for paging in tables where the sort order of PK columns varies

2020-03-06 Thread Daniel Wong (Jira)



[ 
https://issues.apache.org/jira/browse/PHOENIX-4845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17053848#comment-17053848
 ] 

Daniel Wong edited comment on PHOENIX-4845 at 3/7/20, 2:09 AM:
---

I ran these tests locally and they passed on master/4.x-HBase-1.3.  Thoughts 
[~yanxinyi] [~ckulkarni]


was (Author: dbwong):
I ran these tests locally on master/4.x-HBase-1.3.  Thoughts [~yanxinyi]

> Support using Row Value Constructors in OFFSET clause for paging in tables 
> where the sort order of PK columns varies
> 
>
> Key: PHOENIX-4845
> URL: https://issues.apache.org/jira/browse/PHOENIX-4845
> Project: Phoenix
>  Issue Type: New Feature
>Reporter: Thomas D'Silva
>Assignee: Daniel Wong
>Priority: Major
>  Labels: DESC, SFDC
> Attachments: PHOENIX-4845-4.x-HBase-1.3.patch, 
> PHOENIX-4845-4.x-HBase-1.3.v2.patch, PHOENIX-4845-4.x-HBase-1.3.v3.patch, 
> PHOENIX-4845.patch, PHOENIX-offset.txt
>
>  Time Spent: 16h 10m
>  Remaining Estimate: 0h
>
> RVCs along with the LIMIT clause are useful for efficiently paging through 
> rows (see [http://phoenix.apache.org/paged.html]). This works well if the pk 
> columns are sorted ascending, we can always use the > operator to query for 
> the next batch of row.
> However if the PK of a table is (A  DESC, B DESC) we cannot use the following 
> query to page through the data
> {code:java}
> SELECT * FROM TABLE WHERE (A, B) > (?, ?) ORDER BY A DESC, B DESC LIMIT 20
> {code}
> Since the rows are sorted by A desc and then by B descending we need change 
> the comparison order
> {code:java}
> SELECT * FROM TABLE WHERE (A, B) < (?, ?) ORDER BY A DESC, B DESC LIMIT 20
> {code}
> If the PK of a table contains columns with mixed sort order for eg (A  DESC, 
> B) then we cannot use RVC to page through data.
> If we supported using RVCs in the offset clause we could use the offset to 
> set the start row of the scan. Clients would not have to have logic to 
> determine the comparison operator. This would also support paging through 
> data for tables where the PK columns are sorted in mixed order.
> {code:java}
> SELECT * FROM TABLE ORDER BY A DESC, B LIMIT 20 OFFSET (?,?)
> {code}
> We would only allow using the offset if the rows are ordered by the sort 
> order of the PK columns of and Index or Primary Table.
> Note that there is some care is needed in the use of OFFSET with indexes.  If 
> the OFFSET is coercible to multiple indexes/base table it could mean very 
> different positions based on key.  To Handle This the INDEX hint needs to be 
> used to specify an index offset for safety.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (PHOENIX-4845) Support using Row Value Constructors in OFFSET clause for paging in tables where the sort order of PK columns varies

2020-03-06 Thread Daniel Wong (Jira)



[ 
https://issues.apache.org/jira/browse/PHOENIX-4845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17053848#comment-17053848
 ] 

Daniel Wong commented on PHOENIX-4845:
--

I ran these tests locally on master/4.x-HBase-1.3.  Thoughts [~yanxinyi]

> Support using Row Value Constructors in OFFSET clause for paging in tables 
> where the sort order of PK columns varies
> 
>
> Key: PHOENIX-4845
> URL: https://issues.apache.org/jira/browse/PHOENIX-4845
> Project: Phoenix
>  Issue Type: New Feature
>Reporter: Thomas D'Silva
>Assignee: Daniel Wong
>Priority: Major
>  Labels: DESC, SFDC
> Attachments: PHOENIX-4845-4.x-HBase-1.3.patch, 
> PHOENIX-4845-4.x-HBase-1.3.v2.patch, PHOENIX-4845-4.x-HBase-1.3.v3.patch, 
> PHOENIX-4845.patch, PHOENIX-offset.txt
>
>  Time Spent: 16h 10m
>  Remaining Estimate: 0h
>
> RVCs along with the LIMIT clause are useful for efficiently paging through 
> rows (see [http://phoenix.apache.org/paged.html]). This works well if the pk 
> columns are sorted ascending, we can always use the > operator to query for 
> the next batch of row.
> However if the PK of a table is (A  DESC, B DESC) we cannot use the following 
> query to page through the data
> {code:java}
> SELECT * FROM TABLE WHERE (A, B) > (?, ?) ORDER BY A DESC, B DESC LIMIT 20
> {code}
> Since the rows are sorted by A desc and then by B descending we need change 
> the comparison order
> {code:java}
> SELECT * FROM TABLE WHERE (A, B) < (?, ?) ORDER BY A DESC, B DESC LIMIT 20
> {code}
> If the PK of a table contains columns with mixed sort order for eg (A  DESC, 
> B) then we cannot use RVC to page through data.
> If we supported using RVCs in the offset clause we could use the offset to 
> set the start row of the scan. Clients would not have to have logic to 
> determine the comparison operator. This would also support paging through 
> data for tables where the PK columns are sorted in mixed order.
> {code:java}
> SELECT * FROM TABLE ORDER BY A DESC, B LIMIT 20 OFFSET (?,?)
> {code}
> We would only allow using the offset if the rows are ordered by the sort 
> order of the PK columns of and Index or Primary Table.
> Note that there is some care is needed in the use of OFFSET with indexes.  If 
> the OFFSET is coercible to multiple indexes/base table it could mean very 
> different positions based on key.  To Handle This the INDEX hint needs to be 
> used to specify an index offset for safety.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (PHOENIX-5711) Fix duplicate NOTICE error on phoenix-client shading

2020-02-11 Thread Daniel Wong (Jira)



[ 
https://issues.apache.org/jira/browse/PHOENIX-5711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17034981#comment-17034981
 ] 

Daniel Wong commented on PHOENIX-5711:
--

I think there may have been differences in how it handles regex between 
versions I can't recall for sure.  I think the better way going forward is to 
use the tool the made for handling combining required notices and handling the 
overlap in a "designed" fashion for possible legal requirements  
[https://maven.apache.org/plugins/maven-shade-plugin/examples/resource-transformers.html#ApacheNoticeResourceTransformer
 
|https://maven.apache.org/plugins/maven-shade-plugin/examples/resource-transformers.html#ApacheNoticeResourceTransformer]

> Fix duplicate NOTICE error on phoenix-client shading
> 
>
> Key: PHOENIX-5711
> URL: https://issues.apache.org/jira/browse/PHOENIX-5711
> Project: Phoenix
>  Issue Type: Improvement
>Affects Versions: 4.15.0
>Reporter: Masatake Iwasaki
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> I got the following error on building 4.15.0-HBase-1.5 against hadoop-2.10.0 
> (for BIGTOP-3298). The cause seems to be the pattern for exclusion in pom.xml 
> does not match the file name NOTICE (without suffix).
> {noformat}
> $ mvn clean install -DskipTests -Dhbase.version=1.5.0 -Dhadoop.version=2.10.0 
> -Dhadoop-two.version=2.10.0
> ...snip
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-shade-plugin:3.1.1:shade (default-shaded) on 
> project phoenix-client: Error creating shaded jar: duplicate entry: NOTICE -> 
> [Help 1]
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (PHOENIX-5711) Fix duplicate NOTICE error on phoenix-client shading

2020-02-11 Thread Daniel Wong (Jira)



[ 
https://issues.apache.org/jira/browse/PHOENIX-5711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17034941#comment-17034941
 ] 

Daniel Wong commented on PHOENIX-5711:
--

Thanks for the patch.  I think I saw this before internally and our issue was 
the version of the maven plugin.  What did you test this against?

> Fix duplicate NOTICE error on phoenix-client shading
> 
>
> Key: PHOENIX-5711
> URL: https://issues.apache.org/jira/browse/PHOENIX-5711
> Project: Phoenix
>  Issue Type: Improvement
>Affects Versions: 4.15.0
>Reporter: Masatake Iwasaki
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> I got the following error on building 4.15.0-HBase-1.5 against hadoop-2.10.0 
> (for BIGTOP-3298). The cause seems to be the pattern for exclusion in pom.xml 
> does not match the file name NOTICE (without suffix).
> {noformat}
> $ mvn clean install -DskipTests -Dhbase.version=1.5.0 -Dhadoop.version=2.10.0 
> -Dhadoop-two.version=2.10.0
> ...snip
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-shade-plugin:3.1.1:shade (default-shaded) on 
> project phoenix-client: Error creating shaded jar: duplicate entry: NOTICE -> 
> [Help 1]
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (PHOENIX-5537) Phoenix-4701 made hard coupling between phoenix.log.level and getting request metrics.

2020-02-11 Thread Daniel Wong (Jira)



[ 
https://issues.apache.org/jira/browse/PHOENIX-5537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17034849#comment-17034849
 ] 

Daniel Wong commented on PHOENIX-5537:
--

[~yanxinyi] [~ckulkarni] can you help me get this to a hadoop QA run I think 
Richard is not marked as a contributor yet.

> Phoenix-4701 made hard coupling between phoenix.log.level and getting request 
> metrics.
> --
>
> Key: PHOENIX-5537
> URL: https://issues.apache.org/jira/browse/PHOENIX-5537
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.15.0
>Reporter: Daniel Wong
>Priority: Minor
> Fix For: 4.15.1
>
> Attachments: PHOENIX-5537.master.v1.patch
>
>
> Phoenix-4701 made hard coupling between phoenix.log.level and getting request 
> metrics.  For users who do not want to enable system to log this causes a 
> regression from earlier behavior where metrics were populated.  FYI [~ankit]
> {code:java}
> public OverAllQueryMetrics(boolean isRequestMetricsEnabled, LogLevel 
> connectionLogLevel) { public OverAllQueryMetrics(boolean 
> isRequestMetricsEnabled, LogLevel connectionLogLevel) { queryWatch = new 
> MetricsStopWatch(WALL_CLOCK_TIME_MS.isLoggingEnabled(connectionLogLevel));
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (PHOENIX-5537) Phoenix-4701 made hard coupling between phoenix.log.level and getting request metrics.

2020-02-11 Thread Daniel Wong (Jira)



[ 
https://issues.apache.org/jira/browse/PHOENIX-5537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17034848#comment-17034848
 ] 

Daniel Wong commented on PHOENIX-5537:
--

[~RichardAntal] In general the changes look okay to me but I would like to see 
a unit test or integration test to go with the patch.  I'll see if i can get 
this to patch submitted and have our HadoopQA run our tests against this.  

> Phoenix-4701 made hard coupling between phoenix.log.level and getting request 
> metrics.
> --
>
> Key: PHOENIX-5537
> URL: https://issues.apache.org/jira/browse/PHOENIX-5537
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.15.0
>Reporter: Daniel Wong
>Priority: Minor
> Fix For: 4.15.1
>
> Attachments: PHOENIX-5537.master.v1.patch
>
>
> Phoenix-4701 made hard coupling between phoenix.log.level and getting request 
> metrics.  For users who do not want to enable system to log this causes a 
> regression from earlier behavior where metrics were populated.  FYI [~ankit]
> {code:java}
> public OverAllQueryMetrics(boolean isRequestMetricsEnabled, LogLevel 
> connectionLogLevel) { public OverAllQueryMetrics(boolean 
> isRequestMetricsEnabled, LogLevel connectionLogLevel) { queryWatch = new 
> MetricsStopWatch(WALL_CLOCK_TIME_MS.isLoggingEnabled(connectionLogLevel));
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (PHOENIX-4845) Support using Row Value Constructors in OFFSET clause for paging in tables where the sort order of PK columns varies

2020-01-14 Thread Daniel Wong (Jira)



[ 
https://issues.apache.org/jira/browse/PHOENIX-4845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17015504#comment-17015504
 ] 

Daniel Wong commented on PHOENIX-4845:
--

Checked the tests that appears to be a environmental hiccup and it ran clean 
locally.  If you have time can [~tdsilva] or [~jamestaylor] take a peek?  Not 
sure who else might have knowledge of the parser/optimizer.

> Support using Row Value Constructors in OFFSET clause for paging in tables 
> where the sort order of PK columns varies
> 
>
> Key: PHOENIX-4845
> URL: https://issues.apache.org/jira/browse/PHOENIX-4845
> Project: Phoenix
>  Issue Type: New Feature
>Reporter: Thomas D'Silva
>Assignee: Daniel Wong
>Priority: Major
>  Labels: DESC, SFDC
> Attachments: PHOENIX-4845-4.x-HBase-1.3.patch, PHOENIX-offset.txt
>
>  Time Spent: 14h 50m
>  Remaining Estimate: 0h
>
> RVCs along with the LIMIT clause are useful for efficiently paging through 
> rows (see [http://phoenix.apache.org/paged.html]). This works well if the pk 
> columns are sorted ascending, we can always use the > operator to query for 
> the next batch of row.
> However if the PK of a table is (A  DESC, B DESC) we cannot use the following 
> query to page through the data
> {code:java}
> SELECT * FROM TABLE WHERE (A, B) > (?, ?) ORDER BY A DESC, B DESC LIMIT 20
> {code}
> Since the rows are sorted by A desc and then by B descending we need change 
> the comparison order
> {code:java}
> SELECT * FROM TABLE WHERE (A, B) < (?, ?) ORDER BY A DESC, B DESC LIMIT 20
> {code}
> If the PK of a table contains columns with mixed sort order for eg (A  DESC, 
> B) then we cannot use RVC to page through data.
> If we supported using RVCs in the offset clause we could use the offset to 
> set the start row of the scan. Clients would not have to have logic to 
> determine the comparison operator. This would also support paging through 
> data for tables where the PK columns are sorted in mixed order.
> {code:java}
> SELECT * FROM TABLE ORDER BY A DESC, B LIMIT 20 OFFSET (?,?)
> {code}
> We would only allow using the offset if the rows are ordered by the sort 
> order of the PK columns of and Index or Primary Table.
> Note that there is some care is needed in the use of OFFSET with indexes.  If 
> the OFFSET is coercible to multiple indexes/base table it could mean very 
> different positions based on key.  To Handle This the INDEX hint needs to be 
> used to specify an index offset for safety.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (PHOENIX-5628) Phoenix Function to Return HBase Row Key of Column Cell

2020-01-10 Thread Daniel Wong (Jira)



[ 
https://issues.apache.org/jira/browse/PHOENIX-5628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17013261#comment-17013261
 ] 

Daniel Wong commented on PHOENIX-5628:
--

I agree with the above please add composite key with variable types, and 
sorting.  The HBase check is valid but perhaps overkill we can simply define 
the rowkey and compare against it.

> Phoenix Function to Return HBase Row Key of Column Cell
> ---
>
> Key: PHOENIX-5628
> URL: https://issues.apache.org/jira/browse/PHOENIX-5628
> Project: Phoenix
>  Issue Type: New Feature
>Reporter: Geoffrey Jacoby
>Assignee: Abhishek Singh Chouhan
>Priority: Major
> Fix For: 5.1.0, 4.16.0
>
> Attachments: PHOENIX-5628-4.x-HBase-1.5.patch, 
> PHOENIX-5628-master.patch
>
>
> It's occasionally useful when diagnosing an issue with Phoenix to be able to 
> easily look up the byte array of a row key that corresponds to a cell with a 
> Phoenix query. This can then be looked up in the HBase shell. 
> For example:
> SELECT ROWKEY_BYTES(Column1) FROM Table1 WHERE Column1 = 'SomeValue'



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (PHOENIX-5614) Remove unnecessary instances of ClassNotFoundException thrown stemming from various QueryUtil APIs

2019-12-17 Thread Daniel Wong (Jira)



[ 
https://issues.apache.org/jira/browse/PHOENIX-5614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16998738#comment-16998738
 ] 

Daniel Wong commented on PHOENIX-5614:
--

Took a quick look, we were swallowing exceptions in say MetaDataEndpointImpl.  
New code has no swallow.  Are we confident we didn't expose a new code path?  
Haven't looked in detail yet forgive my high level quesiton.

> Remove unnecessary instances of ClassNotFoundException thrown stemming from 
> various QueryUtil APIs
> --
>
> Key: PHOENIX-5614
> URL: https://issues.apache.org/jira/browse/PHOENIX-5614
> Project: Phoenix
>  Issue Type: Improvement
>Affects Versions: 4.15.0
>Reporter: Chinmay Kulkarni
>Assignee: Chinmay Kulkarni
>Priority: Minor
>  Labels: beginner, newbie
> Fix For: 4.15.1
>
> Attachments: PHOENIX-5614-4.x-HBase-1.3-v1.patch
>
>
> Many of the QueryUtil APIs have CNFE in their method signatures, but they 
> don't really throw them. We mostly handle this in its callers (like inside 
> MetaDataEndpointImpl) by swallowing this exception. We should remove these 
> instances.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (PHOENIX-5510) LoggingPhoenixConnection on close does not log with unclosed statements/result sets

2019-10-07 Thread Daniel Wong (Jira)



[ 
https://issues.apache.org/jira/browse/PHOENIX-5510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16946074#comment-16946074
 ] 

Daniel Wong commented on PHOENIX-5510:
--

A new method for logging errors dealing with logging essentially what I was 
proposing.

In my case if I wanted to log an error message saying that since we have 
unclosed result sets and statements.  During close of a 
LoggingPhoenixConnection the class does not have access to SQL stored in the 
statements/result sets it contains.  In addition, if we wanted to indicate that 
the metrics for READS from an unclosed result set were not recorded, ie Read 
path interface was not called.  There is no method to do so today other than 
hacky ways like calling read interface with make belive SQL or make believe 
additional metrics to indicate these were not recorded.  This means I cannot 
put my code which uses my internal logging solution into open source as I 
manually changed LoggingPhoenixConnection and directly called into my logging 
libraries.  If there was a metrics error method, I could have indicated there 
during close of connection that the metrics were missing.

> LoggingPhoenixConnection on close does not log with unclosed 
> statements/result sets
> ---
>
> Key: PHOENIX-5510
> URL: https://issues.apache.org/jira/browse/PHOENIX-5510
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.14.1, 4.14.3
>Reporter: Daniel Wong
>Priority: Minor
>
> Testing the LoggingPhoenixConnection it does not handle request level metrics 
> for unclosed statements/result sets.  This is due to the delegate nature 
> where the resulting PhoenixConnection ends up closing these without ever 
> going through a delegate.  Without this log or an way to inform the user that 
> these occurred operations are essentially blind to these requests.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (PHOENIX-5510) LoggingPhoenixConnection on close does not log with unclosed statements/result sets

2019-10-05 Thread Daniel Wong (Jira)



[ 
https://issues.apache.org/jira/browse/PHOENIX-5510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16945254#comment-16945254
 ] 

Daniel Wong commented on PHOENIX-5510:
--

So thinking about this a bit more this may not be an actual bug in logging the 
result metrics as users should close resultsets and statements prior to 
connections.  However, it would be nice to log something in case this does 
occur so administrators etc can have info on when end users do not properly 
handle this type of closing.  

PhoenixMetricsLog interface could have an additional method either in the next 
release or if move to java 8 occurs and can use a default method.  In addition 
this code could be gutted out and moved to a separate client utils library.  
Perhaps even better a cleaner java service implementation with better coupling 
with the internal phoenix code.

Note for future readers: for my own use case I will inherit and override the 
logging classes to handle logging and metrics for this case. 

FYI

[~tdsilva] [~ckulkarni]

> LoggingPhoenixConnection on close does not log with unclosed 
> statements/result sets
> ---
>
> Key: PHOENIX-5510
> URL: https://issues.apache.org/jira/browse/PHOENIX-5510
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.14.1, 4.14.3
>Reporter: Daniel Wong
>Priority: Minor
>
> Testing the LoggingPhoenixConnection it does not handle request level metrics 
> for unclosed statements/result sets.  This is due to the delegate nature 
> where the resulting PhoenixConnection ends up closing these without ever 
> going through a delegate.  Without this log or an way to inform the user that 
> these occurred operations are essentially blind to these requests.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (PHOENIX-5493) Remove unnecesary iteration in BaseResultIterator

2019-09-25 Thread Daniel Wong (Jira)



[ 
https://issues.apache.org/jira/browse/PHOENIX-5493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16937934#comment-16937934
 ] 

Daniel Wong commented on PHOENIX-5493:
--

+1 for this patch.  Note I would like to get the larger changes in this module 
from the stats branch in eventually.

> Remove unnecesary iteration in BaseResultIterator 
> --
>
> Key: PHOENIX-5493
> URL: https://issues.apache.org/jira/browse/PHOENIX-5493
> Project: Phoenix
>  Issue Type: Improvement
>Affects Versions: 4.14.3
>Reporter: Aman Poonia
>Assignee: Aman Poonia
>Priority: Minor
> Attachments: PHOENIX-5493.patch
>
>
> In BaseResultIterator
> {code:java}
> while (offset < nColumnsInCommon && offset < rangesListSize) {
> List ranges = rangesList.get(offset);
> // We use a skip scan if we have multiple ranges or if
> // we have a non single key range before the last range.
> useSkipScan |= ranges.size() > 1 || hasRange;
> cnf.add(ranges);
> int rangeSpan = 1 + dataScanRanges.getSlotSpans()[offset];
> if (offset + rangeSpan > nColumnsInCommon) {
> rangeSpan = nColumnsInCommon - offset;
> // trim range to only be rangeSpan in length
> ranges = 
> Lists.newArrayListWithExpectedSize(cnf.get(cnf.size()-1).size());
> for (KeyRange range : cnf.get(cnf.size()-1)) {
> range = clipRange(dataScanRanges.getSchema(), offset, 
> rangeSpan, range);
> // trim range to be only rangeSpan in length
> ranges.add(range);
> }
> cnf.set(cnf.size()-1, ranges);
> }
> for (KeyRange range : ranges) {
> if (!range.isSingleKey()) {
> hasRange = true;
> }
> }
> slotSpan[offset] = rangeSpan - 1;
> offset = offset + rangeSpan;
> }
> {code}
>  we can break in the inner loop and save some cycles of CPU



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (PHOENIX-5085) Disentangle BaseResultIterators from the backing Guidepost Data structure

2019-09-16 Thread Daniel Wong (Jira)



[ 
https://issues.apache.org/jira/browse/PHOENIX-5085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16930788#comment-16930788
 ] 

Daniel Wong commented on PHOENIX-5085:
--

Marking as in progress as part of the Stats branch where these changes exist.

> Disentangle BaseResultIterators from the backing Guidepost Data structure
> -
>
> Key: PHOENIX-5085
> URL: https://issues.apache.org/jira/browse/PHOENIX-5085
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Daniel Wong
>Assignee: Daniel Wong
>Priority: Major
>  Labels: Statistics, StatsImprovement
>
> Disentangle BaseResultIterators.getParallelScans from the backing Guidepost 
> Data structure.  This will provide the abstraction for possible new stats 
> data structures in https://issues.apache.org/jira/browse/PHOENIX-4925
>  Will heavily affect changes in 
> https://issues.apache.org/jira/browse/PHOENIX-4926 and 
> https://issues.apache.org/jira/browse/PHOENIX-4594.  [~Bin Shi] 
> [~karanmehta93]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Commented] (PHOENIX-5444) Incorrect Phoenix Client Path in phoenix_utils.py

2019-08-13 Thread Daniel Wong (JIRA)



[ 
https://issues.apache.org/jira/browse/PHOENIX-5444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16906801#comment-16906801
 ] 

Daniel Wong commented on PHOENIX-5444:
--

[https://github.com/apache/phoenix/pull/566] Fix Here

> Incorrect Phoenix Client Path in phoenix_utils.py
> -
>
> Key: PHOENIX-5444
> URL: https://issues.apache.org/jira/browse/PHOENIX-5444
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Daniel Wong
>Assignee: Daniel Wong
>Priority: Trivial
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> ./bin/phoenix_utils.py has a variable for phoenix client jar pattern but does 
> not use it in all places.  Using this will correct some issues for future 
> client names (and in my case my forked versions).



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Commented] (PHOENIX-5142) Protect against bad scan construction causing stack overflow

2019-07-31 Thread Daniel Wong (JIRA)



[ 
https://issues.apache.org/jira/browse/PHOENIX-5142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16897558#comment-16897558
 ] 

Daniel Wong commented on PHOENIX-5142:
--

I'm not sure how to reproduce this in an IT.  The bad scans were due to 
implementation error I had during testing.  Not sure it is worth reproducing or 
affects versions.  This is more of how the code is structured rather than a 
bug.   But I'll fill in the latest as that code path has not been changed.

> Protect against bad scan construction causing stack overflow
> 
>
> Key: PHOENIX-5142
> URL: https://issues.apache.org/jira/browse/PHOENIX-5142
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Daniel Wong
>Priority: Minor
>
> During implementation changes I created an issue where a bad set of scans 
> were generated.  These scans crossed region boundaries incorrectly.  This led 
> Phoenix to believe the cache was out of date and triggered a retry attempt in 
> a recursive manner until the stack overflowed.  We may want to protect 
> against this type of failure by limiting the depth of the recursion and 
> returning an exception to the user.
> {code:java}
> java.lang.StackOverflowError at 
> java.security.AccessController.doPrivileged(Native Method) at 
> java.io.PrintWriter.(PrintWriter.java:116) at 
> java.io.PrintWriter.(PrintWriter.java:100) at 
> org.apache.log4j.DefaultThrowableRenderer.render(DefaultThrowableRenderer.java:58)
>  at 
> org.apache.log4j.spi.ThrowableInformation.getThrowableStrRep(ThrowableInformation.java:87)
>  at 
> org.apache.log4j.spi.LoggingEvent.getThrowableStrRep(LoggingEvent.java:413) 
> at org.apache.log4j.WriterAppender.subAppend(WriterAppender.java:313) at 
> org.apache.log4j.WriterAppender.append(WriterAppender.java:162) at 
> org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:251) at 
> org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66)
>  at org.apache.log4j.Category.callAppenders(Category.java:206) at 
> org.apache.log4j.Category.forcedLog(Category.java:391) at 
> org.apache.log4j.Category.log(Category.java:856) at 
> org.slf4j.impl.Log4jLoggerAdapter.info(Log4jLoggerAdapter.java:382) at 
> org.apache.phoenix.iterate.BaseResultIterators.close(BaseResultIterators.java:1568)
>  at 
> org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:1475)
>  at 
> org.apache.phoenix.iterate.BaseResultIterators.recreateIterators(BaseResultIterators.java:1523)
>  at 
> org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:1441)
>  at 
> org.apache.phoenix.iterate.BaseResultIterators.recreateIterators(BaseResultIterators.java:1523)
>  at 
> org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:1441)
>  at 
> org.apache.phoenix.iterate.BaseResultIterators.recreateIterators(BaseResultIterators.java:1523)
>  at 
> org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:1441)
>  at 
> org.apache.phoenix.iterate.BaseResultIterators.recreateIterators(BaseResultIterators.java:1523)
>  at 
> org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:1441)
>  at 
> org.apache.phoenix.iterate.BaseResultIterators.recreateIterators(BaseResultIterators.java:1523)
>  at 
> org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:1441)
> 
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Commented] (PHOENIX-4845) Support using Row Value Constructors in OFFSET clause for paging in tables where the sort order of PK columns varies

2019-07-30 Thread Daniel Wong (JIRA)



[ 
https://issues.apache.org/jira/browse/PHOENIX-4845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16896605#comment-16896605
 ] 

Daniel Wong commented on PHOENIX-4845:
--

I was implementing salt handling by not handling the "ORDER BY" type concerns 
and just pagining through the salted table normally.  That is first salt byte 
0, then through salt byte 1, etc.  The order by would have to be unconstrained 
as we would not be able to return in a PK order otherwise.

> Support using Row Value Constructors in OFFSET clause for paging in tables 
> where the sort order of PK columns varies
> 
>
> Key: PHOENIX-4845
> URL: https://issues.apache.org/jira/browse/PHOENIX-4845
> Project: Phoenix
>  Issue Type: New Feature
>Reporter: Thomas D'Silva
>Assignee: Daniel Wong
>Priority: Major
>  Labels: DESC, SFDC
> Attachments: PHOENIX-offset.txt
>
>
> RVCs along with the LIMIT clause are useful for efficiently paging through 
> rows (see [http://phoenix.apache.org/paged.html]). This works well if the pk 
> columns are sorted ascending, we can always use the > operator to query for 
> the next batch of row.
> However if the PK of a table is (A  DESC, B DESC) we cannot use the following 
> query to page through the data
> {code:java}
> SELECT * FROM TABLE WHERE (A, B) > (?, ?) ORDER BY A DESC, B DESC LIMIT 20
> {code}
> Since the rows are sorted by A desc and then by B descending we need change 
> the comparison order
> {code:java}
> SELECT * FROM TABLE WHERE (A, B) < (?, ?) ORDER BY A DESC, B DESC LIMIT 20
> {code}
> If the PK of a table contains columns with mixed sort order for eg (A  DESC, 
> B) then we cannot use RVC to page through data.
> If we supported using RVCs in the offset clause we could use the offset to 
> set the start row of the scan. Clients would not have to have logic to 
> determine the comparison operator. This would also support paging through 
> data for tables where the PK columns are sorted in mixed order.
> {code:java}
> SELECT * FROM TABLE ORDER BY A DESC, B LIMIT 20 OFFSET (?,?)
> {code}
> We would only allow using the offset if the rows are ordered by the sort 
> order of the PK columns of and Index or Primary Table.
> Note that there is some care is needed in the use of OFFSET with indexes.  If 
> the OFFSET is coercible to multiple indexes/base table it could mean very 
> different positions based on key.  To Handle This the INDEX hint needs to be 
> used to specify an index offset for safety.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Commented] (PHOENIX-4845) Support using Row Value Constructors in OFFSET clause for paging in tables where the sort order of PK columns varies

2019-07-30 Thread Daniel Wong (JIRA)



[ 
https://issues.apache.org/jira/browse/PHOENIX-4845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16896577#comment-16896577
 ] 

Daniel Wong commented on PHOENIX-4845:
--

How would we expect them to handle stuff like the SALT byte there.  Will they 
be able to even retrieve that data?  This would also require them to understand 
how Phoenix lays out the secondary indexes as given only their CREATE INDEX 
statement they have no knowledge of those keys even existing and would have to 
inspect the system catalog to know about.  If we are okay with those 
restrictions this is feasible but I dislike not being able to run on an index 
who base table is salted for example. 

> Support using Row Value Constructors in OFFSET clause for paging in tables 
> where the sort order of PK columns varies
> 
>
> Key: PHOENIX-4845
> URL: https://issues.apache.org/jira/browse/PHOENIX-4845
> Project: Phoenix
>  Issue Type: New Feature
>Reporter: Thomas D'Silva
>Assignee: Daniel Wong
>Priority: Major
>  Labels: DESC, SFDC
> Attachments: PHOENIX-offset.txt
>
>
> RVCs along with the LIMIT clause are useful for efficiently paging through 
> rows (see [http://phoenix.apache.org/paged.html]). This works well if the pk 
> columns are sorted ascending, we can always use the > operator to query for 
> the next batch of row.
> However if the PK of a table is (A  DESC, B DESC) we cannot use the following 
> query to page through the data
> {code:java}
> SELECT * FROM TABLE WHERE (A, B) > (?, ?) ORDER BY A DESC, B DESC LIMIT 20
> {code}
> Since the rows are sorted by A desc and then by B descending we need change 
> the comparison order
> {code:java}
> SELECT * FROM TABLE WHERE (A, B) < (?, ?) ORDER BY A DESC, B DESC LIMIT 20
> {code}
> If the PK of a table contains columns with mixed sort order for eg (A  DESC, 
> B) then we cannot use RVC to page through data.
> If we supported using RVCs in the offset clause we could use the offset to 
> set the start row of the scan. Clients would not have to have logic to 
> determine the comparison operator. This would also support paging through 
> data for tables where the PK columns are sorted in mixed order.
> {code:java}
> SELECT * FROM TABLE ORDER BY A DESC, B LIMIT 20 OFFSET (?,?)
> {code}
> We would only allow using the offset if the rows are ordered by the sort 
> order of the PK columns of and Index or Primary Table.
> Note that there is some care is needed in the use of OFFSET with indexes.  If 
> the OFFSET is coercible to multiple indexes/base table it could mean very 
> different positions based on key.  To Handle This the INDEX hint needs to be 
> used to specify an index offset for safety.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Commented] (PHOENIX-4845) Support using Row Value Constructors in OFFSET clause for paging in tables where the sort order of PK columns varies

2019-07-30 Thread Daniel Wong (JIRA)



[ 
https://issues.apache.org/jira/browse/PHOENIX-4845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16896562#comment-16896562
 ] 

Daniel Wong commented on PHOENIX-4845:
--

I'll try, I could require INDEX or NO INDEX hint relatively easily, which maybe 
I should.  However I cannot determine all the user defined index PKs given the 
current syscat as far as I can tell.

Consider the following code fragment and the table and indexes it creates:


{code:java}
String ddlTemplate = "CREATE TABLE %s (k1 TINYINT NOT NULL,\n" + "k2 TINYINT 
NOT NULL,\n" + "k3 TINYINT NOT NULL,\n" + "k4 TINYINT NOT NULL,\n" + "k5 
TINYINT NOT NULL,\n" + "k6 TINYINT NOT NULL,\n" + "v1 INTEGER,\n" + "v2 
INTEGER,\n" + "v3 INTEGER,\n" + "v4 INTEGER,\n" + "CONSTRAINT pk PRIMARY KEY 
(k1, k2, k3, k4, k5, k6)) "; 

String longKeyTableName = "T_" + generateUniqueName(); String longKeyIndex1Name 
= "INDEX_1_" + longKeyTableName; 
String longKeyIndex2Name = "INDEX_2_" + longKeyTableName; 

String ddl = String.format(ddlTemplate,longKeyTableName); 
conn.createStatement().execute(ddl); 
String createIndex1 = "CREATE INDEX IF NOT EXISTS " + longKeyIndex1Name + " ON 
" + longKeyTableName + " (k2 ,v1, k4)"; 
String createIndex2 = "CREATE INDEX IF NOT EXISTS " + longKeyIndex2Name + " ON 
" + longKeyTableName + " (v1, v3)";




{code}
Here the PK columns for the table are PKs = \{K1,K2,K3,K4,K5,K6}
For index 1 it is \{K2, 0:V1, K4, K1, K3, K5, K6} as we append the PKs to the 
indexed row key. However I cannot differentiate that PK form a create index 
that had for example CREATE INDEX (k2, v1, k4, k1). So how will I be able to 
force the user to provide either k1 or not provide k1? I didn't see anything in 
SYSCAT or PTable Interface that can allow me to differentiate these scenarios.  
I could be missing something however.

> Support using Row Value Constructors in OFFSET clause for paging in tables 
> where the sort order of PK columns varies
> 
>
> Key: PHOENIX-4845
> URL: https://issues.apache.org/jira/browse/PHOENIX-4845
> Project: Phoenix
>  Issue Type: New Feature
>Reporter: Thomas D'Silva
>Assignee: Daniel Wong
>Priority: Major
>  Labels: DESC, SFDC
> Attachments: PHOENIX-offset.txt
>
>
> RVCs along with the LIMIT clause are useful for efficiently paging through 
> rows (see [http://phoenix.apache.org/paged.html]). This works well if the pk 
> columns are sorted ascending, we can always use the > operator to query for 
> the next batch of row.
> However if the PK of a table is (A  DESC, B DESC) we cannot use the following 
> query to page through the data
> {code:java}
> SELECT * FROM TABLE WHERE (A, B) > (?, ?) ORDER BY A DESC, B DESC LIMIT 20
> {code}
> Since the rows are sorted by A desc and then by B descending we need change 
> the comparison order
> {code:java}
> SELECT * FROM TABLE WHERE (A, B) < (?, ?) ORDER BY A DESC, B DESC LIMIT 20
> {code}
> If the PK of a table contains columns with mixed sort order for eg (A  DESC, 
> B) then we cannot use RVC to page through data.
> If we supported using RVCs in the offset clause we could use the offset to 
> set the start row of the scan. Clients would not have to have logic to 
> determine the comparison operator. This would also support paging through 
> data for tables where the PK columns are sorted in mixed order.
> {code:java}
> SELECT * FROM TABLE ORDER BY A DESC, B LIMIT 20 OFFSET (?,?)
> {code}
> We would only allow using the offset if the rows are ordered by the sort 
> order of the PK columns of and Index or Primary Table.
> Note that there is some care is needed in the use of OFFSET with indexes.  If 
> the OFFSET is coercible to multiple indexes/base table it could mean very 
> different positions based on key.  To Handle This the INDEX hint needs to be 
> used to specify an index offset for safety.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Commented] (PHOENIX-4845) Support using Row Value Constructors in OFFSET clause for paging in tables where the sort order of PK columns varies

2019-07-30 Thread Daniel Wong (JIRA)



[ 
https://issues.apache.org/jira/browse/PHOENIX-4845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16896412#comment-16896412
 ] 

Daniel Wong commented on PHOENIX-4845:
--

One additional concern I had based on my analysis there does not seem to be an 
easy way to determine which PKs of an Index are user defined and which are 
appended from the base table in order to facilitate uniqueness and indexed 
lookup.  While we have lists of both PKs I'm unable to determine which were 
user specified and which were.  This means that I cannot simply require the 
offset to be fully specified for an indexed lookup.  While I can implement with 
the leading edge of hte key this introduces more.  Assuming I'm correct in that 
there is no way to reconstruct this currently should we add this information to 
system catalog?

[~tdsilva] [~swaroopa] [~gjacoby]

> Support using Row Value Constructors in OFFSET clause for paging in tables 
> where the sort order of PK columns varies
> 
>
> Key: PHOENIX-4845
> URL: https://issues.apache.org/jira/browse/PHOENIX-4845
> Project: Phoenix
>  Issue Type: New Feature
>Reporter: Thomas D'Silva
>Assignee: Daniel Wong
>Priority: Major
>  Labels: DESC, SFDC
> Attachments: PHOENIX-offset.txt
>
>
> RVCs along with the LIMIT clause are useful for efficiently paging through 
> rows (see [http://phoenix.apache.org/paged.html]). This works well if the pk 
> columns are sorted ascending, we can always use the > operator to query for 
> the next batch of row.
> However if the PK of a table is (A  DESC, B DESC) we cannot use the following 
> query to page through the data
> {code:java}
> SELECT * FROM TABLE WHERE (A, B) > (?, ?) ORDER BY A DESC, B DESC LIMIT 20
> {code}
> Since the rows are sorted by A desc and then by B descending we need change 
> the comparison order
> {code:java}
> SELECT * FROM TABLE WHERE (A, B) < (?, ?) ORDER BY A DESC, B DESC LIMIT 20
> {code}
> If the PK of a table contains columns with mixed sort order for eg (A  DESC, 
> B) then we cannot use RVC to page through data.
> If we supported using RVCs in the offset clause we could use the offset to 
> set the start row of the scan. Clients would not have to have logic to 
> determine the comparison operator. This would also support paging through 
> data for tables where the PK columns are sorted in mixed order.
> {code:java}
> SELECT * FROM TABLE ORDER BY A DESC, B LIMIT 20 OFFSET (?,?)
> {code}
> We would only allow using the offset if the rows are ordered by the sort 
> order of the PK columns of and Index or Primary Table.
> Note that there is some care is needed in the use of OFFSET with indexes.  If 
> the OFFSET is coercible to multiple indexes/base table it could mean very 
> different positions based on key.  To Handle This the INDEX hint needs to be 
> used to specify an index offset for safety.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Comment Edited] (PHOENIX-4845) Support using Row Value Constructors in OFFSET clause for paging in tables where the sort order of PK columns varies

2019-07-29 Thread Daniel Wong (JIRA)



[ 
https://issues.apache.org/jira/browse/PHOENIX-4845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16895666#comment-16895666
 ] 

Daniel Wong edited comment on PHOENIX-4845 at 7/30/19 12:07 AM:


Updating the high level description with the following note. 

Note that there is some care needed in the use of OFFSET with indexes.  If the 
OFFSET is coercible to multiple indexes/base table it could mean very different 
positions based on key.  To Handle This the INDEX hint needs to be used to 
specify an index offset for safety


was (Author: dbwong):
Updating the high level description with the following note.  

Note that there is some care neded in the use of OFFSET with indexes.  If the 
OFFSET is coercible to multiple indexes/base table it could mean very different 
positions based on key.  To Handle This the INDEX hint needs to be used to 
specifiy an index offset for safety

> Support using Row Value Constructors in OFFSET clause for paging in tables 
> where the sort order of PK columns varies
> 
>
> Key: PHOENIX-4845
> URL: https://issues.apache.org/jira/browse/PHOENIX-4845
> Project: Phoenix
>  Issue Type: New Feature
>Reporter: Thomas D'Silva
>Assignee: Daniel Wong
>Priority: Major
>  Labels: DESC, SFDC
> Attachments: PHOENIX-offset.txt
>
>
> RVCs along with the LIMIT clause are useful for efficiently paging through 
> rows (see [http://phoenix.apache.org/paged.html]). This works well if the pk 
> columns are sorted ascending, we can always use the > operator to query for 
> the next batch of row.
> However if the PK of a table is (A  DESC, B DESC) we cannot use the following 
> query to page through the data
> {code:java}
> SELECT * FROM TABLE WHERE (A, B) > (?, ?) ORDER BY A DESC, B DESC LIMIT 20
> {code}
> Since the rows are sorted by A desc and then by B descending we need change 
> the comparison order
> {code:java}
> SELECT * FROM TABLE WHERE (A, B) < (?, ?) ORDER BY A DESC, B DESC LIMIT 20
> {code}
> If the PK of a table contains columns with mixed sort order for eg (A  DESC, 
> B) then we cannot use RVC to page through data.
> If we supported using RVCs in the offset clause we could use the offset to 
> set the start row of the scan. Clients would not have to have logic to 
> determine the comparison operator. This would also support paging through 
> data for tables where the PK columns are sorted in mixed order.
> {code:java}
> SELECT * FROM TABLE ORDER BY A DESC, B LIMIT 20 OFFSET (?,?)
> {code}
> We would only allow using the offset if the rows are ordered by the sort 
> order of the PK columns of and Index or Primary Table.
> Note that there is some care is needed in the use of OFFSET with indexes.  If 
> the OFFSET is coercible to multiple indexes/base table it could mean very 
> different positions based on key.  To Handle This the INDEX hint needs to be 
> used to specify an index offset for safety.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Commented] (PHOENIX-4845) Support using Row Value Constructors in OFFSET clause for paging in tables where the sort order of PK columns varies

2019-07-29 Thread Daniel Wong (JIRA)



[ 
https://issues.apache.org/jira/browse/PHOENIX-4845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16895666#comment-16895666
 ] 

Daniel Wong commented on PHOENIX-4845:
--

Updating the high level description with the following note.  

Note that there is some care neded in the use of OFFSET with indexes.  If the 
OFFSET is coercible to multiple indexes/base table it could mean very different 
positions based on key.  To Handle This the INDEX hint needs to be used to 
specifiy an index offset for safety

> Support using Row Value Constructors in OFFSET clause for paging in tables 
> where the sort order of PK columns varies
> 
>
> Key: PHOENIX-4845
> URL: https://issues.apache.org/jira/browse/PHOENIX-4845
> Project: Phoenix
>  Issue Type: New Feature
>Reporter: Thomas D'Silva
>Assignee: Daniel Wong
>Priority: Major
>  Labels: DESC, SFDC
> Attachments: PHOENIX-offset.txt
>
>
> RVCs along with the LIMIT clause are useful for efficiently paging through 
> rows (see [http://phoenix.apache.org/paged.html]). This works well if the pk 
> columns are sorted ascending, we can always use the > operator to query for 
> the next batch of row. 
> However if the PK of a table is (A  DESC, B DESC) we cannot use the following 
> query to page through the data
> {code:java}
> SELECT * FROM TABLE WHERE (A, B) > (?, ?) ORDER BY A DESC, B DESC LIMIT 20
> {code}
> Since the rows are sorted by A desc and then by B descending we need change 
> the comparison order
> {code:java}
> SELECT * FROM TABLE WHERE (A, B) < (?, ?) ORDER BY A DESC, B DESC LIMIT 20
> {code}
> If the PK of a table contains columns with mixed sort order for eg (A  DESC, 
> B) then we cannot use RVC to page through data. 
> If we supported using RVCs in the offset clause we could use the offset to 
> set the start row of the scan. Clients would not have to have logic to 
> determine the comparison operator. This would also support paging through 
> data for tables where the PK columns are sorted in mixed order. 
> {code:java}
> SELECT * FROM TABLE ORDER BY A DESC, B LIMIT 20 OFFSET (?,?)
> {code}
> We would only allow using the offset if the rows are ordered by the sort 
> order of the PK columns.
>  
> FYI [~jfernando_sfdc]



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Commented] (PHOENIX-4846) WhereOptimizer.pushKeyExpressionsToScan() does not work correctly if the sort order of pk columns being filtered on changes

2019-07-22 Thread Daniel Wong (JIRA)



[ 
https://issues.apache.org/jira/browse/PHOENIX-4846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16890386#comment-16890386
 ] 

Daniel Wong commented on PHOENIX-4846:
--

I'll see if I can look at this later this week.

> WhereOptimizer.pushKeyExpressionsToScan() does not work correctly if the sort 
> order of pk columns being filtered on changes
> ---
>
> Key: PHOENIX-4846
> URL: https://issues.apache.org/jira/browse/PHOENIX-4846
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.14.0, 5.0.0
>Reporter: Thomas D'Silva
>Priority: Critical
> Fix For: 4.15.0, 5.1.0
>
> Attachments: PHOENIX-4846-wip.patch
>
>
> {{ExpressionComparabilityWrapper}} should set the sort order based on 
> {{childPart.getColumn()}} or else the attached test throws an 
> IllegalArgumentException
> {code}
> java.lang.IllegalArgumentException: 4 > 3
> at java.util.Arrays.copyOfRange(Arrays.java:3519)
> at 
> org.apache.hadoop.hbase.io.ImmutableBytesWritable.copyBytes(ImmutableBytesWritable.java:272)
> at 
> org.apache.phoenix.compile.WhereOptimizer.getTrailingRange(WhereOptimizer.java:329)
> at 
> org.apache.phoenix.compile.WhereOptimizer.clipRight(WhereOptimizer.java:350)
> at 
> org.apache.phoenix.compile.WhereOptimizer.pushKeyExpressionsToScan(WhereOptimizer.java:237)
> at org.apache.phoenix.compile.WhereCompiler.compile(WhereCompiler.java:157)
> at org.apache.phoenix.compile.WhereCompiler.compile(WhereCompiler.java:108)
> at 
> org.apache.phoenix.compile.QueryCompiler.compileSingleFlatQuery(QueryCompiler.java:556)
> {code}
> Also in {{pushKeyExpressionsToScan()}} we cannot extract pk column nodes from 
> the where clause if the sort order of the columns changes. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Commented] (PHOENIX-5290) HashJoinMoreIT is flapping

2019-07-17 Thread Daniel Wong (JIRA)



[ 
https://issues.apache.org/jira/browse/PHOENIX-5290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16887480#comment-16887480
 ] 

Daniel Wong commented on PHOENIX-5290:
--

Thanks for commenting on this.  I think we are in agreement but I was likely 
talking about some detailed level info. For example varbinary row key is 
allowed only at the last column exactly for these separator issues as 0 byte is 
valid and a non-separator in a varbinary.  We get away with this from an 
interpretation perspective since we know the length of the key everything from 
the last field to the end is the varbinary.  This length is is essentially 
“stored” in the hbase key.  In this case we do not allow new columns to be 
added to the row key.  And I agree on not changing the row key I did not mean 
that as an intention.  Sorry if there was misunderstanding!

> HashJoinMoreIT is flapping
> --
>
> Key: PHOENIX-5290
> URL: https://issues.apache.org/jira/browse/PHOENIX-5290
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.15.0, 4.14.1, 5.1.0
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
>Priority: Major
> Fix For: 4.15.0, 5.1.0
>
> Attachments: 5290-combined.txt, 5290-failure.txt, 5290-v2.txt, 
> 5290-v3.txt, 5290.txt
>
>
> {code}
> [INFO] Running org.apache.phoenix.end2end.join.HashJoinMoreIT
> [ERROR] Tests run: 8, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
> 91.509 s <<< FAILURE! - in org.apache.phoenix.end2end.join.HashJoinMoreIT
> [ERROR] testBug2961(org.apache.phoenix.end2end.join.HashJoinMoreIT)  Time 
> elapsed: 2.42 s  <<< ERROR!
> java.lang.IllegalArgumentException: 6 > 5
> at 
> org.apache.phoenix.end2end.join.HashJoinMoreIT.testBug2961(HashJoinMoreIT.java:898)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Commented] (PHOENIX-5290) HashJoinMoreIT is flapping

2019-07-15 Thread Daniel Wong (JIRA)



[ 
https://issues.apache.org/jira/browse/PHOENIX-5290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16885544#comment-16885544
 ] 

Daniel Wong commented on PHOENIX-5290:
--

Quickly trying to catch up here.  There is a bunch of weirdness on separators 
in our key/single value row generation that likely should be abstracted out.  
We append separators somewhat blindly after writing each field when generating 
the row key.  Since row keys have length it is unnecessary to store the 
trailing key so we trim it.  If we were to have multiple variable length 
fields, (or nullable secondary indexes) we could in theory trim those as well.  
For example several empty String VARCHARs in a row.  Without looking in the 
code I cannot recall if there is a case where we trim all of them for storage 
but for querying it appears we will do so for something like the following code 
snippet i ran.  Here in my code the startkey has no trailing separators. 


{code:java}
String dataTableName = "T_" + generateUniqueName();
String dataTableDDL = "CREATE TABLE " + dataTableName + " (k1 INTEGER NOT NULL, 
s1 VARCHAR(64),\n" + "s2 VARCHAR(64),\n"
+ "s3 VARCHAR(64),\n" + "v1 INTEGER,\n" + "CONSTRAINT pk PRIMARY KEY 
(k1))";

conn.createStatement().execute(dataTableDDL);
conn.commit();

String indexDDL = "CREATE INDEX IF NOT EXISTS " + dataTableName+"_IDX" + " ON " 
+ dataTableName + " (s1 ,s2, s3)";

conn.createStatement().execute(indexDDL);
conn.commit();

String upsertDML = String.format("UPSERT INTO %s VALUES(?,?,?,?,?)", 
dataTableName);

int nRows = 0;
PreparedStatement ps = conn.prepareStatement(upsertDML);
ps.setInt(1,5);
ps.setString(2, "a");
ps.setString(3, "");
ps.setString(4, "");
ps.setInt(5, nRows);
int result = ps.executeUpdate();
assertEquals(1, result);
nRows++;

conn.commit();

String sql = "SELECT s1,s2,s3 FROM " + dataTableName + " WHERE (s1,s2,s3) >= 
('a','','')  ORDER BY s1,s2,s3"; 
ResultSet rs = conn.createStatement().executeQuery(sql);
assertTrue(rs.next())
{code}

> HashJoinMoreIT is flapping
> --
>
> Key: PHOENIX-5290
> URL: https://issues.apache.org/jira/browse/PHOENIX-5290
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.15.0, 4.14.1, 5.1.0
>Reporter: Lars Hofhansl
>Priority: Major
> Fix For: 4.15.0, 5.1.0
>
> Attachments: 5290-failure.txt, 5290-v2.txt, 5290-v3.txt, 5290.txt
>
>
> {code}
> [INFO] Running org.apache.phoenix.end2end.join.HashJoinMoreIT
> [ERROR] Tests run: 8, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
> 91.509 s <<< FAILURE! - in org.apache.phoenix.end2end.join.HashJoinMoreIT
> [ERROR] testBug2961(org.apache.phoenix.end2end.join.HashJoinMoreIT)  Time 
> elapsed: 2.42 s  <<< ERROR!
> java.lang.IllegalArgumentException: 6 > 5
> at 
> org.apache.phoenix.end2end.join.HashJoinMoreIT.testBug2961(HashJoinMoreIT.java:898)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Commented] (PHOENIX-4846) WhereOptimizer.pushKeyExpressionsToScan() does not work correctly if the sort order of pk columns being filtered on changes

2019-07-12 Thread Daniel Wong (JIRA)



[ 
https://issues.apache.org/jira/browse/PHOENIX-4846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16884083#comment-16884083
 ] 

Daniel Wong commented on PHOENIX-4846:
--

I'll try to look at this next week.

> WhereOptimizer.pushKeyExpressionsToScan() does not work correctly if the sort 
> order of pk columns being filtered on changes
> ---
>
> Key: PHOENIX-4846
> URL: https://issues.apache.org/jira/browse/PHOENIX-4846
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.14.0, 5.0.0
>Reporter: Thomas D'Silva
>Priority: Critical
> Fix For: 4.15.0, 5.1.0
>
> Attachments: PHOENIX-4846-wip.patch
>
>
> {{ExpressionComparabilityWrapper}} should set the sort order based on 
> {{childPart.getColumn()}} or else the attached test throws an 
> IllegalArgumentException
> {code}
> java.lang.IllegalArgumentException: 4 > 3
> at java.util.Arrays.copyOfRange(Arrays.java:3519)
> at 
> org.apache.hadoop.hbase.io.ImmutableBytesWritable.copyBytes(ImmutableBytesWritable.java:272)
> at 
> org.apache.phoenix.compile.WhereOptimizer.getTrailingRange(WhereOptimizer.java:329)
> at 
> org.apache.phoenix.compile.WhereOptimizer.clipRight(WhereOptimizer.java:350)
> at 
> org.apache.phoenix.compile.WhereOptimizer.pushKeyExpressionsToScan(WhereOptimizer.java:237)
> at org.apache.phoenix.compile.WhereCompiler.compile(WhereCompiler.java:157)
> at org.apache.phoenix.compile.WhereCompiler.compile(WhereCompiler.java:108)
> at 
> org.apache.phoenix.compile.QueryCompiler.compileSingleFlatQuery(QueryCompiler.java:556)
> {code}
> Also in {{pushKeyExpressionsToScan()}} we cannot extract pk column nodes from 
> the where clause if the sort order of the columns changes. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Commented] (PHOENIX-4845) Support using Row Value Constructors in OFFSET clause for paging in tables where the sort order of PK columns varies

2019-06-24 Thread Daniel Wong (JIRA)



[ 
https://issues.apache.org/jira/browse/PHOENIX-4845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16871653#comment-16871653
 ] 

Daniel Wong commented on PHOENIX-4845:
--

Lars we discussed this a bit earlier in our threads, but essentially the reason 
why OFFSET is good to have is it removes the burden of knowing the DDL and 
Access Path from constructed the WHERE clause.  As you have to inspect which 
columns are ASC or DESC part of the pk in order to construct the right 
comparator.  

> Support using Row Value Constructors in OFFSET clause for paging in tables 
> where the sort order of PK columns varies
> 
>
> Key: PHOENIX-4845
> URL: https://issues.apache.org/jira/browse/PHOENIX-4845
> Project: Phoenix
>  Issue Type: New Feature
>Reporter: Thomas D'Silva
>Assignee: Daniel Wong
>Priority: Major
>  Labels: DESC, SFDC
> Attachments: PHOENIX-offset.txt
>
>
> RVCs along with the LIMIT clause are useful for efficiently paging through 
> rows (see [http://phoenix.apache.org/paged.html]). This works well if the pk 
> columns are sorted ascending, we can always use the > operator to query for 
> the next batch of row. 
> However if the PK of a table is (A  DESC, B DESC) we cannot use the following 
> query to page through the data
> {code:java}
> SELECT * FROM TABLE WHERE (A, B) > (?, ?) ORDER BY A DESC, B DESC LIMIT 20
> {code}
> Since the rows are sorted by A desc and then by B descending we need change 
> the comparison order
> {code:java}
> SELECT * FROM TABLE WHERE (A, B) < (?, ?) ORDER BY A DESC, B DESC LIMIT 20
> {code}
> If the PK of a table contains columns with mixed sort order for eg (A  DESC, 
> B) then we cannot use RVC to page through data. 
> If we supported using RVCs in the offset clause we could use the offset to 
> set the start row of the scan. Clients would not have to have logic to 
> determine the comparison operator. This would also support paging through 
> data for tables where the PK columns are sorted in mixed order. 
> {code:java}
> SELECT * FROM TABLE ORDER BY A DESC, B LIMIT 20 OFFSET (?,?)
> {code}
> We would only allow using the offset if the rows are ordered by the sort 
> order of the PK columns.
>  
> FYI [~jfernando_sfdc]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (PHOENIX-4845) Support using Row Value Constructors in OFFSET clause for paging in tables where the sort order of PK columns varies

2019-06-21 Thread Daniel Wong (JIRA)



[ 
https://issues.apache.org/jira/browse/PHOENIX-4845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16869972#comment-16869972
 ] 

Daniel Wong commented on PHOENIX-4845:
--

Initial mostly working outline of approach 1:  
[https://github.com/dbwong/phoenix/pull/2/files]

> Support using Row Value Constructors in OFFSET clause for paging in tables 
> where the sort order of PK columns varies
> 
>
> Key: PHOENIX-4845
> URL: https://issues.apache.org/jira/browse/PHOENIX-4845
> Project: Phoenix
>  Issue Type: New Feature
>Reporter: Thomas D'Silva
>Assignee: Daniel Wong
>Priority: Major
>  Labels: DESC, SFDC
> Attachments: PHOENIX-offset.txt
>
>
> RVCs along with the LIMIT clause are useful for efficiently paging through 
> rows (see [http://phoenix.apache.org/paged.html]). This works well if the pk 
> columns are sorted ascending, we can always use the > operator to query for 
> the next batch of row. 
> However if the PK of a table is (A  DESC, B DESC) we cannot use the following 
> query to page through the data
> {code:java}
> SELECT * FROM TABLE WHERE (A, B) > (?, ?) ORDER BY A DESC, B DESC LIMIT 20
> {code}
> Since the rows are sorted by A desc and then by B descending we need change 
> the comparison order
> {code:java}
> SELECT * FROM TABLE WHERE (A, B) < (?, ?) ORDER BY A DESC, B DESC LIMIT 20
> {code}
> If the PK of a table contains columns with mixed sort order for eg (A  DESC, 
> B) then we cannot use RVC to page through data. 
> If we supported using RVCs in the offset clause we could use the offset to 
> set the start row of the scan. Clients would not have to have logic to 
> determine the comparison operator. This would also support paging through 
> data for tables where the PK columns are sorted in mixed order. 
> {code:java}
> SELECT * FROM TABLE ORDER BY A DESC, B LIMIT 20 OFFSET (?,?)
> {code}
> We would only allow using the offset if the rows are ordered by the sort 
> order of the PK columns.
>  
> FYI [~jfernando_sfdc]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (PHOENIX-4845) Support using Row Value Constructors in OFFSET clause for paging in tables where the sort order of PK columns varies

2019-06-21 Thread Daniel Wong (JIRA)



[ 
https://issues.apache.org/jira/browse/PHOENIX-4845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16869950#comment-16869950
 ] 

Daniel Wong commented on PHOENIX-4845:
--

[~lhofhansl] I don't quite follow the performance concern I assume it is due to 
optimization time as the where optimizer generates the proper scan bounds based 
on the new additional predicates which is linear in the number of pks? This 
predicate should be entirely extractable into the pk so there should not be any 
issue server side.  After attempting approach 2 which I have a general outline 
for PR to shortly follow,  I can say that approach 1 is definitely easier in 
the short term and less likely to miss an issue. 
For the serial iterator I was simply planning to enforce that the select is row 
key ordered this solves any worry about needing to serialize the parallel 
pieces in any new way.

> Support using Row Value Constructors in OFFSET clause for paging in tables 
> where the sort order of PK columns varies
> 
>
> Key: PHOENIX-4845
> URL: https://issues.apache.org/jira/browse/PHOENIX-4845
> Project: Phoenix
>  Issue Type: New Feature
>Reporter: Thomas D'Silva
>Assignee: Daniel Wong
>Priority: Major
>  Labels: DESC, SFDC
> Attachments: PHOENIX-offset.txt
>
>
> RVCs along with the LIMIT clause are useful for efficiently paging through 
> rows (see [http://phoenix.apache.org/paged.html]). This works well if the pk 
> columns are sorted ascending, we can always use the > operator to query for 
> the next batch of row. 
> However if the PK of a table is (A  DESC, B DESC) we cannot use the following 
> query to page through the data
> {code:java}
> SELECT * FROM TABLE WHERE (A, B) > (?, ?) ORDER BY A DESC, B DESC LIMIT 20
> {code}
> Since the rows are sorted by A desc and then by B descending we need change 
> the comparison order
> {code:java}
> SELECT * FROM TABLE WHERE (A, B) < (?, ?) ORDER BY A DESC, B DESC LIMIT 20
> {code}
> If the PK of a table contains columns with mixed sort order for eg (A  DESC, 
> B) then we cannot use RVC to page through data. 
> If we supported using RVCs in the offset clause we could use the offset to 
> set the start row of the scan. Clients would not have to have logic to 
> determine the comparison operator. This would also support paging through 
> data for tables where the PK columns are sorted in mixed order. 
> {code:java}
> SELECT * FROM TABLE ORDER BY A DESC, B LIMIT 20 OFFSET (?,?)
> {code}
> We would only allow using the offset if the rows are ordered by the sort 
> order of the PK columns.
>  
> FYI [~jfernando_sfdc]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (PHOENIX-5360) Cleanup anonymous inner classes in WhereOptimizer

2019-06-19 Thread Daniel Wong (JIRA)



[ 
https://issues.apache.org/jira/browse/PHOENIX-5360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868102#comment-16868102
 ] 

Daniel Wong commented on PHOENIX-5360:
--

One advantage of static inner classes (Say package private) is that they are 
unit testable are we planning on testing these?

> Cleanup anonymous inner classes in WhereOptimizer
> -
>
> Key: PHOENIX-5360
> URL: https://issues.apache.org/jira/browse/PHOENIX-5360
> Project: Phoenix
>  Issue Type: Sub-task
>Reporter: Xinyi Yan
>Assignee: Xinyi Yan
>Priority: Minor
> Attachments: PHOENIX-5360.patch
>
>
> Followed [~gjacoby] in 
> [PHOENIX-5024|https://issues.apache.org/jira/browse/PHOENIX-5024]. Convert 
> Anonymous to Inne refactor in IntelliJ is the nice feature that converts 
> anonymous to inner class. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (PHOENIX-4845) Support using Row Value Constructors in OFFSET clause for paging in tables where the sort order of PK columns varies

2019-06-19 Thread Daniel Wong (JIRA)



[ 
https://issues.apache.org/jira/browse/PHOENIX-4845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868098#comment-16868098
 ] 

Daniel Wong commented on PHOENIX-4845:
--

[~Bin Shi] might have some thoughts on this too.

> Support using Row Value Constructors in OFFSET clause for paging in tables 
> where the sort order of PK columns varies
> 
>
> Key: PHOENIX-4845
> URL: https://issues.apache.org/jira/browse/PHOENIX-4845
> Project: Phoenix
>  Issue Type: New Feature
>Reporter: Thomas D'Silva
>Assignee: Daniel Wong
>Priority: Major
>  Labels: DESC, SFDC
> Attachments: PHOENIX-offset.txt
>
>
> RVCs along with the LIMIT clause are useful for efficiently paging through 
> rows (see [http://phoenix.apache.org/paged.html]). This works well if the pk 
> columns are sorted ascending, we can always use the > operator to query for 
> the next batch of row. 
> However if the PK of a table is (A  DESC, B DESC) we cannot use the following 
> query to page through the data
> {code:java}
> SELECT * FROM TABLE WHERE (A, B) > (?, ?) ORDER BY A DESC, B DESC LIMIT 20
> {code}
> Since the rows are sorted by A desc and then by B descending we need change 
> the comparison order
> {code:java}
> SELECT * FROM TABLE WHERE (A, B) < (?, ?) ORDER BY A DESC, B DESC LIMIT 20
> {code}
> If the PK of a table contains columns with mixed sort order for eg (A  DESC, 
> B) then we cannot use RVC to page through data. 
> If we supported using RVCs in the offset clause we could use the offset to 
> set the start row of the scan. Clients would not have to have logic to 
> determine the comparison operator. This would also support paging through 
> data for tables where the PK columns are sorted in mixed order. 
> {code:java}
> SELECT * FROM TABLE ORDER BY A DESC, B LIMIT 20 OFFSET (?,?)
> {code}
> We would only allow using the offset if the rows are ordered by the sort 
> order of the PK columns.
>  
> FYI [~jfernando_sfdc]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (PHOENIX-4845) Support using Row Value Constructors in OFFSET clause for paging in tables where the sort order of PK columns varies

2019-06-19 Thread Daniel Wong (JIRA)



[ 
https://issues.apache.org/jira/browse/PHOENIX-4845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868090#comment-16868090
 ] 

Daniel Wong edited comment on PHOENIX-4845 at 6/19/19 11:15 PM:


So I have began implementing this Jira and have done some initial scoping and 
approach thoughts.

Disallowed cases will not be supported:

 
{code:java}
String simpleDdl = "CREATE TABLE " + tableName + " (t_id VARCHAR NOT NULL,\n" + 
"k1 INTEGER NOT NULL,\n" + "k2 INTEGER NOT NULL,\n" + "C3.k3 INTEGER,\n" + 
"C2.v1 VARCHAR,\n" + "CONSTRAINT pk PRIMARY KEY (t_id, k1, k2)) ";

//Test RVC Offset columns must be coercible to a base table
@Test
public void testRVCIndexLookup() throws SQLException {
String failureSql = String.format("SELECT t_id, k1, k2 FROM %s OFFSET ('a', 
'ab', 2)",tableName);
try {
conn.createStatement().execute(failureSql);
} catch (Exception e) {
System.out.println(e.getMessage());
return;
}
fail("Should not allow Index lookup with RVC Offset");
}

//Test Not Allow Index Access Path On Base Table
@Test
public void testRVCOffsetNotCoercible() throws SQLException {
String failureSql = String.format("SELECT t_id, k1, k2 FROM %s OFFSET ('a', 
'ab', 2)",tableName);
try {
conn.createStatement().execute(failureSql);
} catch (Exception e) {
System.out.println(e.getMessage());
return;
}
fail("Should not allow non coercible values to PK in RVC Offset");
}

//Test Order By Not PK Order By Exception
@Test
public void testRVCOffsetNotAllowNonPKOrderBy() throws SQLException {
String failureSql = String.format("SELECT t_id, k1, k2, v1 FROM %s ORDER BY v1 
OFFSET ('a', 1, 2)",tableName);
try {
conn.createStatement().execute(failureSql);
} catch (Exception e) {
System.out.println(e.getMessage());
return;
}
fail("Should not allow no PK order by with RVC Offset");
}

//Test Order By Partial PK Order By Exception
@Test
public void testRVCOffsetNotAllowPartialPKOrderBy() throws SQLException {
String failureSql = String.format("SELECT t_id, k1, k2 FROM %s ORDER BY 2 
OFFSET ('a', 1, 2)",tableName);
try {
conn.createStatement().execute(failureSql);
} catch (Exception e) {
System.out.println(e.getMessage());
return;
}
fail("Should not allow partial PK order by with RVC Offset");
}

//Test Order By Not PK Order By Exception
@Test
public void testRVCOffsetNotAllowDifferentPKOrderBy() throws SQLException {
String failureSql = String.format("SELECT t_id, k1, k2 FROM %s ORDER BY 1 
DESC,2,3 OFFSET ('a', 1, 2)",tableName);
try {
conn.createStatement().execute(failureSql);
} catch (Exception e) {
System.out.println(e.getMessage());
return;
}
fail("Should not allow differnt PK order by with RVC Offset");
}

//Test Not allow joins
@Test
public void testRVCOffsetNotAllowedInJoins() throws SQLException {
String failureSql = String.format("SELECT * FROM %s AS T1, %s AS T2 WHERE 
T1.t_id=T2.t_id OFFSET ('a', 1, 2)",tableName,tableName); //literal works
try {
conn.createStatement().execute(failureSql);
} catch (Exception e) {
System.out.println(e.getMessage());
return;
}
fail("Should not have JOIN in RVC Offset");
}

//Test Not allowed in subsquery
@Test
public void testRVCOffsetNotAllowedInSubQuery() throws SQLException {
String failureSql = String.format("SELECT * FROM (SELECT t_id, k2 FROM %s 
OFFSET ('a', 1, 2))",tableName);
try {
conn.createStatement().execute(failureSql);
} catch (Exception e) {
System.out.println(e.getMessage());
return;
}
fail("Should not have subquery with RVC Offset");
}


//Test Not allowed on subsquery
@Test
public void testRVCOffsetNotAllowedOnSubQuery() throws SQLException {
String failureSql = String.format("SELECT * FROM (SELECT t_id, k2 FROM %s) 
OFFSET ('a', 1, 2)",tableName);
try {
conn.createStatement().execute(failureSql);
} catch (Exception e) {
System.out.println(e.getMessage());
return;
}
fail("Should not have subquery with RVC Offset");
}

//Test RVC Offset must be a literal, cannot have column reference
@Test
public void testRVCOffsetLiteral() throws SQLException {
String sql = "SELECT * FROM " + tableName + " OFFSET ('a', 1, 2)"; //literal 
works
conn.createStatement().execute(sql);
String failureSql = "SELECT * FROM " + tableName + " OFFSET ('a', 1, k2)"; 
//column does works
try {
conn.createStatement().execute(failureSql);
} catch (Exception e) {
System.out.println(e.getMessage());
return;
}
fail("Should not have allowed column in RVC Offset");
}



{code}
Approach Possibilities:

I have been considering 2 posibilities for approaching this in Phoenix and am 
open to other suggestions as well.

Given DDL/Query:

Create Table TABLE (a UNSIGNED_TINYINT, b UNSIGNED_TINYINT, c UNSIGNED_TINYINT, 
CONSTANT pk PRIMARY KEY (a,b desc, c))
 SELECT * FROM TABLE OFFSET (1,2,3)

Approach 1: Query Rewrite

SELECT * FROM TABLE OFFSET (1,2,3)
 ->
 SELECT * FROM TABLE WHERE (A > 1 OR (A = 1 AND (B < 2 OR (B = 2 AND (C >= 
3)

Approach 2: Mini - Resolution
 SELECT * FROM TABLE OFFSET (1,2,3)
 -> SELECT 1

[jira] [Commented] (PHOENIX-4845) Support using Row Value Constructors in OFFSET clause for paging in tables where the sort order of PK columns varies

2019-06-19 Thread Daniel Wong (JIRA)



[ 
https://issues.apache.org/jira/browse/PHOENIX-4845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868090#comment-16868090
 ] 

Daniel Wong commented on PHOENIX-4845:
--

So I have began implementing this Jira and have done some initial scoping and 
approach thoughts.

Disallowed cases will not be supported:

 
{code:java}
//Test RVC Offset columns must be coercible to a base table
@Test
public void testRVCIndexLookup() throws SQLException {
String failureSql = String.format("SELECT t_id, k1, k2 FROM %s OFFSET ('a', 
'ab', 2)",tableName);
try {
conn.createStatement().execute(failureSql);
} catch (Exception e) {
System.out.println(e.getMessage());
return;
}
fail("Should not allow Index lookup with RVC Offset");
}

//Test Not Allow Index Access Path On Base Table
@Test
public void testRVCOffsetNotCoercible() throws SQLException {
String failureSql = String.format("SELECT t_id, k1, k2 FROM %s OFFSET ('a', 
'ab', 2)",tableName);
try {
conn.createStatement().execute(failureSql);
} catch (Exception e) {
System.out.println(e.getMessage());
return;
}
fail("Should not allow non coercible values to PK in RVC Offset");
}

//Test Order By Not PK Order By Exception
@Test
public void testRVCOffsetNotAllowNonPKOrderBy() throws SQLException {
String failureSql = String.format("SELECT t_id, k1, k2, v1 FROM %s ORDER BY v1 
OFFSET ('a', 1, 2)",tableName);
try {
conn.createStatement().execute(failureSql);
} catch (Exception e) {
System.out.println(e.getMessage());
return;
}
fail("Should not allow no PK order by with RVC Offset");
}

//Test Order By Partial PK Order By Exception
@Test
public void testRVCOffsetNotAllowPartialPKOrderBy() throws SQLException {
String failureSql = String.format("SELECT t_id, k1, k2 FROM %s ORDER BY 2 
OFFSET ('a', 1, 2)",tableName);
try {
conn.createStatement().execute(failureSql);
} catch (Exception e) {
System.out.println(e.getMessage());
return;
}
fail("Should not allow partial PK order by with RVC Offset");
}

//Test Order By Not PK Order By Exception
@Test
public void testRVCOffsetNotAllowDifferentPKOrderBy() throws SQLException {
String failureSql = String.format("SELECT t_id, k1, k2 FROM %s ORDER BY 1 
DESC,2,3 OFFSET ('a', 1, 2)",tableName);
try {
conn.createStatement().execute(failureSql);
} catch (Exception e) {
System.out.println(e.getMessage());
return;
}
fail("Should not allow differnt PK order by with RVC Offset");
}

//Test Not allow joins
@Test
public void testRVCOffsetNotAllowedInJoins() throws SQLException {
String failureSql = String.format("SELECT * FROM %s AS T1, %s AS T2 WHERE 
T1.t_id=T2.t_id OFFSET ('a', 1, 2)",tableName,tableName); //literal works
try {
conn.createStatement().execute(failureSql);
} catch (Exception e) {
System.out.println(e.getMessage());
return;
}
fail("Should not have JOIN in RVC Offset");
}

//Test Not allowed in subsquery
@Test
public void testRVCOffsetNotAllowedInSubQuery() throws SQLException {
String failureSql = String.format("SELECT * FROM (SELECT t_id, k2 FROM %s 
OFFSET ('a', 1, 2))",tableName);
try {
conn.createStatement().execute(failureSql);
} catch (Exception e) {
System.out.println(e.getMessage());
return;
}
fail("Should not have subquery with RVC Offset");
}


//Test Not allowed on subsquery
@Test
public void testRVCOffsetNotAllowedOnSubQuery() throws SQLException {
String failureSql = String.format("SELECT * FROM (SELECT t_id, k2 FROM %s) 
OFFSET ('a', 1, 2)",tableName);
try {
conn.createStatement().execute(failureSql);
} catch (Exception e) {
System.out.println(e.getMessage());
return;
}
fail("Should not have subquery with RVC Offset");
}

//Test RVC Offset must be a literal, cannot have column reference
@Test
public void testRVCOffsetLiteral() throws SQLException {
String sql = "SELECT * FROM " + tableName + " OFFSET ('a', 1, 2)"; //literal 
works
conn.createStatement().execute(sql);
String failureSql = "SELECT * FROM " + tableName + " OFFSET ('a', 1, k2)"; 
//column does works
try {
conn.createStatement().execute(failureSql);
} catch (Exception e) {
System.out.println(e.getMessage());
return;
}
fail("Should not have allowed column in RVC Offset");
}



{code}
Approach Possibilities:

I have been considering 2 posibilities for approaching this in Phoenix and am 
open to other suggestions as well.



Given DDL/Query:

Create Table TABLE (a UNSIGNED_TINYINT, b UNSIGNED_TINYINT, c UNSIGNED_TINYINT, 
CONSTANT pk PRIMARY KEY (a,b desc, c))
SELECT * FROM TABLE OFFSET (1,2,3)

Approach 1: Query Rewrite
 
 SELECT * FROM TABLE OFFSET (1,2,3)
 ->
 SELECT * FROM TABLE WHERE (A > 1 OR (A = 1 AND (B < 2 OR (B = 2 AND (C >= 
3)

Approach 2: Mini - Resolution
 SELECT * FROM TABLE OFFSET (1,2,3)
 -> SELECT 1 FROM TABLE WHERE (a,b,c)=(1,2,3), (Pass the constructed tree to 
the WhereOptimizer essentially)
 -> Extract Byte Array From Constant [0x01FD03]
 -> Filter Scans start/end key by Constant

Right now I'm favoring approach 2 but this will require some additional hooks 
into the

[jira] [Commented] (PHOENIX-5318) Slots passed to SkipScan filter is incorrect for desc primary keys that are prefixes of each other

2019-06-17 Thread Daniel Wong (JIRA)



[ 
https://issues.apache.org/jira/browse/PHOENIX-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16865818#comment-16865818
 ] 

Daniel Wong commented on PHOENIX-5318:
--

It is probably good to do since we are still working on testing for correctness.

> Slots passed to SkipScan filter is incorrect for desc primary keys that are 
> prefixes of each other
> --
>
> Key: PHOENIX-5318
> URL: https://issues.apache.org/jira/browse/PHOENIX-5318
> Project: Phoenix
>  Issue Type: Improvement
>Affects Versions: 4.15.0, 5.1.0, 4.14.3
>Reporter: Thomas D'Silva
>Assignee: Thomas D'Silva
>Priority: Major
>  Labels: DESC
> Fix For: 4.15.0, 5.1.0, 4.14.3
>
> Attachments: PHOENIX-5318-4.x-HBase-1.3.patch
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> {code}
> CREATE VIEW IF NOT EXISTS CUSTOM_ENTITY."z01" (COL1 VARCHAR, COL2 VARCHAR, 
> COL3 VARCHAR, COL4 VARCHAR CONSTRAINT PK PRIMARY KEY (COL1 DESC, COL2 DESC, 
> COL3 DESC, COL4 DESC)) AS SELECT * FROM 
> CUSTOM_ENTITY.CUSTOM_ENTITY_DATA_NO_ID WHERE KEY_PREFIX = 'z01'; 
>  
> UPSERT INTO CUSTOM_ENTITY."z01" (COL1, COL2, COL3, COL4) VALUES ('8', 'blah', 
> 'blah', 'blah');
> UPSERT INTO CUSTOM_ENTITY."z01" (COL1, COL2, COL3, COL4) VALUES ('6', 'blah', 
> 'blah', 'blah');
> UPSERT INTO CUSTOM_ENTITY."z01" (COL1, COL2, COL3, COL4) VALUES ('23', 
> 'blah', 'blah', 'blah');
> UPSERT INTO CUSTOM_ENTITY."z01" (COL1, COL2, COL3, COL4) VALUES ('17', 
> 'blah', 'blah', 'blah');
>  
> SELECT COL1, COL2, COL3, COL4 FROM CUSTOM_ENTITY."z01" WHERE COL4='blah' AND 
> (COL1='1' OR COL1='2' OR COL1='3' OR COL1='4' OR COL1='5' OR COL1='6' OR 
> COL1='8' OR COL1='17' OR COL1='12' OR COL1='23') AND COL3='blah'
>  
> +---+---+---+---+
> | COL1 | COL2 | COL3 | COL4 |
> +---+---+---+---+
> | 8   | blah | blah | blah |
> | 6   | blah | blah | blah |
> +---+---+---+---+
>  
> SELECT COL1, COL2, COL3, COL4 FROM CUSTOM_ENTITY."z01" WHERE COL4='blah' AND 
> (COL1='6'OR COL1='8' OR COL1='17' OR COL1='12' OR COL1='23') AND COL3='blah'
>  
> +---+---+---+---+
> | COL1 | COL2 | COL3 | COL4 |
> +---+---+---+---+
> | 8   | blah | blah | blah |
> | 6   | blah | blah | blah |
> | 23  | blah | blah | blah |
> | 17  | blah | blah | blah |
> +---+---+---+---
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (PHOENIX-5231) Configurable Stats Cache

2019-05-30 Thread Daniel Wong (JIRA)



[ 
https://issues.apache.org/jira/browse/PHOENIX-5231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16852244#comment-16852244
 ] 

Daniel Wong commented on PHOENIX-5231:
--

I'm okay with reverting as I personally don't have time to look until next 
week.  What is the long term way forward though?  Services are the standard way 
to provide user pluggable functionality.  Originally this was intended for 
possible memcached implementation which we believed was too specific to add 
directly to open source.  [~lhofhansl] 

> Configurable Stats Cache
> 
>
> Key: PHOENIX-5231
> URL: https://issues.apache.org/jira/browse/PHOENIX-5231
> Project: Phoenix
>  Issue Type: Test
>Reporter: Daniel Wong
>Assignee: Daniel Wong
>Priority: Major
> Fix For: 4.15.0, 5.1.0
>
> Attachments: 5231-quickfix-v2.txt, 5231-quickfix.txt, 
> 5231-services-fix.patch, PHOENIX-5231.4.x-HBase-1.3.patch, 
> PHOENIX-5231.4.x-HBase-1.3.v2.patch, PHOENIX-5231.4.x-HBase-1.3.v3.patch, 
> PHOENIX-5231.master.v3.patch, PHOENIX-5231.master.v4.patch
>
>  Time Spent: 8h 40m
>  Remaining Estimate: 0h
>
> Currently, the phoenix stats cache is per 
> ConnectionQuerySerivce/ConnectionProfile, which leads to duplicated cached 
> entry (the guideposts) and waste resources if these separate connections are 
> querying the same underlying table. It would be good to be able to provide a 
> configurable stats cache as control the cache level so it could be per JVM.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (PHOENIX-5231) Configurable Stats Cache

2019-05-28 Thread Daniel Wong (JIRA)



[ 
https://issues.apache.org/jira/browse/PHOENIX-5231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16850207#comment-16850207
 ] 

Daniel Wong commented on PHOENIX-5231:
--

+1 to Vincent's fix.  I'm guessing my version of mvn is really old on my 
personal laptop as I don't develop there much.  I'll try to confirm tonight.

> Configurable Stats Cache
> 
>
> Key: PHOENIX-5231
> URL: https://issues.apache.org/jira/browse/PHOENIX-5231
> Project: Phoenix
>  Issue Type: Test
>Reporter: Daniel Wong
>Assignee: Daniel Wong
>Priority: Major
> Fix For: 4.15.0, 5.1.0
>
> Attachments: 5231-quickfix-v2.txt, 5231-quickfix.txt, 
> 5231-services-fix.patch, PHOENIX-5231.4.x-HBase-1.3.patch, 
> PHOENIX-5231.4.x-HBase-1.3.v2.patch, PHOENIX-5231.4.x-HBase-1.3.v3.patch, 
> PHOENIX-5231.master.v3.patch, PHOENIX-5231.master.v4.patch
>
>  Time Spent: 8h 40m
>  Remaining Estimate: 0h
>
> Currently, the phoenix stats cache is per 
> ConnectionQuerySerivce/ConnectionProfile, which leads to duplicated cached 
> entry (the guideposts) and waste resources if these separate connections are 
> querying the same underlying table. It would be good to be able to provide a 
> configurable stats cache as control the cache level so it could be per JVM.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (PHOENIX-5231) Configurable Stats Cache

2019-05-28 Thread Daniel Wong (JIRA)



[ 
https://issues.apache.org/jira/browse/PHOENIX-5231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16850012#comment-16850012
 ] 

Daniel Wong commented on PHOENIX-5231:
--

I was trying to come up with a fix I think would be clearer but Im still 
traveling and don't have good internet access outside my phone.  I think we can 
construct a better fix as the one you proposed here removes the standard 
META-INF directory approach that one would normally expect the services to live 
in.  Let me see if I can spend a little time and construct something.

> Configurable Stats Cache
> 
>
> Key: PHOENIX-5231
> URL: https://issues.apache.org/jira/browse/PHOENIX-5231
> Project: Phoenix
>  Issue Type: Test
>Reporter: Daniel Wong
>Assignee: Daniel Wong
>Priority: Major
> Fix For: 4.15.0, 5.1.0
>
> Attachments: 5231-quickfix-v2.txt, 5231-quickfix.txt, 
> PHOENIX-5231.4.x-HBase-1.3.patch, PHOENIX-5231.4.x-HBase-1.3.v2.patch, 
> PHOENIX-5231.4.x-HBase-1.3.v3.patch, PHOENIX-5231.master.v3.patch, 
> PHOENIX-5231.master.v4.patch
>
>  Time Spent: 8h 40m
>  Remaining Estimate: 0h
>
> Currently, the phoenix stats cache is per 
> ConnectionQuerySerivce/ConnectionProfile, which leads to duplicated cached 
> entry (the guideposts) and waste resources if these separate connections are 
> querying the same underlying table. It would be good to be able to provide a 
> configurable stats cache as control the cache level so it could be per JVM.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (PHOENIX-5231) Configurable Stats Cache

2019-05-25 Thread Daniel Wong (JIRA)



[ 
https://issues.apache.org/jira/browse/PHOENIX-5231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16848296#comment-16848296
 ] 

Daniel Wong commented on PHOENIX-5231:
--

That's strange I wonder what changed can you share your mvn version so I can 
attempt to investigate later?

> Configurable Stats Cache
> 
>
> Key: PHOENIX-5231
> URL: https://issues.apache.org/jira/browse/PHOENIX-5231
> Project: Phoenix
>  Issue Type: Test
>Reporter: Daniel Wong
>Assignee: Daniel Wong
>Priority: Major
> Fix For: 4.15.0, 5.1.0
>
> Attachments: 5231-quickfix-v2.txt, 5231-quickfix.txt, 
> PHOENIX-5231.4.x-HBase-1.3.patch, PHOENIX-5231.4.x-HBase-1.3.v2.patch, 
> PHOENIX-5231.4.x-HBase-1.3.v3.patch, PHOENIX-5231.master.v3.patch, 
> PHOENIX-5231.master.v4.patch
>
>  Time Spent: 8h 40m
>  Remaining Estimate: 0h
>
> Currently, the phoenix stats cache is per 
> ConnectionQuerySerivce/ConnectionProfile, which leads to duplicated cached 
> entry (the guideposts) and waste resources if these separate connections are 
> querying the same underlying table. It would be good to be able to provide a 
> configurable stats cache as control the cache level so it could be per JVM.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (PHOENIX-5231) Configurable Stats Cache

2019-05-25 Thread Daniel Wong (JIRA)



[ 
https://issues.apache.org/jira/browse/PHOENIX-5231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16848163#comment-16848163
 ] 

Daniel Wong commented on PHOENIX-5231:
--

[~lhofhansl] I am not able to recreate your issue with SQLLine.  Can you give 
me some more information? What is your output of jar tf on your client jar?  
This was my output following mvn package.
{code:java}
dev@OnePageGamesDevBox:~/dev/phoenix$ jar tf 
./phoenix-client/target/phoenix-client-5.1.0-HBase-2.0-SNAPSHOT.jar | grep 
"GuidePostsCacheFactory"
org/apache/phoenix/query/ITGuidePostsCacheFactory.class
org/apache/phoenix/query/GuidePostsCacheFactory.class
org/apache/phoenix/query/DefaultGuidePostsCacheFactory.class
META-INF/services/org.apache.phoenix.query.GuidePostsCacheFactory
META-INF/services/META-INF/services/org.apache.phoenix.query.GuidePostsCacheFactory

{code}
After starting up the sandbox minicluster i was able to connect with SQLLine.  


{code:java}
dev@OnePageGamesDevBox:~/dev/phoenix$ bin/sqlline.py localhost:56453
Setting property: [incremental, false]
Setting property: [isolation, TRANSACTION_READ_COMMITTED]
issuing: !connect jdbc:phoenix:localhost:56453 none none 
org.apache.phoenix.jdbc.PhoenixDriver
Connecting to jdbc:phoenix:localhost:56453
19/05/25 04:41:18 WARN util.NativeCodeLoader: Unable to load native-hadoop 
library for your platform... using builtin-java classes where applicable
Connected to: Phoenix (version 5.1)
Driver: PhoenixEmbeddedDriver (version 5.1)
Autocommit status: true
Transaction isolation: TRANSACTION_READ_COMMITTED
Building list of tables and columns for tab-completion (set fastconnect to true 
to skip)...
154/154 (100%) Done
Done
sqlline version 1.7.0
0: jdbc:phoenix:localhost:56453> select * from system.catalog limit 1;
+---+-++-+---+---++-+--+--+
| TENANT_ID | TABLE_SCHEM | TABLE_NAME | COLUMN_NAME | COLUMN_FAMILY | 
TABLE_SEQ_NUM | TABLE_TYPE | PK_NAME | COLUMN_COUNT |  |
+---+-++-+---+---++-+--+--+
|   | SYSTEM  | CATALOG    | |   | 0
 | s  | PK  | 64   |  |
+---+-++-+---+---++-+--+--+

{code}
I which branch or phoenix build were you having issue with?

> Configurable Stats Cache
> 
>
> Key: PHOENIX-5231
> URL: https://issues.apache.org/jira/browse/PHOENIX-5231
> Project: Phoenix
>  Issue Type: Test
>Reporter: Daniel Wong
>Assignee: Daniel Wong
>Priority: Major
> Fix For: 4.15.0, 5.1.0
>
> Attachments: 5231-quickfix-v2.txt, 5231-quickfix.txt, 
> PHOENIX-5231.4.x-HBase-1.3.patch, PHOENIX-5231.4.x-HBase-1.3.v2.patch, 
> PHOENIX-5231.4.x-HBase-1.3.v3.patch, PHOENIX-5231.master.v3.patch, 
> PHOENIX-5231.master.v4.patch
>
>  Time Spent: 8h 40m
>  Remaining Estimate: 0h
>
> Currently, the phoenix stats cache is per 
> ConnectionQuerySerivce/ConnectionProfile, which leads to duplicated cached 
> entry (the guideposts) and waste resources if these separate connections are 
> querying the same underlying table. It would be good to be able to provide a 
> configurable stats cache as control the cache level so it could be per JVM.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

1 2 >

1 - 100 of 131 matches

Mail list logo