[jira] [Commented] (PHOENIX-7253) Perf improvement for non-full scan queries on large table
[ https://issues.apache.org/jira/browse/PHOENIX-7253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17823491#comment-17823491 ] Daniel Wong commented on PHOENIX-7253: -- I found some of my notes: Phoenix code doing this is a hack given HBase limitations... in 1.x there was no easy access to the hbase clients metadata cache and this method was used to kind of hackly get the cached values only as opposed to directly calling {{HConnection.getAllRegionLocations }} This code path is being called from “{{{}getAllTableRegions.{}}} This is used when the Phoenix client wants/needs to update the region info for a hbase table. This could be done during scan generation or when a split/merge is detected. This method calls {{HConnection.getRegionLocation}} iteratively using one region at a time to find the next region. This is done so that this uses the cache in the HConnection itself rather than the HTable.getAllRegionLocations which appears to cause a MetaScan and re-cache on the HBase client of all the regions in the 1.x codebase. Where this can become an issue is if one region splits many scans may simultaneously call this method in overloading the client threads as well as possibly meta. Some protection so that a client can only call this from a single thread for a single table would improve this flow greatly. Whether this should be done inside of hbase itself or in phoenix is debatable but Phoenix is intending to use the HBase client cache rather than store its own cache of region boundaries. I haven't had time to explore the 2.X codebase but much of this has changed and been replace via async region locator flow here [https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/AsyncTableRegionLocatorImpl.java#L59] and this appears to again call and then cache rather than use the cache were available... some trickyness here. Want the code orignally wanted/intended was to get all the regions from the hclient cache and only renew the cache on detected split/merge iirc. > Perf improvement for non-full scan queries on large table > - > > Key: PHOENIX-7253 > URL: https://issues.apache.org/jira/browse/PHOENIX-7253 > Project: Phoenix > Issue Type: Improvement >Affects Versions: 5.2.0, 5.1.3 >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Critical > Fix For: 5.2.0, 5.1.4 > > > Any considerably large table with more than 100k regions can give problematic > performance if we access all region locations from meta for the given table > before generating parallel or sequential scans for the given query. The perf > impact can really hurt range scan queries. > Consider a table with hundreds of thousands of tenant views. Unless the query > is strict point lookup, any query on any tenant view would end up retrieving > region locations of all regions of the base table. In case if IOException is > thrown by HBase client during any region location lookup in meta, we only > perform single retry. > Proposal: > # All non point lookup queries should only retrieve region locations that > cover the scan boundary. Avoid fetching all region locations of the base > table. > # Make retries configurable with higher default value. > > Sample stacktrace from the multiple failures observed: > {code:java} > java.sql.SQLException: ERROR 1102 (XCL02): Cannot get all table regions.Stack > trace: java.sql.SQLException: ERROR 1102 (XCL02): Cannot get all table > regions. > at > org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:620) > at > org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:229) > at > org.apache.phoenix.query.ConnectionQueryServicesImpl.getAllTableRegions(ConnectionQueryServicesImpl.java:781) > at > org.apache.phoenix.query.DelegateConnectionQueryServices.getAllTableRegions(DelegateConnectionQueryServices.java:87) > at > org.apache.phoenix.query.DelegateConnectionQueryServices.getAllTableRegions(DelegateConnectionQueryServices.java:87) > at > org.apache.phoenix.iterate.DefaultParallelScanGrouper.getRegionBoundaries(DefaultParallelScanGrouper.java:74) > at > org.apache.phoenix.iterate.BaseResultIterators.getRegionBoundaries(BaseResultIterators.java:587) > at > org.apache.phoenix.iterate.BaseResultIterators.getParallelScans(BaseResultIterators.java:936) > at > org.apache.phoenix.iterate.BaseResultIterators.getParallelScans(BaseResultIterators.java:669) > at > org.apache.phoenix.iterate.BaseResultIterators.(BaseResultIterators.java:555) > at > org.apache.phoenix.iterate.SerialIterators.(SerialIterators.java:69) > at org.apache.phoenix.execute.ScanPlan.newIterator(ScanPlan.java:278) > at
[jira] [Commented] (PHOENIX-7253) Perf improvement for non-full scan queries on large table
[ https://issues.apache.org/jira/browse/PHOENIX-7253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17823483#comment-17823483 ] Daniel Wong commented on PHOENIX-7253: -- Hi [~vjasani] I wanted to mention that you need to be a bit careful about this change the entire "getAllTableRegions" function is a bit of a hack in the first place due to the restriction on how hbase cache's table regions it actually servers 2 purposes in phoenix code... the right fix probably needs an hbase API update as well. I'm trying to find the context in all of my docs/notes/conversations but we should probably sync on this case prior to implementation changes. > Perf improvement for non-full scan queries on large table > - > > Key: PHOENIX-7253 > URL: https://issues.apache.org/jira/browse/PHOENIX-7253 > Project: Phoenix > Issue Type: Improvement >Affects Versions: 5.2.0, 5.1.3 >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Critical > Fix For: 5.2.0, 5.1.4 > > > Any considerably large table with more than 100k regions can give problematic > performance if we access all region locations from meta for the given table > before generating parallel or sequential scans for the given query. The perf > impact can really hurt range scan queries. > Consider a table with hundreds of thousands of tenant views. Unless the query > is strict point lookup, any query on any tenant view would end up retrieving > region locations of all regions of the base table. In case if IOException is > thrown by HBase client during any region location lookup in meta, we only > perform single retry. > Proposal: > # All non point lookup queries should only retrieve region locations that > cover the scan boundary. Avoid fetching all region locations of the base > table. > # Make retries configurable with higher default value. > > Sample stacktrace from the multiple failures observed: > {code:java} > java.sql.SQLException: ERROR 1102 (XCL02): Cannot get all table regions.Stack > trace: java.sql.SQLException: ERROR 1102 (XCL02): Cannot get all table > regions. > at > org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:620) > at > org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:229) > at > org.apache.phoenix.query.ConnectionQueryServicesImpl.getAllTableRegions(ConnectionQueryServicesImpl.java:781) > at > org.apache.phoenix.query.DelegateConnectionQueryServices.getAllTableRegions(DelegateConnectionQueryServices.java:87) > at > org.apache.phoenix.query.DelegateConnectionQueryServices.getAllTableRegions(DelegateConnectionQueryServices.java:87) > at > org.apache.phoenix.iterate.DefaultParallelScanGrouper.getRegionBoundaries(DefaultParallelScanGrouper.java:74) > at > org.apache.phoenix.iterate.BaseResultIterators.getRegionBoundaries(BaseResultIterators.java:587) > at > org.apache.phoenix.iterate.BaseResultIterators.getParallelScans(BaseResultIterators.java:936) > at > org.apache.phoenix.iterate.BaseResultIterators.getParallelScans(BaseResultIterators.java:669) > at > org.apache.phoenix.iterate.BaseResultIterators.(BaseResultIterators.java:555) > at > org.apache.phoenix.iterate.SerialIterators.(SerialIterators.java:69) > at org.apache.phoenix.execute.ScanPlan.newIterator(ScanPlan.java:278) > at > org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:374) > at > org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:222) > at > org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:217) > at > org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:212) > at > org.apache.phoenix.jdbc.PhoenixStatement$1.call(PhoenixStatement.java:370) > at > org.apache.phoenix.jdbc.PhoenixStatement$1.call(PhoenixStatement.java:328) > at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53) > at > org.apache.phoenix.jdbc.PhoenixStatement.executeQuery(PhoenixStatement.java:328) > at > org.apache.phoenix.jdbc.PhoenixStatement.executeQuery(PhoenixStatement.java:320) > at > org.apache.phoenix.jdbc.PhoenixPreparedStatement.executeQuery(PhoenixPreparedStatement.java:188) > ... > ... > Caused by: java.io.InterruptedIOException: Origin: InterruptedException > at > org.apache.hadoop.hbase.util.ExceptionUtil.asInterrupt(ExceptionUtil.java:72) > at > org.apache.hadoop.hbase.client.ConnectionImplementation.takeUserRegionLock(ConnectionImplementation.java:1129) > at > org.apache.hadoop.hbase.client.ConnectionImplementation.locateRegionInMeta(ConnectionImplementation.java:994) > at >
[jira] [Commented] (PHOENIX-5833) Incorrect results with RVCs and AND operator
[ https://issues.apache.org/jira/browse/PHOENIX-5833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17750043#comment-17750043 ] Daniel Wong commented on PHOENIX-5833: -- Thanks [~stoty] for doing this I'm not as active in Phoenix but saw you did this and a bunch of other cleanup today! > Incorrect results with RVCs and AND operator > > > Key: PHOENIX-5833 > URL: https://issues.apache.org/jira/browse/PHOENIX-5833 > Project: Phoenix > Issue Type: Bug > Components: core >Affects Versions: 4.15.0 >Reporter: Bharath Vissapragada >Assignee: Istvan Toth >Priority: Critical > Fix For: 5.2.0, 5.1.4 > > Attachments: PHOENIX-5833.4.x.patch > > Time Spent: 20m > Remaining Estimate: 0h > > Phoenix version: 4.15-HBase-1.5 > -- Create a test table and populate a couple of rows. > {noformat} > create table repro_bug(a varchar(10) not null, b varchar(10) not null, c > varchar(10) not null constraint pk primary key(a, b, c)); > upsert into repro_bug values('abc', 'def', 'RRSQ_IMKKL'); > upsert into repro_bug values('abc', 'def', 'RRS_ZYTDT'); > select * from repro_bug; > +--+--+-+ > | A | B | C | > +--+--+-+ > | abc | def | RRSQ_IMKKL | > | abc | def | RRS_ZYTDT | > +--+--+-+ > {noformat} > -- Query 1 - Look for rows where C has a certain prefix - Returns correct > result > {noformat} > select A, B, C from REPRO_BUG where C like 'RRS\\_%'; > +--+--++ > | A | B | C | > +--+--++ > | abc | def | RRS_ZYTDT | > +--+--++ > {noformat} > -- Query 2 - Look for rows where (a, b, c) > first row - Returns correct > result > {noformat} > select A, B, C from REPRO_BUG where (A, B, C) > ('abc', 'def', 'RRSQ_IMKKL') > +--+--++ > | A | B | C | > +--+--++ > | abc | def | RRS_ZYTDT | > +--+--++ > {noformat} > -- Query 3 - Combine the filters from Query 1 and Query2 - Returns incorrect > result.. Ideally it should return the same row as above. > {noformat} > select A, B, C from REPRO_BUG where (A, B, C) > ('abc', 'def', 'RRSQ_IMKKL') > AND C like 'RRS\\_%'; > ++++ > | A | B | C | > ++++ > ++++ > {noformat} > -- Explain for the above incase someone is interested. > {noformat} > explain select A, B, C from REPRO_BUG where (A, B, C) > ('abc', 'def', > 'RRSQ_IMKKL') AND C like 'RRS\\_%'; > ++-++--+ > | PLAN >| EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS | > ++-++--+ > | CLIENT 1-CHUNK PARALLEL 1-WAY ROUND ROBIN RANGE SCAN OVER REPRO_BUG > ['abcdef'] - [*] | null| null | null | > | SERVER FILTER BY FIRST KEY ONLY AND C LIKE 'RRS\_%' >| null| null | null | > ++-++--+ > 2 rows selected (0.003 seconds) > {noformat} > I'm trying to poke around in the code to figure out the issue but my > understanding of Phoenix is limited at this point. So creating a bug report > incase someone can figure this out quickly. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (PHOENIX-6796) Pluggable query logger in Phoenix
[ https://issues.apache.org/jira/browse/PHOENIX-6796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17610786#comment-17610786 ] Daniel Wong commented on PHOENIX-6796: -- For implementation thoughts I had were 3 components. A query logger can be 1 class, a sampler controller (how often and what type of queries), a santizer, and a publisher. Default might be always sample, publish to system.log and don't sanitize. > Pluggable query logger in Phoenix > - > > Key: PHOENIX-6796 > URL: https://issues.apache.org/jira/browse/PHOENIX-6796 > Project: Phoenix > Issue Type: Improvement >Reporter: Daniel Wong >Priority: Minor > > > Requirements > # Pluggable logger or log publisher > ## , we don't want to use system.log for multiple reasons from auditing to > load uncertainty, in addition we need 2 publishing implementations to handle > future/past deployment patterns. (Think of this as taking the same info we > would push in system.log but for example using log4j to publish) > ## We need to sanitize personal information prior to logging for GDPR etc, > (I have some ideas on how to do this easily using the syntax-tree on > sanitizing everything by replacing constants with ? or similar) > # We need some client side performance metrics embedded in the logged > queries. Most of this is there today I believe but need to audit. > # We need errors/exceptions with the query logging. > # Ability to log every ddl/dml/dql. > # Logging capability across phoenix jdbc/phoenix spark/phoenix MR/phoenix pig > Some initial discussion on Apache slack > [thread|https://the-asf.slack.com/archives/CPGHNKLSK/p1620026444027500] > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (PHOENIX-6491) Phoenix High Availability
[ https://issues.apache.org/jira/browse/PHOENIX-6491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17529110#comment-17529110 ] Daniel Wong commented on PHOENIX-6491: -- Initial WIP feature commit from our internal fork. [https://github.com/apache/phoenix/pull/1430] > Phoenix High Availability > - > > Key: PHOENIX-6491 > URL: https://issues.apache.org/jira/browse/PHOENIX-6491 > Project: Phoenix > Issue Type: New Feature > Components: core >Reporter: Daniel Wong >Assignee: Daniel Wong >Priority: Major > > This JIRA proposes Phoenix High Availability (HA) feature which allows > Phoenix users to interact with multiple Phoenix/HBase clusters in order to > achieve additional availability compared to a single cluster. In particular > we target the common deployment configuration of 2 HBase clusters with > master/master asynchronous replication enabled between the queried tables, > but with consideration to future extensions in use cases, replication, and > number of clusters. > > Currently targeted use cases: > # Active-Standby HA for disaster recovery, enables end users to switch HBase > clusters (triggered by administrators) collectively across multiple clients > without restarting. > # Active-Active HA for immutable use cases with point get queries without > deletes, enables a client to connect to both clusters simultaneously for > these use cases which inherently have relaxed consistency requirements. > Concepts: > * HA Group - An HA group is an association between a pair of HBase clusters, > a group of Phoenix clients, metadata state, and an HA policy (see below). HA > groups are pre-defined and a client provides the group name when creating a > phoenix connection to the clusters in that HA group. Note that the same pair > of HBase clusters can be in multiple HA groups. This allows clients to be > grouped based on different use cases, availability requirements, consistency > requirements, and load balancing. > * HA Policy - Every HA group has an associated HA policy which specifies how > to use the HBase clusters pair. This is implemented by an interface that > replaces the JDBC Connection as well as changes in the public APIs in > QueryServices. Currently, there are 2 policies one for each targeted use case > defined above. It is possible to support more HA policies in future for > incoming use cases. > * Metadata Store - Stores the state / manages the state transitions of an HA > group. For example in the Active-Standby setup the store manages which > cluster is currently Active to which all clients in that HA group should > connect. For a particular HA group an entry is referred to as a Cluster Role > Record. > * HA Client - JDBC implementation as well as a handler for metadata store > state changes. End users will use via PhoenixDriver with JDBC string with > special format {{jdbc:phoenix:[zk1,zk2,zk3|zk1',zk2',zk3']}} and HA group > name in the connection properties. Using such a JDBC connection for creating > {{Statement}} or querying a {{ResultSet}} does not require any application > code change. Internally, the implementation will serve incoming client > operation requests according to the HA policy of that group. > More details to come with a detailed design document. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Comment Edited] (PHOENIX-6669) RVC returns a wrong result
[ https://issues.apache.org/jira/browse/PHOENIX-6669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17507901#comment-17507901 ] Daniel Wong edited comment on PHOENIX-6669 at 3/16/22, 11:28 PM: - The rewrite is not correct still. However, my main concern and why i mwntion this is without understanding exactly what the rvc is suppose to accomplish the test cases may not cover the set of issues or the code changes may be fragile. I do agree the wrong result is the critical issue of course but eant us implementing the full fix. was (Author: dbwong): The rewrite is not correct still. However, my main concern anf why i mwntion this is without understanding exactly what the rvc is suppose to accomplish the test cases may not cover the set of issues or the code changes may be fragile. > RVC returns a wrong result > -- > > Key: PHOENIX-6669 > URL: https://issues.apache.org/jira/browse/PHOENIX-6669 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.16.1 >Reporter: Xinyi Yan >Priority: Major > > {code:java} > CREATE TABLE IF NOT EXISTS DUMMY ( > PK1 VARCHAR NOT NULL, > PK2 BIGINT NOT NULL, > PK3 BIGINT NOT NULL, > PK4 VARCHAR NOT NULL, > COL1 BIGINT, > COL2 INTEGER, > COL3 VARCHAR, > COL4 VARCHAR, CONSTRAINT PK PRIMARY KEY > ( > PK1, > PK2, > PK3, > PK4 > ) > );UPSERT INTO DUMMY (PK1, PK4, COL1, PK2, COL2, PK3, COL3, COL4) > VALUES ('xx', 'xid1', 0, 7, 7, 7, 'INSERT', null); > {code} > The non-RVC query returns no row, but the RVC query returns a wrong result. > {code:java} > 0: jdbc:phoenix:localhost> select PK2 > . . . . . . . . . . . . .> from DUMMY > . . . . . . . . . . . . .> where PK1 ='xx' > . . . . . . . . . . . . .> and (PK1 > 'xx' AND PK1 <= 'xx') > . . . . . . . . . . . . .> and (PK2 > 5 AND PK2 <=5) > . . . . . . . . . . . . .> and (PK3 > 2 AND PK3 <=2); > +--+ > | PK2 | > +--+ > +--+ > No rows selected (0.022 seconds) > 0: jdbc:phoenix:localhost> select PK2 > . . . . . . . . . . . . .> from DUMMY > . . . . . . . . . . . . .> where (PK1 = 'xx') > . . . . . . . . . . . . .> and (PK1, PK2, PK3) > ('xx', 5, 2) > . . . . . . . . . . . . .> and (PK1, PK2, PK3) <= ('xx', 5, 2); > +--+ > | PK2 | > +--+ > | 7 | > +--+ > 1 row selected (0.033 seconds) {code} > {code:java} > 0: jdbc:phoenix:localhost> EXPLAIN select PK2 from DUMMY where (PK1 = 'xx') > and (PK1, PK2, PK3) > ('xx', 5, 2) and (PK1, PK2, PK3) <= ('xx', 5, 2); > +--+--+--+--+ > | PLAN | EST_BYTES_READ > | EST_ROWS_READ | | > +--+--+--+--+ > | CLIENT 1-CHUNK PARALLEL 1-WAY ROUND ROBIN RANGE SCAN OVER DUMMY ['xx'] | > null | null | > | SERVER FILTER BY FIRST KEY ONLY | null > | null | | > +--+--+--+--+ > 2 rows selected (0.024 seconds) > 0: jdbc:phoenix:localhost> explain select PK2 from DUMMY where PK1 ='xx' and > (PK1 > 'xx' AND PK1 <= 'xx') and (PK2 > 5 AND PK2 <=5) and (PK3 > 2 AND PK3 > <=2); > +--+--+--+--+ > | PLAN | EST_BYTES_READ > | EST_ROWS_READ | | > +--+--+--+--+ > | DEGENERATE SCAN OVER DUMMY | null > | null | | > +--+--+--+--+ > 1 row selected (0.015 seconds){code} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (PHOENIX-6669) RVC returns a wrong result
[ https://issues.apache.org/jira/browse/PHOENIX-6669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17507901#comment-17507901 ] Daniel Wong commented on PHOENIX-6669: -- The rewrite is not correct still. However, my main concern anf why i mwntion this is without understanding exactly what the rvc is suppose to accomplish the test cases may not cover the set of issues or the code changes may be fragile. > RVC returns a wrong result > -- > > Key: PHOENIX-6669 > URL: https://issues.apache.org/jira/browse/PHOENIX-6669 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.16.1 >Reporter: Xinyi Yan >Priority: Major > > {code:java} > CREATE TABLE IF NOT EXISTS DUMMY ( > PK1 VARCHAR NOT NULL, > PK2 BIGINT NOT NULL, > PK3 BIGINT NOT NULL, > PK4 VARCHAR NOT NULL, > COL1 BIGINT, > COL2 INTEGER, > COL3 VARCHAR, > COL4 VARCHAR, CONSTRAINT PK PRIMARY KEY > ( > PK1, > PK2, > PK3, > PK4 > ) > );UPSERT INTO DUMMY (PK1, PK4, COL1, PK2, COL2, PK3, COL3, COL4) > VALUES ('xx', 'xid1', 0, 7, 7, 7, 'INSERT', null); > {code} > The non-RVC query returns no row, but the RVC query returns a wrong result. > {code:java} > 0: jdbc:phoenix:localhost> select PK2 > . . . . . . . . . . . . .> from DUMMY > . . . . . . . . . . . . .> where PK1 ='xx' > . . . . . . . . . . . . .> and (PK1 > 'xx' AND PK1 <= 'xx') > . . . . . . . . . . . . .> and (PK2 > 5 AND PK2 <=5) > . . . . . . . . . . . . .> and (PK3 > 2 AND PK3 <=2); > +--+ > | PK2 | > +--+ > +--+ > No rows selected (0.022 seconds) > 0: jdbc:phoenix:localhost> select PK2 > . . . . . . . . . . . . .> from DUMMY > . . . . . . . . . . . . .> where (PK1 = 'xx') > . . . . . . . . . . . . .> and (PK1, PK2, PK3) > ('xx', 5, 2) > . . . . . . . . . . . . .> and (PK1, PK2, PK3) <= ('xx', 5, 2); > +--+ > | PK2 | > +--+ > | 7 | > +--+ > 1 row selected (0.033 seconds) {code} > {code:java} > 0: jdbc:phoenix:localhost> EXPLAIN select PK2 from DUMMY where (PK1 = 'xx') > and (PK1, PK2, PK3) > ('xx', 5, 2) and (PK1, PK2, PK3) <= ('xx', 5, 2); > +--+--+--+--+ > | PLAN | EST_BYTES_READ > | EST_ROWS_READ | | > +--+--+--+--+ > | CLIENT 1-CHUNK PARALLEL 1-WAY ROUND ROBIN RANGE SCAN OVER DUMMY ['xx'] | > null | null | > | SERVER FILTER BY FIRST KEY ONLY | null > | null | | > +--+--+--+--+ > 2 rows selected (0.024 seconds) > 0: jdbc:phoenix:localhost> explain select PK2 from DUMMY where PK1 ='xx' and > (PK1 > 'xx' AND PK1 <= 'xx') and (PK2 > 5 AND PK2 <=5) and (PK3 > 2 AND PK3 > <=2); > +--+--+--+--+ > | PLAN | EST_BYTES_READ > | EST_ROWS_READ | | > +--+--+--+--+ > | DEGENERATE SCAN OVER DUMMY | null > | null | | > +--+--+--+--+ > 1 row selected (0.015 seconds){code} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (PHOENIX-6669) RVC returns a wrong result
[ https://issues.apache.org/jira/browse/PHOENIX-6669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17507784#comment-17507784 ] Daniel Wong commented on PHOENIX-6669: -- [~yanxinyi] careful on your transformations, im worried you may not be testing cleanly why not use the actual query rewrite rather than these non equivalent rewrites? > RVC returns a wrong result > -- > > Key: PHOENIX-6669 > URL: https://issues.apache.org/jira/browse/PHOENIX-6669 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.16.1 >Reporter: Xinyi Yan >Priority: Major > > {code:java} > CREATE TABLE IF NOT EXISTS DUMMY ( > PK1 VARCHAR NOT NULL, > PK2 BIGINT NOT NULL, > PK3 BIGINT NOT NULL, > PK4 VARCHAR NOT NULL, > COL1 BIGINT, > COL2 INTEGER, > COL3 VARCHAR, > COL4 VARCHAR, CONSTRAINT PK PRIMARY KEY > ( > PK1, > PK2, > PK3, > PK4 > ) > );UPSERT INTO DUMMY (PK1, PK4, COL1, PK2, COL2, PK3, COL3, COL4) > VALUES ('xx', 'xid1', 0, 7, 7, 7, 'INSERT', null); > {code} > The non-RVC query returns no row, but the RVC query returns a wrong result. > {code:java} > 0: jdbc:phoenix:localhost> select PK2 > . . . . . . . . . . . . .> from DUMMY > . . . . . . . . . . . . .> where (PK1 = 'xx') > . . . . . . . . . . . . .> and ((PK2 > 5 AND PK2 <=5)) > . . . . . . . . . . . . .> and ((PK3 > 2 AND PK3 <=2)); > +--+ > | PK2 | > +--+ > +--+ > No rows selected (0.022 seconds) > 0: jdbc:phoenix:localhost> select PK2 > . . . . . . . . . . . . .> from DUMMY > . . . . . . . . . . . . .> where (PK1 = 'xx') > . . . . . . . . . . . . .> and (PK1, PK2, PK3) > ('xx', 5, 2) > . . . . . . . . . . . . .> and (PK1, PK2, PK3) <= ('xx', 5, 2); > +--+ > | PK2 | > +--+ > | 7 | > +--+ > 1 row selected (0.033 seconds) {code} > {code:java} > 0: jdbc:phoenix:localhost> EXPLAIN select PK2 from DUMMY where (PK1 = 'xx') > and (PK1, PK2, PK3) > ('xx', 5, 2) and (PK1, PK2, PK3) <= ('xx', 5, 2); > +--+--+--+--+ > | PLAN | EST_BYTES_READ > | EST_ROWS_READ | | > +--+--+--+--+ > | CLIENT 1-CHUNK PARALLEL 1-WAY ROUND ROBIN RANGE SCAN OVER DUMMY ['xx'] | > null | null | > | SERVER FILTER BY FIRST KEY ONLY | null > | null | | > +--+--+--+--+ > 2 rows selected (0.024 seconds) > 0: jdbc:phoenix:localhost> EXPLAIN select PK2 from DUMMY where (PK1 = 'xx') > and ((PK2 > 5 AND PK2 <=5)) and ((PK3 > 2 AND PK3 <=2)); > +--+--+--+--+ > | PLAN | EST_BYTES_READ > | EST_ROWS_READ | | > +--+--+--+--+ > | DEGENERATE SCAN OVER DUMMY | null > | null | | > +--+--+--+--+ > 1 row selected (0.005 seconds) {code} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Comment Edited] (PHOENIX-6659) RVC with AND clauses return incorrect result
[ https://issues.apache.org/jira/browse/PHOENIX-6659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17500447#comment-17500447 ] Daniel Wong edited comment on PHOENIX-6659 at 3/3/22, 1:37 AM: --- [~yanxinyi] good catch but one thing you need to be careful on is your reduction isn't always going to be valid. {code:java} SELECT * FROM DUMMY WHERE (PK1 = 'a') AND (PK1,PK2,PK3) <= ('a',3,1);{code} and {code:java} SELECT * FROM DUMMY WHERE (PK1 = 'a') AND (PK2,PK3) <= (3,1);{code} works but consider {code:java} SELECT * FROM DUMMY WHERE (PK1 = 'a' OR PK1 = 'b' ) AND (PK1,PK2,PK3) <= ('b',3,1); SELECT * FROM DUMMY WHERE (PK1 = 'a' OR PK1 = 'b' ) AND (PK2,PK3) <= (3,1); {code} these are not equivalent. was (Author: dbwong): [~yanxinyi] good catch but one thing you need to be careful on is your reduction isn't always going to be valid. {code:java} SELECT * FROM DUMMY WHERE (PK1 = 'a') AND (PK1,PK2,PK3) <= ('a',3,1);{code} and {code:java} SELECT * FROM DUMMY WHERE (PK1 = 'a') AND (PK2,PK3) <= (3,1);{code} works but consider {code:java} SELECT * FROM DUMMY WHERE (PK1 = 'a' OR PK1 = 'b' ) AND (PK1,PK2,PK3) <= ('a',3,1); SELECT * FROM DUMMY WHERE (PK1 = 'a' OR PK1 = 'b' ) AND (PK2,PK3) <= (3,1); {code} these are not equivalent. > RVC with AND clauses return incorrect result > > > Key: PHOENIX-6659 > URL: https://issues.apache.org/jira/browse/PHOENIX-6659 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.16.1 >Reporter: Xinyi Yan >Priority: Critical > Attachments: Screen Shot 2022-03-01 at 1.26.44 PM.png > > > CREATE TABLE DUMMY (PK1 VARCHAR NOT NULL, PK2 BIGINT NOT NULL, PK3 BIGINT NOT > NULL CONSTRAINT PK PRIMARY KEY (PK1,PK2,PK3)); > UPSERT INTO DUMMY VALUES ('a',0,1); > UPSERT INTO DUMMY VALUES ('a',1,1); > UPSERT INTO DUMMY VALUES ('a',2,1); > UPSERT INTO DUMMY VALUES ('a',3,1); > UPSERT INTO DUMMY VALUES ('a',3,2); > UPSERT INTO DUMMY VALUES ('a',4,1); > > {code:java} > 0: jdbc:phoenix:localhost> SELECT * FROM DUMMY WHERE (PK1 = 'a') AND > (PK1,PK2,PK3) <= ('a',3,1); > +--+--++ > | PK1 | PK2 > | PK3 | > +--+--++ > +--+--++ > No rows selected (0.045 seconds) > 0: jdbc:phoenix:localhost> explain SELECT * FROM DUMMY WHERE (PK1 = 'a') AND > (PK1,PK2,PK3) <= ('a',3,1); > +--+--++ > | PLAN | EST_BYTES_READ > | EST_ROWS_READ | > +--+--++ > | CLIENT 1-CHUNK PARALLEL 1-WAY ROUND ROBIN RANGE SCAN OVER DUMMY ['a',*] - > ['a',-9187343239835811840] | null | > | SERVER FILTER BY FIRST KEY ONLY | null > | null | > +--+--++ > 2 rows selected (0.012 seconds) > 0: jdbc:phoenix:localhost> SELECT * FROM DUMMY WHERE (PK1 = 'a') AND > (PK2,PK3) <= (3,1); > +--+--++ > | PK1 | PK2 > | PK3 | > +--+--++ > | a | 0 > | 1 | > | a | 1 > | 1 | > | a | 2 > | 1 | > | a | 3 > | 1 | > +--+--++ > 4 rows selected (0.014 seconds) > 0: jdbc:phoenix:localhost> EXPLAIN SELECT * FROM DUMMY WHERE (PK1 = 'a') AND > (PK2,PK3) <= (3,1); >
[jira] [Comment Edited] (PHOENIX-6659) RVC with AND clauses return incorrect result
[ https://issues.apache.org/jira/browse/PHOENIX-6659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17500447#comment-17500447 ] Daniel Wong edited comment on PHOENIX-6659 at 3/3/22, 1:34 AM: --- [~yanxinyi] good catch but one thing you need to be careful on is your reduction isn't always going to be valid. {code:java} SELECT * FROM DUMMY WHERE (PK1 = 'a') AND (PK1,PK2,PK3) <= ('a',3,1);{code} and {code:java} SELECT * FROM DUMMY WHERE (PK1 = 'a') AND (PK2,PK3) <= (3,1);{code} works but consider {code:java} SELECT * FROM DUMMY WHERE (PK1 = 'a' OR PK1 = 'b' ) AND (PK1,PK2,PK3) <= ('a',3,1); SELECT * FROM DUMMY WHERE (PK1 = 'a' OR PK1 = 'b' ) AND (PK2,PK3) <= (3,1); {code} these are not equivalent. was (Author: dbwong): [~yanxinyi] good catch but one thing you need to be careful on is your reduction isn't always going to be valid. SELECT * FROM DUMMY WHERE (PK1 = 'a') AND (PK1,PK2,PK3) <= ('a',3,1); and SELECT * FROM DUMMY WHERE (PK1 = 'a') AND (PK2,PK3) <= (3,1); works but consider SELECT * FROM DUMMY WHERE (PK1 = 'a' OR PK1 = 'b' ) AND (PK1,PK2,PK3) <= ('a',3,1); SELECT * FROM DUMMY WHERE (PK1 = 'a' OR PK1 = 'b' ) AND (PK2,PK3) <= (3,1); these are not equivalent. > RVC with AND clauses return incorrect result > > > Key: PHOENIX-6659 > URL: https://issues.apache.org/jira/browse/PHOENIX-6659 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.16.1 >Reporter: Xinyi Yan >Priority: Critical > Attachments: Screen Shot 2022-03-01 at 1.26.44 PM.png > > > CREATE TABLE DUMMY (PK1 VARCHAR NOT NULL, PK2 BIGINT NOT NULL, PK3 BIGINT NOT > NULL CONSTRAINT PK PRIMARY KEY (PK1,PK2,PK3)); > UPSERT INTO DUMMY VALUES ('a',0,1); > UPSERT INTO DUMMY VALUES ('a',1,1); > UPSERT INTO DUMMY VALUES ('a',2,1); > UPSERT INTO DUMMY VALUES ('a',3,1); > UPSERT INTO DUMMY VALUES ('a',3,2); > UPSERT INTO DUMMY VALUES ('a',4,1); > > {code:java} > 0: jdbc:phoenix:localhost> SELECT * FROM DUMMY WHERE (PK1 = 'a') AND > (PK1,PK2,PK3) <= ('a',3,1); > +--+--++ > | PK1 | PK2 > | PK3 | > +--+--++ > +--+--++ > No rows selected (0.045 seconds) > 0: jdbc:phoenix:localhost> explain SELECT * FROM DUMMY WHERE (PK1 = 'a') AND > (PK1,PK2,PK3) <= ('a',3,1); > +--+--++ > | PLAN | EST_BYTES_READ > | EST_ROWS_READ | > +--+--++ > | CLIENT 1-CHUNK PARALLEL 1-WAY ROUND ROBIN RANGE SCAN OVER DUMMY ['a',*] - > ['a',-9187343239835811840] | null | > | SERVER FILTER BY FIRST KEY ONLY | null > | null | > +--+--++ > 2 rows selected (0.012 seconds) > 0: jdbc:phoenix:localhost> SELECT * FROM DUMMY WHERE (PK1 = 'a') AND > (PK2,PK3) <= (3,1); > +--+--++ > | PK1 | PK2 > | PK3 | > +--+--++ > | a | 0 > | 1 | > | a | 1 > | 1 | > | a | 2 > | 1 | > | a | 3 > | 1 | > +--+--++ > 4 rows selected (0.014 seconds) > 0: jdbc:phoenix:localhost> EXPLAIN SELECT * FROM DUMMY WHERE (PK1 = 'a') AND > (PK2,PK3) <= (3,1); > +--+--++ > |
[jira] [Commented] (PHOENIX-6659) RVC with AND clauses return incorrect result
[ https://issues.apache.org/jira/browse/PHOENIX-6659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17500447#comment-17500447 ] Daniel Wong commented on PHOENIX-6659: -- [~yanxinyi] good catch but one thing you need to be careful on is your reduction isn't always going to be valid. SELECT * FROM DUMMY WHERE (PK1 = 'a') AND (PK1,PK2,PK3) <= ('a',3,1); and SELECT * FROM DUMMY WHERE (PK1 = 'a') AND (PK2,PK3) <= (3,1); works but consider SELECT * FROM DUMMY WHERE (PK1 = 'a' OR PK1 = 'b' ) AND (PK1,PK2,PK3) <= ('a',3,1); SELECT * FROM DUMMY WHERE (PK1 = 'a' OR PK1 = 'b' ) AND (PK2,PK3) <= (3,1); these are not equivalent. > RVC with AND clauses return incorrect result > > > Key: PHOENIX-6659 > URL: https://issues.apache.org/jira/browse/PHOENIX-6659 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.16.1 >Reporter: Xinyi Yan >Priority: Critical > Attachments: Screen Shot 2022-03-01 at 1.26.44 PM.png > > > CREATE TABLE DUMMY (PK1 VARCHAR NOT NULL, PK2 BIGINT NOT NULL, PK3 BIGINT NOT > NULL CONSTRAINT PK PRIMARY KEY (PK1,PK2,PK3)); > UPSERT INTO DUMMY VALUES ('a',0,1); > UPSERT INTO DUMMY VALUES ('a',1,1); > UPSERT INTO DUMMY VALUES ('a',2,1); > UPSERT INTO DUMMY VALUES ('a',3,1); > UPSERT INTO DUMMY VALUES ('a',3,2); > UPSERT INTO DUMMY VALUES ('a',4,1); > > {code:java} > 0: jdbc:phoenix:localhost> SELECT * FROM DUMMY WHERE (PK1 = 'a') AND > (PK1,PK2,PK3) <= ('a',3,1); > +--+--++ > | PK1 | PK2 > | PK3 | > +--+--++ > +--+--++ > No rows selected (0.045 seconds) > 0: jdbc:phoenix:localhost> explain SELECT * FROM DUMMY WHERE (PK1 = 'a') AND > (PK1,PK2,PK3) <= ('a',3,1); > +--+--++ > | PLAN | EST_BYTES_READ > | EST_ROWS_READ | > +--+--++ > | CLIENT 1-CHUNK PARALLEL 1-WAY ROUND ROBIN RANGE SCAN OVER DUMMY ['a',*] - > ['a',-9187343239835811840] | null | > | SERVER FILTER BY FIRST KEY ONLY | null > | null | > +--+--++ > 2 rows selected (0.012 seconds) > 0: jdbc:phoenix:localhost> SELECT * FROM DUMMY WHERE (PK1 = 'a') AND > (PK2,PK3) <= (3,1); > +--+--++ > | PK1 | PK2 > | PK3 | > +--+--++ > | a | 0 > | 1 | > | a | 1 > | 1 | > | a | 2 > | 1 | > | a | 3 > | 1 | > +--+--++ > 4 rows selected (0.014 seconds) > 0: jdbc:phoenix:localhost> EXPLAIN SELECT * FROM DUMMY WHERE (PK1 = 'a') AND > (PK2,PK3) <= (3,1); > +--+--++ > | PLAN | EST_BYTES_READ > | EST_ROWS_READ | > +--+--++ > | CLIENT 1-CHUNK PARALLEL 1-WAY ROUND ROBIN RANGE SCAN OVER DUMMY ['a',*] - > ['a',3] | null | > | SERVER FILTER BY FIRST KEY ONLY | null > | null | >
[jira] [Commented] (PHOENIX-6604) Allow using indexes for wildcard topN queries on salted tables
[ https://issues.apache.org/jira/browse/PHOENIX-6604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17453471#comment-17453471 ] Daniel Wong commented on PHOENIX-6604: -- Some of us are still watching Lars. > Allow using indexes for wildcard topN queries on salted tables > -- > > Key: PHOENIX-6604 > URL: https://issues.apache.org/jira/browse/PHOENIX-6604 > Project: Phoenix > Issue Type: Bug >Affects Versions: 5.1.2 >Reporter: Lars Hofhansl >Priority: Major > Fix For: 5.1.3 > > Attachments: 6604-1.5.1.3, 6604.5.1.3 > > > Just randomly came across this, playing with TPCH data. > {code:java} > CREATE TABLE lineitem ( > orderkey bigint not null, > partkey bigint, > suppkey bigint, > linenumber integer not null, > quantity double, > extendedprice double, > discount double, > tax double, > returnflag varchar(1), > linestatus varchar(1), > shipdate date, > commitdate date, > receiptdate date, > shipinstruct varchar(25), > shipmode varchar(10), > comment varchar(44) > constraint pk primary key(orderkey, linenumber)) > IMMUTABLE_ROWS=true,SALT_BUCKETS=4; > CREATE LOCAL INDEX l_shipdate ON lineitem(shipdate);{code} > Now: > {code:java} > > explain select * from lineitem order by shipdate limit 1; > +---+ > | PLAN > | > +---+ > | CLIENT 199-CHUNK 8859938 ROWS 2044738843 BYTES PARALLEL 199-WAY FULL SCAN > OVER LI | > | SERVER TOP 1 ROW SORTED BY [SHIPDATE] > | > | CLIENT MERGE SORT > | > | CLIENT LIMIT 1 > | > +---+ > 4 rows selected (6.525 seconds) > -- SAME COLUMNS! > > explain select ORDERKEY, PARTKEY, SUPPKEY, LINENUMBER, QUANTITY, > > EXTENDEDPRICE, DISCOUNT, TAX, RETURNFLAG, LINESTATUS, SHIPDATE, COMMITDATE, > > RECEIPTDATE, SHIPINSTRUCT, SHIPMODE, COMMENT from lineitem order by > > shipdate limit 1; > +---+ > | > | > +---+ > | CLIENT 4-CHUNK 4 ROWS 204 BYTES PARALLEL 4-WAY RANGE SCAN OVER LINEITEM [1] > | > | SERVER MERGE [0.PARTKEY, 0.SUPPKEY, 0.QUANTITY, 0.EXTENDEDPRICE, > 0.DISCOUNT, | > | SERVER FILTER BY FIRST KEY ONLY > | > | SERVER 1 ROW LIMIT > | > | CLIENT MERGE SORT > | > | CLIENT 1 ROW LIMIT > | > +---+ > 6 rows selected (2.736 seconds){code} > > The same happens with a covered global index. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (PHOENIX-6440) failed to write to multiple index table issue
[ https://issues.apache.org/jira/browse/PHOENIX-6440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17441383#comment-17441383 ] Daniel Wong commented on PHOENIX-6440: -- [~giskender] I assume this was fixed in the later 4.14 versions? > failed to write to multiple index table issue > - > > Key: PHOENIX-6440 > URL: https://issues.apache.org/jira/browse/PHOENIX-6440 > Project: Phoenix > Issue Type: Bug > Components: core >Affects Versions: 4.14.0 > Environment: Our environment is: >RHEL: 7.3 64bits >JDK :180_152 64bits > zookeeper: 3.4.10 > HADOOP: 2.7.1 64bits >HBASE 1.2.6 64bits >PHOENIX:4.14.0 >Reporter: Gary >Priority: Major > > When we load data into phoenix(psql.py load csv file) table A, and at the > same time we drop and create table B,in hbase log, we can > find one ERROR,that says index issue: > ERROR 1121(XCL21): Write to the index failed. > disableIndexOnFailure=true,Failed to write multiple index tables > Any ideas for this issue? -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (PHOENIX-6494) Delete data and insert immediately cause the data inconsistently
[ https://issues.apache.org/jira/browse/PHOENIX-6494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17441382#comment-17441382 ] Daniel Wong commented on PHOENIX-6494: -- Not sure what was happening here. Going to mark this as fixed in the above version for now. > Delete data and insert immediately cause the data inconsistently > > > Key: PHOENIX-6494 > URL: https://issues.apache.org/jira/browse/PHOENIX-6494 > Project: Phoenix > Issue Type: Bug > Components: core >Affects Versions: 5.1.1 > Environment: Phoenix 5.1.1 > Hbase 2.3.5 > Hadoop 2.10.1 >Reporter: Future.Zhang >Priority: Major > > I have a table named “TEST1”, it contain 30 record. > I use SpringJDBC to connect phoenix. > > My Java code like below > {code:java} > //代码占位符 > //First, Delete all data > phoenixJdbcTemplate.execute("DELETE FROM TEST1"); > //Second, use alibaba datax to sync data (30 record) into "TEST1" table. > JavaRemoteShell.execute("python /opt/data/bin/datax sync_data.json"); > {code} > after, i query 'TEST1' table, it only 19 record into the table. > > Is it a phoenix bug? -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (PHOENIX-6549) Connection leak checks sometimes fail in tests
[ https://issues.apache.org/jira/browse/PHOENIX-6549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17441379#comment-17441379 ] Daniel Wong commented on PHOENIX-6549: -- Certain paths use both, for specifically upsert-select and delete we should test internal and non-internal. We should likely handle both. > Connection leak checks sometimes fail in tests > -- > > Key: PHOENIX-6549 > URL: https://issues.apache.org/jira/browse/PHOENIX-6549 > Project: Phoenix > Issue Type: Bug > Components: core >Affects Versions: 5.2.0 >Reporter: Istvan Toth >Priority: Major > Attachments: > org.apache.phoenix.end2end.UpsertSelectIT-output.txt.bz2, > org.apache.phoenix.end2end.UpsertSelectIT.txt.gz > > > In rare cases we see > {code:java} > GLOBAL_OPEN_PHOENIX_CONNECTIONS.getMetric().getValue() > {code} > returning 1 instead of 0, even thugh the test class uses try-with-resources > blocks to handle the Phoenix Connections. > I'm attaching the output from a recent test run on ASF Jenkins -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (PHOENIX-6564) Add Phoenix Connection Failed Global Metric
[ https://issues.apache.org/jira/browse/PHOENIX-6564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17441378#comment-17441378 ] Daniel Wong commented on PHOENIX-6564: -- Per discussion I've assigned to you [~vmeka] > Add Phoenix Connection Failed Global Metric > --- > > Key: PHOENIX-6564 > URL: https://issues.apache.org/jira/browse/PHOENIX-6564 > Project: Phoenix > Issue Type: New Feature > Components: core >Affects Versions: 4.16.1, 5.1.2 >Reporter: Daniel Wong >Assignee: vikas meka >Priority: Minor > > Add Phoenix Connection Failed Global Metric. We have > PHOENIX_CONNECTIONS_ATTEMPTED_COUNTER as well as a > PHOENIX_CONNECTIONS_THROTTLED_COUNTER but this doesn't cover all possible > failures under attempted during DriverManager.getConnection. This Jira is to > provide a counter that covers all failures and to ensure that attempted also > covers as wide of a net as possible. This would be a counter metric that > increments when DriverManager.getConnection fails to return a jdbc connection > to the user. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (PHOENIX-4922) CONVERT_TZ with date 0001-01-01
[ https://issues.apache.org/jira/browse/PHOENIX-4922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17441376#comment-17441376 ] Daniel Wong commented on PHOENIX-4922: -- I was looking at some issues in beginner/noob and saw this one. Probably the big date/time project that was being done might resolve this. [~stoty] I can't remember who was leading those changes maybe can we consolidate/remove duplicate issues? There is Phoenix-5066 this and maybe some more? I've removed the beginner themes from this as there was some discussion on how to resolve. > CONVERT_TZ with date 0001-01-01 > --- > > Key: PHOENIX-4922 > URL: https://issues.apache.org/jira/browse/PHOENIX-4922 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.14.0 > Environment: SLF4J: Found binding in > [jar:file:/opt/cloudera/parcels/APACHE_PHOENIX-4.14.0-cdh5.14.2.p0.3/lib/phoenix/phoenix-4.14.0-cdh5.14.2-client.jar!/org/slf4j/impl/StaticLoggerBinder.class] > >Reporter: Eduardo >Priority: Major > Labels: beginner > > Hi all, > We are storing dates from DB2 (field is a DATE) to Phoenix (Field is a > DATE) and the conversion function CONVERT_TZ is handle this way with this > three different records: > 1) Only "select FECSALDO from table;" without conversion > 2011-05-16 23:00:00.000 (this is summer time date) > 2016-03-02 00:00:00.000 (this is winter time date) > 0001-01-01 00:00:00.000 > 2) Using > "select CONVERT_TZ(FECSALDO, 'UTC', 'Europe/London') AS FECSALDO from table"; > 2011-05-17 00:00:00.000 -> OK > 2016-03-02 00:00:00.000 -> OK > {color:#FF}0001-12-31 23:58:45.000 . -> WTF{color} > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (PHOENIX-6571) Hang of connections after OutOfMemoryError
[ https://issues.apache.org/jira/browse/PHOENIX-6571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17426343#comment-17426343 ] Daniel Wong commented on PHOENIX-6571: -- Please attach the logs to the issue directly. Can you take a heapdump or setup for a hprof and understand what objects are leaking? Examples we have seen in the past include ZK thread leak, Phoenix regionInfos etc. > Hang of connections after OutOfMemoryError > --- > > Key: PHOENIX-6571 > URL: https://issues.apache.org/jira/browse/PHOENIX-6571 > Project: Phoenix > Issue Type: Bug > Components: core >Affects Versions: 4.16.1 >Reporter: Aleksey Stavrov >Priority: Major > > We often have connection hangs after OOM. Hangs often occur on locks. > I made a test case locally that reproduces this problem, but I don't know > Java, so I have a complex bundle (perl -> dbd_jdbc -> hbase/phoenix) to use > phoenix, so because of this i am not showing a minimal example to reproduce > this behavior. > Here is the error log shown by dbd_jdbc: [https://pastebin.com/sHcH2iVq] > After the OOM happened, all connections to dbd_jdbc hang and very often have > such a stack trace: > {noformat} > "Thread-18" #197 prio=5 os_prio=0 cpu=21.70ms elapsed=1822.79s > tid=0x7ffaa4124800 nid=0x1c84 waiting on condition [0x7ff9c0ecc000] > > >java.lang.Thread.State: WAITING (parking) > > > > at jdk.internal.misc.Unsafe.park(java.base@11.0.12/Native Method) > > > > - parking to wait for <0xf82e9998> (a > java.util.concurrent.locks.ReentrantLock$NonfairSync) > > > at > java.util.concurrent.locks.LockSupport.park(java.base@11.0.12/LockSupport.java:194) > > > > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(java.base@11.0.12/AbstractQueuedSynchronizer.java:885) > > > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(java.base@11.0.12/AbstractQueuedSynchronizer.java:917) > > > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(java.base@11.0.12/AbstractQueuedSynchronizer.java:1240) > > > at > java.util.concurrent.locks.ReentrantLock.lock(java.base@11.0.12/ReentrantLock.java:267) > > > > at > org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1326) > > > at > org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1230) > > > at > org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1214) > > > at > org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1171) > >
[jira] [Commented] (PHOENIX-5812) Automatically Close "Idle" Long Open Connections
[ https://issues.apache.org/jira/browse/PHOENIX-5812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17352203#comment-17352203 ] Daniel Wong commented on PHOENIX-5812: -- Coming back to this a bit wondering if we want 1 or 2 Jira for phoenix connection level reaping vs hconnection level reaping? [~elserj] [~jisaac] > Automatically Close "Idle" Long Open Connections > > > Key: PHOENIX-5812 > URL: https://issues.apache.org/jira/browse/PHOENIX-5812 > Project: Phoenix > Issue Type: Improvement >Reporter: Daniel Wong >Priority: Major > Labels: quality-improvement > > As Phoenix may keep a maximum default number of connections. Badly > performing client calls or internal errors (See PHOENIX-5802). Can cause > total available connections to go to 0. Proposing a client connection > monitor with a connection reaper like task to reap idle connections. > Definition of "Idle" > Simple may be simple time based say if a connection has been open for > configurable amount of minutes simply close. > More complicated solution may be keeping track of last interaction time. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (PHOENIX-6456) Support query logging for DDL and DML
[ https://issues.apache.org/jira/browse/PHOENIX-6456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17334923#comment-17334923 ] Daniel Wong commented on PHOENIX-6456: -- Can we consider grouping the logging parameters in query services? It is not clear what audit log is and can we group these into the queryLogger as opposed to being grouped into the phoenix connection and statements? > Support query logging for DDL and DML > - > > Key: PHOENIX-6456 > URL: https://issues.apache.org/jira/browse/PHOENIX-6456 > Project: Phoenix > Issue Type: Task > Components: core >Reporter: Richárd Antal >Assignee: Richárd Antal >Priority: Major > > Add Support for DDL and DML logging. > Previously only select query logging was available. > In some situation it can be useful to have support for different sql query > logging as well. > It should be disabled by default because of the write overhead. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (PHOENIX-5872) Close Internal Phoenix Connections that were running during cancel
[ https://issues.apache.org/jira/browse/PHOENIX-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17276511#comment-17276511 ] Daniel Wong commented on PHOENIX-5872: -- Thanks [~stoty] for the help. Hopefully the patches weren't too troublesome. Sorry I didn't get to it earlier. > Close Internal Phoenix Connections that were running during cancel > -- > > Key: PHOENIX-5872 > URL: https://issues.apache.org/jira/browse/PHOENIX-5872 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.14.3, 4.x >Reporter: Daniel Wong >Assignee: Daniel Wong >Priority: Major > Fix For: 5.1.0, 4.16.0 > > Attachments: PHOENIX-5872.4.x.patch, PHOENIX-5872v2.4.x.patch, > PHOENIX-5872v3.4.x.patch, PHOENIX-5872v3.master.patch, > PHOENIX-5872v4.4.x.patch, PHOENIX-5872v5.4.x.patch > > Time Spent: 6h 40m > Remaining Estimate: 0h > > 3 part approach: > 1 don't count internal phoenix connections toward the client limit. > 2 count internal phoenix connections toward a newly defined limit > 3 track parent and child relationships between connections to close those > connections -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (PHOENIX-5833) Incorrect results with RVCs and AND operator
[ https://issues.apache.org/jira/browse/PHOENIX-5833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17241988#comment-17241988 ] Daniel Wong commented on PHOENIX-5833: -- Trying the old 4.x patch first. > Incorrect results with RVCs and AND operator > > > Key: PHOENIX-5833 > URL: https://issues.apache.org/jira/browse/PHOENIX-5833 > Project: Phoenix > Issue Type: Bug > Components: core >Affects Versions: 4.15.0 >Reporter: Bharath Vissapragada >Assignee: Daniel Wong >Priority: Critical > Fix For: 4.16.0 > > Attachments: PHOENIX-5833.4.x.patch > > Time Spent: 20m > Remaining Estimate: 0h > > Phoenix version: 4.15-HBase-1.5 > -- Create a test table and populate a couple of rows. > {noformat} > create table repro_bug(a varchar(10) not null, b varchar(10) not null, c > varchar(10) not null constraint pk primary key(a, b, c)); > upsert into repro_bug values('abc', 'def', 'RRSQ_IMKKL'); > upsert into repro_bug values('abc', 'def', 'RRS_ZYTDT'); > select * from repro_bug; > +--+--+-+ > | A | B | C | > +--+--+-+ > | abc | def | RRSQ_IMKKL | > | abc | def | RRS_ZYTDT | > +--+--+-+ > {noformat} > -- Query 1 - Look for rows where C has a certain prefix - Returns correct > result > {noformat} > select A, B, C from REPRO_BUG where C like 'RRS\\_%'; > +--+--++ > | A | B | C | > +--+--++ > | abc | def | RRS_ZYTDT | > +--+--++ > {noformat} > -- Query 2 - Look for rows where (a, b, c) > first row - Returns correct > result > {noformat} > select A, B, C from REPRO_BUG where (A, B, C) > ('abc', 'def', 'RRSQ_IMKKL') > +--+--++ > | A | B | C | > +--+--++ > | abc | def | RRS_ZYTDT | > +--+--++ > {noformat} > -- Query 3 - Combine the filters from Query 1 and Query2 - Returns incorrect > result.. Ideally it should return the same row as above. > {noformat} > select A, B, C from REPRO_BUG where (A, B, C) > ('abc', 'def', 'RRSQ_IMKKL') > AND C like 'RRS\\_%'; > ++++ > | A | B | C | > ++++ > ++++ > {noformat} > -- Explain for the above incase someone is interested. > {noformat} > explain select A, B, C from REPRO_BUG where (A, B, C) > ('abc', 'def', > 'RRSQ_IMKKL') AND C like 'RRS\\_%'; > ++-++--+ > | PLAN >| EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS | > ++-++--+ > | CLIENT 1-CHUNK PARALLEL 1-WAY ROUND ROBIN RANGE SCAN OVER REPRO_BUG > ['abcdef'] - [*] | null| null | null | > | SERVER FILTER BY FIRST KEY ONLY AND C LIKE 'RRS\_%' >| null| null | null | > ++-++--+ > 2 rows selected (0.003 seconds) > {noformat} > I'm trying to poke around in the code to figure out the issue but my > understanding of Phoenix is limited at this point. So creating a bug report > incase someone can figure this out quickly. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (PHOENIX-5436) Histogram view of Global Client Metrics
[ https://issues.apache.org/jira/browse/PHOENIX-5436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17206557#comment-17206557 ] Daniel Wong commented on PHOENIX-5436: -- Hey we should combine this with the work [~abhishek.sen] [~rushabh.shah] and [~vmeka] are working . ICan we close https://issues.apache.org/jira/browse/PHOENIX-5838 perhaps as a duplicate or other was around? Same with the pair of https://issues.apache.org/jira/browse/PHOENIX-5837 and https://issues.apache.org/jira/browse/PHOENIX-5469 . Can we come to an agreement and close the duplicates? FYI [~ckulkarni] > Histogram view of Global Client Metrics > --- > > Key: PHOENIX-5436 > URL: https://issues.apache.org/jira/browse/PHOENIX-5436 > Project: Phoenix > Issue Type: Improvement >Reporter: abhishek sen >Priority: Minor > Labels: histogram, metric-collector, metrics > Attachments: > Histogram_View_of_Global_Client_metrics_log_typo_corrected.patch, > Histogram_View_of_Global_Client_metrics_removing_star_imports.patch, > Histogram_View_of_some_Global_Client_metrics.patch > > Original Estimate: 336h > Remaining Estimate: 336h > > Currently the > [GlobalClientMetrics|https://github.com/apache/phoenix/blob/master/phoenix-core/src/main/java/org/apache/phoenix/monitoring/GlobalClientMetrics.java] > enum-class defines Phoenix Global Client Metrics as an enum and each > metric-enum tracks a [Phoenix > Metric|https://github.com/apache/phoenix/blob/master/phoenix-core/src/main/java/org/apache/phoenix/monitoring/Metric.java] > object implemented by > [AtomicMetric|https://github.com/apache/phoenix/blob/master/phoenix-core/src/main/java/org/apache/phoenix/monitoring/AtomicMetric.java] > class. This tracks a single counter, but in some use cases we want the > distribution of a particular metric value over some period of time. One > example could be the metric > [TASK_EXECUTION_TIME|https://github.com/apache/phoenix/blob/master/phoenix-core/src/main/java/org/apache/phoenix/monitoring/MetricType.java#L47]. > The Global Client metric tracks the aggregate value for the execution time > of the task. More useful information to monitor would be the distribution of > task execution time instead. > Now, in order to incorporate histogram view of the metric, we can use the > [Histogram > Implementation|https://github.com/apache/hbase/blob/master/hbase-metrics/src/main/java/org/apache/hadoop/hbase/metrics/impl/HistogramImpl.java] > from hbase-metrics-api package. The current GlobalClientMetric also > ultimately adapts to hbase-metrics-api interface but it > [implements|https://github.com/apache/phoenix/blob/master/phoenix-core/src/main/java/org/apache/phoenix/monitoring/GlobalClientMetrics.java#L155] > the > [Gauge|https://github.com/apache/hbase/blob/master/hbase-metrics-api/src/main/java/org/apache/hadoop/hbase/metrics/Gauge.java] > class which can only keep track of a single value not a histogram. So one > way could be to create a new GlobalClientHistogramMetrics enum class which > keeps track of a Histogram (based on > [MutableTimeHistogram|https://github.com/apache/hbase/blob/master/hbase-hadoop2-compat/src/main/java/org/apache/hadoop/metrics2/lib/MutableTimeHistogram.java]) > for each enum-metric instead of a single counter as it is now. The updating > of the these new metric can go > [here|https://github.com/apache/phoenix/blob/master/phoenix-core/src/main/java/org/apache/phoenix/job/JobManager.java#L289] > within JobManager class where the currently available Global as well as > request-level metrics are being updated. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (PHOENIX-5872) Close Internal Phoenix Connections that were running during cancel
[ https://issues.apache.org/jira/browse/PHOENIX-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17196561#comment-17196561 ] Daniel Wong commented on PHOENIX-5872: -- [~gjacoby] there is a master patch attached but hadoop QA was being extremely unhelpful in running consistently at the time and we did not believe the failures to be patch related. Looks like master patch may not even apply now? > Close Internal Phoenix Connections that were running during cancel > -- > > Key: PHOENIX-5872 > URL: https://issues.apache.org/jira/browse/PHOENIX-5872 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.14.3, 4.x >Reporter: Daniel Wong >Assignee: Daniel Wong >Priority: Major > Fix For: 4.16.0 > > Attachments: PHOENIX-5872.4.x.patch, PHOENIX-5872.master.patch, > PHOENIX-5872v2.4.x.patch, PHOENIX-5872v3.4.x.patch, PHOENIX-5872v4.4.x.patch, > PHOENIX-5872v5.4.x.patch > > Time Spent: 6h 40m > Remaining Estimate: 0h > > 3 part approach: > 1 don't count internal phoenix connections toward the client limit. > 2 count internal phoenix connections toward a newly defined limit > 3 track parent and child relationships between connections to close those > connections -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (PHOENIX-5066) The TimeZone is incorrectly used during writing or reading data
[ https://issues.apache.org/jira/browse/PHOENIX-5066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17196547#comment-17196547 ] Daniel Wong commented on PHOENIX-5066: -- Left some initial comments. > The TimeZone is incorrectly used during writing or reading data > --- > > Key: PHOENIX-5066 > URL: https://issues.apache.org/jira/browse/PHOENIX-5066 > Project: Phoenix > Issue Type: Bug >Affects Versions: 5.0.0, 4.14.1 >Reporter: Jaanai Zhang >Assignee: Richard Antal >Priority: Critical > Fix For: 5.1.1, 4.16.0 > > Attachments: DateTest.java, PHOENIX-5066.4x.v1.patch, > PHOENIX-5066.4x.v2.patch, PHOENIX-5066.4x.v3.patch, > PHOENIX-5066.master.v1.patch, PHOENIX-5066.master.v2.patch, > PHOENIX-5066.master.v3.patch, PHOENIX-5066.master.v4.patch, > PHOENIX-5066.master.v5.patch, PHOENIX-5066.master.v6.patch > > Time Spent: 20m > Remaining Estimate: 0h > > We have two methods to write data when uses JDBC API. > #1. Uses _the exceuteUpdate_ method to execute a string that is an upsert SQL. > #2. Uses the _prepareStatement_ method to set some objects and execute. > The _string_ data needs to convert to a new object by the schema information > of tables. we'll use some date formatters to convert string data to object > for Date/Time/Timestamp types when writes data and the formatters are used > when reads data as well. > > *Uses default timezone test* > Writing 3 records by the different ways. > {code:java} > UPSERT INTO date_test VALUES (1,'2018-12-10 15:40:47','2018-12-10 > 15:40:47','2018-12-10 15:40:47') > UPSERT INTO date_test VALUES (2,to_date('2018-12-10 > 15:40:47'),to_time('2018-12-10 15:40:47'),to_timestamp('2018-12-10 15:40:47')) > stmt.setInt(1, 3);stmt.setDate(2, date);stmt.setTime(3, > time);stmt.setTimestamp(4, ts); > {code} > Reading the table by the getObject(getDate/getTime/getTimestamp) methods. > {code:java} > 1 | 2018-12-10 | 23:45:07 | 2018-12-10 23:45:07.0 > 2 | 2018-12-10 | 23:45:07 | 2018-12-10 23:45:07.0 > 3 | 2018-12-10 | 15:45:07 | 2018-12-10 15:45:07.66 > {code} > Reading the table by the getString methods > {code:java} > 1 | 2018-12-10 15:45:07.000 | 2018-12-10 15:45:07.000 | 2018-12-10 > 15:45:07.000 > 2 | 2018-12-10 15:45:07.000 | 2018-12-10 15:45:07.000 | 2018-12-10 > 15:45:07.000 > 3 | 2018-12-10 07:45:07.660 | 2018-12-10 07:45:07.660 | 2018-12-10 > 07:45:07.660 > {code} > *Uses GMT+8 test* > Writing 3 records by the different ways. > {code:java} > UPSERT INTO date_test VALUES (1,'2018-12-10 15:40:47','2018-12-10 > 15:40:47','2018-12-10 15:40:47') > UPSERT INTO date_test VALUES (2,to_date('2018-12-10 > 15:40:47'),to_time('2018-12-10 15:40:47'),to_timestamp('2018-12-10 15:40:47')) > stmt.setInt(1, 3);stmt.setDate(2, date);stmt.setTime(3, > time);stmt.setTimestamp(4, ts); > {code} > Reading the table by the getObject(getDate/getTime/getTimestamp) methods. > {code:java} > 1 | 2018-12-10 | 23:40:47 | 2018-12-10 23:40:47.0 > 2 | 2018-12-10 | 15:40:47 | 2018-12-10 15:40:47.0 > 3 | 2018-12-10 | 15:40:47 | 2018-12-10 15:40:47.106 {code} > Reading the table by the getString methods > {code:java} > 1 | 2018-12-10 23:40:47.000 | 2018-12-10 23:40:47.000 | 2018-12-10 > 23:40:47.000 > 2 | 2018-12-10 15:40:47.000 | 2018-12-10 15:40:47.000 | 2018-12-10 > 15:40:47.000 > 3 | 2018-12-10 15:40:47.106 | 2018-12-10 15:40:47.106 | 2018-12-10 > 15:40:47.106 > {code} > > _We_ have a historical problem, we'll parse the string to > Date/Time/Timestamp objects with timezone in #1, which means the actual data > is going to be changed when stored in HBase table。 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (PHOENIX-5384) org.apache.hadoop.hbase.client.NoServerForRegionException: No server address listed in hbase:meta for region SYSTEM.CATALOG,,1561983176360.29b36daaceebbfc1ff863e445e9
[ https://issues.apache.org/jira/browse/PHOENIX-5384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17196460#comment-17196460 ] Daniel Wong commented on PHOENIX-5384: -- This looks to be an incorrect setup type of issue and a cloudera distrubution. Jira is not for support type of issues. I recommend reaching out to cloudera support, closing. > org.apache.hadoop.hbase.client.NoServerForRegionException: No server address > listed in hbase:meta for region > SYSTEM.CATALOG,,1561983176360.29b36daaceebbfc1ff863e445e9f4cd5. containing row > --- > > Key: PHOENIX-5384 > URL: https://issues.apache.org/jira/browse/PHOENIX-5384 > Project: Phoenix > Issue Type: Task >Affects Versions: 5.0.0 > Environment: CDH =6.0 > Hbase:-2.0.0 > Apache phoenix:-5.0.0 > org.apache.hadoop.hbas >Reporter: Pradyumna K Dalai >Priority: Critical > Attachments: Capture.PNG > > > Hi Team, > We are getting this error while testing apache phoenix in our server. > CDH =6.0 > Hbase:-2.0.0 > Apache phoenix:-5.0.0 > org.apache.hadoop.hbase.client.NoServerForRegionException: No server address > listed in hbase:meta for region > SYSTEM.CATALOG,,1561983176360.29b36daaceebbfc1ff863e445e9f4cd5. containing row > > Please assist. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (PHOENIX-5462) Region Server crash with: Added a key not lexically larger than previous
[ https://issues.apache.org/jira/browse/PHOENIX-5462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17196456#comment-17196456 ] Daniel Wong commented on PHOENIX-5462: -- Looks like a mutable index issue we saw and should be solved by new indexing FYI [~gjacoby] and [~kozdemir]. > Region Server crash with: Added a key not lexically larger than previous > > > Key: PHOENIX-5462 > URL: https://issues.apache.org/jira/browse/PHOENIX-5462 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.14.2 >Reporter: Alex Batyrshin >Priority: Critical > > Region-Server crash with "Added a key not lexically larger than previous" > after table update with Phoenix client. > This bug also was created for HBase - > https://issues.apache.org/jira/browse/HBASE-22862 > Looks like mutations was added in reverse order or something like this. > Our table and pseudo code for table updates: > {code:java} > CREATE TABLE IF NOT EXISTS TBL_TABLE_CODE ( > "c" VARCHAR NOT NULL PRIMARY KEY, > "d"."apd" TIMESTAMP, > "d"."emd" TIMESTAMP, > "d"."prid" VARCHAR, > "d"."o" VARCHAR, > "d"."elr" UNSIGNED_TINYINT, > "d"."st" UNSIGNED_TINYINT > ... > ); > CREATE INDEX "IDX_TBL_TABLE_CODE_O" ON "TBL_TABLE_CODE" ("d"."o", "d"."emd") > INCLUDE( "d"."elr", "apd", "d"."st", ... ); > CREATE INDEX "IDX_TBL_TABLE_CODE_PRID" ON "TBL_TABLE_CODE" ("d"."prid", > "d"."emd") INCLUDE( "d"."elr", "d"."apd", "d"."st", ...); > for (int i=0; i < batches.size; i += 1) { > for (int j = 0; j < batches[i].size; j += 1 ) { > UPSERT INTO TBL_TABLE_CODE VALUES (?, ?, ? ...); > } > commit; > } > {code} > Env: > HBase-1.4.10 > Phoenix-4.14.2-HBase-1.4 > Exception: > {code:java} > 2019-08-15 18:02:21,777 FATAL [MemStoreFlusher.0] regionserver.HRegionServer: > ABORTING region server prod006,60020,1565873610692: Replay of WAL required. > Forcing server shutdown > org.apache.hadoop.hbase.DroppedSnapshotException: region: > TBL_TABLE_CODE,\x0904606203097821slG=sPD,1563070299676.5110b3395ca64a51cea99c6572a4c3d9. >at > org.apache.hadoop.hbase.regionserver.HRegion.internalFlushCacheAndCommit(HRegion.java:2675) >at > org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2352) >at > org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2314) >at > org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:2200) >at > org.apache.hadoop.hbase.regionserver.HRegion.flush(HRegion.java:2125) >at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:512) >at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:482) >at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$900(MemStoreFlusher.java:76) >at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:264) >at java.lang.Thread.run(Thread.java:748) > Caused by: java.io.IOException: Added a key not lexically larger than > previous. Current cell = > \x0901820448218>wGavb'/d:elr/1565881054828/DeleteColumn/vlen=0/seqid=44456567, > lastCell = > \x0901820448218>wGavb'/d:elr/1565881054828/Put/vlen=1/seqid=44457770 >at > org.apache.hadoop.hbase.io.hfile.AbstractHFileWriter.checkKey(AbstractHFileWriter.java:204) >at > org.apache.hadoop.hbase.io.hfile.HFileWriterV2.append(HFileWriterV2.java:279) >at > org.apache.hadoop.hbase.io.hfile.HFileWriterV3.append(HFileWriterV3.java:87) >at > org.apache.hadoop.hbase.regionserver.StoreFile$Writer.append(StoreFile.java:1127) >at > org.apache.hadoop.hbase.regionserver.StoreFlusher.performFlush(StoreFlusher.java:139) >at > org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher.flushSnapshot(DefaultStoreFlusher.java:75) >at > org.apache.hadoop.hbase.regionserver.HStore.flushCache(HStore.java:1003) >at > org.apache.hadoop.hbase.regionserver.HStore$StoreFlusherImpl.flushCache(HStore.java:2523) >at > org.apache.hadoop.hbase.regionserver.HRegion.internalFlushCacheAndCommit(HRegion.java:2622) >... 9 more > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (PHOENIX-4906) Abnormal query result due to Phoenix plan error
[ https://issues.apache.org/jira/browse/PHOENIX-4906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17196453#comment-17196453 ] Daniel Wong commented on PHOENIX-4906: -- Just going over the high priority issues and I think this is a "known" issue for salted tables that the splits must contain at most 1 salt byte and should be split on the salt byte boundaries, we have seen similar issues in past testing environments. Not sure if we can selectively disable merging in HBase. Maybe [~bharathv] or [~abhishek.chouhan] know more? > Abnormal query result due to Phoenix plan error > --- > > Key: PHOENIX-4906 > URL: https://issues.apache.org/jira/browse/PHOENIX-4906 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.11.0, 4.14.0 >Reporter: JeongMin Ju >Priority: Critical > Attachments: ScanRanges_intersectScan.png, > TestSaltingWithRegionMerge.java, initial_salting_region.png, merged-region.png > > > For a salted table, when a query is made for an entire data target, a > different plan is created depending on the type of the query, and as a > result, erroneous data is retrieved as a result. > {code:java} > // Actually, the schema of the table I used is different, but please ignore > it. > create table if not exists test.test_tale ( > rk1 varchar not null, > rk2 varchar not null, > column1 varchar > constraint pk primary key (rk1, rk2) > ) > ... > SALT_BUCKETS=16... > ; > {code} > > I created a table with 16 salting regions and then wrote a lot of data. > HBase automatically split the region and I did the merging regions for data > balancing between the region servers. > Then, when run the query, you can see that another plan is created according > to the Where clause. > * query1 > select count\(*) from test.test_table; > {code:java} > +---+-++ > |PLAN > | EST_BYTES_READ | EST_ROWS_READ | > +---+-++ > | CLIENT 1851-CHUNK 5005959292 ROWS 1944546675532 BYTES PARALLEL 11-WAY FULL > SCAN OVER TEST:TEST_TABLE | 1944546675532 | 5005959292 | > | SERVER FILTER BY FIRST KEY ONLY > | 1944546675532 | 5005959292 | > | SERVER AGGREGATE INTO SINGLE ROW > | 1944546675532 | 5005959292 | > +---+-++ > {code} > * query2 > select count\(*) from test.test_table where rk2 = 'aa'; > {code} > +---+-++ > | PLAN > | EST_BYTES_READ | EST_ROWS_READ | > +---+-++ > | CLIENT 1846-CHUNK 4992196444 ROWS 1939177965768 BYTES PARALLEL 11-WAY RANGE > SCAN OVER TEST:TEST_TABLE [0] - [15] | 1939177965768 | 4992196444 | > | SERVER FILTER BY FIRST KEY ONLY AND RK2 = 'aa' > | 1939177965768 | 4992196444 | > | SERVER AGGREGATE INTO SINGLE ROW > | 1939177965768 | 4992196444 | > +---+-++ > {code} > Since rk2 used in the where clause of query2 is the second column of the PK, > it must be a full scan query like query1. > However, as you can see, query2 is created by range scan and the generated > chunk is also less than five compared to query1. > I added the log and printed out the startkey and endkey of the scan object > generated by the plan. > And I found 5 chunks missing by query2. > All five missing chunks were found in regions where the originally generated > region boundary value was not maintained through the merge operation. > !initial_salting_region.png! > After merging regions > !merged-region.png! > The code that caused the problem is this part. > When a select query is executed, the >
[jira] [Commented] (PHOENIX-5833) Incorrect results with RVCs and AND operator
[ https://issues.apache.org/jira/browse/PHOENIX-5833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17196452#comment-17196452 ] Daniel Wong commented on PHOENIX-5833: -- Patch applied cleanly to my 4.x trying to reupload again. > Incorrect results with RVCs and AND operator > > > Key: PHOENIX-5833 > URL: https://issues.apache.org/jira/browse/PHOENIX-5833 > Project: Phoenix > Issue Type: Bug > Components: core >Affects Versions: 4.15.0 >Reporter: Bharath Vissapragada >Assignee: Daniel Wong >Priority: Critical > Attachments: PHOENIX-5833.4.x.patch > > Time Spent: 20m > Remaining Estimate: 0h > > Phoenix version: 4.15-HBase-1.5 > -- Create a test table and populate a couple of rows. > {noformat} > create table repro_bug(a varchar(10) not null, b varchar(10) not null, c > varchar(10) not null constraint pk primary key(a, b, c)); > upsert into repro_bug values('abc', 'def', 'RRSQ_IMKKL'); > upsert into repro_bug values('abc', 'def', 'RRS_ZYTDT'); > select * from repro_bug; > +--+--+-+ > | A | B | C | > +--+--+-+ > | abc | def | RRSQ_IMKKL | > | abc | def | RRS_ZYTDT | > +--+--+-+ > {noformat} > -- Query 1 - Look for rows where C has a certain prefix - Returns correct > result > {noformat} > select A, B, C from REPRO_BUG where C like 'RRS\\_%'; > +--+--++ > | A | B | C | > +--+--++ > | abc | def | RRS_ZYTDT | > +--+--++ > {noformat} > -- Query 2 - Look for rows where (a, b, c) > first row - Returns correct > result > {noformat} > select A, B, C from REPRO_BUG where (A, B, C) > ('abc', 'def', 'RRSQ_IMKKL') > +--+--++ > | A | B | C | > +--+--++ > | abc | def | RRS_ZYTDT | > +--+--++ > {noformat} > -- Query 3 - Combine the filters from Query 1 and Query2 - Returns incorrect > result.. Ideally it should return the same row as above. > {noformat} > select A, B, C from REPRO_BUG where (A, B, C) > ('abc', 'def', 'RRSQ_IMKKL') > AND C like 'RRS\\_%'; > ++++ > | A | B | C | > ++++ > ++++ > {noformat} > -- Explain for the above incase someone is interested. > {noformat} > explain select A, B, C from REPRO_BUG where (A, B, C) > ('abc', 'def', > 'RRSQ_IMKKL') AND C like 'RRS\\_%'; > ++-++--+ > | PLAN >| EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS | > ++-++--+ > | CLIENT 1-CHUNK PARALLEL 1-WAY ROUND ROBIN RANGE SCAN OVER REPRO_BUG > ['abcdef'] - [*] | null| null | null | > | SERVER FILTER BY FIRST KEY ONLY AND C LIKE 'RRS\_%' >| null| null | null | > ++-++--+ > 2 rows selected (0.003 seconds) > {noformat} > I'm trying to poke around in the code to figure out the issue but my > understanding of Phoenix is limited at this point. So creating a bug report > incase someone can figure this out quickly. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (PHOENIX-5287) Incorrect results for COUNT(1) on a table with GLOBAL INDEX
[ https://issues.apache.org/jira/browse/PHOENIX-5287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17196445#comment-17196445 ] Daniel Wong commented on PHOENIX-5287: -- Repinging [~kozdemir] and [~gjacoby] as there was already a Jira to port index changes to 5.0. > Incorrect results for COUNT(1) on a table with GLOBAL INDEX > --- > > Key: PHOENIX-5287 > URL: https://issues.apache.org/jira/browse/PHOENIX-5287 > Project: Phoenix > Issue Type: Bug >Affects Versions: 5.0.0 > Environment: Environment and data information: > * Column structure of TABLE_A is fully identical to TABLE_B > * TABLE_A has a GLOBAL INDEX > * TABLE_B has a LOCAL INDEX >Reporter: Vlad Krava >Priority: Blocker > > COUNT(\*\) and COUNT(1) commands display incorrect(outdated) statistics for > table with GLOBAL index. > *Example:* > * Export TABLE_A to SCV file (SELECT * FROM *POMG.TABLE_A*) > * Import CSV file to TABLE_B > * COUNT operation on 'TABLE_A' was constantly returning with an amount of > 218623 (for 2 days without any data modifications!!!) : > ** 0: *jdbc:phoenix:> select count(1) from POMG.TABLE_A*; > *** RESULT: 218623 > * Newly exported table from CSV file (TABLE_B) showed different (higher > amount of records) > ** 0: *jdbc:phoenix:> select count(1) from POMG.TABLE_B*; > *** RESULT: 218683 > * COUNT in Hbase is returning the bigger value than COUNT comparing to > Phoenix table ( 218683 vs 218623) > * Phoenix Statistics for this table was updated few times for the past few > testing days > * I took few attends to define data misalignments by executing diff for > primary keys: > ** select key_1 from *POMG.TABLE_A* where key_1 not in (select key_1 from > *POMG.TABLE_B*) - 0 records selected (_Doesn't make sense considering a fact > that TABLE_A larger than TABLE_B and key_1 is unique PRIMARY KEY_) > ** select key_1 from *POMG.TABLE_B* where key_1 not in (select key_1 from > *POMG.TABLE_A*) - 23 records selected (_Doesn't make sense considering a > fact that TABLE_A larger than TABLE_B and key_1 is unique PRIMARY KEY_) > *Workaround:* > * After executing ALTER INDEX with REBUILD flag COUNT statistics for TABLE_A > become identical to TABLE_B > * Diff selects didn't show any differences between *POMG.TABLE_A* and > *POMG.TABLE_B* > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (PHOENIX-5171) SkipScan incorrectly filters composite primary key which the key range contains all values
[ https://issues.apache.org/jira/browse/PHOENIX-5171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17193084#comment-17193084 ] Daniel Wong commented on PHOENIX-5171: -- Main nit I had other than the comment you removed was we should use the table name unique generation. Other than that patch looks good to me. > SkipScan incorrectly filters composite primary key which the key range > contains all values > -- > > Key: PHOENIX-5171 > URL: https://issues.apache.org/jira/browse/PHOENIX-5171 > Project: Phoenix > Issue Type: Bug >Affects Versions: 5.0.0, 4.14.1 >Reporter: Jaanai Zhang >Assignee: Jaanai Zhang >Priority: Blocker > Fix For: 5.1.0, 4.16.0 > > Attachments: 5171-master-v3.patch, PHOENIX-5171-master-v2.patch, > PHOENIX-5171-master.patch > > > Running the below SQL: > {code:sql} > create table if not exists aiolos( > vdate varchar, > tab varchar, > dev tinyint not null, > app varchar, > target varchar, > channel varchar, > one varchar, > two varchar, > count1 integer, > count2 integer, > CONSTRAINT PK PRIMARY KEY (vdate,tab,dev,app,target,channel,one,two)); > upsert into aiolos > values('2018-02-14','channel_agg',2,null,null,'A004',null,null,2,2); > upsert into aiolos > values('2018-02-14','channel_agg',2,null,null,null,null,null,2,2); > SELECT * FROM aiolos WHERE dev = 2 AND vdate BETWEEN '2018-02-10' AND > '2019-02-19' AND tab = 'channel_agg' and channel='A004'; > {code} > Throws exception: > {code:java} > Caused by: java.lang.IllegalStateException: The next hint must come after > previous hint > (prev=2018-02-14\x00channel_agg\x00\x82//LATEST_TIMESTAMP/Maximum/vlen=0/seqid=0, > > next=2018-02-14\x00channel_agg\x00\x82//LATEST_TIMESTAMP/Maximum/vlen=0/seqid=0, > > kv=2018-02-14\x00channel_agg\x00\x82/0:\x00\x00\x00\x00/1550642992223/Put/vlen=4/seqid=5445463) > at > org.apache.phoenix.filter.SkipScanFilter.setNextCellHint(SkipScanFilter.java:171) > at > org.apache.phoenix.filter.SkipScanFilter.filterKeyValue(SkipScanFilter.java:145) > at > org.apache.hadoop.hbase.filter.FilterList.filterKeyValue(FilterList.java:264) > at > org.apache.hadoop.hbase.regionserver.ScanQueryMatcher.match(ScanQueryMatcher.java:418) > at > org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:557) > at > org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:147) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.populateResult(HRegion.java:6308) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:6459) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:6246) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:6232) > at > org.apache.phoenix.iterate.RegionScannerFactory$1.nextRaw(RegionScannerFactory.java:175) > ... 8 more > {code} > The caused by incorrect next cell hint, due to we have skipped the rest of > solts that some key ranges contain all values(EVERYTHING_RANGE) in > ScanUtil.setKey method. The next cell hint of current case is > _kv=2018-02-14\x00channel_agg\x00\x82/0:\x00\x00\x00\x00_, but it shoud be > _kv=2018-02-14\x00channel_agg\x00\x82\x00\x00A004_. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (PHOENIX-5171) SkipScan incorrectly filters composite primary key which the key range contains all values
[ https://issues.apache.org/jira/browse/PHOENIX-5171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17192752#comment-17192752 ] Daniel Wong commented on PHOENIX-5171: -- [~larsh] took a look there are some minor style nits as well as some comments I would change. Other than that i think the patch looks good, it essentially tightens the lower bound of the scan range generator. I ran the original test cases that added the EVERYTHING range qualifier to the disjunction as well. Probably want a new patch upload to verify the tests. [~jaanai] any interest/time to do this? Otherwise I may just do this myself. > SkipScan incorrectly filters composite primary key which the key range > contains all values > -- > > Key: PHOENIX-5171 > URL: https://issues.apache.org/jira/browse/PHOENIX-5171 > Project: Phoenix > Issue Type: Bug >Affects Versions: 5.0.0, 4.14.1 >Reporter: Jaanai Zhang >Assignee: Jaanai Zhang >Priority: Blocker > Fix For: 5.1.0, 4.16.0 > > Attachments: PHOENIX-5171-master-v2.patch, PHOENIX-5171-master.patch > > > Running the below SQL: > {code:sql} > create table if not exists aiolos( > vdate varchar, > tab varchar, > dev tinyint not null, > app varchar, > target varchar, > channel varchar, > one varchar, > two varchar, > count1 integer, > count2 integer, > CONSTRAINT PK PRIMARY KEY (vdate,tab,dev,app,target,channel,one,two)); > upsert into aiolos > values('2018-02-14','channel_agg',2,null,null,'A004',null,null,2,2); > upsert into aiolos > values('2018-02-14','channel_agg',2,null,null,null,null,null,2,2); > SELECT * FROM aiolos WHERE dev = 2 AND vdate BETWEEN '2018-02-10' AND > '2019-02-19' AND tab = 'channel_agg' and channel='A004'; > {code} > Throws exception: > {code:java} > Caused by: java.lang.IllegalStateException: The next hint must come after > previous hint > (prev=2018-02-14\x00channel_agg\x00\x82//LATEST_TIMESTAMP/Maximum/vlen=0/seqid=0, > > next=2018-02-14\x00channel_agg\x00\x82//LATEST_TIMESTAMP/Maximum/vlen=0/seqid=0, > > kv=2018-02-14\x00channel_agg\x00\x82/0:\x00\x00\x00\x00/1550642992223/Put/vlen=4/seqid=5445463) > at > org.apache.phoenix.filter.SkipScanFilter.setNextCellHint(SkipScanFilter.java:171) > at > org.apache.phoenix.filter.SkipScanFilter.filterKeyValue(SkipScanFilter.java:145) > at > org.apache.hadoop.hbase.filter.FilterList.filterKeyValue(FilterList.java:264) > at > org.apache.hadoop.hbase.regionserver.ScanQueryMatcher.match(ScanQueryMatcher.java:418) > at > org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:557) > at > org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:147) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.populateResult(HRegion.java:6308) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:6459) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:6246) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:6232) > at > org.apache.phoenix.iterate.RegionScannerFactory$1.nextRaw(RegionScannerFactory.java:175) > ... 8 more > {code} > The caused by incorrect next cell hint, due to we have skipped the rest of > solts that some key ranges contain all values(EVERYTHING_RANGE) in > ScanUtil.setKey method. The next cell hint of current case is > _kv=2018-02-14\x00channel_agg\x00\x82/0:\x00\x00\x00\x00_, but it shoud be > _kv=2018-02-14\x00channel_agg\x00\x82\x00\x00A004_. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (PHOENIX-6010) Create phoenix-thirdparty, and consume guava through it
[ https://issues.apache.org/jira/browse/PHOENIX-6010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17170369#comment-17170369 ] Daniel Wong commented on PHOENIX-6010: -- Thanks for pinging me on one of the downstream. Just pinging some other people on FYI. I think this is fine for phoenix master. I think we need to add another subtask for phoenix-connectors as well unless you are considering that as part of the phoenix changes. I would prefer another guava version but I haven't checked all the considerations between Java7 and newer versions. I know currently at least for 4.x there are some reasons to stay on java7, i'm debating if on master do we have those same issues? Last discussion of java compatibility https://issues.apache.org/jira/browse/PHOENIX-5855 FYI [~gjacoby] [~ckulkarni] > Create phoenix-thirdparty, and consume guava through it > --- > > Key: PHOENIX-6010 > URL: https://issues.apache.org/jira/browse/PHOENIX-6010 > Project: Phoenix > Issue Type: Improvement > Components: core, omid, tephra >Affects Versions: 5.1.0, 4.16.0 >Reporter: Istvan Toth >Assignee: Istvan Toth >Priority: Major > > We have long-standing and well-documented problems with Guava, just like the > rest of the Hadoop components. > Adopt the solution used by HBase: > * create phoenix-thirdparty repo > * create a pre-shaded phoenix-shaded-guava artifact in it > * Use the pre-shaded Guava in every phoenix component > The advantages are well-known, but to name a few: > * Phoenix will work with Hadoop 3.1.3+ > * One less CVE in our direct dependencies > * No more conflict with our consumer's Guava versions -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (PHOENIX-6022) RVC Offset does not handle trailing nulls properly
[ https://issues.apache.org/jira/browse/PHOENIX-6022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17167474#comment-17167474 ] Daniel Wong commented on PHOENIX-6022: -- Latest patch for comment only nits. Please merge for me [~yanxinyi]. > RVC Offset does not handle trailing nulls properly > -- > > Key: PHOENIX-6022 > URL: https://issues.apache.org/jira/browse/PHOENIX-6022 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.15.0, 4.16.0 >Reporter: Daniel Wong >Assignee: Daniel Wong >Priority: Major > Fix For: 4.16.0 > > Attachments: PHOENIX-6022-4.x.patch > > Time Spent: 0.5h > Remaining Estimate: 0h > > While working on PHOENIX-5924 found that if there was a trailing null the > Phoenix Optimizer during resolution of the miniwhere does not generate a > point get due to how we write/handle trailing nulls in the rowkey. The > sanity check that it is a point get therefore fails. This causes an > exception to be returned to the user. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (PHOENIX-6045) Delete that should qualify for index path does not use index when multiple indexes are available.
[ https://issues.apache.org/jira/browse/PHOENIX-6045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17165986#comment-17165986 ] Daniel Wong edited comment on PHOENIX-6045 at 7/27/20, 10:24 PM: - [~larsh] thats essentially what I was mentioning but I did not communicate it properly. I was suggesting for mutable tables that we could select from index1 the PKs and use that to generate mutation state for the base table which any index by definition has. We could then send mutations to the base table. As part of the mutation once we delete from the base table the deletions could propagate to both index1 and index2. Either through readback as you are mentioning or can the server propgate the mutations? If the index select is much more efficient this overall process could be more efficient? was (Author: dbwong): [~larsh] thats essentially what I was mentioning but I did not communicate it properly. I was suggesting for mutable tables that we could select from index1 the PKs and use that to generate mutation state for the base table which any index by definition has. We could then send mutations to the base table. As part of the mutation once we delete from the base table the deletions could propagate to both index1 and index2. If the index select is much more efficient this overall process could be more efficient? > Delete that should qualify for index path does not use index when multiple > indexes are available. > - > > Key: PHOENIX-6045 > URL: https://issues.apache.org/jira/browse/PHOENIX-6045 > Project: Phoenix > Issue Type: Bug >Affects Versions: 5.0.0, 4.15.0, 4.14.3 >Reporter: Daniel Wong >Priority: Major > Fix For: 5.1.0, 4.16.0 > > Attachments: MultipleDeleteReproIT.java > > > Delete that should qualify for index path does not use index when multiple > indexes are available. Test case to reproduce will be below. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (PHOENIX-6045) Delete that should qualify for index path does not use index when multiple indexes are available.
[ https://issues.apache.org/jira/browse/PHOENIX-6045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17165986#comment-17165986 ] Daniel Wong commented on PHOENIX-6045: -- [~larsh] thats essentially what I was mentioning but I did not communicate it properly. I was suggesting for mutable tables that we could select from index1 the PKs and use that to generate mutation state for the base table which any index by definition has. We could then send mutations to the base table. As part of the mutation once we delete from the base table the deletions could propagate to both index1 and index2. If the index select is much more efficient this overall process could be more efficient? > Delete that should qualify for index path does not use index when multiple > indexes are available. > - > > Key: PHOENIX-6045 > URL: https://issues.apache.org/jira/browse/PHOENIX-6045 > Project: Phoenix > Issue Type: Bug >Affects Versions: 5.0.0, 4.15.0, 4.14.3 >Reporter: Daniel Wong >Priority: Major > Fix For: 5.1.0, 4.16.0 > > Attachments: MultipleDeleteReproIT.java > > > Delete that should qualify for index path does not use index when multiple > indexes are available. Test case to reproduce will be below. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (PHOENIX-6045) Delete that should qualify for index path does not use index when multiple indexes are available.
[ https://issues.apache.org/jira/browse/PHOENIX-6045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17164774#comment-17164774 ] Daniel Wong commented on PHOENIX-6045: -- [~larsh] I think [~kozdemir] has a bit more context on this as he was looking into it in more detail but this is my understanding. For mutable indexes we don't need the other keys since the delete of the index rows are powered by the server side, (rather than the client side). For immutable cases you are correct that if we were powering it from the client we would need the other index keys to delete from the index rows. I think this Jira is still valid. What do you think? > Delete that should qualify for index path does not use index when multiple > indexes are available. > - > > Key: PHOENIX-6045 > URL: https://issues.apache.org/jira/browse/PHOENIX-6045 > Project: Phoenix > Issue Type: Bug >Affects Versions: 5.0.0, 4.15.0, 4.14.3 >Reporter: Daniel Wong >Priority: Major > Fix For: 5.1.0, 4.16.0 > > Attachments: MultipleDeleteReproIT.java > > > Delete that should qualify for index path does not use index when multiple > indexes are available. Test case to reproduce will be below. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (PHOENIX-6023) Wrong result when issuing query for an immutable table with multiple column families
[ https://issues.apache.org/jira/browse/PHOENIX-6023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17164719#comment-17164719 ] Daniel Wong commented on PHOENIX-6023: -- Thanks [~stoty] for running verify manually. > Wrong result when issuing query for an immutable table with multiple column > families > > > Key: PHOENIX-6023 > URL: https://issues.apache.org/jira/browse/PHOENIX-6023 > Project: Phoenix > Issue Type: Bug >Reporter: Toshihiro Suzuki >Assignee: Toshihiro Suzuki >Priority: Major > Fix For: 5.1.0, 4.16.0 > > Attachments: PHOENIX-6023-addendum.master.v1.patch, > PHOENIX-6023.addendum.master.v2.patch, PHOENIX-6023.master.v3.patch > > Time Spent: 40m > Remaining Estimate: 0h > > Steps to reproduce are as follows: > 1. Create an immutable table with multiple column families: > {code} > 0: jdbc:phoenix:> CREATE TABLE TEST ( > . . . . . . . . > ID VARCHAR PRIMARY KEY, > . . . . . . . . > A.COL1 VARCHAR, > . . . . . . . . > B.COL2 VARCHAR > . . . . . . . . > ) IMMUTABLE_ROWS = TRUE; > No rows affected (1.182 seconds) > {code} > 2. Upsert some rows: > {code} > 0: jdbc:phoenix:> UPSERT INTO TEST VALUES ('id0', '0', 'a'); > 1 row affected (0.138 seconds) > 0: jdbc:phoenix:> UPSERT INTO TEST VALUES ('id1', '1', NULL); > 1 row affected (0.009 seconds) > 0: jdbc:phoenix:> UPSERT INTO TEST VALUES ('id2', '2', 'b'); > 1 row affected (0.011 seconds) > 0: jdbc:phoenix:> UPSERT INTO TEST VALUES ('id3', '3', NULL); > 1 row affected (0.007 seconds) > 0: jdbc:phoenix:> UPSERT INTO TEST VALUES ('id4', '4', 'c'); > 1 row affected (0.006 seconds) > 0: jdbc:phoenix:> UPSERT INTO TEST VALUES ('id5', '5', NULL); > 1 row affected (0.007 seconds) > 0: jdbc:phoenix:> UPSERT INTO TEST VALUES ('id6', '6', 'd'); > 1 row affected (0.007 seconds) > 0: jdbc:phoenix:> UPSERT INTO TEST VALUES ('id7', '7', NULL); > 1 row affected (0.007 seconds) > 0: jdbc:phoenix:> UPSERT INTO TEST VALUES ('id8', '8', 'e'); > 1 row affected (0.007 seconds) > 0: jdbc:phoenix:> UPSERT INTO TEST VALUES ('id9', '9', NULL); > 1 row affected (0.009 seconds) > {code} > 3. Count query is okay: > {code} > 0: jdbc:phoenix:> SELECT COUNT(COL1) FROM TEST WHERE COL2 IS NOT NULL; > ++ > | COUNT(A.COL1) | > ++ > | 5 | > ++ > 1 row selected (0.1 seconds) > {code} > 4. However, the following select query returns wrong result (it should return > 5 records): > {code} > 0: jdbc:phoenix:> SELECT COL1 FROM TEST WHERE COL2 IS NOT NULL; > +---+ > | COL1 | > +---+ > | 0 | > | 1 | > | 2 | > | 3 | > | 4 | > | 5 | > | 6 | > | 7 | > | 8 | > | 9 | > +---+ > 10 rows selected (0.058 seconds) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (PHOENIX-6023) Wrong result when issuing query for an immutable table with multiple column families
[ https://issues.apache.org/jira/browse/PHOENIX-6023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17164049#comment-17164049 ] Daniel Wong commented on PHOENIX-6023: -- Just tracked this failure down myself. Please get a hadoopQA run for the addendum FYI [~yanxinyi] > Wrong result when issuing query for an immutable table with multiple column > families > > > Key: PHOENIX-6023 > URL: https://issues.apache.org/jira/browse/PHOENIX-6023 > Project: Phoenix > Issue Type: Bug >Reporter: Toshihiro Suzuki >Assignee: Toshihiro Suzuki >Priority: Major > Fix For: 5.1.0, 4.16.0 > > Attachments: PHOENIX-6023-addendum.master.v1.patch, > PHOENIX-6023.addendum.master.v2.patch > > Time Spent: 0.5h > Remaining Estimate: 0h > > Steps to reproduce are as follows: > 1. Create an immutable table with multiple column families: > {code} > 0: jdbc:phoenix:> CREATE TABLE TEST ( > . . . . . . . . > ID VARCHAR PRIMARY KEY, > . . . . . . . . > A.COL1 VARCHAR, > . . . . . . . . > B.COL2 VARCHAR > . . . . . . . . > ) IMMUTABLE_ROWS = TRUE; > No rows affected (1.182 seconds) > {code} > 2. Upsert some rows: > {code} > 0: jdbc:phoenix:> UPSERT INTO TEST VALUES ('id0', '0', 'a'); > 1 row affected (0.138 seconds) > 0: jdbc:phoenix:> UPSERT INTO TEST VALUES ('id1', '1', NULL); > 1 row affected (0.009 seconds) > 0: jdbc:phoenix:> UPSERT INTO TEST VALUES ('id2', '2', 'b'); > 1 row affected (0.011 seconds) > 0: jdbc:phoenix:> UPSERT INTO TEST VALUES ('id3', '3', NULL); > 1 row affected (0.007 seconds) > 0: jdbc:phoenix:> UPSERT INTO TEST VALUES ('id4', '4', 'c'); > 1 row affected (0.006 seconds) > 0: jdbc:phoenix:> UPSERT INTO TEST VALUES ('id5', '5', NULL); > 1 row affected (0.007 seconds) > 0: jdbc:phoenix:> UPSERT INTO TEST VALUES ('id6', '6', 'd'); > 1 row affected (0.007 seconds) > 0: jdbc:phoenix:> UPSERT INTO TEST VALUES ('id7', '7', NULL); > 1 row affected (0.007 seconds) > 0: jdbc:phoenix:> UPSERT INTO TEST VALUES ('id8', '8', 'e'); > 1 row affected (0.007 seconds) > 0: jdbc:phoenix:> UPSERT INTO TEST VALUES ('id9', '9', NULL); > 1 row affected (0.009 seconds) > {code} > 3. Count query is okay: > {code} > 0: jdbc:phoenix:> SELECT COUNT(COL1) FROM TEST WHERE COL2 IS NOT NULL; > ++ > | COUNT(A.COL1) | > ++ > | 5 | > ++ > 1 row selected (0.1 seconds) > {code} > 4. However, the following select query returns wrong result (it should return > 5 records): > {code} > 0: jdbc:phoenix:> SELECT COL1 FROM TEST WHERE COL2 IS NOT NULL; > +---+ > | COL1 | > +---+ > | 0 | > | 1 | > | 2 | > | 3 | > | 4 | > | 5 | > | 6 | > | 7 | > | 8 | > | 9 | > +---+ > 10 rows selected (0.058 seconds) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (PHOENIX-6045) Delete that should qualify for index path does not use index when multiple indexes are available.
[ https://issues.apache.org/jira/browse/PHOENIX-6045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17163985#comment-17163985 ] Daniel Wong commented on PHOENIX-6045: -- FYI [~ChinmayKulkarni] [~kozdemir] > Delete that should qualify for index path does not use index when multiple > indexes are available. > - > > Key: PHOENIX-6045 > URL: https://issues.apache.org/jira/browse/PHOENIX-6045 > Project: Phoenix > Issue Type: Bug >Reporter: Daniel Wong >Priority: Major > Attachments: MultipleDeleteReproIT.java > > > Delete that should qualify for index path does not use index when multiple > indexes are available. Test case to reproduce will be below. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (PHOENIX-6045) Delete that should qualify for index path does not use index when multiple indexes are available.
[ https://issues.apache.org/jira/browse/PHOENIX-6045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17163984#comment-17163984 ] Daniel Wong commented on PHOENIX-6045: -- With indexes with disjoint set of primary keys I found this. It appears that while we are considering index plans the both indexes' columns are included in the projected column list which causes the optimizer to reject the index as the columns are not covered by either index. > Delete that should qualify for index path does not use index when multiple > indexes are available. > - > > Key: PHOENIX-6045 > URL: https://issues.apache.org/jira/browse/PHOENIX-6045 > Project: Phoenix > Issue Type: Bug >Reporter: Daniel Wong >Priority: Major > Attachments: MultipleDeleteReproIT.java > > > Delete that should qualify for index path does not use index when multiple > indexes are available. Test case to reproduce will be below. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (PHOENIX-6013) RVC Offset does not handle coerced literal nulls properly.
[ https://issues.apache.org/jira/browse/PHOENIX-6013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17159404#comment-17159404 ] Daniel Wong commented on PHOENIX-6013: -- [~gjacoby] will do waiting on getting the other patch in first as this one builds on top of it. > RVC Offset does not handle coerced literal nulls properly. > -- > > Key: PHOENIX-6013 > URL: https://issues.apache.org/jira/browse/PHOENIX-6013 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.15.0 >Reporter: Daniel Wong >Assignee: Daniel Wong >Priority: Major > Fix For: 4.16.0 > > Time Spent: 1h 50m > Remaining Estimate: 0h > > As part of query rewrite paths the offset may go through the A=null gets > rewritten to A IS NULL, the code sanity checks against an equality comparison > op which is not the case on this rewrite. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (PHOENIX-5990) Delete statements don't honor index hints
[ https://issues.apache.org/jira/browse/PHOENIX-5990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17154999#comment-17154999 ] Daniel Wong edited comment on PHOENIX-5990 at 7/10/20, 4:24 AM: I spent some time looking at this. DELETE statements ignore hints for the select portion of their execution with the exception of /*+ USE_INDEX_OVER_DATA_TABLE */. That hint does cause the optimizer to choose index path for deletes. [~gjacoby] do you want to change the scope of this to include having the optimizer choose index path or do you want to close as no change needed? was (Author: dbwong): I spent some time looking at this. DELETE statements ignore hints for the select portion of their execution with the exception of /*+ USE_INDEX_OVER_DATA_TABLE */. That hint does cause the optimizer to choose index path for deletes. [~gjacoby] do you want to change the scope of this to include having the optimizer choose index path or do you want to close as no change needed?{{}}{{}}{{}} > Delete statements don't honor index hints > - > > Key: PHOENIX-5990 > URL: https://issues.apache.org/jira/browse/PHOENIX-5990 > Project: Phoenix > Issue Type: Bug >Reporter: Geoffrey Jacoby >Priority: Major > > While investigating a perf issue with a delete statement, we noticed that the > query optimizer wasn't using an index with a better (though not complete) > match for the WHERE clause than the base table PK was. Even using a query > hint to specify the index did not seem to change the EXPLAIN plan. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (PHOENIX-5990) Delete statements don't honor index hints
[ https://issues.apache.org/jira/browse/PHOENIX-5990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17155091#comment-17155091 ] Daniel Wong commented on PHOENIX-5990: -- I'm not sure, I'm guessing some of the reason is historical as it was preventing indexes from being used with the following comment. But some of it is probably just it being overlooked. {noformat} // If we have immutable indexes, we'd increase the number of bytes scanned by executing // separate queries against each index, so better to drive from a single table in that case. {noformat} > Delete statements don't honor index hints > - > > Key: PHOENIX-5990 > URL: https://issues.apache.org/jira/browse/PHOENIX-5990 > Project: Phoenix > Issue Type: Bug >Reporter: Geoffrey Jacoby >Priority: Major > > While investigating a perf issue with a delete statement, we noticed that the > query optimizer wasn't using an index with a better (though not complete) > match for the WHERE clause than the base table PK was. Even using a query > hint to specify the index did not seem to change the EXPLAIN plan. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (PHOENIX-5990) Delete statements don't honor index hints
[ https://issues.apache.org/jira/browse/PHOENIX-5990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17154999#comment-17154999 ] Daniel Wong commented on PHOENIX-5990: -- I spent some time looking at this. DELETE statements ignore hints for the select portion of their execution with the exception of /*+ USE_INDEX_OVER_DATA_TABLE */. That hint does cause the optimizer to choose index path for deletes. [~gjacoby] do you want to change the scope of this to include having the optimizer choose index path or do you want to close as no change needed?{{}}{{}}{{}} > Delete statements don't honor index hints > - > > Key: PHOENIX-5990 > URL: https://issues.apache.org/jira/browse/PHOENIX-5990 > Project: Phoenix > Issue Type: Bug >Reporter: Geoffrey Jacoby >Priority: Major > > While investigating a perf issue with a delete statement, we noticed that the > query optimizer wasn't using an index with a better (though not complete) > match for the WHERE clause than the base table PK was. Even using a query > hint to specify the index did not seem to change the EXPLAIN plan. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (PHOENIX-5924) RVC Offset does not handle variable length fields exclusive scan boundary correctly
[ https://issues.apache.org/jira/browse/PHOENIX-5924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17136082#comment-17136082 ] Daniel Wong commented on PHOENIX-5924: -- Hoping OSS infra works this time reuploading patch again. > RVC Offset does not handle variable length fields exclusive scan boundary > correctly > --- > > Key: PHOENIX-5924 > URL: https://issues.apache.org/jira/browse/PHOENIX-5924 > Project: Phoenix > Issue Type: Bug >Affects Versions: 5.0.0, 4.15.0, 4.14.3 >Reporter: Daniel Wong >Assignee: Daniel Wong >Priority: Major > Fix For: 5.1.0, 4.16.0 > > Attachments: PHOENIX-5924-4.x.patch > > Time Spent: 0.5h > Remaining Estimate: 0h > > The way exclusive boundary was handled by incrementing the key for variable > length fields is incorrect in the scan boundary. > > In the following case we incrementing incorrectly from 0x490049 -> 0x490050 > ('1','1' -> '1','2') > We should increment from 0x490049 -> 0x49004900 ('1','1' -> '1','1'\x00) > @Test > public void testScenario() throws Exception \{ > String TEST_DDL = "CREATE TABLE IF NOT EXISTS TEST_SCHEMA (\n" > + "ORGANIZATION_ID VARCHAR(15), \n" + "TEST_ID > VARCHAR(15), \n" > + "CREATED_DATE DATE, \n" + "LAST_UPDATE DATE\n" > + "CONSTRAINT TEST_SCHEMA_PK PRIMARY KEY (ORGANIZATION_ID, > TEST_ID) \n" + ")"; > try (Statement statement = conn.createStatement()) { > statement.execute(TEST_DDL); > } > //setup > List upserts = new ArrayList<>(); > upserts.add("UPSERT INTO TEST_SCHEMA(ORGANIZATION_ID,TEST_ID) VALUES > ('1','1')"); > upserts.add("UPSERT INTO TEST_SCHEMA(ORGANIZATION_ID,TEST_ID) VALUES > ('1','10')"); > upserts.add("UPSERT INTO TEST_SCHEMA(ORGANIZATION_ID,TEST_ID) VALUES > ('2','2')"); > for(String sql : upserts) \{ > try (Statement statement = conn.createStatement()) { > statement.execute(sql); > } > } > conn.commit(); > String query1 = "SELECT * FROM TEST_SCHEMA"; > String query2 = "SELECT * FROM TEST_SCHEMA OFFSET > (ORGANIZATION_ID,TEST_ID) = ('1','1')"; > try (Statement statement = conn.createStatement() ; ResultSet rs1 = > statement.executeQuery(query1) ) \{ > TestUtil.printResultSet(rs1); > } > try (Statement statement = conn.createStatement() ; ResultSet rs2 = > statement.executeQuery(query2) ) \{ > TestUtil.printResultSet(rs2); > } > } > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (PHOENIX-5833) Incorrect results with RVCs and AND operator
[ https://issues.apache.org/jira/browse/PHOENIX-5833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17101174#comment-17101174 ] Daniel Wong commented on PHOENIX-5833: -- As discussed offline with [~yanxinyi] and [~ckulkarni] will fix the wrong result and make a followup Jira for the cases where the wrong result was improving performance luckily. > Incorrect results with RVCs and AND operator > > > Key: PHOENIX-5833 > URL: https://issues.apache.org/jira/browse/PHOENIX-5833 > Project: Phoenix > Issue Type: Bug > Components: core >Affects Versions: 4.15.0 >Reporter: Bharath Vissapragada >Assignee: Daniel Wong >Priority: Critical > Attachments: PHOENIX-5833.4.x.patch > > Time Spent: 20m > Remaining Estimate: 0h > > Phoenix version: 4.15-HBase-1.5 > -- Create a test table and populate a couple of rows. > {noformat} > create table repro_bug(a varchar(10) not null, b varchar(10) not null, c > varchar(10) not null constraint pk primary key(a, b, c)); > upsert into repro_bug values('abc', 'def', 'RRSQ_IMKKL'); > upsert into repro_bug values('abc', 'def', 'RRS_ZYTDT'); > select * from repro_bug; > +--+--+-+ > | A | B | C | > +--+--+-+ > | abc | def | RRSQ_IMKKL | > | abc | def | RRS_ZYTDT | > +--+--+-+ > {noformat} > -- Query 1 - Look for rows where C has a certain prefix - Returns correct > result > {noformat} > select A, B, C from REPRO_BUG where C like 'RRS\\_%'; > +--+--++ > | A | B | C | > +--+--++ > | abc | def | RRS_ZYTDT | > +--+--++ > {noformat} > -- Query 2 - Look for rows where (a, b, c) > first row - Returns correct > result > {noformat} > select A, B, C from REPRO_BUG where (A, B, C) > ('abc', 'def', 'RRSQ_IMKKL') > +--+--++ > | A | B | C | > +--+--++ > | abc | def | RRS_ZYTDT | > +--+--++ > {noformat} > -- Query 3 - Combine the filters from Query 1 and Query2 - Returns incorrect > result.. Ideally it should return the same row as above. > {noformat} > select A, B, C from REPRO_BUG where (A, B, C) > ('abc', 'def', 'RRSQ_IMKKL') > AND C like 'RRS\\_%'; > ++++ > | A | B | C | > ++++ > ++++ > {noformat} > -- Explain for the above incase someone is interested. > {noformat} > explain select A, B, C from REPRO_BUG where (A, B, C) > ('abc', 'def', > 'RRSQ_IMKKL') AND C like 'RRS\\_%'; > ++-++--+ > | PLAN >| EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS | > ++-++--+ > | CLIENT 1-CHUNK PARALLEL 1-WAY ROUND ROBIN RANGE SCAN OVER REPRO_BUG > ['abcdef'] - [*] | null| null | null | > | SERVER FILTER BY FIRST KEY ONLY AND C LIKE 'RRS\_%' >| null| null | null | > ++-++--+ > 2 rows selected (0.003 seconds) > {noformat} > I'm trying to poke around in the code to figure out the issue but my > understanding of Phoenix is limited at this point. So creating a bug report > incase someone can figure this out quickly. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (PHOENIX-5833) Incorrect results with LIKE operator
[ https://issues.apache.org/jira/browse/PHOENIX-5833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17089124#comment-17089124 ] Daniel Wong commented on PHOENIX-5833: -- h3. Error Explanation So the error was in the separator but not because it was missing. Prior to the consideration of the andExpresion we have the following 2 key ranges. Span A,B,C -> ('abc\x00def\x00RRSQ_IMKKL' - *) Span C -> ['RRS_' - 'RRS'') //note ' is the character after '_' Essentially as part of the AndExpression we generate the 2 key ranges. Span A,B -> ('abc\x00def' - *) Span C -> ['RRS_' - 'RRS'') When writing the final combined minRange in ScanRanges we start by writing, 'abc\x00def' Note it is not the final key so we append the asc separator. 'abc\x00def\x00' Then finally we note that it is exclusive so we call next byte 'abc\x00def\x01' Here we already now have a key range that will no longer select the row we intended whose PK was 'abc\x00def\x00RRS_ZYTDT' h3. Fix Short term. I think we we truncate a range in the andExpression key extraction we may need to switch from exclusive to inclusive, as the bound that was exclusive was covering all 3 keys. The inclusiveness cannot apply to only 2 of those 3 keys. > Incorrect results with LIKE operator > > > Key: PHOENIX-5833 > URL: https://issues.apache.org/jira/browse/PHOENIX-5833 > Project: Phoenix > Issue Type: Bug > Components: core >Affects Versions: 4.15.0 >Reporter: Bharath Vissapragada >Priority: Critical > > Phoenix version: 4.15-HBase-1.5 > -- Create a test table and populate a couple of rows. > {noformat} > create table repro_bug(a varchar(10) not null, b varchar(10) not null, c > varchar(10) not null constraint pk primary key(a, b, c)); > upsert into repro_bug values('abc', 'def', 'RRSQ_IMKKL'); > upsert into repro_bug values('abc', 'def', 'RRS_ZYTDT'); > select * from repro_bug; > +--+--+-+ > | A | B | C | > +--+--+-+ > | abc | def | RRSQ_IMKKL | > | abc | def | RRS_ZYTDT | > +--+--+-+ > {noformat} > -- Query 1 - Look for rows where C has a certain prefix - Returns correct > result > {noformat} > select A, B, C from REPRO_BUG where C like 'RRS\\_%'; > +--+--++ > | A | B | C | > +--+--++ > | abc | def | RRS_ZYTDT | > +--+--++ > {noformat} > -- Query 2 - Look for rows where (a, b, c) > first row - Returns correct > result > {noformat} > select A, B, C from REPRO_BUG where (A, B, C) > ('abc', 'def', 'RRSQ_IMKKL') > +--+--++ > | A | B | C | > +--+--++ > | abc | def | RRS_ZYTDT | > +--+--++ > {noformat} > -- Query 3 - Combine the filters from Query 1 and Query2 - Returns incorrect > result.. Ideally it should return the same row as above. > {noformat} > select A, B, C from REPRO_BUG where (A, B, C) > ('abc', 'def', 'RRSQ_IMKKL') > AND C like 'RRS\\_%'; > ++++ > | A | B | C | > ++++ > ++++ > {noformat} > -- Explain for the above incase someone is interested. > {noformat} > explain select A, B, C from REPRO_BUG where (A, B, C) > ('abc', 'def', > 'RRSQ_IMKKL') AND C like 'RRS\\_%'; > ++-++--+ > | PLAN >| EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS | > ++-++--+ > | CLIENT 1-CHUNK PARALLEL 1-WAY ROUND ROBIN RANGE SCAN OVER REPRO_BUG > ['abcdef'] - [*] | null| null | null | > | SERVER FILTER BY FIRST KEY ONLY AND C LIKE 'RRS\_%' >| null| null | null | > ++-++--+ > 2 rows selected (0.003 seconds) > {noformat} > I'm trying to poke around in the code to figure out the issue but my > understanding of Phoenix is limited at this point. So creating a bug report > incase someone can figure this out quickly. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (PHOENIX-5833) Incorrect results with LIKE operator
[ https://issues.apache.org/jira/browse/PHOENIX-5833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17089124#comment-17089124 ] Daniel Wong edited comment on PHOENIX-5833 at 4/21/20, 11:01 PM: - h3. Error Explanation So the error was in the separator but not because it was missing. Prior to the consideration of the andExpresion we have the following 2 key ranges. Span A,B,C -> ('abc\x00def\x00RRSQ_IMKKL' - *) Span C -> ['RRS_' - 'RRS'') //note ' is the character after '_' Essentially as part of the AndExpression we generate the 2 key ranges. Span A,B -> ('abc\x00def' - *) Span C -> ['RRS_' - 'RRS'') When writing the final combined minRange in ScanRanges we start by writing, 'abc\x00def' Note it is not the final key so we append the asc separator. 'abc\x00def\x00' Then finally we note that it is exclusive so we call next byte 'abc\x00def\x01' Here we already now have a key range that will no longer select the row we intended whose PK was 'abc\x00def\x00RRS_ZYTDT' h3. Fix Short term. I think when we truncate a range in the andExpression key extraction we may need to switch from exclusive to inclusive, as the bound that was exclusive was covering all 3 keys. The inclusiveness cannot apply to only 2 of those 3 keys. was (Author: dbwong): h3. Error Explanation So the error was in the separator but not because it was missing. Prior to the consideration of the andExpresion we have the following 2 key ranges. Span A,B,C -> ('abc\x00def\x00RRSQ_IMKKL' - *) Span C -> ['RRS_' - 'RRS'') //note ' is the character after '_' Essentially as part of the AndExpression we generate the 2 key ranges. Span A,B -> ('abc\x00def' - *) Span C -> ['RRS_' - 'RRS'') When writing the final combined minRange in ScanRanges we start by writing, 'abc\x00def' Note it is not the final key so we append the asc separator. 'abc\x00def\x00' Then finally we note that it is exclusive so we call next byte 'abc\x00def\x01' Here we already now have a key range that will no longer select the row we intended whose PK was 'abc\x00def\x00RRS_ZYTDT' h3. Fix Short term. I think we we truncate a range in the andExpression key extraction we may need to switch from exclusive to inclusive, as the bound that was exclusive was covering all 3 keys. The inclusiveness cannot apply to only 2 of those 3 keys. > Incorrect results with LIKE operator > > > Key: PHOENIX-5833 > URL: https://issues.apache.org/jira/browse/PHOENIX-5833 > Project: Phoenix > Issue Type: Bug > Components: core >Affects Versions: 4.15.0 >Reporter: Bharath Vissapragada >Priority: Critical > > Phoenix version: 4.15-HBase-1.5 > -- Create a test table and populate a couple of rows. > {noformat} > create table repro_bug(a varchar(10) not null, b varchar(10) not null, c > varchar(10) not null constraint pk primary key(a, b, c)); > upsert into repro_bug values('abc', 'def', 'RRSQ_IMKKL'); > upsert into repro_bug values('abc', 'def', 'RRS_ZYTDT'); > select * from repro_bug; > +--+--+-+ > | A | B | C | > +--+--+-+ > | abc | def | RRSQ_IMKKL | > | abc | def | RRS_ZYTDT | > +--+--+-+ > {noformat} > -- Query 1 - Look for rows where C has a certain prefix - Returns correct > result > {noformat} > select A, B, C from REPRO_BUG where C like 'RRS\\_%'; > +--+--++ > | A | B | C | > +--+--++ > | abc | def | RRS_ZYTDT | > +--+--++ > {noformat} > -- Query 2 - Look for rows where (a, b, c) > first row - Returns correct > result > {noformat} > select A, B, C from REPRO_BUG where (A, B, C) > ('abc', 'def', 'RRSQ_IMKKL') > +--+--++ > | A | B | C | > +--+--++ > | abc | def | RRS_ZYTDT | > +--+--++ > {noformat} > -- Query 3 - Combine the filters from Query 1 and Query2 - Returns incorrect > result.. Ideally it should return the same row as above. > {noformat} > select A, B, C from REPRO_BUG where (A, B, C) > ('abc', 'def', 'RRSQ_IMKKL') > AND C like 'RRS\\_%'; > ++++ > | A | B | C | > ++++ > ++++ > {noformat} > -- Explain for the above incase someone is interested. > {noformat} > explain select A, B, C from REPRO_BUG where (A, B, C) > ('abc', 'def', > 'RRSQ_IMKKL') AND C like 'RRS\\_%'; > ++-++--+ > | PLAN >| EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS | >
[jira] [Comment Edited] (PHOENIX-5833) Incorrect results with LIKE operator
[ https://issues.apache.org/jira/browse/PHOENIX-5833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17088003#comment-17088003 ] Daniel Wong edited comment on PHOENIX-5833 at 4/20/20, 6:44 PM: Does look like one of the likey many RVC related bugs that lurk. The 4.15 scan plan looks slightly improved but should actually be the more constrained startkey of 'abc\x00def\x00RRSQ_IMKKL' minimally. I'm not sure if we have done any key optimization for the LIKE operator but that can be checked pretty easily. I'll try to look in more detail. If i had to guess we are missing the null separator. was (Author: dbwong): Does look like one of the likey many RVC related bugs that lurk. The 4.15 scan plan looks slightly improved but should actually be the more constrained startkey of 'abcdefRRSQ_IMKKL' minimally. I'm not sure if we have done any key optimization for the LIKE operator but that can be checked pretty easily. I'll try to look in more detail. > Incorrect results with LIKE operator > > > Key: PHOENIX-5833 > URL: https://issues.apache.org/jira/browse/PHOENIX-5833 > Project: Phoenix > Issue Type: Bug > Components: core >Affects Versions: 4.15.0 >Reporter: Bharath Vissapragada >Priority: Critical > > Phoenix version: 4.15-HBase-1.5 > -- Create a test table and populate a couple of rows. > {noformat} > create table repro_bug(a varchar(10) not null, b varchar(10) not null, c > varchar(10) not null constraint pk primary key(a, b, c)); > upsert into repro_bug values('abc', 'def', 'RRSQ_IMKKL'); > upsert into repro_bug values('abc', 'def', 'RRS_ZYTDT'); > select * from repro_bug; > +--+--+-+ > | A | B | C | > +--+--+-+ > | abc | def | RRSQ_IMKKL | > | abc | def | RRS_ZYTDT | > +--+--+-+ > {noformat} > -- Query 1 - Look for rows where C has a certain prefix - Returns correct > result > {noformat} > select A, B, C from REPRO_BUG where C like 'RRS\\_%'; > +--+--++ > | A | B | C | > +--+--++ > | abc | def | RRS_ZYTDT | > +--+--++ > {noformat} > -- Query 2 - Look for rows where (a, b, c) > first row - Returns correct > result > {noformat} > select A, B, C from REPRO_BUG where (A, B, C) > ('abc', 'def', 'RRSQ_IMKKL') > +--+--++ > | A | B | C | > +--+--++ > | abc | def | RRS_ZYTDT | > +--+--++ > {noformat} > -- Query 3 - Combine the filters from Query 1 and Query2 - Returns incorrect > result.. Ideally it should return the same row as above. > {noformat} > select A, B, C from REPRO_BUG where (A, B, C) > ('abc', 'def', 'RRSQ_IMKKL') > AND C like 'RRS\\_%'; > ++++ > | A | B | C | > ++++ > ++++ > {noformat} > -- Explain for the above incase someone is interested. > {noformat} > explain select A, B, C from REPRO_BUG where (A, B, C) > ('abc', 'def', > 'RRSQ_IMKKL') AND C like 'RRS\\_%'; > ++-++--+ > | PLAN >| EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS | > ++-++--+ > | CLIENT 1-CHUNK PARALLEL 1-WAY ROUND ROBIN RANGE SCAN OVER REPRO_BUG > ['abcdef'] - [*] | null| null | null | > | SERVER FILTER BY FIRST KEY ONLY AND C LIKE 'RRS\_%' >| null| null | null | > ++-++--+ > 2 rows selected (0.003 seconds) > {noformat} > I'm trying to poke around in the code to figure out the issue but my > understanding of Phoenix is limited at this point. So creating a bug report > incase someone can figure this out quickly. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (PHOENIX-5833) Incorrect results with LIKE operator
[ https://issues.apache.org/jira/browse/PHOENIX-5833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17088003#comment-17088003 ] Daniel Wong commented on PHOENIX-5833: -- Does look like one of the likey many RVC related bugs that lurk. The 4.15 scan plan looks slightly improved but should actually be the more constrained startkey of 'abcdefRRSQ_IMKKL' minimally. I'm not sure if we have done any key optimization for the LIKE operator but that can be checked pretty easily. I'll try to look in more detail. > Incorrect results with LIKE operator > > > Key: PHOENIX-5833 > URL: https://issues.apache.org/jira/browse/PHOENIX-5833 > Project: Phoenix > Issue Type: Bug > Components: core >Affects Versions: 4.15.0 >Reporter: Bharath Vissapragada >Priority: Critical > > Phoenix version: 4.15-HBase-1.5 > -- Create a test table and populate a couple of rows. > {noformat} > create table repro_bug(a varchar(10) not null, b varchar(10) not null, c > varchar(10) not null constraint pk primary key(a, b, c)); > upsert into repro_bug values('abc', 'def', 'RRSQ_IMKKL'); > upsert into repro_bug values('abc', 'def', 'RRS_ZYTDT'); > select * from repro_bug; > +--+--+-+ > | A | B | C | > +--+--+-+ > | abc | def | RRSQ_IMKKL | > | abc | def | RRS_ZYTDT | > +--+--+-+ > {noformat} > -- Query 1 - Look for rows where C has a certain prefix - Returns correct > result > {noformat} > select A, B, C from REPRO_BUG where C like 'RRS\\_%'; > +--+--++ > | A | B | C | > +--+--++ > | abc | def | RRS_ZYTDT | > +--+--++ > {noformat} > -- Query 2 - Look for rows where (a, b, c) > first row - Returns correct > result > {noformat} > select A, B, C from REPRO_BUG where (A, B, C) > ('abc', 'def', 'RRSQ_IMKKL') > +--+--++ > | A | B | C | > +--+--++ > | abc | def | RRS_ZYTDT | > +--+--++ > {noformat} > -- Query 3 - Combine the filters from Query 1 and Query2 - Returns incorrect > result.. Ideally it should return the same row as above. > {noformat} > select A, B, C from REPRO_BUG where (A, B, C) > ('abc', 'def', 'RRSQ_IMKKL') > AND C like 'RRS\\_%'; > ++++ > | A | B | C | > ++++ > ++++ > {noformat} > -- Explain for the above incase someone is interested. > {noformat} > explain select A, B, C from REPRO_BUG where (A, B, C) > ('abc', 'def', > 'RRSQ_IMKKL') AND C like 'RRS\\_%'; > ++-++--+ > | PLAN >| EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS | > ++-++--+ > | CLIENT 1-CHUNK PARALLEL 1-WAY ROUND ROBIN RANGE SCAN OVER REPRO_BUG > ['abcdef'] - [*] | null| null | null | > | SERVER FILTER BY FIRST KEY ONLY AND C LIKE 'RRS\_%' >| null| null | null | > ++-++--+ > 2 rows selected (0.003 seconds) > {noformat} > I'm trying to poke around in the code to figure out the issue but my > understanding of Phoenix is limited at this point. So creating a bug report > incase someone can figure this out quickly. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (PHOENIX-5769) Phoenix precommit Flapping HadoopQA Tests in master
[ https://issues.apache.org/jira/browse/PHOENIX-5769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17084610#comment-17084610 ] Daniel Wong commented on PHOENIX-5769: -- Thanks for all your work [~stoty]. I'll see if i can spend any time next week and I know [~neha.gupta] may be looking at these as well. > Phoenix precommit Flapping HadoopQA Tests in master > > > Key: PHOENIX-5769 > URL: https://issues.apache.org/jira/browse/PHOENIX-5769 > Project: Phoenix > Issue Type: Improvement >Reporter: Daniel Wong >Priority: Major > Attachments: PHOENIX-5769.master.v1.patch, > PHOENIX-5769.master.v3.patch, PHOENIX-5769.master.v4.patch, consoleFull > (1).html, consoleFull (2).html, consoleFull (3).html, consoleFull (4).html, > consoleFull (5).html, consoleFull (6).html, consoleFull (7).html, consoleFull > (8).html, consoleFull.html, testhang.dump > > > I was recently trying to commit changes to Phoenix for multiple issues and > were asked to get clean HadoopQA runs. However, this took a huge effort as I > had to resubmit the same patch multiple times in order to get one "clean". > Looking at the errors the most common one were 3 "Multiple regions on > " and 3 for apache infra issues (host shutdown), 1 for > org.apache.hadoop.hbase.NotServingRegionException, 1 for > SnapshotDoesNotExistException. See builds > [https://builds.apache.org/job/PreCommit-PHOENIX-Build/] here from 3540's to > 3560's. In addition I see multiple builds running simultaneously, limiting > tests to running on 1 host should be configurable right? > In addition I was recommended by [~yanxinyi] that master was less likely to > have issues getting a clean run than 4.x. FYI [~ckulkarni] > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (PHOENIX-5812) Automatically Close "Idle" Long Open Connections
[ https://issues.apache.org/jira/browse/PHOENIX-5812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17074834#comment-17074834 ] Daniel Wong edited comment on PHOENIX-5812 at 4/3/20, 7:14 PM: --- If we want to do idleness correctly the first time we certainly can; however, there is value to the timing since connection open since for our usecases our customers use we expect no connection to be used for multiple queries due to the lightweight nature of our connections and our communicated recommended call pattern. >From the monitor vs doing this at throttle time. I slightly prefer doing it >as a monitor as this would hopefully make the OPEN_CONNECTIONS metric clearer. > And two the end user experience. If they hit a throttled event we could do 2 >paths, 1 retry the query internally after calling `closeAllIdleConnections()`. > Two have the end user retry. Both add latency or retries that I would want >to avoid. Any thoughts from outside of Salesforce? [~elserj] [~stoty] [~RichardAntal] [~vtpavan] was (Author: dbwong): If we want to do idleness correctly the first time we certainly can; however, there is value to the timing since connection open since for our usecases our customers use we expect no connection to be used for multiple queries due to the lightweight nature of our connections and our communicated recommended call pattern. >From the monitor vs doing this at throttle time. I slightly prefer doing it >as a monitor as this would hopefully make the OPEN_CONNECTIONS metric clearer. > And two the end user experience. If they hit a throttled event we could do 2 >paths, 1 retry the query internally after calling `closeAllIdleConnections()`. > Two have the end user retry. Both add latency retries that I would want to >avoid. Any thoughts from outside of Salesforce? [~elserj] [~stoty] [~RichardAntal] [~vtpavan] > Automatically Close "Idle" Long Open Connections > > > Key: PHOENIX-5812 > URL: https://issues.apache.org/jira/browse/PHOENIX-5812 > Project: Phoenix > Issue Type: Improvement >Reporter: Daniel Wong >Priority: Major > > As Phoenix may keep a maximum default number of connections. Badly > performing client calls or internal errors (See PHOENIX-5802). Can cause > total available connections to go to 0. Proposing a client connection > monitor with a connection reaper like task to reap idle connections. > Definition of "Idle" > Simple may be simple time based say if a connection has been open for > configurable amount of minutes simply close. > More complicated solution may be keeping track of last interaction time. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (PHOENIX-5812) Automatically Close "Idle" Long Open Connections
[ https://issues.apache.org/jira/browse/PHOENIX-5812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17074834#comment-17074834 ] Daniel Wong commented on PHOENIX-5812: -- If we want to do idleness correctly the first time we certainly can; however, there is value to the timing since connection open since for our usecases our customers use we expect no connection to be used for multiple queries due to the lightweight nature of our connections and our communicated recommended call pattern. >From the monitor vs doing this at throttle time. I slightly prefer doing it >as a monitor as this would hopefully make the OPEN_CONNECTIONS metric clearer. > And two the end user experience. If they hit a throttled event we could do 2 >paths, 1 retry the query internally after calling `closeAllIdleConnections()`. > Two have the end user retry. Both add latency retries that I would want to >avoid. Any thoughts from outside of Salesforce? [~elserj] [~stoty] [~RichardAntal] [~vtpavan] > Automatically Close "Idle" Long Open Connections > > > Key: PHOENIX-5812 > URL: https://issues.apache.org/jira/browse/PHOENIX-5812 > Project: Phoenix > Issue Type: Improvement >Reporter: Daniel Wong >Priority: Major > > As Phoenix may keep a maximum default number of connections. Badly > performing client calls or internal errors (See PHOENIX-5802). Can cause > total available connections to go to 0. Proposing a client connection > monitor with a connection reaper like task to reap idle connections. > Definition of "Idle" > Simple may be simple time based say if a connection has been open for > configurable amount of minutes simply close. > More complicated solution may be keeping track of last interaction time. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (PHOENIX-5812) Automatically Close "Idle" Long Open Connections
[ https://issues.apache.org/jira/browse/PHOENIX-5812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17073165#comment-17073165 ] Daniel Wong commented on PHOENIX-5812: -- Thoughts [~liuml07]? [~vmeka] I think the requirements would look like this: * New Connection Level Property phoenix.connection.lifetime or similar * defaults to 0, for infinity ie do not close connections. * When non-0, connections that are open longer than phoenix.connection.lifetime are closed. (Could be CQS or elsewhere). * New global metric for number of reaped connections counter. Individual users can metric/alarm on this. > Automatically Close "Idle" Long Open Connections > > > Key: PHOENIX-5812 > URL: https://issues.apache.org/jira/browse/PHOENIX-5812 > Project: Phoenix > Issue Type: Improvement >Reporter: Daniel Wong >Priority: Major > > As Phoenix may keep a maximum default number of connections. Badly > performing client calls or internal errors (See PHOENIX-5802). Can cause > total available connections to go to 0. Proposing a client connection > monitor with a connection reaper like task to reap idle connections. > Definition of "Idle" > Simple may be simple time based say if a connection has been open for > configurable amount of minutes simply close. > More complicated solution may be keeping track of last interaction time. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (PHOENIX-5812) Automatically Close "Idle" Long Open Connections
[ https://issues.apache.org/jira/browse/PHOENIX-5812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17073019#comment-17073019 ] Daniel Wong commented on PHOENIX-5812: -- FYI [~jisaac] > Automatically Close "Idle" Long Open Connections > > > Key: PHOENIX-5812 > URL: https://issues.apache.org/jira/browse/PHOENIX-5812 > Project: Phoenix > Issue Type: Improvement >Reporter: Daniel Wong >Priority: Major > > As Phoenix may keep a maximum default number of connections. Badly > performing client calls or internal errors (See PHOENIX-5802). Can cause > total available connections to go to 0. Proposing a client connection > monitor with a connection reaper like task to reap idle connections. > Definition of "Idle" > Simple may be simple time based say if a connection has been open for > configurable amount of minutes simply close. > More complicated solution may be keeping track of last interaction time. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (PHOENIX-5769) Phoenix precommit Flapping HadoopQA Tests in master
[ https://issues.apache.org/jira/browse/PHOENIX-5769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17072371#comment-17072371 ] Daniel Wong commented on PHOENIX-5769: -- So I took a look at the patch but I didn't see the corresponding timeout is that in the jenkins configuration I don't have access to I assume? > Phoenix precommit Flapping HadoopQA Tests in master > > > Key: PHOENIX-5769 > URL: https://issues.apache.org/jira/browse/PHOENIX-5769 > Project: Phoenix > Issue Type: Improvement >Reporter: Daniel Wong >Assignee: Istvan Toth >Priority: Major > Attachments: PHOENIX-5769.master.v1.patch, > PHOENIX-5769.master.v3.patch, consoleFull (1).html, consoleFull (2).html, > consoleFull (3).html, consoleFull (4).html, consoleFull (5).html, consoleFull > (6).html, consoleFull (7).html, consoleFull (8).html, consoleFull.html > > > I was recently trying to commit changes to Phoenix for multiple issues and > were asked to get clean HadoopQA runs. However, this took a huge effort as I > had to resubmit the same patch multiple times in order to get one "clean". > Looking at the errors the most common one were 3 "Multiple regions on > " and 3 for apache infra issues (host shutdown), 1 for > org.apache.hadoop.hbase.NotServingRegionException, 1 for > SnapshotDoesNotExistException. See builds > [https://builds.apache.org/job/PreCommit-PHOENIX-Build/] here from 3540's to > 3560's. In addition I see multiple builds running simultaneously, limiting > tests to running on 1 host should be configurable right? > In addition I was recommended by [~yanxinyi] that master was less likely to > have issues getting a clean run than 4.x. FYI [~ckulkarni] > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (PHOENIX-5812) Automatically Close "Idle" Long Open Connections
[ https://issues.apache.org/jira/browse/PHOENIX-5812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17072317#comment-17072317 ] Daniel Wong commented on PHOENIX-5812: -- Couple more thoughts after doing some additional research. Oracle provides a connection lifetime parameter on the connection level. We could easily do that and provide a default connection lifetime parameter default to infinite. I think that gives the largest control to the end user, solves most lifetime problems by configuring a default connection lifetime to a reasonable value. We can look into concept of idle connections in a followup Jira if we decided we wanted more complicated logic. > Automatically Close "Idle" Long Open Connections > > > Key: PHOENIX-5812 > URL: https://issues.apache.org/jira/browse/PHOENIX-5812 > Project: Phoenix > Issue Type: Improvement >Reporter: Daniel Wong >Priority: Major > > As Phoenix may keep a maximum default number of connections. Badly > performing client calls or internal errors (See PHOENIX-5802). Can cause > total available connections to go to 0. Proposing a client connection > monitor with a connection reaper like task to reap idle connections. > Definition of "Idle" > Simple may be simple time based say if a connection has been open for > configurable amount of minutes simply close. > More complicated solution may be keeping track of last interaction time. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (PHOENIX-5812) Automatically Close "Idle" Long Open Connections
[ https://issues.apache.org/jira/browse/PHOENIX-5812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17072202#comment-17072202 ] Daniel Wong commented on PHOENIX-5812: -- FYI [~liuml07] [~ckulkarni] > Automatically Close "Idle" Long Open Connections > > > Key: PHOENIX-5812 > URL: https://issues.apache.org/jira/browse/PHOENIX-5812 > Project: Phoenix > Issue Type: Improvement >Reporter: Daniel Wong >Priority: Major > > As Phoenix may keep a maximum default number of connections. Badly > performing client calls or internal errors (See PHOENIX-5802). Can cause > total available connections to go to 0. Proposing a client connection > monitor with a connection reaper like task to reap idle connections. > Definition of "Idle" > Simple may be simple time based say if a connection has been open for > configurable amount of minutes simply close. > More complicated solution may be keeping track of last interaction time. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (PHOENIX-5796) Possible query optimization when projecting uncovered columns and querying on indexed columns
[ https://issues.apache.org/jira/browse/PHOENIX-5796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17066944#comment-17066944 ] Daniel Wong commented on PHOENIX-5796: -- Its not necessarily optimal. There are a bunch of possible jiras around optimization that we could consider. If the selectivity of the index query is high, that is we select a large portion of the rows it will be faster to full table scan the base table. With stats or with region boundaries we can better estimate selectivity of predicates and thus decide if we want to join or FTS. One easy one i want to open is small scan optimizations where for example if the user provides a limit we do stuff like this. This exact case plus limit limit with index join on uncovered columns, i forget if i opened a Jira already. Using region boundaries would be possibly enough in this case as the index is on B only, so its meaningful enough, if the index had been on C,B it would not be sufficient to estimate selectivity and we would need stats. > Possible query optimization when projecting uncovered columns and querying on > indexed columns > - > > Key: PHOENIX-5796 > URL: https://issues.apache.org/jira/browse/PHOENIX-5796 > Project: Phoenix > Issue Type: Improvement >Affects Versions: 5.0.0, 4.15.0 >Reporter: Chinmay Kulkarni >Priority: Major > Attachments: Screen Shot 2020-03-23 at 3.25.38 PM.png, Screen Shot > 2020-03-23 at 3.32.24 PM.png, Screen Shot 2020-03-24 at 11.51.12 AM.png > > > Start HBase-1.3 server with Phoenix-4.15.0-HBase-1.3 server jar. Connect to > it using sqlline.py which has Phoenix-4.15.0-HBase-1.3 Phoenix client. > Create a base table like: > {code:sql} > create table t (a integer primary key, b varchar(10), c integer); > {code} > Create an uncovered index on top of it like: > {code:sql} > create index uncov_index_t on t(b); > {code} > Now if you issue the query: > {code:sql} > explain select c from t where b='abc'; > {code} > You'd see the following explain plan: > !Screen Shot 2020-03-23 at 3.25.38 PM.png|height=150,width=700! > *Which is a full table scan on the base table 't'* since we cannot use the > global index as 'c' is not a covered column in the global index. > *However, projecting columns contained fully within the index pk is correctly > a range scan:* > {code:sql} > explain select a,b from t where b='abc'; > {code} > produces the following explain plan: > !Screen Shot 2020-03-23 at 3.32.24 PM.png|height=150,width=700! > In the first query, can there be an optimization to *query the index table, > get the start and stop keys of the base table and then issue a range > scan/(bunch of point lookups) on the base table* instead of doing a full > table scan on the base table like we currently do? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (PHOENIX-5796) Possible query optimization when projecting uncovered columns and querying on indexed columns
[ https://issues.apache.org/jira/browse/PHOENIX-5796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17066109#comment-17066109 ] Daniel Wong commented on PHOENIX-5796: -- Its a skip scan join so in theory should jump straight to the joined rows. [~ckulkarni] can double check though. > Possible query optimization when projecting uncovered columns and querying on > indexed columns > - > > Key: PHOENIX-5796 > URL: https://issues.apache.org/jira/browse/PHOENIX-5796 > Project: Phoenix > Issue Type: Improvement >Affects Versions: 5.0.0, 4.15.0 >Reporter: Chinmay Kulkarni >Priority: Major > Attachments: Screen Shot 2020-03-23 at 3.25.38 PM.png, Screen Shot > 2020-03-23 at 3.32.24 PM.png, Screen Shot 2020-03-24 at 11.51.12 AM.png > > > Start HBase-1.3 server with Phoenix-4.15.0-HBase-1.3 server jar. Connect to > it using sqlline.py which has Phoenix-4.15.0-HBase-1.3 Phoenix client. > Create a base table like: > {code:sql} > create table t (a integer primary key, b varchar(10), c integer); > {code} > Create an uncovered index on top of it like: > {code:sql} > create index uncov_index_t on t(b); > {code} > Now if you issue the query: > {code:sql} > explain select c from t where b='abc'; > {code} > You'd see the following explain plan: > !Screen Shot 2020-03-23 at 3.25.38 PM.png|height=150,width=700! > *Which is a full table scan on the base table 't'* since we cannot use the > global index as 'c' is not a covered column in the global index. > *However, projecting columns contained fully within the index pk is correctly > a range scan:* > {code:sql} > explain select a,b from t where b='abc'; > {code} > produces the following explain plan: > !Screen Shot 2020-03-23 at 3.32.24 PM.png|height=150,width=700! > In the first query, can there be an optimization to *query the index table, > get the start and stop keys of the base table and then issue a range > scan/(bunch of point lookups) on the base table* instead of doing a full > table scan on the base table like we currently do? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (PHOENIX-5796) Possible query optimization when projecting uncovered columns and querying on indexed columns
[ https://issues.apache.org/jira/browse/PHOENIX-5796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17066088#comment-17066088 ] Daniel Wong commented on PHOENIX-5796: -- Today at least we can force a range scan through index hint which will do the uncovered join. Slightly different but should be similar or better performance. > Possible query optimization when projecting uncovered columns and querying on > indexed columns > - > > Key: PHOENIX-5796 > URL: https://issues.apache.org/jira/browse/PHOENIX-5796 > Project: Phoenix > Issue Type: Improvement >Affects Versions: 5.0.0, 4.15.0 >Reporter: Chinmay Kulkarni >Priority: Major > Attachments: Screen Shot 2020-03-23 at 3.25.38 PM.png, Screen Shot > 2020-03-23 at 3.32.24 PM.png > > > Start HBase-1.3 server with Phoenix-4.15.0-HBase-1.3 server jar. Connect to > it using sqlline.py which has Phoenix-4.15.0-HBase-1.3 Phoenix client. > Create a base table like: > {code:sql} > create table t (a integer primary key, b varchar(10), c integer); > {code} > Create an uncovered index on top of it like: > {code:sql} > create index uncov_index_t on t(b); > {code} > Now if you issue the query: > {code:sql} > explain select c from t where b='abc'; > {code} > You'd see the following explain plan: > !Screen Shot 2020-03-23 at 3.25.38 PM.png|height=150,width=700! > *Which is a full table scan on the base table 't'* since we cannot use the > global index as 'c' is not a covered column in the global index. > *However, projecting columns contained fully within the index pk is correctly > a range scan:* > {code:sql} > explain select a,b from t where b='abc'; > {code} > produces the following explain plan: > !Screen Shot 2020-03-23 at 3.32.24 PM.png|height=150,width=700! > In the first query, can there be an optimization to *query the index table, > get the start and stop keys of the base table and then issue a range > scan/(bunch of point lookups) on the base table* instead of doing a full > table scan on the base table like we currently do? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (PHOENIX-4845) Support using Row Value Constructors in OFFSET clause for paging in tables where the sort order of PK columns varies
[ https://issues.apache.org/jira/browse/PHOENIX-4845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17053848#comment-17053848 ] Daniel Wong edited comment on PHOENIX-4845 at 3/7/20, 2:09 AM: --- I ran these tests locally and they passed on master/4.x-HBase-1.3. Thoughts [~yanxinyi] [~ckulkarni] was (Author: dbwong): I ran these tests locally on master/4.x-HBase-1.3. Thoughts [~yanxinyi] > Support using Row Value Constructors in OFFSET clause for paging in tables > where the sort order of PK columns varies > > > Key: PHOENIX-4845 > URL: https://issues.apache.org/jira/browse/PHOENIX-4845 > Project: Phoenix > Issue Type: New Feature >Reporter: Thomas D'Silva >Assignee: Daniel Wong >Priority: Major > Labels: DESC, SFDC > Attachments: PHOENIX-4845-4.x-HBase-1.3.patch, > PHOENIX-4845-4.x-HBase-1.3.v2.patch, PHOENIX-4845-4.x-HBase-1.3.v3.patch, > PHOENIX-4845.patch, PHOENIX-offset.txt > > Time Spent: 16h 10m > Remaining Estimate: 0h > > RVCs along with the LIMIT clause are useful for efficiently paging through > rows (see [http://phoenix.apache.org/paged.html]). This works well if the pk > columns are sorted ascending, we can always use the > operator to query for > the next batch of row. > However if the PK of a table is (A DESC, B DESC) we cannot use the following > query to page through the data > {code:java} > SELECT * FROM TABLE WHERE (A, B) > (?, ?) ORDER BY A DESC, B DESC LIMIT 20 > {code} > Since the rows are sorted by A desc and then by B descending we need change > the comparison order > {code:java} > SELECT * FROM TABLE WHERE (A, B) < (?, ?) ORDER BY A DESC, B DESC LIMIT 20 > {code} > If the PK of a table contains columns with mixed sort order for eg (A DESC, > B) then we cannot use RVC to page through data. > If we supported using RVCs in the offset clause we could use the offset to > set the start row of the scan. Clients would not have to have logic to > determine the comparison operator. This would also support paging through > data for tables where the PK columns are sorted in mixed order. > {code:java} > SELECT * FROM TABLE ORDER BY A DESC, B LIMIT 20 OFFSET (?,?) > {code} > We would only allow using the offset if the rows are ordered by the sort > order of the PK columns of and Index or Primary Table. > Note that there is some care is needed in the use of OFFSET with indexes. If > the OFFSET is coercible to multiple indexes/base table it could mean very > different positions based on key. To Handle This the INDEX hint needs to be > used to specify an index offset for safety. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (PHOENIX-4845) Support using Row Value Constructors in OFFSET clause for paging in tables where the sort order of PK columns varies
[ https://issues.apache.org/jira/browse/PHOENIX-4845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17053848#comment-17053848 ] Daniel Wong commented on PHOENIX-4845: -- I ran these tests locally on master/4.x-HBase-1.3. Thoughts [~yanxinyi] > Support using Row Value Constructors in OFFSET clause for paging in tables > where the sort order of PK columns varies > > > Key: PHOENIX-4845 > URL: https://issues.apache.org/jira/browse/PHOENIX-4845 > Project: Phoenix > Issue Type: New Feature >Reporter: Thomas D'Silva >Assignee: Daniel Wong >Priority: Major > Labels: DESC, SFDC > Attachments: PHOENIX-4845-4.x-HBase-1.3.patch, > PHOENIX-4845-4.x-HBase-1.3.v2.patch, PHOENIX-4845-4.x-HBase-1.3.v3.patch, > PHOENIX-4845.patch, PHOENIX-offset.txt > > Time Spent: 16h 10m > Remaining Estimate: 0h > > RVCs along with the LIMIT clause are useful for efficiently paging through > rows (see [http://phoenix.apache.org/paged.html]). This works well if the pk > columns are sorted ascending, we can always use the > operator to query for > the next batch of row. > However if the PK of a table is (A DESC, B DESC) we cannot use the following > query to page through the data > {code:java} > SELECT * FROM TABLE WHERE (A, B) > (?, ?) ORDER BY A DESC, B DESC LIMIT 20 > {code} > Since the rows are sorted by A desc and then by B descending we need change > the comparison order > {code:java} > SELECT * FROM TABLE WHERE (A, B) < (?, ?) ORDER BY A DESC, B DESC LIMIT 20 > {code} > If the PK of a table contains columns with mixed sort order for eg (A DESC, > B) then we cannot use RVC to page through data. > If we supported using RVCs in the offset clause we could use the offset to > set the start row of the scan. Clients would not have to have logic to > determine the comparison operator. This would also support paging through > data for tables where the PK columns are sorted in mixed order. > {code:java} > SELECT * FROM TABLE ORDER BY A DESC, B LIMIT 20 OFFSET (?,?) > {code} > We would only allow using the offset if the rows are ordered by the sort > order of the PK columns of and Index or Primary Table. > Note that there is some care is needed in the use of OFFSET with indexes. If > the OFFSET is coercible to multiple indexes/base table it could mean very > different positions based on key. To Handle This the INDEX hint needs to be > used to specify an index offset for safety. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (PHOENIX-5711) Fix duplicate NOTICE error on phoenix-client shading
[ https://issues.apache.org/jira/browse/PHOENIX-5711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17034981#comment-17034981 ] Daniel Wong commented on PHOENIX-5711: -- I think there may have been differences in how it handles regex between versions I can't recall for sure. I think the better way going forward is to use the tool the made for handling combining required notices and handling the overlap in a "designed" fashion for possible legal requirements [https://maven.apache.org/plugins/maven-shade-plugin/examples/resource-transformers.html#ApacheNoticeResourceTransformer |https://maven.apache.org/plugins/maven-shade-plugin/examples/resource-transformers.html#ApacheNoticeResourceTransformer] > Fix duplicate NOTICE error on phoenix-client shading > > > Key: PHOENIX-5711 > URL: https://issues.apache.org/jira/browse/PHOENIX-5711 > Project: Phoenix > Issue Type: Improvement >Affects Versions: 4.15.0 >Reporter: Masatake Iwasaki >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > I got the following error on building 4.15.0-HBase-1.5 against hadoop-2.10.0 > (for BIGTOP-3298). The cause seems to be the pattern for exclusion in pom.xml > does not match the file name NOTICE (without suffix). > {noformat} > $ mvn clean install -DskipTests -Dhbase.version=1.5.0 -Dhadoop.version=2.10.0 > -Dhadoop-two.version=2.10.0 > ...snip > [ERROR] Failed to execute goal > org.apache.maven.plugins:maven-shade-plugin:3.1.1:shade (default-shaded) on > project phoenix-client: Error creating shaded jar: duplicate entry: NOTICE -> > [Help 1] > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (PHOENIX-5711) Fix duplicate NOTICE error on phoenix-client shading
[ https://issues.apache.org/jira/browse/PHOENIX-5711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17034941#comment-17034941 ] Daniel Wong commented on PHOENIX-5711: -- Thanks for the patch. I think I saw this before internally and our issue was the version of the maven plugin. What did you test this against? > Fix duplicate NOTICE error on phoenix-client shading > > > Key: PHOENIX-5711 > URL: https://issues.apache.org/jira/browse/PHOENIX-5711 > Project: Phoenix > Issue Type: Improvement >Affects Versions: 4.15.0 >Reporter: Masatake Iwasaki >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > I got the following error on building 4.15.0-HBase-1.5 against hadoop-2.10.0 > (for BIGTOP-3298). The cause seems to be the pattern for exclusion in pom.xml > does not match the file name NOTICE (without suffix). > {noformat} > $ mvn clean install -DskipTests -Dhbase.version=1.5.0 -Dhadoop.version=2.10.0 > -Dhadoop-two.version=2.10.0 > ...snip > [ERROR] Failed to execute goal > org.apache.maven.plugins:maven-shade-plugin:3.1.1:shade (default-shaded) on > project phoenix-client: Error creating shaded jar: duplicate entry: NOTICE -> > [Help 1] > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (PHOENIX-5537) Phoenix-4701 made hard coupling between phoenix.log.level and getting request metrics.
[ https://issues.apache.org/jira/browse/PHOENIX-5537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17034849#comment-17034849 ] Daniel Wong commented on PHOENIX-5537: -- [~yanxinyi] [~ckulkarni] can you help me get this to a hadoop QA run I think Richard is not marked as a contributor yet. > Phoenix-4701 made hard coupling between phoenix.log.level and getting request > metrics. > -- > > Key: PHOENIX-5537 > URL: https://issues.apache.org/jira/browse/PHOENIX-5537 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.15.0 >Reporter: Daniel Wong >Priority: Minor > Fix For: 4.15.1 > > Attachments: PHOENIX-5537.master.v1.patch > > > Phoenix-4701 made hard coupling between phoenix.log.level and getting request > metrics. For users who do not want to enable system to log this causes a > regression from earlier behavior where metrics were populated. FYI [~ankit] > {code:java} > public OverAllQueryMetrics(boolean isRequestMetricsEnabled, LogLevel > connectionLogLevel) { public OverAllQueryMetrics(boolean > isRequestMetricsEnabled, LogLevel connectionLogLevel) { queryWatch = new > MetricsStopWatch(WALL_CLOCK_TIME_MS.isLoggingEnabled(connectionLogLevel)); > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (PHOENIX-5537) Phoenix-4701 made hard coupling between phoenix.log.level and getting request metrics.
[ https://issues.apache.org/jira/browse/PHOENIX-5537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17034848#comment-17034848 ] Daniel Wong commented on PHOENIX-5537: -- [~RichardAntal] In general the changes look okay to me but I would like to see a unit test or integration test to go with the patch. I'll see if i can get this to patch submitted and have our HadoopQA run our tests against this. > Phoenix-4701 made hard coupling between phoenix.log.level and getting request > metrics. > -- > > Key: PHOENIX-5537 > URL: https://issues.apache.org/jira/browse/PHOENIX-5537 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.15.0 >Reporter: Daniel Wong >Priority: Minor > Fix For: 4.15.1 > > Attachments: PHOENIX-5537.master.v1.patch > > > Phoenix-4701 made hard coupling between phoenix.log.level and getting request > metrics. For users who do not want to enable system to log this causes a > regression from earlier behavior where metrics were populated. FYI [~ankit] > {code:java} > public OverAllQueryMetrics(boolean isRequestMetricsEnabled, LogLevel > connectionLogLevel) { public OverAllQueryMetrics(boolean > isRequestMetricsEnabled, LogLevel connectionLogLevel) { queryWatch = new > MetricsStopWatch(WALL_CLOCK_TIME_MS.isLoggingEnabled(connectionLogLevel)); > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (PHOENIX-4845) Support using Row Value Constructors in OFFSET clause for paging in tables where the sort order of PK columns varies
[ https://issues.apache.org/jira/browse/PHOENIX-4845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17015504#comment-17015504 ] Daniel Wong commented on PHOENIX-4845: -- Checked the tests that appears to be a environmental hiccup and it ran clean locally. If you have time can [~tdsilva] or [~jamestaylor] take a peek? Not sure who else might have knowledge of the parser/optimizer. > Support using Row Value Constructors in OFFSET clause for paging in tables > where the sort order of PK columns varies > > > Key: PHOENIX-4845 > URL: https://issues.apache.org/jira/browse/PHOENIX-4845 > Project: Phoenix > Issue Type: New Feature >Reporter: Thomas D'Silva >Assignee: Daniel Wong >Priority: Major > Labels: DESC, SFDC > Attachments: PHOENIX-4845-4.x-HBase-1.3.patch, PHOENIX-offset.txt > > Time Spent: 14h 50m > Remaining Estimate: 0h > > RVCs along with the LIMIT clause are useful for efficiently paging through > rows (see [http://phoenix.apache.org/paged.html]). This works well if the pk > columns are sorted ascending, we can always use the > operator to query for > the next batch of row. > However if the PK of a table is (A DESC, B DESC) we cannot use the following > query to page through the data > {code:java} > SELECT * FROM TABLE WHERE (A, B) > (?, ?) ORDER BY A DESC, B DESC LIMIT 20 > {code} > Since the rows are sorted by A desc and then by B descending we need change > the comparison order > {code:java} > SELECT * FROM TABLE WHERE (A, B) < (?, ?) ORDER BY A DESC, B DESC LIMIT 20 > {code} > If the PK of a table contains columns with mixed sort order for eg (A DESC, > B) then we cannot use RVC to page through data. > If we supported using RVCs in the offset clause we could use the offset to > set the start row of the scan. Clients would not have to have logic to > determine the comparison operator. This would also support paging through > data for tables where the PK columns are sorted in mixed order. > {code:java} > SELECT * FROM TABLE ORDER BY A DESC, B LIMIT 20 OFFSET (?,?) > {code} > We would only allow using the offset if the rows are ordered by the sort > order of the PK columns of and Index or Primary Table. > Note that there is some care is needed in the use of OFFSET with indexes. If > the OFFSET is coercible to multiple indexes/base table it could mean very > different positions based on key. To Handle This the INDEX hint needs to be > used to specify an index offset for safety. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (PHOENIX-5628) Phoenix Function to Return HBase Row Key of Column Cell
[ https://issues.apache.org/jira/browse/PHOENIX-5628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17013261#comment-17013261 ] Daniel Wong commented on PHOENIX-5628: -- I agree with the above please add composite key with variable types, and sorting. The HBase check is valid but perhaps overkill we can simply define the rowkey and compare against it. > Phoenix Function to Return HBase Row Key of Column Cell > --- > > Key: PHOENIX-5628 > URL: https://issues.apache.org/jira/browse/PHOENIX-5628 > Project: Phoenix > Issue Type: New Feature >Reporter: Geoffrey Jacoby >Assignee: Abhishek Singh Chouhan >Priority: Major > Fix For: 5.1.0, 4.16.0 > > Attachments: PHOENIX-5628-4.x-HBase-1.5.patch, > PHOENIX-5628-master.patch > > > It's occasionally useful when diagnosing an issue with Phoenix to be able to > easily look up the byte array of a row key that corresponds to a cell with a > Phoenix query. This can then be looked up in the HBase shell. > For example: > SELECT ROWKEY_BYTES(Column1) FROM Table1 WHERE Column1 = 'SomeValue' -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (PHOENIX-5614) Remove unnecessary instances of ClassNotFoundException thrown stemming from various QueryUtil APIs
[ https://issues.apache.org/jira/browse/PHOENIX-5614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16998738#comment-16998738 ] Daniel Wong commented on PHOENIX-5614: -- Took a quick look, we were swallowing exceptions in say MetaDataEndpointImpl. New code has no swallow. Are we confident we didn't expose a new code path? Haven't looked in detail yet forgive my high level quesiton. > Remove unnecessary instances of ClassNotFoundException thrown stemming from > various QueryUtil APIs > -- > > Key: PHOENIX-5614 > URL: https://issues.apache.org/jira/browse/PHOENIX-5614 > Project: Phoenix > Issue Type: Improvement >Affects Versions: 4.15.0 >Reporter: Chinmay Kulkarni >Assignee: Chinmay Kulkarni >Priority: Minor > Labels: beginner, newbie > Fix For: 4.15.1 > > Attachments: PHOENIX-5614-4.x-HBase-1.3-v1.patch > > > Many of the QueryUtil APIs have CNFE in their method signatures, but they > don't really throw them. We mostly handle this in its callers (like inside > MetaDataEndpointImpl) by swallowing this exception. We should remove these > instances. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (PHOENIX-5510) LoggingPhoenixConnection on close does not log with unclosed statements/result sets
[ https://issues.apache.org/jira/browse/PHOENIX-5510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16946074#comment-16946074 ] Daniel Wong commented on PHOENIX-5510: -- A new method for logging errors dealing with logging essentially what I was proposing. In my case if I wanted to log an error message saying that since we have unclosed result sets and statements. During close of a LoggingPhoenixConnection the class does not have access to SQL stored in the statements/result sets it contains. In addition, if we wanted to indicate that the metrics for READS from an unclosed result set were not recorded, ie Read path interface was not called. There is no method to do so today other than hacky ways like calling read interface with make belive SQL or make believe additional metrics to indicate these were not recorded. This means I cannot put my code which uses my internal logging solution into open source as I manually changed LoggingPhoenixConnection and directly called into my logging libraries. If there was a metrics error method, I could have indicated there during close of connection that the metrics were missing. > LoggingPhoenixConnection on close does not log with unclosed > statements/result sets > --- > > Key: PHOENIX-5510 > URL: https://issues.apache.org/jira/browse/PHOENIX-5510 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.14.1, 4.14.3 >Reporter: Daniel Wong >Priority: Minor > > Testing the LoggingPhoenixConnection it does not handle request level metrics > for unclosed statements/result sets. This is due to the delegate nature > where the resulting PhoenixConnection ends up closing these without ever > going through a delegate. Without this log or an way to inform the user that > these occurred operations are essentially blind to these requests. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (PHOENIX-5510) LoggingPhoenixConnection on close does not log with unclosed statements/result sets
[ https://issues.apache.org/jira/browse/PHOENIX-5510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16945254#comment-16945254 ] Daniel Wong commented on PHOENIX-5510: -- So thinking about this a bit more this may not be an actual bug in logging the result metrics as users should close resultsets and statements prior to connections. However, it would be nice to log something in case this does occur so administrators etc can have info on when end users do not properly handle this type of closing. PhoenixMetricsLog interface could have an additional method either in the next release or if move to java 8 occurs and can use a default method. In addition this code could be gutted out and moved to a separate client utils library. Perhaps even better a cleaner java service implementation with better coupling with the internal phoenix code. Note for future readers: for my own use case I will inherit and override the logging classes to handle logging and metrics for this case. FYI [~tdsilva] [~ckulkarni] > LoggingPhoenixConnection on close does not log with unclosed > statements/result sets > --- > > Key: PHOENIX-5510 > URL: https://issues.apache.org/jira/browse/PHOENIX-5510 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.14.1, 4.14.3 >Reporter: Daniel Wong >Priority: Minor > > Testing the LoggingPhoenixConnection it does not handle request level metrics > for unclosed statements/result sets. This is due to the delegate nature > where the resulting PhoenixConnection ends up closing these without ever > going through a delegate. Without this log or an way to inform the user that > these occurred operations are essentially blind to these requests. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (PHOENIX-5493) Remove unnecesary iteration in BaseResultIterator
[ https://issues.apache.org/jira/browse/PHOENIX-5493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16937934#comment-16937934 ] Daniel Wong commented on PHOENIX-5493: -- +1 for this patch. Note I would like to get the larger changes in this module from the stats branch in eventually. > Remove unnecesary iteration in BaseResultIterator > -- > > Key: PHOENIX-5493 > URL: https://issues.apache.org/jira/browse/PHOENIX-5493 > Project: Phoenix > Issue Type: Improvement >Affects Versions: 4.14.3 >Reporter: Aman Poonia >Assignee: Aman Poonia >Priority: Minor > Attachments: PHOENIX-5493.patch > > > In BaseResultIterator > {code:java} > while (offset < nColumnsInCommon && offset < rangesListSize) { > List ranges = rangesList.get(offset); > // We use a skip scan if we have multiple ranges or if > // we have a non single key range before the last range. > useSkipScan |= ranges.size() > 1 || hasRange; > cnf.add(ranges); > int rangeSpan = 1 + dataScanRanges.getSlotSpans()[offset]; > if (offset + rangeSpan > nColumnsInCommon) { > rangeSpan = nColumnsInCommon - offset; > // trim range to only be rangeSpan in length > ranges = > Lists.newArrayListWithExpectedSize(cnf.get(cnf.size()-1).size()); > for (KeyRange range : cnf.get(cnf.size()-1)) { > range = clipRange(dataScanRanges.getSchema(), offset, > rangeSpan, range); > // trim range to be only rangeSpan in length > ranges.add(range); > } > cnf.set(cnf.size()-1, ranges); > } > for (KeyRange range : ranges) { > if (!range.isSingleKey()) { > hasRange = true; > } > } > slotSpan[offset] = rangeSpan - 1; > offset = offset + rangeSpan; > } > {code} > we can break in the inner loop and save some cycles of CPU -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (PHOENIX-5085) Disentangle BaseResultIterators from the backing Guidepost Data structure
[ https://issues.apache.org/jira/browse/PHOENIX-5085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16930788#comment-16930788 ] Daniel Wong commented on PHOENIX-5085: -- Marking as in progress as part of the Stats branch where these changes exist. > Disentangle BaseResultIterators from the backing Guidepost Data structure > - > > Key: PHOENIX-5085 > URL: https://issues.apache.org/jira/browse/PHOENIX-5085 > Project: Phoenix > Issue Type: Improvement >Reporter: Daniel Wong >Assignee: Daniel Wong >Priority: Major > Labels: Statistics, StatsImprovement > > Disentangle BaseResultIterators.getParallelScans from the backing Guidepost > Data structure. This will provide the abstraction for possible new stats > data structures in https://issues.apache.org/jira/browse/PHOENIX-4925 > Will heavily affect changes in > https://issues.apache.org/jira/browse/PHOENIX-4926 and > https://issues.apache.org/jira/browse/PHOENIX-4594. [~Bin Shi] > [~karanmehta93] -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (PHOENIX-5444) Incorrect Phoenix Client Path in phoenix_utils.py
[ https://issues.apache.org/jira/browse/PHOENIX-5444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16906801#comment-16906801 ] Daniel Wong commented on PHOENIX-5444: -- [https://github.com/apache/phoenix/pull/566] Fix Here > Incorrect Phoenix Client Path in phoenix_utils.py > - > > Key: PHOENIX-5444 > URL: https://issues.apache.org/jira/browse/PHOENIX-5444 > Project: Phoenix > Issue Type: Bug >Reporter: Daniel Wong >Assignee: Daniel Wong >Priority: Trivial > Time Spent: 10m > Remaining Estimate: 0h > > ./bin/phoenix_utils.py has a variable for phoenix client jar pattern but does > not use it in all places. Using this will correct some issues for future > client names (and in my case my forked versions). -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (PHOENIX-5142) Protect against bad scan construction causing stack overflow
[ https://issues.apache.org/jira/browse/PHOENIX-5142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16897558#comment-16897558 ] Daniel Wong commented on PHOENIX-5142: -- I'm not sure how to reproduce this in an IT. The bad scans were due to implementation error I had during testing. Not sure it is worth reproducing or affects versions. This is more of how the code is structured rather than a bug. But I'll fill in the latest as that code path has not been changed. > Protect against bad scan construction causing stack overflow > > > Key: PHOENIX-5142 > URL: https://issues.apache.org/jira/browse/PHOENIX-5142 > Project: Phoenix > Issue Type: Improvement >Reporter: Daniel Wong >Priority: Minor > > During implementation changes I created an issue where a bad set of scans > were generated. These scans crossed region boundaries incorrectly. This led > Phoenix to believe the cache was out of date and triggered a retry attempt in > a recursive manner until the stack overflowed. We may want to protect > against this type of failure by limiting the depth of the recursion and > returning an exception to the user. > {code:java} > java.lang.StackOverflowError at > java.security.AccessController.doPrivileged(Native Method) at > java.io.PrintWriter.(PrintWriter.java:116) at > java.io.PrintWriter.(PrintWriter.java:100) at > org.apache.log4j.DefaultThrowableRenderer.render(DefaultThrowableRenderer.java:58) > at > org.apache.log4j.spi.ThrowableInformation.getThrowableStrRep(ThrowableInformation.java:87) > at > org.apache.log4j.spi.LoggingEvent.getThrowableStrRep(LoggingEvent.java:413) > at org.apache.log4j.WriterAppender.subAppend(WriterAppender.java:313) at > org.apache.log4j.WriterAppender.append(WriterAppender.java:162) at > org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:251) at > org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66) > at org.apache.log4j.Category.callAppenders(Category.java:206) at > org.apache.log4j.Category.forcedLog(Category.java:391) at > org.apache.log4j.Category.log(Category.java:856) at > org.slf4j.impl.Log4jLoggerAdapter.info(Log4jLoggerAdapter.java:382) at > org.apache.phoenix.iterate.BaseResultIterators.close(BaseResultIterators.java:1568) > at > org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:1475) > at > org.apache.phoenix.iterate.BaseResultIterators.recreateIterators(BaseResultIterators.java:1523) > at > org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:1441) > at > org.apache.phoenix.iterate.BaseResultIterators.recreateIterators(BaseResultIterators.java:1523) > at > org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:1441) > at > org.apache.phoenix.iterate.BaseResultIterators.recreateIterators(BaseResultIterators.java:1523) > at > org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:1441) > at > org.apache.phoenix.iterate.BaseResultIterators.recreateIterators(BaseResultIterators.java:1523) > at > org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:1441) > at > org.apache.phoenix.iterate.BaseResultIterators.recreateIterators(BaseResultIterators.java:1523) > at > org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:1441) > > {code} -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (PHOENIX-4845) Support using Row Value Constructors in OFFSET clause for paging in tables where the sort order of PK columns varies
[ https://issues.apache.org/jira/browse/PHOENIX-4845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16896605#comment-16896605 ] Daniel Wong commented on PHOENIX-4845: -- I was implementing salt handling by not handling the "ORDER BY" type concerns and just pagining through the salted table normally. That is first salt byte 0, then through salt byte 1, etc. The order by would have to be unconstrained as we would not be able to return in a PK order otherwise. > Support using Row Value Constructors in OFFSET clause for paging in tables > where the sort order of PK columns varies > > > Key: PHOENIX-4845 > URL: https://issues.apache.org/jira/browse/PHOENIX-4845 > Project: Phoenix > Issue Type: New Feature >Reporter: Thomas D'Silva >Assignee: Daniel Wong >Priority: Major > Labels: DESC, SFDC > Attachments: PHOENIX-offset.txt > > > RVCs along with the LIMIT clause are useful for efficiently paging through > rows (see [http://phoenix.apache.org/paged.html]). This works well if the pk > columns are sorted ascending, we can always use the > operator to query for > the next batch of row. > However if the PK of a table is (A DESC, B DESC) we cannot use the following > query to page through the data > {code:java} > SELECT * FROM TABLE WHERE (A, B) > (?, ?) ORDER BY A DESC, B DESC LIMIT 20 > {code} > Since the rows are sorted by A desc and then by B descending we need change > the comparison order > {code:java} > SELECT * FROM TABLE WHERE (A, B) < (?, ?) ORDER BY A DESC, B DESC LIMIT 20 > {code} > If the PK of a table contains columns with mixed sort order for eg (A DESC, > B) then we cannot use RVC to page through data. > If we supported using RVCs in the offset clause we could use the offset to > set the start row of the scan. Clients would not have to have logic to > determine the comparison operator. This would also support paging through > data for tables where the PK columns are sorted in mixed order. > {code:java} > SELECT * FROM TABLE ORDER BY A DESC, B LIMIT 20 OFFSET (?,?) > {code} > We would only allow using the offset if the rows are ordered by the sort > order of the PK columns of and Index or Primary Table. > Note that there is some care is needed in the use of OFFSET with indexes. If > the OFFSET is coercible to multiple indexes/base table it could mean very > different positions based on key. To Handle This the INDEX hint needs to be > used to specify an index offset for safety. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (PHOENIX-4845) Support using Row Value Constructors in OFFSET clause for paging in tables where the sort order of PK columns varies
[ https://issues.apache.org/jira/browse/PHOENIX-4845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16896577#comment-16896577 ] Daniel Wong commented on PHOENIX-4845: -- How would we expect them to handle stuff like the SALT byte there. Will they be able to even retrieve that data? This would also require them to understand how Phoenix lays out the secondary indexes as given only their CREATE INDEX statement they have no knowledge of those keys even existing and would have to inspect the system catalog to know about. If we are okay with those restrictions this is feasible but I dislike not being able to run on an index who base table is salted for example. > Support using Row Value Constructors in OFFSET clause for paging in tables > where the sort order of PK columns varies > > > Key: PHOENIX-4845 > URL: https://issues.apache.org/jira/browse/PHOENIX-4845 > Project: Phoenix > Issue Type: New Feature >Reporter: Thomas D'Silva >Assignee: Daniel Wong >Priority: Major > Labels: DESC, SFDC > Attachments: PHOENIX-offset.txt > > > RVCs along with the LIMIT clause are useful for efficiently paging through > rows (see [http://phoenix.apache.org/paged.html]). This works well if the pk > columns are sorted ascending, we can always use the > operator to query for > the next batch of row. > However if the PK of a table is (A DESC, B DESC) we cannot use the following > query to page through the data > {code:java} > SELECT * FROM TABLE WHERE (A, B) > (?, ?) ORDER BY A DESC, B DESC LIMIT 20 > {code} > Since the rows are sorted by A desc and then by B descending we need change > the comparison order > {code:java} > SELECT * FROM TABLE WHERE (A, B) < (?, ?) ORDER BY A DESC, B DESC LIMIT 20 > {code} > If the PK of a table contains columns with mixed sort order for eg (A DESC, > B) then we cannot use RVC to page through data. > If we supported using RVCs in the offset clause we could use the offset to > set the start row of the scan. Clients would not have to have logic to > determine the comparison operator. This would also support paging through > data for tables where the PK columns are sorted in mixed order. > {code:java} > SELECT * FROM TABLE ORDER BY A DESC, B LIMIT 20 OFFSET (?,?) > {code} > We would only allow using the offset if the rows are ordered by the sort > order of the PK columns of and Index or Primary Table. > Note that there is some care is needed in the use of OFFSET with indexes. If > the OFFSET is coercible to multiple indexes/base table it could mean very > different positions based on key. To Handle This the INDEX hint needs to be > used to specify an index offset for safety. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (PHOENIX-4845) Support using Row Value Constructors in OFFSET clause for paging in tables where the sort order of PK columns varies
[ https://issues.apache.org/jira/browse/PHOENIX-4845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16896562#comment-16896562 ] Daniel Wong commented on PHOENIX-4845: -- I'll try, I could require INDEX or NO INDEX hint relatively easily, which maybe I should. However I cannot determine all the user defined index PKs given the current syscat as far as I can tell. Consider the following code fragment and the table and indexes it creates: {code:java} String ddlTemplate = "CREATE TABLE %s (k1 TINYINT NOT NULL,\n" + "k2 TINYINT NOT NULL,\n" + "k3 TINYINT NOT NULL,\n" + "k4 TINYINT NOT NULL,\n" + "k5 TINYINT NOT NULL,\n" + "k6 TINYINT NOT NULL,\n" + "v1 INTEGER,\n" + "v2 INTEGER,\n" + "v3 INTEGER,\n" + "v4 INTEGER,\n" + "CONSTRAINT pk PRIMARY KEY (k1, k2, k3, k4, k5, k6)) "; String longKeyTableName = "T_" + generateUniqueName(); String longKeyIndex1Name = "INDEX_1_" + longKeyTableName; String longKeyIndex2Name = "INDEX_2_" + longKeyTableName; String ddl = String.format(ddlTemplate,longKeyTableName); conn.createStatement().execute(ddl); String createIndex1 = "CREATE INDEX IF NOT EXISTS " + longKeyIndex1Name + " ON " + longKeyTableName + " (k2 ,v1, k4)"; String createIndex2 = "CREATE INDEX IF NOT EXISTS " + longKeyIndex2Name + " ON " + longKeyTableName + " (v1, v3)"; {code} Here the PK columns for the table are PKs = \{K1,K2,K3,K4,K5,K6} For index 1 it is \{K2, 0:V1, K4, K1, K3, K5, K6} as we append the PKs to the indexed row key. However I cannot differentiate that PK form a create index that had for example CREATE INDEX (k2, v1, k4, k1). So how will I be able to force the user to provide either k1 or not provide k1? I didn't see anything in SYSCAT or PTable Interface that can allow me to differentiate these scenarios. I could be missing something however. > Support using Row Value Constructors in OFFSET clause for paging in tables > where the sort order of PK columns varies > > > Key: PHOENIX-4845 > URL: https://issues.apache.org/jira/browse/PHOENIX-4845 > Project: Phoenix > Issue Type: New Feature >Reporter: Thomas D'Silva >Assignee: Daniel Wong >Priority: Major > Labels: DESC, SFDC > Attachments: PHOENIX-offset.txt > > > RVCs along with the LIMIT clause are useful for efficiently paging through > rows (see [http://phoenix.apache.org/paged.html]). This works well if the pk > columns are sorted ascending, we can always use the > operator to query for > the next batch of row. > However if the PK of a table is (A DESC, B DESC) we cannot use the following > query to page through the data > {code:java} > SELECT * FROM TABLE WHERE (A, B) > (?, ?) ORDER BY A DESC, B DESC LIMIT 20 > {code} > Since the rows are sorted by A desc and then by B descending we need change > the comparison order > {code:java} > SELECT * FROM TABLE WHERE (A, B) < (?, ?) ORDER BY A DESC, B DESC LIMIT 20 > {code} > If the PK of a table contains columns with mixed sort order for eg (A DESC, > B) then we cannot use RVC to page through data. > If we supported using RVCs in the offset clause we could use the offset to > set the start row of the scan. Clients would not have to have logic to > determine the comparison operator. This would also support paging through > data for tables where the PK columns are sorted in mixed order. > {code:java} > SELECT * FROM TABLE ORDER BY A DESC, B LIMIT 20 OFFSET (?,?) > {code} > We would only allow using the offset if the rows are ordered by the sort > order of the PK columns of and Index or Primary Table. > Note that there is some care is needed in the use of OFFSET with indexes. If > the OFFSET is coercible to multiple indexes/base table it could mean very > different positions based on key. To Handle This the INDEX hint needs to be > used to specify an index offset for safety. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (PHOENIX-4845) Support using Row Value Constructors in OFFSET clause for paging in tables where the sort order of PK columns varies
[ https://issues.apache.org/jira/browse/PHOENIX-4845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16896412#comment-16896412 ] Daniel Wong commented on PHOENIX-4845: -- One additional concern I had based on my analysis there does not seem to be an easy way to determine which PKs of an Index are user defined and which are appended from the base table in order to facilitate uniqueness and indexed lookup. While we have lists of both PKs I'm unable to determine which were user specified and which were. This means that I cannot simply require the offset to be fully specified for an indexed lookup. While I can implement with the leading edge of hte key this introduces more. Assuming I'm correct in that there is no way to reconstruct this currently should we add this information to system catalog? [~tdsilva] [~swaroopa] [~gjacoby] > Support using Row Value Constructors in OFFSET clause for paging in tables > where the sort order of PK columns varies > > > Key: PHOENIX-4845 > URL: https://issues.apache.org/jira/browse/PHOENIX-4845 > Project: Phoenix > Issue Type: New Feature >Reporter: Thomas D'Silva >Assignee: Daniel Wong >Priority: Major > Labels: DESC, SFDC > Attachments: PHOENIX-offset.txt > > > RVCs along with the LIMIT clause are useful for efficiently paging through > rows (see [http://phoenix.apache.org/paged.html]). This works well if the pk > columns are sorted ascending, we can always use the > operator to query for > the next batch of row. > However if the PK of a table is (A DESC, B DESC) we cannot use the following > query to page through the data > {code:java} > SELECT * FROM TABLE WHERE (A, B) > (?, ?) ORDER BY A DESC, B DESC LIMIT 20 > {code} > Since the rows are sorted by A desc and then by B descending we need change > the comparison order > {code:java} > SELECT * FROM TABLE WHERE (A, B) < (?, ?) ORDER BY A DESC, B DESC LIMIT 20 > {code} > If the PK of a table contains columns with mixed sort order for eg (A DESC, > B) then we cannot use RVC to page through data. > If we supported using RVCs in the offset clause we could use the offset to > set the start row of the scan. Clients would not have to have logic to > determine the comparison operator. This would also support paging through > data for tables where the PK columns are sorted in mixed order. > {code:java} > SELECT * FROM TABLE ORDER BY A DESC, B LIMIT 20 OFFSET (?,?) > {code} > We would only allow using the offset if the rows are ordered by the sort > order of the PK columns of and Index or Primary Table. > Note that there is some care is needed in the use of OFFSET with indexes. If > the OFFSET is coercible to multiple indexes/base table it could mean very > different positions based on key. To Handle This the INDEX hint needs to be > used to specify an index offset for safety. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Comment Edited] (PHOENIX-4845) Support using Row Value Constructors in OFFSET clause for paging in tables where the sort order of PK columns varies
[ https://issues.apache.org/jira/browse/PHOENIX-4845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16895666#comment-16895666 ] Daniel Wong edited comment on PHOENIX-4845 at 7/30/19 12:07 AM: Updating the high level description with the following note. Note that there is some care needed in the use of OFFSET with indexes. If the OFFSET is coercible to multiple indexes/base table it could mean very different positions based on key. To Handle This the INDEX hint needs to be used to specify an index offset for safety was (Author: dbwong): Updating the high level description with the following note. Note that there is some care neded in the use of OFFSET with indexes. If the OFFSET is coercible to multiple indexes/base table it could mean very different positions based on key. To Handle This the INDEX hint needs to be used to specifiy an index offset for safety > Support using Row Value Constructors in OFFSET clause for paging in tables > where the sort order of PK columns varies > > > Key: PHOENIX-4845 > URL: https://issues.apache.org/jira/browse/PHOENIX-4845 > Project: Phoenix > Issue Type: New Feature >Reporter: Thomas D'Silva >Assignee: Daniel Wong >Priority: Major > Labels: DESC, SFDC > Attachments: PHOENIX-offset.txt > > > RVCs along with the LIMIT clause are useful for efficiently paging through > rows (see [http://phoenix.apache.org/paged.html]). This works well if the pk > columns are sorted ascending, we can always use the > operator to query for > the next batch of row. > However if the PK of a table is (A DESC, B DESC) we cannot use the following > query to page through the data > {code:java} > SELECT * FROM TABLE WHERE (A, B) > (?, ?) ORDER BY A DESC, B DESC LIMIT 20 > {code} > Since the rows are sorted by A desc and then by B descending we need change > the comparison order > {code:java} > SELECT * FROM TABLE WHERE (A, B) < (?, ?) ORDER BY A DESC, B DESC LIMIT 20 > {code} > If the PK of a table contains columns with mixed sort order for eg (A DESC, > B) then we cannot use RVC to page through data. > If we supported using RVCs in the offset clause we could use the offset to > set the start row of the scan. Clients would not have to have logic to > determine the comparison operator. This would also support paging through > data for tables where the PK columns are sorted in mixed order. > {code:java} > SELECT * FROM TABLE ORDER BY A DESC, B LIMIT 20 OFFSET (?,?) > {code} > We would only allow using the offset if the rows are ordered by the sort > order of the PK columns of and Index or Primary Table. > Note that there is some care is needed in the use of OFFSET with indexes. If > the OFFSET is coercible to multiple indexes/base table it could mean very > different positions based on key. To Handle This the INDEX hint needs to be > used to specify an index offset for safety. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (PHOENIX-4845) Support using Row Value Constructors in OFFSET clause for paging in tables where the sort order of PK columns varies
[ https://issues.apache.org/jira/browse/PHOENIX-4845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16895666#comment-16895666 ] Daniel Wong commented on PHOENIX-4845: -- Updating the high level description with the following note. Note that there is some care neded in the use of OFFSET with indexes. If the OFFSET is coercible to multiple indexes/base table it could mean very different positions based on key. To Handle This the INDEX hint needs to be used to specifiy an index offset for safety > Support using Row Value Constructors in OFFSET clause for paging in tables > where the sort order of PK columns varies > > > Key: PHOENIX-4845 > URL: https://issues.apache.org/jira/browse/PHOENIX-4845 > Project: Phoenix > Issue Type: New Feature >Reporter: Thomas D'Silva >Assignee: Daniel Wong >Priority: Major > Labels: DESC, SFDC > Attachments: PHOENIX-offset.txt > > > RVCs along with the LIMIT clause are useful for efficiently paging through > rows (see [http://phoenix.apache.org/paged.html]). This works well if the pk > columns are sorted ascending, we can always use the > operator to query for > the next batch of row. > However if the PK of a table is (A DESC, B DESC) we cannot use the following > query to page through the data > {code:java} > SELECT * FROM TABLE WHERE (A, B) > (?, ?) ORDER BY A DESC, B DESC LIMIT 20 > {code} > Since the rows are sorted by A desc and then by B descending we need change > the comparison order > {code:java} > SELECT * FROM TABLE WHERE (A, B) < (?, ?) ORDER BY A DESC, B DESC LIMIT 20 > {code} > If the PK of a table contains columns with mixed sort order for eg (A DESC, > B) then we cannot use RVC to page through data. > If we supported using RVCs in the offset clause we could use the offset to > set the start row of the scan. Clients would not have to have logic to > determine the comparison operator. This would also support paging through > data for tables where the PK columns are sorted in mixed order. > {code:java} > SELECT * FROM TABLE ORDER BY A DESC, B LIMIT 20 OFFSET (?,?) > {code} > We would only allow using the offset if the rows are ordered by the sort > order of the PK columns. > > FYI [~jfernando_sfdc] -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (PHOENIX-4846) WhereOptimizer.pushKeyExpressionsToScan() does not work correctly if the sort order of pk columns being filtered on changes
[ https://issues.apache.org/jira/browse/PHOENIX-4846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16890386#comment-16890386 ] Daniel Wong commented on PHOENIX-4846: -- I'll see if I can look at this later this week. > WhereOptimizer.pushKeyExpressionsToScan() does not work correctly if the sort > order of pk columns being filtered on changes > --- > > Key: PHOENIX-4846 > URL: https://issues.apache.org/jira/browse/PHOENIX-4846 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.14.0, 5.0.0 >Reporter: Thomas D'Silva >Priority: Critical > Fix For: 4.15.0, 5.1.0 > > Attachments: PHOENIX-4846-wip.patch > > > {{ExpressionComparabilityWrapper}} should set the sort order based on > {{childPart.getColumn()}} or else the attached test throws an > IllegalArgumentException > {code} > java.lang.IllegalArgumentException: 4 > 3 > at java.util.Arrays.copyOfRange(Arrays.java:3519) > at > org.apache.hadoop.hbase.io.ImmutableBytesWritable.copyBytes(ImmutableBytesWritable.java:272) > at > org.apache.phoenix.compile.WhereOptimizer.getTrailingRange(WhereOptimizer.java:329) > at > org.apache.phoenix.compile.WhereOptimizer.clipRight(WhereOptimizer.java:350) > at > org.apache.phoenix.compile.WhereOptimizer.pushKeyExpressionsToScan(WhereOptimizer.java:237) > at org.apache.phoenix.compile.WhereCompiler.compile(WhereCompiler.java:157) > at org.apache.phoenix.compile.WhereCompiler.compile(WhereCompiler.java:108) > at > org.apache.phoenix.compile.QueryCompiler.compileSingleFlatQuery(QueryCompiler.java:556) > {code} > Also in {{pushKeyExpressionsToScan()}} we cannot extract pk column nodes from > the where clause if the sort order of the columns changes. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (PHOENIX-5290) HashJoinMoreIT is flapping
[ https://issues.apache.org/jira/browse/PHOENIX-5290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16887480#comment-16887480 ] Daniel Wong commented on PHOENIX-5290: -- Thanks for commenting on this. I think we are in agreement but I was likely talking about some detailed level info. For example varbinary row key is allowed only at the last column exactly for these separator issues as 0 byte is valid and a non-separator in a varbinary. We get away with this from an interpretation perspective since we know the length of the key everything from the last field to the end is the varbinary. This length is is essentially “stored” in the hbase key. In this case we do not allow new columns to be added to the row key. And I agree on not changing the row key I did not mean that as an intention. Sorry if there was misunderstanding! > HashJoinMoreIT is flapping > -- > > Key: PHOENIX-5290 > URL: https://issues.apache.org/jira/browse/PHOENIX-5290 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.15.0, 4.14.1, 5.1.0 >Reporter: Lars Hofhansl >Assignee: Lars Hofhansl >Priority: Major > Fix For: 4.15.0, 5.1.0 > > Attachments: 5290-combined.txt, 5290-failure.txt, 5290-v2.txt, > 5290-v3.txt, 5290.txt > > > {code} > [INFO] Running org.apache.phoenix.end2end.join.HashJoinMoreIT > [ERROR] Tests run: 8, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: > 91.509 s <<< FAILURE! - in org.apache.phoenix.end2end.join.HashJoinMoreIT > [ERROR] testBug2961(org.apache.phoenix.end2end.join.HashJoinMoreIT) Time > elapsed: 2.42 s <<< ERROR! > java.lang.IllegalArgumentException: 6 > 5 > at > org.apache.phoenix.end2end.join.HashJoinMoreIT.testBug2961(HashJoinMoreIT.java:898) > {code} -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (PHOENIX-5290) HashJoinMoreIT is flapping
[ https://issues.apache.org/jira/browse/PHOENIX-5290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16885544#comment-16885544 ] Daniel Wong commented on PHOENIX-5290: -- Quickly trying to catch up here. There is a bunch of weirdness on separators in our key/single value row generation that likely should be abstracted out. We append separators somewhat blindly after writing each field when generating the row key. Since row keys have length it is unnecessary to store the trailing key so we trim it. If we were to have multiple variable length fields, (or nullable secondary indexes) we could in theory trim those as well. For example several empty String VARCHARs in a row. Without looking in the code I cannot recall if there is a case where we trim all of them for storage but for querying it appears we will do so for something like the following code snippet i ran. Here in my code the startkey has no trailing separators. {code:java} String dataTableName = "T_" + generateUniqueName(); String dataTableDDL = "CREATE TABLE " + dataTableName + " (k1 INTEGER NOT NULL, s1 VARCHAR(64),\n" + "s2 VARCHAR(64),\n" + "s3 VARCHAR(64),\n" + "v1 INTEGER,\n" + "CONSTRAINT pk PRIMARY KEY (k1))"; conn.createStatement().execute(dataTableDDL); conn.commit(); String indexDDL = "CREATE INDEX IF NOT EXISTS " + dataTableName+"_IDX" + " ON " + dataTableName + " (s1 ,s2, s3)"; conn.createStatement().execute(indexDDL); conn.commit(); String upsertDML = String.format("UPSERT INTO %s VALUES(?,?,?,?,?)", dataTableName); int nRows = 0; PreparedStatement ps = conn.prepareStatement(upsertDML); ps.setInt(1,5); ps.setString(2, "a"); ps.setString(3, ""); ps.setString(4, ""); ps.setInt(5, nRows); int result = ps.executeUpdate(); assertEquals(1, result); nRows++; conn.commit(); String sql = "SELECT s1,s2,s3 FROM " + dataTableName + " WHERE (s1,s2,s3) >= ('a','','') ORDER BY s1,s2,s3"; ResultSet rs = conn.createStatement().executeQuery(sql); assertTrue(rs.next()) {code} > HashJoinMoreIT is flapping > -- > > Key: PHOENIX-5290 > URL: https://issues.apache.org/jira/browse/PHOENIX-5290 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.15.0, 4.14.1, 5.1.0 >Reporter: Lars Hofhansl >Priority: Major > Fix For: 4.15.0, 5.1.0 > > Attachments: 5290-failure.txt, 5290-v2.txt, 5290-v3.txt, 5290.txt > > > {code} > [INFO] Running org.apache.phoenix.end2end.join.HashJoinMoreIT > [ERROR] Tests run: 8, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: > 91.509 s <<< FAILURE! - in org.apache.phoenix.end2end.join.HashJoinMoreIT > [ERROR] testBug2961(org.apache.phoenix.end2end.join.HashJoinMoreIT) Time > elapsed: 2.42 s <<< ERROR! > java.lang.IllegalArgumentException: 6 > 5 > at > org.apache.phoenix.end2end.join.HashJoinMoreIT.testBug2961(HashJoinMoreIT.java:898) > {code} -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (PHOENIX-4846) WhereOptimizer.pushKeyExpressionsToScan() does not work correctly if the sort order of pk columns being filtered on changes
[ https://issues.apache.org/jira/browse/PHOENIX-4846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16884083#comment-16884083 ] Daniel Wong commented on PHOENIX-4846: -- I'll try to look at this next week. > WhereOptimizer.pushKeyExpressionsToScan() does not work correctly if the sort > order of pk columns being filtered on changes > --- > > Key: PHOENIX-4846 > URL: https://issues.apache.org/jira/browse/PHOENIX-4846 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.14.0, 5.0.0 >Reporter: Thomas D'Silva >Priority: Critical > Fix For: 4.15.0, 5.1.0 > > Attachments: PHOENIX-4846-wip.patch > > > {{ExpressionComparabilityWrapper}} should set the sort order based on > {{childPart.getColumn()}} or else the attached test throws an > IllegalArgumentException > {code} > java.lang.IllegalArgumentException: 4 > 3 > at java.util.Arrays.copyOfRange(Arrays.java:3519) > at > org.apache.hadoop.hbase.io.ImmutableBytesWritable.copyBytes(ImmutableBytesWritable.java:272) > at > org.apache.phoenix.compile.WhereOptimizer.getTrailingRange(WhereOptimizer.java:329) > at > org.apache.phoenix.compile.WhereOptimizer.clipRight(WhereOptimizer.java:350) > at > org.apache.phoenix.compile.WhereOptimizer.pushKeyExpressionsToScan(WhereOptimizer.java:237) > at org.apache.phoenix.compile.WhereCompiler.compile(WhereCompiler.java:157) > at org.apache.phoenix.compile.WhereCompiler.compile(WhereCompiler.java:108) > at > org.apache.phoenix.compile.QueryCompiler.compileSingleFlatQuery(QueryCompiler.java:556) > {code} > Also in {{pushKeyExpressionsToScan()}} we cannot extract pk column nodes from > the where clause if the sort order of the columns changes. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (PHOENIX-4845) Support using Row Value Constructors in OFFSET clause for paging in tables where the sort order of PK columns varies
[ https://issues.apache.org/jira/browse/PHOENIX-4845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16871653#comment-16871653 ] Daniel Wong commented on PHOENIX-4845: -- Lars we discussed this a bit earlier in our threads, but essentially the reason why OFFSET is good to have is it removes the burden of knowing the DDL and Access Path from constructed the WHERE clause. As you have to inspect which columns are ASC or DESC part of the pk in order to construct the right comparator. > Support using Row Value Constructors in OFFSET clause for paging in tables > where the sort order of PK columns varies > > > Key: PHOENIX-4845 > URL: https://issues.apache.org/jira/browse/PHOENIX-4845 > Project: Phoenix > Issue Type: New Feature >Reporter: Thomas D'Silva >Assignee: Daniel Wong >Priority: Major > Labels: DESC, SFDC > Attachments: PHOENIX-offset.txt > > > RVCs along with the LIMIT clause are useful for efficiently paging through > rows (see [http://phoenix.apache.org/paged.html]). This works well if the pk > columns are sorted ascending, we can always use the > operator to query for > the next batch of row. > However if the PK of a table is (A DESC, B DESC) we cannot use the following > query to page through the data > {code:java} > SELECT * FROM TABLE WHERE (A, B) > (?, ?) ORDER BY A DESC, B DESC LIMIT 20 > {code} > Since the rows are sorted by A desc and then by B descending we need change > the comparison order > {code:java} > SELECT * FROM TABLE WHERE (A, B) < (?, ?) ORDER BY A DESC, B DESC LIMIT 20 > {code} > If the PK of a table contains columns with mixed sort order for eg (A DESC, > B) then we cannot use RVC to page through data. > If we supported using RVCs in the offset clause we could use the offset to > set the start row of the scan. Clients would not have to have logic to > determine the comparison operator. This would also support paging through > data for tables where the PK columns are sorted in mixed order. > {code:java} > SELECT * FROM TABLE ORDER BY A DESC, B LIMIT 20 OFFSET (?,?) > {code} > We would only allow using the offset if the rows are ordered by the sort > order of the PK columns. > > FYI [~jfernando_sfdc] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PHOENIX-4845) Support using Row Value Constructors in OFFSET clause for paging in tables where the sort order of PK columns varies
[ https://issues.apache.org/jira/browse/PHOENIX-4845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16869972#comment-16869972 ] Daniel Wong commented on PHOENIX-4845: -- Initial mostly working outline of approach 1: [https://github.com/dbwong/phoenix/pull/2/files] > Support using Row Value Constructors in OFFSET clause for paging in tables > where the sort order of PK columns varies > > > Key: PHOENIX-4845 > URL: https://issues.apache.org/jira/browse/PHOENIX-4845 > Project: Phoenix > Issue Type: New Feature >Reporter: Thomas D'Silva >Assignee: Daniel Wong >Priority: Major > Labels: DESC, SFDC > Attachments: PHOENIX-offset.txt > > > RVCs along with the LIMIT clause are useful for efficiently paging through > rows (see [http://phoenix.apache.org/paged.html]). This works well if the pk > columns are sorted ascending, we can always use the > operator to query for > the next batch of row. > However if the PK of a table is (A DESC, B DESC) we cannot use the following > query to page through the data > {code:java} > SELECT * FROM TABLE WHERE (A, B) > (?, ?) ORDER BY A DESC, B DESC LIMIT 20 > {code} > Since the rows are sorted by A desc and then by B descending we need change > the comparison order > {code:java} > SELECT * FROM TABLE WHERE (A, B) < (?, ?) ORDER BY A DESC, B DESC LIMIT 20 > {code} > If the PK of a table contains columns with mixed sort order for eg (A DESC, > B) then we cannot use RVC to page through data. > If we supported using RVCs in the offset clause we could use the offset to > set the start row of the scan. Clients would not have to have logic to > determine the comparison operator. This would also support paging through > data for tables where the PK columns are sorted in mixed order. > {code:java} > SELECT * FROM TABLE ORDER BY A DESC, B LIMIT 20 OFFSET (?,?) > {code} > We would only allow using the offset if the rows are ordered by the sort > order of the PK columns. > > FYI [~jfernando_sfdc] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PHOENIX-4845) Support using Row Value Constructors in OFFSET clause for paging in tables where the sort order of PK columns varies
[ https://issues.apache.org/jira/browse/PHOENIX-4845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16869950#comment-16869950 ] Daniel Wong commented on PHOENIX-4845: -- [~lhofhansl] I don't quite follow the performance concern I assume it is due to optimization time as the where optimizer generates the proper scan bounds based on the new additional predicates which is linear in the number of pks? This predicate should be entirely extractable into the pk so there should not be any issue server side. After attempting approach 2 which I have a general outline for PR to shortly follow, I can say that approach 1 is definitely easier in the short term and less likely to miss an issue. For the serial iterator I was simply planning to enforce that the select is row key ordered this solves any worry about needing to serialize the parallel pieces in any new way. > Support using Row Value Constructors in OFFSET clause for paging in tables > where the sort order of PK columns varies > > > Key: PHOENIX-4845 > URL: https://issues.apache.org/jira/browse/PHOENIX-4845 > Project: Phoenix > Issue Type: New Feature >Reporter: Thomas D'Silva >Assignee: Daniel Wong >Priority: Major > Labels: DESC, SFDC > Attachments: PHOENIX-offset.txt > > > RVCs along with the LIMIT clause are useful for efficiently paging through > rows (see [http://phoenix.apache.org/paged.html]). This works well if the pk > columns are sorted ascending, we can always use the > operator to query for > the next batch of row. > However if the PK of a table is (A DESC, B DESC) we cannot use the following > query to page through the data > {code:java} > SELECT * FROM TABLE WHERE (A, B) > (?, ?) ORDER BY A DESC, B DESC LIMIT 20 > {code} > Since the rows are sorted by A desc and then by B descending we need change > the comparison order > {code:java} > SELECT * FROM TABLE WHERE (A, B) < (?, ?) ORDER BY A DESC, B DESC LIMIT 20 > {code} > If the PK of a table contains columns with mixed sort order for eg (A DESC, > B) then we cannot use RVC to page through data. > If we supported using RVCs in the offset clause we could use the offset to > set the start row of the scan. Clients would not have to have logic to > determine the comparison operator. This would also support paging through > data for tables where the PK columns are sorted in mixed order. > {code:java} > SELECT * FROM TABLE ORDER BY A DESC, B LIMIT 20 OFFSET (?,?) > {code} > We would only allow using the offset if the rows are ordered by the sort > order of the PK columns. > > FYI [~jfernando_sfdc] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PHOENIX-5360) Cleanup anonymous inner classes in WhereOptimizer
[ https://issues.apache.org/jira/browse/PHOENIX-5360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868102#comment-16868102 ] Daniel Wong commented on PHOENIX-5360: -- One advantage of static inner classes (Say package private) is that they are unit testable are we planning on testing these? > Cleanup anonymous inner classes in WhereOptimizer > - > > Key: PHOENIX-5360 > URL: https://issues.apache.org/jira/browse/PHOENIX-5360 > Project: Phoenix > Issue Type: Sub-task >Reporter: Xinyi Yan >Assignee: Xinyi Yan >Priority: Minor > Attachments: PHOENIX-5360.patch > > > Followed [~gjacoby] in > [PHOENIX-5024|https://issues.apache.org/jira/browse/PHOENIX-5024]. Convert > Anonymous to Inne refactor in IntelliJ is the nice feature that converts > anonymous to inner class. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PHOENIX-4845) Support using Row Value Constructors in OFFSET clause for paging in tables where the sort order of PK columns varies
[ https://issues.apache.org/jira/browse/PHOENIX-4845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868098#comment-16868098 ] Daniel Wong commented on PHOENIX-4845: -- [~Bin Shi] might have some thoughts on this too. > Support using Row Value Constructors in OFFSET clause for paging in tables > where the sort order of PK columns varies > > > Key: PHOENIX-4845 > URL: https://issues.apache.org/jira/browse/PHOENIX-4845 > Project: Phoenix > Issue Type: New Feature >Reporter: Thomas D'Silva >Assignee: Daniel Wong >Priority: Major > Labels: DESC, SFDC > Attachments: PHOENIX-offset.txt > > > RVCs along with the LIMIT clause are useful for efficiently paging through > rows (see [http://phoenix.apache.org/paged.html]). This works well if the pk > columns are sorted ascending, we can always use the > operator to query for > the next batch of row. > However if the PK of a table is (A DESC, B DESC) we cannot use the following > query to page through the data > {code:java} > SELECT * FROM TABLE WHERE (A, B) > (?, ?) ORDER BY A DESC, B DESC LIMIT 20 > {code} > Since the rows are sorted by A desc and then by B descending we need change > the comparison order > {code:java} > SELECT * FROM TABLE WHERE (A, B) < (?, ?) ORDER BY A DESC, B DESC LIMIT 20 > {code} > If the PK of a table contains columns with mixed sort order for eg (A DESC, > B) then we cannot use RVC to page through data. > If we supported using RVCs in the offset clause we could use the offset to > set the start row of the scan. Clients would not have to have logic to > determine the comparison operator. This would also support paging through > data for tables where the PK columns are sorted in mixed order. > {code:java} > SELECT * FROM TABLE ORDER BY A DESC, B LIMIT 20 OFFSET (?,?) > {code} > We would only allow using the offset if the rows are ordered by the sort > order of the PK columns. > > FYI [~jfernando_sfdc] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (PHOENIX-4845) Support using Row Value Constructors in OFFSET clause for paging in tables where the sort order of PK columns varies
[ https://issues.apache.org/jira/browse/PHOENIX-4845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868090#comment-16868090 ] Daniel Wong edited comment on PHOENIX-4845 at 6/19/19 11:15 PM: So I have began implementing this Jira and have done some initial scoping and approach thoughts. Disallowed cases will not be supported: {code:java} String simpleDdl = "CREATE TABLE " + tableName + " (t_id VARCHAR NOT NULL,\n" + "k1 INTEGER NOT NULL,\n" + "k2 INTEGER NOT NULL,\n" + "C3.k3 INTEGER,\n" + "C2.v1 VARCHAR,\n" + "CONSTRAINT pk PRIMARY KEY (t_id, k1, k2)) "; //Test RVC Offset columns must be coercible to a base table @Test public void testRVCIndexLookup() throws SQLException { String failureSql = String.format("SELECT t_id, k1, k2 FROM %s OFFSET ('a', 'ab', 2)",tableName); try { conn.createStatement().execute(failureSql); } catch (Exception e) { System.out.println(e.getMessage()); return; } fail("Should not allow Index lookup with RVC Offset"); } //Test Not Allow Index Access Path On Base Table @Test public void testRVCOffsetNotCoercible() throws SQLException { String failureSql = String.format("SELECT t_id, k1, k2 FROM %s OFFSET ('a', 'ab', 2)",tableName); try { conn.createStatement().execute(failureSql); } catch (Exception e) { System.out.println(e.getMessage()); return; } fail("Should not allow non coercible values to PK in RVC Offset"); } //Test Order By Not PK Order By Exception @Test public void testRVCOffsetNotAllowNonPKOrderBy() throws SQLException { String failureSql = String.format("SELECT t_id, k1, k2, v1 FROM %s ORDER BY v1 OFFSET ('a', 1, 2)",tableName); try { conn.createStatement().execute(failureSql); } catch (Exception e) { System.out.println(e.getMessage()); return; } fail("Should not allow no PK order by with RVC Offset"); } //Test Order By Partial PK Order By Exception @Test public void testRVCOffsetNotAllowPartialPKOrderBy() throws SQLException { String failureSql = String.format("SELECT t_id, k1, k2 FROM %s ORDER BY 2 OFFSET ('a', 1, 2)",tableName); try { conn.createStatement().execute(failureSql); } catch (Exception e) { System.out.println(e.getMessage()); return; } fail("Should not allow partial PK order by with RVC Offset"); } //Test Order By Not PK Order By Exception @Test public void testRVCOffsetNotAllowDifferentPKOrderBy() throws SQLException { String failureSql = String.format("SELECT t_id, k1, k2 FROM %s ORDER BY 1 DESC,2,3 OFFSET ('a', 1, 2)",tableName); try { conn.createStatement().execute(failureSql); } catch (Exception e) { System.out.println(e.getMessage()); return; } fail("Should not allow differnt PK order by with RVC Offset"); } //Test Not allow joins @Test public void testRVCOffsetNotAllowedInJoins() throws SQLException { String failureSql = String.format("SELECT * FROM %s AS T1, %s AS T2 WHERE T1.t_id=T2.t_id OFFSET ('a', 1, 2)",tableName,tableName); //literal works try { conn.createStatement().execute(failureSql); } catch (Exception e) { System.out.println(e.getMessage()); return; } fail("Should not have JOIN in RVC Offset"); } //Test Not allowed in subsquery @Test public void testRVCOffsetNotAllowedInSubQuery() throws SQLException { String failureSql = String.format("SELECT * FROM (SELECT t_id, k2 FROM %s OFFSET ('a', 1, 2))",tableName); try { conn.createStatement().execute(failureSql); } catch (Exception e) { System.out.println(e.getMessage()); return; } fail("Should not have subquery with RVC Offset"); } //Test Not allowed on subsquery @Test public void testRVCOffsetNotAllowedOnSubQuery() throws SQLException { String failureSql = String.format("SELECT * FROM (SELECT t_id, k2 FROM %s) OFFSET ('a', 1, 2)",tableName); try { conn.createStatement().execute(failureSql); } catch (Exception e) { System.out.println(e.getMessage()); return; } fail("Should not have subquery with RVC Offset"); } //Test RVC Offset must be a literal, cannot have column reference @Test public void testRVCOffsetLiteral() throws SQLException { String sql = "SELECT * FROM " + tableName + " OFFSET ('a', 1, 2)"; //literal works conn.createStatement().execute(sql); String failureSql = "SELECT * FROM " + tableName + " OFFSET ('a', 1, k2)"; //column does works try { conn.createStatement().execute(failureSql); } catch (Exception e) { System.out.println(e.getMessage()); return; } fail("Should not have allowed column in RVC Offset"); } {code} Approach Possibilities: I have been considering 2 posibilities for approaching this in Phoenix and am open to other suggestions as well. Given DDL/Query: Create Table TABLE (a UNSIGNED_TINYINT, b UNSIGNED_TINYINT, c UNSIGNED_TINYINT, CONSTANT pk PRIMARY KEY (a,b desc, c)) SELECT * FROM TABLE OFFSET (1,2,3) Approach 1: Query Rewrite SELECT * FROM TABLE OFFSET (1,2,3) -> SELECT * FROM TABLE WHERE (A > 1 OR (A = 1 AND (B < 2 OR (B = 2 AND (C >= 3) Approach 2: Mini - Resolution SELECT * FROM TABLE OFFSET (1,2,3) -> SELECT 1
[jira] [Commented] (PHOENIX-4845) Support using Row Value Constructors in OFFSET clause for paging in tables where the sort order of PK columns varies
[ https://issues.apache.org/jira/browse/PHOENIX-4845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868090#comment-16868090 ] Daniel Wong commented on PHOENIX-4845: -- So I have began implementing this Jira and have done some initial scoping and approach thoughts. Disallowed cases will not be supported: {code:java} //Test RVC Offset columns must be coercible to a base table @Test public void testRVCIndexLookup() throws SQLException { String failureSql = String.format("SELECT t_id, k1, k2 FROM %s OFFSET ('a', 'ab', 2)",tableName); try { conn.createStatement().execute(failureSql); } catch (Exception e) { System.out.println(e.getMessage()); return; } fail("Should not allow Index lookup with RVC Offset"); } //Test Not Allow Index Access Path On Base Table @Test public void testRVCOffsetNotCoercible() throws SQLException { String failureSql = String.format("SELECT t_id, k1, k2 FROM %s OFFSET ('a', 'ab', 2)",tableName); try { conn.createStatement().execute(failureSql); } catch (Exception e) { System.out.println(e.getMessage()); return; } fail("Should not allow non coercible values to PK in RVC Offset"); } //Test Order By Not PK Order By Exception @Test public void testRVCOffsetNotAllowNonPKOrderBy() throws SQLException { String failureSql = String.format("SELECT t_id, k1, k2, v1 FROM %s ORDER BY v1 OFFSET ('a', 1, 2)",tableName); try { conn.createStatement().execute(failureSql); } catch (Exception e) { System.out.println(e.getMessage()); return; } fail("Should not allow no PK order by with RVC Offset"); } //Test Order By Partial PK Order By Exception @Test public void testRVCOffsetNotAllowPartialPKOrderBy() throws SQLException { String failureSql = String.format("SELECT t_id, k1, k2 FROM %s ORDER BY 2 OFFSET ('a', 1, 2)",tableName); try { conn.createStatement().execute(failureSql); } catch (Exception e) { System.out.println(e.getMessage()); return; } fail("Should not allow partial PK order by with RVC Offset"); } //Test Order By Not PK Order By Exception @Test public void testRVCOffsetNotAllowDifferentPKOrderBy() throws SQLException { String failureSql = String.format("SELECT t_id, k1, k2 FROM %s ORDER BY 1 DESC,2,3 OFFSET ('a', 1, 2)",tableName); try { conn.createStatement().execute(failureSql); } catch (Exception e) { System.out.println(e.getMessage()); return; } fail("Should not allow differnt PK order by with RVC Offset"); } //Test Not allow joins @Test public void testRVCOffsetNotAllowedInJoins() throws SQLException { String failureSql = String.format("SELECT * FROM %s AS T1, %s AS T2 WHERE T1.t_id=T2.t_id OFFSET ('a', 1, 2)",tableName,tableName); //literal works try { conn.createStatement().execute(failureSql); } catch (Exception e) { System.out.println(e.getMessage()); return; } fail("Should not have JOIN in RVC Offset"); } //Test Not allowed in subsquery @Test public void testRVCOffsetNotAllowedInSubQuery() throws SQLException { String failureSql = String.format("SELECT * FROM (SELECT t_id, k2 FROM %s OFFSET ('a', 1, 2))",tableName); try { conn.createStatement().execute(failureSql); } catch (Exception e) { System.out.println(e.getMessage()); return; } fail("Should not have subquery with RVC Offset"); } //Test Not allowed on subsquery @Test public void testRVCOffsetNotAllowedOnSubQuery() throws SQLException { String failureSql = String.format("SELECT * FROM (SELECT t_id, k2 FROM %s) OFFSET ('a', 1, 2)",tableName); try { conn.createStatement().execute(failureSql); } catch (Exception e) { System.out.println(e.getMessage()); return; } fail("Should not have subquery with RVC Offset"); } //Test RVC Offset must be a literal, cannot have column reference @Test public void testRVCOffsetLiteral() throws SQLException { String sql = "SELECT * FROM " + tableName + " OFFSET ('a', 1, 2)"; //literal works conn.createStatement().execute(sql); String failureSql = "SELECT * FROM " + tableName + " OFFSET ('a', 1, k2)"; //column does works try { conn.createStatement().execute(failureSql); } catch (Exception e) { System.out.println(e.getMessage()); return; } fail("Should not have allowed column in RVC Offset"); } {code} Approach Possibilities: I have been considering 2 posibilities for approaching this in Phoenix and am open to other suggestions as well. Given DDL/Query: Create Table TABLE (a UNSIGNED_TINYINT, b UNSIGNED_TINYINT, c UNSIGNED_TINYINT, CONSTANT pk PRIMARY KEY (a,b desc, c)) SELECT * FROM TABLE OFFSET (1,2,3) Approach 1: Query Rewrite SELECT * FROM TABLE OFFSET (1,2,3) -> SELECT * FROM TABLE WHERE (A > 1 OR (A = 1 AND (B < 2 OR (B = 2 AND (C >= 3) Approach 2: Mini - Resolution SELECT * FROM TABLE OFFSET (1,2,3) -> SELECT 1 FROM TABLE WHERE (a,b,c)=(1,2,3), (Pass the constructed tree to the WhereOptimizer essentially) -> Extract Byte Array From Constant [0x01FD03] -> Filter Scans start/end key by Constant Right now I'm favoring approach 2 but this will require some additional hooks into the
[jira] [Commented] (PHOENIX-5318) Slots passed to SkipScan filter is incorrect for desc primary keys that are prefixes of each other
[ https://issues.apache.org/jira/browse/PHOENIX-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16865818#comment-16865818 ] Daniel Wong commented on PHOENIX-5318: -- It is probably good to do since we are still working on testing for correctness. > Slots passed to SkipScan filter is incorrect for desc primary keys that are > prefixes of each other > -- > > Key: PHOENIX-5318 > URL: https://issues.apache.org/jira/browse/PHOENIX-5318 > Project: Phoenix > Issue Type: Improvement >Affects Versions: 4.15.0, 5.1.0, 4.14.3 >Reporter: Thomas D'Silva >Assignee: Thomas D'Silva >Priority: Major > Labels: DESC > Fix For: 4.15.0, 5.1.0, 4.14.3 > > Attachments: PHOENIX-5318-4.x-HBase-1.3.patch > > Time Spent: 1h 40m > Remaining Estimate: 0h > > {code} > CREATE VIEW IF NOT EXISTS CUSTOM_ENTITY."z01" (COL1 VARCHAR, COL2 VARCHAR, > COL3 VARCHAR, COL4 VARCHAR CONSTRAINT PK PRIMARY KEY (COL1 DESC, COL2 DESC, > COL3 DESC, COL4 DESC)) AS SELECT * FROM > CUSTOM_ENTITY.CUSTOM_ENTITY_DATA_NO_ID WHERE KEY_PREFIX = 'z01'; > > UPSERT INTO CUSTOM_ENTITY."z01" (COL1, COL2, COL3, COL4) VALUES ('8', 'blah', > 'blah', 'blah'); > UPSERT INTO CUSTOM_ENTITY."z01" (COL1, COL2, COL3, COL4) VALUES ('6', 'blah', > 'blah', 'blah'); > UPSERT INTO CUSTOM_ENTITY."z01" (COL1, COL2, COL3, COL4) VALUES ('23', > 'blah', 'blah', 'blah'); > UPSERT INTO CUSTOM_ENTITY."z01" (COL1, COL2, COL3, COL4) VALUES ('17', > 'blah', 'blah', 'blah'); > > SELECT COL1, COL2, COL3, COL4 FROM CUSTOM_ENTITY."z01" WHERE COL4='blah' AND > (COL1='1' OR COL1='2' OR COL1='3' OR COL1='4' OR COL1='5' OR COL1='6' OR > COL1='8' OR COL1='17' OR COL1='12' OR COL1='23') AND COL3='blah' > > +---+---+---+---+ > | COL1 | COL2 | COL3 | COL4 | > +---+---+---+---+ > | 8 | blah | blah | blah | > | 6 | blah | blah | blah | > +---+---+---+---+ > > SELECT COL1, COL2, COL3, COL4 FROM CUSTOM_ENTITY."z01" WHERE COL4='blah' AND > (COL1='6'OR COL1='8' OR COL1='17' OR COL1='12' OR COL1='23') AND COL3='blah' > > +---+---+---+---+ > | COL1 | COL2 | COL3 | COL4 | > +---+---+---+---+ > | 8 | blah | blah | blah | > | 6 | blah | blah | blah | > | 23 | blah | blah | blah | > | 17 | blah | blah | blah | > +---+---+---+--- > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PHOENIX-5231) Configurable Stats Cache
[ https://issues.apache.org/jira/browse/PHOENIX-5231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16852244#comment-16852244 ] Daniel Wong commented on PHOENIX-5231: -- I'm okay with reverting as I personally don't have time to look until next week. What is the long term way forward though? Services are the standard way to provide user pluggable functionality. Originally this was intended for possible memcached implementation which we believed was too specific to add directly to open source. [~lhofhansl] > Configurable Stats Cache > > > Key: PHOENIX-5231 > URL: https://issues.apache.org/jira/browse/PHOENIX-5231 > Project: Phoenix > Issue Type: Test >Reporter: Daniel Wong >Assignee: Daniel Wong >Priority: Major > Fix For: 4.15.0, 5.1.0 > > Attachments: 5231-quickfix-v2.txt, 5231-quickfix.txt, > 5231-services-fix.patch, PHOENIX-5231.4.x-HBase-1.3.patch, > PHOENIX-5231.4.x-HBase-1.3.v2.patch, PHOENIX-5231.4.x-HBase-1.3.v3.patch, > PHOENIX-5231.master.v3.patch, PHOENIX-5231.master.v4.patch > > Time Spent: 8h 40m > Remaining Estimate: 0h > > Currently, the phoenix stats cache is per > ConnectionQuerySerivce/ConnectionProfile, which leads to duplicated cached > entry (the guideposts) and waste resources if these separate connections are > querying the same underlying table. It would be good to be able to provide a > configurable stats cache as control the cache level so it could be per JVM. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PHOENIX-5231) Configurable Stats Cache
[ https://issues.apache.org/jira/browse/PHOENIX-5231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16850207#comment-16850207 ] Daniel Wong commented on PHOENIX-5231: -- +1 to Vincent's fix. I'm guessing my version of mvn is really old on my personal laptop as I don't develop there much. I'll try to confirm tonight. > Configurable Stats Cache > > > Key: PHOENIX-5231 > URL: https://issues.apache.org/jira/browse/PHOENIX-5231 > Project: Phoenix > Issue Type: Test >Reporter: Daniel Wong >Assignee: Daniel Wong >Priority: Major > Fix For: 4.15.0, 5.1.0 > > Attachments: 5231-quickfix-v2.txt, 5231-quickfix.txt, > 5231-services-fix.patch, PHOENIX-5231.4.x-HBase-1.3.patch, > PHOENIX-5231.4.x-HBase-1.3.v2.patch, PHOENIX-5231.4.x-HBase-1.3.v3.patch, > PHOENIX-5231.master.v3.patch, PHOENIX-5231.master.v4.patch > > Time Spent: 8h 40m > Remaining Estimate: 0h > > Currently, the phoenix stats cache is per > ConnectionQuerySerivce/ConnectionProfile, which leads to duplicated cached > entry (the guideposts) and waste resources if these separate connections are > querying the same underlying table. It would be good to be able to provide a > configurable stats cache as control the cache level so it could be per JVM. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PHOENIX-5231) Configurable Stats Cache
[ https://issues.apache.org/jira/browse/PHOENIX-5231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16850012#comment-16850012 ] Daniel Wong commented on PHOENIX-5231: -- I was trying to come up with a fix I think would be clearer but Im still traveling and don't have good internet access outside my phone. I think we can construct a better fix as the one you proposed here removes the standard META-INF directory approach that one would normally expect the services to live in. Let me see if I can spend a little time and construct something. > Configurable Stats Cache > > > Key: PHOENIX-5231 > URL: https://issues.apache.org/jira/browse/PHOENIX-5231 > Project: Phoenix > Issue Type: Test >Reporter: Daniel Wong >Assignee: Daniel Wong >Priority: Major > Fix For: 4.15.0, 5.1.0 > > Attachments: 5231-quickfix-v2.txt, 5231-quickfix.txt, > PHOENIX-5231.4.x-HBase-1.3.patch, PHOENIX-5231.4.x-HBase-1.3.v2.patch, > PHOENIX-5231.4.x-HBase-1.3.v3.patch, PHOENIX-5231.master.v3.patch, > PHOENIX-5231.master.v4.patch > > Time Spent: 8h 40m > Remaining Estimate: 0h > > Currently, the phoenix stats cache is per > ConnectionQuerySerivce/ConnectionProfile, which leads to duplicated cached > entry (the guideposts) and waste resources if these separate connections are > querying the same underlying table. It would be good to be able to provide a > configurable stats cache as control the cache level so it could be per JVM. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PHOENIX-5231) Configurable Stats Cache
[ https://issues.apache.org/jira/browse/PHOENIX-5231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16848296#comment-16848296 ] Daniel Wong commented on PHOENIX-5231: -- That's strange I wonder what changed can you share your mvn version so I can attempt to investigate later? > Configurable Stats Cache > > > Key: PHOENIX-5231 > URL: https://issues.apache.org/jira/browse/PHOENIX-5231 > Project: Phoenix > Issue Type: Test >Reporter: Daniel Wong >Assignee: Daniel Wong >Priority: Major > Fix For: 4.15.0, 5.1.0 > > Attachments: 5231-quickfix-v2.txt, 5231-quickfix.txt, > PHOENIX-5231.4.x-HBase-1.3.patch, PHOENIX-5231.4.x-HBase-1.3.v2.patch, > PHOENIX-5231.4.x-HBase-1.3.v3.patch, PHOENIX-5231.master.v3.patch, > PHOENIX-5231.master.v4.patch > > Time Spent: 8h 40m > Remaining Estimate: 0h > > Currently, the phoenix stats cache is per > ConnectionQuerySerivce/ConnectionProfile, which leads to duplicated cached > entry (the guideposts) and waste resources if these separate connections are > querying the same underlying table. It would be good to be able to provide a > configurable stats cache as control the cache level so it could be per JVM. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PHOENIX-5231) Configurable Stats Cache
[ https://issues.apache.org/jira/browse/PHOENIX-5231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16848163#comment-16848163 ] Daniel Wong commented on PHOENIX-5231: -- [~lhofhansl] I am not able to recreate your issue with SQLLine. Can you give me some more information? What is your output of jar tf on your client jar? This was my output following mvn package. {code:java} dev@OnePageGamesDevBox:~/dev/phoenix$ jar tf ./phoenix-client/target/phoenix-client-5.1.0-HBase-2.0-SNAPSHOT.jar | grep "GuidePostsCacheFactory" org/apache/phoenix/query/ITGuidePostsCacheFactory.class org/apache/phoenix/query/GuidePostsCacheFactory.class org/apache/phoenix/query/DefaultGuidePostsCacheFactory.class META-INF/services/org.apache.phoenix.query.GuidePostsCacheFactory META-INF/services/META-INF/services/org.apache.phoenix.query.GuidePostsCacheFactory {code} After starting up the sandbox minicluster i was able to connect with SQLLine. {code:java} dev@OnePageGamesDevBox:~/dev/phoenix$ bin/sqlline.py localhost:56453 Setting property: [incremental, false] Setting property: [isolation, TRANSACTION_READ_COMMITTED] issuing: !connect jdbc:phoenix:localhost:56453 none none org.apache.phoenix.jdbc.PhoenixDriver Connecting to jdbc:phoenix:localhost:56453 19/05/25 04:41:18 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Connected to: Phoenix (version 5.1) Driver: PhoenixEmbeddedDriver (version 5.1) Autocommit status: true Transaction isolation: TRANSACTION_READ_COMMITTED Building list of tables and columns for tab-completion (set fastconnect to true to skip)... 154/154 (100%) Done Done sqlline version 1.7.0 0: jdbc:phoenix:localhost:56453> select * from system.catalog limit 1; +---+-++-+---+---++-+--+--+ | TENANT_ID | TABLE_SCHEM | TABLE_NAME | COLUMN_NAME | COLUMN_FAMILY | TABLE_SEQ_NUM | TABLE_TYPE | PK_NAME | COLUMN_COUNT | | +---+-++-+---+---++-+--+--+ | | SYSTEM | CATALOG | | | 0 | s | PK | 64 | | +---+-++-+---+---++-+--+--+ {code} I which branch or phoenix build were you having issue with? > Configurable Stats Cache > > > Key: PHOENIX-5231 > URL: https://issues.apache.org/jira/browse/PHOENIX-5231 > Project: Phoenix > Issue Type: Test >Reporter: Daniel Wong >Assignee: Daniel Wong >Priority: Major > Fix For: 4.15.0, 5.1.0 > > Attachments: 5231-quickfix-v2.txt, 5231-quickfix.txt, > PHOENIX-5231.4.x-HBase-1.3.patch, PHOENIX-5231.4.x-HBase-1.3.v2.patch, > PHOENIX-5231.4.x-HBase-1.3.v3.patch, PHOENIX-5231.master.v3.patch, > PHOENIX-5231.master.v4.patch > > Time Spent: 8h 40m > Remaining Estimate: 0h > > Currently, the phoenix stats cache is per > ConnectionQuerySerivce/ConnectionProfile, which leads to duplicated cached > entry (the guideposts) and waste resources if these separate connections are > querying the same underlying table. It would be good to be able to provide a > configurable stats cache as control the cache level so it could be per JVM. -- This message was sent by Atlassian JIRA (v7.6.3#76005)