[jira] [Commented] (PHOENIX-6897) Filters on unverified index rows return wrong result
[ https://issues.apache.org/jira/browse/PHOENIX-6897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17737374#comment-17737374 ] ASF GitHub Bot commented on PHOENIX-6897: - tkhurana opened a new pull request, #1632: URL: https://github.com/apache/phoenix/pull/1632 …597) * PHOENIX-6897 Filters on unverified index rows return wrong result * Fixed checkstyle and missing license warnings * Addressed review comments - > Filters on unverified index rows return wrong result > > > Key: PHOENIX-6897 > URL: https://issues.apache.org/jira/browse/PHOENIX-6897 > Project: Phoenix > Issue Type: Bug >Affects Versions: 5.1.2 >Reporter: Yunbo Fan >Assignee: Tanuj Khurana >Priority: Major > Fix For: 5.2.0 > > > h4. Summary: > Upsert include three phases, and if failed after phase1, unverified index > rows will leave in the index table. This will cause wrong result when do > aggregate queries. > h4. Steps for reproduce > 1. create table and index > {code} > create table students(id integer primary key, name varchar, status integer); > create index students_name_index on students(name, id) include (status); > {code} > 2. upsert data using phoenix > {code} > upsert into students values(1, 'tom', 1); > upsert into students values(2, 'jerry', 2); > {code} > 3. do phase1 by hbase shell, change status column value to '2' and verified > column value to '2' > {code} > put 'STUDENTS_NAME_INDEX', "tom\x00\x80\x00\x00\x01", '0:0:STATUS', > "\x80\x00\x00\x02" > put 'STUDENTS_NAME_INDEX', "tom\x00\x80\x00\x00\x01", '0:_0', "\x02" > {code} > notice: hbase shell can't parse colon in column, like '0:0:STATUS', you may > need comment the line in hbase/lib/ruby/hbase/table.rb, see > https://issues.apache.org/jira/browse/HBASE-13788 > {code} > # Returns family and (when has it) qualifier for a column name > def parse_column_name(column) > split = > org.apache.hadoop.hbase.KeyValue.parseColumn(column.to_java_bytes) > -> comment this line out #set_converter(split) if split.length > 1 > return split[0], (split.length > 1) ? split[1] : nil > end > {code} > 4. do query without aggregate, the result is right > {code} > 0: jdbc:phoenix:> select status from students where name = 'tom'; > ++ > | STATUS | > ++ > | 1 | > ++ > {code} > 5. do query with aggregate, get wrong result > {code} > 0: jdbc:phoenix:> select count(*) from students where name = 'tom' and status > = 1; > +--+ > | COUNT(1) | > +--+ > | 0| > +--+ > {code} > 6. using NO_INDEX hint > {code} > 0: jdbc:phoenix:> select /*+ NO_INDEX */ count(*) from students where name = > 'tom' and status = 1; > +--+ > | COUNT(1) | > +--+ > | 1| > +--+ > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [phoenix] tkhurana opened a new pull request, #1632: PHOENIX-6897 Filters on unverified index rows return wrong result (#1…
tkhurana opened a new pull request, #1632: URL: https://github.com/apache/phoenix/pull/1632 …597) * PHOENIX-6897 Filters on unverified index rows return wrong result * Fixed checkstyle and missing license warnings * Addressed review comments - -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@phoenix.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (PHOENIX-6985) Setting server-side masking flag default to false
[ https://issues.apache.org/jira/browse/PHOENIX-6985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17737346#comment-17737346 ] ASF GitHub Bot commented on PHOENIX-6985: - lokiore commented on PR #1629: URL: https://github.com/apache/phoenix/pull/1629#issuecomment-1608244815 > @lokiore I understand they are flappers after talking to @virajjasani @tkhurana, Is there a way to mark them as such? I am not sure if we can do something here other than Ignore which I assume we don't want. maybe some plugin can be added at jenkins like Result Test Analyser or do we already have that @virajjasani !? > Setting server-side masking flag default to false > -- > > Key: PHOENIX-6985 > URL: https://issues.apache.org/jira/browse/PHOENIX-6985 > Project: Phoenix > Issue Type: Sub-task >Reporter: Lokesh Khurana >Assignee: Lokesh Khurana >Priority: Major > > Currently PhoenixTTL feature is enabled by default which adds > PhoenixTTLRegionObserver coproc to every table, setting > phoenix.ttl.server_side.masking.enabled default to false. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [phoenix] lokiore commented on pull request #1629: PHOENIX-6985 :- Setting server-side masking flag default to false
lokiore commented on PR #1629: URL: https://github.com/apache/phoenix/pull/1629#issuecomment-1608244815 > @lokiore I understand they are flappers after talking to @virajjasani @tkhurana, Is there a way to mark them as such? I am not sure if we can do something here other than Ignore which I assume we don't want. maybe some plugin can be added at jenkins like Result Test Analyser or do we already have that @virajjasani !? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@phoenix.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Comment Edited] (PHOENIX-6975) Introduce StaleMetadataCache Exception
[ https://issues.apache.org/jira/browse/PHOENIX-6975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17737331#comment-17737331 ] Rushabh Shah edited comment on PHOENIX-6975 at 6/26/23 8:46 PM: Thinking how to achieve this. For executeQuery() method, how client will retry on StaleMetadataCacheException? Application using phoenix will similar query structure to query from Phoenix server. {code:java} String query = "SELECT * FROM " + tableNameStr; // Execute query try (ResultSet rs = conn.createStatement().executeQuery(query)) { while (rs.next()) { --> This will throw StaleMetadataCacheException // Read from ResultSet } } {code} In the above example, Phoenix client will receive StaleMetadataCacheException while doing rs.next() calls. How will client handle StaleMetadataCacheException and retry? Should phoenix client throw StaleMetadataCacheException all the way back to the application or re-create PhoenixStatement and PhoenixResultSet and retry again transparently? What happens if phoenix client encounters StaleMetadataCacheException on the 4th or 5th rs.next call? When the client retries, should it skip the previous 3 or 4 results and read from the 5th result or retry again from the beginning? +Option 1: Throw StaleMetadataCacheException back to the application.+ # Intercept StaleMetadataCacheException in PhoenixResultSet#next and invalidate the client side cache. # Throw StaleMetadataCacheException (subclass of SQLException) back to the application. Currently [PhoenixResultSet#next|https://github.com/apache/phoenix/blob/master/phoenix-core/src/main/java/org/apache/phoenix/jdbc/PhoenixResultSet.java#L873] is throwing SQLException. # Let the application decide whether to retry or not? Pros: # Simple to implement. # The application is in control whether to retry or not. If the application is going to retry then it can it can update the business logic to take appropriate actions on retry, like resetting some counters to avoid double counting, etc. Cons: # Application will encounter SQLException more frequently during the schema upgrade process. Currently if application is NOT retrying on any SQLException then the failure rate will increase. In this case, there will be some changes required to handle StaleMetadataCacheException. +Option 2: Handle the retry logic within phoenix client.+ # Intercept StaleMetadataCacheException in PhoenixResultSet#next and invalidate the client side cache. # We will need to reset state in PhoenixResultSet and PhoenixStatement, particularly the QueryPlan object. # Run [PhoenixStatement#executeQuery|https://github.com/apache/phoenix/blob/master/phoenix-core/src/main/java/org/apache/phoenix/jdbc/PhoenixStatement.java#L2151] to generate QueryPlan again. Pros: # Very transparent to the end application. It doesn’t need to worry about the StaleMetadataCacheException. # No more pros. Refer to point#1 Cons: # Trust issue: What happens if PhoenixResultSet#next failed while iterating the results. Lets say application read 4 values and 5th next() called failed with StaleMetadataCacheException. If phoenix client retry transparently, it will start iterating PhoenixResultSet from the beginning again. This can cause trust issues since we will re-process the first 4 rows. # The code will become too complex and very difficult to maintain. If we introduce new logic in future, we will have to make sure that it gets reset on StaleMetadataCacheException. Very cumbersome to maintain. Thoughts? [~stoty] [~gjacoby] [~kadir] [~jisaac] [~tkhurana] was (Author: shahrs87): Thinking how to achieve this. For executeQuery() method, how client will retry on StaleMetadataCacheException? Application using phoenix will similar query structure to query from Phoenix server. {code:java} String query = "SELECT * FROM " + tableNameStr; // Execute query try (ResultSet rs = conn.createStatement().executeQuery(query)) { while (rs.next()) { --> This will throw StaleMetadataCacheException // Read from ResultSet } } {code} In the above example, Phoenix client will receive StaleMetadataCacheException while doing rs.next() calls. How will client handle StaleMetadataCacheException and retry? Should phoenix client throw StaleMetadataCacheException all the way back to the application or re-create PhoenixStatement and PhoenixResultSet and retry again transparently? What happens if phoenix client encounters StaleMetadataCacheException on the 4th or 5th rs.next call? When the client retries, should it skip the previous 3 or 4 results and read from the 5th result or retry again from the beginning? Option 1: Throw StaleMetadataCacheException back to the application. # Intercept StaleMetadataCacheException in PhoenixResultSet#next and invalidate the client side cache. # Throw StaleMetadataCacheException (subclass of SQLException) back
[jira] [Commented] (PHOENIX-6975) Introduce StaleMetadataCache Exception
[ https://issues.apache.org/jira/browse/PHOENIX-6975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17737331#comment-17737331 ] Rushabh Shah commented on PHOENIX-6975: --- Thinking how to achieve this. For executeQuery() method, how client will retry on StaleMetadataCacheException? Application using phoenix will similar query structure to query from Phoenix server. {code:java} String query = "SELECT * FROM " + tableNameStr; // Execute query try (ResultSet rs = conn.createStatement().executeQuery(query)) { while (rs.next()) { --> This will throw StaleMetadataCacheException // Read from ResultSet } } {code} In the above example, Phoenix client will receive StaleMetadataCacheException while doing rs.next() calls. How will client handle StaleMetadataCacheException and retry? Should phoenix client throw StaleMetadataCacheException all the way back to the application or re-create PhoenixStatement and PhoenixResultSet and retry again transparently? What happens if phoenix client encounters StaleMetadataCacheException on the 4th or 5th rs.next call? When the client retries, should it skip the previous 3 or 4 results and read from the 5th result or retry again from the beginning? Option 1: Throw StaleMetadataCacheException back to the application. # Intercept StaleMetadataCacheException in PhoenixResultSet#next and invalidate the client side cache. # Throw StaleMetadataCacheException (subclass of SQLException) back to the application. Currently [PhoenixResultSet#next|https://github.com/apache/phoenix/blob/master/phoenix-core/src/main/java/org/apache/phoenix/jdbc/PhoenixResultSet.java#L873] is throwing SQLException. # Let the application decide whether to retry or not? Pros: # Simple to implement. # The application is in control whether to retry or not. If the application is going to retry then it can it can update the business logic to take appropriate actions on retry, like resetting some counters to avoid double counting, etc. Cons: # Application will encounter SQLException more frequently during the schema upgrade process. Currently if application is NOT retrying on any SQLException then the failure rate will increase. In this case, there will be some changes required to handle StaleMetadataCacheException. Option 2: Handle the retry logic within phoenix client. # Intercept StaleMetadataCacheException in PhoenixResultSet#next and invalidate the client side cache. # We will need to reset state in PhoenixResultSet and PhoenixStatement, particularly the QueryPlan object. # Run [PhoenixStatement#executeQuery|https://github.com/apache/phoenix/blob/master/phoenix-core/src/main/java/org/apache/phoenix/jdbc/PhoenixStatement.java#L2151] to generate QueryPlan again. Pros: # Very transparent to the end application. It doesn’t need to worry about the StaleMetadataCacheException. # No more pros. Refer to point#1 Cons: # Trust issue: What happens if PhoenixResultSet#next failed while iterating the results. Lets say application read 4 values and 5th next() called failed with StaleMetadataCacheException. If phoenix client retry transparently, it will start iterating PhoenixResultSet from the beginning again. This can cause trust issues since we will re-process the first 4 rows. # The code will become too complex and very difficult to maintain. If we introduce new logic in future, we will have to make sure that it gets reset on StaleMetadataCacheException. Very cumbersome to maintain. Thoughts, [~stoty] [~gjacoby] [~kadir] [~jisaac] [~tkhurana] > Introduce StaleMetadataCache Exception > -- > > Key: PHOENIX-6975 > URL: https://issues.apache.org/jira/browse/PHOENIX-6975 > Project: Phoenix > Issue Type: Sub-task > Components: core >Reporter: Rushabh Shah >Assignee: Rushabh Shah >Priority: Major > > Introduce StaleMetadataCache Exception if client provided last ddl timestamp > is less than server side last ddl timestamp and allow the client to retry the > statement. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (PHOENIX-6985) Setting server-side masking flag default to false
[ https://issues.apache.org/jira/browse/PHOENIX-6985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17737330#comment-17737330 ] ASF GitHub Bot commented on PHOENIX-6985: - jpisaac commented on PR #1629: URL: https://github.com/apache/phoenix/pull/1629#issuecomment-1608174524 @lokiore I understand they are flappers after talking to @virajjasani @tkhurana, Is there a way to mark them as such? > Setting server-side masking flag default to false > -- > > Key: PHOENIX-6985 > URL: https://issues.apache.org/jira/browse/PHOENIX-6985 > Project: Phoenix > Issue Type: Sub-task >Reporter: Lokesh Khurana >Assignee: Lokesh Khurana >Priority: Major > > Currently PhoenixTTL feature is enabled by default which adds > PhoenixTTLRegionObserver coproc to every table, setting > phoenix.ttl.server_side.masking.enabled default to false. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [phoenix] jpisaac commented on pull request #1629: PHOENIX-6985 :- Setting server-side masking flag default to false
jpisaac commented on PR #1629: URL: https://github.com/apache/phoenix/pull/1629#issuecomment-1608174524 @lokiore I understand they are flappers after talking to @virajjasani @tkhurana, Is there a way to mark them as such? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@phoenix.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Comment Edited] (PHOENIX-6983) Add hint to disable server merges for uncovered index queries
[ https://issues.apache.org/jira/browse/PHOENIX-6983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17737302#comment-17737302 ] Lars Hofhansl edited comment on PHOENIX-6983 at 6/26/23 5:50 PM: - isn't this the same as hinting not to use the index in question, or hinting a full scan? [~stoty] was (Author: lhofhansl): isn't this the same as hinting not to use the index in question, or hinting a full scan? > Add hint to disable server merges for uncovered index queries > - > > Key: PHOENIX-6983 > URL: https://issues.apache.org/jira/browse/PHOENIX-6983 > Project: Phoenix > Issue Type: Improvement > Components: core >Affects Versions: 5.2.0, 5.1.3 >Reporter: Istvan Toth >Assignee: Istvan Toth >Priority: Major > Fix For: 5.2.0, 5.1.4 > > > In certain cases, the new server merge code is far less efficient than the > old skip-scan-merge code path. > Specifically, when a large number of rows is matched on the index table, then > each of those rows has to be resolved from the data table, and the filtering > must be done on the index RS. > With the old code path, these filters were pushed to the data table, and > processed in parallel, with much better performance. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (PHOENIX-6983) Add hint to disable server merges for uncovered index queries
[ https://issues.apache.org/jira/browse/PHOENIX-6983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17737302#comment-17737302 ] Lars Hofhansl commented on PHOENIX-6983: isn't this the same as hinting not to use the index in question, or hinting a full scan? > Add hint to disable server merges for uncovered index queries > - > > Key: PHOENIX-6983 > URL: https://issues.apache.org/jira/browse/PHOENIX-6983 > Project: Phoenix > Issue Type: Improvement > Components: core >Affects Versions: 5.2.0, 5.1.3 >Reporter: Istvan Toth >Assignee: Istvan Toth >Priority: Major > Fix For: 5.2.0, 5.1.4 > > > In certain cases, the new server merge code is far less efficient than the > old skip-scan-merge code path. > Specifically, when a large number of rows is matched on the index table, then > each of those rows has to be resolved from the data table, and the filtering > must be done on the index RS. > With the old code path, these filters were pushed to the data table, and > processed in parallel, with much better performance. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [phoenix-omid] richardantal closed pull request #135: OMID-244 Upgrade SnakeYaml version to 2.0
richardantal closed pull request #135: OMID-244 Upgrade SnakeYaml version to 2.0 URL: https://github.com/apache/phoenix-omid/pull/135 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@phoenix.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [phoenix-omid] richardantal opened a new pull request, #136: OMID-244 Upgrade SnakeYaml version to 2.0
richardantal opened a new pull request, #136: URL: https://github.com/apache/phoenix-omid/pull/136 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@phoenix.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org