[jira] [Commented] (PHOENIX-6897) Filters on unverified index rows return wrong result

2023-06-26 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/PHOENIX-6897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17737374#comment-17737374
 ] 

ASF GitHub Bot commented on PHOENIX-6897:
-

tkhurana opened a new pull request, #1632:
URL: https://github.com/apache/phoenix/pull/1632

   …597)
   
   * PHOENIX-6897 Filters on unverified index rows return wrong result
   
   * Fixed checkstyle and missing license warnings
   
   * Addressed review comments
   
   -




> Filters on unverified index rows return wrong result
> 
>
> Key: PHOENIX-6897
> URL: https://issues.apache.org/jira/browse/PHOENIX-6897
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 5.1.2
>Reporter: Yunbo Fan
>Assignee: Tanuj Khurana
>Priority: Major
> Fix For: 5.2.0
>
>
> h4. Summary:
> Upsert include three phases, and if failed after phase1, unverified index 
> rows will leave in the index table. This will cause wrong result when do 
> aggregate queries.
> h4. Steps for reproduce
> 1. create table and index
> {code}
> create table students(id integer primary key, name varchar, status integer);
> create index students_name_index on students(name, id) include (status);
> {code}
> 2. upsert data using phoenix
> {code}
> upsert into students values(1, 'tom', 1);
> upsert into students values(2, 'jerry', 2);
> {code}
> 3. do phase1 by hbase shell, change status column value to '2' and verified 
> column value to '2'
> {code}
> put 'STUDENTS_NAME_INDEX', "tom\x00\x80\x00\x00\x01", '0:0:STATUS', 
> "\x80\x00\x00\x02"
> put 'STUDENTS_NAME_INDEX', "tom\x00\x80\x00\x00\x01", '0:_0', "\x02"
> {code}
> notice: hbase shell can't parse colon in column, like '0:0:STATUS', you may 
> need comment the line in hbase/lib/ruby/hbase/table.rb, see 
> https://issues.apache.org/jira/browse/HBASE-13788
> {code}
> # Returns family and (when has it) qualifier for a column name
> def parse_column_name(column)
>   split = 
> org.apache.hadoop.hbase.KeyValue.parseColumn(column.to_java_bytes)
>   -> comment this line out #set_converter(split) if split.length > 1
>   return split[0], (split.length > 1) ? split[1] : nil
> end
> {code}
> 4. do query without aggregate, the result is right
> {code}
> 0: jdbc:phoenix:> select status from students where name = 'tom';
> ++
> | STATUS |
> ++
> | 1  |
> ++
> {code}
> 5. do query with aggregate, get wrong result
> {code}
> 0: jdbc:phoenix:> select count(*) from students where name = 'tom' and status 
> = 1;
> +--+
> | COUNT(1) |
> +--+
> | 0|
> +--+
> {code}
> 6. using NO_INDEX hint
> {code}
> 0: jdbc:phoenix:> select /*+ NO_INDEX */ count(*) from students where name = 
> 'tom' and status = 1;
> +--+
> | COUNT(1) |
> +--+
> | 1|
> +--+
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [phoenix] tkhurana opened a new pull request, #1632: PHOENIX-6897 Filters on unverified index rows return wrong result (#1…

2023-06-26 Thread via GitHub


tkhurana opened a new pull request, #1632:
URL: https://github.com/apache/phoenix/pull/1632

   …597)
   
   * PHOENIX-6897 Filters on unverified index rows return wrong result
   
   * Fixed checkstyle and missing license warnings
   
   * Addressed review comments
   
   -


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@phoenix.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (PHOENIX-6985) Setting server-side masking flag default to false

2023-06-26 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/PHOENIX-6985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17737346#comment-17737346
 ] 

ASF GitHub Bot commented on PHOENIX-6985:
-

lokiore commented on PR #1629:
URL: https://github.com/apache/phoenix/pull/1629#issuecomment-1608244815

   > @lokiore I understand they are flappers after talking to @virajjasani 
@tkhurana, Is there a way to mark them as such?
   
   I am not sure if we can do something here other than Ignore which I assume 
we don't want. maybe some plugin can be added at jenkins like Result Test 
Analyser or do we already have that @virajjasani !?




> Setting server-side masking flag default to false 
> --
>
> Key: PHOENIX-6985
> URL: https://issues.apache.org/jira/browse/PHOENIX-6985
> Project: Phoenix
>  Issue Type: Sub-task
>Reporter: Lokesh Khurana
>Assignee: Lokesh Khurana
>Priority: Major
>
> Currently PhoenixTTL feature is enabled by default which adds 
> PhoenixTTLRegionObserver coproc to every table, setting 
> phoenix.ttl.server_side.masking.enabled default to false.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [phoenix] lokiore commented on pull request #1629: PHOENIX-6985 :- Setting server-side masking flag default to false

2023-06-26 Thread via GitHub


lokiore commented on PR #1629:
URL: https://github.com/apache/phoenix/pull/1629#issuecomment-1608244815

   > @lokiore I understand they are flappers after talking to @virajjasani 
@tkhurana, Is there a way to mark them as such?
   
   I am not sure if we can do something here other than Ignore which I assume 
we don't want. maybe some plugin can be added at jenkins like Result Test 
Analyser or do we already have that @virajjasani !?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@phoenix.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Comment Edited] (PHOENIX-6975) Introduce StaleMetadataCache Exception

2023-06-26 Thread Rushabh Shah (Jira)


[ 
https://issues.apache.org/jira/browse/PHOENIX-6975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17737331#comment-17737331
 ] 

Rushabh Shah edited comment on PHOENIX-6975 at 6/26/23 8:46 PM:


Thinking how to achieve this.

For executeQuery() method, how client will retry on 
StaleMetadataCacheException? 
Application using phoenix will similar query structure to query from Phoenix 
server.

 
{code:java}
String query = "SELECT * FROM " + tableNameStr;
// Execute query
try (ResultSet rs = conn.createStatement().executeQuery(query)) {
  while (rs.next()) {   --> This will throw StaleMetadataCacheException
   // Read from ResultSet
  }
}

{code}
 

In the above example, Phoenix client will receive StaleMetadataCacheException 
while doing rs.next() calls.
How will client handle StaleMetadataCacheException and retry? 
Should phoenix client throw StaleMetadataCacheException all the way back to the 
application or re-create PhoenixStatement and PhoenixResultSet and retry again 
transparently? 
What happens if phoenix client encounters StaleMetadataCacheException on the 
4th or 5th rs.next call? When the client retries, should it skip the previous 3 
or 4 results and read from the 5th result or retry again from the beginning?

+Option 1: Throw StaleMetadataCacheException back to the application.+
 # Intercept StaleMetadataCacheException in PhoenixResultSet#next and 
invalidate the client side cache.
 # Throw StaleMetadataCacheException (subclass of SQLException) back to the 
application. Currently 
[PhoenixResultSet#next|https://github.com/apache/phoenix/blob/master/phoenix-core/src/main/java/org/apache/phoenix/jdbc/PhoenixResultSet.java#L873]
 is throwing SQLException.
 # Let the application decide whether to retry or not?

Pros:
 # Simple to implement.
 # The application is in control whether to retry or not. If the application is 
going to retry then it can it can update the business logic to take appropriate 
actions on retry, like resetting some counters to avoid double counting, etc.

Cons:
 # Application will encounter SQLException more frequently during the schema 
upgrade process. Currently if application is NOT retrying on any SQLException 
then the failure rate will increase. In this case, there will be some changes 
required to handle StaleMetadataCacheException.

+Option 2: Handle the retry logic within phoenix client.+
 # Intercept StaleMetadataCacheException in PhoenixResultSet#next and 
invalidate the client side cache.
 # We will need to reset state in PhoenixResultSet and PhoenixStatement, 
particularly the QueryPlan object.
 # Run 
[PhoenixStatement#executeQuery|https://github.com/apache/phoenix/blob/master/phoenix-core/src/main/java/org/apache/phoenix/jdbc/PhoenixStatement.java#L2151]
 to generate QueryPlan again.

Pros:
 # Very transparent to the end application. It doesn’t need to worry about the 
StaleMetadataCacheException.
 # No more pros. Refer to point#1

Cons:
 # Trust issue: What happens if PhoenixResultSet#next failed while iterating 
the results. Lets say application read 4 values and 5th next() called failed 
with StaleMetadataCacheException. If phoenix client retry transparently, it 
will start iterating PhoenixResultSet from the beginning again. This can cause 
trust issues since we will re-process the first 4 rows.
 # The code will become too complex and very difficult to maintain. If we 
introduce new logic in future, we will have to make sure that it gets reset on 
StaleMetadataCacheException. Very cumbersome to maintain.

 

Thoughts?  [~stoty]  [~gjacoby] [~kadir]  [~jisaac] [~tkhurana] 


was (Author: shahrs87):
Thinking how to achieve this.

For executeQuery() method, how client will retry on 
StaleMetadataCacheException? 
Application using phoenix will similar query structure to query from Phoenix 
server.

 
{code:java}
String query = "SELECT * FROM " + tableNameStr;
// Execute query
try (ResultSet rs = conn.createStatement().executeQuery(query)) {
  while (rs.next()) {   --> This will throw StaleMetadataCacheException
   // Read from ResultSet
  }
}

{code}
 

In the above example, Phoenix client will receive StaleMetadataCacheException 
while doing rs.next() calls.
How will client handle StaleMetadataCacheException and retry? 
Should phoenix client throw StaleMetadataCacheException all the way back to the 
application or re-create PhoenixStatement and PhoenixResultSet and retry again 
transparently? 
What happens if phoenix client encounters StaleMetadataCacheException on the 
4th or 5th rs.next call? When the client retries, should it skip the previous 3 
or 4 results and read from the 5th result or retry again from the beginning? 

Option 1: Throw StaleMetadataCacheException back to the application.
 # Intercept StaleMetadataCacheException in PhoenixResultSet#next and 
invalidate the client side cache.
 # Throw StaleMetadataCacheException (subclass of SQLException) back 

[jira] [Commented] (PHOENIX-6975) Introduce StaleMetadataCache Exception

2023-06-26 Thread Rushabh Shah (Jira)


[ 
https://issues.apache.org/jira/browse/PHOENIX-6975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17737331#comment-17737331
 ] 

Rushabh Shah commented on PHOENIX-6975:
---

Thinking how to achieve this.

For executeQuery() method, how client will retry on 
StaleMetadataCacheException? 
Application using phoenix will similar query structure to query from Phoenix 
server.

 
{code:java}
String query = "SELECT * FROM " + tableNameStr;
// Execute query
try (ResultSet rs = conn.createStatement().executeQuery(query)) {
  while (rs.next()) {   --> This will throw StaleMetadataCacheException
   // Read from ResultSet
  }
}

{code}
 

In the above example, Phoenix client will receive StaleMetadataCacheException 
while doing rs.next() calls.
How will client handle StaleMetadataCacheException and retry? 
Should phoenix client throw StaleMetadataCacheException all the way back to the 
application or re-create PhoenixStatement and PhoenixResultSet and retry again 
transparently? 
What happens if phoenix client encounters StaleMetadataCacheException on the 
4th or 5th rs.next call? When the client retries, should it skip the previous 3 
or 4 results and read from the 5th result or retry again from the beginning? 

Option 1: Throw StaleMetadataCacheException back to the application.
 # Intercept StaleMetadataCacheException in PhoenixResultSet#next and 
invalidate the client side cache.
 # Throw StaleMetadataCacheException (subclass of SQLException) back to the 
application. Currently 
[PhoenixResultSet#next|https://github.com/apache/phoenix/blob/master/phoenix-core/src/main/java/org/apache/phoenix/jdbc/PhoenixResultSet.java#L873]
 is throwing SQLException.
 # Let the application decide whether to retry or not?

Pros:
 # Simple to implement.
 # The application is in control whether to retry or not. If the application is 
going to retry then it can it can update the business logic to take appropriate 
actions on retry, like resetting some counters to avoid double counting, etc.

Cons:
 # Application will encounter SQLException more frequently during the schema 
upgrade process. Currently if application is NOT retrying on any SQLException 
then the failure rate will increase. In this case, there will be some changes 
required to handle StaleMetadataCacheException.


Option 2: Handle the retry logic within phoenix client.
 # Intercept StaleMetadataCacheException in PhoenixResultSet#next and 
invalidate the client side cache.
 # We will need to reset state in PhoenixResultSet and PhoenixStatement, 
particularly the QueryPlan object.
 # Run 
[PhoenixStatement#executeQuery|https://github.com/apache/phoenix/blob/master/phoenix-core/src/main/java/org/apache/phoenix/jdbc/PhoenixStatement.java#L2151]
 to generate QueryPlan again.

Pros:
 # Very transparent to the end application. It doesn’t need to worry about the 
StaleMetadataCacheException.
 # No more pros. Refer to point#1

Cons:
 # Trust issue: What happens if PhoenixResultSet#next failed while iterating 
the results. Lets say application read 4 values and 5th next() called failed 
with StaleMetadataCacheException. If phoenix client retry transparently, it 
will start iterating PhoenixResultSet from the beginning again. This can cause 
trust issues since we will re-process the first 4 rows.
 # The code will become too complex and very difficult to maintain. If we 
introduce new logic in future, we will have to make sure that it gets reset on 
StaleMetadataCacheException. Very cumbersome to maintain.

 

Thoughts, [~stoty]  [~gjacoby] [~kadir]  [~jisaac] [~tkhurana] 

> Introduce StaleMetadataCache Exception
> --
>
> Key: PHOENIX-6975
> URL: https://issues.apache.org/jira/browse/PHOENIX-6975
> Project: Phoenix
>  Issue Type: Sub-task
>  Components: core
>Reporter: Rushabh Shah
>Assignee: Rushabh Shah
>Priority: Major
>
> Introduce StaleMetadataCache Exception if client provided last ddl timestamp 
> is less than server side last ddl timestamp and allow the client to retry the 
> statement.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (PHOENIX-6985) Setting server-side masking flag default to false

2023-06-26 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/PHOENIX-6985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17737330#comment-17737330
 ] 

ASF GitHub Bot commented on PHOENIX-6985:
-

jpisaac commented on PR #1629:
URL: https://github.com/apache/phoenix/pull/1629#issuecomment-1608174524

   @lokiore I understand they are flappers after talking to @virajjasani 
@tkhurana, Is there a way to mark them as such?




> Setting server-side masking flag default to false 
> --
>
> Key: PHOENIX-6985
> URL: https://issues.apache.org/jira/browse/PHOENIX-6985
> Project: Phoenix
>  Issue Type: Sub-task
>Reporter: Lokesh Khurana
>Assignee: Lokesh Khurana
>Priority: Major
>
> Currently PhoenixTTL feature is enabled by default which adds 
> PhoenixTTLRegionObserver coproc to every table, setting 
> phoenix.ttl.server_side.masking.enabled default to false.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [phoenix] jpisaac commented on pull request #1629: PHOENIX-6985 :- Setting server-side masking flag default to false

2023-06-26 Thread via GitHub


jpisaac commented on PR #1629:
URL: https://github.com/apache/phoenix/pull/1629#issuecomment-1608174524

   @lokiore I understand they are flappers after talking to @virajjasani 
@tkhurana, Is there a way to mark them as such?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@phoenix.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Comment Edited] (PHOENIX-6983) Add hint to disable server merges for uncovered index queries

2023-06-26 Thread Lars Hofhansl (Jira)


[ 
https://issues.apache.org/jira/browse/PHOENIX-6983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17737302#comment-17737302
 ] 

Lars Hofhansl edited comment on PHOENIX-6983 at 6/26/23 5:50 PM:
-

isn't this the same as hinting not to use the index in question, or hinting a 
full scan?

[~stoty] 


was (Author: lhofhansl):
isn't this the same as hinting not to use the index in question, or hinting a 
full scan?

> Add hint to disable server merges for uncovered index queries
> -
>
> Key: PHOENIX-6983
> URL: https://issues.apache.org/jira/browse/PHOENIX-6983
> Project: Phoenix
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 5.2.0, 5.1.3
>Reporter: Istvan Toth
>Assignee: Istvan Toth
>Priority: Major
> Fix For: 5.2.0, 5.1.4
>
>
> In certain cases, the new server merge code is far less efficient than the 
> old skip-scan-merge code path.
> Specifically, when a large number of rows is matched on the index table, then 
> each of those rows has to be resolved from the data table, and the filtering 
> must be done on the index RS.
> With the old code path, these filters were pushed to the data table, and 
> processed in parallel, with much better performance.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (PHOENIX-6983) Add hint to disable server merges for uncovered index queries

2023-06-26 Thread Lars Hofhansl (Jira)


[ 
https://issues.apache.org/jira/browse/PHOENIX-6983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17737302#comment-17737302
 ] 

Lars Hofhansl commented on PHOENIX-6983:


isn't this the same as hinting not to use the index in question, or hinting a 
full scan?

> Add hint to disable server merges for uncovered index queries
> -
>
> Key: PHOENIX-6983
> URL: https://issues.apache.org/jira/browse/PHOENIX-6983
> Project: Phoenix
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 5.2.0, 5.1.3
>Reporter: Istvan Toth
>Assignee: Istvan Toth
>Priority: Major
> Fix For: 5.2.0, 5.1.4
>
>
> In certain cases, the new server merge code is far less efficient than the 
> old skip-scan-merge code path.
> Specifically, when a large number of rows is matched on the index table, then 
> each of those rows has to be resolved from the data table, and the filtering 
> must be done on the index RS.
> With the old code path, these filters were pushed to the data table, and 
> processed in parallel, with much better performance.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [phoenix-omid] richardantal closed pull request #135: OMID-244 Upgrade SnakeYaml version to 2.0

2023-06-26 Thread via GitHub


richardantal closed pull request #135: OMID-244 Upgrade SnakeYaml version to 2.0
URL: https://github.com/apache/phoenix-omid/pull/135


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@phoenix.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [phoenix-omid] richardantal opened a new pull request, #136: OMID-244 Upgrade SnakeYaml version to 2.0

2023-06-26 Thread via GitHub


richardantal opened a new pull request, #136:
URL: https://github.com/apache/phoenix-omid/pull/136

   (no comment)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@phoenix.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org