[jira] [Comment Edited] (PHOENIX-7309) Support specifying splits.txt file while creating a table.

2024-06-03 Thread Andrew Kyle Purtell (Jira)


[ 
https://issues.apache.org/jira/browse/PHOENIX-7309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17851823#comment-17851823
 ] 

Andrew Kyle Purtell edited comment on PHOENIX-7309 at 6/3/24 9:37 PM:
--

bq. While Phoenix can do something similar to what HBase shell does, I believe 
rather than Phoenix having to read the whole split file (with potentially 10k 
or 50k worth of split keys) and create split keys array, it would be great if 
HBase can provide public Admin API with input as split file. 

HBase is unlikely to agree to have the master read some file from a user 
supplied (untrusted) location. My initial reaction is to veto it as 
unnecessary, because the functionality already exists in the API, and changes 
to public APIs involve some nontrivial work to avoid compatibility issues. 
Also, accepting user supplied input from the filesystem means attacker file 
level risks become possible.  I might be convinced if there is simply no other 
way to accomplish what you need, but that doesn't seem to be the case. Other 
committers may have similar reactions. Or not. File a JIRA if you'd like to 
pursue it, but consider my comment over there would be the sam.e

bq. While Phoenix can do something similar to what HBase shell does

Yes, you see what is implemented in jruby for the shell. Simply reimplement 
this in Java for your purposes, and modify for requirements.


was (Author: apurtell):
bq. While Phoenix can do something similar to what HBase shell does, I believe 
rather than Phoenix having to read the whole split file (with potentially 10k 
or 50k worth of split keys) and create split keys array, it would be great if 
HBase can provide public Admin API with input as split file. 

HBase is unlikely to agree to have the master read some file from a user 
supplied (untrusted) location. My initial reaction is to veto it as 
unnecessary, because the functionality already exists in the API, and changes 
to public APIs involve some nontrivial work to avoid compatibility issues. 
Also, accepting user supplied input from the filesystem means attacker file 
level risks become possible.  

bq. While Phoenix can do something similar to what HBase shell does

Yes, you see what is implemented in jruby for the shell. Simply reimplement 
this in Java for your purposes, and modify for requirements.

> Support specifying splits.txt file while creating a table.
> --
>
> Key: PHOENIX-7309
> URL: https://issues.apache.org/jira/browse/PHOENIX-7309
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Rushabh Shah
>Priority: Major
>
> Currently phoenix grammar support specifying splits points while creating a 
> table.
> See grammar [here|https://phoenix.apache.org/language/index.html#create_table]
> {noformat}
> CREATE TABLE IF NOT EXISTS "my_case_sensitive_table"
> ( "id" char(10) not null primary key, "value" integer)
> DATA_BLOCK_ENCODING='NONE',VERSIONS=5,MAX_FILESIZE=200 split on (?, 
> ?, ?)
> {noformat}
> This works fine if you have few split points (less than 10-20). 
> But if you want to specify 1000 (or in 10,000s) split points then this API 
> becomes very cumbersome to use.
> HBase provides API to create a table with split points text file.
> {noformat}
>   hbase> create 't1', 'f1', SPLITS_FILE => 'splits.txt', OWNER => 'johndoe'
> {noformat}
> We should also have support in Phoenix to provide split points in a text file.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (PHOENIX-7309) Support specifying splits.txt file while creating a table.

2024-06-03 Thread Andrew Kyle Purtell (Jira)


[ 
https://issues.apache.org/jira/browse/PHOENIX-7309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17851823#comment-17851823
 ] 

Andrew Kyle Purtell edited comment on PHOENIX-7309 at 6/3/24 9:35 PM:
--

bq. While Phoenix can do something similar to what HBase shell does, I believe 
rather than Phoenix having to read the whole split file (with potentially 10k 
or 50k worth of split keys) and create split keys array, it would be great if 
HBase can provide public Admin API with input as split file. 

HBase is unlikely to agree to have the master read some file from a user 
supplied (untrusted) location. My initial reaction is to veto it as 
unnecessary, because the functionality already exists in the API, and changes 
to public APIs involve some nontrivial work to avoid compatibility issues. 
Also, accepting user supplied input from the filesystem means attacker file 
level risks become possible.  

bq. While Phoenix can do something similar to what HBase shell does

Yes, you see what is implemented in jruby for the shell. Simply reimplement 
this in Java for your purposes, and modify for requirements.


was (Author: apurtell):
bq. While Phoenix can do something similar to what HBase shell does, I believe 
rather than Phoenix having to read the whole split file (with potentially 10k 
or 50k worth of split keys) and create split keys array, it would be great if 
HBase can provide public Admin API with input as split file. 

HBase is unlikely to agree to have the master read some file from a user 
supplied (untrusted) location. My initial reaction is to veto it as 
unnecessary, because the functionality already exists in the API, and changes 
to public APIs involve some nontrivial work to avoid compatibility issues. 
Also, accepting user supplied input from the filesystem means attacker file 
level risks become possible.  



> Support specifying splits.txt file while creating a table.
> --
>
> Key: PHOENIX-7309
> URL: https://issues.apache.org/jira/browse/PHOENIX-7309
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Rushabh Shah
>Priority: Major
>
> Currently phoenix grammar support specifying splits points while creating a 
> table.
> See grammar [here|https://phoenix.apache.org/language/index.html#create_table]
> {noformat}
> CREATE TABLE IF NOT EXISTS "my_case_sensitive_table"
> ( "id" char(10) not null primary key, "value" integer)
> DATA_BLOCK_ENCODING='NONE',VERSIONS=5,MAX_FILESIZE=200 split on (?, 
> ?, ?)
> {noformat}
> This works fine if you have few split points (less than 10-20). 
> But if you want to specify 1000 (or in 10,000s) split points then this API 
> becomes very cumbersome to use.
> HBase provides API to create a table with split points text file.
> {noformat}
>   hbase> create 't1', 'f1', SPLITS_FILE => 'splits.txt', OWNER => 'johndoe'
> {noformat}
> We should also have support in Phoenix to provide split points in a text file.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (PHOENIX-7309) Support specifying splits.txt file while creating a table.

2024-06-03 Thread Andrew Kyle Purtell (Jira)


[ 
https://issues.apache.org/jira/browse/PHOENIX-7309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17851823#comment-17851823
 ] 

Andrew Kyle Purtell commented on PHOENIX-7309:
--

bq. While Phoenix can do something similar to what HBase shell does, I believe 
rather than Phoenix having to read the whole split file (with potentially 10k 
or 50k worth of split keys) and create split keys array, it would be great if 
HBase can provide public Admin API with input as split file. 

HBase is unlikely to agree to have the master read some file from a user 
supplied (untrusted) location. My initial reaction is to veto it as 
unnecessary, because the functionality already exists in the API, and changes 
to public APIs involve some nontrivial work to avoid compatibility issues. 
Also, accepting user supplied input from the filesystem means attacker file 
level risks become possible.  



> Support specifying splits.txt file while creating a table.
> --
>
> Key: PHOENIX-7309
> URL: https://issues.apache.org/jira/browse/PHOENIX-7309
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Rushabh Shah
>Priority: Major
>
> Currently phoenix grammar support specifying splits points while creating a 
> table.
> See grammar [here|https://phoenix.apache.org/language/index.html#create_table]
> {noformat}
> CREATE TABLE IF NOT EXISTS "my_case_sensitive_table"
> ( "id" char(10) not null primary key, "value" integer)
> DATA_BLOCK_ENCODING='NONE',VERSIONS=5,MAX_FILESIZE=200 split on (?, 
> ?, ?)
> {noformat}
> This works fine if you have few split points (less than 10-20). 
> But if you want to specify 1000 (or in 10,000s) split points then this API 
> becomes very cumbersome to use.
> HBase provides API to create a table with split points text file.
> {noformat}
>   hbase> create 't1', 'f1', SPLITS_FILE => 'splits.txt', OWNER => 'johndoe'
> {noformat}
> We should also have support in Phoenix to provide split points in a text file.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (PHOENIX-6522) Unique Id generation support queryId

2023-05-11 Thread Andrew Kyle Purtell (Jira)


[ 
https://issues.apache.org/jira/browse/PHOENIX-6522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17721958#comment-17721958
 ] 

Andrew Kyle Purtell commented on PHOENIX-6522:
--

My apologies [~sairampola6], I do not receive at-mention emails from Apache 
JIRA so was informed by someone else about this issue today.

> Unique Id generation support queryId
> 
>
> Key: PHOENIX-6522
> URL: https://issues.apache.org/jira/browse/PHOENIX-6522
> Project: Phoenix
>  Issue Type: Sub-task
>Reporter: Kiran Kumar Maturi
>Assignee: Pola Sairam
>Priority: Major
>
> Sometimes user might want a queryId be generated for the query rather than 
> adding it. This feature will be config based if enabled it will generate 
> queryId for all the queries



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (PHOENIX-6522) Unique Id generation support queryId

2023-05-11 Thread Andrew Kyle Purtell (Jira)


[ 
https://issues.apache.org/jira/browse/PHOENIX-6522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17721958#comment-17721958
 ] 

Andrew Kyle Purtell edited comment on PHOENIX-6522 at 5/12/23 12:30 AM:


My apologies [~sairampola6], I do not receive at-mention emails from Apache 
JIRA so was informed by someone else about this issue today. I have made you a 
contributor on the Phoenix project and assigned this issue to you. Going 
forward you should be able to self-assign issues. 


was (Author: apurtell):
My apologies [~sairampola6], I do not receive at-mention emails from Apache 
JIRA so was informed by someone else about this issue today.

> Unique Id generation support queryId
> 
>
> Key: PHOENIX-6522
> URL: https://issues.apache.org/jira/browse/PHOENIX-6522
> Project: Phoenix
>  Issue Type: Sub-task
>Reporter: Kiran Kumar Maturi
>Assignee: Pola Sairam
>Priority: Major
>
> Sometimes user might want a queryId be generated for the query rather than 
> adding it. This feature will be config based if enabled it will generate 
> queryId for all the queries



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (PHOENIX-6786) SequenceRegionObserver should use batch mutation coproc hooks

2023-01-10 Thread Andrew Kyle Purtell (Jira)


[ 
https://issues.apache.org/jira/browse/PHOENIX-6786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17657016#comment-17657016
 ] 

Andrew Kyle Purtell edited comment on PHOENIX-6786 at 1/11/23 12:10 AM:


apurtell commented on PR #1549:
URL: https://github.com/apache/phoenix/pull/1549#issuecomment-1378059506

   @kadirozde Is this the kind of refactoring you had in mind?
   
   This patch:
   - Adds a preBatchMutate hook to SequenceRegionObserver that will filter any 
Append or Increment operations submitted in batch and handle them ahead of base 
HBase processing. Puts and Deletes will be passed through to normal handling as 
required. Phoenix rewrites Increment into Puts and submits them by 
Region#batchMutate, which would set up a cycle with preBatchMutate otherwise, I 
believe.
   - Retains existing hooks for preAppend and preIncrement that are necessary 
for intercepting other APIs.
   - Refactor most logic to reusable private methods.  





was (Author: githubbot):
apurtell commented on PR #1549:
URL: https://github.com/apache/phoenix/pull/1549#issuecomment-1378059506

   @kadirozde Is this the kind of refactoring you had in mind?
   
   This patch:
   - Adds a preBatchMutate hook that will filter any Append or Increment 
operations submitted in batch and handle them ahead of base HBase processing. 
   - Retains existing hooks for preAppend and preIncrement that are necessary 
for intercepting other APIs.
   - Refactor most logic to reusable private methods.  




> SequenceRegionObserver should use batch mutation coproc hooks
> -
>
> Key: PHOENIX-6786
> URL: https://issues.apache.org/jira/browse/PHOENIX-6786
> Project: Phoenix
>  Issue Type: Sub-task
>Reporter: Geoffrey Jacoby
>Assignee: Andrew Kyle Purtell
>Priority: Major
>
> SequenceRegionObserver uses preIncrement but could use the standard batch 
> mutation coproc hooks, similarly to how atomic upserts work after 
> PHOENIX-6387. This will simplify the code and also make it easier to re-use 
> code from secondary index generation in performance optimizations. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] (PHOENIX-6786) SequenceRegionObserver should use batch mutation coproc hooks

2023-01-10 Thread Andrew Kyle Purtell (Jira)


[ https://issues.apache.org/jira/browse/PHOENIX-6786 ]


Andrew Kyle Purtell deleted comment on PHOENIX-6786:
--

was (Author: apurtell):
WIP: https://github.com/apache/phoenix/pull/1549
[~kozdemir] is this the sort of refactoring you had in mind?

> SequenceRegionObserver should use batch mutation coproc hooks
> -
>
> Key: PHOENIX-6786
> URL: https://issues.apache.org/jira/browse/PHOENIX-6786
> Project: Phoenix
>  Issue Type: Sub-task
>Reporter: Geoffrey Jacoby
>Assignee: Andrew Kyle Purtell
>Priority: Major
>
> SequenceRegionObserver uses preIncrement but could use the standard batch 
> mutation coproc hooks, similarly to how atomic upserts work after 
> PHOENIX-6387. This will simplify the code and also make it easier to re-use 
> code from secondary index generation in performance optimizations. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (PHOENIX-6786) SequenceRegionObserver should use batch mutation coproc hooks

2023-01-10 Thread Andrew Kyle Purtell (Jira)


[ 
https://issues.apache.org/jira/browse/PHOENIX-6786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17657017#comment-17657017
 ] 

Andrew Kyle Purtell commented on PHOENIX-6786:
--

WIP: https://github.com/apache/phoenix/pull/1549
[~kozdemir] is this the sort of refactoring you had in mind?

> SequenceRegionObserver should use batch mutation coproc hooks
> -
>
> Key: PHOENIX-6786
> URL: https://issues.apache.org/jira/browse/PHOENIX-6786
> Project: Phoenix
>  Issue Type: Sub-task
>Reporter: Geoffrey Jacoby
>Assignee: Andrew Kyle Purtell
>Priority: Major
>
> SequenceRegionObserver uses preIncrement but could use the standard batch 
> mutation coproc hooks, similarly to how atomic upserts work after 
> PHOENIX-6387. This will simplify the code and also make it easier to re-use 
> code from secondary index generation in performance optimizations. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (PHOENIX-6786) SequenceRegionObserver should use batch mutation coproc hooks

2023-01-06 Thread Andrew Kyle Purtell (Jira)


[ 
https://issues.apache.org/jira/browse/PHOENIX-6786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17655561#comment-17655561
 ] 

Andrew Kyle Purtell edited comment on PHOENIX-6786 at 1/6/23 7:29 PM:
--

Currently I am making sure tests start at a clean place. Finding I am having to 
do -Dskip.code-coverage=true to avoid repeatable jacoco related build errors 
("malformed input around byte 2", like 
https://stackoverflow.com/questions/55381133/an-error-has-occured-in-jacoco-report-generation).
 This is also with -Dhbase.profile=2.5 fwiw


was (Author: apurtell):
Currently I am making sure tests start at a clean place. Finding I am having to 
do -Dskip.code-coverage=true to avoid repeatable jacoco related build errors 
("malformed input around byte 2", like 
https://stackoverflow.com/questions/55381133/an-error-has-occured-in-jacoco-report-generation).
 

> SequenceRegionObserver should use batch mutation coproc hooks
> -
>
> Key: PHOENIX-6786
> URL: https://issues.apache.org/jira/browse/PHOENIX-6786
> Project: Phoenix
>  Issue Type: Sub-task
>Reporter: Geoffrey Jacoby
>Assignee: Andrew Kyle Purtell
>Priority: Major
>
> SequenceRegionObserver uses preIncrement but could use the standard batch 
> mutation coproc hooks, similarly to how atomic upserts work after 
> PHOENIX-6387. This will simplify the code and also make it easier to re-use 
> code from secondary index generation in performance optimizations. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (PHOENIX-6786) SequenceRegionObserver should use batch mutation coproc hooks

2023-01-06 Thread Andrew Kyle Purtell (Jira)


[ 
https://issues.apache.org/jira/browse/PHOENIX-6786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17655561#comment-17655561
 ] 

Andrew Kyle Purtell commented on PHOENIX-6786:
--

Currently I am making sure tests start at a clean place. Finding I am having to 
do -Dskip.code-coverage=true to avoid repeatable jacoco related build errors 
("malformed input around byte 2", like 
https://stackoverflow.com/questions/55381133/an-error-has-occured-in-jacoco-report-generation).
 

> SequenceRegionObserver should use batch mutation coproc hooks
> -
>
> Key: PHOENIX-6786
> URL: https://issues.apache.org/jira/browse/PHOENIX-6786
> Project: Phoenix
>  Issue Type: Sub-task
>Reporter: Geoffrey Jacoby
>Assignee: Andrew Kyle Purtell
>Priority: Major
>
> SequenceRegionObserver uses preIncrement but could use the standard batch 
> mutation coproc hooks, similarly to how atomic upserts work after 
> PHOENIX-6387. This will simplify the code and also make it easier to re-use 
> code from secondary index generation in performance optimizations. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] (PHOENIX-6786) SequenceRegionObserver should use batch mutation coproc hooks

2023-01-05 Thread Andrew Kyle Purtell (Jira)


[ https://issues.apache.org/jira/browse/PHOENIX-6786 ]


Andrew Kyle Purtell deleted comment on PHOENIX-6786:
--

was (Author: apurtell):
[~kozdemir] Planning to make the SequenceRegionObserver changes that [~gjacoby] 
recommended. 

> SequenceRegionObserver should use batch mutation coproc hooks
> -
>
> Key: PHOENIX-6786
> URL: https://issues.apache.org/jira/browse/PHOENIX-6786
> Project: Phoenix
>  Issue Type: Sub-task
>Reporter: Geoffrey Jacoby
>Assignee: Andrew Kyle Purtell
>Priority: Major
>
> SequenceRegionObserver uses preIncrement but could use the standard batch 
> mutation coproc hooks, similarly to how atomic upserts work after 
> PHOENIX-6387. This will simplify the code and also make it easier to re-use 
> code from secondary index generation in performance optimizations. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (PHOENIX-6787) Server-side Sequence Update Consolidation

2022-11-07 Thread Andrew Kyle Purtell (Jira)


[ 
https://issues.apache.org/jira/browse/PHOENIX-6787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17630041#comment-17630041
 ] 

Andrew Kyle Purtell commented on PHOENIX-6787:
--

[~kozdemir] Taking this one too. 

> Server-side Sequence Update Consolidation
> -
>
> Key: PHOENIX-6787
> URL: https://issues.apache.org/jira/browse/PHOENIX-6787
> Project: Phoenix
>  Issue Type: Sub-task
>Reporter: Geoffrey Jacoby
>Assignee: Andrew Kyle Purtell
>Priority: Major
> Fix For: 5.3.0
>
>
> For secondary indexes, we have optimizations so that if multiple mutations 
> are waiting on the same row lock, all subsequent mutations can re-use the 
> previous mutation's final state and avoid an extra Get. 
> We can apply a similar idea to Phoenix sequences. If there's a "hot" sequence 
> with multiple requests queueing for a Sequence row lock, we can consolidate 
> them down to one set of Get / Put operations, then satisfy them all. This 
> change is transparent to the clients. 
> Note that if this consolidation would cause the sequence update to fail when 
> some of the requests would have succeeded otherwise, we should not 
> consolidate. (An example is if a sequence has cycling disabled, and the first 
> request would not overflow, but the first and second combined would. In this 
> case we should let the first request go through unconsolidated, and fail the 
> second request.) 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (PHOENIX-6786) SequenceRegionObserver should use batch mutation coproc hooks

2022-11-07 Thread Andrew Kyle Purtell (Jira)


[ 
https://issues.apache.org/jira/browse/PHOENIX-6786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17630040#comment-17630040
 ] 

Andrew Kyle Purtell commented on PHOENIX-6786:
--

[~kozdemir] Planning to make the SequenceRegionObserver changes that [~gjacoby] 
recommended. 

> SequenceRegionObserver should use batch mutation coproc hooks
> -
>
> Key: PHOENIX-6786
> URL: https://issues.apache.org/jira/browse/PHOENIX-6786
> Project: Phoenix
>  Issue Type: Sub-task
>Reporter: Geoffrey Jacoby
>Assignee: Andrew Kyle Purtell
>Priority: Major
>
> SequenceRegionObserver uses preIncrement but could use the standard batch 
> mutation coproc hooks, similarly to how atomic upserts work after 
> PHOENIX-6387. This will simplify the code and also make it easier to re-use 
> code from secondary index generation in performance optimizations. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (PHOENIX-6768) phoenix-core build fails on jacoco during report generation

2022-08-12 Thread Andrew Kyle Purtell (Jira)


[ 
https://issues.apache.org/jira/browse/PHOENIX-6768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17579197#comment-17579197
 ] 

Andrew Kyle Purtell edited comment on PHOENIX-6768 at 8/13/22 1:11 AM:
---

[~stoty] If this is still open when I am back from vacation at the end of the 
month I will look at it.


was (Author: apurtell):
[~stoty] I was going to look at this but hit PHOENIX-6769 first. If this is 
still open when I am back from vacation at the end of the month I will look at 
it.

> phoenix-core build fails on jacoco during report generation
> ---
>
> Key: PHOENIX-6768
> URL: https://issues.apache.org/jira/browse/PHOENIX-6768
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Andrew Kyle Purtell
>Priority: Major
>
> Anyone seen this?
> {noformat}
> [ERROR] Failed to execute goal org.jacoco:jacoco-maven-plugin:0.8.7:report 
> (report) on project phoenix-core:
> An error has occurred in JaCoCo report generation.:
> Error while creating report:
> Unknown block type 0. -> [Help 1]
> [ERROR] After correcting the problems, you can resume the build with the 
> command
> [ERROR]   mvn  -rf :phoenix-core
> {noformat}
> Related, I don't see any way to disable jacoco with a build profile.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (PHOENIX-6769) Align mockito version with Hadoop and HBase

2022-08-12 Thread Andrew Kyle Purtell (Jira)


[ 
https://issues.apache.org/jira/browse/PHOENIX-6769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17579198#comment-17579198
 ] 

Andrew Kyle Purtell commented on PHOENIX-6769:
--

As for the original issue that caused me to open an earlier version of this 
ticket, that was the classic "you have not compiled HBase 2 for Phoenix" 
problem, but the mockito issues are definitely a latent concern regardless.

> Align mockito version with Hadoop and HBase
> ---
>
> Key: PHOENIX-6769
> URL: https://issues.apache.org/jira/browse/PHOENIX-6769
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 5.3.0
>Reporter: Andrew Kyle Purtell
>Priority: Major
> Fix For: 5.3.0
>
>
> There is a well known incompatibility between old versions of mockito-all and 
> mockito-core and newer versions. It manifests as 
> IncompatibleClassChangeErrors and other linkage problems. The Hadoop 
> minicluster in versions 3.x embed mockito classes in the minicluster. 
> To avoid potential problems it would be best to align Phoenix use of mockito 
> (mockito-core) with downstreamers. HBase uses mockito-core 2.28.2 on 
> branch-2.4 and branch-2.5. (Phoenix is on 1.10.19.) I checked Hadoop 
> branch-3.3 and it's also on 2.28.2.
> I recently opened a PR for OMID-226 to fix the same concern in phoenix-omid. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (PHOENIX-6769) Unit tests failing with IncompatibleClassChangeError

2022-08-12 Thread Andrew Kyle Purtell (Jira)


[ 
https://issues.apache.org/jira/browse/PHOENIX-6769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17579196#comment-17579196
 ] 

Andrew Kyle Purtell commented on PHOENIX-6769:
--

[~rushabh.shah] It would be good to make this change out in open source and 
then pick it back into our internal version.

I'm out for two weeks on vacation starting basically now. I can pick this up 
when back at the end of the month if nobody has tackled it in the meantime, but 
it is not difficult... See OMID-226 for an example of how to accomplish it.

> Unit tests failing with IncompatibleClassChangeError
> 
>
> Key: PHOENIX-6769
> URL: https://issues.apache.org/jira/browse/PHOENIX-6769
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 5.3.0
>Reporter: Andrew Kyle Purtell
>Priority: Major
> Fix For: 5.3.0
>
>
> There is a well known incompatibility between old versions of mockito-all and 
> mockito-core and newer versions. I'm not exactly sure what the trigger was, 
> perhaps PHOENIX-6753, but tests fail for me when run locally exhibiting the 
> classic symptom of this problem.
> The fix is to align your use of mockito (mockito-core) with downstreamers. 
> HBase uses mockito-core 2.28.2 on branch-2.4 and branch-2.5. (Phoenix is on 
> 1.10.19.) I checked Hadoop branch-3.3 and it's also on 2.28.2.
> I recently opened a PR for OMID-226 to fix the same problem in phoenix-omid. 
> For example
> {noformat}
> ERROR] 
> org.apache.phoenix.hbase.index.write.recovery.TestPerRegionIndexWriteCache.testMultipleAddsForSingleRegion
>   Time elapsed: 0.035 s  <<< ERROR!
> java.lang.IncompatibleClassChangeError: Found interface 
> org.apache.hadoop.hdfs.protocol.HdfsFileStatus, but class was expected
>   at 
> org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutputHelper.createOutput(FanOutOneBlockAsyncDFSOutputHelper.java:535)
>   at 
> org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutputHelper.access$400(FanOutOneBlockAsyncDFSOutputHelper.java:112)
>   at 
> org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutputHelper$8.doCall(FanOutOneBlockAsyncDFSOutputHelper.java:615)
>   at 
> org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutputHelper$8.doCall(FanOutOneBlockAsyncDFSOutputHelper.java:610)
>   at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>   at 
> org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutputHelper.createOutput(FanOutOneBlockAsyncDFSOutputHelper.java:623)
>   at 
> org.apache.hadoop.hbase.io.asyncfs.AsyncFSOutputHelper.createOutput(AsyncFSOutputHelper.java:53)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.AsyncProtobufLogWriter.initOutput(AsyncProtobufLogWriter.java:190)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.AbstractProtobufLogWriter.init(AbstractProtobufLogWriter.java:160)
>   at 
> org.apache.hadoop.hbase.wal.AsyncFSWALProvider.createAsyncWriter(AsyncFSWALProvider.java:116)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.AsyncFSWAL.createWriterInstance(AsyncFSWAL.java:723)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.AsyncFSWAL.createWriterInstance(AsyncFSWAL.java:129)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL.rollWriter(AbstractFSWAL.java:833)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL.rollWriter(AbstractFSWAL.java:547)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL.init(AbstractFSWAL.java:488)
>   at 
> org.apache.hadoop.hbase.wal.AbstractFSWALProvider.getWAL(AbstractFSWALProvider.java:160)
>   at 
> org.apache.hadoop.hbase.wal.AbstractFSWALProvider.getWAL(AbstractFSWALProvider.java:62)
>   at org.apache.hadoop.hbase.wal.WALFactory.getWAL(WALFactory.java:295)
>   at 
> org.apache.phoenix.hbase.index.write.recovery.TestPerRegionIndexWriteCache.setUp(TestPerRegionIndexWriteCache.java:109)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (PHOENIX-6768) phoenix-core build fails on jacoco during report generation

2022-08-12 Thread Andrew Kyle Purtell (Jira)


[ 
https://issues.apache.org/jira/browse/PHOENIX-6768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17579197#comment-17579197
 ] 

Andrew Kyle Purtell commented on PHOENIX-6768:
--

[~stoty] I was going to look at this but hit PHOENIX-6769 first. If this is 
still open when I am back from vacation at the end of the month I will look at 
it.

> phoenix-core build fails on jacoco during report generation
> ---
>
> Key: PHOENIX-6768
> URL: https://issues.apache.org/jira/browse/PHOENIX-6768
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Andrew Kyle Purtell
>Priority: Major
>
> Anyone seen this?
> {noformat}
> [ERROR] Failed to execute goal org.jacoco:jacoco-maven-plugin:0.8.7:report 
> (report) on project phoenix-core:
> An error has occurred in JaCoCo report generation.:
> Error while creating report:
> Unknown block type 0. -> [Help 1]
> [ERROR] After correcting the problems, you can resume the build with the 
> command
> [ERROR]   mvn  -rf :phoenix-core
> {noformat}
> Related, I don't see any way to disable jacoco with a build profile.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (PHOENIX-6627) Remove all references to Tephra from 4.x and master

2022-08-11 Thread Andrew Kyle Purtell (Jira)


[ 
https://issues.apache.org/jira/browse/PHOENIX-6627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17578555#comment-17578555
 ] 

Andrew Kyle Purtell edited comment on PHOENIX-6627 at 8/11/22 4:24 PM:
---

bq. The problem is that we do not have a Phoenix level logical tooling for 
doing a logical Dump/Restore that would isolate us from the data 
representation, like for example the mysqldump tool.

This seems like a reasonable enabling followup. From what I have read on 
background for this work it is unlikely that there is a Tephra table user out 
there, but such a tool can be written and tested and made available before a 
release with a change like PHOENIX-6627 included. It will be generally useful 
for a variety of migration and rescue scenarios.

Perhaps a table snapshot mapreduce application:
1. Trigger a snapshot.
2. Write out a DDL file with CREATE statements that would recreate the schema.
3. In a distributed fashion, write out row-ranges of the base table, using the 
snapshot as input, into multiple compressed textual DML files. 

Indexes are ignored because they will be recreated when the DDL file is 
executed and recreated on the fly as base table dump is reinserted.
I am fuzzy on the details necessary for proper handling of views.

So to re-import, a user would execute the DDL, then iterate over the set of 
compressed DML files and execute the statements within (perhaps in parallel). 
The export tool should have a companion import tool that automates this 
process. 

It is very likely the user would configure output to a public cloud object 
store like S3, but because we have any/all of Hadoop's output formats 
available, it could be to a big local HDFS volume as alternative if required.

WDYT? File an issue for this? 


was (Author: apurtell):
bq. The problem is that we do not have a Phoenix level logical tooling for 
doing a logical Dump/Restore that would isolate us from the data 
representation, like for example the mysqldump tool.

This seems like a reasonable enabling followup. From what I have read on 
background for this work it is unlikely that there is a Tephra table user out 
there, but such a tool can be written and tested and made available before a 
release with a change like PHOENIX-6627 included. It will be generally useful 
for a variety of migration and rescue scenarios.

Perhaps a table snapshot mapreduce application:
1. Trigger a snapshot.
2. Write out a DDL file with CREATE statements that would recreate the schema.
3. In a distributed fashion, write out row-ranges of the base table, using the 
snapshot as input, into multiple compressed textual DML files. 

Indexes are ignored because they will be recreated when the DDL file is 
executed and recreated on the fly as base table dump is reinserted.
I am fuzzy on the details necessary for proper handling of views.

So to re-import, a user would execute the DDL, then iterate over the set of 
compressed DML files and execute the statements within (perhaps in parallel). 
The export tool should have a companion import tool that automates this 
process. 

WDYT? File an issue for this? 

> Remove all references to Tephra from 4.x and master
> ---
>
> Key: PHOENIX-6627
> URL: https://issues.apache.org/jira/browse/PHOENIX-6627
> Project: Phoenix
>  Issue Type: Sub-task
>  Components: 4.x, tephra
>Reporter: Istvan Toth
>Assignee: Andrew Kyle Purtell
>Priority: Major
> Fix For: 5.2.0
>
>
> Removing tephra from the runtime is easy, as it uses the well defind 
> TransactionProvider interfaces.
> Removing Tephra references from all the test cases is a much bigger task.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (PHOENIX-6627) Remove all references to Tephra from 4.x and master

2022-08-11 Thread Andrew Kyle Purtell (Jira)


[ 
https://issues.apache.org/jira/browse/PHOENIX-6627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17578555#comment-17578555
 ] 

Andrew Kyle Purtell edited comment on PHOENIX-6627 at 8/11/22 4:23 PM:
---

bq. The problem is that we do not have a Phoenix level logical tooling for 
doing a logical Dump/Restore that would isolate us from the data 
representation, like for example the mysqldump tool.

This seems like a reasonable enabling followup. From what I have read on 
background for this work it is unlikely that there is a Tephra table user out 
there, but such a tool can be written and tested and made available before a 
release with a change like PHOENIX-6627 included. It will be generally useful 
for a variety of migration and rescue scenarios.

Perhaps a table snapshot mapreduce application:
1. Trigger a snapshot.
2. Write out a DDL file with CREATE statements that would recreate the schema.
3. In a distributed fashion, write out row-ranges of the base table, using the 
snapshot as input, into multiple compressed textual DML files, like mysqldump. 

Indexes are ignored because they will be recreated when the DDL file is 
executed and recreated on the fly as base table dump is reinserted.
I am fuzzy on the details necessary for proper handling of views.

So to re-import, a user would execute the DDL, then iterate over the set of 
compressed DML files and execute the statements within (perhaps in parallel). 
The export tool should have a companion import tool that automates this 
process. 

WDYT? File an issue for this? 


was (Author: apurtell):
bq. The problem is that we do not have a Phoenix level logical tooling for 
doing a logical Dump/Restore that would isolate us from the data 
representation, like for example the mysqldump tool.

This seems like a reasonable enabling followup. From what I have read on 
background for this work it is unlikely that there is a Tephra table user out 
there, but such a tool can be written and tested and made available before a 
release with a change like PHOENIX-6627 included. It will be generally useful 
for a variety of migration and rescue scenarios.

Perhaps a table snapshot mapreduce application:
1. Trigger a snapshot.
2. Write out a DDL file with CREATE statements that would recreate the schema.
3. In a distributed fashion, write out row-ranges of the base table into 
multiple compressed textual DML files, like mysqldump. 

Indexes are ignored because they will be recreated when the DDL file is 
executed and recreated on the fly as base table dump is reinserted.
I am fuzzy on the details necessary for proper handling of views.

So to re-import, a user would execute the DDL, then iterate over the set of 
compressed DML files and execute the statements within (perhaps in parallel). 
The export tool should have a companion import tool that automates this 
process. 

WDYT? File an issue for this? 

> Remove all references to Tephra from 4.x and master
> ---
>
> Key: PHOENIX-6627
> URL: https://issues.apache.org/jira/browse/PHOENIX-6627
> Project: Phoenix
>  Issue Type: Sub-task
>  Components: 4.x, tephra
>Reporter: Istvan Toth
>Assignee: Andrew Kyle Purtell
>Priority: Major
> Fix For: 5.2.0
>
>
> Removing tephra from the runtime is easy, as it uses the well defind 
> TransactionProvider interfaces.
> Removing Tephra references from all the test cases is a much bigger task.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (PHOENIX-6627) Remove all references to Tephra from 4.x and master

2022-08-11 Thread Andrew Kyle Purtell (Jira)


[ 
https://issues.apache.org/jira/browse/PHOENIX-6627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17578555#comment-17578555
 ] 

Andrew Kyle Purtell edited comment on PHOENIX-6627 at 8/11/22 4:23 PM:
---

bq. The problem is that we do not have a Phoenix level logical tooling for 
doing a logical Dump/Restore that would isolate us from the data 
representation, like for example the mysqldump tool.

This seems like a reasonable enabling followup. From what I have read on 
background for this work it is unlikely that there is a Tephra table user out 
there, but such a tool can be written and tested and made available before a 
release with a change like PHOENIX-6627 included. It will be generally useful 
for a variety of migration and rescue scenarios.

Perhaps a table snapshot mapreduce application:
1. Trigger a snapshot.
2. Write out a DDL file with CREATE statements that would recreate the schema.
3. In a distributed fashion, write out row-ranges of the base table, using the 
snapshot as input, into multiple compressed textual DML files. 

Indexes are ignored because they will be recreated when the DDL file is 
executed and recreated on the fly as base table dump is reinserted.
I am fuzzy on the details necessary for proper handling of views.

So to re-import, a user would execute the DDL, then iterate over the set of 
compressed DML files and execute the statements within (perhaps in parallel). 
The export tool should have a companion import tool that automates this 
process. 

WDYT? File an issue for this? 


was (Author: apurtell):
bq. The problem is that we do not have a Phoenix level logical tooling for 
doing a logical Dump/Restore that would isolate us from the data 
representation, like for example the mysqldump tool.

This seems like a reasonable enabling followup. From what I have read on 
background for this work it is unlikely that there is a Tephra table user out 
there, but such a tool can be written and tested and made available before a 
release with a change like PHOENIX-6627 included. It will be generally useful 
for a variety of migration and rescue scenarios.

Perhaps a table snapshot mapreduce application:
1. Trigger a snapshot.
2. Write out a DDL file with CREATE statements that would recreate the schema.
3. In a distributed fashion, write out row-ranges of the base table, using the 
snapshot as input, into multiple compressed textual DML files, like mysqldump. 

Indexes are ignored because they will be recreated when the DDL file is 
executed and recreated on the fly as base table dump is reinserted.
I am fuzzy on the details necessary for proper handling of views.

So to re-import, a user would execute the DDL, then iterate over the set of 
compressed DML files and execute the statements within (perhaps in parallel). 
The export tool should have a companion import tool that automates this 
process. 

WDYT? File an issue for this? 

> Remove all references to Tephra from 4.x and master
> ---
>
> Key: PHOENIX-6627
> URL: https://issues.apache.org/jira/browse/PHOENIX-6627
> Project: Phoenix
>  Issue Type: Sub-task
>  Components: 4.x, tephra
>Reporter: Istvan Toth
>Assignee: Andrew Kyle Purtell
>Priority: Major
> Fix For: 5.2.0
>
>
> Removing tephra from the runtime is easy, as it uses the well defind 
> TransactionProvider interfaces.
> Removing Tephra references from all the test cases is a much bigger task.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (PHOENIX-6627) Remove all references to Tephra from 4.x and master

2022-08-11 Thread Andrew Kyle Purtell (Jira)


[ 
https://issues.apache.org/jira/browse/PHOENIX-6627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17578555#comment-17578555
 ] 

Andrew Kyle Purtell edited comment on PHOENIX-6627 at 8/11/22 4:22 PM:
---

bq. The problem is that we do not have a Phoenix level logical tooling for 
doing a logical Dump/Restore that would isolate us from the data 
representation, like for example the mysqldump tool.

This seems like a reasonable enabling followup. From what I have read on 
background for this work it is unlikely that there is a Tephra table user out 
there, but such a tool can be written and tested and made available before a 
release with a change like PHOENIX-6627 included. It will be generally useful 
for a variety of migration and rescue scenarios.

Perhaps a table snapshot mapreduce application:
1. Trigger a snapshot.
2. Write out a DDL file with CREATE statements that would recreate the schema.
3. In a distributed fashion, write out row-ranges of the base table into 
multiple compressed textual DML files, like mysqldump. 

Indexes are ignored because they will be recreated when the DDL file is 
executed and recreated on the fly as base table dump is reinserted.
I am fuzzy on the details necessary for proper handling of views.

So to re-import, a user would execute the DDL, then iterate over the set of 
compressed DML files and execute the statements within (perhaps in parallel). 
The export tool should have a companion import tool that automates this 
process. 

WDYT? File an issue for this? 


was (Author: apurtell):
bq. The problem is that we do not have a Phoenix level logical tooling for 
doing a logical Dump/Restore that would isolate us from the data 
representation, like for example the mysqldump tool.

This seems like a reasonable enabling followup. From what I have read on 
background for this work it is unlikely that there is a Tephra table user out 
there, but such a tool can be written and tested and made available before a 
release with a change like PHOENIX-6627 included. It will be generally useful 
for a variety of migration and rescue scenarios.

Perhaps a table snapshot mapreduce application:
1. Trigger a snapshot.
2. Write out a DDL file with CREATE statements that would recreate the schema.
3. In a distributed fashion, write out row-ranges of the base table, ignoring 
indexes, into multiple compressed textual DML files, like mysqldump. 

I am fuzzy on the details necessary for proper handling of views.

So to re-import, a user would execute the DDL, then iterate over the set of 
compressed DML files and execute the statements within (perhaps in parallel). 
The export tool should have a companion import tool that automates this 
process. 

WDYT? File an issue for this? 

> Remove all references to Tephra from 4.x and master
> ---
>
> Key: PHOENIX-6627
> URL: https://issues.apache.org/jira/browse/PHOENIX-6627
> Project: Phoenix
>  Issue Type: Sub-task
>  Components: 4.x, tephra
>Reporter: Istvan Toth
>Assignee: Andrew Kyle Purtell
>Priority: Major
> Fix For: 5.2.0
>
>
> Removing tephra from the runtime is easy, as it uses the well defind 
> TransactionProvider interfaces.
> Removing Tephra references from all the test cases is a much bigger task.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (PHOENIX-6627) Remove all references to Tephra from 4.x and master

2022-08-11 Thread Andrew Kyle Purtell (Jira)


[ 
https://issues.apache.org/jira/browse/PHOENIX-6627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17578555#comment-17578555
 ] 

Andrew Kyle Purtell edited comment on PHOENIX-6627 at 8/11/22 4:21 PM:
---

bq. The problem is that we do not have a Phoenix level logical tooling for 
doing a logical Dump/Restore that would isolate us from the data 
representation, like for example the mysqldump tool.

This seems like a reasonable enabling followup. From what I have read on 
background for this work it is unlikely that there is a Tephra table user out 
there, but such a tool can be written and tested and made available before a 
release with a change like PHOENIX-6627 included. It will be generally useful 
for a variety of migration and rescue scenarios.

Perhaps a table snapshot mapreduce application:
1. Trigger a snapshot.
2. Write out a DDL file with CREATE statements that would recreate the schema.
3. In a distributed fashion, write out row-ranges of the base table, ignoring 
indexes, into multiple compressed textual DML files, like mysqldump. 

I am fuzzy on the details necessary for proper handling of views.

So to re-import, a user would execute the DDL, then iterate over the set of 
compressed DML files and execute the statements within (perhaps in parallel). 
The export tool should have a companion import tool that automates this 
process. 

WDYT? File an issue for this? 


was (Author: apurtell):
bq. The problem is that we do not have a Phoenix level logical tooling for 
doing a logical Dump/Restore that would isolate us from the data 
representation, like for example the mysqldump tool.

This seems like a reasonable enabling followup. From what I have read on 
background for this work it is unlikely that there is a Tephra table user out 
there, but such a tool can be written and tested and made available before a 
release with a change like PHOENIX-6627 included. Perhaps a table snapshot 
mapreduce application:
1. Trigger a snapshot.
2. Write out a DDL file with CREATE statements that would recreate the schema.
3. In a distributed fashion, write out row-ranges of the base table, ignoring 
indexes, into multiple compressed textual DML files, like mysqldump. 

I am fuzzy on the details necessary for proper handling of views.

So to re-import, a user would execute the DDL, then iterate over the set of 
compressed DML files and execute the statements within (perhaps in parallel). 
The export tool should have a companion import tool that automates this 
process. 

WDYT? File an issue for this? 

> Remove all references to Tephra from 4.x and master
> ---
>
> Key: PHOENIX-6627
> URL: https://issues.apache.org/jira/browse/PHOENIX-6627
> Project: Phoenix
>  Issue Type: Sub-task
>  Components: 4.x, tephra
>Reporter: Istvan Toth
>Assignee: Andrew Kyle Purtell
>Priority: Major
> Fix For: 5.2.0
>
>
> Removing tephra from the runtime is easy, as it uses the well defind 
> TransactionProvider interfaces.
> Removing Tephra references from all the test cases is a much bigger task.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (PHOENIX-6627) Remove all references to Tephra from 4.x and master

2022-08-11 Thread Andrew Kyle Purtell (Jira)


[ 
https://issues.apache.org/jira/browse/PHOENIX-6627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17578555#comment-17578555
 ] 

Andrew Kyle Purtell commented on PHOENIX-6627:
--

bq. The problem is that we do not have a Phoenix level logical tooling for 
doing a logical Dump/Restore that would isolate us from the data 
representation, like for example the mysqldump tool.

This seems like a reasonable enabling followup. From what I have read on 
background for this work it is unlikely that there is a Tephra table user out 
there, but such a tool can be written and tested and made available before a 
release with a change like PHOENIX-6627 included. Perhaps a table snapshot 
mapreduce application:
1. Trigger a snapshot.
2. Write out a DDL file with CREATE statements that would recreate the schema.
3. In a distributed fashion, write out row-ranges of the base table, ignoring 
indexes, into multiple compressed textual DML files, like mysqldump. 

I am fuzzy on the details necessary for proper handling of views.

So to re-import, a user would execute the DDL, then iterate over the set of 
compressed DML files and execute the statements within (perhaps in parallel). 
The export tool should have a companion import tool that automates this 
process. 

WDYT? File an issue for this? 

> Remove all references to Tephra from 4.x and master
> ---
>
> Key: PHOENIX-6627
> URL: https://issues.apache.org/jira/browse/PHOENIX-6627
> Project: Phoenix
>  Issue Type: Sub-task
>  Components: 4.x, tephra
>Reporter: Istvan Toth
>Assignee: Andrew Kyle Purtell
>Priority: Major
> Fix For: 5.2.0
>
>
> Removing tephra from the runtime is easy, as it uses the well defind 
> TransactionProvider interfaces.
> Removing Tephra references from all the test cases is a much bigger task.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (PHOENIX-6627) Remove all references to Tephra from 4.x and master

2022-08-10 Thread Andrew Kyle Purtell (Jira)


[ 
https://issues.apache.org/jira/browse/PHOENIX-6627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17578163#comment-17578163
 ] 

Andrew Kyle Purtell edited comment on PHOENIX-6627 at 8/10/22 9:52 PM:
---

Anyone who has deployed Tephra transitional tables, if such a user actually 
exists, would be at a dead end? 

MetaDataClient does not allow going from transactional to non transactional. 
This is discussed there in some comments that describe how for Tephra tables 
all cell timestamps would need to be rewritten and Tephra-specific delete 
markers would need to be removed or converted into normal tombstones to process 
such an ALTER. I think it is probably out of scope of a "remove" JIRA to try 
and implement complex one-off functionality, especially for an option suspected 
of never being deployed, but want to raise the issue. 

ALTER of an OMID transactional to nontransactional table would be problematic 
for broadly similar reasons. All of the shadow metadata should be discarded.

Any txn engine is likely to have this general class of problem. 

What would you recommend to users? "Before upgrading, if you have a Tephra 
transactional table, first run a table export operation. Then drop the table. 
Then upgrade. Then create a replacement table. Then import table data from the 
earlier export." ? 

[~stoty] [~gjacoby] [~kadir] [~jisaac]


was (Author: apurtell):
Anyone who has deployed Tephra transitional tables, if such a user actually 
exists, would be at a dead end? 

MetaDataClient does not allow going from transactional to non transactional. 
This is discussed there in some comments that describe how for Tephra tables 
all cell timestamps would need to be rewritten and Tephra-specific delete 
markers would need to be removed or concerted into normal tombstones to process 
such an ALTER. I think it is probably out of scope of a "remove" JIRA to try 
and implement complex one-off functionality, especially for an option suspected 
of never being deployed, but want to raise the issue. 

ALTER of an OMID transactional to nontransactional table would be problematic 
for broadly similar reasons. All of the shadow metadata should be discarded.

Any txn engine is likely to have this general class of problem. 

What would you recommend to users? "Before upgrading, if you have a Tephra 
transactional table, first run a table export operation. Then drop the table. 
Then upgrade. Then create a replacement table. Then import table data from the 
earlier export." ? 

[~stoty] [~gjacoby] [~kadir] [~jisaac]

> Remove all references to Tephra from 4.x and master
> ---
>
> Key: PHOENIX-6627
> URL: https://issues.apache.org/jira/browse/PHOENIX-6627
> Project: Phoenix
>  Issue Type: Sub-task
>  Components: 4.x, tephra
>Reporter: Istvan Toth
>Assignee: Andrew Kyle Purtell
>Priority: Major
> Fix For: 5.2.0
>
>
> Removing tephra from the runtime is easy, as it uses the well defind 
> TransactionProvider interfaces.
> Removing Tephra references from all the test cases is a much bigger task.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (PHOENIX-6627) Remove all references to Tephra from 4.x and master

2022-08-10 Thread Andrew Kyle Purtell (Jira)


[ 
https://issues.apache.org/jira/browse/PHOENIX-6627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17578163#comment-17578163
 ] 

Andrew Kyle Purtell commented on PHOENIX-6627:
--

Anyone who has deployed Tephra transitional tables, if such a user actually 
exists, would be at a dead end? 

MetaDataClient does not allow going from transactional to non transactional. 
This is mentioned in comments that describe how for Tephra tables all cell 
timestamps would need to be rewritten and Tephra-specific delete markers would 
need to be removed or concerted into normal tombstones to process such an 
ALTER. I think it is probably out of scope of a "remove" JIRA to try and 
implement complex one-off functionality, especially for an option suspected of 
never being deployed, but want to raise the issue. 

ALTER of an OMID transactional to nontransactional table would be problematic 
for broadly similar reasons. All of the shadow metadata should be discarded.

Any txn engine is likely to have this general class of problem. 

What would you recommend to users? "Before upgrading, if you have a Tephra 
transactional table, first run a table export operation. Then drop the table. 
Then upgrade. Then create a replacement table. Then import table data from the 
earlier export." ? 

[~stoty] [~gjacoby] [~kadir] [~jisaac]

> Remove all references to Tephra from 4.x and master
> ---
>
> Key: PHOENIX-6627
> URL: https://issues.apache.org/jira/browse/PHOENIX-6627
> Project: Phoenix
>  Issue Type: Sub-task
>  Components: 4.x, tephra
>Reporter: Istvan Toth
>Assignee: Andrew Kyle Purtell
>Priority: Major
> Fix For: 5.2.0
>
>
> Removing tephra from the runtime is easy, as it uses the well defind 
> TransactionProvider interfaces.
> Removing Tephra references from all the test cases is a much bigger task.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (PHOENIX-6627) Remove all references to Tephra from 4.x and master

2022-08-10 Thread Andrew Kyle Purtell (Jira)


[ 
https://issues.apache.org/jira/browse/PHOENIX-6627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17578163#comment-17578163
 ] 

Andrew Kyle Purtell edited comment on PHOENIX-6627 at 8/10/22 9:51 PM:
---

Anyone who has deployed Tephra transitional tables, if such a user actually 
exists, would be at a dead end? 

MetaDataClient does not allow going from transactional to non transactional. 
This is discussed there in some comments that describe how for Tephra tables 
all cell timestamps would need to be rewritten and Tephra-specific delete 
markers would need to be removed or concerted into normal tombstones to process 
such an ALTER. I think it is probably out of scope of a "remove" JIRA to try 
and implement complex one-off functionality, especially for an option suspected 
of never being deployed, but want to raise the issue. 

ALTER of an OMID transactional to nontransactional table would be problematic 
for broadly similar reasons. All of the shadow metadata should be discarded.

Any txn engine is likely to have this general class of problem. 

What would you recommend to users? "Before upgrading, if you have a Tephra 
transactional table, first run a table export operation. Then drop the table. 
Then upgrade. Then create a replacement table. Then import table data from the 
earlier export." ? 

[~stoty] [~gjacoby] [~kadir] [~jisaac]


was (Author: apurtell):
Anyone who has deployed Tephra transitional tables, if such a user actually 
exists, would be at a dead end? 

MetaDataClient does not allow going from transactional to non transactional. 
This is mentioned in comments that describe how for Tephra tables all cell 
timestamps would need to be rewritten and Tephra-specific delete markers would 
need to be removed or concerted into normal tombstones to process such an 
ALTER. I think it is probably out of scope of a "remove" JIRA to try and 
implement complex one-off functionality, especially for an option suspected of 
never being deployed, but want to raise the issue. 

ALTER of an OMID transactional to nontransactional table would be problematic 
for broadly similar reasons. All of the shadow metadata should be discarded.

Any txn engine is likely to have this general class of problem. 

What would you recommend to users? "Before upgrading, if you have a Tephra 
transactional table, first run a table export operation. Then drop the table. 
Then upgrade. Then create a replacement table. Then import table data from the 
earlier export." ? 

[~stoty] [~gjacoby] [~kadir] [~jisaac]

> Remove all references to Tephra from 4.x and master
> ---
>
> Key: PHOENIX-6627
> URL: https://issues.apache.org/jira/browse/PHOENIX-6627
> Project: Phoenix
>  Issue Type: Sub-task
>  Components: 4.x, tephra
>Reporter: Istvan Toth
>Assignee: Andrew Kyle Purtell
>Priority: Major
> Fix For: 5.2.0
>
>
> Removing tephra from the runtime is easy, as it uses the well defind 
> TransactionProvider interfaces.
> Removing Tephra references from all the test cases is a much bigger task.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (PHOENIX-6627) Remove all references to Tephra from 4.x and master

2022-08-10 Thread Andrew Kyle Purtell (Jira)


[ 
https://issues.apache.org/jira/browse/PHOENIX-6627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17578109#comment-17578109
 ] 

Andrew Kyle Purtell commented on PHOENIX-6627:
--

Taking this, thanks [~gyzsolt]. Will ping when there is a PR ready for review.

> Remove all references to Tephra from 4.x and master
> ---
>
> Key: PHOENIX-6627
> URL: https://issues.apache.org/jira/browse/PHOENIX-6627
> Project: Phoenix
>  Issue Type: Sub-task
>  Components: 4.x, tephra
>Reporter: Istvan Toth
>Assignee: Andrew Kyle Purtell
>Priority: Major
> Fix For: 5.2.0
>
>
> Removing tephra from the runtime is easy, as it uses the well defind 
> TransactionProvider interfaces.
> Removing Tephra references from all the test cases is a much bigger task.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (PHOENIX-6627) Remove all references to Tephra from 4.x and master

2022-08-09 Thread Andrew Kyle Purtell (Jira)


[ 
https://issues.apache.org/jira/browse/PHOENIX-6627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17577481#comment-17577481
 ] 

Andrew Kyle Purtell edited comment on PHOENIX-6627 at 8/9/22 3:22 PM:
--

{quote}
I agree on keeping (but preferable renaming) the enum value.
{quote}
It cannot be renamed, or existing downstream code will fail to compile. If that 
is not a concern then yeah the ordinal stored into syscat would not change so 
that aspect of the change would be compatible.

{quote} 
Come to think of it, we cannot just remove the Tephra coprocessor, we MUST keep 
it, otherwise the RS is just going to fail to start up after an upgrade.
{quote}

Release notes would need to insist that any Tephra based transactional table 
that is deployed must be ALTERered to be non-transactional first, this is a 
must in any case, which would delete the coprocessor name from the list. 

However to be safe a no-op coprocessor with the same class name can be kept in 
place so if it is still in the table coprocessor list it will not fail to load.


was (Author: apurtell):
{quote}
I agree on keeping (but preferable renaming) the enum value.
{quote}
It cannot be renamed, or existing code will fail to compile. 

{quote} 
Come to think of it, we cannot just remove the Tephra coprocessor, we MUST keep 
it, otherwise the RS is just going to fail to start up after an upgrade.
{quote}

Release notes would need to insist that any Tephra based transactional table 
that is deployed must be ALTERered to be non-transactional first, this is a 
must in any case, which would delete the coprocessor name from the list. 

However to be safe a no-op coprocessor with the same class name can be kept in 
place so if it is still in the table coprocessor list it will not fail to load.

> Remove all references to Tephra from 4.x and master
> ---
>
> Key: PHOENIX-6627
> URL: https://issues.apache.org/jira/browse/PHOENIX-6627
> Project: Phoenix
>  Issue Type: Sub-task
>  Components: 4.x, tephra
>Reporter: Istvan Toth
>Assignee: Zsolt Gyulavari
>Priority: Major
> Fix For: 5.2.0
>
>
> Removing tephra from the runtime is easy, as it uses the well defind 
> TransactionProvider interfaces.
> Removing Tephra references from all the test cases is a much bigger task.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (PHOENIX-6627) Remove all references to Tephra from 4.x and master

2022-08-09 Thread Andrew Kyle Purtell (Jira)


[ 
https://issues.apache.org/jira/browse/PHOENIX-6627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17577481#comment-17577481
 ] 

Andrew Kyle Purtell edited comment on PHOENIX-6627 at 8/9/22 3:20 PM:
--

{quote}
I agree on keeping (but preferable renaming) the enum value.
{quote}
It cannot be renamed, or existing code will fail to compile. 

{quote} 
Come to think of it, we cannot just remove the Tephra coprocessor, we MUST keep 
it, otherwise the RS is just going to fail to start up after an upgrade.
{quote}

Release notes would need to insist that any Tephra based transactional table 
that is deployed must be ALTERered to be non-transactional first, this is a 
must in any case, which would delete the coprocessor name from the list. 

However to be safe a no-op coprocessor with the same class name can be kept in 
place so if it is still in the table coprocessor list it will not fail to load. 
However the implementation is deleted and the build dependencies are dropped.




was (Author: apurtell):
{quote}
I agree on keeping (but preferable renaming) the enum value.
{quote}
It cannot be renamed, or existing code will fail to compile. 

{quote} 
Come to think of it, we cannot just remove the Tephra coprocessor, we MUST keep 
it, otherwise the RS is just going to fail to start up after an upgrade.
{quote}

The idea here is to keep a no-op coprocessor with the same class name in place 
so if it is still in the table coprocessor list it will not fail to load. 

Anyway, release notes would need to insist that any Tephra based transactional 
table that is deployed must be ALTERered to be non-transactional first, this is 
a must in any case, which would delete the coprocessor name from the list.

> Remove all references to Tephra from 4.x and master
> ---
>
> Key: PHOENIX-6627
> URL: https://issues.apache.org/jira/browse/PHOENIX-6627
> Project: Phoenix
>  Issue Type: Sub-task
>  Components: 4.x, tephra
>Reporter: Istvan Toth
>Assignee: Zsolt Gyulavari
>Priority: Major
> Fix For: 5.2.0
>
>
> Removing tephra from the runtime is easy, as it uses the well defind 
> TransactionProvider interfaces.
> Removing Tephra references from all the test cases is a much bigger task.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (PHOENIX-6627) Remove all references to Tephra from 4.x and master

2022-08-09 Thread Andrew Kyle Purtell (Jira)


[ 
https://issues.apache.org/jira/browse/PHOENIX-6627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17577481#comment-17577481
 ] 

Andrew Kyle Purtell edited comment on PHOENIX-6627 at 8/9/22 3:20 PM:
--

{quote}
I agree on keeping (but preferable renaming) the enum value.
{quote}
It cannot be renamed, or existing code will fail to compile. 

{quote} 
Come to think of it, we cannot just remove the Tephra coprocessor, we MUST keep 
it, otherwise the RS is just going to fail to start up after an upgrade.
{quote}

Release notes would need to insist that any Tephra based transactional table 
that is deployed must be ALTERered to be non-transactional first, this is a 
must in any case, which would delete the coprocessor name from the list. 

However to be safe a no-op coprocessor with the same class name can be kept in 
place so if it is still in the table coprocessor list it will not fail to load.


was (Author: apurtell):
{quote}
I agree on keeping (but preferable renaming) the enum value.
{quote}
It cannot be renamed, or existing code will fail to compile. 

{quote} 
Come to think of it, we cannot just remove the Tephra coprocessor, we MUST keep 
it, otherwise the RS is just going to fail to start up after an upgrade.
{quote}

Release notes would need to insist that any Tephra based transactional table 
that is deployed must be ALTERered to be non-transactional first, this is a 
must in any case, which would delete the coprocessor name from the list. 

However to be safe a no-op coprocessor with the same class name can be kept in 
place so if it is still in the table coprocessor list it will not fail to load. 
However the implementation is deleted and the build dependencies are dropped.



> Remove all references to Tephra from 4.x and master
> ---
>
> Key: PHOENIX-6627
> URL: https://issues.apache.org/jira/browse/PHOENIX-6627
> Project: Phoenix
>  Issue Type: Sub-task
>  Components: 4.x, tephra
>Reporter: Istvan Toth
>Assignee: Zsolt Gyulavari
>Priority: Major
> Fix For: 5.2.0
>
>
> Removing tephra from the runtime is easy, as it uses the well defind 
> TransactionProvider interfaces.
> Removing Tephra references from all the test cases is a much bigger task.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (PHOENIX-6627) Remove all references to Tephra from 4.x and master

2022-08-09 Thread Andrew Kyle Purtell (Jira)


[ 
https://issues.apache.org/jira/browse/PHOENIX-6627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17577481#comment-17577481
 ] 

Andrew Kyle Purtell commented on PHOENIX-6627:
--

{quote}
I agree on keeping (but preferable renaming) the enum value.
{quote}
It cannot be renamed, or existing code will fail to compile. 

{quote} 
Come to think of it, we cannot just remove the Tephra coprocessor, we MUST keep 
it, otherwise the RS is just going to fail to start up after an upgrade.
{quote}

The idea here is to keep a no-op coprocessor with the same class name in place 
so if it is still in the table coprocessor list it will not fail to load. 

Anyway, release notes would need to insist that any Tephra based transactional 
table that is deployed must be ALTERered to be non-transactional first, this is 
a must in any case, which would delete the coprocessor name from the list.

> Remove all references to Tephra from 4.x and master
> ---
>
> Key: PHOENIX-6627
> URL: https://issues.apache.org/jira/browse/PHOENIX-6627
> Project: Phoenix
>  Issue Type: Sub-task
>  Components: 4.x, tephra
>Reporter: Istvan Toth
>Assignee: Zsolt Gyulavari
>Priority: Major
> Fix For: 5.2.0
>
>
> Removing tephra from the runtime is easy, as it uses the well defind 
> TransactionProvider interfaces.
> Removing Tephra references from all the test cases is a much bigger task.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (PHOENIX-6627) Remove all references to Tephra from 4.x and master

2022-08-08 Thread Andrew Kyle Purtell (Jira)


[ 
https://issues.apache.org/jira/browse/PHOENIX-6627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17577049#comment-17577049
 ] 

Andrew Kyle Purtell edited comment on PHOENIX-6627 at 8/9/22 12:06 AM:
---

[~gyzsolt] [~stoty] [~gjacoby]

Is see PR #1455 has been in draft state for a while. Is this still being 
actively worked on? Do you need someone else to pick it up? I volunteer for 
that if so. 

Also wondering if backwards support is really needed given this:

bq. And no, I've never seen a live Tephra use case in production.

Can Tephra and its libthrift dependency (0.9.x has a high scoring CVE 
associated with it) simply be unconditionally removed? The TEPHRA enum value 
would remain for compatibility as the current patch provides, but the Tephra 
transaction manager class and the Tephra coprocessor classes should be removed, 
and the no-op transaction manager should be patched so the coprocessor name and 
capabilities getters return null rather than throw an 
UnsupportedOperationException.


was (Author: apurtell):
[~gyzsolt] [~stoty] [~gjacoby]

Is see PR #1455 has been in draft state for a while. Is this still being 
actively worked on? Do you need someone else to pick it up? I volunteer for 
that if so. 

Also wondering if backwards support is really needed given this:

bq. And no, I've never seen a live Tephra use case in production.

Can Tephra and its libthrift dependency (0.9.x has a high scoring CVE 
associated with it) simply be unconditionally removed?

> Remove all references to Tephra from 4.x and master
> ---
>
> Key: PHOENIX-6627
> URL: https://issues.apache.org/jira/browse/PHOENIX-6627
> Project: Phoenix
>  Issue Type: Sub-task
>  Components: 4.x, tephra
>Reporter: Istvan Toth
>Assignee: Zsolt Gyulavari
>Priority: Major
> Fix For: 5.2.0
>
>
> Removing tephra from the runtime is easy, as it uses the well defind 
> TransactionProvider interfaces.
> Removing Tephra references from all the test cases is a much bigger task.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (PHOENIX-6627) Remove all references to Tephra from 4.x and master

2022-08-08 Thread Andrew Kyle Purtell (Jira)


[ 
https://issues.apache.org/jira/browse/PHOENIX-6627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17577049#comment-17577049
 ] 

Andrew Kyle Purtell edited comment on PHOENIX-6627 at 8/9/22 12:06 AM:
---

[~gyzsolt] [~stoty] [~gjacoby]

Is see PR #1455 has been in draft state for a while. Is this still being 
actively worked on? Do you need someone else to pick it up? I volunteer for 
that if so. 

Also wondering if backwards support is really needed given this:

bq. And no, I've never seen a live Tephra use case in production.

Can Tephra and its libthrift dependency (0.9.x has a high scoring CVE 
associated with it) simply be unconditionally removed? The TEPHRA enum value 
would remain for compatibility as the current patch provides, but the Tephra 
transaction manager class and the Tephra coprocessor classes could be deleted, 
and the no-op transaction manager should be patched so the coprocessor name and 
capabilities getters return null rather than throw an 
UnsupportedOperationException.


was (Author: apurtell):
[~gyzsolt] [~stoty] [~gjacoby]

Is see PR #1455 has been in draft state for a while. Is this still being 
actively worked on? Do you need someone else to pick it up? I volunteer for 
that if so. 

Also wondering if backwards support is really needed given this:

bq. And no, I've never seen a live Tephra use case in production.

Can Tephra and its libthrift dependency (0.9.x has a high scoring CVE 
associated with it) simply be unconditionally removed? The TEPHRA enum value 
would remain for compatibility as the current patch provides, but the Tephra 
transaction manager class and the Tephra coprocessor classes should be removed, 
and the no-op transaction manager should be patched so the coprocessor name and 
capabilities getters return null rather than throw an 
UnsupportedOperationException.

> Remove all references to Tephra from 4.x and master
> ---
>
> Key: PHOENIX-6627
> URL: https://issues.apache.org/jira/browse/PHOENIX-6627
> Project: Phoenix
>  Issue Type: Sub-task
>  Components: 4.x, tephra
>Reporter: Istvan Toth
>Assignee: Zsolt Gyulavari
>Priority: Major
> Fix For: 5.2.0
>
>
> Removing tephra from the runtime is easy, as it uses the well defind 
> TransactionProvider interfaces.
> Removing Tephra references from all the test cases is a much bigger task.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (PHOENIX-6627) Remove all references to Tephra from 4.x and master

2022-08-08 Thread Andrew Kyle Purtell (Jira)


[ 
https://issues.apache.org/jira/browse/PHOENIX-6627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17577049#comment-17577049
 ] 

Andrew Kyle Purtell commented on PHOENIX-6627:
--

[~gyzsolt] [~stoty] [~gjacoby]

Is see PR #1455 has been in draft state for a while. Is this still being 
actively worked on? Do you need someone else to pick it up? I volunteer for 
that if so. 

Also wondering if backwards support is really needed given this:

bq. And no, I've never seen a live Tephra use case in production.

Can Tephra and its libthrift dependency (0.9.x has a high scoring CVE 
associated with it) simply be unconditionally removed?

> Remove all references to Tephra from 4.x and master
> ---
>
> Key: PHOENIX-6627
> URL: https://issues.apache.org/jira/browse/PHOENIX-6627
> Project: Phoenix
>  Issue Type: Sub-task
>  Components: 4.x, tephra
>Reporter: Istvan Toth
>Assignee: Zsolt Gyulavari
>Priority: Major
> Fix For: 5.2.0
>
>
> Removing tephra from the runtime is easy, as it uses the well defind 
> TransactionProvider interfaces.
> Removing Tephra references from all the test cases is a much bigger task.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] (PHOENIX-6692) Add HBase 2.5 support

2022-06-16 Thread Andrew Kyle Purtell (Jira)


[ https://issues.apache.org/jira/browse/PHOENIX-6692 ]


Andrew Kyle Purtell deleted comment on PHOENIX-6692:
--

was (Author: apurtell):
2.4.13 is coming soon but currently waiting a bit more to see if HBASE-27097 
can land.

> Add HBase 2.5 support
> -
>
> Key: PHOENIX-6692
> URL: https://issues.apache.org/jira/browse/PHOENIX-6692
> Project: Phoenix
>  Issue Type: New Feature
>  Components: core
>Reporter: Geoffrey Jacoby
>Assignee: Istvan Toth
>Priority: Major
> Fix For: 5.2.0
>
>
> I was talking with [~apurtell], who's RM for HBase 2.5, and he let me know 
> that HBase 2.5 will be released very soon. Since we're also planning on 
> releasing Phoenix 5.2 soon, we should make it sure it releases with HBase 2.5 
> support assuming this isn't too time-consuming / complicated. 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (PHOENIX-6692) Add HBase 2.5 support

2022-06-16 Thread Andrew Kyle Purtell (Jira)


[ 
https://issues.apache.org/jira/browse/PHOENIX-6692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17555300#comment-17555300
 ] 

Andrew Kyle Purtell commented on PHOENIX-6692:
--

2.4.13 is coming soon but currently waiting a bit more to see if HBASE-27097 
can land.

> Add HBase 2.5 support
> -
>
> Key: PHOENIX-6692
> URL: https://issues.apache.org/jira/browse/PHOENIX-6692
> Project: Phoenix
>  Issue Type: New Feature
>  Components: core
>Reporter: Geoffrey Jacoby
>Assignee: Istvan Toth
>Priority: Major
> Fix For: 5.2.0
>
>
> I was talking with [~apurtell], who's RM for HBase 2.5, and he let me know 
> that HBase 2.5 will be released very soon. Since we're also planning on 
> releasing Phoenix 5.2 soon, we should make it sure it releases with HBase 2.5 
> support assuming this isn't too time-consuming / complicated. 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (PHOENIX-6692) Add HBase 2.5 support

2022-06-16 Thread Andrew Kyle Purtell (Jira)


[ 
https://issues.apache.org/jira/browse/PHOENIX-6692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17555299#comment-17555299
 ] 

Andrew Kyle Purtell commented on PHOENIX-6692:
--

Thanks [~stoty]

> Add HBase 2.5 support
> -
>
> Key: PHOENIX-6692
> URL: https://issues.apache.org/jira/browse/PHOENIX-6692
> Project: Phoenix
>  Issue Type: New Feature
>  Components: core
>Reporter: Geoffrey Jacoby
>Assignee: Istvan Toth
>Priority: Major
> Fix For: 5.2.0
>
>
> I was talking with [~apurtell], who's RM for HBase 2.5, and he let me know 
> that HBase 2.5 will be released very soon. Since we're also planning on 
> releasing Phoenix 5.2 soon, we should make it sure it releases with HBase 2.5 
> support assuming this isn't too time-consuming / complicated. 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (PHOENIX-1422) Stateless Sequences

2022-06-14 Thread Andrew Kyle Purtell (Jira)


[ 
https://issues.apache.org/jira/browse/PHOENIX-1422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17554225#comment-17554225
 ] 

Andrew Kyle Purtell commented on PHOENIX-1422:
--

Sounds good. I will resolve this as WontFix.

> Stateless Sequences
> ---
>
> Key: PHOENIX-1422
> URL: https://issues.apache.org/jira/browse/PHOENIX-1422
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Lars Hofhansl
>Priority: Major
>
> [~apurtell] and I were chatting yesterday.
> It would be good if Phoenix had stateless sequences, i.e. sequences that give 
> out unique ids with (very) high probability.
> We can do that by starting with a timestamp, shifting it left 16 or 24 bits 
> left and fill in the new bits with a random number.
> So we're guaranteed to get a new id for each millisecond and within a 
> millisecond we break the tie with a random number. If we can make the 
> likelihood of duplicate numbers lower than (say) a data center failure, we're 
> OK. I would test this with a upsert into x ... select from x ... type query 
> inserting 100's of millions of rows.
> Need to think of a syntax too.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (PHOENIX-6692) Add HBase 2.5 support

2022-06-14 Thread Andrew Kyle Purtell (Jira)


[ 
https://issues.apache.org/jira/browse/PHOENIX-6692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17554223#comment-17554223
 ] 

Andrew Kyle Purtell commented on PHOENIX-6692:
--

[~stoty] [~vjasani] got us to back out the change to scanner semantics that 
broke your IT tests. Fix will be in 2.4.13. You may want to make a note 
somewhere that 2.4.11 and 2.4.12 are the affected releases. 

> Add HBase 2.5 support
> -
>
> Key: PHOENIX-6692
> URL: https://issues.apache.org/jira/browse/PHOENIX-6692
> Project: Phoenix
>  Issue Type: New Feature
>  Components: core
>Reporter: Geoffrey Jacoby
>Assignee: Istvan Toth
>Priority: Major
> Fix For: 5.2.0
>
>
> I was talking with [~apurtell], who's RM for HBase 2.5, and he let me know 
> that HBase 2.5 will be released very soon. Since we're also planning on 
> releasing Phoenix 5.2 soon, we should make it sure it releases with HBase 2.5 
> support assuming this isn't too time-consuming / complicated. 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (PHOENIX-1422) Stateless Sequences

2022-06-12 Thread Andrew Kyle Purtell (Jira)


[ 
https://issues.apache.org/jira/browse/PHOENIX-1422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17553299#comment-17553299
 ] 

Andrew Kyle Purtell commented on PHOENIX-1422:
--

[~kozdemir] I am wondering if you have any thoughts on this. Is this worth 
resurrecting? 

The SYSTEM.SEQUENCE table can hotspot by design because a stateful sequence has 
to be updated with an atomic operation. Even if that update is done such that a 
set of sequences can be handed out in batch. 

> Stateless Sequences
> ---
>
> Key: PHOENIX-1422
> URL: https://issues.apache.org/jira/browse/PHOENIX-1422
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Lars Hofhansl
>Priority: Major
>
> [~apurtell] and I were chatting yesterday.
> It would be good if Phoenix had stateless sequences, i.e. sequences that give 
> out unique ids with (very) high probability.
> We can do that by starting with a timestamp, shifting it left 16 or 24 bits 
> left and fill in the new bits with a random number.
> So we're guaranteed to get a new id for each millisecond and within a 
> millisecond we break the tie with a random number. If we can make the 
> likelihood of duplicate numbers lower than (say) a data center failure, we're 
> OK. I would test this with a upsert into x ... select from x ... type query 
> inserting 100's of millions of rows.
> Need to think of a syntax too.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (PHOENIX-6672) Move phoenix website from svn to git

2022-05-11 Thread Andrew Kyle Purtell (Jira)


[ 
https://issues.apache.org/jira/browse/PHOENIX-6672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17535812#comment-17535812
 ] 

Andrew Kyle Purtell commented on PHOENIX-6672:
--

I don't know how to do this. Subscribe to the Infrastructure mailing list 
infrastruct...@apache.org or file an INFRA JIRA asking for advice. 

> Move phoenix website from svn to git
> 
>
> Key: PHOENIX-6672
> URL: https://issues.apache.org/jira/browse/PHOENIX-6672
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Aman Poonia
>Assignee: Kiran Kumar Maturi
>Priority: Minor
>
> Currently we have our website hosted from svn. It is good to move it to git 
> to let other developers create PR as they do to any JIRA. This will help us 
> in improving the workflow to contribute to phoenix documentation



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Comment Edited] (PHOENIX-6672) Move phoenix website from svn to git

2022-05-11 Thread Andrew Kyle Purtell (Jira)


[ 
https://issues.apache.org/jira/browse/PHOENIX-6672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17535812#comment-17535812
 ] 

Andrew Kyle Purtell edited comment on PHOENIX-6672 at 5/12/22 2:18 AM:
---

I don't know how to do this. Subscribe to the Infrastructure mailing list 
infrastruct...@apache.org and then ask there and/or file an INFRA JIRA asking 
for advice. 


was (Author: apurtell):
I don't know how to do this. Subscribe to the Infrastructure mailing list 
infrastruct...@apache.org or file an INFRA JIRA asking for advice. 

> Move phoenix website from svn to git
> 
>
> Key: PHOENIX-6672
> URL: https://issues.apache.org/jira/browse/PHOENIX-6672
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Aman Poonia
>Assignee: Kiran Kumar Maturi
>Priority: Minor
>
> Currently we have our website hosted from svn. It is good to move it to git 
> to let other developers create PR as they do to any JIRA. This will help us 
> in improving the workflow to contribute to phoenix documentation



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (PHOENIX-6506) Tenant Connection is not able to access/validate Global Sequences

2021-07-06 Thread Andrew Kyle Purtell (Jira)


[ 
https://issues.apache.org/jira/browse/PHOENIX-6506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17375914#comment-17375914
 ] 

Andrew Kyle Purtell commented on PHOENIX-6506:
--

[~mihir6692] Done!

> Tenant Connection is not able to access/validate Global Sequences
> -
>
> Key: PHOENIX-6506
> URL: https://issues.apache.org/jira/browse/PHOENIX-6506
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Lokesh Khurana
>Priority: Major
> Attachments: PHOENIX-6506-test.patch
>
>
> In our test environment we were using sequence(created using global 
> connection) in our Table, recently we updated that table to be MULTI_TENANT 
> and both types of connections (tenant and global) are needs to upsert data 
> into Table.
> Tenant connections are not able to upsert data into table as it is not able 
> to validate sequence and getting SequenceNotFoundException. 
> Adding patch with Test failing due to same scenario.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (PHOENIX-6438) Compilation failures after changes to Private annotated HBase class RpcControllerFactory

2021-04-08 Thread Andrew Kyle Purtell (Jira)


[ 
https://issues.apache.org/jira/browse/PHOENIX-6438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17317472#comment-17317472
 ] 

Andrew Kyle Purtell commented on PHOENIX-6438:
--

Pursuing a short term fix by adding back compat methods for HBASE-25735 before 
it gets released. Then, let's get the audience annotation upgraded via 
HBASE-25750. 

> Compilation failures after changes to Private annotated HBase class 
> RpcControllerFactory
> 
>
> Key: PHOENIX-6438
> URL: https://issues.apache.org/jira/browse/PHOENIX-6438
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 5.1.1
>Reporter: Andrew Kyle Purtell
>Priority: Major
>
> After HBASE-25735 (not released yet, but coming up in 2.4.3)
> there are compilation failures in 
> phoenix-core/src/main/java/org/apache/hadoop/hbase/ipc/controller/ClientRpcControllerFactory.java,
>  
> phoenix-core/src/main/java/org/apache/hadoop/hbase/ipc/controller/InterRegionServerMetadataRpcControllerFactory.java.
>  
> Phoenix can't be taking this dependency on a Private annotated class and/or 
> the annotation must be changed to LimitedPrivate.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (PHOENIX-6045) Delete that should qualify for index path does not use index when multiple indexes are available.

2020-08-05 Thread Andrew Kyle Purtell (Jira)


[ 
https://issues.apache.org/jira/browse/PHOENIX-6045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17171891#comment-17171891
 ] 

Andrew Kyle Purtell commented on PHOENIX-6045:
--

Next time click the drop down list on the merge button and choose 'Squash And 
Merge’, which will do the right thing if you’ve made multiple commits to the PR 
branch. All of those commits will get rolled up into one and the squashed 
commit will be applied to the merge target. 

> Delete that should qualify for index path does not use index when multiple 
> indexes are available.
> -
>
> Key: PHOENIX-6045
> URL: https://issues.apache.org/jira/browse/PHOENIX-6045
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 5.0.0, 4.15.0, 4.14.3
>Reporter: Daniel Wong
>Assignee: Kadir OZDEMIR
>Priority: Major
> Fix For: 5.1.0, 4.16.0
>
> Attachments: MultipleDeleteReproIT.java, PHOENIX-6045.4.x.001.patch, 
> PHOENIX-6045.4.x.002.patch, screenshot-1.png
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Delete that should qualify for index path does not use index when multiple 
> indexes are available.  Test case to reproduce will be below.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (PHOENIX-5883) Add HBase 1.6 compatibility module to 4.x branch

2020-06-09 Thread Andrew Kyle Purtell (Jira)


[ 
https://issues.apache.org/jira/browse/PHOENIX-5883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17129614#comment-17129614
 ] 

Andrew Kyle Purtell commented on PHOENIX-5883:
--

In my testing the 1.5 compat module works with HBase 1.6. The only thing that 
fails is the backwards compatibility unit test because it hasn't been taught 
about 1.6 yet, as [~stoty] mentions.

> Add HBase 1.6 compatibility module to 4.x branch
> 
>
> Key: PHOENIX-5883
> URL: https://issues.apache.org/jira/browse/PHOENIX-5883
> Project: Phoenix
>  Issue Type: Improvement
>Affects Versions: 4.15.0, 4.14.3
>Reporter: Chinmay Kulkarni
>Assignee: Istvan Toth
>Priority: Major
> Fix For: 4.16.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We should add a compatibility module for HBase 1.6 similar to the ones added 
> for HBase 1.3, 1.4 and 1.5 in the 4.x branch.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (PHOENIX-4216) Figure out why tests randomly fail with master not able to initialize in 200 seconds

2020-05-26 Thread Andrew Kyle Purtell (Jira)


[ 
https://issues.apache.org/jira/browse/PHOENIX-4216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17117167#comment-17117167
 ] 

Andrew Kyle Purtell commented on PHOENIX-4216:
--

bq. The Region server uses EnvironmentEdgeManager.currentTime to report the 
current time and HMaster uses System.currentTimeMillis() to get the current 
time for computation against the reported time by RS. 

This is a bug. There should be no direct calls to System.currentTimeMillis, 
they should all go through EnvironmentEdgeManager.currentTime. 

bq. Ideally, even the EnvironmentEdgeManager should give the same as 
System.currenttimemillis() here unless we use some other delegate 

That's probably what is happening

> Figure out why tests randomly fail with master not able to initialize in 200 
> seconds
> 
>
> Key: PHOENIX-4216
> URL: https://issues.apache.org/jira/browse/PHOENIX-4216
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 5.0.0, 4.15.0, 4.14.3
>Reporter: Samarth Jain
>Priority: Major
>  Labels: phoenix-hardening, precommit, quality-improvement
> Fix For: 5.1.0, 4.16.0
>
> Attachments: Precommit-3849.log
>
>
> Sample failure:
> https://builds.apache.org/job/PreCommit-PHOENIX-Build/1450//testReport/
> [~apurtell] - Looking at the thread dump in the above link, do you see why 
> master startup failed? I couldn't see any obvious deadlocks



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (PHOENIX-5883) Add HBase 1.6 compatibility module to 4.x branch

2020-05-07 Thread Andrew Kyle Purtell (Jira)


[ 
https://issues.apache.org/jira/browse/PHOENIX-5883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17102075#comment-17102075
 ] 

Andrew Kyle Purtell commented on PHOENIX-5883:
--

[~siddhimehta] Our internal branches are different. Run the tests there you'll 
see the minicluster starting.

Regarding Apache HBase 1.6.0 and 2.7 since Hadoop 2.7 has been EOL for a long 
time and even 2.8 is marginal at this point I don't see it as an issue. 

> Add HBase 1.6 compatibility module to 4.x branch
> 
>
> Key: PHOENIX-5883
> URL: https://issues.apache.org/jira/browse/PHOENIX-5883
> Project: Phoenix
>  Issue Type: Improvement
>Affects Versions: 4.15.0, 4.14.3
>Reporter: Chinmay Kulkarni
>Assignee: Istvan Toth
>Priority: Major
> Fix For: 4.16.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We should add a compatibility module for HBase 1.6 similar to the ones added 
> for HBase 1.3, 1.4 and 1.5 in the 4.x branch.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (PHOENIX-5827) Let PQS act as a maven repo

2020-04-22 Thread Andrew Kyle Purtell (Jira)


[ 
https://issues.apache.org/jira/browse/PHOENIX-5827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17089959#comment-17089959
 ] 

Andrew Kyle Purtell commented on PHOENIX-5827:
--

Whoa, nice idea

> Let PQS act as a maven repo
> ---
>
> Key: PHOENIX-5827
> URL: https://issues.apache.org/jira/browse/PHOENIX-5827
> Project: Phoenix
>  Issue Type: Improvement
>  Components: queryserver
>Reporter: Josh Elser
>Assignee: Josh Elser
>Priority: Major
> Fix For: queryserver-1.0.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> PQS is already an HTTP server and we have the Phoenix client jars for PQS to 
> operate.
> How about we just let PQS host these jars as a normal Maven repository?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (PHOENIX-5520) Phoenix-level HBase ReplicationEndpoint

2020-01-22 Thread Andrew Kyle Purtell (Jira)


[ 
https://issues.apache.org/jira/browse/PHOENIX-5520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17021624#comment-17021624
 ] 

Andrew Kyle Purtell commented on PHOENIX-5520:
--

bq. I think one solution is to prototype on branch 4.15-HBase-1.5

Make a feature branch based on 4.15-HBase-1.5, do your dev there, then merge 
from feature branch when ready - done!

> Phoenix-level HBase ReplicationEndpoint
> ---
>
> Key: PHOENIX-5520
> URL: https://issues.apache.org/jira/browse/PHOENIX-5520
> Project: Phoenix
>  Issue Type: Sub-task
>Reporter: Geoffrey Jacoby
>Assignee: Bharath Vissapragada
>Priority: Major
>
> A Phoenix implementation of HBase's ReplicationEndpoint that tails the WAL 
> like a normal replication endpoint. However, rather than writing to HBase's 
> replication sink APIs (which create HBase RPCs to a remote cluster), they 
> should write to a new Phoenix Endpoint coprocessor (created in a separate 
> sub-task).
> This assumes that the WAL entries have been annotated with Phoenix metadata 
> (tenant, logical table/view name, timestamp) using the mechanism in 
> PHOENIX-5435.
> While many custom ReplicationEndpoints inherit from 
> HBaseInterClusterReplicationEndpoint and just override the filtering logic, 
> this will need to avoid HBaseInterClusterReplicationEndpoint (which uses 
> HBase RPCs and the HBase sink manager) and instead inherit from 
> BaseReplicationEndpoint, or even implement the ReplicationEndpoint interface 
> + extend AbstractService directly. This is because it has to manage its own 
> transport mechanism to the remote cluster. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (PHOENIX-4266) Avoid scanner caching in Phoenix

2020-01-14 Thread Andrew Kyle Purtell (Jira)


[ 
https://issues.apache.org/jira/browse/PHOENIX-4266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17015575#comment-17015575
 ] 

Andrew Kyle Purtell commented on PHOENIX-4266:
--

Do you mean setCaching or setBatch()? 

Would a patch that simply removes all calls to Scan#setBatch() and setCaching() 
be an acceptable first cut? [~larsh] 

> Avoid scanner caching in Phoenix
> 
>
> Key: PHOENIX-4266
> URL: https://issues.apache.org/jira/browse/PHOENIX-4266
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Priority: Major
> Fix For: 5.1.0, 4.16.0
>
>
> Phoenix tries to set caching on all scans. On HBase versions before 0.98 that 
> made sense, now it is the wrong thing to do.
> HBase will by default do size based chunking. Setting scanner caching 
> prevents HBase doing this work.
> We should avoid scanner everywhere, and only use in cases where we know the 
> number of rows to be returned (and that number is small).
> [~sergey.soldatov], [~jamestaylor]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (PHOENIX-5672) Unable to find cached index metadata with large UPSERT/SELECT and local index.

2020-01-13 Thread Andrew Kyle Purtell (Jira)


[ 
https://issues.apache.org/jira/browse/PHOENIX-5672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17014729#comment-17014729
 ] 

Andrew Kyle Purtell commented on PHOENIX-5672:
--

bq. IMHO - just like the server side deletes and upsert/selects - this code 
should just be removed.

+1
Seems more likely this can/will be done than someone to pick this up and really 
fix it

> Unable to find cached index metadata with large UPSERT/SELECT and local index.
> --
>
> Key: PHOENIX-5672
> URL: https://issues.apache.org/jira/browse/PHOENIX-5672
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.15.0
>Reporter: Lars Hofhansl
>Priority: Major
>
> Doing a very large UPSERT/SELECT back into the same table. After a while I 
> get this exception. This happens with server side mutation turned off or on 
> and regardless of the batch-size (which I have increased to 1 in this 
> last example).
> {code:java}
> 20/01/10 16:41:54 WARN client.AsyncProcess: #1, table=TEST, attempt=1/35 
> failed=1ops, last exception: 
> org.apache.hadoop.hbase.DoNotRetryIOException: 
> org.apache.hadoop.hbase.DoNotRetryIOException: ERROR 2008 (INT10): ERROR 2008 
> (INT10): Unable to find cached index metadata.  key=-1180967500149768360 
> region=TEST,\x80\x965g\x80\x0F@\xAA\x80Y$\xEF,1578504217187.42467236e0b49fda05fdaaf69de98832.host=lhofhansl-wsl2,16201,157870268
>  Index update failed20/01/10 16:41:54 WARN client.AsyncProcess: #1, 
> table=TEST, attempt=1/35 failed=1ops, last exception: 
> org.apache.hadoop.hbase.DoNotRetryIOException: 
> org.apache.hadoop.hbase.DoNotRetryIOException: ERROR 2008 (INT10): ERROR 2008 
> (INT10): Unable to find cached index metadata.  key=-1180967500149768360 
> region=TEST,\x80\x965g\x80\x0F@\xAA\x80Y$\xEF,1578504217187.42467236e0b49fda05fdaaf69de98832.host=lhofhansl-wsl2,16201,157870268
>  Index update failed at 
> org.apache.phoenix.util.ServerUtil.createIOException(ServerUtil.java:113) at 
> org.apache.phoenix.util.ServerUtil.throwIOException(ServerUtil.java:87) at 
> org.apache.phoenix.index.PhoenixIndexMetaDataBuilder.getIndexMetaDataCache(PhoenixIndexMetaDataBuilder.java:101)
>  at 
> org.apache.phoenix.index.PhoenixIndexMetaDataBuilder.getIndexMetaData(PhoenixIndexMetaDataBuilder.java:51)
>  at 
> org.apache.phoenix.index.PhoenixIndexBuilder.getIndexMetaData(PhoenixIndexBuilder.java:100)
>  at 
> org.apache.phoenix.index.PhoenixIndexBuilder.getIndexMetaData(PhoenixIndexBuilder.java:73)
>  at 
> org.apache.phoenix.hbase.index.builder.IndexBuildManager.getIndexMetaData(IndexBuildManager.java:84)
>  at 
> org.apache.phoenix.hbase.index.IndexRegionObserver.getPhoenixIndexMetaData(IndexRegionObserver.java:594)
>  at 
> org.apache.phoenix.hbase.index.IndexRegionObserver.preBatchMutateWithExceptions(IndexRegionObserver.java:646)
>  at 
> org.apache.phoenix.hbase.index.IndexRegionObserver.preBatchMutate(IndexRegionObserver.java:334)
>  at 
> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$35.call(RegionCoprocessorHost.java:1024)
>  at 
> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$RegionOperation.call(RegionCoprocessorHost.java:1742)
>  at 
> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.execOperation(RegionCoprocessorHost.java:1827)
>  at 
> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.execOperation(RegionCoprocessorHost.java:1783)
>  at 
> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.preBatchMutate(RegionCoprocessorHost.java:1020)
>  at 
> org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutation(HRegion.java:3425)
>  at 
> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:3163) 
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:3105) 
> at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(RSRpcServices.java:944)
>  at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(RSRpcServices.java:872)
>  at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(RSRpcServices.java:2472)
>  at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:36812)
>  at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2399) at 
> org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124) at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:311) at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:291)Caused
>  by: java.sql.SQLException: ERROR 2008 (INT10): Unable to find cached index 
> metadata.  key=-1180967500149768360 
> region=TEST,\x80\x965g\x80\x0F@\xAA\x80Y$\xEF,1578504217187.42467236e0b49fda05fdaaf69de98832.host=lhofhansl-wsl2,16201,157870268
>  at 
> 

[jira] [Commented] (PHOENIX-5595) Use ROW_INDEX_V1 block encoding and zSTD compression by default

2019-11-27 Thread Andrew Kyle Purtell (Jira)


[ 
https://issues.apache.org/jira/browse/PHOENIX-5595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16983788#comment-16983788
 ] 

Andrew Kyle Purtell commented on PHOENIX-5595:
--

I think it would be fine to try for ZSTD as default if available, and fall back 
to FAST_DIFF and no compression if not. Or perhaps ROW_INDEX and GZ would still 
be less bad than other options as fallback, given GZ has a pure Java 
implementation in every JRE. We only take the decompression hit once at read, 
while with FAST_DIFF, at every scan. 

> Use ROW_INDEX_V1 block encoding and zSTD compression by default
> ---
>
> Key: PHOENIX-5595
> URL: https://issues.apache.org/jira/browse/PHOENIX-5595
> Project: Phoenix
>  Issue Type: Wish
>Reporter: Lars Hofhansl
>Priority: Major
>
> Phoenix defaults to FAST_DIFF block encoding and no compression (not needed 
> with FAST_DIFF).
> I blogged about this extensively here: 
> http://hadoop-hbase.blogspot.com/2018/10/apache-hbase-and-apache-phoenix-more-on.html
> We should switch the default to block encoding ROW_INDEX_V1 and compression 
> zSTD for all newly created tables (including global indexes). Local indexes 
> can stay with FAST_DIFF, but perhaps for completeness we should just switch 
> everything.
> The only wrinkle is that FAST_DIFF also does compression (i.e. the diff 
> encoding), and ROW_INDEX_V1 actually increases the block size a little bit 
> since it keeps in a index of row keys so that it can do binary search inside 
> of an HFile block. Hence it needs to be paired with compression. Every test I 
> did suggests that zSTD is the best.
> The main wrinkle is that zSTD needs a Hadoop/HBase build with native zSTD 
> support compiled.
> I marked this as a Wish... Perhaps we can discuss here.
> What I do know is that FAST_DIFF has outgrown its usefulness, seeking into 
> FAST_DIFF is (naturally) slow since it would need to seek to that last know 
> fully stored key and then play all the diffs forward from there to the actual 
> row we want to seek to. This impacts GETs.
> zSTD also offers better compression and thus reduced IO even when paired with 
> ROW_INDEX_V1.
> [~apurtell] What we discussed a while back.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (PHOENIX-5523) Upgrade HBase version in 4.x-HBase-1.5 to newly released 1.5.0

2019-10-14 Thread Andrew Kyle Purtell (Jira)


[ 
https://issues.apache.org/jira/browse/PHOENIX-5523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16951208#comment-16951208
 ] 

Andrew Kyle Purtell commented on PHOENIX-5523:
--

The 1.5.0 vote passed today. The 1.5.0 artifacts have been released to Apache's 
Maven repo. I'm waiting 24 hours to send out a release note in order for the 
convenience binaries to finish propagating to mirrors. 

> Upgrade HBase version in 4.x-HBase-1.5 to newly released 1.5.0
> --
>
> Key: PHOENIX-5523
> URL: https://issues.apache.org/jira/browse/PHOENIX-5523
> Project: Phoenix
>  Issue Type: Improvement
>Affects Versions: 4.15.0
>Reporter: Chinmay Kulkarni
>Assignee: Chinmay Kulkarni
>Priority: Major
> Fix For: 4.15.0
>
>
> HBase 1.5.0 will be released this Monday (Oct 14) and we should use this 
> instead of the snapshot version.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)