[ 
https://issues.apache.org/jira/browse/PHOENIX-6627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17578555#comment-17578555
 ] 

Andrew Kyle Purtell edited comment on PHOENIX-6627 at 8/11/22 4:21 PM:
-----------------------------------------------------------------------

bq. The problem is that we do not have a Phoenix level logical tooling for 
doing a logical Dump/Restore that would isolate us from the data 
representation, like for example the mysqldump tool.

This seems like a reasonable enabling followup. From what I have read on 
background for this work it is unlikely that there is a Tephra table user out 
there, but such a tool can be written and tested and made available before a 
release with a change like PHOENIX-6627 included. It will be generally useful 
for a variety of migration and rescue scenarios.

Perhaps a table snapshot mapreduce application:
1. Trigger a snapshot.
2. Write out a DDL file with CREATE statements that would recreate the schema.
3. In a distributed fashion, write out row-ranges of the base table, ignoring 
indexes, into multiple compressed textual DML files, like mysqldump. 

I am fuzzy on the details necessary for proper handling of views.

So to re-import, a user would execute the DDL, then iterate over the set of 
compressed DML files and execute the statements within (perhaps in parallel). 
The export tool should have a companion import tool that automates this 
process. 

WDYT? File an issue for this? 


was (Author: apurtell):
bq. The problem is that we do not have a Phoenix level logical tooling for 
doing a logical Dump/Restore that would isolate us from the data 
representation, like for example the mysqldump tool.

This seems like a reasonable enabling followup. From what I have read on 
background for this work it is unlikely that there is a Tephra table user out 
there, but such a tool can be written and tested and made available before a 
release with a change like PHOENIX-6627 included. Perhaps a table snapshot 
mapreduce application:
1. Trigger a snapshot.
2. Write out a DDL file with CREATE statements that would recreate the schema.
3. In a distributed fashion, write out row-ranges of the base table, ignoring 
indexes, into multiple compressed textual DML files, like mysqldump. 

I am fuzzy on the details necessary for proper handling of views.

So to re-import, a user would execute the DDL, then iterate over the set of 
compressed DML files and execute the statements within (perhaps in parallel). 
The export tool should have a companion import tool that automates this 
process. 

WDYT? File an issue for this? 

> Remove all references to Tephra from 4.x and master
> ---------------------------------------------------
>
>                 Key: PHOENIX-6627
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-6627
>             Project: Phoenix
>          Issue Type: Sub-task
>          Components: 4.x, tephra
>            Reporter: Istvan Toth
>            Assignee: Andrew Kyle Purtell
>            Priority: Major
>             Fix For: 5.2.0
>
>
> Removing tephra from the runtime is easy, as it uses the well defind 
> TransactionProvider interfaces.
> Removing Tephra references from all the test cases is a much bigger task.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to