[jira] [Comment Edited] (PHOENIX-6627) Remove all references to Tephra from 4.x and master

Andrew Kyle Purtell (Jira) Thu, 11 Aug 2022 09:23:09 -0700


    [ 
https://issues.apache.org/jira/browse/PHOENIX-6627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17578555#comment-17578555
 ]


Andrew Kyle Purtell edited comment on PHOENIX-6627 at 8/11/22 4:22 PM:
-----------------------------------------------------------------------

bq. The problem is that we do not have a Phoenix level logical tooling for 
doing a logical Dump/Restore that would isolate us from the data 
representation, like for example the mysqldump tool.

This seems like a reasonable enabling followup. From what I have read on 
background for this work it is unlikely that there is a Tephra table user out 
there, but such a tool can be written and tested and made available before a 
release with a change like PHOENIX-6627 included. It will be generally useful 
for a variety of migration and rescue scenarios.

Perhaps a table snapshot mapreduce application:
1. Trigger a snapshot.
2. Write out a DDL file with CREATE statements that would recreate the schema.
3. In a distributed fashion, write out row-ranges of the base table into 
multiple compressed textual DML files, like mysqldump. 

Indexes are ignored because they will be recreated when the DDL file is 
executed and recreated on the fly as base table dump is reinserted.
I am fuzzy on the details necessary for proper handling of views.

So to re-import, a user would execute the DDL, then iterate over the set of 
compressed DML files and execute the statements within (perhaps in parallel). 
The export tool should have a companion import tool that automates this 
process. 

WDYT? File an issue for this? 


was (Author: apurtell):
bq. The problem is that we do not have a Phoenix level logical tooling for 
doing a logical Dump/Restore that would isolate us from the data 
representation, like for example the mysqldump tool.

This seems like a reasonable enabling followup. From what I have read on 
background for this work it is unlikely that there is a Tephra table user out 
there, but such a tool can be written and tested and made available before a 
release with a change like PHOENIX-6627 included. It will be generally useful 
for a variety of migration and rescue scenarios.

Perhaps a table snapshot mapreduce application:
1. Trigger a snapshot.
2. Write out a DDL file with CREATE statements that would recreate the schema.
3. In a distributed fashion, write out row-ranges of the base table, ignoring 
indexes, into multiple compressed textual DML files, like mysqldump. 

I am fuzzy on the details necessary for proper handling of views.

So to re-import, a user would execute the DDL, then iterate over the set of 
compressed DML files and execute the statements within (perhaps in parallel). 
The export tool should have a companion import tool that automates this 
process. 

WDYT? File an issue for this? 

> Remove all references to Tephra from 4.x and master
> ---------------------------------------------------
>
>                 Key: PHOENIX-6627
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-6627
>             Project: Phoenix
>          Issue Type: Sub-task
>          Components: 4.x, tephra
>            Reporter: Istvan Toth
>            Assignee: Andrew Kyle Purtell
>            Priority: Major
>             Fix For: 5.2.0
>
>
> Removing tephra from the runtime is easy, as it uses the well defind 
> TransactionProvider interfaces.
> Removing Tephra references from all the test cases is a much bigger task.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (PHOENIX-6627) Remove all references to Tephra from 4.x and master

Reply via email to