[
https://issues.apache.org/jira/browse/PHOENIX-3645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Thomas D'Silva resolved PHOENIX-3645.
-------------------------------------
Resolution: Fixed
> Build a mechanism for creating a table and populating it with data from a
> source table
> --------------------------------------------------------------------------------------
>
> Key: PHOENIX-3645
> URL: https://issues.apache.org/jira/browse/PHOENIX-3645
> Project: Phoenix
> Issue Type: New Feature
> Reporter: Samarth Jain
> Priority: Major
>
> As part of PHOENIX-1598, we are introducing the capability of mapping column
> names and encoding column values. For users to be able to use this new
> scheme, they would need to recreate their tables from the scratch. For
> situations like this, it would be nice to have a mechanism where we can
> create a new table and fill it with data of the existing table.
> A simple possibility is to disable the source table, take a snapshot of it,
> create new table using the snapshot of the old table, and drop the old table.
> However, this would require downtime.
> Another way would be use an UPSERT INTO TARGET TABLE SELECT * FROM SOURCE
> TABLE or a map reduce job to the bulk load. These mechanisms though have the
> inherent limitation that they miss the updates to the old table after they
> were kicked off or after they were complete. To handle the case of these
> missing updates, a somewhat crazy idea would be mark the new table as an
> index on the existing table. The index table would have the same exact schema
> as the data table. Incremental changes would then be automatically taken care
> of by our index change mechanism. We can then use our existing map reduce
> index build job to bulk load the "old" data into the new table.
> There is a slight chance that we would miss the update happening to the
> source table when we are in the process of doing the index->table conversion.
> One way to handle that would be store the physical hbase table name for a
> phoenix table in the SYSTEM.CATALOG. Then the reducer of the map reduce job
> would simply have to change this mapping in the SYSTEM.CATALOG table. This
> should cause the new updates to go to the new hbase table.
> There are probably some edge cases or gotchas that I am not thinking about
> right now. [~jamestaylor], probably has more thoughts on this.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)