Re: DDL for CarbonData table backup and recovery (new feature)

Mohammad Shahid Khan Sun, 26 Nov 2017 21:55:00 -0800

Hi Ravindra & Likun
I am freezing the design and going to start the code.
Please revert me if any issues.


--Regards,
   Shahid

On Fri, Nov 24, 2017 at 12:40 PM, mohdshahidkhan <
mohdshahidkhan1...@gmail.com> wrote:

> *Please update solution:
> Instead of passing the dbLocation, the database name will be passed in the
> Register DDL*
> CarbonData table backup and recovery
> Background
> Customer has created one CarbonData table which is already loaded very huge
> data, and now they install another cluster which want to use the same data
> as this table and don’t want load again, because load data cost long time,
> so they want can directly backup this table data and recover it in another
> cluster. After recovery the data in the CarbonData user can use it as a
> normal CarbonData table.
> Requirement Description
> A CarbonData table’s data can support backup the data and recover the data
> which no need load data again.
> To reuse the CarbonData table of another cluster a DDL should be provided
> to
> create the CarbonData table from the existing carbon table schema.
> Solution
> Currently CarbonData has below three types of tables
> 1.   Normal table
> 2.   Pre Aggregate table
> CarbonData should provide a DDL command to create the table from existing
> table data.
> Below DDL command could be used to create the table from existing table
> data.
>
>   REGISTER TABLES FOR $DBName;
>
>            i.  The database path will be retrived from hive catalog &
>                The database path will be scanned to get all table.
>           ii.  The table schema will be read to get columns details.
>          iii.  The table will be registered to the hive catalog with below
> details
> CREATE TABLE $tbName USING carbondata OPTIONS (tableName "$dbName.$tbName",
> dbName "$dbName",
> tablePath "$tablePath",
> path "$tablePath" )
>
> Precondition:
>     i. The user has to create the database and Before executing this
> command
> the old table schema and
>        data should be copied into the database location.
>    ii. If the table is aggregate table then all the aggregate tables should
> be copied to the  in database
>        location .
>
> Validation:
>    1. If database does not exist then the registration will fail.
>    2. The table will be registered only if same table name is not already
> registered.
>    3. If the table contains the aggregate tables then all the aggregate
> tables should be registered to hive
>        catalog and if any of the aggregate table does not exist then the
> table creation operation should fail.
>
>
>
> --
> Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.
> n5.nabble.com/
>

Re: DDL for CarbonData table backup and recovery (new feature)

Reply via email to