[ 
https://issues.apache.org/jira/browse/TAJO-337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyunsik Choi updated TAJO-337:
------------------------------
    Description: 
Currently, Tajo uses HDFS as a primary storage. But, as a data warehouse 
system, Tajo should easily support various data sources.

For this, I propose a generic storage handler interface that provides common 
storage methods as follows:

* splitting input data
* locality
* accessing catalog
* creating a table
* removing a table
* adding default table properties and validating properties
* committing, rollback, and clean up output tables
* getting table physical information like table volumes and others

  was:
Currently, Tajo uses HDFS as a primary storage. But, as a data warehouse 
system, Tajo should easily support various data sources.

For this, I propose a generic storage handler interface that provides common 
storage methods as follows:
* splitting input data
* locality
* accessing catalog
* creating a table
* removing a table
* adding default table properties and validating properties
* committing, rollback, and clean up output tables

The above methods are derived from query proecssing mechanism on data sets 
stored in HDFS.

Query, SubQuery, and Repartition which usually deal with Fragment should not 
deal with concrete Fragment classes. Instead, they should use the methods of 
GenericStorageManager.


> Generic StorageManager to provide common storage methods
> --------------------------------------------------------
>
>                 Key: TAJO-337
>                 URL: https://issues.apache.org/jira/browse/TAJO-337
>             Project: Tajo
>          Issue Type: Improvement
>          Components: catalog, storage
>            Reporter: Hyunsik Choi
>            Assignee: Hyunsik Choi
>
> Currently, Tajo uses HDFS as a primary storage. But, as a data warehouse 
> system, Tajo should easily support various data sources.
> For this, I propose a generic storage handler interface that provides common 
> storage methods as follows:
> * splitting input data
> * locality
> * accessing catalog
> * creating a table
> * removing a table
> * adding default table properties and validating properties
> * committing, rollback, and clean up output tables
> * getting table physical information like table volumes and others



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to