[
https://issues.apache.org/jira/browse/TAJO-337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hyunsik Choi updated TAJO-337:
------------------------------
Description:
Currently, Tajo uses HDFS as a primary storage. But, as a data warehouse
system, Tajo should easily support various data sources.
For this, I propose a generic storage handler interface that provides common
storage methods as follows:
* splitting input data
* locality
* accessing catalog
* creating a table
* removing a table
* adding default table properties and validating properties
* committing, rollback, and clean up output tables
* getting table physical information like table volumes and others
was:
Currently, Tajo uses HDFS as a primary storage. But, as a data warehouse
system, Tajo should easily support various data sources.
For this, I propose a generic storage handler interface that provides common
storage methods as follows:
* splitting input data
* locality
* accessing catalog
* creating a table
* removing a table
* adding default table properties and validating properties
* committing, rollback, and clean up output tables
The above methods are derived from query proecssing mechanism on data sets
stored in HDFS.
Query, SubQuery, and Repartition which usually deal with Fragment should not
deal with concrete Fragment classes. Instead, they should use the methods of
GenericStorageManager.
> Generic StorageManager to provide common storage methods
> --------------------------------------------------------
>
> Key: TAJO-337
> URL: https://issues.apache.org/jira/browse/TAJO-337
> Project: Tajo
> Issue Type: Improvement
> Components: catalog, storage
> Reporter: Hyunsik Choi
> Assignee: Hyunsik Choi
>
> Currently, Tajo uses HDFS as a primary storage. But, as a data warehouse
> system, Tajo should easily support various data sources.
> For this, I propose a generic storage handler interface that provides common
> storage methods as follows:
> * splitting input data
> * locality
> * accessing catalog
> * creating a table
> * removing a table
> * adding default table properties and validating properties
> * committing, rollback, and clean up output tables
> * getting table physical information like table volumes and others
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)