[
https://issues.apache.org/jira/browse/TAJO-1493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14902730#comment-14902730
]
ASF GitHub Bot commented on TAJO-1493:
--------------------------------------
Github user jihoonson commented on a diff in the pull request:
https://github.com/apache/tajo/pull/772#discussion_r40093853
--- Diff:
tajo-catalog/tajo-catalog-drivers/tajo-hive/src/main/java/org/apache/tajo/catalog/store/HiveCatalogStore.java
---
@@ -845,13 +856,174 @@ public boolean existPartitionMethod(String
databaseName, String tableName) throw
}
@Override
- public List<CatalogProtos.PartitionDescProto> getPartitions(String
databaseName,
- String tableName)
{
- throw new UnsupportedOperationException();
+ public List<CatalogProtos.PartitionDescProto>
getPartitionsOfTable(String databaseName, String tableName)
+ throws UndefinedDatabaseException, UndefinedTableException,
UndefinedPartitionMethodException {
+ PartitionsByFilterProto.Builder request =
PartitionsByFilterProto.newBuilder();
+ request.setDatabaseName(databaseName);
+ request.setTableName(tableName);
+ request.setFilter("");
+
+ return getPartitionsByFilter(request.build());
+ }
+
+ @Override
+ public List<PartitionDescProto>
getPartitionsByAlgebra(PartitionsByAlgebraProto request) throws
+ UndefinedDatabaseException, UndefinedTableException,
UndefinedPartitionMethodException, UnsupportedException {
+
+ List<PartitionDescProto> list = null;
+
+ try {
+ String databaseName = request.getDatabaseName();
+ String tableName = request.getTableName();
+
+ if (!existDatabase(databaseName)) {
+ throw new UndefinedDatabaseException(tableName);
+ }
+
+ if (!existTable(databaseName, tableName)) {
+ throw new UndefinedTableException(tableName);
+ }
+
+ if (!existPartitionMethod(databaseName, tableName)) {
+ throw new UndefinedPartitionMethodException(tableName);
+ }
+
+ TableDescProto tableDesc = getTable(databaseName, tableName);
+ String filter = getFilter(databaseName, tableName,
tableDesc.getPartition().getExpressionSchema().getFieldsList()
+ , request.getAlgebra());
+ list = getPartitionsByFilterFromHiveMetaStore(databaseName,
tableName, filter);
+ } catch (UnsupportedException ue) {
+ throw ue;
+ } catch (Exception se) {
+ throw new TajoInternalError(se);
+ }
+
+ return list;
+ }
+
+ private String getFilter(String databaseName, String tableName,
List<ColumnProto> partitionColumns
+ , String json) throws TajoException {
+
+ Expr[] exprs = null;
+
+ if (json != null && !json.isEmpty()) {
+ Expr algebra = JsonHelper.fromJson(json, Expr.class);
+ exprs = AlgebraicUtil.toConjunctiveNormalFormArray(algebra);
+ }
+
+ PartitionFilterAlgebraVisitor visitor = new
PartitionFilterAlgebraVisitor();
+ visitor.setIsHiveCatalog(true);
+
+ Expr[] filters =
AlgebraicUtil.getRearrangedCNFExpressions(databaseName + "." + tableName,
partitionColumns, exprs);
+
+ StringBuffer sb = new StringBuffer();
+
+ // Write join clause from second column to last column.
+ Column target;
+
+ int addedFilter = 0;
+ String result;
+ for (int i = 0; i < partitionColumns.size(); i++) {
+ target = new Column(partitionColumns.get(i));
+
+ if (!(filters[i] instanceof IsNullPredicate)) {
+ visitor.setColumn(target);
+ visitor.visit(null, new Stack<Expr>(), filters[i]);
+ result = visitor.getResult();
+
+ // If visitor build filter successfully, add filter to be used for
executing hive api.
+ if (result.length() > 0) {
+ if (addedFilter > 0) {
--- End diff --
```addedFilter``` can be replaced with sb.length().
> Make partition pruning based on catalog informations
> ----------------------------------------------------
>
> Key: TAJO-1493
> URL: https://issues.apache.org/jira/browse/TAJO-1493
> Project: Tajo
> Issue Type: Sub-task
> Components: Catalog, Planner/Optimizer
> Reporter: Jaehwa Jung
> Assignee: Jaehwa Jung
> Fix For: 0.11.0, 0.12.0
>
> Attachments: TAJO-1493.patch, TAJO-1493_2.patch, TAJO-1493_3.patch,
> TAJO-1493_4.patch, TAJO-1493_5.patch
>
>
> Currently, PartitionedTableRewriter take a look into partition directories
> for rewriting filter conditions. It get all sub directories of table path
> because catalog doesn’t provide partition directories. But if there are lots
> of sub directories on HDFS, such as, more than 10,000 directories, it might
> be cause overload to NameNode. Thus, CatalogStore need to provide partition
> directories for specified filter conditions. I designed new method to
> CatalogStore as follows:
> * method name: getPartitionsWithConditionFilters
> * first parameter: database name
> * second parameter: table name
> * third parameter: where clause (included target column name and partition
> value)
> * return values:
> List<org.apache.tajo.catalog.proto.CatalogProtos.TablePartitionProto>
> * description: It scan right partition directories on CatalogStore with where
> caluse.
> For examples, users set parameters as following:
> ** first parameter: default
> ** second parameter: table1
> ** third parameter: COLUMN_NAME = 'col1' AND PARTITION_VALUE = '3
> In the previous cases, this method will create select clause as follows.
> {code:xml}
> SELECT DISTINCT A.PATH
> FROM PARTITIONS A, (
> SELECT B.PARTITION_ID
> FROM PARTITION_KEYS B
> WHERE B.PARTITION_ID > 0
> AND (
> COLUMN_NAME = 'col1' AND PARTITION_VALUE = '3'
> )
> ) B
> WHERE A.PARTITION_ID > 0
> AND A.TID = ${table_id}
> AND A.PARTITION_ID = B.PARTITION_ID
> {code}
> At the first time, I considered to use EvalNode instead of where clause. But
> I can’t use it because of recursive related problems between tajo-catalog
> module and tajo-plan module. So, I’ll implement utility class to convert
> EvalNode to SQL.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)