[
https://issues.apache.org/jira/browse/TAJO-1493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14740810#comment-14740810
]
ASF GitHub Bot commented on TAJO-1493:
--------------------------------------
Github user hyunsik commented on a diff in the pull request:
https://github.com/apache/tajo/pull/624#discussion_r39271326
--- Diff:
tajo-catalog/tajo-catalog-server/src/main/java/org/apache/tajo/catalog/store/AbstractDBStore.java
---
@@ -2172,6 +2178,252 @@ private void setPartitionKeys(int pid,
PartitionDescProto.Builder partitionDesc)
return partitions;
}
+ /**
+ * Check if list of partitios exist on catalog.
+ *
+ *
+ * @param databaseId
+ * @param tableId
+ * @return
+ */
+ public boolean existPartitionsOnCatalog(int databaseId, int tableId) {
+ Connection conn = null;
+ ResultSet res = null;
+ PreparedStatement pstmt = null;
+ boolean result = false;
+
+ try {
+ String sql = "SELECT COUNT(*) CNT FROM "
+ + TB_PARTTIONS +" WHERE " + COL_TABLES_PK + " = ? ";
+
+ if (LOG.isDebugEnabled()) {
+ LOG.debug(sql);
+ }
+
+ conn = getConnection();
+ pstmt = conn.prepareStatement(sql);
+ pstmt.setInt(1, tableId);
+ res = pstmt.executeQuery();
+
+ if (res.next()) {
+ if (res.getInt("CNT") > 0) {
+ result = true;
+ }
+ }
+ } catch (SQLException se) {
+ throw new TajoInternalError(se);
+ } finally {
+ CatalogUtil.closeQuietly(pstmt, res);
+ }
+ return result;
+ }
+
+ @Override
+ public List<PartitionDescProto>
getPartitionsByFilter(PartitionsByFilterProto request) {
+ throw new TajoRuntimeException(new UnsupportedException());
+ }
+
+ @Override
+ public List<PartitionDescProto>
getPartitionsByAlgebra(PartitionsByAlgebraProto request) throws
+ UndefinedDatabaseException, UndefinedTableException,
UndefinedPartitionMethodException {
+ Connection conn = null;
+ PreparedStatement pstmt = null;
+ ResultSet res = null;
+ int currentIndex = 1;
+ String selectStatement = null;
+
+ List<PartitionDescProto> partitions = TUtil.newList();
+ List<PartitionFilterSet> filterSets = TUtil.newList();
+
+ try {
+ int databaseId = getDatabaseId(request.getDatabaseName());
+ int tableId = getTableId(databaseId, request.getDatabaseName(),
request.getTableName());
+ if (!existPartitionMethod(request.getDatabaseName(),
request.getTableName())) {
+ throw new
UndefinedPartitionMethodException(request.getTableName());
+ }
+
+ if (!existPartitionsOnCatalog(databaseId, tableId)) {
+ throw new PartitionNotFoundException(request.getTableName());
--- End diff --
Could you elaborate the purpose of this exception? Does it means that there
is no partition? If so, it does not make sense because it returns a list of
partitions. So, it should return an empty list if there is no partitions.
> Make partition pruning based on catalog informations
> ----------------------------------------------------
>
> Key: TAJO-1493
> URL: https://issues.apache.org/jira/browse/TAJO-1493
> Project: Tajo
> Issue Type: Sub-task
> Components: Catalog, Planner/Optimizer
> Reporter: Jaehwa Jung
> Assignee: Jaehwa Jung
> Fix For: 0.11.0, 0.12.0
>
> Attachments: TAJO-1493.patch, TAJO-1493_2.patch, TAJO-1493_3.patch,
> TAJO-1493_4.patch
>
>
> Currently, PartitionedTableRewriter take a look into partition directories
> for rewriting filter conditions. It get all sub directories of table path
> because catalog doesn’t provide partition directories. But if there are lots
> of sub directories on HDFS, such as, more than 10,000 directories, it might
> be cause overload to NameNode. Thus, CatalogStore need to provide partition
> directories for specified filter conditions. I designed new method to
> CatalogStore as follows:
> * method name: getPartitionsWithConditionFilters
> * first parameter: database name
> * second parameter: table name
> * third parameter: where clause (included target column name and partition
> value)
> * return values:
> List<org.apache.tajo.catalog.proto.CatalogProtos.TablePartitionProto>
> * description: It scan right partition directories on CatalogStore with where
> caluse.
> For examples, users set parameters as following:
> ** first parameter: default
> ** second parameter: table1
> ** third parameter: COLUMN_NAME = 'col1' AND PARTITION_VALUE = '3
> In the previous cases, this method will create select clause as follows.
> {code:xml}
> SELECT DISTINCT A.PATH
> FROM PARTITIONS A, (
> SELECT B.PARTITION_ID
> FROM PARTITION_KEYS B
> WHERE B.PARTITION_ID > 0
> AND (
> COLUMN_NAME = 'col1' AND PARTITION_VALUE = '3'
> )
> ) B
> WHERE A.PARTITION_ID > 0
> AND A.TID = ${table_id}
> AND A.PARTITION_ID = B.PARTITION_ID
> {code}
> At the first time, I considered to use EvalNode instead of where clause. But
> I can’t use it because of recursive related problems between tajo-catalog
> module and tajo-plan module. So, I’ll implement utility class to convert
> EvalNode to SQL.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)