[jira] [Work logged] (HIVE-25773) Column descriptors might not deleted via direct sql

ASF GitHub Bot (Jira) Mon, 06 Dec 2021 11:01:44 -0800


     [ 
https://issues.apache.org/jira/browse/HIVE-25773?focusedWorklogId=691297&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-691297
 ]


ASF GitHub Bot logged work on HIVE-25773:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 06/Dec/21 19:00
            Start Date: 06/Dec/21 19:00
    Worklog Time Spent: 10m 
      Work Description: hsnusonic commented on a change in pull request #2843:
URL: https://github.com/apache/hive/pull/2843#discussion_r763293899



##########
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
##########
@@ -2818,26 +2820,24 @@ private void dropDanglingColumnDescriptors(List<Object> 
columnDescriptorIdList)
 
     // Drop column descriptor, if no relation left
     queryText =
-        "SELECT " + SDS + ".\"CD_ID\", count(1) "
+        "SELECT " + SDS + ".\"CD_ID\" "
             + "from " + SDS + " "
             + "WHERE " + SDS + ".\"CD_ID\" in (" + colIds + ") "
             + "GROUP BY " + SDS + ".\"CD_ID\"";
-    List<Object> danglingColumnDescriptorIdList = new 
ArrayList<>(columnDescriptorIdList.size());
+    Set<Long> danglingColumnDescriptorIdSet = new 
HashSet<>(columnDescriptorIdList);
     try (QueryWrapper query = new 
QueryWrapper(pm.newQuery("javax.jdo.query.SQL", queryText))) {
-      List<Object[]> sqlResult = MetastoreDirectSqlUtils
-          .ensureList(executeWithArray(query, null, queryText));
+      List<Long> sqlResult = executeWithArray(query, null, queryText);
 
       if (!sqlResult.isEmpty()) {
-        for (Object[] fields : sqlResult) {
-          if (MetastoreDirectSqlUtils.extractSqlInt(fields[1]) == 0) {
-            
danglingColumnDescriptorIdList.add(MetastoreDirectSqlUtils.extractSqlLong(fields[0]));
-          }
+        for (Long cdId : sqlResult) {
+          // the returned CD is not dangling, so remove it from the list
+          danglingColumnDescriptorIdSet.remove(cdId);
         }
       }
     }
-    if (!danglingColumnDescriptorIdList.isEmpty()) {
+    if (!danglingColumnDescriptorIdSet.isEmpty()) {
       try {
-        String danglingCDIds = getIdListForIn(danglingColumnDescriptorIdList);
+        String danglingCDIds = 
getIdListForIn(Arrays.asList(danglingColumnDescriptorIdSet.toArray()));

Review comment:
       Sounds great. I made the change.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 691297)
    Time Spent: 40m  (was: 0.5h)

> Column descriptors might not deleted via direct sql
> ---------------------------------------------------
>
>                 Key: HIVE-25773
>                 URL: https://issues.apache.org/jira/browse/HIVE-25773
>             Project: Hive
>          Issue Type: Bug
>          Components: Metastore
>            Reporter: Yu-Wen Lai
>            Assignee: Yu-Wen Lai
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 40m
>  Remaining Estimate: 0h
>
> Steps to reproduce:
> 1. create a partitioned table
> 2. add a partition _p_
> 3. add column to the partition _p_ (a column descriptor will be created)
> 4. drop partition _p_
> The new column descriptor still existed even though there's no relation left. 
> We are currently using below SQL and extract the results that count = 0 as 
> dangling column descriptors. However, it is impossible to get count = 0 from 
> groupby query so they will never be deleted if it is not a table's default 
> column descriptor.
>  
> {code:java}
> SELECT SDS.CD_ID, count(1)
>   FROM SDS WHERE SDS.CD_ID in (cdIds)
>   GROUP BY SDS.CD_ID;{code}
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Work logged] (HIVE-25773) Column descriptors might not deleted via direct sql

Reply via email to