[jira] [Work logged] (HIVE-21762) REPL DUMP to support new format for replication policy input to take included tables list.

ASF GitHub Bot (JIRA) Wed, 12 Jun 2019 06:49:18 -0700


     [ 
https://issues.apache.org/jira/browse/HIVE-21762?focusedWorklogId=258726&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-258726
 ]


ASF GitHub Bot logged work on HIVE-21762:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 12/Jun/19 13:48
            Start Date: 12/Jun/19 13:48
    Worklog Time Spent: 10m 
      Work Description: maheshk114 commented on pull request #664: HIVE-21762: 
REPL DUMP to support new format for replication policy input to take included 
tables list.
URL: https://github.com/apache/hive/pull/664#discussion_r292913152
 
 

 ##########
 File path: 
ql/src/java/org/apache/hadoop/hive/ql/parse/ReplicationSemanticAnalyzer.java
 ##########
 @@ -173,67 +184,38 @@ private void initReplDump(ASTNode ast) throws 
HiveException {
           break;
         }
         case TOK_REPL_TABLES: {
-          assert(currNode.getChildCount() <= 2);
-          if (!isValidTablesList(currNode)) {
-            LOG.error(ErrorMsg.REPL_INCORRECT_SYNTAX_FOR_REPL_POLICY.getMsg());
-            throw new 
SemanticException(ErrorMsg.REPL_INCORRECT_SYNTAX_FOR_REPL_POLICY.getMsg());
-          }
-
-          // Traverse the children which can be single table_name node or just 
include tables list
-          // or both include and exclude tables list.
-          for (int listIdx = 0; listIdx < currNode.getChildCount(); listIdx++) 
{
-            Tree tablesNode = currNode.getChild(listIdx);
-            if (tablesNode.getType() == TOK_TABNAME) {
-              String tableName = tablesNode.getChild(0).getText();
-              LOG.info("ReplScope: Set Table Name: {}", tableName);
-              replScope.setTableName(tableName);
-            } else {
-              List<String> tablesList = new ArrayList<>();
-              for (int child = 0; child < tablesNode.getChildCount(); child++) 
{
-                Tree tablePatternNode = tablesNode.getChild(child);
-                if (tablePatternNode.getType() == TOK_NULL) {
-                  
LOG.error(ErrorMsg.REPL_INVALID_DB_OR_TABLE_PATTERN.getMsg());
-                  throw new 
SemanticException(ErrorMsg.REPL_INVALID_DB_OR_TABLE_PATTERN.getMsg());
-                }
-                tablesList.add(unescapeSQLString(tablePatternNode.getText()));
-              }
-              if (!tablesList.isEmpty()) {
-                if (listIdx == 0) {
-                  LOG.info("ReplScope: Set Included Tables List: {}", 
tablesList);
-                  replScope.setIncludedTablePatterns(tablesList);
-                } else {
-                  LOG.info("ReplScope: Set Excluded Tables List: {}", 
tablesList);
-                  replScope.setExcludedTablePatterns(tablesList);
-                }
-              }
-            }
-          }
+          setReplDumpTablesList(currNode);
           break;
         }
-        default: {
+        case TOK_FROM: {
           // TOK_FROM subtree
           Tree fromNode = currNode;
           eventFrom = 
Long.parseLong(PlanUtils.stripQuotes(fromNode.getChild(0).getText()));
-          // skip the first, which is always required
-          int numChild = 1;
-          while (numChild < fromNode.getChildCount()) {
-            if (fromNode.getChild(numChild).getType() == TOK_TO) {
+
+          // Skip the first, which is always required
+          int fromChildIdx = 1;
+          while (fromChildIdx < fromNode.getChildCount()) {
+            if (fromNode.getChild(fromChildIdx).getType() == TOK_TO) {
               eventTo =
-                      
Long.parseLong(PlanUtils.stripQuotes(fromNode.getChild(numChild + 
1).getText()));
-              // skip the next child, since we already took care of it
-              numChild++;
-            } else if (fromNode.getChild(numChild).getType() == TOK_LIMIT) {
+                      
Long.parseLong(PlanUtils.stripQuotes(fromNode.getChild(fromChildIdx + 
1).getText()));
+              // Skip the next child, since we already took care of it
+              fromChildIdx++;
+            } else if (fromNode.getChild(fromChildIdx).getType() == TOK_LIMIT) 
{
               maxEventLimit =
-                      
Integer.parseInt(PlanUtils.stripQuotes(fromNode.getChild(numChild + 
1).getText()));
-              // skip the next child, since we already took care of it
-              numChild++;
+                      
Integer.parseInt(PlanUtils.stripQuotes(fromNode.getChild(fromChildIdx + 
1).getText()));
+              // Skip the next child, since we already took care of it
+              fromChildIdx++;
             }
             // move to the next child in FROM tree
-            numChild++;
+            fromChildIdx++;
           }
+          break;
+        }
+        default: {
+          throw new SemanticException("Unrecognized token in REPL DUMP 
statement.");
 
 Review comment:
   print the token ..it will be easier to debug
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 258726)
    Time Spent: 4h 50m  (was: 4h 40m)

> REPL DUMP to support new format for replication policy input to take included 
> tables list.
> ------------------------------------------------------------------------------------------
>
>                 Key: HIVE-21762
>                 URL: https://issues.apache.org/jira/browse/HIVE-21762
>             Project: Hive
>          Issue Type: Sub-task
>          Components: repl
>            Reporter: Sankar Hariappan
>            Assignee: Sankar Hariappan
>            Priority: Major
>              Labels: DR, Replication, pull-request-available
>         Attachments: HIVE-21762.01.patch, HIVE-21762.02.patch, 
> HIVE-21762.03.patch, HIVE-21762.04.patch, HIVE-21762.05.patch, 
> HIVE-21762.06.patch
>
>          Time Spent: 4h 50m
>  Remaining Estimate: 0h
>
> - REPL DUMP syntax:
> {code}
> REPL DUMP <repl_policy> [FROM <last_repl_id> WITH <key_values_list>;
> {code}
> - New format for the Replication policy have 3 parts all separated with Dot 
> (.). 
> 1. First part is DB name.
> 2. Second part is included list. Comma separated table names/regex with in 
> square brackets[].  If square brackets are not there, then it is treated as 
> single table replication which skips DB level events.
> 3. Third part is excluded list. Comma separated table names/regex with in 
> square brackets[].
> {code}
> <db_name> -- Full DB replication which is currently supported
> <db_name>.['.*?']  -- Full DB replication
> <db_name>.[] -- Replicate just functions and not include any tables.
> <db_name>.['t1', 't2']  -- DB replication with static list of tables t1 and 
> t2 included.
> <db_name>.['t1*', 't2', '*t3'].['t100', '5t3', 't4'] -- DB replication with 
> all tables having prefix t1, with suffix t3 and include table t2 and exclude 
> t100 which has the prefix t1, 5t3 which suffix t3 and t4.
> {code}
> - Need to support regular expression of any format. 
> - A table is included in dump only if it matches the regular expressions in 
> included list and doesn't match the excluded list.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Work logged] (HIVE-21762) REPL DUMP to support new format for replication policy input to take included tables list.

Reply via email to