[jira] [Commented] (HIVE-10190) CBO: AST mode checks for TABLESAMPLE with AST.toString().contains(TOK_TABLESPLITSAMPLE)

2015-04-09 Thread Reuben (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14487266#comment-14487266
 ] 

Reuben commented on HIVE-10190:
---

Right, which is is why in my code, I was using the {{SetT.contains}} rather 
than {{String.contains}}. Also, the patch as is won't work because:

{code}
  for (Node child : current.getChildren()) {
fringe.add((ASTNode) child);
  }
{code}

will fail if current doesn't have any children. 

Lastly, this is my first patch, so uh ... if I could maybe get some help on how 
to commit it .. that would be cool too : ).

 CBO: AST mode checks for TABLESAMPLE with 
 AST.toString().contains(TOK_TABLESPLITSAMPLE)
 -

 Key: HIVE-10190
 URL: https://issues.apache.org/jira/browse/HIVE-10190
 Project: Hive
  Issue Type: Bug
  Components: CBO
Affects Versions: 1.2.0
Reporter: Gopal V
Assignee: Pengcheng Xiong
Priority: Trivial
  Labels: perfomance
 Attachments: HIVE-10190-querygen.py, HIVE-10190.01.patch


 {code}
 public static boolean validateASTForUnsupportedTokens(ASTNode ast) {
 String astTree = ast.toStringTree();
 // if any of following tokens are present in AST, bail out
 String[] tokens = { TOK_CHARSETLITERAL, TOK_TABLESPLITSAMPLE };
 for (String token : tokens) {
   if (astTree.contains(token)) {
 return false;
   }
 }
 return true;
   }
 {code}
 This is an issue for a SQL query which is bigger in AST form than in text 
 (~700kb).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10190) CBO: AST mode checks for TABLESAMPLE with AST.toString().contains(TOK_TABLESPLITSAMPLE)

2015-04-09 Thread Reuben (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14487276#comment-14487276
 ] 

Reuben commented on HIVE-10190:
---

One other thing, it looks like {{ArrayListT.remove}} has a runtime of O(N) 
(http://infotechgems.blogspot.com/2011/11/java-collections-performance-time.html).
 If we don't want to use a {{QueueT}}, maybe a {{LinkedListT}} instead?

 CBO: AST mode checks for TABLESAMPLE with 
 AST.toString().contains(TOK_TABLESPLITSAMPLE)
 -

 Key: HIVE-10190
 URL: https://issues.apache.org/jira/browse/HIVE-10190
 Project: Hive
  Issue Type: Bug
  Components: CBO
Affects Versions: 1.2.0
Reporter: Gopal V
Assignee: Pengcheng Xiong
Priority: Trivial
  Labels: perfomance
 Attachments: HIVE-10190-querygen.py, HIVE-10190.01.patch


 {code}
 public static boolean validateASTForUnsupportedTokens(ASTNode ast) {
 String astTree = ast.toStringTree();
 // if any of following tokens are present in AST, bail out
 String[] tokens = { TOK_CHARSETLITERAL, TOK_TABLESPLITSAMPLE };
 for (String token : tokens) {
   if (astTree.contains(token)) {
 return false;
   }
 }
 return true;
   }
 {code}
 This is an issue for a SQL query which is bigger in AST form than in text 
 (~700kb).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10190) CBO: AST mode checks for TABLESAMPLE with AST.toString().contains(TOK_TABLESPLITSAMPLE)

2015-04-09 Thread Reuben (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14488144#comment-14488144
 ] 

Reuben commented on HIVE-10190:
---

Okay, I think it's up. I really appreciate the help!

 CBO: AST mode checks for TABLESAMPLE with 
 AST.toString().contains(TOK_TABLESPLITSAMPLE)
 -

 Key: HIVE-10190
 URL: https://issues.apache.org/jira/browse/HIVE-10190
 Project: Hive
  Issue Type: Bug
  Components: CBO
Affects Versions: 1.2.0
Reporter: Gopal V
Assignee: Pengcheng Xiong
Priority: Trivial
  Labels: perfomance
 Attachments: HIVE-10190-querygen.py, HIVE-10190.01.patch, 
 HIVE-10190.02.patch


 {code}
 public static boolean validateASTForUnsupportedTokens(ASTNode ast) {
 String astTree = ast.toStringTree();
 // if any of following tokens are present in AST, bail out
 String[] tokens = { TOK_CHARSETLITERAL, TOK_TABLESPLITSAMPLE };
 for (String token : tokens) {
   if (astTree.contains(token)) {
 return false;
   }
 }
 return true;
   }
 {code}
 This is an issue for a SQL query which is bigger in AST form than in text 
 (~700kb).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10190) CBO: AST mode checks for TABLESAMPLE with AST.toString().contains(TOK_TABLESPLITSAMPLE)

2015-04-09 Thread Reuben (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Reuben updated HIVE-10190:
--
Attachment: HIVE-10190.02.patch

 CBO: AST mode checks for TABLESAMPLE with 
 AST.toString().contains(TOK_TABLESPLITSAMPLE)
 -

 Key: HIVE-10190
 URL: https://issues.apache.org/jira/browse/HIVE-10190
 Project: Hive
  Issue Type: Bug
  Components: CBO
Affects Versions: 1.2.0
Reporter: Gopal V
Assignee: Pengcheng Xiong
Priority: Trivial
  Labels: perfomance
 Attachments: HIVE-10190-querygen.py, HIVE-10190.01.patch, 
 HIVE-10190.02.patch


 {code}
 public static boolean validateASTForUnsupportedTokens(ASTNode ast) {
 String astTree = ast.toStringTree();
 // if any of following tokens are present in AST, bail out
 String[] tokens = { TOK_CHARSETLITERAL, TOK_TABLESPLITSAMPLE };
 for (String token : tokens) {
   if (astTree.contains(token)) {
 return false;
   }
 }
 return true;
   }
 {code}
 This is an issue for a SQL query which is bigger in AST form than in text 
 (~700kb).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10190) CBO: AST mode checks for TABLESAMPLE with AST.toString().contains(TOK_TABLESPLITSAMPLE)

2015-04-08 Thread Reuben (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14485983#comment-14485983
 ] 

Reuben commented on HIVE-10190:
---

I think I maybe have a solution for this, but not sure how to test a 700+ KB 
query:

{code}
  public static boolean validateASTForUnsupportedTokens(ASTNode ast) {
StackASTNode fringe = new StackASTNode();
SetString unsupportedTokens = new HashSetString() {{
  add(TOK_CHARSETLITERAL);
  add(TOK_TABLESPLITSAMPLE);
}};

// Flat-DFS.
fringe.push(ast);
while (!fringe.isEmpty()) {
  ASTNode current = fringe.pop();
  if (current.getChildCount()  0) {
for (Node child : current.getChildren()) {
  fringe.push((ASTNode)child);
}
  }

  if (unsupportedTokens.contains(current.getText())) {
return false;
  }
}
}
{code}

 CBO: AST mode checks for TABLESAMPLE with 
 AST.toString().contains(TOK_TABLESPLITSAMPLE)
 -

 Key: HIVE-10190
 URL: https://issues.apache.org/jira/browse/HIVE-10190
 Project: Hive
  Issue Type: Bug
  Components: CBO
Affects Versions: 1.2.0
Reporter: Gopal V
Assignee: Laljo John Pullokkaran
Priority: Trivial
  Labels: perfomance

 {code}
 public static boolean validateASTForUnsupportedTokens(ASTNode ast) {
 String astTree = ast.toStringTree();
 // if any of following tokens are present in AST, bail out
 String[] tokens = { TOK_CHARSETLITERAL, TOK_TABLESPLITSAMPLE };
 for (String token : tokens) {
   if (astTree.contains(token)) {
 return false;
   }
 }
 return true;
   }
 {code}
 This is an issue for a SQL query which is bigger in AST form than in text 
 (~700kb).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)