[ 
https://issues.apache.org/jira/browse/HIVE-2453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13107018#comment-13107018
 ] 

jirapos...@reviews.apache.org commented on HIVE-2453:
-----------------------------------------------------



bq.  On 2011-09-16 21:27:59, Ning Zhang wrote:
bq.  > trunk/ql/src/java/org/apache/hadoop/hive/ql/QueryProperties.java, line 42
bq.  > <https://reviews.apache.org/r/1933/diff/1/?file=41497#file41497line42>
bq.  >
bq.  >     can you split it into 2 parts: useScriptInMapper and 
useScriptInReducer?
bq.  
bq.  Kevin Wilfong wrote:
bq.      Determining whether a script is used in the mapper or the reducer will 
require going through the operator tree added to each Map Reduce job to 
determine if a Transform operator is there and then setting the appropriate 
flag.  That is more work than I'd like to do here considering this feature will 
probably not be used by most users.  I would like to keep the flag here, so 
that it can be decided if that work needs to be performed somewhere else.

OK. My original thought of splitting this into mapper and reducer flags is that 
we can analyze the cost of the script operator based on its input size (mappers 
and reducers have different input size metrics). Let's see if they are needed 
in the future and file a followup JIRA then. 


- Ning


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1933/#review1946
-----------------------------------------------------------


On 2011-09-17 00:14:50, Kevin Wilfong wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/1933/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-09-17 00:14:50)
bq.  
bq.  
bq.  Review request for hive and Ning Zhang.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  The information that would be useful for categorizing queries is clearest 
in the Semantic Analyzer, when the data from the Parser is interpreted.  I 
added a new class which is designed to collect that data here, and place it 
ultimately in the QueryPlan where it will be available to hooks.
bq.  
bq.  The information I collect is whether or not the query has the following 
clauses:
bq.    Join
bq.    Group By
bq.    Order By
bq.    Sort By
bq.    Group By after a Join clause
bq.  
bq.  Also, I store whether or not a script is used for mapping or reducing.
bq.  
bq.  
bq.  This addresses bug HIVE-2453.
bq.      https://issues.apache.org/jira/browse/HIVE-2453
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    trunk/ql/src/java/org/apache/hadoop/hive/ql/QueryPlan.java 1170719 
bq.    trunk/ql/src/java/org/apache/hadoop/hive/ql/QueryProperties.java 
PRE-CREATION 
bq.    
trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java 
1170719 
bq.    trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
1170719 
bq.    
trunk/ql/src/test/org/apache/hadoop/hive/ql/hooks/CheckQueryPropertiesHook.java 
PRE-CREATION 
bq.    trunk/ql/src/test/queries/clientpositive/query_properties.q PRE-CREATION 
bq.    trunk/ql/src/test/results/clientpositive/query_properties.q.out 
PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/1933/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  I added a new test, which runs a variety of queries, such that each of the 
flags in QueryProperties is set by at least one query, and also some are set in 
combinations.
bq.  I also added a hook which prints the contents of QueryProperties to error 
on the console.
bq.  
bq.  I checked the output in the results file and verified it matched what I 
expected.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Kevin
bq.  
bq.



> Need a way to categorize queries in hooks for improved logging
> --------------------------------------------------------------
>
>                 Key: HIVE-2453
>                 URL: https://issues.apache.org/jira/browse/HIVE-2453
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Kevin Wilfong
>            Assignee: Kevin Wilfong
>         Attachments: HIVE-2453.1.patch.txt
>
>
> We need a way to categorize queries, such as whether or not the include a 
> join clause, a group by clause, etc., in the hooks.  This will allow for 
> better performance logging.
> Currently the only way I can find is to go through the operators in the 
> tasks, but which operators are used for the different types of queries may 
> change over time.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to