[ 
https://issues.apache.org/jira/browse/PIG-2397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13169803#comment-13169803
 ] 

Dmitriy V. Ryaboy commented on PIG-2397:
----------------------------------------

Jie, something just occurred to me about Q1 -- are you sure Hive is doing the 
right thing here?
If it's using more than 1 reducer for the ORDER operation, it's not:

{code}
Syntax of Order By
The ORDER BY syntax in Hive QL is similar to the syntax of ORDER BY in SQL 
language.

colOrder: ( ASC | DESC )
orderBy: ORDER BY colName colOrder? (',' colName colOrder?)*
query: SELECT expression (',' expression)* FROM src orderBy
There are some limitations in the "order by" clause. In the strict mode 
(i.e., hive.mapred.mode=strict), the order by clause has to be followed 
by a "limit" clause. The limit clause is not necessary if you set 
hive.mapred.mode to nonstrict. The reason is that in order to impose 
total order of all results, there has to be one reducer to sort the final 
output. If the number of rows in the output is too large, the single reducer 
could take a very long time to finish.
{code}
                
> Running TPC-H on Pig
> --------------------
>
>                 Key: PIG-2397
>                 URL: https://issues.apache.org/jira/browse/PIG-2397
>             Project: Pig
>          Issue Type: Task
>            Reporter: Jie Li
>         Attachments: TPC-H_on_Pig.tgz, pig_tpch.ppt
>
>
> For a class project we developed a whole set of Pig scripts for TPC-H. Our 
> goals are:
> 1) identifying the bottlenecks of Pig's performance especially of its 
> relational operators,
> 2) studying how to write efficient scripts by making full use of Pig Latin's 
> features,
> 3) comparing with Hive's TPC-H results for verifying both 1) and 2).
> We will update the JIRA with our scripts, results and analysis soon.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to