[ 
https://issues.apache.org/jira/browse/HIVE-14627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-14627:
-----------------------------------------
       Resolution: Fixed
    Fix Version/s: 2.2.0
           Status: Resolved  (was: Patch Available)

Committed to master. Thanks for the reviews!

> Improvements to MiniMr tests
> ----------------------------
>
>                 Key: HIVE-14627
>                 URL: https://issues.apache.org/jira/browse/HIVE-14627
>             Project: Hive
>          Issue Type: Sub-task
>    Affects Versions: 2.2.0
>            Reporter: Prasanth Jayachandran
>            Assignee: Prasanth Jayachandran
>             Fix For: 2.2.0
>
>         Attachments: HIVE-14627.1.patch, HIVE-14627.2.patch, 
> HIVE-14627.3.patch
>
>
> Currently MiniMr is extremely slow, I ran udf_using.q on MiniMr and following 
> are the execution time breakdown
> Total time - 13m59s
> Junit reported time for testcase - 50s
> Most of the time is spent in creating/loading/analyzing initial tables - ~12m
> Cleanup - ~1m
> There is huge overhead for running MiniMr tests when compared to the actual 
> test runtime. 
> Ran the same test without init script.
> Total time - 2m17s
> Junit reported time for testcase - 52s
> Also I noticed some tests that doesn't have to run on MiniMr (like 
> udf_using.q that does not require MiniMr. It just reads/write to hdfs which 
> we can do in MiniTez/MiniLlap which are way faster). Most tests access only 
> very few initial tables to read few rows from it. We can fix those tests to 
> load just the table that is required for the table instead of all initial 
> tables. Also we can remove q_init_script.sql initialization for MiniMr after 
> rewriting and moving over the unwanted tests which should cut down the 
> runtime a lot.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to