[
https://issues.apache.org/jira/browse/HIVE-14627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Prasanth Jayachandran updated HIVE-14627:
-----------------------------------------
Attachment: HIVE-14627.1.patch
> Improvements to MiniMr tests
> ----------------------------
>
> Key: HIVE-14627
> URL: https://issues.apache.org/jira/browse/HIVE-14627
> Project: Hive
> Issue Type: Sub-task
> Affects Versions: 2.2.0
> Reporter: Prasanth Jayachandran
> Assignee: Prasanth Jayachandran
> Attachments: HIVE-14627.1.patch
>
>
> Currently MiniMr is extremely slow, I ran udf_using.q on MiniMr and following
> are the execution time breakdown
> Total time - 13m59s
> Junit reported time for testcase - 50s
> Most of the time is spent in creating/loading/analyzing initial tables - ~12m
> Cleanup - ~1m
> There is huge overhead for running MiniMr tests when compared to the actual
> test runtime.
> Ran the same test without init script.
> Total time - 2m17s
> Junit reported time for testcase - 52s
> Also I noticed some tests that doesn't have to run on MiniMr (like
> udf_using.q that does not require MiniMr. It just reads/write to hdfs which
> we can do in MiniTez/MiniLlap which are way faster). Most tests access only
> very few initial tables to read few rows from it. We can fix those tests to
> load just the table that is required for the table instead of all initial
> tables. Also we can remove q_init_script.sql initialization for MiniMr after
> rewriting and moving over the unwanted tests which should cut down the
> runtime a lot.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)