[jira] Updated: (HIVE-1408) add option to let hive automatically run in local mode based on tunable heuristics

Joydeep Sen Sarma (JIRA) Sun, 25 Jul 2010 19:22:15 -0700

     [ 
https://issues.apache.org/jira/browse/HIVE-1408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Joydeep Sen Sarma updated HIVE-1408:
------------------------------------

    Attachment: 1408.5.patch

final patch i hope!

had to go through some hoops to make the test pass on all versions. it turns 
out not having the pfile implementation on different implementations makes the 
test outputs differ (ignoring pfile: in diffs is not enough because path order 
in different lists change)

so i have ported the ProxyFileSystem to all the shims (only 17 required 
significant changes).

tests of 17 and 20 both pass now (running 18 and 19).

> add option to let hive automatically run in local mode based on tunable 
> heuristics
> ----------------------------------------------------------------------------------
>
>                 Key: HIVE-1408
>                 URL: https://issues.apache.org/jira/browse/HIVE-1408
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>            Reporter: Joydeep Sen Sarma
>            Assignee: Joydeep Sen Sarma
>         Attachments: 1408.1.patch, 1408.2.patch, 1408.2.q.out.patch, 
> 1408.3.patch, 1408.4.patch, 1408.5.patch
>
>
> as a followup to HIVE-543 - we should have a simple option (enabled by 
> default) to let hive run in local mode if possible.
> two levels of options are desirable:
> 1. hive.exec.mode.local.auto=true/false // control whether local mode is 
> automatically chosen
> 2. Options to control different heuristics, some naiive examples:
>      hive.exec.mode.local.auto.input.size.max=1G // don't choose local mode 
> if data > 1G
>      hive.exec.mode.local.auto.script.enable=true/false // choose if local 
> mode is enabled for queries with user scripts
> this can be implemented as a pre/post execution hook. It makes sense to 
> provide this as a standard hook in the hive codebase since it's likely to 
> improve response time for many users (especially for test queries).
> the initial proposal is to choose this at a query level and not at per 
> hive-task (ie. hadoop job) level. per job-level requires more changes to 
> compilation (to not pre-commit to hdfs or local scratch directories at 
> compile time).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-1408) add option to let hive automatically run in local mode based on tunable heuristics

Reply via email to