[jira] Commented: (HIVE-1408) add option to let hive automatically run in local mode based on tunable heuristics

Joydeep Sen Sarma (JIRA) Wed, 28 Jul 2010 00:20:49 -0700

    [ 
https://issues.apache.org/jira/browse/HIVE-1408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12893088#action_12893088
 ]


Joydeep Sen Sarma commented on HIVE-1408:
-----------------------------------------

#1 - we decide that i would try to take out ProxyFileSystem from the hive jars 
in the distribution. unfortunately, i am unable to do so - all the simple ways 
seem to break the tests. i don't see much of a downside with the current 
arrangement - ProxyFileSystem is test-only code - there's no reason why anyone 
should invoke this. so shouldn't cause any problems (even though it ships with 
the hive jars). the pfile:// -> ProxyFileSystem mapping exists only in test 
mode.

  btw - i can't use ShimLoader - because Hadoop doesn't specify a factory class 
for creating file system object. it expects a file system class directly. that 
makes it impossible to write a portable filesystem class using the shimloader 
paradigm. i am beginning to appreciate factory classes more.

#2 not an issue - can't use ShimLoader as per above.

#3 fixed

#4, #5, #6, #7, #8 - not an issue as we discussed. HIVE-1484 has already been 
filed as a followup work to use local dir for intermediate data when possible

#9 - fixed. moved one public func to Utility.java and eliminated the other.


> add option to let hive automatically run in local mode based on tunable 
> heuristics
> ----------------------------------------------------------------------------------
>
>                 Key: HIVE-1408
>                 URL: https://issues.apache.org/jira/browse/HIVE-1408
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>            Reporter: Joydeep Sen Sarma
>            Assignee: Joydeep Sen Sarma
>         Attachments: 1408.1.patch, 1408.2.patch, 1408.2.q.out.patch, 
> 1408.7.patch, hive-1408.6.patch
>
>
> as a followup to HIVE-543 - we should have a simple option (enabled by 
> default) to let hive run in local mode if possible.
> two levels of options are desirable:
> 1. hive.exec.mode.local.auto=true/false // control whether local mode is 
> automatically chosen
> 2. Options to control different heuristics, some naiive examples:
>      hive.exec.mode.local.auto.input.size.max=1G // don't choose local mode 
> if data > 1G
>      hive.exec.mode.local.auto.script.enable=true/false // choose if local 
> mode is enabled for queries with user scripts
> this can be implemented as a pre/post execution hook. It makes sense to 
> provide this as a standard hook in the hive codebase since it's likely to 
> improve response time for many users (especially for test queries).
> the initial proposal is to choose this at a query level and not at per 
> hive-task (ie. hadoop job) level. per job-level requires more changes to 
> compilation (to not pre-commit to hdfs or local scratch directories at 
> compile time).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1408) add option to let hive automatically run in local mode based on tunable heuristics

Reply via email to