[
https://issues.apache.org/jira/browse/HIVE-1096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12835504#action_12835504
]
Carl Steinbach commented on HIVE-1096:
--------------------------------------
bq. Philosophically I agree. In actuality have Hive/Hadoop conf is easily
manipulated by changing your hadoop-site.xml or hive-site.xml. Users do have
unprotected access to the namespace that is the nature of hadoop. Users of hive
are setting variables all the time.
True, but I think we should try to improve the situation. As a start we can add
code to throw an error if hive-default.xml or hive-site.xml sets a hive.*
configuration property that is not defined in HiveConf. This would protect the
hive.* namespace and at the same time make it easy to track down cases where
folks misspell a hive.* property name.
bq. The only true difference in implementation is that your doing it with
properties and I am doing it with HiveConf Vars. If we support both I think we
are both happy. Any ideas?
I agree that we should support access to both system properties and hiveconf
properties, but if we do how will we resolve cases where the user references
{{${foo.bar}}} and both the system and hiveconf define properties named
foo.bar? Also, another problem I see with using the hiveconf namespace for user
variable definitions is that user variables cease to have any meaning past the
client-side query preprocessing step, yet since they're part of the hiveconf
they will get included in the jobconf and sent to datanodes.
Here's a proposal:
* Allow users to reference variables in QL statements using the syntax
{{${namespace:variable_name}}}.
* Users can define variables on the command line using a new "{{-hivevar x=y}}"
switch. Values defined in this manner become part of the user namespace, which
is the default namespace. They can be referenced as either
{{${default:variablename}}} or {{${variablename}}}.
* Hive configuration properties are part of the "hiveconf" namespace, and can
be referenced as {{${hiveconf:propertyname}}}.
* System properties are part of the "system" namespace, and can be referenced
as {{${system:property_name}}}.
What do you think?
> Hive Variables
> --------------
>
> Key: HIVE-1096
> URL: https://issues.apache.org/jira/browse/HIVE-1096
> Project: Hadoop Hive
> Issue Type: New Feature
> Reporter: Edward Capriolo
> Assignee: Edward Capriolo
> Attachments: 1096-9.diff, hive-1096-2.diff, hive-1096-7.diff,
> hive-1096-8.diff, hive-1096.diff
>
>
> From mailing list:
> --Amazon Elastic MapReduce version of Hive seems to have a nice feature
> called "Variables." Basically you can define a variable via command-line
> while invoking hive with -d DT=2009-12-09 and then refer to the variable via
> ${DT} within the hive queries. This could be extremely useful. I can't seem
> to find this feature even on trunk. Is this feature currently anywhere in the
> roadmap?--
> This could be implemented in many places.
> A simple place to put this is
> in Driver.compile or Driver.run we can do string substitutions at that level,
> and further downstream need not be effected.
> There could be some benefits to doing this further downstream, parser,plan.
> but based on the simple needs we may not need to overthink this.
> I will get started on implementing in compile unless someone wants to discuss
> this more.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.