[ 
https://issues.apache.org/jira/browse/HIVE-9664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14317573#comment-14317573
 ] 

Anant Nag commented on HIVE-9664:
---------------------------------

The gradle like notation can also be extended to other commands such as  ADD 
FILE  and  ADD ARCHIVE. The LIST and DELETE commands should continue to work 
and produce sensible output. 

The following syntax for the command is proposed: 

 {code}add [FILE|JAR|ARCHIVE] 
<ivy://org:module:version?exclude=org1:module1&org2:module2>  
<ivy://org3:module3:version?exclude=org4:module4&org5:module5>*{code}
{code}add [FILE|JAR|ARCHIVE] <http://url_of_the_jar> 
<http://url_of_the_jar>*{code}
 {code}add [FILE|JAR|ARCHIVE] <https://url_of_the_jar> 
<https://url_of_the_jar>*{code}
 {code}add [FILE|JAR|ARCHIVE] <file://location_of_the_jar>  
<file://location_of_the_jar>*{code}
{code}add [FILE|JAR|ARCHIVE] <location_of_the_jar> <location_of_the_jar>*{code}

The motivation for the above syntax is being able to differentiate how a 
jar(file) is obtained. Having something like <ivy://org:module:version> helps 
us to identify that the file is being downloaded from the artifact such as 
maven repository whereas <file:///tmp/abc.jar> helps us to identify that the 
jar is being added from the local system.

We're assuming that the jar can be added by either of the following methods:

1. A jar can be added from the artifactory( like maven repository). In such a 
case, transitive dependencies( if enabled) should also be downloaded and added 
to the classpath. If some dependencies have to be excluded then those should be 
mentioned in the command itself.
Command: 
{code}
add jar <ivy://org:module:version> 
<ivy://org:module:version?exclude=org1:module1>* 
{code}

exclude=org1:module1 denotes that these dependencies should be excluded while 
satisfying transitive dependencies. 

2) A http or https url of the jar can directly be provided. In such a case, the 
jar will be downloaded and added to the classpath. This might be useful in 
cases where a single jar is required which is not present in the artifactory 
but a download link is available.
Command:
{code}
add jar <http://xyz.com/abc.jar> 
add jar <https://xyz.com/abc.jar>
{code}

3) The jar can be added from the local filesystem. This is basically what hive 
already supports with the add command. The file is already there in the 
filesystem and it is just added to the classpath.
Command:
{code}
add jar jarname
add jar file:///tmp/sample.jar
{code}

4) The jar can be added from the hdfs file system. 

Command: 
{code}
add jar hdfs:/user/abc/dwh-udf.jar;
add jar hdfs:///user/abc/dwh-udf.jar;
{code}

Having syntax like the commands above helps us to clearly distinct the location 
of the jar and also the method used to obtain the jar. Please mention your 
queries and thoughts in the comments.

> Hive "add jar" command should be able to download and add jars from a 
> repository
> --------------------------------------------------------------------------------
>
>                 Key: HIVE-9664
>                 URL: https://issues.apache.org/jira/browse/HIVE-9664
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Anant Nag
>              Labels: hive
>
> Currently Hive's "add jar" command takes a local path to the dependency jar. 
> This clutters the local file-system as users may forget to remove this jar 
> later
> It would be nice if Hive supported a Gradle like notation to download the jar 
> from a repository.
> Example:  add jar org:module:version
>         
> It should also be backward compatible and should take jar from the local 
> file-system as well. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to