[
https://issues.apache.org/jira/browse/SQOOP-2634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994019#comment-14994019
]
Jarek Jarcec Cecho commented on SQOOP-2634:
-------------------------------------------
Let's step back on this one little bit to revisit our goals and our options as
I feel that we're rushing the implementation too much without fully realizing
all the consequences.
So let me try articulate the goals (on high level):
# Separate classpath for each connector. Each connector should run with it's
own classpath that is independent on Sqoop Server or Hadoop/YARN classpath.
# Simple deployment - in best case scenario the entire connector is one that
users can download and simply "put" into configured directory. We can relax
this goal to some extent.
# Same behavior for internal/external connectors - we should not have special
code paths based on whether it's build-in connector or not. It just complicates
the code and as all connectors will be "internal" we would never properly test
the code patch for "external" connetors.
Needs specific for some known connectors:
* All *JDBC connectors* will need to have ability to add jars from external
location. Since we can't ship JDBC drivers with the connector due to
incompatible licensing, users will have to supply them somehow. Currently this
is achieved by uploading all additional jars into configured directory from
which Sqoop will automatically put them to the classpath.
Open questions:
* What classes/jars should be put on connector's classpath by Sqoop 2 server.
Couple of interesting classes/jars:
** {{joda-time}} is part of our API to exchange data (e.g. all date-time
objects passed from IDFs should be encoded in {{joda-time}})?
** {{connector-sdk}}: ?
** {{sqoop-common}}: ?
** *HDFS Connector*: Since we're running on Hadoop cluster should we use HDFS
libraries from the cluster or not?
So far discussed solutions:
* Let user configure the classpath for each connector in sqoop.properties file.
Further described in [design
doc|https://issues.apache.org/jira/secure/attachment/12768912/design-doc-v1.pdf].
* Have all dependencies present in separate (configured) directory. Further
desribed in [design
doc|https://issues.apache.org/jira/secure/attachment/12770790/design-doc-v2.pdf]).
Couple of additional ideas to explore:
* [One jar|http://one-jar.sourceforge.net] allows to create single jar with all
dependencies inside that jar.
*
[Class-Path|https://docs.oracle.com/javase/tutorial/deployment/jar/downman.html]
in {{Manifest.mf}} file might be also a viable solution.
Would that be a fair summary [~dian.fu]?
> Sqoop2: Allow connectors to express jar dependencies
> ----------------------------------------------------
>
> Key: SQOOP-2634
> URL: https://issues.apache.org/jira/browse/SQOOP-2634
> Project: Sqoop
> Issue Type: Sub-task
> Reporter: Dian Fu
> Assignee: Dian Fu
> Fix For: 1.99.7
>
> Attachments: SQOOP-2634.001.patch, SQOOP-2634.002.patch,
> SQOOP-2634.003.patch, SQOOP-2634.004.patch, SQOOP-2634.005.patch,
> SQOOP-2634.006.patch, SQOOP-2634.007.patch, SQOOP-2634.008.patch,
> SQOOP-2634.009.patch, SQOOP-2634.010.patch, design-doc-v1.pdf,
> design-doc-v2.pdf
>
>
> Currently Sqoop 2 has already provided the ability to config jar dependencies
> with property "org.apache.sqoop.classpath.extra". The limitation of this
> property is that we have to put all the dependencies together. It can't
> express jar dependencies for a specified connector. This capacity is useful
> as some connectors may have conflict jar dependencies. Put all the
> dependencies from different connectors together may cause problems.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)