[ 
https://issues.apache.org/jira/browse/SQOOP-2634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14998875#comment-14998875
 ] 

Jarek Jarcec Cecho commented on SQOOP-2634:
-------------------------------------------

{quote}
There are two parameters urls and systemClasses in the constructor of 
ConnectorClassLoader. From the implementation of ConnectorClassLoader, we can 
see that only classes which are in the urls and not in the systemClasses will 
be loaded by ConnectorClassLoader. Other classes will be loaded by the parent 
classloader. 
{quote}

Thanks for the explanation [~dian.fu]. My understanding of the 
{{ConnectorClassLoader}} was that it will offer only classes that the connector 
explicitly exposed. But it seems that the behavior is a bit different - firstly 
look for classes that the connector explicitly exposed and if they are not 
found, then go to the Sqoop classpath.

If that is indeed the case, then I feel that a lot of things will be simpler 
for us - for example the need in Generic JDBC Connector to inject additional 
JDBC drivers on the connector classpath is no longer a concern, right? Users 
can use exactly the same way as today - inject the jars to the Sqoop classpath 
and then the connector can use them from there. Would that make sense?

{quote}
IMO, We don't need to do any special things for this. As the HDFS libraries are 
part of the Hadoop cluster, so when the jobs are running on the Hadoop cluster, 
they will get these libraries automatically. Thoughts?
{quote}

Yes, with my updated understanding of the {{ConnectorClassLoader}} I agree.

Looking into the latest design document 

1) I like the idea that each connector will ship it's dependencies as part of 
it's jar - it will simplify everything for everyone :) I however would advise 
not to use the maven shade plugin because to my best knowledge this plugin will 
relocate the libraries (e.g. from {{com.dependency.Class}}, it will become 
{{org.apache.connector.jarcec-connector.com.dependency.Class}}) which is harder 
to debug. I would follow the "one jar" idea of just putting all the dependency 
jars to {{lib/}} directory inside the connector's jar and create custom 
classloader that will be able to load them from there - what do you think?

2) Having better understanding of the {{ConnectorClassLoader}} I would suggest:
2.1) Keep the method {{Initializer.getJars}} around to allow connectors to 
express dependencies above what is available in {{lib/}} (or whatever we will 
do for 1)).
2.2.) I would drop the concept of {{org. 
apache.sqoop.connector.connector-name.classpath.extra}} in favor of retaining 
the {{Initializer.getJars}}.

What do you think?

> Sqoop2: Allow connectors to express jar dependencies
> ----------------------------------------------------
>
>                 Key: SQOOP-2634
>                 URL: https://issues.apache.org/jira/browse/SQOOP-2634
>             Project: Sqoop
>          Issue Type: Sub-task
>            Reporter: Dian Fu
>            Assignee: Dian Fu
>             Fix For: 1.99.7
>
>         Attachments: SQOOP-2634.001.patch, SQOOP-2634.002.patch, 
> SQOOP-2634.003.patch, SQOOP-2634.004.patch, SQOOP-2634.005.patch, 
> SQOOP-2634.006.patch, SQOOP-2634.007.patch, SQOOP-2634.008.patch, 
> SQOOP-2634.009.patch, SQOOP-2634.010.patch, design-doc-v1.pdf, 
> design-doc-v2.pdf, design-doc-v3.pdf
>
>
> Currently Sqoop 2 has already provided the ability to config jar dependencies 
> with property "org.apache.sqoop.classpath.extra". The limitation of this 
> property is that we have to put all the dependencies together. It can't 
> express jar dependencies for a specified connector. This capacity is useful 
> as some connectors may have conflict jar dependencies. Put all the 
> dependencies from different connectors together may cause problems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to