> On Dec. 5, 2015, 12:35 p.m., Jarek Cecho wrote: > > connector/connector-kite/src/main/java/org/apache/sqoop/connector/kite/KiteFromInitializer.java, > > lines 71-81 > > <https://reviews.apache.org/r/39586/diff/9/?file=1131934#file1131934line71> > > > > Aren't those dependecies on inside the connector's jar as well?
kitesdk will be inside the connector's jar. For other dependencies such as hadoop, hive, thrift, datanucleus, etc, they will be provided by the hadoop cluster and hive cluster environment separately. > On Dec. 5, 2015, 12:35 p.m., Jarek Cecho wrote: > > core/src/main/java/org/apache/sqoop/connector/ConnectorHandler.java, lines > > 74-76 > > <https://reviews.apache.org/r/39586/diff/9/?file=1131940#file1131940line74> > > > > Can we abstract this "magic" to standalone method and provide proper > > test coverage? > > > > It seems fragile, so having good tests will hopefully help ensuring > > that we won't break this in the future. Good point. Updated. > On Dec. 5, 2015, 12:35 p.m., Jarek Cecho wrote: > > core/src/main/java/org/apache/sqoop/connector/ConnectorManagerUtils.java, > > line 55 > > <https://reviews.apache.org/r/39586/diff/9/?file=1131941#file1131941line55> > > > > Wondering why do we need to keep the classloader dependencies around? > > Can't we just store the classloder instance inside the ConnectorHandler or > > perhaps even easier call connectorInstance.getClass().getClassLoader(). Good point. Updated the patch to use connectorInstance.getClass().getClassLoader() to get the classloader. > On Dec. 5, 2015, 12:35 p.m., Jarek Cecho wrote: > > core/src/main/java/org/apache/sqoop/driver/JobManager.java, lines 496-500 > > <https://reviews.apache.org/r/39586/diff/9/?file=1131942#file1131942line496> > > > > I'm wondering if this is needed? Based on preliminary research any > > "new" call inside the initializer() method should use the Connector classes > > classloader. For "new" call, that's true that it will use the connector classes classloader. But for "ServiceLoader" api, it will load service classes with the current thread's context class loader by default[1]. [1] https://docs.oracle.com/javase/6/docs/api/java/util/ServiceLoader.html#load(java.lang.Class) > On Dec. 5, 2015, 12:35 p.m., Jarek Cecho wrote: > > execution/mapreduce/src/main/java/org/apache/sqoop/execution/mapreduce/MapreduceExecutionEngine.java, > > line 86 > > <https://reviews.apache.org/r/39586/diff/9/?file=1131943#file1131943line86> > > > > We're not really storing the classpath here, just the jar location. > > Also I'm wondering if this is really desirable to store since the path is > > absolute on given machine, right? 1. That's true that it's actually the jar location. Good point, have updated the configuration "JOB_CONNECTOR_FROM_CLASSPATH" to "JOB_CONNECTOR_FROM_LOCATION", also updated "JOB_CONNECTOR_TO_CLASSPATH". 2. When we submit MR job, the connector jar will not be put in the property "tmpjars" of configuration, but in the property "tmpfiles" of configuration as we don't want the connector jar in the classpath of Map/Reduce Job. We want the connector class loaded by the connector's own classloader. So we need a configuration to tell us the connector jar location. > On Dec. 5, 2015, 12:35 p.m., Jarek Cecho wrote: > > execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopInputFormat.java, > > lines 62-79 > > <https://reviews.apache.org/r/39586/diff/9/?file=1131946#file1131946line62> > > > > I think that easier and more nicer way how to do this is to load the > > SqoopConnector class with given class loader. Everything else we should > > "get" for free. As I said in the above comments, for "ServiceLoader" api, it will load the service classes with the current thread's context class loader by default[1]. So we'd better set the context classloader. Actually we did encountered such cases in kite-sdk[2]. [1] https://docs.oracle.com/javase/6/docs/api/java/util/ServiceLoader.html#load(java.lang.Class) [2] https://github.com/kite-sdk/kite/blob/master/kite-data/kite-data-core/src/main/java/org/kitesdk/data/spi/Registration.java > On Dec. 5, 2015, 12:35 p.m., Jarek Cecho wrote: > > server/pom.xml, lines 83-88 > > <https://reviews.apache.org/r/39586/diff/9/?file=1131953#file1131953line83> > > > > I'm wondering why do we need the exclusions? They shouldn't be > > propagated because we marked them as "provided" anyway right? We want these connectors present but the dependencies of these connectors not present as they have already being packaged in the connector jar. "provided" will make the connector jar also not present, which is not expected. > On Dec. 5, 2015, 12:35 p.m., Jarek Cecho wrote: > > common/src/main/resources/org.apache.sqoop.connector-classloader.properties, > > line 54 > > <https://reviews.apache.org/r/39586/diff/9/?file=1131927#file1131927line54> > > > > Wondering what is the "-" at the begging of the line mean? File "org.apache.sqoop.connector-classloader.properties" defines which classes are consided system classes and should be loaded by the system classloader. "-" means that all classes under package "org.apache.sqoop.connector." should not be considered system class as we want these classes loaded by the connector's own classloader, not the system classloader. > On Dec. 5, 2015, 12:35 p.m., Jarek Cecho wrote: > > connector/connector-ftp/pom.xml, line 96 > > <https://reviews.apache.org/r/39586/diff/9/?file=1131928#file1131928line96> > > > > Let's not hardcode the project name here - it will be hard to mantain > > and prone to errors. We should be able to get the project name from > > ${project.artifactId} or similar variable. Good point. Fixed. > On Dec. 5, 2015, 12:35 p.m., Jarek Cecho wrote: > > connector/connector-ftp/pom.xml, line 80 > > <https://reviews.apache.org/r/39586/diff/9/?file=1131928#file1131928line80> > > > > Can we define the assembly plugin in root pom.xml so that we don't have > > repeat the version every time in each children pom file? We do that alredy > > for pretty much all dependencies (there are few exceptions) and it's much > > cleaner. Good point. Have addressed this in the new patch. > On Dec. 5, 2015, 12:35 p.m., Jarek Cecho wrote: > > connector/connector-ftp/pom.xml, lines 88-102 > > <https://reviews.apache.org/r/39586/diff/9/?file=1131928#file1131928line88> > > > > Wondering if there is a way to somehow put this fragment to some of the > > parent pom.xml files so that we don't have to repeat it every time? > > > > Perhaps we can create new parent module "sqoop-connector" that will > > contain the packaging portion and use it for all children? Good point. Addressed in the new patch. > On Dec. 5, 2015, 12:35 p.m., Jarek Cecho wrote: > > core/src/main/java/org/apache/sqoop/connector/ConnectorManagerUtils.java, > > lines 184-200 > > <https://reviews.apache.org/r/39586/diff/9/?file=1131941#file1131941line184> > > > > I'm wondering if there is a way to avoid creating the intermediate > > files on disk? > > > > How are other similar libraries solving (One jar for example) the same > > problem? Good point. Have filed JIRA SQOOP-2635 to handle this. It takes the similar approach to One Jar. - Dian ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/39586/#review108543 ----------------------------------------------------------- On Nov. 19, 2015, 12:39 p.m., Dian Fu wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/39586/ > ----------------------------------------------------------- > > (Updated Nov. 19, 2015, 12:39 p.m.) > > > Review request for Sqoop. > > > Bugs: SQOOP-2634 > https://issues.apache.org/jira/browse/SQOOP-2634 > > > Repository: sqoop-sqoop2 > > > Description > ------- > > Currently Sqoop 2 has already provided the ability to config jar dependencies > with property "org.apache.sqoop.classpath.extra". The limitation of this > property is that we have to put all the dependencies together. It can't > express jar dependencies for a specified connector. This capacity is useful > as some connectors may have conflict jar dependencies. Put all the > dependencies from different connectors together may cause problems. > > > Diffs > ----- > > assemblies/pom.xml PRE-CREATION > assemblies/src/main/resources/assemblies/sqoop-connector.xml PRE-CREATION > common/src/main/java/org/apache/sqoop/error/code/ConnectorError.java > 2f17d95 > common/src/main/java/org/apache/sqoop/utils/ConnectorClassLoader.java > 4c42a78 > common/src/main/resources/org.apache.sqoop.connector-classloader.properties > e2936a9 > connector/connector-ftp/pom.xml 41ea026 > connector/connector-generic-jdbc/pom.xml 052f06d > connector/connector-hdfs/pom.xml 022a024 > connector/connector-kafka/pom.xml e0f0684 > > connector/connector-kafka/src/main/java/org/apache/sqoop/connector/kafka/KafkaToInitializer.java > 923d1aa > connector/connector-kite/pom.xml d8eaa8e > > connector/connector-kite/src/main/java/org/apache/sqoop/connector/kite/KiteFromInitializer.java > 28c5bac > > connector/connector-kite/src/main/java/org/apache/sqoop/connector/kite/KiteToInitializer.java > 50daba0 > connector/connector-oracle-jdbc/pom.xml 8186b3a > > connector/connector-sdk/src/main/java/org/apache/sqoop/connector/spi/SqoopConnector.java > 85ba8be > connector/connector-sftp/pom.xml 312ac61 > > connector/connector-sftp/src/main/java/org/apache/sqoop/connector/sftp/SftpToInitializer.java > bfb51ac > core/src/main/java/org/apache/sqoop/connector/ConnectorHandler.java 1899bb7 > core/src/main/java/org/apache/sqoop/connector/ConnectorManagerUtils.java > 9f9be57 > core/src/main/java/org/apache/sqoop/driver/JobManager.java 15ca796 > > execution/mapreduce/src/main/java/org/apache/sqoop/execution/mapreduce/MapreduceExecutionEngine.java > 3acd4a1 > execution/mapreduce/src/main/java/org/apache/sqoop/job/MRJobConstants.java > 737ceda > execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/MRUtils.java > PRE-CREATION > > execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopInputFormat.java > 0623f7b > execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopMapper.java > 7d20992 > > execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopNullOutputFormat.java > 8c8526b > > execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopOutputFormatLoadExecutor.java > 623d1f4 > execution/mapreduce/src/test/java/org/apache/sqoop/job/TestMapReduce.java > 2463643 > execution/mapreduce/src/test/java/org/apache/sqoop/job/TestMatching.java > d0b41d1 > pom.xml 91721ce > server/pom.xml 370a6a2 > shell/src/test/java/org/apache/sqoop/shell/TestCloneCommand.java 7a139ba > shell/src/test/java/org/apache/sqoop/shell/TestCreateCommand.java 0316a60 > shell/src/test/java/org/apache/sqoop/shell/TestUpdateCommand.java d8f061a > > submission/mapreduce/src/main/java/org/apache/sqoop/submission/mapreduce/MapreduceSubmissionEngine.java > f396783 > > Diff: https://reviews.apache.org/r/39586/diff/ > > > Testing > ------- > > > Thanks, > > Dian Fu > >
