Hi All, In build.xml in branch-0.13, 1. following jars are included into lib directory during packaging: <copy todir="${tar.dist.dir}/lib"> <fileset dir="${ivy.lib.dir}" includes="jython-*.jar"/> <fileset dir="${ivy.lib.dir}" includes="jruby-*.jar"/> <fileset dir="${ivy.lib.dir}" includes="groovy-*.jar"/> <fileset dir="${ivy.lib.dir}" includes="js-*.jar"/> <fileset dir="${ivy.lib.dir}" includes="hbase-*.jar" excludes="hbase-*tests.jar"/> <fileset dir="${ivy.lib.dir}" includes="protobuf-java-*.jar"/> <fileset dir="${ivy.lib.dir}" includes="zookeeper-*.jar"/> <fileset dir="${ivy.lib.dir}" includes="accumulo-*.jar" excludes="accumulo-minicluster*.jar"/> <fileset dir="${ivy.lib.dir}" includes="avro-*.jar" excludes="avro-*tests.jar"/> <fileset dir="${ivy.lib.dir}" includes="json-simple-*.jar"/> </copy> 2. following jars are included in without hadoop jar file: <fileset dir="${ivy.lib.dir}" id="runtime.dependencies-withouthadoop.jar"> <patternset id="pattern.runtime.dependencies-withouthadoop.jar"> <include name="antlr-runtime-${antlr.version}.jar"/> <include name="ST4-${stringtemplate.version}.jar"/> <include name="jline-${jline.version}.jar"/> <include name="jackson-mapper-asl-${jackson.version}.jar"/> <include name="jackson-core-asl-${jackson.version}.jar"/> <include name="joda-time-${joda-time.version}.jar"/> <include name="guava-${guava.version}.jar"/> <include name="automaton-${automaton.version}.jar"/> <include name="jansi-${jansi.version}.jar"/> <include name="avro-${avro.version}.jar"/> <include name="avro-mapred-${avro.version}.jar"/> <include name="trevni-core-${avro.version}.jar"/> <include name="trevni-avro-${avro.version}.jar"/> <include name="snappy-java-${snappy.version}.jar"/> <include name="asm*.jar"/> </patternset> </fileset> Questions: 1. Could you tell what the jars in 1# and 2# are used for? What are the differences between them? 1. Seems all the jars in 1# and 2# will be necessary when pig running at hadoop cluster. Correct? 2. without hadoop jar file will be invoked during pig running in hadoop cluster, and this jar file includes pig core and some dependencies. Could we just generate pig core jar and move all the dependencies into lib directory? Then in pig script, we could alwayse add pig core into classpath add the dependencies into classpath in "if [ -n "$HADOOP_BIN" ]; then" option. If this: - it would be clear for user to check dependencies version - it would be easiler to mantain dependencies, Like version update. Sometime, I would like to check whether pig could work with difference version dependencies. Do you have any concern?
Thanks