[jira] [Commented] (SPARK-1698) Improve spark integration
[ https://issues.apache.org/jira/browse/SPARK-1698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13987682#comment-13987682 ] Guoqiang Li commented on SPARK-1698: [~srowen] About [SPARK-1681|https://issues.apache.org/jira/browse/SPARK-1681] there is only one solution: The datanucleus jars is added to the CLASSPATH. Well,there may be other better solution, but I didn't find it I disagree with [PR 610|https://github.com/apache/spark/pull/610],It's not perfect. [The PR 598|https://github.com/apache/spark/pull/598] reference [HADOOP-7939|https://issues.apache.org/jira/browse/HADOOP-7939],I think that is better. There is [another solution|https://github.com/witgo/spark/tree/standalone] reference [Invalid or corrupt JAR File built by Maven shade plugin|http://stackoverflow.com/questions/13021423/invalid-or-corrupt-jar-file-built-by-maven-shade-plugin]. But this involves [SI-6660 REPL: load transitive dependencies of JARs on classpath|https://issues.scala-lang.org/browse/SI-6660] Improve spark integration - Key: SPARK-1698 URL: https://issues.apache.org/jira/browse/SPARK-1698 Project: Spark Issue Type: Improvement Components: Build, Deploy Reporter: Guoqiang Li Assignee: Guoqiang Li Fix For: 1.0.0 Use the shade plugin to create a big JAR with all the dependencies can cause a few problems 1. Missing jar's meta information 2. Some file is covered, eg: plugin.xml 3. Different versions of the jar may co-exist 4. Too big, java 6 does not support -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (SPARK-1698) Improve spark integration
[ https://issues.apache.org/jira/browse/SPARK-1698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13987686#comment-13987686 ] Sean Owen commented on SPARK-1698: -- (Copying an earlier comment that went to the mailing list, but didn't make it here:) #1 and #2 are not relevant the issue of jar size. These can be problems in general, but don't think there have been issues attributable to file clashes. Shading has mechanisms to deal with this anyway. #3 is a problem in general too, but is not specific to shading. Where versions collide, build processes like Maven and shading must be used to resolve them. But this happens regardless of whether you shade a fat jar. #4 is a real problem specific to Java 6. It does seem like it will be important to identify and remove more unnecessary dependencies to work around it. But shading per se is not the problem, and it is important to make a packaged jar for the app. What are you proposing? Dependencies to be removed? Improve spark integration - Key: SPARK-1698 URL: https://issues.apache.org/jira/browse/SPARK-1698 Project: Spark Issue Type: Improvement Components: Build, Deploy Reporter: Guoqiang Li Assignee: Guoqiang Li Fix For: 1.0.0 Use the shade plugin to create a big JAR with all the dependencies can cause a few problems 1. Missing jar's meta information 2. Some file is covered, eg: plugin.xml 3. Different versions of the jar may co-exist 4. Too big, java 6 does not support -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (SPARK-1698) Improve spark integration
[ https://issues.apache.org/jira/browse/SPARK-1698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13987698#comment-13987698 ] Sean Owen commented on SPARK-1698: -- What is the suggested change in this particular JIRA? I saw the PR, which seems to replace the shade with assembly plugin. Given the reference to https://issues.scala-lang.org/browse/SI-6660 are you suggesting that your assembly change packages differently, by putting jars in jars? Yes, the issue you link to is exactly the kind of problem that can occur with this approach. It comes up a bit in Hadoop as well. Even though it is in theory a fine way to do things. But is that what you're getting at? Improve spark integration - Key: SPARK-1698 URL: https://issues.apache.org/jira/browse/SPARK-1698 Project: Spark Issue Type: Improvement Components: Build, Deploy Reporter: Guoqiang Li Assignee: Guoqiang Li Fix For: 1.0.0 Use the shade plugin to create a big JAR with all the dependencies can cause a few problems 1. Missing jar's meta information 2. Some file is covered, eg: plugin.xml 3. Different versions of the jar may co-exist 4. Too big, java 6 does not support -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (SPARK-1698) Improve spark integration
[ https://issues.apache.org/jira/browse/SPARK-1698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13987699#comment-13987699 ] Guoqiang Li commented on SPARK-1698: [~srowen] In [The PR 598|https://github.com/apache/spark/pull/598] #1,#2,#4 do not occur and #3 is very easy to find Improve spark integration - Key: SPARK-1698 URL: https://issues.apache.org/jira/browse/SPARK-1698 Project: Spark Issue Type: Improvement Components: Build, Deploy Reporter: Guoqiang Li Assignee: Guoqiang Li Fix For: 1.0.0 Use the shade plugin to create a big JAR with all the dependencies can cause a few problems 1. Missing jar's meta information 2. Some file is covered, eg: plugin.xml 3. Different versions of the jar may co-exist 4. Too big, java 6 does not support -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (SPARK-1698) Improve spark integration
[ https://issues.apache.org/jira/browse/SPARK-1698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13987708#comment-13987708 ] Guoqiang Li commented on SPARK-1698: [~srowen] In [The PR 598|https://github.com/apache/spark/pull/598] ,The directory structure of a spark similar to hadoop 2.3.0. There are three subcomponents: core,examples,hive,Their path is share/spark/core,share/spark/examples,share/spark/hive Improve spark integration - Key: SPARK-1698 URL: https://issues.apache.org/jira/browse/SPARK-1698 Project: Spark Issue Type: Improvement Components: Build, Deploy Reporter: Guoqiang Li Assignee: Guoqiang Li Fix For: 1.0.0 Use the shade plugin to create a big JAR with all the dependencies can cause a few problems 1. Missing jar's meta information 2. Some file is covered, eg: plugin.xml 3. Different versions of the jar may co-exist 4. Too big, java 6 does not support -- This message was sent by Atlassian JIRA (v6.2#6252)