[jira] [Commented] (SPARK-1698) Improve spark integration

2014-05-02 Thread Guoqiang Li (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-1698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13987682#comment-13987682
 ] 

Guoqiang Li commented on SPARK-1698:


[~srowen]
About [SPARK-1681|https://issues.apache.org/jira/browse/SPARK-1681] there is 
only one solution: The datanucleus jars is added to the CLASSPATH.
Well,there may be other better solution, but I didn't find it

I disagree with [PR 610|https://github.com/apache/spark/pull/610],It's not 
perfect.

[The PR 598|https://github.com/apache/spark/pull/598] reference 
[HADOOP-7939|https://issues.apache.org/jira/browse/HADOOP-7939],I think that is 
better.

There is [another solution|https://github.com/witgo/spark/tree/standalone] 
reference [Invalid or corrupt JAR File built by Maven shade 
plugin|http://stackoverflow.com/questions/13021423/invalid-or-corrupt-jar-file-built-by-maven-shade-plugin].
But this involves [SI-6660 REPL: load transitive dependencies of JARs on 
classpath|https://issues.scala-lang.org/browse/SI-6660]


 Improve spark integration
 -

 Key: SPARK-1698
 URL: https://issues.apache.org/jira/browse/SPARK-1698
 Project: Spark
  Issue Type: Improvement
  Components: Build, Deploy
Reporter: Guoqiang Li
Assignee: Guoqiang Li
 Fix For: 1.0.0


 Use the shade plugin to create a big JAR with all the dependencies can cause 
 a few problems
 1. Missing jar's meta information
 2. Some file is covered, eg: plugin.xml
 3. Different versions of the jar may co-exist
 4. Too big, java 6 does not support



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (SPARK-1698) Improve spark integration

2014-05-02 Thread Sean Owen (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-1698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13987686#comment-13987686
 ] 

Sean Owen commented on SPARK-1698:
--

(Copying an earlier comment that went to the mailing list, but didn't make it 
here:)

#1 and #2 are not relevant the issue of jar size. These can be problems in 
general, but don't think there have been issues attributable to file clashes. 
Shading has mechanisms to deal with this anyway.

#3 is a problem in general too, but is not specific to shading. Where versions 
collide, build processes like Maven and shading must be used to resolve them. 
But this happens regardless of whether you shade a fat jar.

#4 is a real problem specific to Java 6. It does seem like it will be important 
to identify and remove more unnecessary dependencies to work around it.

But shading per se is not the problem, and it is important to make a packaged 
jar for the app. What are you proposing? Dependencies to be removed?

 Improve spark integration
 -

 Key: SPARK-1698
 URL: https://issues.apache.org/jira/browse/SPARK-1698
 Project: Spark
  Issue Type: Improvement
  Components: Build, Deploy
Reporter: Guoqiang Li
Assignee: Guoqiang Li
 Fix For: 1.0.0


 Use the shade plugin to create a big JAR with all the dependencies can cause 
 a few problems
 1. Missing jar's meta information
 2. Some file is covered, eg: plugin.xml
 3. Different versions of the jar may co-exist
 4. Too big, java 6 does not support



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (SPARK-1698) Improve spark integration

2014-05-02 Thread Sean Owen (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-1698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13987698#comment-13987698
 ] 

Sean Owen commented on SPARK-1698:
--

What is the suggested change in this particular JIRA? I saw the PR, which seems 
to replace the shade with assembly plugin. Given the reference to 
https://issues.scala-lang.org/browse/SI-6660 are you suggesting that your 
assembly change packages differently, by putting jars in jars? Yes, the issue 
you link to is exactly the kind of problem that can occur with this approach. 
It comes up a bit in Hadoop as well. Even though it is in theory a fine way to 
do things. But is that what you're getting at?

 Improve spark integration
 -

 Key: SPARK-1698
 URL: https://issues.apache.org/jira/browse/SPARK-1698
 Project: Spark
  Issue Type: Improvement
  Components: Build, Deploy
Reporter: Guoqiang Li
Assignee: Guoqiang Li
 Fix For: 1.0.0


 Use the shade plugin to create a big JAR with all the dependencies can cause 
 a few problems
 1. Missing jar's meta information
 2. Some file is covered, eg: plugin.xml
 3. Different versions of the jar may co-exist
 4. Too big, java 6 does not support



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (SPARK-1698) Improve spark integration

2014-05-02 Thread Guoqiang Li (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-1698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13987699#comment-13987699
 ] 

Guoqiang Li commented on SPARK-1698:


[~srowen]
In [The PR 598|https://github.com/apache/spark/pull/598] #1,#2,#4 do not occur 
and #3 is very easy to find

 Improve spark integration
 -

 Key: SPARK-1698
 URL: https://issues.apache.org/jira/browse/SPARK-1698
 Project: Spark
  Issue Type: Improvement
  Components: Build, Deploy
Reporter: Guoqiang Li
Assignee: Guoqiang Li
 Fix For: 1.0.0


 Use the shade plugin to create a big JAR with all the dependencies can cause 
 a few problems
 1. Missing jar's meta information
 2. Some file is covered, eg: plugin.xml
 3. Different versions of the jar may co-exist
 4. Too big, java 6 does not support



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (SPARK-1698) Improve spark integration

2014-05-02 Thread Guoqiang Li (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-1698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13987708#comment-13987708
 ] 

Guoqiang Li commented on SPARK-1698:


[~srowen]
In [The PR 598|https://github.com/apache/spark/pull/598] ,The directory 
structure of a spark similar to hadoop 2.3.0.
There are three subcomponents: core,examples,hive,Their path is 
share/spark/core,share/spark/examples,share/spark/hive

 Improve spark integration
 -

 Key: SPARK-1698
 URL: https://issues.apache.org/jira/browse/SPARK-1698
 Project: Spark
  Issue Type: Improvement
  Components: Build, Deploy
Reporter: Guoqiang Li
Assignee: Guoqiang Li
 Fix For: 1.0.0


 Use the shade plugin to create a big JAR with all the dependencies can cause 
 a few problems
 1. Missing jar's meta information
 2. Some file is covered, eg: plugin.xml
 3. Different versions of the jar may co-exist
 4. Too big, java 6 does not support



--
This message was sent by Atlassian JIRA
(v6.2#6252)