subject:"Maven"

Re: Spark 3.2.4 pom NOT FOUND on maven

2023-04-21 Thread Enrico Minack

Hi Dongjoon,

thanks for confirmation.

I have added the Apache release repository to my project, so it fetches
the jars from there and not Maven central.

That is a great workaround until Maven central has resolved the issue.

Cheers,
Enrico

Am 19.04.23 um 03:04 schrieb Dongjoon Hyun:

Thank you for reporting, Enrico.

I verified your issue report and also double-checked that both the original
official Apache repository and Google Maven Mirror works correctly. Given that,
it could be due to some transient issues because the artifacts are copied from
Apache repository to Maven Central to Google Mirror. It means it worked fine
until they are copied to Google Mirror.

1)
https://repository.apache.org/content/repositories/releases/org/apache/spark/spark-parent_2.13/3.2.4/spark-parent_2.13-3.2.4.pom

2)
https://maven-central.storage-download.googleapis.com/maven2/org/apache/spark/spark-parent_2.13/3.2.4/spark-parent_2.13-3.2.4.pom

You may want to use (1) and (2) repositories temporarily while waiting for
`repo1.maven.org`'s recovery.

Dongjoon.

On 2023/04/18 05:38:59 Enrico Minack wrote:

Any suggestions on how to fix or use the Spark 3.2.4 (Scala 2.13) release?

Cheers,
Enrico

Am 17.04.23 um 08:19 schrieb Enrico Minack:

Hi,

thanks for the Spark 3.2.4 release.

I have found that Maven does not serve the spark-parent_2.13 pom file.
It is listed in the directory:
https://repo1.maven.org/maven2/org/apache/spark/spark-parent_2.13/3.2.4/

But cannot be downloaded:
https://repo1.maven.org/maven2/org/apache/spark/spark-parent_2.13/3.2.4/spark-parent_2.13-3.2.4.pom

The 2.12 file is fine:
https://repo1.maven.org/maven2/org/apache/spark/spark-parent_2.12/3.2.4/spark-parent_2.12-3.2.4.pom

Any chance this can be fixed?

Cheers,
Enrico

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Re: Spark 3.2.4 pom NOT FOUND on maven

2023-04-18 Thread Dongjoon Hyun

Thank you for reporting, Enrico.

I verified your issue report and also double-checked that both the original 
official Apache repository and Google Maven Mirror works correctly. Given that, 
it could be due to some transient issues because the artifacts are copied from 
Apache repository to Maven Central to Google Mirror. It means it worked fine 
until they are copied to Google Mirror.

1) 
https://repository.apache.org/content/repositories/releases/org/apache/spark/spark-parent_2.13/3.2.4/spark-parent_2.13-3.2.4.pom

2) 
https://maven-central.storage-download.googleapis.com/maven2/org/apache/spark/spark-parent_2.13/3.2.4/spark-parent_2.13-3.2.4.pom

You may want to use (1) and (2) repositories temporarily while waiting for 
`repo1.maven.org`'s recovery.

Dongjoon.


On 2023/04/18 05:38:59 Enrico Minack wrote:
> Any suggestions on how to fix or use the Spark 3.2.4 (Scala 2.13) release?
> 
> Cheers,
> Enrico
> 
> 
> Am 17.04.23 um 08:19 schrieb Enrico Minack:
> > Hi,
> >
> > thanks for the Spark 3.2.4 release.
> >
> > I have found that Maven does not serve the spark-parent_2.13 pom file. 
> > It is listed in the directory:
> > https://repo1.maven.org/maven2/org/apache/spark/spark-parent_2.13/3.2.4/
> >
> > But cannot be downloaded:
> > https://repo1.maven.org/maven2/org/apache/spark/spark-parent_2.13/3.2.4/spark-parent_2.13-3.2.4.pom
> >  
> >
> >
> > The 2.12 file is fine:
> > https://repo1.maven.org/maven2/org/apache/spark/spark-parent_2.12/3.2.4/spark-parent_2.12-3.2.4.pom
> >  
> >
> >
> > Any chance this can be fixed?
> >
> > Cheers,
> > Enrico
> >
> >
> > -
> > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
> >
> 
> 
> -
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
> 
> 

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Re: Spark 3.2.4 pom NOT FOUND on maven

2023-04-17 Thread Enrico Minack


Any suggestions on how to fix or use the Spark 3.2.4 (Scala 2.13) release?

Cheers,
Enrico


Am 17.04.23 um 08:19 schrieb Enrico Minack:

Hi,

thanks for the Spark 3.2.4 release.

I have found that Maven does not serve the spark-parent_2.13 pom file. 
It is listed in the directory:

https://repo1.maven.org/maven2/org/apache/spark/spark-parent_2.13/3.2.4/

But cannot be downloaded:
https://repo1.maven.org/maven2/org/apache/spark/spark-parent_2.13/3.2.4/spark-parent_2.13-3.2.4.pom 



The 2.12 file is fine:
https://repo1.maven.org/maven2/org/apache/spark/spark-parent_2.12/3.2.4/spark-parent_2.12-3.2.4.pom 



Any chance this can be fixed?

Cheers,
Enrico


-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org




-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Spark 3.2.4 pom NOT FOUND on maven

2023-04-17 Thread Enrico Minack


Hi,

thanks for the Spark 3.2.4 release.

I have found that Maven does not serve the spark-parent_2.13 pom file. 
It is listed in the directory:

https://repo1.maven.org/maven2/org/apache/spark/spark-parent_2.13/3.2.4/

But cannot be downloaded:
https://repo1.maven.org/maven2/org/apache/spark/spark-parent_2.13/3.2.4/spark-parent_2.13-3.2.4.pom

The 2.12 file is fine:
https://repo1.maven.org/maven2/org/apache/spark/spark-parent_2.12/3.2.4/spark-parent_2.12-3.2.4.pom

Any chance this can be fixed?

Cheers,
Enrico


-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Re: maven build failing in spark sql w/BouncyCastleProvider CNFE

2022-12-08 Thread Steve Loughran

i think the scala plugin upgrade may be good, but the bouncy castle one
needed. 1.70 is the most recent, afaik.



On Thu, 8 Dec 2022 at 06:59, Yang,Jie(INF)  wrote:

> Steve, after some investigate, I think this problem may not related to
> `scala-maven-plugin`. We can add the following two test dependencies to the
> `sql/core` module to make the mvn build successful:
>
>
>
> ```
>
> 
>
>   org.bouncycastle
>
>   bcprov-jdk15on
>
>   test
>
> 
>
> 
>
>   org.bouncycastle
>
>   bcpkix-jdk15on
>
>   test
>
> 
>
> ```
>
>
>
> Yang Jie
>
>
>
> *发件人**: *"Yang,Jie(INF)" 
> *日期**: *2022年12月6日 星期二 18:27
> *收件人**: *Steve Loughran 
> *抄送**: *Hyukjin Kwon , Apache Spark Dev <
> dev@spark.apache.org>
> *主题**: *Re: maven build failing in spark sql w/BouncyCastleProvider CNFE
>
>
>
> I think we can try scala-maven-plugin 4.8.0
>
>
>
> *发件人**: *Steve Loughran 
> *日期**: *2022年12月6日 星期二 18:19
> *收件人**: *"Yang,Jie(INF)" 
> *抄送**: *Hyukjin Kwon , Apache Spark Dev <
> dev@spark.apache.org>
> *主题**: *Re: maven build failing in spark sql w/BouncyCastleProvider CNFE
>
>
>
>
>
>
>
> On Tue, 6 Dec 2022 at 04:10, Yang,Jie(INF)  wrote:
>
> Steve, did compile failed happen when mvn build Spark master with hadoop
> 3.4.0-SNAPSHOT?
>
>
>
> yes. doesn't happen with
>
> * branch-3.3 snapshot (3.3.9-SNAPSHOT)
>
> * branch-3.3.5 RC0 "pre-rc" in asf staging.
>
>
>
> maybe trying the 4.8.0 plugin would be worth trying...not something i'll
> do this week as i'm really trying to get the RC0 out rather than anything
> else
>
>
>
>
>
> *发件人**: *Hyukjin Kwon 
> *日期**: *2022年12月6日 星期二 10:27
> *抄送**: *Apache Spark Dev 
> *主题**: *Re: maven build failing in spark sql w/BouncyCastleProvider CNFE
>
>
>
> Steve, does the lower version of scala plugin work for you? If that
> solves, we could temporary downgrade for now.
>
>
>
> On Mon, 5 Dec 2022 at 22:23, Steve Loughran 
> wrote:
>
>  trying to build spark master w/ hadoop trunk and the maven sbt plugin is
> failing. This doesn't happen with the 3.3.5 RC0;
>
>
>
> I note that the only mention of this anywhere was me in march.
>
>
>
> clearly something in hadoop trunk has changed in a way which is
> incompatible.
>
>
>
> Has anyone else tried such a build/seen this problem? any suggestions of a
> fix?
>
>
>
> Created SPARK-41392 to cover this...
>
>
>
> [INFO]
> 
> [ERROR] Failed to execute goal
> net.alchim31.maven:scala-maven-plugin:4.7.2:testCompile
> (scala-test-compile-first) on project spark-sql_2.12: Execution
> scala-test-compile-first of goal
> net.alchim31.maven:scala-maven-plugin:4.7.2:testCompile failed: A required
> class was missing while executing
> net.alchim31.maven:scala-maven-plugin:4.7.2:testCompile:
> org/bouncycastle/jce/provider/BouncyCastleProvider
> [
>
>

Re: maven build failing in spark sql w/BouncyCastleProvider CNFE

2022-12-07 Thread Yang,Jie(INF)

Steve, after some investigate, I think this problem may not related to 
`scala-maven-plugin`. We can add the following two test dependencies to the 
`sql/core` module to make the mvn build successful:

```

  org.bouncycastle
  bcprov-jdk15on
  test


  org.bouncycastle
  bcpkix-jdk15on
  test

```

Yang Jie

发件人: "Yang,Jie(INF)" 
日期: 2022年12月6日 星期二 18:27
收件人: Steve Loughran 
抄送: Hyukjin Kwon , Apache Spark Dev 
主题: Re: maven build failing in spark sql w/BouncyCastleProvider CNFE

I think we can try scala-maven-plugin 4.8.0

发件人: Steve Loughran 
日期: 2022年12月6日 星期二 18:19
收件人: "Yang,Jie(INF)" 
抄送: Hyukjin Kwon , Apache Spark Dev 
主题: Re: maven build failing in spark sql w/BouncyCastleProvider CNFE



On Tue, 6 Dec 2022 at 04:10, Yang,Jie(INF) 
mailto:yangji...@baidu.com>> wrote:
Steve, did compile failed happen when mvn build Spark master with hadoop 
3.4.0-SNAPSHOT?

yes. doesn't happen with
* branch-3.3 snapshot (3.3.9-SNAPSHOT)
* branch-3.3.5 RC0 "pre-rc" in asf staging.

maybe trying the 4.8.0 plugin would be worth trying...not something i'll do 
this week as i'm really trying to get the RC0 out rather than anything else


发件人: Hyukjin Kwon mailto:gurwls...@gmail.com>>
日期: 2022年12月6日 星期二 10:27
抄送: Apache Spark Dev mailto:dev@spark.apache.org>>
主题: Re: maven build failing in spark sql w/BouncyCastleProvider CNFE

Steve, does the lower version of scala plugin work for you? If that solves, we 
could temporary downgrade for now.

On Mon, 5 Dec 2022 at 22:23, Steve Loughran  wrote:
 trying to build spark master w/ hadoop trunk and the maven sbt plugin is 
failing. This doesn't happen with the 3.3.5 RC0;

I note that the only mention of this anywhere was me in march.

clearly something in hadoop trunk has changed in a way which is incompatible.

Has anyone else tried such a build/seen this problem? any suggestions of a fix?

Created SPARK-41392 to cover this...

[INFO] 
[ERROR] Failed to execute goal 
net.alchim31.maven:scala-maven-plugin:4.7.2:testCompile 
(scala-test-compile-first) on project spark-sql_2.12: Execution 
scala-test-compile-first of goal 
net.alchim31.maven:scala-maven-plugin:4.7.2:testCompile failed: A required 
class was missing while executing 
net.alchim31.maven:scala-maven-plugin:4.7.2:testCompile: 
org/bouncycastle/jce/provider/BouncyCastleProvider
[

Re: maven build failing in spark sql w/BouncyCastleProvider CNFE

2022-12-06 Thread Yang,Jie(INF)

I think we can try scala-maven-plugin 4.8.0

发件人: Steve Loughran 
日期: 2022年12月6日 星期二 18:19
收件人: "Yang,Jie(INF)" 
抄送: Hyukjin Kwon , Apache Spark Dev 
主题: Re: maven build failing in spark sql w/BouncyCastleProvider CNFE



On Tue, 6 Dec 2022 at 04:10, Yang,Jie(INF) 
mailto:yangji...@baidu.com>> wrote:
Steve, did compile failed happen when mvn build Spark master with hadoop 
3.4.0-SNAPSHOT?

yes. doesn't happen with
* branch-3.3 snapshot (3.3.9-SNAPSHOT)
* branch-3.3.5 RC0 "pre-rc" in asf staging.

maybe trying the 4.8.0 plugin would be worth trying...not something i'll do 
this week as i'm really trying to get the RC0 out rather than anything else


发件人: Hyukjin Kwon mailto:gurwls...@gmail.com>>
日期: 2022年12月6日 星期二 10:27
抄送: Apache Spark Dev mailto:dev@spark.apache.org>>
主题: Re: maven build failing in spark sql w/BouncyCastleProvider CNFE

Steve, does the lower version of scala plugin work for you? If that solves, we 
could temporary downgrade for now.

On Mon, 5 Dec 2022 at 22:23, Steve Loughran  wrote:
 trying to build spark master w/ hadoop trunk and the maven sbt plugin is 
failing. This doesn't happen with the 3.3.5 RC0;

I note that the only mention of this anywhere was me in march.

clearly something in hadoop trunk has changed in a way which is incompatible.

Has anyone else tried such a build/seen this problem? any suggestions of a fix?

Created SPARK-41392 to cover this...

[INFO] 
[ERROR] Failed to execute goal 
net.alchim31.maven:scala-maven-plugin:4.7.2:testCompile 
(scala-test-compile-first) on project spark-sql_2.12: Execution 
scala-test-compile-first of goal 
net.alchim31.maven:scala-maven-plugin:4.7.2:testCompile failed: A required 
class was missing while executing 
net.alchim31.maven:scala-maven-plugin:4.7.2:testCompile: 
org/bouncycastle/jce/provider/BouncyCastleProvider
[

Re: maven build failing in spark sql w/BouncyCastleProvider CNFE

2022-12-06 Thread Steve Loughran

On Tue, 6 Dec 2022 at 04:10, Yang,Jie(INF)  wrote:

> Steve, did compile failed happen when mvn build Spark master with hadoop
> 3.4.0-SNAPSHOT?
>

yes. doesn't happen with
* branch-3.3 snapshot (3.3.9-SNAPSHOT)
* branch-3.3.5 RC0 "pre-rc" in asf staging.

maybe trying the 4.8.0 plugin would be worth trying...not something i'll do
this week as i'm really trying to get the RC0 out rather than anything else


>
>
> *发件人**: *Hyukjin Kwon 
> *日期**: *2022年12月6日 星期二 10:27
> *抄送**: *Apache Spark Dev 
> *主题**: *Re: maven build failing in spark sql w/BouncyCastleProvider CNFE
>
>
>
> Steve, does the lower version of scala plugin work for you? If that
> solves, we could temporary downgrade for now.
>
>
>
> On Mon, 5 Dec 2022 at 22:23, Steve Loughran 
> wrote:
>
>  trying to build spark master w/ hadoop trunk and the maven sbt plugin is
> failing. This doesn't happen with the 3.3.5 RC0;
>
>
>
> I note that the only mention of this anywhere was me in march.
>
>
>
> clearly something in hadoop trunk has changed in a way which is
> incompatible.
>
>
>
> Has anyone else tried such a build/seen this problem? any suggestions of a
> fix?
>
>
>
> Created SPARK-41392 to cover this...
>
>
>
> [INFO]
> 
> [ERROR] Failed to execute goal
> net.alchim31.maven:scala-maven-plugin:4.7.2:testCompile
> (scala-test-compile-first) on project spark-sql_2.12: Execution
> scala-test-compile-first of goal
> net.alchim31.maven:scala-maven-plugin:4.7.2:testCompile failed: A required
> class was missing while executing
> net.alchim31.maven:scala-maven-plugin:4.7.2:testCompile:
> org/bouncycastle/jce/provider/BouncyCastleProvider
> [
>
>

Re: maven build failing in spark sql w/BouncyCastleProvider CNFE

2022-12-05 Thread Yang,Jie(INF)

Steve, did compile failed happen when mvn build Spark master with hadoop 
3.4.0-SNAPSHOT?

发件人: Hyukjin Kwon 
日期: 2022年12月6日 星期二 10:27
抄送: Apache Spark Dev 
主题: Re: maven build failing in spark sql w/BouncyCastleProvider CNFE

Steve, does the lower version of scala plugin work for you? If that solves, we 
could temporary downgrade for now.

On Mon, 5 Dec 2022 at 22:23, Steve Loughran  wrote:
 trying to build spark master w/ hadoop trunk and the maven sbt plugin is 
failing. This doesn't happen with the 3.3.5 RC0;

I note that the only mention of this anywhere was me in march.

clearly something in hadoop trunk has changed in a way which is incompatible.

Has anyone else tried such a build/seen this problem? any suggestions of a fix?

Created SPARK-41392 to cover this...

[INFO] 
[ERROR] Failed to execute goal 
net.alchim31.maven:scala-maven-plugin:4.7.2:testCompile 
(scala-test-compile-first) on project spark-sql_2.12: Execution 
scala-test-compile-first of goal 
net.alchim31.maven:scala-maven-plugin:4.7.2:testCompile failed: A required 
class was missing while executing 
net.alchim31.maven:scala-maven-plugin:4.7.2:testCompile: 
org/bouncycastle/jce/provider/BouncyCastleProvider
[ERROR] -
[ERROR] realm =plugin>net.alchim31.maven:scala-maven-plugin:4.7.2
[ERROR] strategy = org.codehaus.plexus.classworlds.strategy.SelfFirstStrategy
[ERROR] urls[0] = 
file:/Users/stevel/.m2/repository/net/alchim31/maven/scala-maven-plugin/4.7.2/scala-maven-plugin-4.7.2.jar
[ERROR] urls[1] = 
file:/Users/stevel/.m2/repository/org/apache/maven/shared/maven-dependency-tree/3.2.0/maven-dependency-tree-3.2.0.jar
[ERROR] urls[2] = 
file:/Users/stevel/.m2/repository/org/eclipse/aether/aether-util/1.0.0.v20140518/aether-util-1.0.0.v20140518.jar
[ERROR] urls[3] = 
file:/Users/stevel/.m2/repository/org/apache/maven/reporting/maven-reporting-api/3.1.1/maven-reporting-api-3.1.1.jar
[ERROR] urls[4] = 
file:/Users/stevel/.m2/repository/org/apache/maven/doxia/doxia-sink-api/1.11.1/doxia-sink-api-1.11.1.jar
[ERROR] urls[5] = 
file:/Users/stevel/.m2/repository/org/apache/maven/doxia/doxia-logging-api/1.11.1/doxia-logging-api-1.11.1.jar
[ERROR] urls[6] = 
file:/Users/stevel/.m2/repository/org/apache/maven/maven-archiver/3.6.0/maven-archiver-3.6.0.jar
[ERROR] urls[7] = 
file:/Users/stevel/.m2/repository/org/codehaus/plexus/plexus-io/3.4.0/plexus-io-3.4.0.jar
[ERROR] urls[8] = 
file:/Users/stevel/.m2/repository/org/codehaus/plexus/plexus-interpolation/1.26/plexus-interpolation-1.26.jar
[ERROR] urls[9] = 
file:/Users/stevel/.m2/repository/org/apache/commons/commons-exec/1.3/commons-exec-1.3.jar
[ERROR] urls[10] = 
file:/Users/stevel/.m2/repository/org/codehaus/plexus/plexus-utils/3.4.2/plexus-utils-3.4.2.jar
[ERROR] urls[11] = 
file:/Users/stevel/.m2/repository/org/codehaus/plexus/plexus-archiver/4.5.0/plexus-archiver-4.5.0.jar
[ERROR] urls[12] = 
file:/Users/stevel/.m2/repository/commons-io/commons-io/2.11.0/commons-io-2.11.0.jar
[ERROR] urls[13] = 
file:/Users/stevel/.m2/repository/org/apache/commons/commons-compress/1.21/commons-compress-1.21.jar
[ERROR] urls[14] = 
file:/Users/stevel/.m2/repository/org/iq80/snappy/snappy/0.4/snappy-0.4.jar
[ERROR] urls[15] = 
file:/Users/stevel/.m2/repository/org/tukaani/xz/1.9/xz-1.9.jar
[ERROR] urls[16] = 
file:/Users/stevel/.m2/repository/com/github/luben/zstd-jni/1.5.2-4/zstd-jni-1.5.2-4.jar
[ERROR] urls[17] = 
file:/Users/stevel/.m2/repository/org/scala-sbt/zinc_2.13/1.7.1/zinc_2.13-1.7.1.jar
[ERROR] urls[18] = 
file:/Users/stevel/.m2/repository/org/scala-lang/scala-library/2.13.8/scala-library-2.13.8.jar
[ERROR] urls[19] = 
file:/Users/stevel/.m2/repository/org/scala-sbt/zinc-core_2.13/1.7.1/zinc-core_2.13-1.7.1.jar
[ERROR] urls[20] = 
file:/Users/stevel/.m2/repository/org/scala-sbt/zinc-apiinfo_2.13/1.7.1/zinc-apiinfo_2.13-1.7.1.jar
[ERROR] urls[21] = 
file:/Users/stevel/.m2/repository/org/scala-sbt/compiler-bridge_2.13/1.7.1/compiler-bridge_2.13-1.7.1.jar
[ERROR] urls[22] = 
file:/Users/stevel/.m2/repository/org/scala-sbt/zinc-classpath_2.13/1.7.1/zinc-classpath_2.13-1.7.1.jar
[ERROR] urls[23] = 
file:/Users/stevel/.m2/repository/org/scala-lang/scala-compiler/2.13.8/scala-compiler-2.13.8.jar
[ERROR] urls[24] = 
file:/Users/stevel/.m2/repository/org/scala-sbt/compiler-interface/1.7.1/compiler-interface-1.7.1.jar
[ERROR] urls[25] = 
file:/Users/stevel/.m2/repository/org/scala-sbt/util-interface/1.7.0/util-interface-1.7.0.jar
[ERROR] urls[26] = 
file:/Users/stevel/.m2/repository/org/scala-sbt/zinc-persist-core-assembly/1.7.1/zinc-persist-core-assembly-1.7.1.jar
[ERROR] urls[27] = 
file:/Users/stevel/.m2/repository/org/scala-lang/modules/scala-parallel-collections_2.13/0.2.0/scala-parallel-collections_2.13-0.2.0.jar
[ERROR] urls[28] = 
file:/Users/stevel/.m2/repository/org/scala-sbt/io_2.13/1.7.0/io_2.13-1.7.0.jar
[ERROR] urls[29] = 
file:/Users/stevel/.m2/repository/com/swo

Re: maven build failing in spark sql w/BouncyCastleProvider CNFE

2022-12-05 Thread Hyukjin Kwon

Steve, does the lower version of scala plugin work for you? If that solves,
we could temporary downgrade for now.

On Mon, 5 Dec 2022 at 22:23, Steve Loughran 
wrote:

>  trying to build spark master w/ hadoop trunk and the maven sbt plugin is
> failing. This doesn't happen with the 3.3.5 RC0;
>
> I note that the only mention of this anywhere was me in march.
>
> clearly something in hadoop trunk has changed in a way which is
> incompatible.
>
> Has anyone else tried such a build/seen this problem? any suggestions of a
> fix?
>
> Created SPARK-41392 to cover this...
>
> [INFO]
> 
> [ERROR] Failed to execute goal
> net.alchim31.maven:scala-maven-plugin:4.7.2:testCompile
> (scala-test-compile-first) on project spark-sql_2.12: Execution
> scala-test-compile-first of goal
> net.alchim31.maven:scala-maven-plugin:4.7.2:testCompile failed: A required
> class was missing while executing
> net.alchim31.maven:scala-maven-plugin:4.7.2:testCompile:
> org/bouncycastle/jce/provider/BouncyCastleProvider
> [ERROR] -----
> [ERROR] realm =plugin>net.alchim31.maven:scala-maven-plugin:4.7.2
> [ERROR] strategy =
> org.codehaus.plexus.classworlds.strategy.SelfFirstStrategy
> [ERROR] urls[0] =
> file:/Users/stevel/.m2/repository/net/alchim31/maven/scala-maven-plugin/4.7.2/scala-maven-plugin-4.7.2.jar
> [ERROR] urls[1] =
> file:/Users/stevel/.m2/repository/org/apache/maven/shared/maven-dependency-tree/3.2.0/maven-dependency-tree-3.2.0.jar
> [ERROR] urls[2] =
> file:/Users/stevel/.m2/repository/org/eclipse/aether/aether-util/1.0.0.v20140518/aether-util-1.0.0.v20140518.jar
> [ERROR] urls[3] =
> file:/Users/stevel/.m2/repository/org/apache/maven/reporting/maven-reporting-api/3.1.1/maven-reporting-api-3.1.1.jar
> [ERROR] urls[4] =
> file:/Users/stevel/.m2/repository/org/apache/maven/doxia/doxia-sink-api/1.11.1/doxia-sink-api-1.11.1.jar
> [ERROR] urls[5] =
> file:/Users/stevel/.m2/repository/org/apache/maven/doxia/doxia-logging-api/1.11.1/doxia-logging-api-1.11.1.jar
> [ERROR] urls[6] =
> file:/Users/stevel/.m2/repository/org/apache/maven/maven-archiver/3.6.0/maven-archiver-3.6.0.jar
> [ERROR] urls[7] =
> file:/Users/stevel/.m2/repository/org/codehaus/plexus/plexus-io/3.4.0/plexus-io-3.4.0.jar
> [ERROR] urls[8] =
> file:/Users/stevel/.m2/repository/org/codehaus/plexus/plexus-interpolation/1.26/plexus-interpolation-1.26.jar
> [ERROR] urls[9] =
> file:/Users/stevel/.m2/repository/org/apache/commons/commons-exec/1.3/commons-exec-1.3.jar
> [ERROR] urls[10] =
> file:/Users/stevel/.m2/repository/org/codehaus/plexus/plexus-utils/3.4.2/plexus-utils-3.4.2.jar
> [ERROR] urls[11] =
> file:/Users/stevel/.m2/repository/org/codehaus/plexus/plexus-archiver/4.5.0/plexus-archiver-4.5.0.jar
> [ERROR] urls[12] =
> file:/Users/stevel/.m2/repository/commons-io/commons-io/2.11.0/commons-io-2.11.0.jar
> [ERROR] urls[13] =
> file:/Users/stevel/.m2/repository/org/apache/commons/commons-compress/1.21/commons-compress-1.21.jar
> [ERROR] urls[14] =
> file:/Users/stevel/.m2/repository/org/iq80/snappy/snappy/0.4/snappy-0.4.jar
> [ERROR] urls[15] =
> file:/Users/stevel/.m2/repository/org/tukaani/xz/1.9/xz-1.9.jar
> [ERROR] urls[16] =
> file:/Users/stevel/.m2/repository/com/github/luben/zstd-jni/1.5.2-4/zstd-jni-1.5.2-4.jar
> [ERROR] urls[17] =
> file:/Users/stevel/.m2/repository/org/scala-sbt/zinc_2.13/1.7.1/zinc_2.13-1.7.1.jar
> [ERROR] urls[18] =
> file:/Users/stevel/.m2/repository/org/scala-lang/scala-library/2.13.8/scala-library-2.13.8.jar
> [ERROR] urls[19] =
> file:/Users/stevel/.m2/repository/org/scala-sbt/zinc-core_2.13/1.7.1/zinc-core_2.13-1.7.1.jar
> [ERROR] urls[20] =
> file:/Users/stevel/.m2/repository/org/scala-sbt/zinc-apiinfo_2.13/1.7.1/zinc-apiinfo_2.13-1.7.1.jar
> [ERROR] urls[21] =
> file:/Users/stevel/.m2/repository/org/scala-sbt/compiler-bridge_2.13/1.7.1/compiler-bridge_2.13-1.7.1.jar
> [ERROR] urls[22] =
> file:/Users/stevel/.m2/repository/org/scala-sbt/zinc-classpath_2.13/1.7.1/zinc-classpath_2.13-1.7.1.jar
> [ERROR] urls[23] =
> file:/Users/stevel/.m2/repository/org/scala-lang/scala-compiler/2.13.8/scala-compiler-2.13.8.jar
> [ERROR] urls[24] =
> file:/Users/stevel/.m2/repository/org/scala-sbt/compiler-interface/1.7.1/compiler-interface-1.7.1.jar
> [ERROR] urls[25] =
> file:/Users/stevel/.m2/repository/org/scala-sbt/util-interface/1.7.0/util-interface-1.7.0.jar
> [ERROR] urls[26] =
> file:/Users/stevel/.m2/repository/org/scala-sbt/zinc-persist-core-assembly/1.7.1/zinc-persist-core-assembly-1.7.1.jar
> [ERROR] urls[27] =
> file:/Users/stevel/.m2/repository/org/scala-lang/modules/scala-parallel-collections_2.13/0.2.0/scala-parallel-collections_2.13-0.2.0.jar
> [ERROR]

maven build failing in spark sql w/BouncyCastleProvider CNFE

2022-12-05 Thread Steve Loughran

 trying to build spark master w/ hadoop trunk and the maven sbt plugin is
failing. This doesn't happen with the 3.3.5 RC0;

I note that the only mention of this anywhere was me in march.

clearly something in hadoop trunk has changed in a way which is
incompatible.

Has anyone else tried such a build/seen this problem? any suggestions of a
fix?

Created SPARK-41392 to cover this...

[INFO]

[ERROR] Failed to execute goal
net.alchim31.maven:scala-maven-plugin:4.7.2:testCompile
(scala-test-compile-first) on project spark-sql_2.12: Execution
scala-test-compile-first of goal
net.alchim31.maven:scala-maven-plugin:4.7.2:testCompile failed: A required
class was missing while executing
net.alchim31.maven:scala-maven-plugin:4.7.2:testCompile:
org/bouncycastle/jce/provider/BouncyCastleProvider
[ERROR] -
[ERROR] realm =plugin>net.alchim31.maven:scala-maven-plugin:4.7.2
[ERROR] strategy =
org.codehaus.plexus.classworlds.strategy.SelfFirstStrategy
[ERROR] urls[0] =
file:/Users/stevel/.m2/repository/net/alchim31/maven/scala-maven-plugin/4.7.2/scala-maven-plugin-4.7.2.jar
[ERROR] urls[1] =
file:/Users/stevel/.m2/repository/org/apache/maven/shared/maven-dependency-tree/3.2.0/maven-dependency-tree-3.2.0.jar
[ERROR] urls[2] =
file:/Users/stevel/.m2/repository/org/eclipse/aether/aether-util/1.0.0.v20140518/aether-util-1.0.0.v20140518.jar
[ERROR] urls[3] =
file:/Users/stevel/.m2/repository/org/apache/maven/reporting/maven-reporting-api/3.1.1/maven-reporting-api-3.1.1.jar
[ERROR] urls[4] =
file:/Users/stevel/.m2/repository/org/apache/maven/doxia/doxia-sink-api/1.11.1/doxia-sink-api-1.11.1.jar
[ERROR] urls[5] =
file:/Users/stevel/.m2/repository/org/apache/maven/doxia/doxia-logging-api/1.11.1/doxia-logging-api-1.11.1.jar
[ERROR] urls[6] =
file:/Users/stevel/.m2/repository/org/apache/maven/maven-archiver/3.6.0/maven-archiver-3.6.0.jar
[ERROR] urls[7] =
file:/Users/stevel/.m2/repository/org/codehaus/plexus/plexus-io/3.4.0/plexus-io-3.4.0.jar
[ERROR] urls[8] =
file:/Users/stevel/.m2/repository/org/codehaus/plexus/plexus-interpolation/1.26/plexus-interpolation-1.26.jar
[ERROR] urls[9] =
file:/Users/stevel/.m2/repository/org/apache/commons/commons-exec/1.3/commons-exec-1.3.jar
[ERROR] urls[10] =
file:/Users/stevel/.m2/repository/org/codehaus/plexus/plexus-utils/3.4.2/plexus-utils-3.4.2.jar
[ERROR] urls[11] =
file:/Users/stevel/.m2/repository/org/codehaus/plexus/plexus-archiver/4.5.0/plexus-archiver-4.5.0.jar
[ERROR] urls[12] =
file:/Users/stevel/.m2/repository/commons-io/commons-io/2.11.0/commons-io-2.11.0.jar
[ERROR] urls[13] =
file:/Users/stevel/.m2/repository/org/apache/commons/commons-compress/1.21/commons-compress-1.21.jar
[ERROR] urls[14] =
file:/Users/stevel/.m2/repository/org/iq80/snappy/snappy/0.4/snappy-0.4.jar
[ERROR] urls[15] =
file:/Users/stevel/.m2/repository/org/tukaani/xz/1.9/xz-1.9.jar
[ERROR] urls[16] =
file:/Users/stevel/.m2/repository/com/github/luben/zstd-jni/1.5.2-4/zstd-jni-1.5.2-4.jar
[ERROR] urls[17] =
file:/Users/stevel/.m2/repository/org/scala-sbt/zinc_2.13/1.7.1/zinc_2.13-1.7.1.jar
[ERROR] urls[18] =
file:/Users/stevel/.m2/repository/org/scala-lang/scala-library/2.13.8/scala-library-2.13.8.jar
[ERROR] urls[19] =
file:/Users/stevel/.m2/repository/org/scala-sbt/zinc-core_2.13/1.7.1/zinc-core_2.13-1.7.1.jar
[ERROR] urls[20] =
file:/Users/stevel/.m2/repository/org/scala-sbt/zinc-apiinfo_2.13/1.7.1/zinc-apiinfo_2.13-1.7.1.jar
[ERROR] urls[21] =
file:/Users/stevel/.m2/repository/org/scala-sbt/compiler-bridge_2.13/1.7.1/compiler-bridge_2.13-1.7.1.jar
[ERROR] urls[22] =
file:/Users/stevel/.m2/repository/org/scala-sbt/zinc-classpath_2.13/1.7.1/zinc-classpath_2.13-1.7.1.jar
[ERROR] urls[23] =
file:/Users/stevel/.m2/repository/org/scala-lang/scala-compiler/2.13.8/scala-compiler-2.13.8.jar
[ERROR] urls[24] =
file:/Users/stevel/.m2/repository/org/scala-sbt/compiler-interface/1.7.1/compiler-interface-1.7.1.jar
[ERROR] urls[25] =
file:/Users/stevel/.m2/repository/org/scala-sbt/util-interface/1.7.0/util-interface-1.7.0.jar
[ERROR] urls[26] =
file:/Users/stevel/.m2/repository/org/scala-sbt/zinc-persist-core-assembly/1.7.1/zinc-persist-core-assembly-1.7.1.jar
[ERROR] urls[27] =
file:/Users/stevel/.m2/repository/org/scala-lang/modules/scala-parallel-collections_2.13/0.2.0/scala-parallel-collections_2.13-0.2.0.jar
[ERROR] urls[28] =
file:/Users/stevel/.m2/repository/org/scala-sbt/io_2.13/1.7.0/io_2.13-1.7.0.jar
[ERROR] urls[29] =
file:/Users/stevel/.m2/repository/com/swoval/file-tree-views/2.1.9/file-tree-views-2.1.9.jar
[ERROR] urls[30] =
file:/Users/stevel/.m2/repository/net/java/dev/jna/jna/5.12.0/jna-5.12.0.jar
[ERROR] urls[31] =
file:/Users/stevel/.m2/repository/net/java/dev/jna/jna-platform/5.12.0/jna-platform-5.12.0.jar
[ERROR] urls[32] =
file:/Users/stevel/.m2/repository/org/scala-sbt/util-logging_2.13/1.7.0/util-logging_2.13-1.7.0.jar
[ERROR] urls[33] =
file:/Users/stevel/.m2/repository/

Maven Test blocks with TransportCipherSuite

2022-05-20 Thread Qian SUN

Hi, team.

I run the maven command to run unit test, and have a NPE.

command: ./build/mvn test
refer to
https://spark.apache.org/docs/latest/building-spark.html#running-tests

NPE is as follow:
22/05/20 16:32:45.450 main WARN AbstractChannelHandlerContext: Failed to
mark a promise as failure because it has succeeded already:
DefaultChannelPromise@366ef90e(success)
java.lang.NullPointerException: null
at
org.apache.spark.network.crypto.TransportCipher$EncryptionHandler.close(TransportCipher.java:137)
~[classes/:?]
at
io.netty.channel.AbstractChannelHandlerContext.invokeClose(AbstractChannelHandlerContext.java:622)
~[netty-transport-4.1.77.Final.jar:4.1.77.Final]
at
io.netty.channel.AbstractChannelHandlerContext.close(AbstractChannelHandlerContext.java:606)
~[netty-transport-4.1.77.Final.jar:4.1.77.Final]
at
io.netty.channel.DefaultChannelPipeline.close(DefaultChannelPipeline.java:994)
~[netty-transport-4.1.77.Final.jar:4.1.77.Final]
at io.netty.channel.AbstractChannel.close(AbstractChannel.java:280)
~[netty-transport-4.1.77.Final.jar:4.1.77.Final]
at
io.netty.channel.embedded.EmbeddedChannel.close(EmbeddedChannel.java:568)
~[netty-transport-4.1.77.Final.jar:4.1.77.Final]
at
io.netty.channel.embedded.EmbeddedChannel.close(EmbeddedChannel.java:555)
~[netty-transport-4.1.77.Final.jar:4.1.77.Final]
at
io.netty.channel.embedded.EmbeddedChannel.finish(EmbeddedChannel.java:503)
~[netty-transport-4.1.77.Final.jar:4.1.77.Final]
at
io.netty.channel.embedded.EmbeddedChannel.finish(EmbeddedChannel.java:483)
~[netty-transport-4.1.77.Final.jar:4.1.77.Final]
at
org.apache.spark.network.crypto.TransportCipherSuite.testBufferNotLeaksOnInternalError(TransportCipherSuite.java:78)
~[test-classes/:?]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
~[?:1.8.0_291]
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
~[?:1.8.0_291]
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
~[?:1.8.0_291]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_291]
at
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
~[junit-4.13.2.jar:4.13.2]
at
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
~[junit-4.13.2.jar:4.13.2]
at
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
~[junit-4.13.2.jar:4.13.2]
at
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
~[junit-4.13.2.jar:4.13.2]
at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
~[junit-4.13.2.jar:4.13.2]
at
org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
~[junit-4.13.2.jar:4.13.2]
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
~[junit-4.13.2.jar:4.13.2]
at
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
~[junit-4.13.2.jar:4.13.2]
at
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
~[junit-4.13.2.jar:4.13.2]
at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
~[junit-4.13.2.jar:4.13.2]
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
~[junit-4.13.2.jar:4.13.2]
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
~[junit-4.13.2.jar:4.13.2]
at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
~[junit-4.13.2.jar:4.13.2]
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
~[junit-4.13.2.jar:4.13.2]
at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
~[junit-4.13.2.jar:4.13.2]
at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
~[junit-4.13.2.jar:4.13.2]
at
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:364)
~[surefire-junit4-3.0.0-M5.jar:3.0.0-M5]
at
org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:272)
~[surefire-junit4-3.0.0-M5.jar:3.0.0-M5]
at
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:237)
~[surefire-junit4-3.0.0-M5.jar:3.0.0-M5]
at
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:158)
~[surefire-junit4-3.0.0-M5.jar:3.0.0-M5]
at
org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:428)
~[surefire-booter-3.0.0-M5.jar:3.0.0-M5]
at
org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:162)
~[surefire-booter-3.0.0-M5.jar:3.0.0-M5]
at org.apache.maven.surefire.booter.ForkedBooter.run(ForkedBooter.java:562)
~[surefire-booter-3.0.0-M5.jar:3.0.0-M5]
at
org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:548)
~[surefire-booter-3.0.0-M5.jar:3.0.0-M5]


Anyone with same exception?

-- 
Best!
Qian SUN

Re: Problem building spark-catalyst_2.12 with Maven

2022-02-10 Thread Martin Grigorov

I've found the problem!
It was indeed a local thingy!

$ cat ~/.mavenrc
MAVEN_OPTS='-XX:+TieredCompilation -XX:TieredStopAtLevel=1'

I've added this some time ago. It optimizes the build time. But it seems it
also overrides the env var MAVEN_OPTS...

Now it fails with:

[INFO] --- scala-maven-plugin:4.3.0:compile (scala-compile-first) @
spark-catalyst_2.12 ---
[INFO] Using incremental compilation using Mixed compile order
[INFO] Compiler bridge file:
/home/martin/.sbt/1.0/zinc/org.scala-sbt/org.scala-sbt-compiler-bridge_2.12-1.3.1-bin_2.12.15__52.0-1.3.1_20191012T045515.jar
[INFO] compiler plugin:
BasicArtifact(com.github.ghik,silencer-plugin_2.12.15,1.7.6,null)
[INFO] Compiling 372 Scala sources and 171 Java sources to
/home/martin/git/apache/spark/sql/catalyst/target/scala-2.12/classes ...

[ERROR] [Error] : error writing
/home/martin/git/apache/spark/sql/catalyst/target/scala-2.12/classes/org/apache/spark/sql/catalyst/analysis/Analyzer$ResolveGroupingAnalytics$$anonfun$org$apache$spark$sql$catalyst$analysis$Analyzer$ResolveGroupingAnalytics$$replaceGroupingFunc$1.class:
java.nio.file.FileSystemException
/home/martin/git/apache/spark/sql/catalyst/target/scala-2.12/classes/org/apache/spark/sql/catalyst/analysis/Analyzer$ResolveGroupingAnalytics$$anonfun$org$apache$spark$sql$catalyst$analysis$Analyzer$ResolveGroupingAnalytics$$replaceGroupingFunc$1.class:
File name too long
but this is well documented:
https://spark.apache.org/docs/latest/building-spark.html#encrypted-filesystems

All works now!
Thank you, Sean!


On Thu, Feb 10, 2022 at 10:13 PM Sean Owen  wrote:

> I think it's another occurrence that I had to change or had to set
> MAVEN_OPTS. I think this occurs in a way that this setting doesn't affect,
> though I don't quite understand it. Try the stack size in test runner
> configs
>
> On Thu, Feb 10, 2022, 2:02 PM Martin Grigorov 
> wrote:
>
>> Hi Sean,
>>
>> On Thu, Feb 10, 2022 at 5:37 PM Sean Owen  wrote:
>>
>>> Yes I've seen this; the JVM stack size needs to be increased. I'm not
>>> sure if it's env specific (though you and I at least have hit it, I think
>>> others), or whether we need to change our build script.
>>> In the pom.xml file, find "-Xss..." settings and make them something
>>> like "-Xss4m", see if that works.
>>>
>>
>> It is already a much bigger value - 128m (
>> https://github.com/apache/spark/blob/50256bde9bdf217413545a6d2945d6c61bf4cfff/pom.xml#L2845
>> )
>> I've tried smaller and bigger values for all jvmArgs next to this one.
>> None helped!
>> I also have the feeling it is something in my environment that overrides
>> these values but so far I cannot identify anything.
>>
>>
>>
>>>
>>> On Thu, Feb 10, 2022 at 8:54 AM Martin Grigorov 
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> I am not able to build Spark due to the following error :
>>>>
>>>> ERROR] ## Exception when compiling 543 sources to
>>>> /home/martin/git/apache/spark/sql/catalyst/target/scala-2.12/classes
>>>> java.lang.BootstrapMethodError: call site initialization exception
>>>> java.lang.invoke.CallSite.makeSite(CallSite.java:341)
>>>>
>>>> java.lang.invoke.MethodHandleNatives.linkCallSiteImpl(MethodHandleNatives.java:307)
>>>>
>>>> java.lang.invoke.MethodHandleNatives.linkCallSite(MethodHandleNatives.java:297)
>>>> scala.tools.nsc.typechecker.Typers$Typer.typedBlock(Typers.scala:2504)
>>>>
>>>> scala.tools.nsc.typechecker.Typers$Typer.$anonfun$typed1$103(Typers.scala:5711)
>>>>
>>>> scala.tools.nsc.typechecker.Typers$Typer.typedOutsidePatternMode$1(Typers.scala:500)
>>>> scala.tools.nsc.typechecker.Typers$Typer.typed1(Typers.scala:5746)
>>>> scala.tools.nsc.typechecker.Typers$Typer.typed(Typers.scala:5781)
>>>> ...
>>>> Caused by: java.lang.StackOverflowError
>>>> at java.lang.ref.Reference. (Reference.java:303)
>>>> at java.lang.ref.WeakReference. (WeakReference.java:57)
>>>> at
>>>> java.lang.invoke.MethodType$ConcurrentWeakInternSet$WeakEntry.
>>>> (MethodType.java:1269)
>>>> at java.lang.invoke.MethodType$ConcurrentWeakInternSet.get
>>>> (MethodType.java:1216)
>>>> at java.lang.invoke.MethodType.makeImpl (MethodType.java:302)
>>>> at java.lang.invoke.MethodType.dropParameterTypes
>>>> (MethodType.java:573)
>>>> at java.lang.invoke.MethodType.replaceParameterTypes
>>>> (MethodType.java:467)
>>>> at java.lang.invoke.MethodHandle.asSpreader (MethodHandle.java:875)

Re: Problem building spark-catalyst_2.12 with Maven

2022-02-10 Thread Sean Owen

I think it's another occurrence that I had to change or had to set
MAVEN_OPTS. I think this occurs in a way that this setting doesn't affect,
though I don't quite understand it. Try the stack size in test runner
configs

On Thu, Feb 10, 2022, 2:02 PM Martin Grigorov  wrote:

> Hi Sean,
>
> On Thu, Feb 10, 2022 at 5:37 PM Sean Owen  wrote:
>
>> Yes I've seen this; the JVM stack size needs to be increased. I'm not
>> sure if it's env specific (though you and I at least have hit it, I think
>> others), or whether we need to change our build script.
>> In the pom.xml file, find "-Xss..." settings and make them something like
>> "-Xss4m", see if that works.
>>
>
> It is already a much bigger value - 128m (
> https://github.com/apache/spark/blob/50256bde9bdf217413545a6d2945d6c61bf4cfff/pom.xml#L2845
> )
> I've tried smaller and bigger values for all jvmArgs next to this one.
> None helped!
> I also have the feeling it is something in my environment that overrides
> these values but so far I cannot identify anything.
>
>
>
>>
>> On Thu, Feb 10, 2022 at 8:54 AM Martin Grigorov 
>> wrote:
>>
>>> Hi,
>>>
>>> I am not able to build Spark due to the following error :
>>>
>>> ERROR] ## Exception when compiling 543 sources to
>>> /home/martin/git/apache/spark/sql/catalyst/target/scala-2.12/classes
>>> java.lang.BootstrapMethodError: call site initialization exception
>>> java.lang.invoke.CallSite.makeSite(CallSite.java:341)
>>>
>>> java.lang.invoke.MethodHandleNatives.linkCallSiteImpl(MethodHandleNatives.java:307)
>>>
>>> java.lang.invoke.MethodHandleNatives.linkCallSite(MethodHandleNatives.java:297)
>>> scala.tools.nsc.typechecker.Typers$Typer.typedBlock(Typers.scala:2504)
>>>
>>> scala.tools.nsc.typechecker.Typers$Typer.$anonfun$typed1$103(Typers.scala:5711)
>>>
>>> scala.tools.nsc.typechecker.Typers$Typer.typedOutsidePatternMode$1(Typers.scala:500)
>>> scala.tools.nsc.typechecker.Typers$Typer.typed1(Typers.scala:5746)
>>> scala.tools.nsc.typechecker.Typers$Typer.typed(Typers.scala:5781)
>>> ...
>>> Caused by: java.lang.StackOverflowError
>>> at java.lang.ref.Reference. (Reference.java:303)
>>> at java.lang.ref.WeakReference. (WeakReference.java:57)
>>> at
>>> java.lang.invoke.MethodType$ConcurrentWeakInternSet$WeakEntry.
>>> (MethodType.java:1269)
>>> at java.lang.invoke.MethodType$ConcurrentWeakInternSet.get
>>> (MethodType.java:1216)
>>> at java.lang.invoke.MethodType.makeImpl (MethodType.java:302)
>>> at java.lang.invoke.MethodType.dropParameterTypes
>>> (MethodType.java:573)
>>> at java.lang.invoke.MethodType.replaceParameterTypes
>>> (MethodType.java:467)
>>> at java.lang.invoke.MethodHandle.asSpreader (MethodHandle.java:875)
>>> at java.lang.invoke.Invokers.spreadInvoker (Invokers.java:158)
>>> at java.lang.invoke.CallSite.makeSite (CallSite.java:324)
>>> at java.lang.invoke.MethodHandleNatives.linkCallSiteImpl
>>> (MethodHandleNatives.java:307)
>>> at java.lang.invoke.MethodHandleNatives.linkCallSite
>>> (MethodHandleNatives.java:297)
>>> at scala.tools.nsc.typechecker.Typers$Typer.typedBlock
>>> (Typers.scala:2504)
>>> at scala.tools.nsc.typechecker.Typers$Typer.$anonfun$typed1$103
>>> (Typers.scala:5711)
>>> at
>>> scala.tools.nsc.typechecker.Typers$Typer.typedOutsidePatternMode$1
>>> (Typers.scala:500)
>>> at scala.tools.nsc.typechecker.Typers$Typer.typed1
>>> (Typers.scala:5746)
>>> at scala.tools.nsc.typechecker.Typers$Typer.typed (Typers.scala:5781)
>>>
>>> I have played a lot with the scala-maven-plugin jvmArg settings at [1]
>>> but so far nothing helps.
>>> Same error for Scala 2.12 and 2.13.
>>>
>>> The command I use is: ./build/mvn install -Pkubernetes -DskipTests
>>>
>>> I need to create a distribution from master branch.
>>>
>>> Java: 1.8.0_312
>>> Maven: 3.8.4
>>> OS: Ubuntu 21.10
>>>
>>> Any hints ?
>>> Thank you!
>>>
>>> 1.
>>> https://github.com/apache/spark/blob/50256bde9bdf217413545a6d2945d6c61bf4cfff/pom.xml#L2845-L2849
>>>
>>

Re: Problem building spark-catalyst_2.12 with Maven

2022-02-10 Thread Martin Grigorov

Hi Sean,

On Thu, Feb 10, 2022 at 5:37 PM Sean Owen  wrote:

> Yes I've seen this; the JVM stack size needs to be increased. I'm not sure
> if it's env specific (though you and I at least have hit it, I think
> others), or whether we need to change our build script.
> In the pom.xml file, find "-Xss..." settings and make them something like
> "-Xss4m", see if that works.
>

It is already a much bigger value - 128m (
https://github.com/apache/spark/blob/50256bde9bdf217413545a6d2945d6c61bf4cfff/pom.xml#L2845
)
I've tried smaller and bigger values for all jvmArgs next to this one. None
helped!
I also have the feeling it is something in my environment that overrides
these values but so far I cannot identify anything.



>
> On Thu, Feb 10, 2022 at 8:54 AM Martin Grigorov 
> wrote:
>
>> Hi,
>>
>> I am not able to build Spark due to the following error :
>>
>> ERROR] ## Exception when compiling 543 sources to
>> /home/martin/git/apache/spark/sql/catalyst/target/scala-2.12/classes
>> java.lang.BootstrapMethodError: call site initialization exception
>> java.lang.invoke.CallSite.makeSite(CallSite.java:341)
>>
>> java.lang.invoke.MethodHandleNatives.linkCallSiteImpl(MethodHandleNatives.java:307)
>>
>> java.lang.invoke.MethodHandleNatives.linkCallSite(MethodHandleNatives.java:297)
>> scala.tools.nsc.typechecker.Typers$Typer.typedBlock(Typers.scala:2504)
>>
>> scala.tools.nsc.typechecker.Typers$Typer.$anonfun$typed1$103(Typers.scala:5711)
>>
>> scala.tools.nsc.typechecker.Typers$Typer.typedOutsidePatternMode$1(Typers.scala:500)
>> scala.tools.nsc.typechecker.Typers$Typer.typed1(Typers.scala:5746)
>> scala.tools.nsc.typechecker.Typers$Typer.typed(Typers.scala:5781)
>> ...
>> Caused by: java.lang.StackOverflowError
>> at java.lang.ref.Reference. (Reference.java:303)
>> at java.lang.ref.WeakReference. (WeakReference.java:57)
>> at
>> java.lang.invoke.MethodType$ConcurrentWeakInternSet$WeakEntry.
>> (MethodType.java:1269)
>> at java.lang.invoke.MethodType$ConcurrentWeakInternSet.get
>> (MethodType.java:1216)
>> at java.lang.invoke.MethodType.makeImpl (MethodType.java:302)
>> at java.lang.invoke.MethodType.dropParameterTypes
>> (MethodType.java:573)
>> at java.lang.invoke.MethodType.replaceParameterTypes
>> (MethodType.java:467)
>> at java.lang.invoke.MethodHandle.asSpreader (MethodHandle.java:875)
>> at java.lang.invoke.Invokers.spreadInvoker (Invokers.java:158)
>> at java.lang.invoke.CallSite.makeSite (CallSite.java:324)
>> at java.lang.invoke.MethodHandleNatives.linkCallSiteImpl
>> (MethodHandleNatives.java:307)
>> at java.lang.invoke.MethodHandleNatives.linkCallSite
>> (MethodHandleNatives.java:297)
>> at scala.tools.nsc.typechecker.Typers$Typer.typedBlock
>> (Typers.scala:2504)
>> at scala.tools.nsc.typechecker.Typers$Typer.$anonfun$typed1$103
>> (Typers.scala:5711)
>> at scala.tools.nsc.typechecker.Typers$Typer.typedOutsidePatternMode$1
>> (Typers.scala:500)
>> at scala.tools.nsc.typechecker.Typers$Typer.typed1 (Typers.scala:5746)
>> at scala.tools.nsc.typechecker.Typers$Typer.typed (Typers.scala:5781)
>>
>> I have played a lot with the scala-maven-plugin jvmArg settings at [1]
>> but so far nothing helps.
>> Same error for Scala 2.12 and 2.13.
>>
>> The command I use is: ./build/mvn install -Pkubernetes -DskipTests
>>
>> I need to create a distribution from master branch.
>>
>> Java: 1.8.0_312
>> Maven: 3.8.4
>> OS: Ubuntu 21.10
>>
>> Any hints ?
>> Thank you!
>>
>> 1.
>> https://github.com/apache/spark/blob/50256bde9bdf217413545a6d2945d6c61bf4cfff/pom.xml#L2845-L2849
>>
>

Re: Problem building spark-catalyst_2.12 with Maven

2022-02-10 Thread Sean Owen

Yes I've seen this; the JVM stack size needs to be increased. I'm not sure
if it's env specific (though you and I at least have hit it, I think
others), or whether we need to change our build script.
In the pom.xml file, find "-Xss..." settings and make them something like
"-Xss4m", see if that works.

On Thu, Feb 10, 2022 at 8:54 AM Martin Grigorov 
wrote:

> Hi,
>
> I am not able to build Spark due to the following error :
>
> ERROR] ## Exception when compiling 543 sources to
> /home/martin/git/apache/spark/sql/catalyst/target/scala-2.12/classes
> java.lang.BootstrapMethodError: call site initialization exception
> java.lang.invoke.CallSite.makeSite(CallSite.java:341)
>
> java.lang.invoke.MethodHandleNatives.linkCallSiteImpl(MethodHandleNatives.java:307)
>
> java.lang.invoke.MethodHandleNatives.linkCallSite(MethodHandleNatives.java:297)
> scala.tools.nsc.typechecker.Typers$Typer.typedBlock(Typers.scala:2504)
>
> scala.tools.nsc.typechecker.Typers$Typer.$anonfun$typed1$103(Typers.scala:5711)
>
> scala.tools.nsc.typechecker.Typers$Typer.typedOutsidePatternMode$1(Typers.scala:500)
> scala.tools.nsc.typechecker.Typers$Typer.typed1(Typers.scala:5746)
> scala.tools.nsc.typechecker.Typers$Typer.typed(Typers.scala:5781)
> ...
> Caused by: java.lang.StackOverflowError
> at java.lang.ref.Reference. (Reference.java:303)
> at java.lang.ref.WeakReference. (WeakReference.java:57)
> at
> java.lang.invoke.MethodType$ConcurrentWeakInternSet$WeakEntry.
> (MethodType.java:1269)
> at java.lang.invoke.MethodType$ConcurrentWeakInternSet.get
> (MethodType.java:1216)
> at java.lang.invoke.MethodType.makeImpl (MethodType.java:302)
> at java.lang.invoke.MethodType.dropParameterTypes (MethodType.java:573)
> at java.lang.invoke.MethodType.replaceParameterTypes
> (MethodType.java:467)
> at java.lang.invoke.MethodHandle.asSpreader (MethodHandle.java:875)
> at java.lang.invoke.Invokers.spreadInvoker (Invokers.java:158)
> at java.lang.invoke.CallSite.makeSite (CallSite.java:324)
> at java.lang.invoke.MethodHandleNatives.linkCallSiteImpl
> (MethodHandleNatives.java:307)
> at java.lang.invoke.MethodHandleNatives.linkCallSite
> (MethodHandleNatives.java:297)
> at scala.tools.nsc.typechecker.Typers$Typer.typedBlock
> (Typers.scala:2504)
> at scala.tools.nsc.typechecker.Typers$Typer.$anonfun$typed1$103
> (Typers.scala:5711)
> at scala.tools.nsc.typechecker.Typers$Typer.typedOutsidePatternMode$1
> (Typers.scala:500)
> at scala.tools.nsc.typechecker.Typers$Typer.typed1 (Typers.scala:5746)
> at scala.tools.nsc.typechecker.Typers$Typer.typed (Typers.scala:5781)
>
> I have played a lot with the scala-maven-plugin jvmArg settings at [1] but
> so far nothing helps.
> Same error for Scala 2.12 and 2.13.
>
> The command I use is: ./build/mvn install -Pkubernetes -DskipTests
>
> I need to create a distribution from master branch.
>
> Java: 1.8.0_312
> Maven: 3.8.4
> OS: Ubuntu 21.10
>
> Any hints ?
> Thank you!
>
> 1.
> https://github.com/apache/spark/blob/50256bde9bdf217413545a6d2945d6c61bf4cfff/pom.xml#L2845-L2849
>

Problem building spark-catalyst_2.12 with Maven

2022-02-10 Thread Martin Grigorov

Hi,

I am not able to build Spark due to the following error :

ERROR] ## Exception when compiling 543 sources to
/home/martin/git/apache/spark/sql/catalyst/target/scala-2.12/classes
java.lang.BootstrapMethodError: call site initialization exception
java.lang.invoke.CallSite.makeSite(CallSite.java:341)
java.lang.invoke.MethodHandleNatives.linkCallSiteImpl(MethodHandleNatives.java:307)
java.lang.invoke.MethodHandleNatives.linkCallSite(MethodHandleNatives.java:297)
scala.tools.nsc.typechecker.Typers$Typer.typedBlock(Typers.scala:2504)
scala.tools.nsc.typechecker.Typers$Typer.$anonfun$typed1$103(Typers.scala:5711)
scala.tools.nsc.typechecker.Typers$Typer.typedOutsidePatternMode$1(Typers.scala:500)
scala.tools.nsc.typechecker.Typers$Typer.typed1(Typers.scala:5746)
scala.tools.nsc.typechecker.Typers$Typer.typed(Typers.scala:5781)
...
Caused by: java.lang.StackOverflowError
at java.lang.ref.Reference. (Reference.java:303)
at java.lang.ref.WeakReference. (WeakReference.java:57)
at java.lang.invoke.MethodType$ConcurrentWeakInternSet$WeakEntry.
(MethodType.java:1269)
at java.lang.invoke.MethodType$ConcurrentWeakInternSet.get
(MethodType.java:1216)
at java.lang.invoke.MethodType.makeImpl (MethodType.java:302)
at java.lang.invoke.MethodType.dropParameterTypes (MethodType.java:573)
at java.lang.invoke.MethodType.replaceParameterTypes
(MethodType.java:467)
at java.lang.invoke.MethodHandle.asSpreader (MethodHandle.java:875)
at java.lang.invoke.Invokers.spreadInvoker (Invokers.java:158)
at java.lang.invoke.CallSite.makeSite (CallSite.java:324)
at java.lang.invoke.MethodHandleNatives.linkCallSiteImpl
(MethodHandleNatives.java:307)
at java.lang.invoke.MethodHandleNatives.linkCallSite
(MethodHandleNatives.java:297)
at scala.tools.nsc.typechecker.Typers$Typer.typedBlock
(Typers.scala:2504)
at scala.tools.nsc.typechecker.Typers$Typer.$anonfun$typed1$103
(Typers.scala:5711)
at scala.tools.nsc.typechecker.Typers$Typer.typedOutsidePatternMode$1
(Typers.scala:500)
at scala.tools.nsc.typechecker.Typers$Typer.typed1 (Typers.scala:5746)
at scala.tools.nsc.typechecker.Typers$Typer.typed (Typers.scala:5781)

I have played a lot with the scala-maven-plugin jvmArg settings at [1] but
so far nothing helps.
Same error for Scala 2.12 and 2.13.

The command I use is: ./build/mvn install -Pkubernetes -DskipTests

I need to create a distribution from master branch.

Java: 1.8.0_312
Maven: 3.8.4
OS: Ubuntu 21.10

Any hints ?
Thank you!

1.
https://github.com/apache/spark/blob/50256bde9bdf217413545a6d2945d6c61bf4cfff/pom.xml#L2845-L2849

Re: Missing module spark-hadoop-cloud in Maven central

2021-06-21 Thread Dongjoon Hyun

Hi, Steve.

Here is the PR for publishing it as a part of Apache Spark 3.2.0+

https://github.com/apache/spark/pull/33003

Bests,
Dongjoon.

On 2021/06/01 17:09:53, Steve Loughran  wrote: 
> (can't reply to user@, so pulling @dev instead. sorry)
> 
> (can't reply to user@, so pulling @dev instead)
> 
> There is no fundamental reason why the hadoop-cloud POM and artifact isn't
> built/released by the ASF spark project; I think the effort it took to get
> the spark-hadoop-cloud module it in at all was enough to put me off trying
> to get the artifact released.
> 
> Including the AWS SDK in the spark tarball the main thing to question.
> 
> It does contain some minimal binding classes to deal with two issues, both
> of which are actually fixable if anyone sat down to do it.
> 
> 
>1. Spark using mapreduce V1 APIs (org.apache.hadoop.mapred) vs v2
>((org.apache.hadoop.mapredreduce.{input, output,... }). That's fixable in
>spark; a shim class was just a lot less traumatic.
>2. Parquet being fussy about writing to a subclass of
>ParquetOutputCommitter. Again, a shim does that, alternative is a fix in
>Parquet. Or I modify the original Hadoop FileOutputCommitter to actually
>wrap/forward to a new committer. I chose not not to do that from the outset
>because that class scares me. Nothing has changed my opinion there. FWIW
>EMR just did their S3-only committer as a subclass of
>ParquetOutputCommitter. Simpler solution if you don't have to care about
>other committers for other stores.
> 
> Move spark to MRv2 APIs and Parquet lib to downgrade if the committer isn't
> a subclass (it wants the option to call writeMetaDataFile()), and the need
> for those shims goes away.
> 
> What the module also does is import the relevant hadoop-aws, hadoop-azure
> modules etc and strip out anything which complicates life. When published
> to the maven repo then, apps can import it downstream and get a consistent
> set of hadooop-* artifacts, and the AWS artifacts which they've been
> compiled and tested with.
> 
> They are published by both cloudera and palantir; it'd be really good for
> the world as a whole if the ASF published them too, in sync with the rest
> of the release
> 
> https://mvnrepository.com/artifact/org.apache.spark/spark-hadoop-cloud
> 
> 
> There's one other aspect of the module, which is when it is built the spark
> distribution includes the AWS SDK bundle, which is a few hundred MB and
> growing.
> 
> Why use the whole shaded JAR?" Classpaths. Jackson versions, httpclient
> versions, etc: if they weren't shaded it'd be very hard to get a consistent
> set of dependencies. There's the side benefit of having one consistent set
> of AWS libraries, so spark-kinesis will be in sync with s3a client,
> DynamoDB client, etc, etc. (
> https://issues.apache.org/jira/browse/HADOOP-17197 )
> 
> There's a very good case for excluding that SDK from the distro unless you
> are confident people really want it. Instead just say "this release
> contains all the ASF dependencies needed to work with AWS, just add
> "aws-sdk-bundle 1.11.XYZ".
> 
> I'm happy to work on that if I can get some promise of review time from
> others.
> 
> On related notes
> 
> Hadoop 3.3.1 RCs are up for testing. For S3A this includes everything in
> https://issues.apache.org/jira/browse/HADOOP-16829   big speedups in list
> calls, and you can turn off deletion of dir marking for significant IO
> gains/reduced throttling. Do play ASAP, do complain on issues: this is your
> last chance before things ship.
> 
> For everything else, yes, many benefits. And, courtesy of Huawei, native
> ARM support too. Your VM cost/hour just went down for all workloads where
> you don't need GPUs.
> 
> *The RC2 artifacts are at*:
> https://home.apache.org/~weichiu/hadoop-3.3.1-RC2/
> ARM artifacts: https://home.apache.org/~weichiu/hadoop-3.3.1-RC2-arm/
> 
> 
> *The maven artifacts are hosted here:*
> https://repository.apache.org/content/repositories/orgapachehadoop-1318/
> 
> 
> Independent of that, anyone working on Azure or GCS who wants spark to
> write output in a classic Hive partitioned directory structure -there's a
> WiP committer which promises speed and correctness even when the store
> (GCS) doesn't do atomic dir renames.
> 
> https://github.com/apache/hadoop/pull/2971
> 
> Reviews and testing with private datasets strongly encouraged, and I'd love
> to get the IOStatistics parts of the _SUCCESS files to see what happened.
> This committer measures time to list/rename/mkdir in task and job commit,
> and aggregates them all into the final report.
> 
> -Steve
> 
> On Mon, 31 May 2021 at 13:35, Sean Owen

Re: Missing module spark-hadoop-cloud in Maven central

2021-06-01 Thread Steve Loughran

(can't reply to user@, so pulling @dev instead. sorry)

(can't reply to user@, so pulling @dev instead)

There is no fundamental reason why the hadoop-cloud POM and artifact isn't
built/released by the ASF spark project; I think the effort it took to get
the spark-hadoop-cloud module it in at all was enough to put me off trying
to get the artifact released.

Including the AWS SDK in the spark tarball the main thing to question.

It does contain some minimal binding classes to deal with two issues, both
of which are actually fixable if anyone sat down to do it.

   1. Spark using mapreduce V1 APIs (org.apache.hadoop.mapred) vs v2
   ((org.apache.hadoop.mapredreduce.{input, output,... }). That's fixable in
   spark; a shim class was just a lot less traumatic.
   2. Parquet being fussy about writing to a subclass of
   ParquetOutputCommitter. Again, a shim does that, alternative is a fix in
   Parquet. Or I modify the original Hadoop FileOutputCommitter to actually
   wrap/forward to a new committer. I chose not not to do that from the outset
   because that class scares me. Nothing has changed my opinion there. FWIW
   EMR just did their S3-only committer as a subclass of
   ParquetOutputCommitter. Simpler solution if you don't have to care about
   other committers for other stores.

Move spark to MRv2 APIs and Parquet lib to downgrade if the committer isn't
a subclass (it wants the option to call writeMetaDataFile()), and the need
for those shims goes away.

What the module also does is import the relevant hadoop-aws, hadoop-azure
modules etc and strip out anything which complicates life. When published
to the maven repo then, apps can import it downstream and get a consistent
set of hadooop-* artifacts, and the AWS artifacts which they've been
compiled and tested with.

They are published by both cloudera and palantir; it'd be really good for
the world as a whole if the ASF published them too, in sync with the rest
of the release

https://mvnrepository.com/artifact/org.apache.spark/spark-hadoop-cloud

There's one other aspect of the module, which is when it is built the spark
distribution includes the AWS SDK bundle, which is a few hundred MB and
growing.

Why use the whole shaded JAR?" Classpaths. Jackson versions, httpclient
versions, etc: if they weren't shaded it'd be very hard to get a consistent
set of dependencies. There's the side benefit of having one consistent set
of AWS libraries, so spark-kinesis will be in sync with s3a client,
DynamoDB client, etc, etc. (
https://issues.apache.org/jira/browse/HADOOP-17197 )

There's a very good case for excluding that SDK from the distro unless you
are confident people really want it. Instead just say "this release
contains all the ASF dependencies needed to work with AWS, just add
"aws-sdk-bundle 1.11.XYZ".

I'm happy to work on that if I can get some promise of review time from
others.

On related notes

Hadoop 3.3.1 RCs are up for testing. For S3A this includes everything in
https://issues.apache.org/jira/browse/HADOOP-16829   big speedups in list
calls, and you can turn off deletion of dir marking for significant IO
gains/reduced throttling. Do play ASAP, do complain on issues: this is your
last chance before things ship.

For everything else, yes, many benefits. And, courtesy of Huawei, native
ARM support too. Your VM cost/hour just went down for all workloads where
you don't need GPUs.

*The RC2 artifacts are at*:
https://home.apache.org/~weichiu/hadoop-3.3.1-RC2/
ARM artifacts: https://home.apache.org/~weichiu/hadoop-3.3.1-RC2-arm/

*The maven artifacts are hosted here:*
https://repository.apache.org/content/repositories/orgapachehadoop-1318/

Independent of that, anyone working on Azure or GCS who wants spark to
write output in a classic Hive partitioned directory structure -there's a
WiP committer which promises speed and correctness even when the store
(GCS) doesn't do atomic dir renames.

https://github.com/apache/hadoop/pull/2971

Reviews and testing with private datasets strongly encouraged, and I'd love
to get the IOStatistics parts of the _SUCCESS files to see what happened.
This committer measures time to list/rename/mkdir in task and job commit,
and aggregates them all into the final report.

-Steve

On Mon, 31 May 2021 at 13:35, Sean Owen  wrote:

> I know it's not enabled by default when the binary artifacts are built,
> but not exactly sure why it's not built separately at all. It's almost a
> dependencies-only pom artifact, but there are two source files. Steve do
> you have an angle on that?
>
> On Mon, May 31, 2021 at 5:37 AM Erik Torres  wrote:
>
>> Hi,
>>
>> I'm following this documentation
>> <https://spark.apache.org/docs/latest/cloud-integration.html#installation> to
>> configure my Spark-based application to interact with Amazon S3. However, I
>> cannot find the spark-hadoop-cloud module in Maven central for the
>> non-commerc

[FYI] Scala 2.13 Maven Artifacts

2021-01-27 Thread Dongjoon Hyun

Hi, All.

Apache Spark community starts to publish Scala 2.13 Maven artifacts daily.


https://repository.apache.org/content/repositories/snapshots/org/apache/spark/spark-core_2.13/3.2.0-SNAPSHOT/

It aims to encourage more tests on Scala 2.13 (and Scala 3) and to identify
issues in advance for Apache Spark 3.2.0.

In addition, recently we upgraded from Scala 2.12.10 to Scala 2.12.13 nine
days ago
but we decided to revert this today due to
https://github.com/scala/bug/issues/12038

[SPARK-31168][SPARK-33913][BUILD] Upgrade Scala to 2.12.13 and Kafka to
2.7.0
https://github.com/apache/spark/commit/a65e86a65e39f3a61c3248b006e897effd7e4c2a

Revert "[SPARK-31168][SPARK-33913][BUILD] Upgrade Scala to 2.12.13 and
Kafka to 2.7.0"
https://github.com/apache/spark/commit/1217c8b4181d8b8f54b7f0b0510d37ba773eeaa3

The scala bug is originally open for Scala 2.13, but we didn't downgrade
the Scala 2.13 version because it's not officially supported yet. It's
still using Scala 2.13.4.

Please file a JIRA if you hit some issues with the Scala 2.13 Maven
artifacts.

Bests,
Dongjoon.

Re: Adding Maven Central mirror from Google to the build?

2020-01-22 Thread Tom Graves

 +1 for proposal.
Tom
On Tuesday, January 21, 2020, 04:37:04 PM CST, Sean Owen  
wrote:  
 
 See https://github.com/apache/spark/pull/27307 for some context. We've
had to add, in at least one place, some settings to resolve artifacts
from a mirror besides Maven Central to work around some build
problems.

Now, we find it might be simpler to just use this mirror as the
primary repo in the build, falling back to Central if needed.

The question is: any objections to that?

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Re: Adding Maven Central mirror from Google to the build?

2020-01-21 Thread Hyukjin Kwon

+1. If it becomes a problem for any reason, we can consider another option (
https://github.com/apache/spark/pull/27307#issuecomment-576951473) later

2020년 1월 22일 (수) 오전 8:23, Dongjoon Hyun 님이 작성:

> +1, I'm supporting the following proposal.
>
> > this mirror as the primary repo in the build, falling back to Central if
> needed.
>
> Thanks,
> Dongjoon.
>
>
> On Tue, Jan 21, 2020 at 14:37 Sean Owen  wrote:
>
>> See https://github.com/apache/spark/pull/27307 for some context. We've
>> had to add, in at least one place, some settings to resolve artifacts
>> from a mirror besides Maven Central to work around some build
>> problems.
>>
>> Now, we find it might be simpler to just use this mirror as the
>> primary repo in the build, falling back to Central if needed.
>>
>> The question is: any objections to that?
>>
>> -
>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>
>>

Re: Adding Maven Central mirror from Google to the build?

2020-01-21 Thread Reynold Xin

This seems reasonable!

On Tue, Jan 21, 2020 at 3:23 PM, Dongjoon Hyun < dongjoon.h...@gmail.com > 
wrote:

> 
> +1, I'm supporting the following proposal.
> 
> 
> > this mirror as the primary repo in the build, falling back to Central if
> needed.
> 
> 
> Thanks,
> Dongjoon.
> 
> 
> 
> On Tue, Jan 21, 2020 at 14:37 Sean Owen < srowen@ gmail. com (
> sro...@gmail.com ) > wrote:
> 
> 
>> See https:/ / github. com/ apache/ spark/ pull/ 27307 (
>> https://github.com/apache/spark/pull/27307 ) for some context. We've
>> had to add, in at least one place, some settings to resolve artifacts
>> from a mirror besides Maven Central to work around some build
>> problems.
>> 
>> Now, we find it might be simpler to just use this mirror as the
>> primary repo in the build, falling back to Central if needed.
>> 
>> The question is: any objections to that?
>> 
>> -
>> To unsubscribe e-mail: dev-unsubscribe@ spark. apache. org (
>> dev-unsubscr...@spark.apache.org )
> 
> 
>

Re: Adding Maven Central mirror from Google to the build?

2020-01-21 Thread Dongjoon Hyun

+1, I'm supporting the following proposal.

> this mirror as the primary repo in the build, falling back to Central if
needed.

Thanks,
Dongjoon.


On Tue, Jan 21, 2020 at 14:37 Sean Owen  wrote:

> See https://github.com/apache/spark/pull/27307 for some context. We've
> had to add, in at least one place, some settings to resolve artifacts
> from a mirror besides Maven Central to work around some build
> problems.
>
> Now, we find it might be simpler to just use this mirror as the
> primary repo in the build, falling back to Central if needed.
>
> The question is: any objections to that?
>
> -
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>

Adding Maven Central mirror from Google to the build?

2020-01-21 Thread Sean Owen

See https://github.com/apache/spark/pull/27307 for some context. We've
had to add, in at least one place, some settings to resolve artifacts
from a mirror besides Maven Central to work around some build
problems.

Now, we find it might be simpler to just use this mirror as the
primary repo in the build, falling back to Central if needed.

The question is: any objections to that?

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Re: Spark master build hangs using parallel build option in maven

2020-01-18 Thread Sean Owen

I think we can remove that note from the README then, I'll do that.

On Sat, Jan 18, 2020 at 1:38 AM Dongjoon Hyun  wrote:
>
> Hi, Saurabh.
>
> It seems that you are hitting 
> https://issues.apache.org/jira/browse/SPARK-26095 .
>
> And, we disabled the parallel build via 
> https://github.com/apache/spark/pull/23061 at 3.0.0.
>
> According to the stack trace in JIRA and PR description, `maven-shade-plugin` 
> seems to be the root cause.
>
> For now, I'd like to recommend you to disable it because `Maven` itself warns 
> you already. (You know that, right?)
>
>> [INFO] [ pom 
>> ]-
>> [WARNING] *
>> [WARNING] * Your build is requesting parallel execution, but project  *
>> [WARNING] * contains the following plugin(s) that have goals not marked   *
>> [WARNING] * as @threadSafe to support parallel building.  *
>> [WARNING] * While this /may/ work fine, please look for plugin updates*
>> [WARNING] * and/or request plugins be made thread-safe.   *
>> [WARNING] * If reporting an issue, report it against the plugin in*
>> [WARNING] * question, not against maven-core  *
>> [WARNING] *
>> [WARNING] The following plugins are not marked @threadSafe in Spark Project 
>> Parent POM:
>> [WARNING] org.scalatest:scalatest-maven-plugin:2.0.0
>> [WARNING] Enable debug to see more precisely which goals are not marked 
>> @threadSafe.
>> [WARNING] *
>
>
> I respect `Maven` warnings.
>
> Bests,
> Dongjoon.
>
>
> On Fri, Jan 17, 2020 at 9:22 PM Saurabh Chawla  wrote:
>>
>> Hi Sean,
>>
>> Thanks for checking this.
>>
>> I am able to see parallel build info in the readme file 
>> https://github.com/apache/spark#building-spark
>>
>> "
>> You can build Spark using more than one thread by using the -T option with 
>> Maven, see "Parallel builds in Maven 3". More detailed documentation is 
>> available from the project site, at "Building Spark".
>> "
>>
>> This used to work while building older version of spark(2.4.3, 2.3.2 etc).
>> build/mvn -Duse.zinc.server=false -DuseZincForJdk8=false 
>> -Dmaven.javadoc.skip=true -DskipSource=true -Phive -Phive-thriftserver 
>> -Phive-provided -Pyarn -Phadoop-provided -Dhadoop.version=2.8.5 
>> -DskipTests=true -T 4 clean package
>>
>> Also I have seen the maven version is changed from 3.5.4 to 3.6.3 in master 
>> branch compared to spark 2.4.3.
>> Not sure if it's due to some bug in maven version used in master or some new 
>> change added in the master branch that prevent the parallel build.
>>
>> Regards
>> Saurabh Chawla
>>
>>
>> On Sat, Jan 18, 2020 at 2:19 AM Sean Owen  wrote:
>>>
>>> I don't believe you can use a parallel build indeed. Some things
>>> collide with each other. Some of the suites are run in parallel inside
>>> the build though already.
>>>
>>> On Fri, Jan 17, 2020 at 1:23 PM Saurabh Chawla  
>>> wrote:
>>> >
>>> > Hi All,
>>> >
>>> > Spark master build hangs using parallel build option in maven. On running 
>>> > build the sequentially on spark master using maven, build did not hang. 
>>> > This issue occurs on giving hadoop-provided (-Phadoop-provided 
>>> > -Dhadoop.version=2.8.5) option. Same command works fine to build 
>>> > spark-2.4.3 parallelly
>>> >
>>> > Command to build spark master sequentially - Spark build works fine
>>> > build/mvn  -Duse.zinc.server=false -DuseZincForJdk8=false 
>>> > -Dmaven.javadoc.skip=true -DskipSource=true -Phive -Phive-thriftserver 
>>> > -Phive-provided -Pyarn -Phadoop-provided -Dhadoop.version=2.8.5 
>>> > -DskipTests=true  clean package
>>> >
>>> > Command to build spark master parallel - spark build hangs
>>> > build/mvn -X -Duse.zinc.server=false -DuseZincForJdk8=false 
>>> > -Dmaven.javadoc.skip=true -DskipSource=true -Phive -Phive-thriftserver 
>>> > -Phive-provided -Pyarn -Phadoop-provided -Dhadoop.version=2.8.5 
>>> > -DskipTests=true -T 4 clean package
>>> >
>>> > This is the trace which keeps on repeating in maven logs
>>> >
>>> > [DEBUG] building mave

Re: Spark master build hangs using parallel build option in maven

2020-01-17 Thread Dongjoon Hyun

Hi, Saurabh.

It seems that you are hitting
https://issues.apache.org/jira/browse/SPARK-26095 .

And, we disabled the parallel build via
https://github.com/apache/spark/pull/23061 at 3.0.0.

According to the stack trace in JIRA and PR description,
`maven-shade-plugin` seems to be the root cause.

For now, I'd like to recommend you to disable it because `Maven` itself
warns you already. (You know that, right?)

[INFO] [ pom
> ]-
> [WARNING] *
> [WARNING] * Your build is requesting parallel execution, but project  *
> [WARNING] * contains the following plugin(s) that have goals not marked   *
> [WARNING] * as @threadSafe to support parallel building.  *
> [WARNING] * While this /may/ work fine, please look for plugin updates*
> [WARNING] * and/or request plugins be made thread-safe.   *
> [WARNING] * If reporting an issue, report it against the plugin in*
> [WARNING] * question, not against maven-core  *
> [WARNING] *
> [WARNING] The following plugins are not marked @threadSafe in Spark
> Project Parent POM:
> [WARNING] org.scalatest:scalatest-maven-plugin:2.0.0
> [WARNING] Enable debug to see more precisely which goals are not marked
> @threadSafe.
> [WARNING] *****


I respect `Maven` warnings.

Bests,
Dongjoon.


On Fri, Jan 17, 2020 at 9:22 PM Saurabh Chawla 
wrote:

> Hi Sean,
>
> Thanks for checking this.
>
> I am able to see parallel build info in the readme file
> https://github.com/apache/spark#building-spark
>
> "
> You can build Spark using more than one thread by using the -T option with
> Maven, see "Parallel builds in Maven 3"
> <https://cwiki.apache.org/confluence/display/MAVEN/Parallel+builds+in+Maven+3>.
> More detailed documentation is available from the project site, at "Building
> Spark" <https://spark.apache.org/docs/latest/building-spark.html>.
> "
>
> This used to work while building older version of spark(2.4.3, 2.3.2 etc).
> build/mvn -Duse.zinc.server=false -DuseZincForJdk8=false
> -Dmaven.javadoc.skip=true -DskipSource=true -Phive -Phive-thriftserver
> -Phive-provided -Pyarn -Phadoop-provided -Dhadoop.version=2.8.5
> -DskipTests=true -T 4 clean package
>
> Also I have seen the maven version is changed from 3.5.4 to 3.6.3 in
> master branch compared to spark 2.4.3.
> Not sure if it's due to some bug in maven version used in master or some
> new change added in the master branch that prevent the parallel build.
>
> Regards
> Saurabh Chawla
>
>
> On Sat, Jan 18, 2020 at 2:19 AM Sean Owen  wrote:
>
>> I don't believe you can use a parallel build indeed. Some things
>> collide with each other. Some of the suites are run in parallel inside
>> the build though already.
>>
>> On Fri, Jan 17, 2020 at 1:23 PM Saurabh Chawla 
>> wrote:
>> >
>> > Hi All,
>> >
>> > Spark master build hangs using parallel build option in maven. On
>> running build the sequentially on spark master using maven, build did not
>> hang. This issue occurs on giving hadoop-provided (-Phadoop-provided
>> -Dhadoop.version=2.8.5) option. Same command works fine to build
>> spark-2.4.3 parallelly
>> >
>> > Command to build spark master sequentially - Spark build works fine
>> > build/mvn  -Duse.zinc.server=false -DuseZincForJdk8=false
>> -Dmaven.javadoc.skip=true -DskipSource=true -Phive -Phive-thriftserver
>> -Phive-provided -Pyarn -Phadoop-provided -Dhadoop.version=2.8.5
>> -DskipTests=true  clean package
>> >
>> > Command to build spark master parallel - spark build hangs
>> > build/mvn -X -Duse.zinc.server=false -DuseZincForJdk8=false
>> -Dmaven.javadoc.skip=true -DskipSource=true -Phive -Phive-thriftserver
>> -Phive-provided -Pyarn -Phadoop-provided -Dhadoop.version=2.8.5
>> -DskipTests=true -T 4 clean package
>> >
>> > This is the trace which keeps on repeating in maven logs
>> >
>> > [DEBUG] building maven31 dependency graph for
>> org.apache.spark:spark-network-yarn_2.12:jar:3.0.0-SNAPSHOT with
>> Maven31DependencyGraphBuilder
>> > [DEBUG] Dependency collection stats: {ConflictMarker.analyzeTime=60583,
>> ConflictMarker.markTime=23750, ConflictMarker.nodeCount=419,
>> ConflictIdSorter.graphTime=41262, ConflictIdSorter.topsortTime=9704,
>> ConflictIdSorter.conflictIdCount=105,
>> ConflictIdSorter.conflictIdCycleCount=0, ConflictResolver.totalTime

Re: Spark master build hangs using parallel build option in maven

2020-01-17 Thread Saurabh Chawla

Hi Sean,

Thanks for checking this.

I am able to see parallel build info in the readme file
https://github.com/apache/spark#building-spark

"
You can build Spark using more than one thread by using the -T option with
Maven, see "Parallel builds in Maven 3"
<https://cwiki.apache.org/confluence/display/MAVEN/Parallel+builds+in+Maven+3>.
More detailed documentation is available from the project site, at "Building
Spark" <https://spark.apache.org/docs/latest/building-spark.html>.
"

This used to work while building older version of spark(2.4.3, 2.3.2 etc).
build/mvn -Duse.zinc.server=false -DuseZincForJdk8=false
-Dmaven.javadoc.skip=true -DskipSource=true -Phive -Phive-thriftserver
-Phive-provided -Pyarn -Phadoop-provided -Dhadoop.version=2.8.5
-DskipTests=true -T 4 clean package

Also I have seen the maven version is changed from 3.5.4 to 3.6.3 in master
branch compared to spark 2.4.3.
Not sure if it's due to some bug in maven version used in master or some
new change added in the master branch that prevent the parallel build.

Regards
Saurabh Chawla


On Sat, Jan 18, 2020 at 2:19 AM Sean Owen  wrote:

> I don't believe you can use a parallel build indeed. Some things
> collide with each other. Some of the suites are run in parallel inside
> the build though already.
>
> On Fri, Jan 17, 2020 at 1:23 PM Saurabh Chawla 
> wrote:
> >
> > Hi All,
> >
> > Spark master build hangs using parallel build option in maven. On
> running build the sequentially on spark master using maven, build did not
> hang. This issue occurs on giving hadoop-provided (-Phadoop-provided
> -Dhadoop.version=2.8.5) option. Same command works fine to build
> spark-2.4.3 parallelly
> >
> > Command to build spark master sequentially - Spark build works fine
> > build/mvn  -Duse.zinc.server=false -DuseZincForJdk8=false
> -Dmaven.javadoc.skip=true -DskipSource=true -Phive -Phive-thriftserver
> -Phive-provided -Pyarn -Phadoop-provided -Dhadoop.version=2.8.5
> -DskipTests=true  clean package
> >
> > Command to build spark master parallel - spark build hangs
> > build/mvn -X -Duse.zinc.server=false -DuseZincForJdk8=false
> -Dmaven.javadoc.skip=true -DskipSource=true -Phive -Phive-thriftserver
> -Phive-provided -Pyarn -Phadoop-provided -Dhadoop.version=2.8.5
> -DskipTests=true -T 4 clean package
> >
> > This is the trace which keeps on repeating in maven logs
> >
> > [DEBUG] building maven31 dependency graph for
> org.apache.spark:spark-network-yarn_2.12:jar:3.0.0-SNAPSHOT with
> Maven31DependencyGraphBuilder
> > [DEBUG] Dependency collection stats: {ConflictMarker.analyzeTime=60583,
> ConflictMarker.markTime=23750, ConflictMarker.nodeCount=419,
> ConflictIdSorter.graphTime=41262, ConflictIdSorter.topsortTime=9704,
> ConflictIdSorter.conflictIdCount=105,
> ConflictIdSorter.conflictIdCycleCount=0, ConflictResolver.totalTime=632542,
> ConflictResolver.conflictItemCount=193,
> DefaultDependencyCollector.collectTime=1020759,
> DefaultDependencyCollector.transformTime=775495}
> > [DEBUG] org.apache.spark:spark-network-yarn_2.12:jar:3.0.0-SNAPSHOT
> > [DEBUG]
> org.apache.spark:spark-network-shuffle_2.12:jar:3.0.0-SNAPSHOT:compile
> > [DEBUG]
>  org.apache.spark:spark-network-common_2.12:jar:3.0.0-SNAPSHOT:compile
> > [DEBUG]  io.netty:netty-all:jar:4.1.42.Final:compile (version
> managed from 4.1.42.Final)
> > [DEBUG]  org.apache.commons:commons-lang3:jar:3.9:compile
> (version managed from 3.9)
> > [DEBUG]
> org.fusesource.leveldbjni:leveldbjni-all:jar:1.8:compile (version managed
> from 1.8)
> > [DEBUG]
> com.fasterxml.jackson.core:jackson-databind:jar:2.10.0:compile (version
> managed from 2.10.0)
> > [DEBUG]
>  com.fasterxml.jackson.core:jackson-core:jar:2.10.0:compile (version
> managed from 2.10.0)
> > [DEBUG]
> com.fasterxml.jackson.core:jackson-annotations:jar:2.10.0:compile (version
> managed from 2.10.0)
> > [DEBUG]  com.google.code.findbugs:jsr305:jar:3.0.0:compile
> (version managed from 3.0.0)
> > [DEBUG]  com.google.guava:guava:jar:14.0.1:provided (scope
> managed from compile) (version managed from 14.0.1)
> > [DEBUG]  org.apache.commons:commons-crypto:jar:1.0.0:compile
> (version managed from 1.0.0) (exclusions managed from
> [net.java.dev.jna:jna:*:*])
> > [DEBUG]   io.dropwizard.metrics:metrics-core:jar:4.1.1:compile
> (version managed from 4.1.1)
> > [DEBUG]org.apache.spark:spark-tags_2.12:jar:3.0.0-SNAPSHOT:test
> > [DEBUG]   org.scala-lang:scala-library:jar:2.12.10:compile (version
> managed from 2.12.10)
> > [DEBUG]org.apache.spark:spark-tags_2.12:jar:tests:3.0.0-SNAPSHOT:test
> > [DEBUG]or

Re: Spark master build hangs using parallel build option in maven

2020-01-17 Thread Sean Owen

I don't believe you can use a parallel build indeed. Some things
collide with each other. Some of the suites are run in parallel inside
the build though already.

On Fri, Jan 17, 2020 at 1:23 PM Saurabh Chawla  wrote:
>
> Hi All,
>
> Spark master build hangs using parallel build option in maven. On running 
> build the sequentially on spark master using maven, build did not hang. This 
> issue occurs on giving hadoop-provided (-Phadoop-provided 
> -Dhadoop.version=2.8.5) option. Same command works fine to build spark-2.4.3 
> parallelly
>
> Command to build spark master sequentially - Spark build works fine
> build/mvn  -Duse.zinc.server=false -DuseZincForJdk8=false 
> -Dmaven.javadoc.skip=true -DskipSource=true -Phive -Phive-thriftserver 
> -Phive-provided -Pyarn -Phadoop-provided -Dhadoop.version=2.8.5 
> -DskipTests=true  clean package
>
> Command to build spark master parallel - spark build hangs
> build/mvn -X -Duse.zinc.server=false -DuseZincForJdk8=false 
> -Dmaven.javadoc.skip=true -DskipSource=true -Phive -Phive-thriftserver 
> -Phive-provided -Pyarn -Phadoop-provided -Dhadoop.version=2.8.5 
> -DskipTests=true -T 4 clean package
>
> This is the trace which keeps on repeating in maven logs
>
> [DEBUG] building maven31 dependency graph for 
> org.apache.spark:spark-network-yarn_2.12:jar:3.0.0-SNAPSHOT with 
> Maven31DependencyGraphBuilder
> [DEBUG] Dependency collection stats: {ConflictMarker.analyzeTime=60583, 
> ConflictMarker.markTime=23750, ConflictMarker.nodeCount=419, 
> ConflictIdSorter.graphTime=41262, ConflictIdSorter.topsortTime=9704, 
> ConflictIdSorter.conflictIdCount=105, 
> ConflictIdSorter.conflictIdCycleCount=0, ConflictResolver.totalTime=632542, 
> ConflictResolver.conflictItemCount=193, 
> DefaultDependencyCollector.collectTime=1020759, 
> DefaultDependencyCollector.transformTime=775495}
> [DEBUG] org.apache.spark:spark-network-yarn_2.12:jar:3.0.0-SNAPSHOT
> [DEBUG]
> org.apache.spark:spark-network-shuffle_2.12:jar:3.0.0-SNAPSHOT:compile
> [DEBUG]   
> org.apache.spark:spark-network-common_2.12:jar:3.0.0-SNAPSHOT:compile
> [DEBUG]  io.netty:netty-all:jar:4.1.42.Final:compile (version managed 
> from 4.1.42.Final)
> [DEBUG]  org.apache.commons:commons-lang3:jar:3.9:compile (version 
> managed from 3.9)
> [DEBUG]  org.fusesource.leveldbjni:leveldbjni-all:jar:1.8:compile 
> (version managed from 1.8)
> [DEBUG]  
> com.fasterxml.jackson.core:jackson-databind:jar:2.10.0:compile (version 
> managed from 2.10.0)
> [DEBUG] 
> com.fasterxml.jackson.core:jackson-core:jar:2.10.0:compile (version managed 
> from 2.10.0)
> [DEBUG]  
> com.fasterxml.jackson.core:jackson-annotations:jar:2.10.0:compile (version 
> managed from 2.10.0)
> [DEBUG]  com.google.code.findbugs:jsr305:jar:3.0.0:compile (version 
> managed from 3.0.0)
> [DEBUG]  com.google.guava:guava:jar:14.0.1:provided (scope managed 
> from compile) (version managed from 14.0.1)
> [DEBUG]  org.apache.commons:commons-crypto:jar:1.0.0:compile (version 
> managed from 1.0.0) (exclusions managed from [net.java.dev.jna:jna:*:*])
> [DEBUG]   io.dropwizard.metrics:metrics-core:jar:4.1.1:compile (version 
> managed from 4.1.1)
> [DEBUG]org.apache.spark:spark-tags_2.12:jar:3.0.0-SNAPSHOT:test
> [DEBUG]   org.scala-lang:scala-library:jar:2.12.10:compile (version 
> managed from 2.12.10)
> [DEBUG]org.apache.spark:spark-tags_2.12:jar:tests:3.0.0-SNAPSHOT:test
> [DEBUG]org.apache.hadoop:hadoop-client:jar:2.8.5:provided (exclusions 
> managed from [org.fusesource.leveldbjni:leveldbjni-all:*:*, asm:asm:*:*, 
> org.codehaus.jackson:jackson-mapper-asl:*:*, org.ow2.asm:asm:*:*, 
> org.jboss.netty:netty:*:*, io.netty:netty:*:*, 
> commons-beanutils:commons-beanutils-core:*:*, 
> commons-logging:commons-logging:*:*, org.mockito:mockito-all:*:*, 
> org.mortbay.jetty:servlet-api-2.5:*:*, javax.servlet:servlet-api:*:*, 
> junit:junit:*:*, com.sun.jersey:*:*:*, 
> com.sun.jersey.jersey-test-framework:*:*:*, com.sun.jersey.contribs:*:*:*, 
> net.java.dev.jets3t:jets3t:*:*, javax.ws.rs:jsr311-api:*:*, 
> org.eclipse.jetty:jetty-webapp:*:*])
> [DEBUG]   org.apache.hadoop:hadoop-common:jar:2.8.5:provided
> [DEBUG]  com.hadoop.gplcompression:hadoop-lzo:jar:0.4.19:provided
> [DEBUG]  commons-cli:commons-cli:jar:1.2:provided
> [DEBUG]  org.apache.commons:commons-math3:jar:3.4.1:provided (version 
> managed from 3.1.1)
> [DEBUG]  org.apache.httpcomponents:httpclient:jar:4.5.6:provided 
> (version managed from 4.5.2)
> [DEBUG] org.apache.httpcomponents:httpcore:jar:4.4.12:provided 
> (version managed from 4.4.10)
> [DEBUG]  commons-codec:common

Spark master build hangs using parallel build option in maven

2020-01-17 Thread Saurabh Chawla

Hi All,

Spark master build hangs using parallel build option in maven. On running
build the sequentially on spark master using maven, build did not hang.
This issue occurs on giving hadoop-provided (*-Phadoop-provided
-Dhadoop.version=2.8.5) *option. Same command works fine to build
spark-2.4.3 parallelly

*Command to build spark master sequentially - *Spark build works fine
build/mvn  -Duse.zinc.server=false -DuseZincForJdk8=false
-Dmaven.javadoc.skip=true -DskipSource=true -Phive -Phive-thriftserver
-Phive-provided -Pyarn -Phadoop-provided -Dhadoop.version=2.8.5
-DskipTests=true  clean package

*Command to build spark master parallel - *spark build hangs
build/mvn -X -Duse.zinc.server=false -DuseZincForJdk8=false
-Dmaven.javadoc.skip=true -DskipSource=true -Phive -Phive-thriftserver
-Phive-provided -Pyarn -Phadoop-provided -Dhadoop.version=2.8.5
-DskipTests=true -T 4 clean package

This is the trace which keeps on repeating in maven logs

[DEBUG] building maven31 dependency graph for
org.apache.spark:spark-network-yarn_2.12:jar:3.0.0-SNAPSHOT with
Maven31DependencyGraphBuilder
[DEBUG] Dependency collection stats: {ConflictMarker.analyzeTime=60583,
ConflictMarker.markTime=23750, ConflictMarker.nodeCount=419,
ConflictIdSorter.graphTime=41262, ConflictIdSorter.topsortTime=9704,
ConflictIdSorter.conflictIdCount=105,
ConflictIdSorter.conflictIdCycleCount=0, ConflictResolver.totalTime=632542,
ConflictResolver.conflictItemCount=193,
DefaultDependencyCollector.collectTime=1020759,
DefaultDependencyCollector.transformTime=775495}
[DEBUG] org.apache.spark:spark-network-yarn_2.12:jar:3.0.0-SNAPSHOT
[DEBUG]
 org.apache.spark:spark-network-shuffle_2.12:jar:3.0.0-SNAPSHOT:compile
[DEBUG]
org.apache.spark:spark-network-common_2.12:jar:3.0.0-SNAPSHOT:compile
[DEBUG]  io.netty:netty-all:jar:4.1.42.Final:compile (version
managed from 4.1.42.Final)
[DEBUG]  org.apache.commons:commons-lang3:jar:3.9:compile (version
managed from 3.9)
[DEBUG]  org.fusesource.leveldbjni:leveldbjni-all:jar:1.8:compile
(version managed from 1.8)
[DEBUG]
 com.fasterxml.jackson.core:jackson-databind:jar:2.10.0:compile (version
managed from 2.10.0)
[DEBUG]
com.fasterxml.jackson.core:jackson-core:jar:2.10.0:compile (version managed
from 2.10.0)
[DEBUG]
 com.fasterxml.jackson.core:jackson-annotations:jar:2.10.0:compile (version
managed from 2.10.0)
[DEBUG]  com.google.code.findbugs:jsr305:jar:3.0.0:compile (version
managed from 3.0.0)
[DEBUG]  com.google.guava:guava:jar:14.0.1:provided (scope managed
from compile) (version managed from 14.0.1)
[DEBUG]  org.apache.commons:commons-crypto:jar:1.0.0:compile
(version managed from 1.0.0) (exclusions managed from
[net.java.dev.jna:jna:*:*])
[DEBUG]   io.dropwizard.metrics:metrics-core:jar:4.1.1:compile (version
managed from 4.1.1)
[DEBUG]org.apache.spark:spark-tags_2.12:jar:3.0.0-SNAPSHOT:test
[DEBUG]   org.scala-lang:scala-library:jar:2.12.10:compile (version
managed from 2.12.10)
[DEBUG]org.apache.spark:spark-tags_2.12:jar:tests:3.0.0-SNAPSHOT:test
[DEBUG]org.apache.hadoop:hadoop-client:jar:2.8.5:provided (exclusions
managed from [org.fusesource.leveldbjni:leveldbjni-all:*:*, asm:asm:*:*,
org.codehaus.jackson:jackson-mapper-asl:*:*, org.ow2.asm:asm:*:*,
org.jboss.netty:netty:*:*, io.netty:netty:*:*,
commons-beanutils:commons-beanutils-core:*:*,
commons-logging:commons-logging:*:*, org.mockito:mockito-all:*:*,
org.mortbay.jetty:servlet-api-2.5:*:*, javax.servlet:servlet-api:*:*,
junit:junit:*:*, com.sun.jersey:*:*:*,
com.sun.jersey.jersey-test-framework:*:*:*, com.sun.jersey.contribs:*:*:*,
net.java.dev.jets3t:jets3t:*:*, javax.ws.rs:jsr311-api:*:*,
org.eclipse.jetty:jetty-webapp:*:*])
[DEBUG]   org.apache.hadoop:hadoop-common:jar:2.8.5:provided
[DEBUG]  com.hadoop.gplcompression:hadoop-lzo:jar:0.4.19:provided
[DEBUG]  commons-cli:commons-cli:jar:1.2:provided
[DEBUG]  org.apache.commons:commons-math3:jar:3.4.1:provided
(version managed from 3.1.1)
[DEBUG]  org.apache.httpcomponents:httpclient:jar:4.5.6:provided
(version managed from 4.5.2)
[DEBUG] org.apache.httpcomponents:httpcore:jar:4.4.12:provided
(version managed from 4.4.10)
[DEBUG]  commons-codec:commons-codec:jar:1.10:provided (version
managed from 1.11)
[DEBUG]  commons-io:commons-io:jar:2.4:provided (version managed
from 2.5)
[DEBUG]  commons-net:commons-net:jar:3.1:provided (version managed
from 3.6)
[DEBUG]  commons-collections:commons-collections:jar:3.2.2:provided
(version managed from 3.2.2)
[DEBUG]
 org.eclipse.jetty:jetty-servlet:jar:9.4.18.v20190429:provided (scope
managed from compile) (version managed from 9.3.19.v20170502)
[DEBUG]
org.eclipse.jetty:jetty-security:jar:9.4.18.v20190429:provided (scope
managed from compile) (version managed from 9.4.18.v20190429)
[DEBUG]  javax.servlet.jsp:jsp-api:jar:2.1:provided
[DEBUG]  log4j:log4j:jar:1.2.17:provided (scope managed from

Re: [build system] maven master branch builds timing out en masse...

2019-10-07 Thread Sean Owen

Moving the conversation here -- yes, why on earth are they taking this
long all of the sudden? we'll have to look again when they come back
online.  The last successful build took 6 hours, of which 4:45 were
the unit tests themselves.

It's mostly SQL tests; SQLQuerySuite is approaching an hour.
https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-maven-hadoop-3.2/lastStableBuild/testReport/org.apache.spark.sql/SQLQueryTestSuite/

Well, at least it seems like we legitimately need these full
integration tests to run that long, and that's just life right now, so
8 hours is better than failing. But yeah, seems like we just test a
whole lot more recently and it's taking a long time. Kudos to recent
efforts to parallelize some of that.

On Mon, Oct 7, 2019 at 1:52 PM Shane Knapp  wrote:
>
> just chatted w/sean privately and i'm going to up the test timeouts to
> 480mins (8 hours).
>
> i still don't like this but at least it should hopefully get things green 
> again.
>
> On Mon, Oct 7, 2019 at 11:31 AM Shane Knapp  wrote:
> >
> > https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-maven-hadoop-2.7/
> > https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-maven-hadoop-2.7-ubuntu-testing/
> > https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-maven-hadoop-3.2/
> > https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-maven-hadoop-3.2-jdk-11/
> >
> > spark-master-test-maven-hadoop-3.2 looks like it's been timing out for
> > ~2 weeks, but the others only for a few days.
> >
> > all of these builds have a 335 minute (5 hours, 35 mins) timeout configured.
> >
> > honestly, i feel that 5.5 hours is already an insane amount of time
> > for these builds to run...   do we need to increase it?  can
> > someone(s) here figure out what's taking so long and refactor some of
> > the tests?
> >
> > shane
> > --
> > Shane Knapp
> > UC Berkeley EECS Research / RISELab Staff Technical Lead
> > https://rise.cs.berkeley.edu
>
>
>
> --
> Shane Knapp
> UC Berkeley EECS Research / RISELab Staff Technical Lead
> https://rise.cs.berkeley.edu
>
> -
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Re: [build system] maven master branch builds timing out en masse...

2019-10-07 Thread Shane Knapp

just chatted w/sean privately and i'm going to up the test timeouts to
480mins (8 hours).

i still don't like this but at least it should hopefully get things green again.

On Mon, Oct 7, 2019 at 11:31 AM Shane Knapp  wrote:
>
> https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-maven-hadoop-2.7/
> https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-maven-hadoop-2.7-ubuntu-testing/
> https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-maven-hadoop-3.2/
> https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-maven-hadoop-3.2-jdk-11/
>
> spark-master-test-maven-hadoop-3.2 looks like it's been timing out for
> ~2 weeks, but the others only for a few days.
>
> all of these builds have a 335 minute (5 hours, 35 mins) timeout configured.
>
> honestly, i feel that 5.5 hours is already an insane amount of time
> for these builds to run...   do we need to increase it?  can
> someone(s) here figure out what's taking so long and refactor some of
> the tests?
>
> shane
> --
> Shane Knapp
> UC Berkeley EECS Research / RISELab Staff Technical Lead
> https://rise.cs.berkeley.edu



-- 
Shane Knapp
UC Berkeley EECS Research / RISELab Staff Technical Lead
https://rise.cs.berkeley.edu

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

[build system] maven master branch builds timing out en masse...

2019-10-07 Thread Shane Knapp

https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-maven-hadoop-2.7/
https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-maven-hadoop-2.7-ubuntu-testing/
https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-maven-hadoop-3.2/
https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-maven-hadoop-3.2-jdk-11/

spark-master-test-maven-hadoop-3.2 looks like it's been timing out for
~2 weeks, but the others only for a few days.

all of these builds have a 335 minute (5 hours, 35 mins) timeout configured.

honestly, i feel that 5.5 hours is already an insane amount of time
for these builds to run...   do we need to increase it?  can
someone(s) here figure out what's taking so long and refactor some of
the tests?

shane
-- 
Shane Knapp
UC Berkeley EECS Research / RISELab Staff Technical Lead
https://rise.cs.berkeley.edu

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

[build system] maven snapshot builds moved to ubuntu workers

2019-10-04 Thread Shane Knapp

https://amplab.cs.berkeley.edu/jenkins/job/spark-master-maven-snapshots/
https://amplab.cs.berkeley.edu/jenkins/job/spark-branch-2.4-maven-snapshots/

i created dry-run test builds and everything looked great.  please
file a JIRA is anything published by these jobs looks fishy.

shane
-- 
Shane Knapp
UC Berkeley EECS Research / RISELab Staff Technical Lead
https://rise.cs.berkeley.edu

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Re: maven 3.6.1 removed from apache maven repo

2019-09-03 Thread Felix Cheung

(Hmm, what is spark-...@apache.org?)

From: Sean Owen 
Sent: Tuesday, September 3, 2019 11:58:30 AM
To: Xiao Li 
Cc: Tom Graves ; spark-...@apache.org 

Subject: Re: maven 3.6.1 removed from apache maven repo

It's because build/mvn only queries ASF mirrors, and they remove non-current 
releases from mirrors regularly (we do the same).
This may help avoid this in the future: 
https://github.com/apache/spark/pull/25667

On Tue, Sep 3, 2019 at 1:41 PM Xiao Li 
mailto:lix...@databricks.com>> wrote:
Hi, Tom,

To unblock the build, I merged the upgrade to master. 
https://github.com/apache/spark/pull/25665

Thanks!

Xiao

On Tue, Sep 3, 2019 at 10:58 AM Tom Graves  wrote:
It looks like maven 3.6.1 was removed from the repo - see SPARK-28960.  It 
looks like they pushed 3.6.2,  but I don't see any release notes on the maven 
page for it 3.6.2

Seems like we had this happen before, can't remember if it was maven or 
something else, anyone remember or know if they are about to release 3.6.2?

Tom

--
[Databricks Summit - Watch the 
talks]<https://databricks.com/sparkaisummit/north-america>

Re: maven 3.6.1 removed from apache maven repo

2019-09-03 Thread Sean Owen

It's because build/mvn only queries ASF mirrors, and they remove
non-current releases from mirrors regularly (we do the same).
This may help avoid this in the future:
https://github.com/apache/spark/pull/25667

On Tue, Sep 3, 2019 at 1:41 PM Xiao Li  wrote:

> Hi, Tom,
>
> To unblock the build, I merged the upgrade to master.
> https://github.com/apache/spark/pull/25665
>
> Thanks!
>
> Xiao
>
>
> On Tue, Sep 3, 2019 at 10:58 AM Tom Graves 
> wrote:
>
>> It looks like maven 3.6.1 was removed from the repo - see SPARK-28960.
>> It looks like they pushed 3.6.2,  but I don't see any release notes on the
>> maven page for it 3.6.2
>>
>> Seems like we had this happen before, can't remember if it was maven or
>> something else, anyone remember or know if they are about to release 3.6.2?
>>
>> Tom
>>
>
>
> --
> [image: Databricks Summit - Watch the talks]
> <https://databricks.com/sparkaisummit/north-america>
>

Re: maven 3.6.1 removed from apache maven repo

2019-09-03 Thread Xiao Li

Hi, Tom,

To unblock the build, I merged the upgrade to master.
https://github.com/apache/spark/pull/25665

Thanks!

Xiao


On Tue, Sep 3, 2019 at 10:58 AM Tom Graves 
wrote:

> It looks like maven 3.6.1 was removed from the repo - see SPARK-28960.  It
> looks like they pushed 3.6.2,  but I don't see any release notes on the
> maven page for it 3.6.2
>
> Seems like we had this happen before, can't remember if it was maven or
> something else, anyone remember or know if they are about to release 3.6.2?
>
> Tom
>


-- 
[image: Databricks Summit - Watch the talks]
<https://databricks.com/sparkaisummit/north-america>

maven 3.6.1 removed from apache maven repo

2019-09-03 Thread Tom Graves

It looks like maven 3.6.1 was removed from the repo - see SPARK-28960.  It 
looks like they pushed 3.6.2,  but I don't see any release notes on the maven 
page for it 3.6.2
Seems like we had this happen before, can't remember if it was maven or 
something else, anyone remember or know if they are about to release 3.6.2?
Tom

Re: Master maven build failing for 6 days -- may need some more eyes

2019-05-30 Thread Xiao Li

Thanks! Yuming and Gengliang are working on this.

On Thu, May 30, 2019 at 8:21 AM Sean Owen  wrote:

> I might need some help figuring this out. The master Maven build has
> been failing for almost a week, and I'm having trouble diagnosing why.
> Of course, the PR builder has been fine.
>
>
> First one seems to be:
>
>
> https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-2.7/6393/
>
> [SPARK-27831][SQL][TEST][test-hadoop3.2] Move Hive test jars to maven
> (detail / githubweb)
> [MINOR][DOCS][R] Use actual version in SparkR Arrow guide for (detail
> / githubweb)
> [SPARK-26356][SQL] remove SaveMode from data source v2 (detail / githubweb)
> [SPARK-27824][SQL] Make rule EliminateResolvedHint idempotent (detail
> / githubweb)
> [SPARK-27677][CORE] Disable by default fetching of disk persisted RDD
> (detail / githubweb)
> ...
> = FINISHED o.a.s.sql.hive.thriftserver.HiveThriftBinaryServerSuite:
> 'test add jar' =
>
> *** RUN ABORTED ***
>   java.lang.NoClassDefFoundError:
> org/apache/hadoop/hive/contrib/udaf/example/UDAFExampleMax
>   at
> org.apache.spark.sql.hive.test.HiveTestUtils$.(HiveTestUtils.scala:28)
>   at
> org.apache.spark.sql.hive.test.HiveTestUtils$.(HiveTestUtils.scala)
>   at
> org.apache.spark.sql.hive.thriftserver.HiveThriftBinaryServerSuite.$anonfun$new$55(HiveThriftServer2Suites.scala:488)
>
>
>
>
> but scanning later failed builds I'm seeing things like:
>
>
> https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-2.7/6412
>
> - conjuncts before nondeterministic
> DataSourceV2ScanExecRedactionSuite:
> - treeString is redacted *** FAILED ***
>   "== Parsed Logical Plan ==
>   RelationV2[a#863554L, foo#863555] orc
>
> *(redacted).7/sql/core/target/tmp/spark-9a065829-75ff-450e-90eb-ec8e344851da
>
>
> This might be caused by https://github.com/apache/spark/pull/24719
>
>
>
> I posted some comments on related PRs already but heads up that others
> may want to look at these failures too.
>
> -
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>

-- 
[image: Databricks Summit - Watch the talks]
<https://databricks.com/sparkaisummit/north-america>

Master maven build failing for 6 days -- may need some more eyes

2019-05-30 Thread Sean Owen

I might need some help figuring this out. The master Maven build has
been failing for almost a week, and I'm having trouble diagnosing why.
Of course, the PR builder has been fine.


First one seems to be:

https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-2.7/6393/

[SPARK-27831][SQL][TEST][test-hadoop3.2] Move Hive test jars to maven
(detail / githubweb)
[MINOR][DOCS][R] Use actual version in SparkR Arrow guide for (detail
/ githubweb)
[SPARK-26356][SQL] remove SaveMode from data source v2 (detail / githubweb)
[SPARK-27824][SQL] Make rule EliminateResolvedHint idempotent (detail
/ githubweb)
[SPARK-27677][CORE] Disable by default fetching of disk persisted RDD
(detail / githubweb)
...
= FINISHED o.a.s.sql.hive.thriftserver.HiveThriftBinaryServerSuite:
'test add jar' =

*** RUN ABORTED ***
  java.lang.NoClassDefFoundError:
org/apache/hadoop/hive/contrib/udaf/example/UDAFExampleMax
  at 
org.apache.spark.sql.hive.test.HiveTestUtils$.(HiveTestUtils.scala:28)
  at org.apache.spark.sql.hive.test.HiveTestUtils$.(HiveTestUtils.scala)
  at 
org.apache.spark.sql.hive.thriftserver.HiveThriftBinaryServerSuite.$anonfun$new$55(HiveThriftServer2Suites.scala:488)




but scanning later failed builds I'm seeing things like:

https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-2.7/6412

- conjuncts before nondeterministic
DataSourceV2ScanExecRedactionSuite:
- treeString is redacted *** FAILED ***
  "== Parsed Logical Plan ==
  RelationV2[a#863554L, foo#863555] orc
*(redacted).7/sql/core/target/tmp/spark-9a065829-75ff-450e-90eb-ec8e344851da


This might be caused by https://github.com/apache/spark/pull/24719



I posted some comments on related PRs already but heads up that others
may want to look at these failures too.

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Re: [build system] speeding up maven build building only changed modules compared to master branch

2019-01-28 Thread Reynold Xin

This might be useful to do.

BTW, based on my experience with different build systems in the past few years 
(extensively SBT/Maven/Bazel, and to a less extent Gradle/Cargo), I think the 
longer term solution is to move to Bazel. It is so much easier to understand 
and use, and also much more feature rich with great support for multiple 
languages. It also supports distributed/local build cache so builds can be much 
faster.

On Mon, Jan 28, 2019 at 1:28 AM, Gabor Somogyi < gabor.g.somo...@gmail.com > 
wrote:

> 
> Do you have some numbers how much is this faster? I'm asking it because
> previously I've evaluated another plugin and found the following:
> - Incremental build didn't bring too much even in bigger than spark
> projects
> - Incremental test was buggy and sometimes the required tests were not
> executed which caused several issues
> All in all a single tiny little bug in the incremental test could cause
> horror for developers so it must be rock solid.
> Is this project used somewhere in production?
> 
> On Sat, Jan 26, 2019 at 4:03 PM Sean Owen < srowen@ gmail. com (
> sro...@gmail.com ) > wrote:
> 
> 
>> Sounds interesting; would it be able to handle R and Python modules built
>> by this project ? The home grown solution here does I think and that is
>> helpful. 
>> 
>> On Sat, Jan 26, 2019, 6:57 AM vaclavkosar < admin@ vaclavkosar. com (
>> ad...@vaclavkosar.com ) wrote:
>> 
>> 
>>> 
>>> 
>>> I think it would be good idea to use gitflow-incremental-builder maven
>>> plugin for Spark builds. It saves resources by building only modules that
>>> are impacted by changes compared to git master branch via
>>> gitflow-incremental-builder maven plugin. For example if there is only a
>>> change introduced into on of files of spark-avro_2.11 then only that maven
>>> module and its maven dependencies and dependents would be build or tested.
>>> If there are no disagreements, I can submit a pull request for that.
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> Project URL: https:/ / github. com/ vackosar/ gitflow-incremental-builder (
>>> https://github.com/vackosar/gitflow-incremental-builder )
>>> 
>>> 
>>> 
>>> Disclaimer: I originally created the project. But most of recent
>>> improvements and maintenance were deved by Falko.
>>> 
>>> 
>>> 
>> 
>> 
> 
>

Re: [build system] speeding up maven build building only changed modules compared to master branch

2019-01-28 Thread Gabor Somogyi

Do you have some numbers how much is this faster? I'm asking it because
previously I've evaluated another plugin and found the following:
- Incremental build didn't bring too much even in bigger than spark projects
- Incremental test was buggy and sometimes the required tests were not
executed which caused several issues
All in all a single tiny little bug in the incremental test could cause
horror for developers so it must be rock solid.
Is this project used somewhere in production?

On Sat, Jan 26, 2019 at 4:03 PM Sean Owen  wrote:

> Sounds interesting; would it be able to handle R and Python modules built
> by this project ? The home grown solution here does I think and that is
> helpful.
>
> On Sat, Jan 26, 2019, 6:57 AM vaclavkosar 
>> I think it would be good idea to use gitflow-incremental-builder maven
>> plugin for Spark builds. It saves resources by building only modules
>> that are impacted by changes compared to git master branch via
>> gitflow-incremental-builder maven plugin. For example if there is only a
>> change introduced into on of files of spark-avro_2.11 then only that maven
>> module and its maven dependencies and dependents would be build or tested.
>> If there are no disagreements, I can submit a pull request for that.
>>
>>
>> Project URL: https://github.com/vackosar/gitflow-incremental-builder
>>
>> Disclaimer: I originally created the project. But most of recent
>> improvements and maintenance were deved by Falko.
>>
>

Re: [build system] speeding up maven build building only changed modules compared to master branch

2019-01-26 Thread Sean Owen

Sounds interesting; would it be able to handle R and Python modules built
by this project ? The home grown solution here does I think and that is
helpful.

On Sat, Jan 26, 2019, 6:57 AM vaclavkosar  I think it would be good idea to use gitflow-incremental-builder maven
> plugin for Spark builds. It saves resources by building only modules that
> are impacted by changes compared to git master branch via
> gitflow-incremental-builder maven plugin. For example if there is only a
> change introduced into on of files of spark-avro_2.11 then only that maven
> module and its maven dependencies and dependents would be build or tested.
> If there are no disagreements, I can submit a pull request for that.
>
>
> Project URL: https://github.com/vackosar/gitflow-incremental-builder
>
> Disclaimer: I originally created the project. But most of recent
> improvements and maintenance were deved by Falko.
>

[build system] speeding up maven build building only changed modules compared to master branch

2019-01-26 Thread vaclavkosar

I think it would be good idea to use gitflow-incremental-builder maven 
plugin for Spark builds. It saves resources by building only modules 
that are impacted by changes compared to git master branch via 
gitflow-incremental-builder maven plugin. For example if there is only a 
change introduced into on of files of spark-avro_2.11 then only that 
maven module and its maven dependencies and dependents would be build or 
tested. If there are no disagreements, I can submit a pull request for that.



Project URL: https://github.com/vackosar/gitflow-incremental-builder

Disclaimer: I originally created the project. But most of recent 
improvements and maintenance were deved by Falko.

Re: Maven

2018-11-20 Thread Sean Owen

Sure, if you published Spark artifacts in a local repo (even your
local file system) as com.foo:spark-core_2.12, etc, just depend on
those artifacts, not the org.apache ones.
On Tue, Nov 20, 2018 at 3:21 PM Jack Kolokasis  wrote:
>
> Hello,
>
> is there any way to use my local custom - Spark as dependency while
> I am using maven to compile my applications ?
>
> Thanks for your reply,
> --Iacovos
>
> -
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Maven

2018-11-20 Thread Jack Kolokasis


Hello,

   is there any way to use my local custom - Spark as dependency while 
I am using maven to compile my applications ?


Thanks for your reply,
--Iacovos

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Re: Spark scala development in Sbt vs Maven

2018-03-05 Thread Anthony May

We use sbt for easy cross project dependencies with multiple scala versions
in a mono-repo for which it pretty good albeit with some quirks. As our
projects have matured and change less we moved away from cross project
dependencies but it was extremely useful early in the projects. We knew
that a lot of this was possible in maven/gradle but did not want to go
through the hackage required to get it working.

On Mon, 5 Mar 2018 at 09:49 Sean Owen <sro...@gmail.com> wrote:

> Spark uses Maven as the primary build, but SBT works as well. It reads the
> Maven build to some extent.
>
> Zinc incremental compilation works with Maven (with the Scala plugin for
> Maven).
>
> Myself, I prefer Maven, for some of the reasons it is the main build in
> Spark: declarative builds end up being a good thing. You want builds very
> standard. I think the flexibility of writing code to express your build
> just gives a lot of rope to hang yourself with, and recalls the old days of
> Ant builds, where no two builds you'd encounter looked alike when doing the
> same thing.
>
> If by cross publishing you mean handling different scala versions, yeah
> SBT is more aware of that. The Spark Maven build manages to handle that
> with some hacking.
>
>
> On Mon, Mar 5, 2018 at 9:56 AM Jörn Franke <jornfra...@gmail.com> wrote:
>
>> I think most of the scala development in Spark happens with sbt - in the
>> open source world.
>>
>>  However, you can do it with Gradle and Maven as well. It depends on your
>> organization etc. what is your standard.
>>
>> Some things might be more cumbersome too reach in non-sbt scala
>> scenarios, but this is more and more improving.
>>
>> > On 5. Mar 2018, at 16:47, Swapnil Shinde <swapnilushi...@gmail.com>
>> wrote:
>> >
>> > Hello
>> >SBT's incremental compilation was a huge plus to build spark+scala
>> applications in SBT for some time. It seems Maven can also support
>> incremental compilation with Zinc server. Considering that, I am interested
>> to know communities experience -
>> >
>> > 1. Spark documentation says SBT is being used by many contributors for
>> day to day development mainly because of incremental compilation.
>> Considering Maven is supporting incremental compilation through Zinc, do
>> contributors prefer to change from SBT to maven?
>> >
>> > 2. Any issues /learning experiences with Maven + Zinc?
>> >
>> > 3. Any other reasons to use SBT over Maven for scala development.
>> >
>> > I understand SBT has many other advantages over Maven like cross
>> version publishing etc. but incremental compilation is major need for us. I
>> am more interested to know why Spark contributors/committers prefer SBT for
>> day to day development.
>> >
>> > Any help and advice would help us to direct our evaluations in right
>> direction,
>> >
>> > Thanks
>> > Swapnil
>>
>> -
>>
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>
>>

Re: Spark scala development in Sbt vs Maven

2018-03-05 Thread Sean Owen

Spark uses Maven as the primary build, but SBT works as well. It reads the
Maven build to some extent.

Zinc incremental compilation works with Maven (with the Scala plugin for
Maven).

Myself, I prefer Maven, for some of the reasons it is the main build in
Spark: declarative builds end up being a good thing. You want builds very
standard. I think the flexibility of writing code to express your build
just gives a lot of rope to hang yourself with, and recalls the old days of
Ant builds, where no two builds you'd encounter looked alike when doing the
same thing.

If by cross publishing you mean handling different scala versions, yeah SBT
is more aware of that. The Spark Maven build manages to handle that with
some hacking.

On Mon, Mar 5, 2018 at 9:56 AM Jörn Franke <jornfra...@gmail.com> wrote:

> I think most of the scala development in Spark happens with sbt - in the
> open source world.
>
>  However, you can do it with Gradle and Maven as well. It depends on your
> organization etc. what is your standard.
>
> Some things might be more cumbersome too reach in non-sbt scala scenarios,
> but this is more and more improving.
>
> > On 5. Mar 2018, at 16:47, Swapnil Shinde <swapnilushi...@gmail.com>
> wrote:
> >
> > Hello
> >SBT's incremental compilation was a huge plus to build spark+scala
> applications in SBT for some time. It seems Maven can also support
> incremental compilation with Zinc server. Considering that, I am interested
> to know communities experience -
> >
> > 1. Spark documentation says SBT is being used by many contributors for
> day to day development mainly because of incremental compilation.
> Considering Maven is supporting incremental compilation through Zinc, do
> contributors prefer to change from SBT to maven?
> >
> > 2. Any issues /learning experiences with Maven + Zinc?
> >
> > 3. Any other reasons to use SBT over Maven for scala development.
> >
> > I understand SBT has many other advantages over Maven like cross version
> publishing etc. but incremental compilation is major need for us. I am more
> interested to know why Spark contributors/committers prefer SBT for day to
> day development.
> >
> > Any help and advice would help us to direct our evaluations in right
> direction,
> >
> > Thanks
> > Swapnil
>
> -
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>

Re: Spark scala development in Sbt vs Maven

2018-03-05 Thread Jörn Franke

I think most of the scala development in Spark happens with sbt - in the open 
source world.

 However, you can do it with Gradle and Maven as well. It depends on your 
organization etc. what is your standard.

Some things might be more cumbersome too reach in non-sbt scala scenarios, but 
this is more and more improving.

> On 5. Mar 2018, at 16:47, Swapnil Shinde <swapnilushi...@gmail.com> wrote:
> 
> Hello
>SBT's incremental compilation was a huge plus to build spark+scala 
> applications in SBT for some time. It seems Maven can also support 
> incremental compilation with Zinc server. Considering that, I am interested 
> to know communities experience -
> 
> 1. Spark documentation says SBT is being used by many contributors for day to 
> day development mainly because of incremental compilation. Considering Maven 
> is supporting incremental compilation through Zinc, do contributors prefer to 
> change from SBT to maven?
> 
> 2. Any issues /learning experiences with Maven + Zinc?
> 
> 3. Any other reasons to use SBT over Maven for scala development.
> 
> I understand SBT has many other advantages over Maven like cross version 
> publishing etc. but incremental compilation is major need for us. I am more 
> interested to know why Spark contributors/committers prefer SBT for day to 
> day development.
> 
> Any help and advice would help us to direct our evaluations in right 
> direction,
> 
> Thanks
> Swapnil

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Re: Object in compiler mirror not found - maven build

2017-11-26 Thread Mark Hamstra

Or you just have zinc running but in a bad state. `zinc -shutdown` should
kill it off and let you try again.

On Sun, Nov 26, 2017 at 2:12 PM, Sean Owen <so...@cloudera.com> wrote:

> I'm not seeing that on OS X or Linux. It sounds a bit like you have an old
> version of zinc or scala or something installed.
>
> On Sun, Nov 26, 2017 at 3:55 PM Tomasz Dudek <
> megatrontomaszdu...@gmail.com> wrote:
>
>> Hello everyone,
>>
>> I would love to help develop Apache Spark. I have run into a (very
>> basic?) issue which holds me in that mission.
>>
>> I followed the `how to contribute` guide, however running ./build/mvn
>> -DskipTests clean package fails with:
>>
>> [INFO] Using zinc server for incremental compilation
>> [info] 'compiler-interface' not yet compiled for Scala 2.11.8.
>> Compiling...
>> error: scala.reflect.internal.MissingRequirementError: object
>> java.lang.Object in compiler mirror not found.
>> at scala.reflect.internal.MissingRequirementError$.signal(
>> MissingRequirementError.scala:17)
>> at scala.reflect.internal.MissingRequirementError$.notFound(
>> MissingRequirementError.scala:18)
>> at scala.reflect.internal.Mirrors$RootsBase.
>> getModuleOrClass(Mirrors.scala:53)
>>
>> is it perhaps compability issue? Versions I use are as follows:
>>
>> ➜  spark git:(master) ✗ ./build/mvn --version
>> Using `mvn` from path: /Users/tdudek/Programming/
>> spark/build/apache-maven-3.3.9/bin/mvn
>> Apache Maven 3.3.9 (bb52d8502b132ec0a5a3f4c09453c07478323dc5;
>> 2015-11-10T17:41:47+01:00)
>> Maven home: /Users/tdudek/Programming/spark/build/apache-maven-3.3.9
>> Java version: 1.8.0_152, vendor: Oracle Corporation
>> Java home: /Library/Java/JavaVirtualMachines/jdk1.8.0_
>> 152.jdk/Contents/Home/jre
>> Default locale: en_PL, platform encoding: US-ASCII
>> OS name: "mac os x", version: "10.13.1", arch: "x86_64", family: "mac"
>>
>> I just lost few hours mindlessly trying to make it work. I hate to waste
>> other peoples' time and I'm REALLY ashamed at my question, but I think I am
>> missing something fundamental.
>>
>> Cheers,
>> Tomasz
>>
>

Re: Object in compiler mirror not found - maven build

2017-11-26 Thread Sean Owen

I'm not seeing that on OS X or Linux. It sounds a bit like you have an old
version of zinc or scala or something installed.

On Sun, Nov 26, 2017 at 3:55 PM Tomasz Dudek <megatrontomaszdu...@gmail.com>
wrote:

> Hello everyone,
>
> I would love to help develop Apache Spark. I have run into a (very basic?)
> issue which holds me in that mission.
>
> I followed the `how to contribute` guide, however running ./build/mvn
> -DskipTests clean package fails with:
>
> [INFO] Using zinc server for incremental compilation
> [info] 'compiler-interface' not yet compiled for Scala 2.11.8. Compiling...
> error: scala.reflect.internal.MissingRequirementError: object
> java.lang.Object in compiler mirror not found.
> at
> scala.reflect.internal.MissingRequirementError$.signal(MissingRequirementError.scala:17)
> at
> scala.reflect.internal.MissingRequirementError$.notFound(MissingRequirementError.scala:18)
> at
> scala.reflect.internal.Mirrors$RootsBase.getModuleOrClass(Mirrors.scala:53)
>
> is it perhaps compability issue? Versions I use are as follows:
>
> ➜  spark git:(master) ✗ ./build/mvn --version
> Using `mvn` from path:
> /Users/tdudek/Programming/spark/build/apache-maven-3.3.9/bin/mvn
> Apache Maven 3.3.9 (bb52d8502b132ec0a5a3f4c09453c07478323dc5;
> 2015-11-10T17:41:47+01:00)
> Maven home: /Users/tdudek/Programming/spark/build/apache-maven-3.3.9
> Java version: 1.8.0_152, vendor: Oracle Corporation
> Java home:
> /Library/Java/JavaVirtualMachines/jdk1.8.0_152.jdk/Contents/Home/jre
> Default locale: en_PL, platform encoding: US-ASCII
> OS name: "mac os x", version: "10.13.1", arch: "x86_64", family: "mac"
>
> I just lost few hours mindlessly trying to make it work. I hate to waste
> other peoples' time and I'm REALLY ashamed at my question, but I think I am
> missing something fundamental.
>
> Cheers,
> Tomasz
>

Object in compiler mirror not found - maven build

2017-11-26 Thread Tomasz Dudek

Hello everyone,

I would love to help develop Apache Spark. I have run into a (very basic?)
issue which holds me in that mission.

I followed the `how to contribute` guide, however running ./build/mvn
-DskipTests clean package fails with:

[INFO] Using zinc server for incremental compilation
[info] 'compiler-interface' not yet compiled for Scala 2.11.8. Compiling...
error: scala.reflect.internal.MissingRequirementError: object
java.lang.Object in compiler mirror not found.
at
scala.reflect.internal.MissingRequirementError$.signal(MissingRequirementError.scala:17)
at
scala.reflect.internal.MissingRequirementError$.notFound(MissingRequirementError.scala:18)
at
scala.reflect.internal.Mirrors$RootsBase.getModuleOrClass(Mirrors.scala:53)

is it perhaps compability issue? Versions I use are as follows:

➜  spark git:(master) ✗ ./build/mvn --version
Using `mvn` from path:
/Users/tdudek/Programming/spark/build/apache-maven-3.3.9/bin/mvn
Apache Maven 3.3.9 (bb52d8502b132ec0a5a3f4c09453c07478323dc5;
2015-11-10T17:41:47+01:00)
Maven home: /Users/tdudek/Programming/spark/build/apache-maven-3.3.9
Java version: 1.8.0_152, vendor: Oracle Corporation
Java home:
/Library/Java/JavaVirtualMachines/jdk1.8.0_152.jdk/Contents/Home/jre
Default locale: en_PL, platform encoding: US-ASCII
OS name: "mac os x", version: "10.13.1", arch: "x86_64", family: "mac"

I just lost few hours mindlessly trying to make it work. I hate to waste
other peoples' time and I'm REALLY ashamed at my question, but I think I am
missing something fundamental.

Cheers,
Tomasz

Re: In Intellij, maven failed to build Catalyst project

2017-02-20 Thread Armin Braun

I think the reason you're seeing this (and it then disappearing in Sean's
case) is likely that there was a change in another that required a
recompile of a module dependency.
Maven doesn't do this automatically by default. So it eventually goes away
when you do a full build either with Maven or SBT.

You could add `--also-make-dependents`  (-amd) as a build flag to fix this
behavior and rebuild dependencies too.

On Tue, Feb 21, 2017 at 6:35 AM, Sean Owen <so...@cloudera.com> wrote:

> I saw this too yesterday but not today. It may have been fixed by some
> recent commits.
>
> On Mon, Feb 20, 2017 at 6:52 PM ron8hu <ron...@huawei.com> wrote:
>
> I am using Intellij IDEA 15.0.6.   I used to use Maven to compile Spark
> project Catalyst inside Intellij without problem.
>
> A couple of days ago, I fetched latest Spark code from its master
> repository.  There was a change in CreateJacksonParser.scala.  So I used
> Maven to compile Catalyst project again.  Then I got this error:
> ./spark/sql/catalyst/json/CreateJacksonParser.scala:33: value
> getByteBuffer is not a member of org.apache.spark.unsafe.types.UTF8String
>
> Note that I was able to compile the source code using sbt.  This compile
> error happens inside Intellij/Maven.  Below is the detailed message.   Any
> advice will be appreciated.
> ***    ***
> [INFO] --- scala-maven-plugin:3.2.2:compile (scala-compile-first) @
> spark-catalyst_2.11 ---
> [INFO] Using zinc server for incremental compilation
>
> Compiling 203 Scala sources and 26 Java sources to
> /home/rhu/spark-20/sql/catalyst/target/scala-2.11/classes
> /home/rhu/spark-20/sql/catalyst/src/main/scala/org/
> apache/spark/sql/catalyst/json/CreateJacksonParser.scala:33:
> value getByteBuffer is not a member of
> org.apache.spark.unsafe.types.UTF8String
> val bb = record.getByteBuffer
> ^ not found
>
> One error found.
> [INFO] Compile failed at Feb 20, 2017
> [INFO]
> 
> [INFO] BUILD FAILURE
> [INFO]
> 
> [INFO] Total time: 01:53 min
> [INFO] Finished at: 2017-02-20T17:37:55-08:00
> [INFO] Final Memory: 38M/795M
> [INFO]
> 
> [WARNING] The requested profile "hive" could not be activated because it
> does not exist.
> [ERROR] Failed to execute goal
> net.alchim31.maven:scala-maven-plugin:3.2.2:compile (scala-compile-first)
> on
> project spark-catalyst_2.11: Execution scala-compile-first of goal
> net.alchim31.maven:scala-maven-plugin:3.2.2:compile failed. CompileFailed
> ->
> [Help 1]
> [ERROR]
> [ERROR] To see the full stack trace of the errors, re-run Maven with the -e
> switch.
> [ERROR] Re-run Maven using the -X switch to enable full debug logging.
> [ERROR]
> [ERROR] For more information about the errors and possible solutions,
> please
> read the following articles:
> [ERROR] [Help 1]
> http://cwiki.apache.org/confluence/display/MAVEN/PluginExecutionException
>
> Process finished with exit code 1
>
>
>
>
> --
> View this message in context: http://apache-spark-
> developers-list.1001551.n3.nabble.com/In-Intellij-maven-
> failed-to-build-Catalyst-project-tp21036.html
> Sent from the Apache Spark Developers List mailing list archive at
> Nabble.com.
>
> -
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>

Re: In Intellij, maven failed to build Catalyst project

2017-02-20 Thread Sean Owen

I saw this too yesterday but not today. It may have been fixed by some
recent commits.

On Mon, Feb 20, 2017 at 6:52 PM ron8hu <ron...@huawei.com> wrote:

I am using Intellij IDEA 15.0.6.   I used to use Maven to compile Spark
project Catalyst inside Intellij without problem.

A couple of days ago, I fetched latest Spark code from its master
repository.  There was a change in CreateJacksonParser.scala.  So I used
Maven to compile Catalyst project again.  Then I got this error:
./spark/sql/catalyst/json/CreateJacksonParser.scala:33: value
getByteBuffer is not a member of org.apache.spark.unsafe.types.UTF8String

Note that I was able to compile the source code using sbt.  This compile
error happens inside Intellij/Maven.  Below is the detailed message.   Any
advice will be appreciated.
***    ***
[INFO] --- scala-maven-plugin:3.2.2:compile (scala-compile-first) @
spark-catalyst_2.11 ---
[INFO] Using zinc server for incremental compilation

Compiling 203 Scala sources and 26 Java sources to
/home/rhu/spark-20/sql/catalyst/target/scala-2.11/classes
/home/rhu/spark-20/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/CreateJacksonParser.scala:33:
value getByteBuffer is not a member of
org.apache.spark.unsafe.types.UTF8String
val bb = record.getByteBuffer
^ not found

One error found.
[INFO] Compile failed at Feb 20, 2017
[INFO]

[INFO] BUILD FAILURE
[INFO]

[INFO] Total time: 01:53 min
[INFO] Finished at: 2017-02-20T17:37:55-08:00
[INFO] Final Memory: 38M/795M
[INFO]

[WARNING] The requested profile "hive" could not be activated because it
does not exist.
[ERROR] Failed to execute goal
net.alchim31.maven:scala-maven-plugin:3.2.2:compile (scala-compile-first) on
project spark-catalyst_2.11: Execution scala-compile-first of goal
net.alchim31.maven:scala-maven-plugin:3.2.2:compile failed. CompileFailed ->
[Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e
switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please
read the following articles:
[ERROR] [Help 1]
http://cwiki.apache.org/confluence/display/MAVEN/PluginExecutionException

Process finished with exit code 1




--
View this message in context:
http://apache-spark-developers-list.1001551.n3.nabble.com/In-Intellij-maven-failed-to-build-Catalyst-project-tp21036.html
Sent from the Apache Spark Developers List mailing list archive at
Nabble.com.

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

In Intellij, maven failed to build Catalyst project

2017-02-20 Thread ron8hu

I am using Intellij IDEA 15.0.6.   I used to use Maven to compile Spark
project Catalyst inside Intellij without problem.  

A couple of days ago, I fetched latest Spark code from its master
repository.  There was a change in CreateJacksonParser.scala.  So I used
Maven to compile Catalyst project again.  Then I got this error:
./spark/sql/catalyst/json/CreateJacksonParser.scala:33: value
getByteBuffer is not a member of org.apache.spark.unsafe.types.UTF8String

Note that I was able to compile the source code using sbt.  This compile
error happens inside Intellij/Maven.  Below is the detailed message.   Any
advice will be appreciated.  
***    ***
[INFO] --- scala-maven-plugin:3.2.2:compile (scala-compile-first) @
spark-catalyst_2.11 ---
[INFO] Using zinc server for incremental compilation

Compiling 203 Scala sources and 26 Java sources to
/home/rhu/spark-20/sql/catalyst/target/scala-2.11/classes
/home/rhu/spark-20/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/CreateJacksonParser.scala:33:
value getByteBuffer is not a member of
org.apache.spark.unsafe.types.UTF8String
val bb = record.getByteBuffer
^ not found

One error found.
[INFO] Compile failed at Feb 20, 2017
[INFO]

[INFO] BUILD FAILURE
[INFO]

[INFO] Total time: 01:53 min
[INFO] Finished at: 2017-02-20T17:37:55-08:00
[INFO] Final Memory: 38M/795M
[INFO]

[WARNING] The requested profile "hive" could not be activated because it
does not exist.
[ERROR] Failed to execute goal
net.alchim31.maven:scala-maven-plugin:3.2.2:compile (scala-compile-first) on
project spark-catalyst_2.11: Execution scala-compile-first of goal
net.alchim31.maven:scala-maven-plugin:3.2.2:compile failed. CompileFailed ->
[Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e
switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please
read the following articles:
[ERROR] [Help 1]
http://cwiki.apache.org/confluence/display/MAVEN/PluginExecutionException

Process finished with exit code 1




--
View this message in context: 
http://apache-spark-developers-list.1001551.n3.nabble.com/In-Intellij-maven-failed-to-build-Catalyst-project-tp21036.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Re: Issues in compiling spark 2.0.0 code using scala-maven-plugin

2016-09-30 Thread satyajit vegesna

>
>
> i am trying to compile code using maven ,which was working with spark
> 1.6.2, but when i try for spark 2.0.0 then i get below error,
>
> org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute
> goal net.alchim31.maven:scala-maven-plugin:3.2.2:compile (default) on
> project NginxLoads-repartition: wrap: 
> org.apache.commons.exec.ExecuteException:
> Process exited with an error: 1 (Exit value: 1)
> at org.apache.maven.lifecycle.internal.MojoExecutor.execute(
> MojoExecutor.java:212)
> at org.apache.maven.lifecycle.internal.MojoExecutor.execute(
> MojoExecutor.java:153)
> at org.apache.maven.lifecycle.internal.MojoExecutor.execute(
> MojoExecutor.java:145)
> at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.
> buildProject(LifecycleModuleBuilder.java:116)
> at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.
> buildProject(LifecycleModuleBuilder.java:80)
> at org.apache.maven.lifecycle.internal.builder.singlethreaded.
> SingleThreadedBuilder.build(SingleThreadedBuilder.java:51)
> at org.apache.maven.lifecycle.internal.LifecycleStarter.
> execute(LifecycleStarter.java:128)
> at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:307)
> at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:193)
> at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:106)
> at org.apache.maven.cli.MavenCli.execute(MavenCli.java:863)
> at org.apache.maven.cli.MavenCli.doMain(MavenCli.java:288)
> at org.apache.maven.cli.MavenCli.main(MavenCli.java:199)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(
> NativeMethodAccessorImpl.java:57)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(
> DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.codehaus.plexus.classworlds.launcher.Launcher.
> launchEnhanced(Launcher.java:289)
> at org.codehaus.plexus.classworlds.launcher.Launcher.
> launch(Launcher.java:229)
> at org.codehaus.plexus.classworlds.launcher.Launcher.
> mainWithExitCode(Launcher.java:415)
> at org.codehaus.plexus.classworlds.launcher.Launcher.
> main(Launcher.java:356)
> Caused by: org.apache.maven.plugin.MojoExecutionException: wrap:
> org.apache.commons.exec.ExecuteException: Process exited with an error: 1
> (Exit value: 1)
> at scala_maven.ScalaMojoSupport.execute(ScalaMojoSupport.java:490)
> at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(
> DefaultBuildPluginManager.java:134)
> at org.apache.maven.lifecycle.internal.MojoExecutor.execute(
> MojoExecutor.java:207)
> ... 20 more
> Caused by: org.apache.commons.exec.ExecuteException: Process exited with
> an error: 1 (Exit value: 1)
> at org.apache.commons.exec.DefaultExecutor.executeInternal(
> DefaultExecutor.java:377)
> at org.apache.commons.exec.DefaultExecutor.execute(
> DefaultExecutor.java:160)
> at org.apache.commons.exec.DefaultExecutor.execute(
> DefaultExecutor.java:147)
> at scala_maven_executions.JavaMainCallerByFork.run(
> JavaMainCallerByFork.java:100)
> at scala_maven.ScalaCompilerSupport.compile(ScalaCompilerSupport.java:161)
> at scala_maven.ScalaCompilerSupport.doExecute(
> ScalaCompilerSupport.java:99)
> at scala_maven.ScalaMojoSupport.execute(ScalaMojoSupport.java:482)
> ... 22 more
>
>
> PFB pom.xml that i am using, any help would be appreciated.
>
> 
> http://maven.apache.org/POM/4.0.0;
>  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance;
>  xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
> http://maven.apache.org/xsd/maven-4.0.0.xsd;>
> 4.0.0
>
> NginxLoads-repartition
> NginxLoads-repartition
> 1.1-SNAPSHOT
> ${project.artifactId}
> This is a boilerplate maven project to start using Spark in 
> Scala
> 2010
>
> 
> 1.6
> 1.6
> UTF-8
> 2.11
> 2.11
>     
> 2.11.8
> 
>
> 
> 
> 
> cloudera-repo-releases
> https://repository.cloudera.com/artifactory/repo/
> 
> 
>
> 
> src/main/scala
> src/test/scala
> 
> 
> 
> maven-assembly-plugin
> 
> 
> package
> 
> single
> 
> 
> 
> 
> 
>     jar-with-dependencies
> 
>

Re: Using Spark as a Maven dependency but with Hadoop 2.6

2016-09-30 Thread Steve Loughran


On 29 Sep 2016, at 10:37, Olivier Girardot 
<o.girar...@lateral-thoughts.com<mailto:o.girar...@lateral-thoughts.com>> wrote:

I know that the code itself would not be the same, but it would be useful to at 
least have the pom/build.sbt transitive dependencies different when fetching 
the artifact with a specific classifier, don't you think ?
For now I've overriden them myself using the dependency versions defined in the 
pom.xml of spark.
So it's not a blocker issue, it may be useful to document it, but a blog post 
would be sufficient I think.



The problem here is that it's not directly something that maven repo is set up 
to deal with. What could be done would be to publish multiple pom-only 
artifacts, spark-scala-2.11-hadoop-2.6.pom which would declare the transitive 
stuff appropriately for the right version. You wouldn't need to actually 
rebuild everything, just declare a dependency on the spark 2.2 artifacts 
excluding all of hadoop 2.2, pulling in 2.6.

This wouldn't even need to be an org.apache.spark artifact, just something any 
can build and publish under their own name.

Volunteers?

Issues in compiling spark 2.0.0 code using scala-maven-plugin

2016-09-29 Thread satyajit vegesna

Hi ALL,

i am trying to compile code using maven ,which was working with spark
1.6.2, but when i try for spark 2.0.0 then i get below error,

org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute
goal net.alchim31.maven:scala-maven-plugin:3.2.2:compile (default) on
project NginxLoads-repartition: wrap:
org.apache.commons.exec.ExecuteException: Process exited with an error: 1
(Exit value: 1)
at
org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:212)
at
org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153)
at
org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:145)
at
org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:116)
at
org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:80)
at
org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build(SingleThreadedBuilder.java:51)
at
org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:128)
at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:307)
at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:193)
at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:106)
at org.apache.maven.cli.MavenCli.execute(MavenCli.java:863)
at org.apache.maven.cli.MavenCli.doMain(MavenCli.java:288)
at org.apache.maven.cli.MavenCli.main(MavenCli.java:199)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at
org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced(Launcher.java:289)
at
org.codehaus.plexus.classworlds.launcher.Launcher.launch(Launcher.java:229)
at
org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode(Launcher.java:415)
at org.codehaus.plexus.classworlds.launcher.Launcher.main(Launcher.java:356)
Caused by: org.apache.maven.plugin.MojoExecutionException: wrap:
org.apache.commons.exec.ExecuteException: Process exited with an error: 1
(Exit value: 1)
at scala_maven.ScalaMojoSupport.execute(ScalaMojoSupport.java:490)
at
org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:134)
at
org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:207)
... 20 more
Caused by: org.apache.commons.exec.ExecuteException: Process exited with an
error: 1 (Exit value: 1)
at
org.apache.commons.exec.DefaultExecutor.executeInternal(DefaultExecutor.java:377)
at org.apache.commons.exec.DefaultExecutor.execute(DefaultExecutor.java:160)
at org.apache.commons.exec.DefaultExecutor.execute(DefaultExecutor.java:147)
at
scala_maven_executions.JavaMainCallerByFork.run(JavaMainCallerByFork.java:100)
at scala_maven.ScalaCompilerSupport.compile(ScalaCompilerSupport.java:161)
at scala_maven.ScalaCompilerSupport.doExecute(ScalaCompilerSupport.java:99)
at scala_maven.ScalaMojoSupport.execute(ScalaMojoSupport.java:482)
... 22 more


PFB pom.xml that i am using, any help would be appreciated.


http://maven.apache.org/POM/4.0.0;
 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance;
 xsi:schemaLocation="http://maven.apache.org/POM/4.0.0
http://maven.apache.org/xsd/maven-4.0.0.xsd;>
4.0.0

NginxLoads-repartition
NginxLoads-repartition
1.1-SNAPSHOT
${project.artifactId}
This is a boilerplate maven project to start using
Spark in Scala
2010


1.6
1.6
UTF-8
2.11
2.11

2.11.8





cloudera-repo-releases
https://repository.cloudera.com/artifactory/repo/




src/main/scala
src/test/scala



    maven-assembly-plugin


package

single





jar-with-dependencies




org.apache.maven.plugins
    maven-compiler-plugin
3.5.1

1.7
1.7




net.alchim31.maven
    scala-maven-plugin
3.2.2







compile
testCompile



-ma

Re: Using Spark as a Maven dependency but with Hadoop 2.6

2016-09-29 Thread Sean Owen

No, I think that's what dependencyManagent (or equivalent) is definitely for.

On Thu, Sep 29, 2016 at 5:37 AM, Olivier Girardot
 wrote:
> I know that the code itself would not be the same, but it would be useful to
> at least have the pom/build.sbt transitive dependencies different when
> fetching the artifact with a specific classifier, don't you think ?
> For now I've overriden them myself using the dependency versions defined in
> the pom.xml of spark.
> So it's not a blocker issue, it may be useful to document it, but a blog
> post would be sufficient I think.
>

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Re: Using Spark as a Maven dependency but with Hadoop 2.6

2016-09-29 Thread Olivier Girardot

I know that the code itself would not be the same, but it would be useful to at
least have the pom/build.sbt transitive dependencies different when fetching the
artifact with a specific classifier, don't you think ?For now I've overriden
them myself using the dependency versions defined in the pom.xml of spark.So
it's not a blocker issue, it may be useful to document it, but a blog post would
be sufficient I think.
 





On Wed, Sep 28, 2016 7:21 PM, Sean Owen so...@cloudera.com
wrote:
I guess I'm claiming the artifacts wouldn't even be different in the first
place, because the Hadoop APIs that are used are all the same across these
versions. That would be the thing that makes you need multiple versions of the
artifact under multiple classifiers.
On Wed, Sep 28, 2016 at 1:16 PM, Olivier Girardot <
o.girar...@lateral-thoughts.com>  wrote:
ok, don't you think it could be published with just different classifiers
hadoop-2.6hadoop-2.4
hadoop-2.2 being the current default.
So for now, I should just override spark 2.0.0's dependencies with the ones
defined in the pom profile

 





On Thu, Sep 22, 2016 11:17 AM, Sean Owen so...@cloudera.com
wrote:
There can be just one published version of the Spark artifacts and they have to
depend on something, though in truth they'd be binary-compatible with anything
2.2+. So you merely manage the dependency versions up to the desired version in
your .
On Thu, Sep 22, 2016 at 7:05 AM, Olivier Girardot <
o.girar...@lateral-thoughts.com>  wrote:
Hi,when we fetch Spark 2.0.0 as maven dependency then we automatically end up
with hadoop 2.2 as a transitive dependency, I know multiple profiles are used to
generate the different tar.gz bundles that we can download, Is there by any
chance publications of Spark 2.0.0 with different classifier according to
different versions of Hadoop available ?
Thanks for your time !
Olivier Girardot

 


Olivier Girardot| Associé
o.girar...@lateral-thoughts.com
+33 6 24 09 17 94
 


Olivier Girardot| Associé
o.girar...@lateral-thoughts.com
+33 6 24 09 17 94

Re: Using Spark as a Maven dependency but with Hadoop 2.6

2016-09-28 Thread Sean Owen

I guess I'm claiming the artifacts wouldn't even be different in the first
place, because the Hadoop APIs that are used are all the same across these
versions. That would be the thing that makes you need multiple versions of
the artifact under multiple classifiers.

On Wed, Sep 28, 2016 at 1:16 PM, Olivier Girardot <
o.girar...@lateral-thoughts.com> wrote:

> ok, don't you think it could be published with just different classifiers
> hadoop-2.6
> hadoop-2.4
> hadoop-2.2 being the current default.
>
> So for now, I should just override spark 2.0.0's dependencies with the
> ones defined in the pom profile
>
>
>
> On Thu, Sep 22, 2016 11:17 AM, Sean Owen so...@cloudera.com wrote:
>
>> There can be just one published version of the Spark artifacts and they
>> have to depend on something, though in truth they'd be binary-compatible
>> with anything 2.2+. So you merely manage the dependency versions up to the
>> desired version in your .
>>
>> On Thu, Sep 22, 2016 at 7:05 AM, Olivier Girardot <
>> o.girar...@lateral-thoughts.com> wrote:
>>
>> Hi,
>> when we fetch Spark 2.0.0 as maven dependency then we automatically end
>> up with hadoop 2.2 as a transitive dependency, I know multiple profiles are
>> used to generate the different tar.gz bundles that we can download, Is
>> there by any chance publications of Spark 2.0.0 with different classifier
>> according to different versions of Hadoop available ?
>>
>> Thanks for your time !
>>
>> *Olivier Girardot*
>>
>>
>>
>
> *Olivier Girardot* | Associé
> o.girar...@lateral-thoughts.com
> +33 6 24 09 17 94
>

Re: Using Spark as a Maven dependency but with Hadoop 2.6

2016-09-28 Thread Olivier Girardot

ok, don't you think it could be published with just different classifiers
hadoop-2.6hadoop-2.4
hadoop-2.2 being the current default.
So for now, I should just override spark 2.0.0's dependencies with the ones
defined in the pom profile
 





On Thu, Sep 22, 2016 11:17 AM, Sean Owen so...@cloudera.com
wrote:
There can be just one published version of the Spark artifacts and they have to
depend on something, though in truth they'd be binary-compatible with anything
2.2+. So you merely manage the dependency versions up to the desired version in
your .
On Thu, Sep 22, 2016 at 7:05 AM, Olivier Girardot <
o.girar...@lateral-thoughts.com>  wrote:
Hi,when we fetch Spark 2.0.0 as maven dependency then we automatically end up
with hadoop 2.2 as a transitive dependency, I know multiple profiles are used to
generate the different tar.gz bundles that we can download, Is there by any
chance publications of Spark 2.0.0 with different classifier according to
different versions of Hadoop available ?
Thanks for your time !
Olivier Girardot

 


Olivier Girardot| Associé
o.girar...@lateral-thoughts.com
+33 6 24 09 17 94

Re: Using Spark as a Maven dependency but with Hadoop 2.6

2016-09-22 Thread Sean Owen

There can be just one published version of the Spark artifacts and they
have to depend on something, though in truth they'd be binary-compatible
with anything 2.2+. So you merely manage the dependency versions up to the
desired version in your .

On Thu, Sep 22, 2016 at 7:05 AM, Olivier Girardot <
o.girar...@lateral-thoughts.com> wrote:

> Hi,
> when we fetch Spark 2.0.0 as maven dependency then we automatically end up
> with hadoop 2.2 as a transitive dependency, I know multiple profiles are
> used to generate the different tar.gz bundles that we can download, Is
> there by any chance publications of Spark 2.0.0 with different classifier
> according to different versions of Hadoop available ?
>
> Thanks for your time !
>
> *Olivier Girardot*
>

Re: Mesos is now a maven module

2016-08-30 Thread Dongjoon Hyun

Thank you all for quick fix! :D

Dongjoon.

On Tuesday, August 30, 2016, Michael Gummelt  wrote:

> https://github.com/apache/spark/pull/14885
>
> Thanks
>
> On Tue, Aug 30, 2016 at 11:36 AM, Marcelo Vanzin  > wrote:
>
>> On Tue, Aug 30, 2016 at 11:32 AM, Sean Owen > > wrote:
>> > Ah, I helped miss that. We don't enable -Pyarn for YARN because it's
>> > already always set? I wonder if it makes sense to make that optional
>> > in order to speed up builds, or, maybe I'm missing a reason it's
>> > always essential.
>>
>> YARN is currently handled as part of the Hadoop profiles in
>> dev/run-tests.py; it could potentially be changed to behave like the
>> others (e.g. only enabled when the YARN code changes).
>>
>> --
>> Marcelo
>>
>> -
>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>> 
>>
>>
>
>
> --
> Michael Gummelt
> Software Engineer
> Mesosphere
>

Re: Mesos is now a maven module

2016-08-30 Thread Michael Gummelt

https://github.com/apache/spark/pull/14885

Thanks

On Tue, Aug 30, 2016 at 11:36 AM, Marcelo Vanzin 
wrote:

> On Tue, Aug 30, 2016 at 11:32 AM, Sean Owen  wrote:
> > Ah, I helped miss that. We don't enable -Pyarn for YARN because it's
> > already always set? I wonder if it makes sense to make that optional
> > in order to speed up builds, or, maybe I'm missing a reason it's
> > always essential.
>
> YARN is currently handled as part of the Hadoop profiles in
> dev/run-tests.py; it could potentially be changed to behave like the
> others (e.g. only enabled when the YARN code changes).
>
> --
> Marcelo
>
> -
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>


-- 
Michael Gummelt
Software Engineer
Mesosphere

Re: Mesos is now a maven module

2016-08-30 Thread Marcelo Vanzin

On Tue, Aug 30, 2016 at 11:32 AM, Sean Owen  wrote:
> Ah, I helped miss that. We don't enable -Pyarn for YARN because it's
> already always set? I wonder if it makes sense to make that optional
> in order to speed up builds, or, maybe I'm missing a reason it's
> always essential.

YARN is currently handled as part of the Hadoop profiles in
dev/run-tests.py; it could potentially be changed to behave like the
others (e.g. only enabled when the YARN code changes).

-- 
Marcelo

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Re: Mesos is now a maven module

2016-08-30 Thread Sean Owen

Ah, I helped miss that. We don't enable -Pyarn for YARN because it's
already always set? I wonder if it makes sense to make that optional
in order to speed up builds, or, maybe I'm missing a reason it's
always essential.

I think it's not setting -Pmesos indeed because no Mesos code was
changed but I think that script change is necessary as a follow up
yes.

Yeah, nothing is in the Jenkins config itself.

On Tue, Aug 30, 2016 at 6:05 PM, Marcelo Vanzin  wrote:
> A quick look shows that maybe dev/sparktestsupport/modules.py needs to
> be modified, and a "build_profile_flags" added to the mesos section
> (similar to hive / hive-thriftserver).
>
> Note not all PR builds will trigger mesos currently, since it's listed
> as an independent module in the above file.
>

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Re: Mesos is now a maven module

2016-08-30 Thread Dongjoon Hyun

Thank you for confirming, Sean and Marcelo!

Bests,
Dongjoon.

On Tue, Aug 30, 2016 at 10:05 AM, Marcelo Vanzin <van...@cloudera.com>
wrote:

> A quick look shows that maybe dev/sparktestsupport/modules.py needs to
> be modified, and a "build_profile_flags" added to the mesos section
> (similar to hive / hive-thriftserver).
>
> Note not all PR builds will trigger mesos currently, since it's listed
> as an independent module in the above file.
>
> On Tue, Aug 30, 2016 at 10:01 AM, Sean Owen <so...@cloudera.com> wrote:
> > I have the heady power to modify Jenkins jobs now, so I will carefully
> take
> > a look at them and see if any of the config needs -Pmesos. But yeah I
> > thought this should be baked into the script.
> >
> > On Tue, Aug 30, 2016 at 5:56 PM, Dongjoon Hyun <dongj...@apache.org>
> wrote:
> >>
> >> Hi, Michael.
> >>
> >> It's a great news!
> >>
> >> BTW, I'm wondering if the Jenkins (SparkPullRequestBuilder) knows this
> new
> >> profile, -Pmesos.
> >>
> >> The PR was passed with the following Jenkins build arguments without
> >> `-Pmesos` option. (at the last test)
> >> ```
> >> [info] Building Spark (w/Hive 1.2.1) using SBT with these arguments:
> >> -Pyarn -Phadoop-2.3 -Pkinesis-asl -Phive-thriftserver -Phive
> test:package
> >> streaming-kafka-0-8-assembly/assembly streaming-flume-assembly/assembly
> >> streaming-kinesis-asl-assembly/assembly
> >> ```
> >>
> >> https://amplab.cs.berkeley.edu/jenkins/job/
> SparkPullRequestBuilder/64435/consoleFull
> >>
> >> Also, up to now, Jenkins seems not to use '-Pmesos' for all PRs.
> >>
> >> Bests,
> >> Dongjoon.
> >>
> >>
> >> On Fri, Aug 26, 2016 at 3:19 PM, Michael Gummelt <
> mgumm...@mesosphere.io>
> >> wrote:
> >>>
> >>> If it's separable, then sure.  Consistency is nice.
> >>>
> >>> On Fri, Aug 26, 2016 at 2:14 PM, Jacek Laskowski <ja...@japila.pl>
> wrote:
> >>>>
> >>>> Hi Michael,
> >>>>
> >>>> Congrats!
> >>>>
> >>>> BTW What I like about the change the most is that it uses the
> >>>> pluggable interface for TaskScheduler and SchedulerBackend (as
> >>>> introduced by YARN). Think Standalone should follow the steps. WDYT?
> >>>>
> >>>> Pozdrawiam,
> >>>> Jacek Laskowski
> >>>> 
> >>>> https://medium.com/@jaceklaskowski/
> >>>> Mastering Apache Spark 2.0 http://bit.ly/mastering-apache-spark
> >>>> Follow me at https://twitter.com/jaceklaskowski
> >>>>
> >>>>
> >>>> On Fri, Aug 26, 2016 at 10:20 PM, Michael Gummelt
> >>>> <mgumm...@mesosphere.io> wrote:
> >>>> > Hello devs,
> >>>> >
> >>>> > Much like YARN, Mesos has been refactored into a Maven module.  So
> >>>> > when
> >>>> > building, you must add "-Pmesos" to enable Mesos support.
> >>>> >
> >>>> > The pre-built distributions from Apache will continue to enable
> Mesos.
> >>>> >
> >>>> > PR: https://github.com/apache/spark/pull/14637
> >>>> >
> >>>> > Cheers
> >>>> >
> >>>> > --
> >>>> > Michael Gummelt
> >>>> > Software Engineer
> >>>> > Mesosphere
> >>>
> >>>
> >>>
> >>>
> >>> --
> >>> Michael Gummelt
> >>> Software Engineer
> >>> Mesosphere
> >>
> >>
> >
>
>
>
> --
> Marcelo
>

Re: Mesos is now a maven module

2016-08-30 Thread Marcelo Vanzin

A quick look shows that maybe dev/sparktestsupport/modules.py needs to
be modified, and a "build_profile_flags" added to the mesos section
(similar to hive / hive-thriftserver).

Note not all PR builds will trigger mesos currently, since it's listed
as an independent module in the above file.

On Tue, Aug 30, 2016 at 10:01 AM, Sean Owen <so...@cloudera.com> wrote:
> I have the heady power to modify Jenkins jobs now, so I will carefully take
> a look at them and see if any of the config needs -Pmesos. But yeah I
> thought this should be baked into the script.
>
> On Tue, Aug 30, 2016 at 5:56 PM, Dongjoon Hyun <dongj...@apache.org> wrote:
>>
>> Hi, Michael.
>>
>> It's a great news!
>>
>> BTW, I'm wondering if the Jenkins (SparkPullRequestBuilder) knows this new
>> profile, -Pmesos.
>>
>> The PR was passed with the following Jenkins build arguments without
>> `-Pmesos` option. (at the last test)
>> ```
>> [info] Building Spark (w/Hive 1.2.1) using SBT with these arguments:
>> -Pyarn -Phadoop-2.3 -Pkinesis-asl -Phive-thriftserver -Phive test:package
>> streaming-kafka-0-8-assembly/assembly streaming-flume-assembly/assembly
>> streaming-kinesis-asl-assembly/assembly
>> ```
>>
>> https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64435/consoleFull
>>
>> Also, up to now, Jenkins seems not to use '-Pmesos' for all PRs.
>>
>> Bests,
>> Dongjoon.
>>
>>
>> On Fri, Aug 26, 2016 at 3:19 PM, Michael Gummelt <mgumm...@mesosphere.io>
>> wrote:
>>>
>>> If it's separable, then sure.  Consistency is nice.
>>>
>>> On Fri, Aug 26, 2016 at 2:14 PM, Jacek Laskowski <ja...@japila.pl> wrote:
>>>>
>>>> Hi Michael,
>>>>
>>>> Congrats!
>>>>
>>>> BTW What I like about the change the most is that it uses the
>>>> pluggable interface for TaskScheduler and SchedulerBackend (as
>>>> introduced by YARN). Think Standalone should follow the steps. WDYT?
>>>>
>>>> Pozdrawiam,
>>>> Jacek Laskowski
>>>> 
>>>> https://medium.com/@jaceklaskowski/
>>>> Mastering Apache Spark 2.0 http://bit.ly/mastering-apache-spark
>>>> Follow me at https://twitter.com/jaceklaskowski
>>>>
>>>>
>>>> On Fri, Aug 26, 2016 at 10:20 PM, Michael Gummelt
>>>> <mgumm...@mesosphere.io> wrote:
>>>> > Hello devs,
>>>> >
>>>> > Much like YARN, Mesos has been refactored into a Maven module.  So
>>>> > when
>>>> > building, you must add "-Pmesos" to enable Mesos support.
>>>> >
>>>> > The pre-built distributions from Apache will continue to enable Mesos.
>>>> >
>>>> > PR: https://github.com/apache/spark/pull/14637
>>>> >
>>>> > Cheers
>>>> >
>>>> > --
>>>> > Michael Gummelt
>>>> > Software Engineer
>>>> > Mesosphere
>>>
>>>
>>>
>>>
>>> --
>>> Michael Gummelt
>>> Software Engineer
>>> Mesosphere
>>
>>
>



-- 
Marcelo

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Re: Mesos is now a maven module

2016-08-30 Thread Sean Owen

I have the heady power to modify Jenkins jobs now, so I will carefully take
a look at them and see if any of the config needs -Pmesos. But yeah I
thought this should be baked into the script.

On Tue, Aug 30, 2016 at 5:56 PM, Dongjoon Hyun <dongj...@apache.org> wrote:

> Hi, Michael.
>
> It's a great news!
>
> BTW, I'm wondering if the Jenkins (SparkPullRequestBuilder) knows this new
> profile, -Pmesos.
>
> The PR was passed with the following Jenkins build arguments without
> `-Pmesos` option. (at the last test)
> ```
> [info] Building Spark (w/Hive 1.2.1) using SBT with these arguments:
>  -Pyarn -Phadoop-2.3 -Pkinesis-asl -Phive-thriftserver -Phive test:package
> streaming-kafka-0-8-assembly/assembly streaming-flume-assembly/assembly
> streaming-kinesis-asl-assembly/assembly
> ```
> https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64435/
> consoleFull
>
> Also, up to now, Jenkins seems not to use '-Pmesos' for all PRs.
>
> Bests,
> Dongjoon.
>
>
> On Fri, Aug 26, 2016 at 3:19 PM, Michael Gummelt <mgumm...@mesosphere.io>
> wrote:
>
>> If it's separable, then sure.  Consistency is nice.
>>
>> On Fri, Aug 26, 2016 at 2:14 PM, Jacek Laskowski <ja...@japila.pl> wrote:
>>
>>> Hi Michael,
>>>
>>> Congrats!
>>>
>>> BTW What I like about the change the most is that it uses the
>>> pluggable interface for TaskScheduler and SchedulerBackend (as
>>> introduced by YARN). Think Standalone should follow the steps. WDYT?
>>>
>>> Pozdrawiam,
>>> Jacek Laskowski
>>> 
>>> https://medium.com/@jaceklaskowski/
>>> Mastering Apache Spark 2.0 http://bit.ly/mastering-apache-spark
>>> Follow me at https://twitter.com/jaceklaskowski
>>>
>>>
>>> On Fri, Aug 26, 2016 at 10:20 PM, Michael Gummelt
>>> <mgumm...@mesosphere.io> wrote:
>>> > Hello devs,
>>> >
>>> > Much like YARN, Mesos has been refactored into a Maven module.  So when
>>> > building, you must add "-Pmesos" to enable Mesos support.
>>> >
>>> > The pre-built distributions from Apache will continue to enable Mesos.
>>> >
>>> > PR: https://github.com/apache/spark/pull/14637
>>> >
>>> > Cheers
>>> >
>>> > --
>>> > Michael Gummelt
>>> > Software Engineer
>>> > Mesosphere
>>>
>>
>>
>>
>> --
>> Michael Gummelt
>> Software Engineer
>> Mesosphere
>>
>
>

Re: Mesos is now a maven module

2016-08-30 Thread Marcelo Vanzin

Michael added the profile to the build scripts, but maybe some script
or code path was missed...

On Tue, Aug 30, 2016 at 9:56 AM, Dongjoon Hyun <dongj...@apache.org> wrote:
> Hi, Michael.
>
> It's a great news!
>
> BTW, I'm wondering if the Jenkins (SparkPullRequestBuilder) knows this new
> profile, -Pmesos.
>
> The PR was passed with the following Jenkins build arguments without
> `-Pmesos` option. (at the last test)
> ```
> [info] Building Spark (w/Hive 1.2.1) using SBT with these arguments:  -Pyarn
> -Phadoop-2.3 -Pkinesis-asl -Phive-thriftserver -Phive test:package
> streaming-kafka-0-8-assembly/assembly streaming-flume-assembly/assembly
> streaming-kinesis-asl-assembly/assembly
> ```
> https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64435/consoleFull
>
> Also, up to now, Jenkins seems not to use '-Pmesos' for all PRs.
>
> Bests,
> Dongjoon.
>
>
> On Fri, Aug 26, 2016 at 3:19 PM, Michael Gummelt <mgumm...@mesosphere.io>
> wrote:
>>
>> If it's separable, then sure.  Consistency is nice.
>>
>> On Fri, Aug 26, 2016 at 2:14 PM, Jacek Laskowski <ja...@japila.pl> wrote:
>>>
>>> Hi Michael,
>>>
>>> Congrats!
>>>
>>> BTW What I like about the change the most is that it uses the
>>> pluggable interface for TaskScheduler and SchedulerBackend (as
>>> introduced by YARN). Think Standalone should follow the steps. WDYT?
>>>
>>> Pozdrawiam,
>>> Jacek Laskowski
>>> 
>>> https://medium.com/@jaceklaskowski/
>>> Mastering Apache Spark 2.0 http://bit.ly/mastering-apache-spark
>>> Follow me at https://twitter.com/jaceklaskowski
>>>
>>>
>>> On Fri, Aug 26, 2016 at 10:20 PM, Michael Gummelt
>>> <mgumm...@mesosphere.io> wrote:
>>> > Hello devs,
>>> >
>>> > Much like YARN, Mesos has been refactored into a Maven module.  So when
>>> > building, you must add "-Pmesos" to enable Mesos support.
>>> >
>>> > The pre-built distributions from Apache will continue to enable Mesos.
>>> >
>>> > PR: https://github.com/apache/spark/pull/14637
>>> >
>>> > Cheers
>>> >
>>> > --
>>> > Michael Gummelt
>>> > Software Engineer
>>> > Mesosphere
>>
>>
>>
>>
>> --
>> Michael Gummelt
>> Software Engineer
>> Mesosphere
>
>



-- 
Marcelo

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Re: Mesos is now a maven module

2016-08-30 Thread Dongjoon Hyun

Hi, Michael.

It's a great news!

BTW, I'm wondering if the Jenkins (SparkPullRequestBuilder) knows this new
profile, -Pmesos.

The PR was passed with the following Jenkins build arguments without
`-Pmesos` option. (at the last test)
```
[info] Building Spark (w/Hive 1.2.1) using SBT with these arguments:
 -Pyarn -Phadoop-2.3 -Pkinesis-asl -Phive-thriftserver -Phive test:package
streaming-kafka-0-8-assembly/assembly streaming-flume-assembly/assembly
streaming-kinesis-asl-assembly/assembly
```
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64435/consoleFull

Also, up to now, Jenkins seems not to use '-Pmesos' for all PRs.

Bests,
Dongjoon.


On Fri, Aug 26, 2016 at 3:19 PM, Michael Gummelt <mgumm...@mesosphere.io>
wrote:

> If it's separable, then sure.  Consistency is nice.
>
> On Fri, Aug 26, 2016 at 2:14 PM, Jacek Laskowski <ja...@japila.pl> wrote:
>
>> Hi Michael,
>>
>> Congrats!
>>
>> BTW What I like about the change the most is that it uses the
>> pluggable interface for TaskScheduler and SchedulerBackend (as
>> introduced by YARN). Think Standalone should follow the steps. WDYT?
>>
>> Pozdrawiam,
>> Jacek Laskowski
>> 
>> https://medium.com/@jaceklaskowski/
>> Mastering Apache Spark 2.0 http://bit.ly/mastering-apache-spark
>> Follow me at https://twitter.com/jaceklaskowski
>>
>>
>> On Fri, Aug 26, 2016 at 10:20 PM, Michael Gummelt
>> <mgumm...@mesosphere.io> wrote:
>> > Hello devs,
>> >
>> > Much like YARN, Mesos has been refactored into a Maven module.  So when
>> > building, you must add "-Pmesos" to enable Mesos support.
>> >
>> > The pre-built distributions from Apache will continue to enable Mesos.
>> >
>> > PR: https://github.com/apache/spark/pull/14637
>> >
>> > Cheers
>> >
>> > --
>> > Michael Gummelt
>> > Software Engineer
>> > Mesosphere
>>
>
>
>
> --
> Michael Gummelt
> Software Engineer
> Mesosphere
>

Re: Mesos is now a maven module

2016-08-26 Thread Michael Gummelt

If it's separable, then sure.  Consistency is nice.

On Fri, Aug 26, 2016 at 2:14 PM, Jacek Laskowski <ja...@japila.pl> wrote:

> Hi Michael,
>
> Congrats!
>
> BTW What I like about the change the most is that it uses the
> pluggable interface for TaskScheduler and SchedulerBackend (as
> introduced by YARN). Think Standalone should follow the steps. WDYT?
>
> Pozdrawiam,
> Jacek Laskowski
> 
> https://medium.com/@jaceklaskowski/
> Mastering Apache Spark 2.0 http://bit.ly/mastering-apache-spark
> Follow me at https://twitter.com/jaceklaskowski
>
>
> On Fri, Aug 26, 2016 at 10:20 PM, Michael Gummelt
> <mgumm...@mesosphere.io> wrote:
> > Hello devs,
> >
> > Much like YARN, Mesos has been refactored into a Maven module.  So when
> > building, you must add "-Pmesos" to enable Mesos support.
> >
> > The pre-built distributions from Apache will continue to enable Mesos.
> >
> > PR: https://github.com/apache/spark/pull/14637
> >
> > Cheers
> >
> > --
> > Michael Gummelt
> > Software Engineer
> > Mesosphere
>



-- 
Michael Gummelt
Software Engineer
Mesosphere

Re: Mesos is now a maven module

2016-08-26 Thread Jacek Laskowski

Hi Michael,

Congrats!

BTW What I like about the change the most is that it uses the
pluggable interface for TaskScheduler and SchedulerBackend (as
introduced by YARN). Think Standalone should follow the steps. WDYT?

Pozdrawiam,
Jacek Laskowski

https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2.0 http://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski


On Fri, Aug 26, 2016 at 10:20 PM, Michael Gummelt
<mgumm...@mesosphere.io> wrote:
> Hello devs,
>
> Much like YARN, Mesos has been refactored into a Maven module.  So when
> building, you must add "-Pmesos" to enable Mesos support.
>
> The pre-built distributions from Apache will continue to enable Mesos.
>
> PR: https://github.com/apache/spark/pull/14637
>
> Cheers
>
> --
> Michael Gummelt
> Software Engineer
> Mesosphere

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Re: Mesos is now a maven module

2016-08-26 Thread Reynold Xin

This is great!


On Fri, Aug 26, 2016 at 1:20 PM, Michael Gummelt <mgumm...@mesosphere.io>
wrote:

> Hello devs,
>
> Much like YARN, Mesos has been refactored into a Maven module.  So when
> building, you must add "-Pmesos" to enable Mesos support.
>
> The pre-built distributions from Apache will continue to enable Mesos.
>
> PR: https://github.com/apache/spark/pull/14637
>
> Cheers
>
> --
> Michael Gummelt
> Software Engineer
> Mesosphere
>

Mesos is now a maven module

2016-08-26 Thread Michael Gummelt

Hello devs,

Much like YARN, Mesos has been refactored into a Maven module.  So when
building, you must add "-Pmesos" to enable Mesos support.

The pre-built distributions from Apache will continue to enable Mesos.

PR: https://github.com/apache/spark/pull/14637

Cheers

-- 
Michael Gummelt
Software Engineer
Mesosphere

Re: Spark 2.0.1 / 2.1.0 on Maven

2016-08-15 Thread Jacek Laskowski

Thanks Sean. That reflects my sentiments so well!

Pozdrawiam,
Jacek Laskowski

https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2.0 http://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski


On Mon, Aug 15, 2016 at 1:08 AM, Sean Owen  wrote:
> I believe Chris was being a bit facetious.
>
> The ASF guidance is right, that it's important people don't consume
> non-blessed snapshot builds as like other releases. The intended
> audience is developers and so the easiest default policy is to only
> advertise the snapshots where only developers are likely to be
> looking.
>
> That said, they're not secret or confidential, and while this probably
> should go to dev@, it's not a sin to mention the name of snapshots on
> user@, as long as these disclaimers are clear too. I'd rather a user
> understand the full picture, than find the snapshots and not
> understand any of the context.
>
> On Mon, Aug 15, 2016 at 2:11 AM, Jacek Laskowski  wrote:
>> Hi Chris,
>>
>> With my ASF member hat on...
>>
>> Oh, come on, Chris. It's not "in violation of ASF policies"
>> whatsoever. Policies are for ASF developers not for users. Honestly, I
>> was surprised to read the note in Mark Hamstra's email. It's very
>> restrictive but it says about what committers and PMCs should do not
>> users:
>>
>> "Do not include any links on the project website that might encourage
>> non-developers to download and use nightly builds, snapshots, release
>> candidates, or any other similar package."
>>
>> Pozdrawiam,
>> Jacek Laskowski
>> 
>> https://medium.com/@jaceklaskowski/
>> Mastering Apache Spark 2.0 http://bit.ly/mastering-apache-spark
>> Follow me at https://twitter.com/jaceklaskowski

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Re: Spark 2.0.1 / 2.1.0 on Maven

2016-08-15 Thread Steve Loughran

As well as the legal issue 'nightly builds haven't been through the strict 
review and license check process for ASF releases', and the engineering issue 
'release off a nightly and your users will hate you', there's an ASF community 
one: ASF projects want to build a dev community as well as a user one —and 
encouraging users to jump to coding for/near a project is a way to do this.

1. Anyone on user@ lists are strongly encouraged to get on the dev@ lists, not 
just to contribute code but to contribute to discourse on project direction, 
comment on issues, review other code.

2. Its really good if someone builds/tests their downstream apps against 
pre-releases; catching problems early is the best way to fix them.

3. it really, really helps if the people doing build/tess of downstream apps 
have their own copy of the source code, in sync with the snapshots they use. 
That puts them in the place to start debugging any problems which surface, 
identify if it is a bug in their own code surfacing, vs a regression in the 
dependency —and, if it is the latter, they are in the position to start working 
on a fix * and test it in the exact environment where the problem arises*

That's why you wan't to restrict these snapshots to developers: it's not "go 
away, user", it's "come and join the developers'

> On 15 Aug 2016, at 09:08, Sean Owen  wrote:
> 
> I believe Chris was being a bit facetious.
> 
> The ASF guidance is right, that it's important people don't consume
> non-blessed snapshot builds as like other releases. The intended
> audience is developers and so the easiest default policy is to only
> advertise the snapshots where only developers are likely to be
> looking.
> 
> That said, they're not secret or confidential, and while this probably
> should go to dev@, it's not a sin to mention the name of snapshots on
> user@, as long as these disclaimers are clear too. I'd rather a user
> understand the full picture, than find the snapshots and not
> understand any of the context.
> 
> On Mon, Aug 15, 2016 at 2:11 AM, Jacek Laskowski  wrote:
>> Hi Chris,
>> 
>> With my ASF member hat on...
>> 
>> Oh, come on, Chris. It's not "in violation of ASF policies"
>> whatsoever. Policies are for ASF developers not for users. Honestly, I
>> was surprised to read the note in Mark Hamstra's email. It's very
>> restrictive but it says about what committers and PMCs should do not
>> users:
>> 
>> "Do not include any links on the project website that might encourage
>> non-developers to download and use nightly builds, snapshots, release
>> candidates, or any other similar package."
>> 
>> Pozdrawiam,
>> Jacek Laskowski
>> 
>> https://medium.com/@jaceklaskowski/
>> Mastering Apache Spark 2.0 http://bit.ly/mastering-apache-spark
>> Follow me at https://twitter.com/jaceklaskowski
> 
> -
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
> 
>

Re: Spark 2.0.1 / 2.1.0 on Maven

2016-08-15 Thread Sean Owen

I believe Chris was being a bit facetious.

The ASF guidance is right, that it's important people don't consume
non-blessed snapshot builds as like other releases. The intended
audience is developers and so the easiest default policy is to only
advertise the snapshots where only developers are likely to be
looking.

That said, they're not secret or confidential, and while this probably
should go to dev@, it's not a sin to mention the name of snapshots on
user@, as long as these disclaimers are clear too. I'd rather a user
understand the full picture, than find the snapshots and not
understand any of the context.

On Mon, Aug 15, 2016 at 2:11 AM, Jacek Laskowski  wrote:
> Hi Chris,
>
> With my ASF member hat on...
>
> Oh, come on, Chris. It's not "in violation of ASF policies"
> whatsoever. Policies are for ASF developers not for users. Honestly, I
> was surprised to read the note in Mark Hamstra's email. It's very
> restrictive but it says about what committers and PMCs should do not
> users:
>
> "Do not include any links on the project website that might encourage
> non-developers to download and use nightly builds, snapshots, release
> candidates, or any other similar package."
>
> Pozdrawiam,
> Jacek Laskowski
> 
> https://medium.com/@jaceklaskowski/
> Mastering Apache Spark 2.0 http://bit.ly/mastering-apache-spark
> Follow me at https://twitter.com/jaceklaskowski

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Re: Spark 2.0.1 / 2.1.0 on Maven

2016-08-14 Thread Jacek Laskowski

Hi Chris,

With my ASF member hat on...

Oh, come on, Chris. It's not "in violation of ASF policies"
whatsoever. Policies are for ASF developers not for users. Honestly, I
was surprised to read the note in Mark Hamstra's email. It's very
restrictive but it says about what committers and PMCs should do not
users:

"Do not include any links on the project website that might encourage
non-developers to download and use nightly builds, snapshots, release
candidates, or any other similar package."

Pozdrawiam,
Jacek Laskowski

https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2.0 http://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski


On Tue, Aug 9, 2016 at 11:52 AM, Chris Fregly <ch...@fregly.com> wrote:
> alrighty then!
>
> bcc'ing user list.  cc'ing dev list.
>
> @user list people:  do not read any further or you will be in violation of
> ASF policies!
>
> On Tue, Aug 9, 2016 at 11:50 AM, Mark Hamstra <m...@clearstorydata.com>
> wrote:
>>
>> That's not going to happen on the user list, since that is against ASF
>> policy (http://www.apache.org/dev/release.html):
>>
>>> During the process of developing software and preparing a release,
>>> various packages are made available to the developer community for testing
>>> purposes. Do not include any links on the project website that might
>>> encourage non-developers to download and use nightly builds, snapshots,
>>> release candidates, or any other similar package. The only people who are
>>> supposed to know about such packages are the people following the dev list
>>> (or searching its archives) and thus aware of the conditions placed on the
>>> package. If you find that the general public are downloading such test
>>> packages, then remove them.
>>
>>
>> On Tue, Aug 9, 2016 at 11:32 AM, Chris Fregly <ch...@fregly.com> wrote:
>>>
>>> this is a valid question.  there are many people building products and
>>> tooling on top of spark and would like access to the latest snapshots and
>>> such.  today's ink is yesterday's news to these people - including myself.
>>>
>>> what is the best way to get snapshot releases including nightly and
>>> specially-blessed "preview" releases so that we, too, can say "try the
>>> latest release in our product"?
>>>
>>> there was a lot of chatter during the 2.0.0/2.0.1 release that i largely
>>> ignored because of conflicting/confusing/changing responses.  and i'd rather
>>> not dig through jenkins builds to figure this out as i'll likely get it
>>> wrong.
>>>
>>> please provide the relevant snapshot/preview/nightly/whatever repos (or
>>> equivalent) that we need to include in our builds to have access to the
>>> absolute latest build assets for every major and minor release.
>>>
>>> thanks!
>>>
>>> -chris
>>>
>>>
>>> On Tue, Aug 9, 2016 at 10:00 AM, Mich Talebzadeh
>>> <mich.talebza...@gmail.com> wrote:
>>>>
>>>> LOL
>>>>
>>>> Ink has not dried on Spark 2 yet so to speak :)
>>>>
>>>> Dr Mich Talebzadeh
>>>>
>>>>
>>>>
>>>> LinkedIn
>>>> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>>>>
>>>>
>>>>
>>>> http://talebzadehmich.wordpress.com
>>>>
>>>>
>>>> Disclaimer: Use it at your own risk. Any and all responsibility for any
>>>> loss, damage or destruction of data or any other property which may arise
>>>> from relying on this email's technical content is explicitly disclaimed. 
>>>> The
>>>> author will in no case be liable for any monetary damages arising from such
>>>> loss, damage or destruction.
>>>>
>>>>
>>>>
>>>>
>>>> On 9 August 2016 at 17:56, Mark Hamstra <m...@clearstorydata.com> wrote:
>>>>>
>>>>> What are you expecting to find?  There currently are no releases beyond
>>>>> Spark 2.0.0.
>>>>>
>>>>> On Tue, Aug 9, 2016 at 9:55 AM, Jestin Ma <jestinwith.a...@gmail.com>
>>>>> wrote:
>>>>>>
>>>>>> If we want to use versions of Spark beyond the official 2.0.0 release,
>>>>>> specifically on Maven + Java, what steps should we take to upgrade? I 
>>>>>> can't
>>>>>> find the newer versions on Maven central.
>>>>>>
>>>>>> Thank you!
>>>>>> Jestin
>>>>>
>>>>>
>>>>
>>>
>>>
>>>
>>> --
>>> Chris Fregly
>>> Research Scientist @ PipelineIO
>>> San Francisco, CA
>>> pipeline.io
>>> advancedspark.com
>>>
>>
>
>
>
> --
> Chris Fregly
> Research Scientist @ PipelineIO
> San Francisco, CA
> pipeline.io
> advancedspark.com
>

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Re: Spark 2.0.1 / 2.1.0 on Maven

2016-08-09 Thread Chris Fregly

alrighty then!

bcc'ing user list.  cc'ing dev list.

@user list people:  do not read any further or you will be in violation of
ASF policies!

On Tue, Aug 9, 2016 at 11:50 AM, Mark Hamstra <m...@clearstorydata.com>
wrote:

> That's not going to happen on the user list, since that is against ASF
> policy (http://www.apache.org/dev/release.html):
>
> During the process of developing software and preparing a release, various
>> packages are made available to the developer community for testing
>> purposes. Do not include any links on the project website that might
>> encourage non-developers to download and use nightly builds, snapshots,
>> release candidates, or any other similar package. The only people who
>> are supposed to know about such packages are the people following the dev
>> list (or searching its archives) and thus aware of the conditions placed on
>> the package. If you find that the general public are downloading such test
>> packages, then remove them.
>>
>
> On Tue, Aug 9, 2016 at 11:32 AM, Chris Fregly <ch...@fregly.com> wrote:
>
>> this is a valid question.  there are many people building products and
>> tooling on top of spark and would like access to the latest snapshots and
>> such.  today's ink is yesterday's news to these people - including myself.
>>
>> what is the best way to get snapshot releases including nightly and
>> specially-blessed "preview" releases so that we, too, can say "try the
>> latest release in our product"?
>>
>> there was a lot of chatter during the 2.0.0/2.0.1 release that i largely
>> ignored because of conflicting/confusing/changing responses.  and i'd
>> rather not dig through jenkins builds to figure this out as i'll likely get
>> it wrong.
>>
>> please provide the relevant snapshot/preview/nightly/whatever repos (or
>> equivalent) that we need to include in our builds to have access to the
>> absolute latest build assets for every major and minor release.
>>
>> thanks!
>>
>> -chris
>>
>>
>> On Tue, Aug 9, 2016 at 10:00 AM, Mich Talebzadeh <
>> mich.talebza...@gmail.com> wrote:
>>
>>> LOL
>>>
>>> Ink has not dried on Spark 2 yet so to speak :)
>>>
>>> Dr Mich Talebzadeh
>>>
>>>
>>>
>>> LinkedIn * 
>>> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>>> <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>>>
>>>
>>>
>>> http://talebzadehmich.wordpress.com
>>>
>>>
>>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>>> any loss, damage or destruction of data or any other property which may
>>> arise from relying on this email's technical content is explicitly
>>> disclaimed. The author will in no case be liable for any monetary damages
>>> arising from such loss, damage or destruction.
>>>
>>>
>>>
>>> On 9 August 2016 at 17:56, Mark Hamstra <m...@clearstorydata.com> wrote:
>>>
>>>> What are you expecting to find?  There currently are no releases beyond
>>>> Spark 2.0.0.
>>>>
>>>> On Tue, Aug 9, 2016 at 9:55 AM, Jestin Ma <jestinwith.a...@gmail.com>
>>>> wrote:
>>>>
>>>>> If we want to use versions of Spark beyond the official 2.0.0 release,
>>>>> specifically on Maven + Java, what steps should we take to upgrade? I 
>>>>> can't
>>>>> find the newer versions on Maven central.
>>>>>
>>>>> Thank you!
>>>>> Jestin
>>>>>
>>>>
>>>>
>>>
>>
>>
>> --
>> *Chris Fregly*
>> Research Scientist @ PipelineIO
>> San Francisco, CA
>> pipeline.io
>> advancedspark.com
>>
>>
>


-- 
*Chris Fregly*
Research Scientist @ PipelineIO
San Francisco, CA
pipeline.io
advancedspark.com

Re: spark-packages with maven

2016-07-15 Thread Ismaël Mejía

Thanks for the info Burak, I will check the repo you mention, do you know
concretely what is the 'magic' that spark-packages need or if is there any
document with info about it ?

On Fri, Jul 15, 2016 at 10:12 PM, Luciano Resende <luckbr1...@gmail.com>
wrote:

>
> On Fri, Jul 15, 2016 at 10:48 AM, Jacek Laskowski <ja...@japila.pl> wrote:
>
>> +1000
>>
>> Thanks Ismael for bringing this up! I meant to have send it earlier too
>> since I've been struggling with a sbt-based Scala project for a Spark
>> package myself this week and haven't yet found out how to do local
>> publishing.
>>
>> If such a guide existed for Maven I could use it for sbt easily too :-)
>>
>> Ping me Ismael if you don't hear back from the group so I feel invited
>> for digging into the plugin's sources.
>>
>> Best,
>> Jacek
>>
>> On 15 Jul 2016 2:29 p.m., "Ismaël Mejía" <ieme...@gmail.com> wrote:
>>
>> Hello, I would like to know if there is an easy way to package a new
>> spark-package
>> with maven, I just found this repo, but I am not an sbt user.
>>
>> https://github.com/databricks/sbt-spark-package
>>
>> One more question, is there a formal specification or documentation of
>> what do
>> you need to include in a spark-package (any special file, manifest, etc)
>> ? I
>> have not found any doc in the website.
>>
>> Thanks,
>> Ismael
>>
>>
>>
>
> I was under the impression that spark-packages was more like a place for
> one to list/advertise their extensions,  but when you do spark submit with
> --packages, it will use maven to resolve your package
> and as long as it succeeds, it will use it (e.g. you can do mvn clean
> install for your local packages, and use --packages with a spark server
> running on that same machine).
>
> From sbt, I think you can just use publishTo and define a local
> repository, something like
>
> publishTo := Some("Local Maven Repository" at 
> "file://"+Path.userHome.absolutePath+"/.m2/repository")
>
>
>
> --
> Luciano Resende
> http://twitter.com/lresende1975
> http://lresende.blogspot.com/
>

Re: spark-packages with maven

2016-07-15 Thread Luciano Resende

On Fri, Jul 15, 2016 at 10:48 AM, Jacek Laskowski <ja...@japila.pl> wrote:

> +1000
>
> Thanks Ismael for bringing this up! I meant to have send it earlier too
> since I've been struggling with a sbt-based Scala project for a Spark
> package myself this week and haven't yet found out how to do local
> publishing.
>
> If such a guide existed for Maven I could use it for sbt easily too :-)
>
> Ping me Ismael if you don't hear back from the group so I feel invited for
> digging into the plugin's sources.
>
> Best,
> Jacek
>
> On 15 Jul 2016 2:29 p.m., "Ismaël Mejía" <ieme...@gmail.com> wrote:
>
> Hello, I would like to know if there is an easy way to package a new
> spark-package
> with maven, I just found this repo, but I am not an sbt user.
>
> https://github.com/databricks/sbt-spark-package
>
> One more question, is there a formal specification or documentation of
> what do
> you need to include in a spark-package (any special file, manifest, etc) ?
> I
> have not found any doc in the website.
>
> Thanks,
> Ismael
>
>
>

I was under the impression that spark-packages was more like a place for
one to list/advertise their extensions,  but when you do spark submit with
--packages, it will use maven to resolve your package
and as long as it succeeds, it will use it (e.g. you can do mvn clean
install for your local packages, and use --packages with a spark server
running on that same machine).

>From sbt, I think you can just use publishTo and define a local repository,
something like

publishTo := Some("Local Maven Repository" at
"file://"+Path.userHome.absolutePath+"/.m2/repository")



-- 
Luciano Resende
http://twitter.com/lresende1975
http://lresende.blogspot.com/

Re: spark-packages with maven

2016-07-15 Thread Burak Yavuz

Hi Ismael and Jacek,

If you use Maven for building your applications, you may use the
spark-package command line tool (
https://github.com/databricks/spark-package-cmd-tool) to perform packaging.
It requires you to build your jar using maven first, and then does all the
extra magic that Spark Package requires.

Please contact me directly if you have any issues.

Best,
Burak

On Fri, Jul 15, 2016 at 10:48 AM, Jacek Laskowski <ja...@japila.pl> wrote:

> +1000
>
> Thanks Ismael for bringing this up! I meant to have send it earlier too
> since I've been struggling with a sbt-based Scala project for a Spark
> package myself this week and haven't yet found out how to do local
> publishing.
>
> If such a guide existed for Maven I could use it for sbt easily too :-)
>
> Ping me Ismael if you don't hear back from the group so I feel invited for
> digging into the plugin's sources.
>
> Best,
> Jacek
>
> On 15 Jul 2016 2:29 p.m., "Ismaël Mejía" <ieme...@gmail.com> wrote:
>
> Hello, I would like to know if there is an easy way to package a new
> spark-package
> with maven, I just found this repo, but I am not an sbt user.
>
> https://github.com/databricks/sbt-spark-package
>
> One more question, is there a formal specification or documentation of
> what do
> you need to include in a spark-package (any special file, manifest, etc) ?
> I
> have not found any doc in the website.
>
> Thanks,
> Ismael
>
>
>

Re: spark-packages with maven

2016-07-15 Thread Jacek Laskowski

+1000

Thanks Ismael for bringing this up! I meant to have send it earlier too
since I've been struggling with a sbt-based Scala project for a Spark
package myself this week and haven't yet found out how to do local
publishing.

If such a guide existed for Maven I could use it for sbt easily too :-)

Ping me Ismael if you don't hear back from the group so I feel invited for
digging into the plugin's sources.

Best,
Jacek

On 15 Jul 2016 2:29 p.m., "Ismaël Mejía" <ieme...@gmail.com> wrote:

Hello, I would like to know if there is an easy way to package a new
spark-package
with maven, I just found this repo, but I am not an sbt user.

https://github.com/databricks/sbt-spark-package

One more question, is there a formal specification or documentation of what
do
you need to include in a spark-package (any special file, manifest, etc) ? I
have not found any doc in the website.

Thanks,
Ismael

spark-packages with maven

2016-07-15 Thread Ismaël Mejía

Hello, I would like to know if there is an easy way to package a new
spark-package
with maven, I just found this repo, but I am not an sbt user.

https://github.com/databricks/sbt-spark-package

One more question, is there a formal specification or documentation of what
do
you need to include in a spark-package (any special file, manifest, etc) ? I
have not found any doc in the website.

Thanks,
Ismael

Re: Spark 2.0.0-preview artifacts still not available in Maven

2016-06-07 Thread Shivaram Venkataraman

As far as I know the process is just to copy docs/_site from the build
to the appropriate location in the SVN repo (i.e.
site/docs/2.0.0-preview).

Thanks
Shivaram

On Tue, Jun 7, 2016 at 8:14 AM, Sean Owen  wrote:
> As a stop-gap, I can edit that page to have a small section about
> preview releases and point to the nightly docs.
>
> Not sure who has the power to push 2.0.0-preview to site/docs, but, if
> that's done then we can symlink "preview" in that dir to it and be
> done, and update this section about preview docs accordingly.
>
> On Tue, Jun 7, 2016 at 4:10 PM, Tom Graves  wrote:
>> Thanks Sean, you were right, hard refresh made it show up.
>>
>> Seems like we should at least link to the preview docs from
>> http://spark.apache.org/documentation.html.
>>
>> Tom
>>
>>
>> On Tuesday, June 7, 2016 10:04 AM, Sean Owen  wrote:
>>
>>
>> It's there (refresh maybe?). See the end of the downloads dropdown.
>>
>> For the moment you can see the docs in the nightly docs build:
>> https://home.apache.org/~pwendell/spark-nightly/spark-branch-2.0-docs/latest/
>>
>> I don't know, what's the best way to put this into the main site?
>> under a /preview root? I am not sure how that process works.
>>
>> On Tue, Jun 7, 2016 at 4:01 PM, Tom Graves  wrote:
>>> I just checked and I don't see the 2.0 preview release at all anymore on
>>> .http://spark.apache.org/downloads.html, is it in transition?The only
>>> place I can see it is at
>>> http://spark.apache.org/news/spark-2.0.0-preview.html
>>>
>>>
>>> I would like to see docs there too.  My opinion is it should be as easy to
>>> use/try out as any other spark release.
>>>
>>> Tom
>>
>>>
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
>> For additional commands, e-mail: dev-h...@spark.apache.org
>>
>>
>>
>>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
> For additional commands, e-mail: dev-h...@spark.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

Re: Spark 2.0.0-preview artifacts still not available in Maven

2016-06-07 Thread Tom Graves

Thanks Sean, you were right, hard refresh made it show up.
Seems like we should at least link to the preview docs from 
http://spark.apache.org/documentation.html.
Tom 

On Tuesday, June 7, 2016 10:04 AM, Sean Owen  wrote:

 It's there (refresh maybe?). See the end of the downloads dropdown.

For the moment you can see the docs in the nightly docs build:
https://home.apache.org/~pwendell/spark-nightly/spark-branch-2.0-docs/latest/

I don't know, what's the best way to put this into the main site?
under a /preview root? I am not sure how that process works.

On Tue, Jun 7, 2016 at 4:01 PM, Tom Graves  wrote:
> I just checked and I don't see the 2.0 preview release at all anymore on
> .http://spark.apache.org/downloads.html, is it in transition?    The only
> place I can see it is at
> http://spark.apache.org/news/spark-2.0.0-preview.html
>
>
> I would like to see docs there too.  My opinion is it should be as easy to
> use/try out as any other spark release.
>
> Tom
>

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

Re: Spark 2.0.0-preview artifacts still not available in Maven

2016-06-07 Thread Sean Owen

As a stop-gap, I can edit that page to have a small section about
preview releases and point to the nightly docs.

Not sure who has the power to push 2.0.0-preview to site/docs, but, if
that's done then we can symlink "preview" in that dir to it and be
done, and update this section about preview docs accordingly.

On Tue, Jun 7, 2016 at 4:10 PM, Tom Graves  wrote:
> Thanks Sean, you were right, hard refresh made it show up.
>
> Seems like we should at least link to the preview docs from
> http://spark.apache.org/documentation.html.
>
> Tom
>
>
> On Tuesday, June 7, 2016 10:04 AM, Sean Owen  wrote:
>
>
> It's there (refresh maybe?). See the end of the downloads dropdown.
>
> For the moment you can see the docs in the nightly docs build:
> https://home.apache.org/~pwendell/spark-nightly/spark-branch-2.0-docs/latest/
>
> I don't know, what's the best way to put this into the main site?
> under a /preview root? I am not sure how that process works.
>
> On Tue, Jun 7, 2016 at 4:01 PM, Tom Graves  wrote:
>> I just checked and I don't see the 2.0 preview release at all anymore on
>> .http://spark.apache.org/downloads.html, is it in transition?The only
>> place I can see it is at
>> http://spark.apache.org/news/spark-2.0.0-preview.html
>>
>>
>> I would like to see docs there too.  My opinion is it should be as easy to
>> use/try out as any other spark release.
>>
>> Tom
>
>>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
> For additional commands, e-mail: dev-h...@spark.apache.org
>
>
>
>

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

Re: Spark 2.0.0-preview artifacts still not available in Maven

2016-06-07 Thread Sean Owen

It's there (refresh maybe?). See the end of the downloads dropdown.

For the moment you can see the docs in the nightly docs build:
https://home.apache.org/~pwendell/spark-nightly/spark-branch-2.0-docs/latest/

I don't know, what's the best way to put this into the main site?
under a /preview root? I am not sure how that process works.

On Tue, Jun 7, 2016 at 4:01 PM, Tom Graves  wrote:
> I just checked and I don't see the 2.0 preview release at all anymore on
> .http://spark.apache.org/downloads.html, is it in transition?The only
> place I can see it is at
> http://spark.apache.org/news/spark-2.0.0-preview.html
>
>
> I would like to see docs there too.  My opinion is it should be as easy to
> use/try out as any other spark release.
>
> Tom
>

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

Re: Spark 2.0.0-preview artifacts still not available in Maven

2016-06-06 Thread Imran Rashid

I've been a bit on the fence on this, but I agree that Luciano makes a
compelling reason for why we really should publish things to maven
central.  Sure we slightly increase the risk somebody refers to the preview
release too late, but really that is their own fault.

And I also I agree with comments from Sean and Mark that this is *not* a
"Databricks vs The World" scenario at all.

On Mon, Jun 6, 2016 at 2:13 PM, Luciano Resende <luckbr1...@gmail.com>
wrote:

>
>
> On Mon, Jun 6, 2016 at 12:05 PM, Reynold Xin <r...@databricks.com> wrote:
>
>> The bahir one was a good argument actually. I just clicked the button to
>> push it into Maven central.
>>
>>
> Thank You !!!
>
>

Re: Spark 2.0.0-preview artifacts still not available in Maven

2016-06-06 Thread Luciano Resende

On Mon, Jun 6, 2016 at 12:05 PM, Reynold Xin <r...@databricks.com> wrote:

> The bahir one was a good argument actually. I just clicked the button to
> push it into Maven central.
>
>
Thank You !!!

Re: Spark 2.0.0-preview artifacts still not available in Maven

2016-06-06 Thread Reynold Xin

The bahir one was a good argument actually. I just clicked the button to
push it into Maven central.


On Mon, Jun 6, 2016 at 12:00 PM, Mark Hamstra <m...@clearstorydata.com>
wrote:

> Fine.  I don't feel strongly enough about it to continue to argue against
> putting the artifacts on Maven Central.
>
> On Mon, Jun 6, 2016 at 11:48 AM, Sean Owen <so...@cloudera.com> wrote:
>
>> Artifacts can't be removed from Maven in any normal circumstance, but,
>> it's no problem.
>>
>> The argument that people might keep using it goes for any older
>> release. Why would anyone use 1.6.0 when 1.6.1 exists? yet we keep
>> 1.6.0 just for the record and to not break builds. It may be that
>> Foobar 3.0-beta depends on 2.0.0-preview and 3.0 will shortly depend
>> on 2.0.0, but, killing the -preview artifact breaks that other
>> historical release/branch.
>>
>> I agree that "-alpha-1" would have been better. But we're talking
>> about working around pretty bone-headed behavior, to not notice what
>> version of Spark they build against, or not understand what
>> 2.0.0-preview vs 2.0.0 means in a world of semver.
>>
>> BTW Maven sorts 2.0.0-preview before 2.0.0, so 2.0.0 would show up as
>> the latest, when released, in tools like mvn
>> versions:display-dependency-updates. You could exclude the preview
>> release by requiring version [2.0.0,).
>>
>> On Mon, Jun 6, 2016 at 7:19 PM, Mark Hamstra <m...@clearstorydata.com>
>> wrote:
>> > Precisely because the naming of the preview artifacts has to fall
>> outside of
>> > the normal versioning, I can easily see incautious Maven users a few
>> months
>> > from now mistaking the preview artifacts as spark-2.0-something-special
>> > instead of spark-2.0-something-stale.
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
>> For additional commands, e-mail: dev-h...@spark.apache.org
>>
>>
>

Re: Spark 2.0.0-preview artifacts still not available in Maven

2016-06-06 Thread Sean Owen

Artifacts can't be removed from Maven in any normal circumstance, but,
it's no problem.

The argument that people might keep using it goes for any older
release. Why would anyone use 1.6.0 when 1.6.1 exists? yet we keep
1.6.0 just for the record and to not break builds. It may be that
Foobar 3.0-beta depends on 2.0.0-preview and 3.0 will shortly depend
on 2.0.0, but, killing the -preview artifact breaks that other
historical release/branch.

I agree that "-alpha-1" would have been better. But we're talking
about working around pretty bone-headed behavior, to not notice what
version of Spark they build against, or not understand what
2.0.0-preview vs 2.0.0 means in a world of semver.

BTW Maven sorts 2.0.0-preview before 2.0.0, so 2.0.0 would show up as
the latest, when released, in tools like mvn
versions:display-dependency-updates. You could exclude the preview
release by requiring version [2.0.0,).

On Mon, Jun 6, 2016 at 7:19 PM, Mark Hamstra <m...@clearstorydata.com> wrote:
> Precisely because the naming of the preview artifacts has to fall outside of
> the normal versioning, I can easily see incautious Maven users a few months
> from now mistaking the preview artifacts as spark-2.0-something-special
> instead of spark-2.0-something-stale.

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

Re: Spark 2.0.0-preview artifacts still not available in Maven

2016-06-06 Thread Ovidiu-Cristian MARCU

+1 for moving this discussion to a proactive new (alpha/beta) release of Apache 
Spark 2.0!

> On 06 Jun 2016, at 20:25, Ovidiu Cristian Marcu <oma...@inria.fr> wrote:
> 
> Any chance to start preparing a new alpha/beta release for 2.0 this month or 
> the preview will be pushed to maven and considered an alpha?
> 
> Sent from TypeApp <http://www.typeapp.com/r>
> 
> On Jun 6, 2016, at 20:12, Matei Zaharia <matei.zaha...@gmail.com 
> <mailto:matei.zaha...@gmail.com>> wrote:
> Is there any way to remove artifacts from Maven Central? Maybe that would 
> help clean these things up long-term, though it would create problems for 
> users who for some reason decide to rely on these previews. 
> 
> In any case, if people are *really* concerned about this, we should just put 
> it there. My thought was that it's better for users to do something special 
> to link to this release (e.g. add a reference to the staging repo) so that 
> they are more likely to know that it's a special, unstable thing. Same thing 
> they do to use snapshots. 
> 
> Matei
> 
> On Mon, Jun 6, 2016 at 10:49 AM, Luciano Resende <luckbr1...@gmail.com 
> <mailto:luckbr1...@gmail.com>> wrote: 
> 
> 
> On Mon, Jun 6, 2016 at 10:08 AM, Mark Hamstra <m...@clearstorydata.com 
> <mailto:m...@clearstorydata.com>> wrote:
> I still don't know where this "severely compromised builds of limited 
> usefulness" thing comes from? what's so bad? You didn't veto its 
> release, after all.
> 
> I simply mean that it was released with the knowledge that there are still 
> significant bugs in the preview that definitely would warrant a veto if this 
> were intended to be on a par with other releases.  There have been repeated 
> announcements to that effect, but developers finding the preview artifacts on 
> Maven Central months from now may well not also see those announcements and 
> related discussion.  The artifacts will be very stale and no longer useful 
> for their limited testing purpose, but will persist in the repository.  
> 
> 
> A few months from now, why would a developer choose a preview, alpha, beta 
> compared to the GA 2.0 release ? 
> 
> As for the being stale part, this is true for every release anyone put out 
> there. 
> 
> 
> -- 
> Luciano Resende 
> http://twitter.com/lresende1975 <http://twitter.com/lresende1975> 
> http://lresende.blogspot.com/ <http://lresende.blogspot.com/>

Re: Spark 2.0.0-preview artifacts still not available in Maven

2016-06-06 Thread Luciano Resende

On Mon, Jun 6, 2016 at 11:12 AM, Matei Zaharia <matei.zaha...@gmail.com>
wrote:

> Is there any way to remove artifacts from Maven Central? Maybe that would
> help clean these things up long-term, though it would create problems for
> users who for some reason decide to rely on these previews.
>
> In any case, if people are *really* concerned about this, we should just
> put it there. My thought was that it's better for users to do something
> special to link to this release (e.g. add a reference to the staging repo)
> so that they are more likely to know that it's a special, unstable thing.
> Same thing they do to use snapshots.
>
> Matei
>
>
So, consider this thread started on another project :
https://www.mail-archive.com/dev@bahir.apache.org/msg00038.html

What would be your recommendation ?
   - Start a release based on Apache Spark 2.0.0 preview staging repo ? I
would  reject that...
   - Start a release on a set of artifacts that are going to be deleted ? I
would also reject that

To me, if companies are using the release on their products, and other
projects are relying on the release to provide a way for users to test,
this should be considered as any other release, published permanently,
which at some point will become obsolete and users will move on to more
stable releases.

Thanks



-- 
Luciano Resende
http://twitter.com/lresende1975
http://lresende.blogspot.com/

Re: Spark 2.0.0-preview artifacts still not available in Maven

2016-06-06 Thread Matei Zaharia

Is there any way to remove artifacts from Maven Central? Maybe that would
help clean these things up long-term, though it would create problems for
users who for some reason decide to rely on these previews.

In any case, if people are *really* concerned about this, we should just
put it there. My thought was that it's better for users to do something
special to link to this release (e.g. add a reference to the staging repo)
so that they are more likely to know that it's a special, unstable thing.
Same thing they do to use snapshots.

Matei

On Mon, Jun 6, 2016 at 10:49 AM, Luciano Resende <luckbr1...@gmail.com>
wrote:

>
>
> On Mon, Jun 6, 2016 at 10:08 AM, Mark Hamstra <m...@clearstorydata.com>
> wrote:
>
>> I still don't know where this "severely compromised builds of limited
>>> usefulness" thing comes from? what's so bad? You didn't veto its
>>> release, after all.
>>
>>
>> I simply mean that it was released with the knowledge that there are
>> still significant bugs in the preview that definitely would warrant a veto
>> if this were intended to be on a par with other releases.  There have been
>> repeated announcements to that effect, but developers finding the preview
>> artifacts on Maven Central months from now may well not also see those
>> announcements and related discussion.  The artifacts will be very stale and
>> no longer useful for their limited testing purpose, but will persist in the
>> repository.
>>
>>
> A few months from now, why would a developer choose a preview, alpha, beta
> compared to the GA 2.0 release ?
>
> As for the being stale part, this is true for every release anyone put out
> there.
>
>
> --
> Luciano Resende
> http://twitter.com/lresende1975
> http://lresende.blogspot.com/
>

Re: Spark 2.0.0-preview artifacts still not available in Maven

2016-06-06 Thread Luciano Resende

On Mon, Jun 6, 2016 at 10:08 AM, Mark Hamstra <m...@clearstorydata.com>
wrote:

> I still don't know where this "severely compromised builds of limited
>> usefulness" thing comes from? what's so bad? You didn't veto its
>> release, after all.
>
>
> I simply mean that it was released with the knowledge that there are still
> significant bugs in the preview that definitely would warrant a veto if
> this were intended to be on a par with other releases.  There have been
> repeated announcements to that effect, but developers finding the preview
> artifacts on Maven Central months from now may well not also see those
> announcements and related discussion.  The artifacts will be very stale and
> no longer useful for their limited testing purpose, but will persist in the
> repository.
>
>
A few months from now, why would a developer choose a preview, alpha, beta
compared to the GA 2.0 release ?

As for the being stale part, this is true for every release anyone put out
there.


-- 
Luciano Resende
http://twitter.com/lresende1975
http://lresende.blogspot.com/

Re: Spark 2.0.0-preview artifacts still not available in Maven

2016-06-06 Thread Luciano Resende

On Mon, Jun 6, 2016 at 9:51 AM, Sean Owen  wrote:

> I still don't know where this "severely compromised builds of limited
> usefulness" thing comes from? what's so bad? You didn't veto its
> release, after all. And rightly so: a release doesn't mean "definitely
> works"; it means it was created the right way. It's OK to say it's
> buggy alpha software; this isn't an argument to not really release it.
>
> But aside from that: if it should be used by someone, then who did you
> have in mind?
>
> It would be coherent at least to decide not to make alpha-like
> release, but, we agreed to, which is why this argument sort of
> surprises me.
>
> I share some concerns about piling on Databricks. Nothing here is by
> nature about an organization. However, this release really began in
> response to a thread (which not everyone here can see) about
> Databricks releasing a "2.0.0 preview" option in their product before
> it existed. I presume employees of that company sort of endorse this,
> which has put this same release into the hands of not just developers
> or admins but end users -- even with caveats and warnings.
>
> (And I think that's right!)
>
>

In this case, I would only expect the 2.0.0 preview to be treated as just
any other release, period.


-- 
Luciano Resende
http://twitter.com/lresende1975
http://lresende.blogspot.com/

Re: Spark 2.0.0-preview artifacts still not available in Maven

2016-06-06 Thread Mark Hamstra

>
> I still don't know where this "severely compromised builds of limited
> usefulness" thing comes from? what's so bad? You didn't veto its
> release, after all.


I simply mean that it was released with the knowledge that there are still
significant bugs in the preview that definitely would warrant a veto if
this were intended to be on a par with other releases.  There have been
repeated announcements to that effect, but developers finding the preview
artifacts on Maven Central months from now may well not also see those
announcements and related discussion.  The artifacts will be very stale and
no longer useful for their limited testing purpose, but will persist in the
repository.

On Mon, Jun 6, 2016 at 9:51 AM, Sean Owen <so...@cloudera.com> wrote:

> I still don't know where this "severely compromised builds of limited
> usefulness" thing comes from? what's so bad? You didn't veto its
> release, after all. And rightly so: a release doesn't mean "definitely
> works"; it means it was created the right way. It's OK to say it's
> buggy alpha software; this isn't an argument to not really release it.
>
> But aside from that: if it should be used by someone, then who did you
> have in mind?
>
> It would be coherent at least to decide not to make alpha-like
> release, but, we agreed to, which is why this argument sort of
> surprises me.
>
> I share some concerns about piling on Databricks. Nothing here is by
> nature about an organization. However, this release really began in
> response to a thread (which not everyone here can see) about
> Databricks releasing a "2.0.0 preview" option in their product before
> it existed. I presume employees of that company sort of endorse this,
> which has put this same release into the hands of not just developers
> or admins but end users -- even with caveats and warnings.
>
> (And I think that's right!)
>
> While I'd like to see your reasons before I'd agree with you Mark,
> yours is a feasible position; I'm not as sure how people who work for
> Databricks can argue at the same time however that this should be
> carefully guarded as an ASF release -- even with caveats and warnings.
>
> We don't need to assume bad faith -- I don't. The appearance alone is
> enough to act to make this consistent.
>
> But, I think the resolution is simple: it's not 'dangerous' to release
> this and I don't think people who say they think this really do. So
> just finish this release normally, and we're done. Even if you think
> there's an argument against it, weigh vs the problems above.
>
>
> On Mon, Jun 6, 2016 at 4:00 PM, Mark Hamstra <m...@clearstorydata.com>
> wrote:
> > This is not a Databricks vs. The World situation, and the fact that some
> > persist in forcing every issue into that frame is getting annoying.
> There
> > are good engineering and project-management reasons not to populate the
> > long-term, canonical repository of Maven artifacts with what are known
> to be
> > severely compromised builds of limited usefulness, particularly over
> time.
> > It is a legitimate dispute over whether these preview artifacts should be
> > deployed to Maven Central, not one that must be seen as Databricks
> seeking
> > improper advantage.
> >
>

1 2 3 >