Re: [VOTE] Release Apache Spark 1.2.0 (RC1)
+1 (non-binding) built from source fired up a spark-shell against YARN cluster ran some jobs using parallelize ran some jobs that read files clicked around the web UI On Sun, Nov 30, 2014 at 1:10 AM, GuoQiang Li wi...@qq.com wrote: +1 (non-binding) -- Original -- From: Patrick Wendell;pwend...@gmail.com; Date: Sat, Nov 29, 2014 01:16 PM To: dev@spark.apache.orgdev@spark.apache.org; Subject: [VOTE] Release Apache Spark 1.2.0 (RC1) Please vote on releasing the following candidate as Apache Spark version 1.2.0! The tag to be voted on is v1.2.0-rc1 (commit 1056e9ec1): https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=1056e9ec13203d0c51564265e94d77a054498fdb The release files, including signatures, digests, etc. can be found at: http://people.apache.org/~pwendell/spark-1.2.0-rc1/ Release artifacts are signed with the following key: https://people.apache.org/keys/committer/pwendell.asc The staging repository for this release can be found at: https://repository.apache.org/content/repositories/orgapachespark-1048/ The documentation corresponding to this release can be found at: http://people.apache.org/~pwendell/spark-1.2.0-rc1-docs/ Please vote on releasing this package as Apache Spark 1.2.0! The vote is open until Tuesday, December 02, at 05:15 UTC and passes if a majority of at least 3 +1 PMC votes are cast. [ ] +1 Release this package as Apache Spark 1.1.0 [ ] -1 Do not release this package because ... To learn more about Apache Spark, please see http://spark.apache.org/ == What justifies a -1 vote for this release? == This vote is happening very late into the QA period compared with previous votes, so -1 votes should only occur for significant regressions from 1.0.2. Bugs already present in 1.1.X, minor regressions, or bugs related to new features will not block this release. == What default changes should I be aware of? == 1. The default value of spark.shuffle.blockTransferService has been changed to netty -- Old behavior can be restored by switching to nio 2. The default value of spark.shuffle.manager has been changed to sort. -- Old behavior can be restored by setting spark.shuffle.manager to hash. == Other notes == Because this vote is occurring over a weekend, I will likely extend the vote if this RC survives until the end of the vote period. - Patrick - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: Creating a SchemaRDD from an existing API
Hi Michael, About this new data source API, what type of data sources would it support? Does it have to be RDBMS necessarily? Cheers On Sat, Nov 29, 2014 at 12:57 AM, Michael Armbrust mich...@databricks.com wrote: You probably don't need to create a new kind of SchemaRDD. Instead I'd suggest taking a look at the data sources API that we are adding in Spark 1.2. There is not a ton of documentation, but the test cases show how to implement the various interfaces https://github.com/apache/spark/tree/master/sql/core/src/test/scala/org/apache/spark/sql/sources, and there is an example library for reading Avro data https://github.com/databricks/spark-avro. On Thu, Nov 27, 2014 at 10:31 PM, Niranda Perera nira...@wso2.com wrote: Hi, I am evaluating Spark for an analytic component where we do batch processing of data using SQL. So, I am particularly interested in Spark SQL and in creating a SchemaRDD from an existing API [1]. This API exposes elements in a database as datasources. Using the methods allowed by this data source, we can access and edit data. So, I want to create a custom SchemaRDD using the methods and provisions of this API. I tried going through Spark documentation and the Java Docs, but unfortunately, I was unable to come to a final conclusion if this was actually possible. I would like to ask the Spark Devs, 1. As of the current Spark release, can we make a custom SchemaRDD? 2. What is the extension point to a custom SchemaRDD? or are there particular interfaces? 3. Could you please point me the specific docs regarding this matter? Your help in this regard is highly appreciated. Cheers [1] https://github.com/wso2-dev/carbon-analytics/tree/master/components/xanalytics -- *Niranda Perera* Software Engineer, WSO2 Inc. Mobile: +94-71-554-8430 Twitter: @n1r44 https://twitter.com/N1R44 -- *Niranda Perera* Software Engineer, WSO2 Inc. Mobile: +94-71-554-8430 Twitter: @n1r44 https://twitter.com/N1R44
Spark Summit East CFP - 5 days until deadline
The inaugural Spark Summit East (spark-summit.org/east), an event to bring the Apache Spark community together, will be in New York City on March 18, 2015. The call for submissions is currently open, but will close this Friday December 5, at 11:59pm PST. The summit is looking for talks that will cover topics including applications, development, research, and data science. At the Summit you can look forward to hearing from committers, developers, CEOs, and companies who are solving real-world big data challenges with Spark. All submissions will be reviewed by a Program Committee that is made up of the creators, top committers and individuals who have heavily contributed to the Spark project. No speaker slots are being sold to sponsors in an effort to to keep the Summit a community driven event. To submit your abstracts please visit: spark-summit.org/east/2015/cfp Looking forward to seeing you there! Best, Scott The Spark Summit Organizers
Re: [VOTE] Release Apache Spark 1.2.0 (RC1)
Hi everyone, There’s an open bug report related to Spark standalone which could be a potential release-blocker (pending investigation / a bug fix): https://issues.apache.org/jira/browse/SPARK-4498. This issue seems non-deterministc and only affects long-running Spark standalone deployments, so it may be hard to reproduce. I’m going to work on a patch to add additional logging in order to help with debugging. I just wanted to give an early head’s up about this issue and to get more eyes on it in case anyone else has run into it or wants to help with debugging. - Josh On November 28, 2014 at 9:18:09 PM, Patrick Wendell (pwend...@gmail.com) wrote: Please vote on releasing the following candidate as Apache Spark version 1.2.0! The tag to be voted on is v1.2.0-rc1 (commit 1056e9ec1): https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=1056e9ec13203d0c51564265e94d77a054498fdb The release files, including signatures, digests, etc. can be found at: http://people.apache.org/~pwendell/spark-1.2.0-rc1/ Release artifacts are signed with the following key: https://people.apache.org/keys/committer/pwendell.asc The staging repository for this release can be found at: https://repository.apache.org/content/repositories/orgapachespark-1048/ The documentation corresponding to this release can be found at: http://people.apache.org/~pwendell/spark-1.2.0-rc1-docs/ Please vote on releasing this package as Apache Spark 1.2.0! The vote is open until Tuesday, December 02, at 05:15 UTC and passes if a majority of at least 3 +1 PMC votes are cast. [ ] +1 Release this package as Apache Spark 1.1.0 [ ] -1 Do not release this package because ... To learn more about Apache Spark, please see http://spark.apache.org/ == What justifies a -1 vote for this release? == This vote is happening very late into the QA period compared with previous votes, so -1 votes should only occur for significant regressions from 1.0.2. Bugs already present in 1.1.X, minor regressions, or bugs related to new features will not block this release. == What default changes should I be aware of? == 1. The default value of spark.shuffle.blockTransferService has been changed to netty -- Old behavior can be restored by switching to nio 2. The default value of spark.shuffle.manager has been changed to sort. -- Old behavior can be restored by setting spark.shuffle.manager to hash. == Other notes == Because this vote is occurring over a weekend, I will likely extend the vote if this RC survives until the end of the vote period. - Patrick - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: Creating a SchemaRDD from an existing API
No, it should support any data source that has a schema and can produce rows. On Mon, Dec 1, 2014 at 1:34 AM, Niranda Perera nira...@wso2.com wrote: Hi Michael, About this new data source API, what type of data sources would it support? Does it have to be RDBMS necessarily? Cheers On Sat, Nov 29, 2014 at 12:57 AM, Michael Armbrust mich...@databricks.com wrote: You probably don't need to create a new kind of SchemaRDD. Instead I'd suggest taking a look at the data sources API that we are adding in Spark 1.2. There is not a ton of documentation, but the test cases show how to implement the various interfaces https://github.com/apache/spark/tree/master/sql/core/src/test/scala/org/apache/spark/sql/sources, and there is an example library for reading Avro data https://github.com/databricks/spark-avro. On Thu, Nov 27, 2014 at 10:31 PM, Niranda Perera nira...@wso2.com wrote: Hi, I am evaluating Spark for an analytic component where we do batch processing of data using SQL. So, I am particularly interested in Spark SQL and in creating a SchemaRDD from an existing API [1]. This API exposes elements in a database as datasources. Using the methods allowed by this data source, we can access and edit data. So, I want to create a custom SchemaRDD using the methods and provisions of this API. I tried going through Spark documentation and the Java Docs, but unfortunately, I was unable to come to a final conclusion if this was actually possible. I would like to ask the Spark Devs, 1. As of the current Spark release, can we make a custom SchemaRDD? 2. What is the extension point to a custom SchemaRDD? or are there particular interfaces? 3. Could you please point me the specific docs regarding this matter? Your help in this regard is highly appreciated. Cheers [1] https://github.com/wso2-dev/carbon-analytics/tree/master/components/xanalytics -- *Niranda Perera* Software Engineer, WSO2 Inc. Mobile: +94-71-554-8430 Twitter: @n1r44 https://twitter.com/N1R44 -- *Niranda Perera* Software Engineer, WSO2 Inc. Mobile: +94-71-554-8430 Twitter: @n1r44 https://twitter.com/N1R44
Re: [VOTE] Release Apache Spark 1.2.0 (RC1)
+0.9 from me. Tested it on Mac and Windows (someone has to do it) and while things work, I noticed a few recent scripts don't have Windows equivalents, namely https://issues.apache.org/jira/browse/SPARK-4683 and https://issues.apache.org/jira/browse/SPARK-4684. The first one at least would be good to fix if we do another RC. Not blocking the release but useful to fix in docs is https://issues.apache.org/jira/browse/SPARK-4685. Matei On Dec 1, 2014, at 11:18 AM, Josh Rosen rosenvi...@gmail.com wrote: Hi everyone, There’s an open bug report related to Spark standalone which could be a potential release-blocker (pending investigation / a bug fix): https://issues.apache.org/jira/browse/SPARK-4498. This issue seems non-deterministc and only affects long-running Spark standalone deployments, so it may be hard to reproduce. I’m going to work on a patch to add additional logging in order to help with debugging. I just wanted to give an early head’s up about this issue and to get more eyes on it in case anyone else has run into it or wants to help with debugging. - Josh On November 28, 2014 at 9:18:09 PM, Patrick Wendell (pwend...@gmail.com) wrote: Please vote on releasing the following candidate as Apache Spark version 1.2.0! The tag to be voted on is v1.2.0-rc1 (commit 1056e9ec1): https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=1056e9ec13203d0c51564265e94d77a054498fdb The release files, including signatures, digests, etc. can be found at: http://people.apache.org/~pwendell/spark-1.2.0-rc1/ Release artifacts are signed with the following key: https://people.apache.org/keys/committer/pwendell.asc The staging repository for this release can be found at: https://repository.apache.org/content/repositories/orgapachespark-1048/ The documentation corresponding to this release can be found at: http://people.apache.org/~pwendell/spark-1.2.0-rc1-docs/ Please vote on releasing this package as Apache Spark 1.2.0! The vote is open until Tuesday, December 02, at 05:15 UTC and passes if a majority of at least 3 +1 PMC votes are cast. [ ] +1 Release this package as Apache Spark 1.1.0 [ ] -1 Do not release this package because ... To learn more about Apache Spark, please see http://spark.apache.org/ == What justifies a -1 vote for this release? == This vote is happening very late into the QA period compared with previous votes, so -1 votes should only occur for significant regressions from 1.0.2. Bugs already present in 1.1.X, minor regressions, or bugs related to new features will not block this release. == What default changes should I be aware of? == 1. The default value of spark.shuffle.blockTransferService has been changed to netty -- Old behavior can be restored by switching to nio 2. The default value of spark.shuffle.manager has been changed to sort. -- Old behavior can be restored by setting spark.shuffle.manager to hash. == Other notes == Because this vote is occurring over a weekend, I will likely extend the vote if this RC survives until the end of the vote period. - Patrick - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
jenkins downtime: 730-930am, 12/12/14
i'll send out a reminder next week, but i wanted to give a heads up: i'll be bringing down the entire jenkins infrastructure for reboots and system updates. please let me know if there are any conflicts with this, thanks! shane
Required file not found in building
It seems there were some additional settings required to build spark now . This should be a snap for most of you ot there about what I am missing. Here is the command line I have traditionally used: mvn -Pyarn -Phadoop-2.3 -Phive install compile package -DskipTests That command line is however failing with the lastest from HEAD: INFO] --- scala-maven-plugin:3.2.0:compile (scala-compile-first) @ spark-network-common_2.10 --- [INFO] Using zinc server for incremental compilation [INFO] compiler plugin: BasicArtifact(org.scalamacros,paradise_2.10.4,2.0.1,null) *[error] Required file not found: scala-compiler-2.10.4.jar* *[error] See zinc -help for information about locating necessary files* [INFO] [INFO] Reactor Summary: [INFO] [INFO] Spark Project Parent POM .. SUCCESS [4.077s] [INFO] Spark Project Networking .. FAILURE [0.445s] OK let's try zinc -help: 18:38:00/spark2 $*zinc -help* Nailgun server running with 1 cached compiler Version = 0.3.5.1 Zinc compiler cache limit = 5 Resident scalac cache limit = 0 Analysis cache limit = 5 Compiler(Scala 2.10.4) [74ff364f] Setup = { * scala compiler = /Users/steve/.m2/repository/org/scala-lang/scala-compiler/2.10.4/scala-compiler-2.10.4.jar* scala library = /Users/steve/.m2/repository/org/scala-lang/scala-library/2.10.4/scala-library-2.10.4.jar scala extra = { /Users/steve/.m2/repository/org/scala-lang/scala-reflect/2.10.4/scala-reflect-2.10.4.jar /shared/zinc-0.3.5.1/lib/scala-reflect.jar } sbt interface = /shared/zinc-0.3.5.1/lib/sbt-interface.jar compiler interface sources = /shared/zinc-0.3.5.1/lib/compiler-interface-sources.jar java home = fork java = false cache directory = /Users/steve/.zinc/0.3.5.1 } Does that compiler jar exist? Yes! 18:39:34/spark2 $ll /Users/steve/.m2/repository/org/scala-lang/scala-compiler/2.10.4/scala-compiler-2.10.4.jar -rw-r--r-- 1 steve staff 14445780 Apr 9 2014 /Users/steve/.m2/repository/org/scala-lang/scala-compiler/2.10.4/scala-compiler-2.10.4.jar
Re: packaging spark run time with osgi service
Already tried the solutions they provided.. Did not workout.. On 12/2/14 8:17 AM, Dinesh J. Weerakkody wrote: Hi Lochana, can you please go through this mail thread [1]. I haven't tried but can be useful. [1] http://apache-spark-user-list.1001560.n3.nabble.com/Packaging-a-spark-job-using-maven-td5615.html On Mon, Dec 1, 2014 at 4:28 PM, Lochana Menikarachchi locha...@gmail.com mailto:locha...@gmail.com wrote: I have spark core and mllib as dependencies for a spark based osgi service. When I call the model building method through a unit test (without osgi) it works OK. When I call it through the osgi service, nothing happens. I tried adding spark assembly jar. Now it throws following error.. An error occurred while building supervised machine learning model: No configuration setting found for key 'akka.version' com.typesafe.config.ConfigException$Missing: No configuration setting found for key 'akka.version' at com.typesafe.config.impl.SimpleConfig.findKey(SimpleConfig.java:115) at com.typesafe.config.impl.SimpleConfig.find(SimpleConfig.java:136) at com.typesafe.config.impl.SimpleConfig.find(SimpleConfig.java:142) at com.typesafe.config.impl.SimpleConfig.find(SimpleConfig.java:150) at com.typesafe.config.impl.SimpleConfig.find(SimpleConfig.java:155) at com.typesafe.config.impl.SimpleConfig.getString(SimpleConfig.java:197) What is the correct way to include spark runtime dependencies to osgi service.. Thanks. Lochana - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org mailto:dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org mailto:dev-h...@spark.apache.org -- Thanks Best Regards, *Dinesh J. Weerakkody* /www.dineshjweerakkody.com http://www.dineshjweerakkody.com/
Re: Required file not found in building
I tried the same command on MacBook and didn't experience the same error. Which OS are you using ? Cheers On Mon, Dec 1, 2014 at 6:42 PM, Stephen Boesch java...@gmail.com wrote: It seems there were some additional settings required to build spark now . This should be a snap for most of you ot there about what I am missing. Here is the command line I have traditionally used: mvn -Pyarn -Phadoop-2.3 -Phive install compile package -DskipTests That command line is however failing with the lastest from HEAD: INFO] --- scala-maven-plugin:3.2.0:compile (scala-compile-first) @ spark-network-common_2.10 --- [INFO] Using zinc server for incremental compilation [INFO] compiler plugin: BasicArtifact(org.scalamacros,paradise_2.10.4,2.0.1,null) *[error] Required file not found: scala-compiler-2.10.4.jar* *[error] See zinc -help for information about locating necessary files* [INFO] [INFO] Reactor Summary: [INFO] [INFO] Spark Project Parent POM .. SUCCESS [4.077s] [INFO] Spark Project Networking .. FAILURE [0.445s] OK let's try zinc -help: 18:38:00/spark2 $*zinc -help* Nailgun server running with 1 cached compiler Version = 0.3.5.1 Zinc compiler cache limit = 5 Resident scalac cache limit = 0 Analysis cache limit = 5 Compiler(Scala 2.10.4) [74ff364f] Setup = { * scala compiler = /Users/steve/.m2/repository/org/scala-lang/scala-compiler/2.10.4/scala-compiler-2.10.4.jar* scala library = /Users/steve/.m2/repository/org/scala-lang/scala-library/2.10.4/scala-library-2.10.4.jar scala extra = { /Users/steve/.m2/repository/org/scala-lang/scala-reflect/2.10.4/scala-reflect-2.10.4.jar /shared/zinc-0.3.5.1/lib/scala-reflect.jar } sbt interface = /shared/zinc-0.3.5.1/lib/sbt-interface.jar compiler interface sources = /shared/zinc-0.3.5.1/lib/compiler-interface-sources.jar java home = fork java = false cache directory = /Users/steve/.zinc/0.3.5.1 } Does that compiler jar exist? Yes! 18:39:34/spark2 $ll /Users/steve/.m2/repository/org/scala-lang/scala-compiler/2.10.4/scala-compiler-2.10.4.jar -rw-r--r-- 1 steve staff 14445780 Apr 9 2014 /Users/steve/.m2/repository/org/scala-lang/scala-compiler/2.10.4/scala-compiler-2.10.4.jar
Re: Required file not found in building
Mac as well. Just found the problem: I had created an alias to zinc a couple of months back. Apparently that is not happy with the build anymore. No problem now that the issue has been isolated - just need to fix my zinc alias. 2014-12-01 18:55 GMT-08:00 Ted Yu yuzhih...@gmail.com: I tried the same command on MacBook and didn't experience the same error. Which OS are you using ? Cheers On Mon, Dec 1, 2014 at 6:42 PM, Stephen Boesch java...@gmail.com wrote: It seems there were some additional settings required to build spark now . This should be a snap for most of you ot there about what I am missing. Here is the command line I have traditionally used: mvn -Pyarn -Phadoop-2.3 -Phive install compile package -DskipTests That command line is however failing with the lastest from HEAD: INFO] --- scala-maven-plugin:3.2.0:compile (scala-compile-first) @ spark-network-common_2.10 --- [INFO] Using zinc server for incremental compilation [INFO] compiler plugin: BasicArtifact(org.scalamacros,paradise_2.10.4,2.0.1,null) *[error] Required file not found: scala-compiler-2.10.4.jar* *[error] See zinc -help for information about locating necessary files* [INFO] [INFO] Reactor Summary: [INFO] [INFO] Spark Project Parent POM .. SUCCESS [4.077s] [INFO] Spark Project Networking .. FAILURE [0.445s] OK let's try zinc -help: 18:38:00/spark2 $*zinc -help* Nailgun server running with 1 cached compiler Version = 0.3.5.1 Zinc compiler cache limit = 5 Resident scalac cache limit = 0 Analysis cache limit = 5 Compiler(Scala 2.10.4) [74ff364f] Setup = { * scala compiler = /Users/steve/.m2/repository/org/scala-lang/scala-compiler/2.10.4/scala-compiler-2.10.4.jar* scala library = /Users/steve/.m2/repository/org/scala-lang/scala-library/2.10.4/scala-library-2.10.4.jar scala extra = { /Users/steve/.m2/repository/org/scala-lang/scala-reflect/2.10.4/scala-reflect-2.10.4.jar /shared/zinc-0.3.5.1/lib/scala-reflect.jar } sbt interface = /shared/zinc-0.3.5.1/lib/sbt-interface.jar compiler interface sources = /shared/zinc-0.3.5.1/lib/compiler-interface-sources.jar java home = fork java = false cache directory = /Users/steve/.zinc/0.3.5.1 } Does that compiler jar exist? Yes! 18:39:34/spark2 $ll /Users/steve/.m2/repository/org/scala-lang/scala-compiler/2.10.4/scala-compiler-2.10.4.jar -rw-r--r-- 1 steve staff 14445780 Apr 9 2014 /Users/steve/.m2/repository/org/scala-lang/scala-compiler/2.10.4/scala-compiler-2.10.4.jar
Re: Required file not found in building
The zinc src zip for 0.3.5.3 was downloaded and exploded. Then I ran sbt dist/create . zinc is being launched from dist/target/zinc-0.3.5.3/bin/zinc 2014-12-01 20:12 GMT-08:00 Ted Yu yuzhih...@gmail.com: I use zinc 0.2.0 and started zinc with the same command shown below. I don't observe such error. How did you install zinc-0.3.5.3 ? Cheers On Mon, Dec 1, 2014 at 8:00 PM, Stephen Boesch java...@gmail.com wrote: Anyone maybe can assist on how to run zinc with the latest maven build? I am starting zinc as follows: /shared/zinc-0.3.5.3/dist/target/zinc-0.3.5.3/bin/zinc -scala-home $SCALA_HOME -nailed -start The pertinent env vars are: 19:58:11/lib $echo $SCALA_HOME /shared/scala 19:58:14/lib $which scala /shared/scala/bin/scala 19:58:16/lib $scala -version Scala code runner version 2.10.4 -- Copyright 2002-2013, LAMP/EPFL When I do *not *start zinc then the maven build works .. but v slowly since no incremental compiler available. When zinc is started as shown above then the error occurs on all of the modules except parent: [INFO] Using zinc server for incremental compilation [INFO] compiler plugin: BasicArtifact(org.scalamacros,paradise_2.10.4,2.0.1,null) [error] Required file not found: scala-compiler-2.10.4.jar [error] See zinc -help for information about locating necessary files 2014-12-01 19:02 GMT-08:00 Stephen Boesch java...@gmail.com: Mac as well. Just found the problem: I had created an alias to zinc a couple of months back. Apparently that is not happy with the build anymore. No problem now that the issue has been isolated - just need to fix my zinc alias. 2014-12-01 18:55 GMT-08:00 Ted Yu yuzhih...@gmail.com: I tried the same command on MacBook and didn't experience the same error. Which OS are you using ? Cheers On Mon, Dec 1, 2014 at 6:42 PM, Stephen Boesch java...@gmail.com wrote: It seems there were some additional settings required to build spark now . This should be a snap for most of you ot there about what I am missing. Here is the command line I have traditionally used: mvn -Pyarn -Phadoop-2.3 -Phive install compile package -DskipTests That command line is however failing with the lastest from HEAD: INFO] --- scala-maven-plugin:3.2.0:compile (scala-compile-first) @ spark-network-common_2.10 --- [INFO] Using zinc server for incremental compilation [INFO] compiler plugin: BasicArtifact(org.scalamacros,paradise_2.10.4,2.0.1,null) *[error] Required file not found: scala-compiler-2.10.4.jar* *[error] See zinc -help for information about locating necessary files* [INFO] [INFO] Reactor Summary: [INFO] [INFO] Spark Project Parent POM .. SUCCESS [4.077s] [INFO] Spark Project Networking .. FAILURE [0.445s] OK let's try zinc -help: 18:38:00/spark2 $*zinc -help* Nailgun server running with 1 cached compiler Version = 0.3.5.1 Zinc compiler cache limit = 5 Resident scalac cache limit = 0 Analysis cache limit = 5 Compiler(Scala 2.10.4) [74ff364f] Setup = { * scala compiler = /Users/steve/.m2/repository/org/scala-lang/scala-compiler/2.10.4/scala-compiler-2.10.4.jar* scala library = /Users/steve/.m2/repository/org/scala-lang/scala-library/2.10.4/scala-library-2.10.4.jar scala extra = { /Users/steve/.m2/repository/org/scala-lang/scala-reflect/2.10.4/scala-reflect-2.10.4.jar /shared/zinc-0.3.5.1/lib/scala-reflect.jar } sbt interface = /shared/zinc-0.3.5.1/lib/sbt-interface.jar compiler interface sources = /shared/zinc-0.3.5.1/lib/compiler-interface-sources.jar java home = fork java = false cache directory = /Users/steve/.zinc/0.3.5.1 } Does that compiler jar exist? Yes! 18:39:34/spark2 $ll /Users/steve/.m2/repository/org/scala-lang/scala-compiler/2.10.4/scala-compiler-2.10.4.jar -rw-r--r-- 1 steve staff 14445780 Apr 9 2014 /Users/steve/.m2/repository/org/scala-lang/scala-compiler/2.10.4/scala-compiler-2.10.4.jar
Re: Required file not found in building
I used the following for brew: http://repo.typesafe.com/typesafe/zinc/com/typesafe/zinc/dist/0.3.0/zinc-0.3.0.tgz After starting zinc, I issued the same mvn command but didn't encounter the error you saw. FYI On Mon, Dec 1, 2014 at 8:18 PM, Stephen Boesch java...@gmail.com wrote: The zinc src zip for 0.3.5.3 was downloaded and exploded. Then I ran sbt dist/create . zinc is being launched from dist/target/zinc-0.3.5.3/bin/zinc 2014-12-01 20:12 GMT-08:00 Ted Yu yuzhih...@gmail.com: I use zinc 0.2.0 and started zinc with the same command shown below. I don't observe such error. How did you install zinc-0.3.5.3 ? Cheers On Mon, Dec 1, 2014 at 8:00 PM, Stephen Boesch java...@gmail.com wrote: Anyone maybe can assist on how to run zinc with the latest maven build? I am starting zinc as follows: /shared/zinc-0.3.5.3/dist/target/zinc-0.3.5.3/bin/zinc -scala-home $SCALA_HOME -nailed -start The pertinent env vars are: 19:58:11/lib $echo $SCALA_HOME /shared/scala 19:58:14/lib $which scala /shared/scala/bin/scala 19:58:16/lib $scala -version Scala code runner version 2.10.4 -- Copyright 2002-2013, LAMP/EPFL When I do *not *start zinc then the maven build works .. but v slowly since no incremental compiler available. When zinc is started as shown above then the error occurs on all of the modules except parent: [INFO] Using zinc server for incremental compilation [INFO] compiler plugin: BasicArtifact(org.scalamacros,paradise_2.10.4,2.0.1,null) [error] Required file not found: scala-compiler-2.10.4.jar [error] See zinc -help for information about locating necessary files 2014-12-01 19:02 GMT-08:00 Stephen Boesch java...@gmail.com: Mac as well. Just found the problem: I had created an alias to zinc a couple of months back. Apparently that is not happy with the build anymore. No problem now that the issue has been isolated - just need to fix my zinc alias. 2014-12-01 18:55 GMT-08:00 Ted Yu yuzhih...@gmail.com: I tried the same command on MacBook and didn't experience the same error. Which OS are you using ? Cheers On Mon, Dec 1, 2014 at 6:42 PM, Stephen Boesch java...@gmail.com wrote: It seems there were some additional settings required to build spark now . This should be a snap for most of you ot there about what I am missing. Here is the command line I have traditionally used: mvn -Pyarn -Phadoop-2.3 -Phive install compile package -DskipTests That command line is however failing with the lastest from HEAD: INFO] --- scala-maven-plugin:3.2.0:compile (scala-compile-first) @ spark-network-common_2.10 --- [INFO] Using zinc server for incremental compilation [INFO] compiler plugin: BasicArtifact(org.scalamacros,paradise_2.10.4,2.0.1,null) *[error] Required file not found: scala-compiler-2.10.4.jar* *[error] See zinc -help for information about locating necessary files* [INFO] [INFO] Reactor Summary: [INFO] [INFO] Spark Project Parent POM .. SUCCESS [4.077s] [INFO] Spark Project Networking .. FAILURE [0.445s] OK let's try zinc -help: 18:38:00/spark2 $*zinc -help* Nailgun server running with 1 cached compiler Version = 0.3.5.1 Zinc compiler cache limit = 5 Resident scalac cache limit = 0 Analysis cache limit = 5 Compiler(Scala 2.10.4) [74ff364f] Setup = { * scala compiler = /Users/steve/.m2/repository/org/scala-lang/scala-compiler/2.10.4/scala-compiler-2.10.4.jar* scala library = /Users/steve/.m2/repository/org/scala-lang/scala-library/2.10.4/scala-library-2.10.4.jar scala extra = { /Users/steve/.m2/repository/org/scala-lang/scala-reflect/2.10.4/scala-reflect-2.10.4.jar /shared/zinc-0.3.5.1/lib/scala-reflect.jar } sbt interface = /shared/zinc-0.3.5.1/lib/sbt-interface.jar compiler interface sources = /shared/zinc-0.3.5.1/lib/compiler-interface-sources.jar java home = fork java = false cache directory = /Users/steve/.zinc/0.3.5.1 } Does that compiler jar exist? Yes! 18:39:34/spark2 $ll /Users/steve/.m2/repository/org/scala-lang/scala-compiler/2.10.4/scala-compiler-2.10.4.jar -rw-r--r-- 1 steve staff 14445780 Apr 9 2014 /Users/steve/.m2/repository/org/scala-lang/scala-compiler/2.10.4/scala-compiler-2.10.4.jar
Monitoring Spark
hello, im running spark on a cluster and i want to monitor how many nodes/ cores are active in different (specific) points of the program. is there any way to do this? thanks, Isca
Can the Scala classes in the spark source code, be inherited in Java classes?
Hi, Can the Scala classes in the spark source code, be inherited (and other OOP concepts) in Java classes? I want to customize some part of the code, but I would like to do it in a Java environment. Rgds -- *Niranda Perera* Software Engineer, WSO2 Inc. Mobile: +94-71-554-8430 Twitter: @n1r44 https://twitter.com/N1R44
Re: Required file not found in building
I'm having no problems with the build or zinc on my Mac. I use zinc from brew install zinc. On Tue, Dec 2, 2014 at 3:02 AM, Stephen Boesch java...@gmail.com wrote: Mac as well. Just found the problem: I had created an alias to zinc a couple of months back. Apparently that is not happy with the build anymore. No problem now that the issue has been isolated - just need to fix my zinc alias. 2014-12-01 18:55 GMT-08:00 Ted Yu yuzhih...@gmail.com: I tried the same command on MacBook and didn't experience the same error. Which OS are you using ? Cheers On Mon, Dec 1, 2014 at 6:42 PM, Stephen Boesch java...@gmail.com wrote: It seems there were some additional settings required to build spark now . This should be a snap for most of you ot there about what I am missing. Here is the command line I have traditionally used: mvn -Pyarn -Phadoop-2.3 -Phive install compile package -DskipTests That command line is however failing with the lastest from HEAD: INFO] --- scala-maven-plugin:3.2.0:compile (scala-compile-first) @ spark-network-common_2.10 --- [INFO] Using zinc server for incremental compilation [INFO] compiler plugin: BasicArtifact(org.scalamacros,paradise_2.10.4,2.0.1,null) *[error] Required file not found: scala-compiler-2.10.4.jar* *[error] See zinc -help for information about locating necessary files* [INFO] [INFO] Reactor Summary: [INFO] [INFO] Spark Project Parent POM .. SUCCESS [4.077s] [INFO] Spark Project Networking .. FAILURE [0.445s] OK let's try zinc -help: 18:38:00/spark2 $*zinc -help* Nailgun server running with 1 cached compiler Version = 0.3.5.1 Zinc compiler cache limit = 5 Resident scalac cache limit = 0 Analysis cache limit = 5 Compiler(Scala 2.10.4) [74ff364f] Setup = { * scala compiler = /Users/steve/.m2/repository/org/scala-lang/scala-compiler/2.10.4/scala-compiler-2.10.4.jar* scala library = /Users/steve/.m2/repository/org/scala-lang/scala-library/2.10.4/scala-library-2.10.4.jar scala extra = { /Users/steve/.m2/repository/org/scala-lang/scala-reflect/2.10.4/scala-reflect-2.10.4.jar /shared/zinc-0.3.5.1/lib/scala-reflect.jar } sbt interface = /shared/zinc-0.3.5.1/lib/sbt-interface.jar compiler interface sources = /shared/zinc-0.3.5.1/lib/compiler-interface-sources.jar java home = fork java = false cache directory = /Users/steve/.zinc/0.3.5.1 } Does that compiler jar exist? Yes! 18:39:34/spark2 $ll /Users/steve/.m2/repository/org/scala-lang/scala-compiler/2.10.4/scala-compiler-2.10.4.jar -rw-r--r-- 1 steve staff 14445780 Apr 9 2014 /Users/steve/.m2/repository/org/scala-lang/scala-compiler/2.10.4/scala-compiler-2.10.4.jar - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: Can the Scala classes in the spark source code, be inherited in Java classes?
Yes, they are compiled to classes in JVM bytecode just the same. You may find the generated code from Scala looks a bit strange and uses Scala-specific classes, but it's certainly possible to treat them like other Java classes. On Tue, Dec 2, 2014 at 5:22 AM, Niranda Perera nira...@wso2.com wrote: Hi, Can the Scala classes in the spark source code, be inherited (and other OOP concepts) in Java classes? I want to customize some part of the code, but I would like to do it in a Java environment. Rgds -- *Niranda Perera* Software Engineer, WSO2 Inc. Mobile: +94-71-554-8430 Twitter: @n1r44 https://twitter.com/N1R44 - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: Can the Scala classes in the spark source code, be inherited in Java classes?
Oops my previous response wasn't sent properly to the dev list. Here you go for archiving. Yes you can. Scala classes are compiled down to classes in bytecode. Take a look at this: https://twitter.github.io/scala_school/java.html Note that questions like this are not exactly what this dev list is meant for ... On Mon, Dec 1, 2014 at 9:22 PM, Niranda Perera nira...@wso2.com wrote: Hi, Can the Scala classes in the spark source code, be inherited (and other OOP concepts) in Java classes? I want to customize some part of the code, but I would like to do it in a Java environment. Rgds -- *Niranda Perera* Software Engineer, WSO2 Inc. Mobile: +94-71-554-8430 Twitter: @n1r44 https://twitter.com/N1R44
Re: [VOTE] Release Apache Spark 1.2.0 (RC1)
Hey All, Just an update. Josh, Andrew, and others are working to reproduce SPARK-4498 and fix it. Other than that issue no serious regressions have been reported so far. If we are able to get a fix in for that soon, we'll likely cut another RC with the patch. Continued testing of RC1 is definitely appreciated! I'll leave this vote open to allow folks to continue posting comments. It's fine to still give +1 from your own testing... i.e. you can assume at this point SPARK-4498 will be fixed before releasing. - Patrick On Mon, Dec 1, 2014 at 3:30 PM, Matei Zaharia matei.zaha...@gmail.com wrote: +0.9 from me. Tested it on Mac and Windows (someone has to do it) and while things work, I noticed a few recent scripts don't have Windows equivalents, namely https://issues.apache.org/jira/browse/SPARK-4683 and https://issues.apache.org/jira/browse/SPARK-4684. The first one at least would be good to fix if we do another RC. Not blocking the release but useful to fix in docs is https://issues.apache.org/jira/browse/SPARK-4685. Matei On Dec 1, 2014, at 11:18 AM, Josh Rosen rosenvi...@gmail.com wrote: Hi everyone, There's an open bug report related to Spark standalone which could be a potential release-blocker (pending investigation / a bug fix): https://issues.apache.org/jira/browse/SPARK-4498. This issue seems non-deterministc and only affects long-running Spark standalone deployments, so it may be hard to reproduce. I'm going to work on a patch to add additional logging in order to help with debugging. I just wanted to give an early head's up about this issue and to get more eyes on it in case anyone else has run into it or wants to help with debugging. - Josh On November 28, 2014 at 9:18:09 PM, Patrick Wendell (pwend...@gmail.com) wrote: Please vote on releasing the following candidate as Apache Spark version 1.2.0! The tag to be voted on is v1.2.0-rc1 (commit 1056e9ec1): https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=1056e9ec13203d0c51564265e94d77a054498fdb The release files, including signatures, digests, etc. can be found at: http://people.apache.org/~pwendell/spark-1.2.0-rc1/ Release artifacts are signed with the following key: https://people.apache.org/keys/committer/pwendell.asc The staging repository for this release can be found at: https://repository.apache.org/content/repositories/orgapachespark-1048/ The documentation corresponding to this release can be found at: http://people.apache.org/~pwendell/spark-1.2.0-rc1-docs/ Please vote on releasing this package as Apache Spark 1.2.0! The vote is open until Tuesday, December 02, at 05:15 UTC and passes if a majority of at least 3 +1 PMC votes are cast. [ ] +1 Release this package as Apache Spark 1.1.0 [ ] -1 Do not release this package because ... To learn more about Apache Spark, please see http://spark.apache.org/ == What justifies a -1 vote for this release? == This vote is happening very late into the QA period compared with previous votes, so -1 votes should only occur for significant regressions from 1.0.2. Bugs already present in 1.1.X, minor regressions, or bugs related to new features will not block this release. == What default changes should I be aware of? == 1. The default value of spark.shuffle.blockTransferService has been changed to netty -- Old behavior can be restored by switching to nio 2. The default value of spark.shuffle.manager has been changed to sort. -- Old behavior can be restored by setting spark.shuffle.manager to hash. == Other notes == Because this vote is occurring over a weekend, I will likely extend the vote if this RC survives until the end of the vote period. - Patrick - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org