spark ec2 script doest not install necessary files to launch spark
Hello, I followed the instructions for launching Spark 1.5.1 on my AWS EC2 but the script is not installing all the folders/files required to initialize Spark. Since the log message is long, I have created a gist here: https://gist.github.com/Emaasit/696145959bbbd989bfe1 Please help. I have been going at this for more than 6 hours now to no success. - Daniel Emaasit, Ph.D. Research Assistant Transportation Research Center (TRC) University of Nevada, Las Vegas Las Vegas, NV 89154-4015 Cell: 615-649-2489 www.danielemaasit.com -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/spark-ec2-script-doest-not-install-necessary-files-to-launch-spark-tp25311.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: inlcudePackage() deprecated?
Got it. Ignore my similar question on Github comments. On Thu, Jun 4, 2015 at 11:48 AM, Shivaram Venkataraman < shiva...@eecs.berkeley.edu> wrote: > Yeah - We don't have support for running UDFs on DataFrames yet. There is > an open issue to track this > https://issues.apache.org/jira/browse/SPARK-6817 > > Thanks > Shivaram > > On Thu, Jun 4, 2015 at 3:10 AM, Daniel Emaasit > wrote: > >> Hello Shivaram, >> Was the includePackage() function deprecated in SparkR 1.4.0? >> I don't see it in the documentation? If it was, does that mean that we >> can use R packages on Spark DataFrames the usual way we do for local R >> dataframes? >> >> Daniel >> >> -- >> Daniel Emaasit >> Ph.D. Research Assistant >> Transportation Research Center (TRC) >> University of Nevada, Las Vegas >> Las Vegas, NV 89154-4015 >> Cell: 615-649-2489 >> www.danielemaasit.com <http://www.danielemaasit.com/> >> >> >> >> > -- Daniel Emaasit Ph.D. Research Assistant Transportation Research Center (TRC) University of Nevada, Las Vegas Las Vegas, NV 89154-4015 Cell: 615-649-2489 www.danielemaasit.com <http://www.danielemaasit.com/>
inlcudePackage() deprecated?
Hello Shivaram, Was the includePackage() function deprecated in SparkR 1.4.0? I don't see it in the documentation? If it was, does that mean that we can use R packages on Spark DataFrames the usual way we do for local R dataframes? Daniel -- Daniel Emaasit Ph.D. Research Assistant Transportation Research Center (TRC) University of Nevada, Las Vegas Las Vegas, NV 89154-4015 Cell: 615-649-2489 www.danielemaasit.com <http://www.danielemaasit.com/>
Re: Spark 1.4.0 build Error on Windows
ven-shared-archive-resources\META-INF\NOTICE (The system cannot find t he path specified) -> [Help 1] [ERROR] [ERROR] To see the full stack trace of the errors, re-run Maven with the -e swit ch. [ERROR] Re-run Maven using the -X switch to enable full debug logging. [ERROR] [ERROR] For more information about the errors and possible solutions, please rea d the following articles: [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionE xception C:\Program Files\Apache Software Foundation\spark-branch-1.4> On Tue, Jun 2, 2015 at 7:17 PM, Shivaram Venkataraman < shivaram.venkatara...@gmail.com> wrote: > No worries - Also cc'ing user@spark.apache.org might get faster responses > ! > > Shivaram > > On Tue, Jun 2, 2015 at 6:05 PM, Daniel Emaasit > wrote: > >> Oops, My bad. I was building from the wrong Directory. >> >> >> On Tue, Jun 2, 2015 at 5:57 PM, Daniel Emaasit >> wrote: >> >>> Hello Shivaram, >>> While I was able to build Spark 1.3.0. I am getting errors building >>> Spark 1.4.0. I was trying to build from the 1.4 branch from >>> https://github.com/apache/spark/tree/branch-1.4 >>> Here is the log file. >>> >>> C:\Program Files\Apache Software Foundation\spark-branch-1.4>cd build >>> >>> C:\Program Files\Apache Software Foundation\spark-branch-1.4\build>ls >>> mvn sbt sbt-launch-lib.bash >>> >>> C:\Program Files\Apache Software Foundation\spark-branch-1.4\build>mvn >>> -Psparkr >>> -Pyarn -Phadoop-2.4 -Dhadoop.version=2.4.0 -DskipTests clean package >>> [INFO] Scanning for projects... >>> [INFO] >>> >>> [INFO] BUILD FAILURE >>> [INFO] >>> >>> [INFO] Total time: 0.469 s >>> [INFO] Finished at: 2015-06-02T17:47:28-07:00 >>> [INFO] Final Memory: 4M/121M >>> [INFO] >>> >>> [WARNING] The requested profile "sparkr" could not be activated because >>> it does >>> not exist. >>> [WARNING] The requested profile "yarn" could not be activated because it >>> does no >>> t exist. >>> [WARNING] The requested profile "hadoop-2.4" could not be activated >>> because it d >>> oes not exist. >>> [ERROR] The goal you specified requires a project to execute but there >>> is no POM >>> in this directory (C:\Program Files\Apache Software >>> Foundation\spark-branch-1.4 >>> \build). Please verify you invoked Maven from the correct directory. -> >>> [Help 1] >>> >>> [ERROR] >>> [ERROR] To see the full stack trace of the errors, re-run Maven with the >>> -e swit >>> ch. >>> [ERROR] Re-run Maven using the -X switch to enable full debug logging. >>> [ERROR] >>> [ERROR] For more information about the errors and possible solutions, >>> please rea >>> d the following articles: >>> [ERROR] [Help 1] >>> http://cwiki.apache.org/confluence/display/MAVEN/MissingProject >>> Exception >>> C:\Program Files\Apache Software Foundation\spark-branch-1.4\build> >>> >>> -- >>> Daniel Emaasit >>> Ph.D. Research Assistant >>> Transportation Research Center (TRC) >>> University of Nevada, Las Vegas >>> Las Vegas, NV 89154-4015 >>> Cell: 615-649-2489 >>> www.danielemaasit.com <http://www.danielemaasit.com/> >>> >>> >>> >>> >> >> >> -- >> Daniel Emaasit >> Ph.D. Research Assistant >> Transportation Research Center (TRC) >> University of Nevada, Las Vegas >> Las Vegas, NV 89154-4015 >> Cell: 615-649-2489 >> www.danielemaasit.com <http://www.danielemaasit.com/> >> >> >> >> > -- Daniel Emaasit Ph.D. Research Assistant Transportation Research Center (TRC) University of Nevada, Las Vegas Las Vegas, NV 89154-4015 Cell: 615-649-2489 www.danielemaasit.com <http://www.danielemaasit.com/>
Error: Building Spark 1.4.0 from Github-1.4 release branch
tem cannot find t he path specified) -> [Help 1] [ERROR] [ERROR] To see the full stack trace of the errors, re-run Maven with the -e swit ch. [ERROR] Re-run Maven using the -X switch to enable full debug logging. [ERROR] [ERROR] For more information about the errors and possible solutions, please rea d the following articles: [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionE xception C:\Program Files\Apache Software Foundation\spark-branch-1.4> - Daniel Emaasit, Ph.D. Research Assistant Transportation Research Center (TRC) University of Nevada, Las Vegas Las Vegas, NV 89154-4015 Cell: 615-649-2489 www.danielemaasit.com -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Error-Building-Spark-1-4-0-from-Github-1-4-release-branch-tp23132.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: DataFrames coming in SparkR in Apache Spark 1.4.0
You can build Spark from the 1.4 release branch yourself: https://github.com/apache/spark/tree/branch-1.4 - Daniel Emaasit, Ph.D. Research Assistant Transportation Research Center (TRC) University of Nevada, Las Vegas Las Vegas, NV 89154-4015 Cell: 615-649-2489 www.danielemaasit.com -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/DataFrames-coming-in-SparkR-in-Apache-Spark-1-4-0-tp23116p23131.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
DataFrames coming in SparkR in Apache Spark 1.4.0
For the impatient R-user, here is a link <http://people.apache.org/~pwendell/spark-nightly/spark-1.4-docs/latest/sparkr.html> to get started working with DataFrames using SparkR. Or copy and paste this link into your web browser: http://people.apache.org/~pwendell/spark-nightly/spark-1.4-docs/latest/sparkr.html Happy coding, Daniel - Daniel Emaasit, Ph.D. Research Assistant Transportation Research Center (TRC) University of Nevada, Las Vegas Las Vegas, NV 89154-4015 Cell: 615-649-2489 www.danielemaasit.com -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/DataFrames-coming-in-SparkR-in-Apache-Spark-1-4-0-tp23116.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: IDE for sparkR
Rstudio is the best IDE for running sparkR. Instructions for this can be found at this link <https://github.com/apache/spark/tree/branch-1.4/R> . You will need to set some environment variables as described below. *Using SparkR from RStudio* If you wish to use SparkR from RStudio or other R frontends you will need to set some environment variables which point SparkR to your Spark installation. For example # Set this to where Spark is installed Sys.setenv(SPARK_HOME="/Users/shivaram/spark") # This line loads SparkR from the installed directory .libPaths(c(file.path(Sys.getenv("SPARK_HOME"), "R", "lib"), .libPaths())) library(SparkR) sc <- sparkR.init(master="local") - Daniel Emaasit, Ph.D. Research Assistant Transportation Research Center (TRC) University of Nevada, Las Vegas Las Vegas, NV 89154-4015 Cell: 615-649-2489 www.danielemaasit.com -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/IDE-for-sparkR-tp4764p23115.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Book: Data Analysis with SparkR
Is the a book on SparkR for the absolute & terrified beginner? I use R for my daily analysis and I am interested in a detailed guide to using SparkR for data analytics: like a book or online tutorials. If there's any please direct me to the address. Thanks, Daniel -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Book-Data-Analysis-with-SparkR-tp19529.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org