spark ec2 script doest not install necessary files to launch spark

2015-11-06 Thread Emaasit
Hello,
I followed the instructions for launching Spark 1.5.1 on my AWS EC2 but the
script is not installing all the folders/files required to initialize Spark.
Since the log message is long, I have created a gist here:
https://gist.github.com/Emaasit/696145959bbbd989bfe1

Please help. I have been going at this for more than 6 hours now to no
success.



-
Daniel Emaasit, 
Ph.D. Research Assistant
Transportation Research Center (TRC)
University of Nevada, Las Vegas
Las Vegas, NV 89154-4015
Cell: 615-649-2489
www.danielemaasit.com 
--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/spark-ec2-script-doest-not-install-necessary-files-to-launch-spark-tp25311.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: inlcudePackage() deprecated?

2015-06-04 Thread Daniel Emaasit
Got it. Ignore my similar question on Github comments.

On Thu, Jun 4, 2015 at 11:48 AM, Shivaram Venkataraman <
shiva...@eecs.berkeley.edu> wrote:

> Yeah - We don't have support for running UDFs on DataFrames yet. There is
> an open issue to track this
> https://issues.apache.org/jira/browse/SPARK-6817
>
> Thanks
> Shivaram
>
> On Thu, Jun 4, 2015 at 3:10 AM, Daniel Emaasit 
> wrote:
>
>> Hello Shivaram,
>> Was the includePackage() function deprecated in SparkR 1.4.0?
>> I don't see it in the documentation? If it was, does that mean that we
>> can use R packages on Spark DataFrames the usual way we do for local R
>> dataframes?
>>
>> Daniel
>>
>> --
>> Daniel Emaasit
>> Ph.D. Research Assistant
>> Transportation Research Center (TRC)
>> University of Nevada, Las Vegas
>> Las Vegas, NV 89154-4015
>> Cell: 615-649-2489
>> www.danielemaasit.com  <http://www.danielemaasit.com/>
>>
>>
>>
>>
>


-- 
Daniel Emaasit
Ph.D. Research Assistant
Transportation Research Center (TRC)
University of Nevada, Las Vegas
Las Vegas, NV 89154-4015
Cell: 615-649-2489
www.danielemaasit.com  <http://www.danielemaasit.com/>


inlcudePackage() deprecated?

2015-06-04 Thread Daniel Emaasit
Hello Shivaram,
Was the includePackage() function deprecated in SparkR 1.4.0?
I don't see it in the documentation? If it was, does that mean that we can
use R packages on Spark DataFrames the usual way we do for local R
dataframes?

Daniel

-- 
Daniel Emaasit
Ph.D. Research Assistant
Transportation Research Center (TRC)
University of Nevada, Las Vegas
Las Vegas, NV 89154-4015
Cell: 615-649-2489
www.danielemaasit.com  <http://www.danielemaasit.com/>


Re: Spark 1.4.0 build Error on Windows

2015-06-03 Thread Daniel Emaasit
ven-shared-archive-resources\META-INF\NOTICE (The system cannot
find t
he path specified) -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e
swit
ch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions,
please rea
d the following articles:
[ERROR] [Help 1]
http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionE
xception
C:\Program Files\Apache Software Foundation\spark-branch-1.4>

On Tue, Jun 2, 2015 at 7:17 PM, Shivaram Venkataraman <
shivaram.venkatara...@gmail.com> wrote:

> No worries - Also cc'ing user@spark.apache.org might get faster responses
> !
>
> Shivaram
>
> On Tue, Jun 2, 2015 at 6:05 PM, Daniel Emaasit 
> wrote:
>
>> Oops, My bad. I was building from the wrong Directory.
>>
>>
>> On Tue, Jun 2, 2015 at 5:57 PM, Daniel Emaasit 
>> wrote:
>>
>>> Hello Shivaram,
>>> While I was able to build Spark 1.3.0. I am getting errors building
>>> Spark 1.4.0. I was trying to build from the 1.4 branch from
>>> https://github.com/apache/spark/tree/branch-1.4
>>> Here is the log file.
>>>
>>> C:\Program Files\Apache Software Foundation\spark-branch-1.4>cd build
>>>
>>> C:\Program Files\Apache Software Foundation\spark-branch-1.4\build>ls
>>> mvn  sbt  sbt-launch-lib.bash
>>>
>>> C:\Program Files\Apache Software Foundation\spark-branch-1.4\build>mvn
>>> -Psparkr
>>>  -Pyarn -Phadoop-2.4 -Dhadoop.version=2.4.0 -DskipTests clean package
>>> [INFO] Scanning for projects...
>>> [INFO]
>>> 
>>> [INFO] BUILD FAILURE
>>> [INFO]
>>> 
>>> [INFO] Total time: 0.469 s
>>> [INFO] Finished at: 2015-06-02T17:47:28-07:00
>>> [INFO] Final Memory: 4M/121M
>>> [INFO]
>>> 
>>> [WARNING] The requested profile "sparkr" could not be activated because
>>> it does
>>> not exist.
>>> [WARNING] The requested profile "yarn" could not be activated because it
>>> does no
>>> t exist.
>>> [WARNING] The requested profile "hadoop-2.4" could not be activated
>>> because it d
>>> oes not exist.
>>> [ERROR] The goal you specified requires a project to execute but there
>>> is no POM
>>>  in this directory (C:\Program Files\Apache Software
>>> Foundation\spark-branch-1.4
>>> \build). Please verify you invoked Maven from the correct directory. ->
>>> [Help 1]
>>>
>>> [ERROR]
>>> [ERROR] To see the full stack trace of the errors, re-run Maven with the
>>> -e swit
>>> ch.
>>> [ERROR] Re-run Maven using the -X switch to enable full debug logging.
>>> [ERROR]
>>> [ERROR] For more information about the errors and possible solutions,
>>> please rea
>>> d the following articles:
>>> [ERROR] [Help 1]
>>> http://cwiki.apache.org/confluence/display/MAVEN/MissingProject
>>> Exception
>>> C:\Program Files\Apache Software Foundation\spark-branch-1.4\build>
>>>
>>> --
>>> Daniel Emaasit
>>> Ph.D. Research Assistant
>>> Transportation Research Center (TRC)
>>> University of Nevada, Las Vegas
>>> Las Vegas, NV 89154-4015
>>> Cell: 615-649-2489
>>> www.danielemaasit.com  <http://www.danielemaasit.com/>
>>>
>>>
>>>
>>>
>>
>>
>> --
>> Daniel Emaasit
>> Ph.D. Research Assistant
>> Transportation Research Center (TRC)
>> University of Nevada, Las Vegas
>> Las Vegas, NV 89154-4015
>> Cell: 615-649-2489
>> www.danielemaasit.com  <http://www.danielemaasit.com/>
>>
>>
>>
>>
>


-- 
Daniel Emaasit
Ph.D. Research Assistant
Transportation Research Center (TRC)
University of Nevada, Las Vegas
Las Vegas, NV 89154-4015
Cell: 615-649-2489
www.danielemaasit.com  <http://www.danielemaasit.com/>


Error: Building Spark 1.4.0 from Github-1.4 release branch

2015-06-03 Thread Emaasit
tem cannot
find t
he path specified) -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e
swit
ch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please
rea
d the following articles:
[ERROR] [Help 1]
http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionE
xception
C:\Program Files\Apache Software Foundation\spark-branch-1.4>



-
Daniel Emaasit, 
Ph.D. Research Assistant
Transportation Research Center (TRC)
University of Nevada, Las Vegas
Las Vegas, NV 89154-4015
Cell: 615-649-2489
www.danielemaasit.com 
--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Error-Building-Spark-1-4-0-from-Github-1-4-release-branch-tp23132.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: DataFrames coming in SparkR in Apache Spark 1.4.0

2015-06-03 Thread Emaasit
You can build Spark from the 1.4 release branch yourself:
https://github.com/apache/spark/tree/branch-1.4



-
Daniel Emaasit, 
Ph.D. Research Assistant
Transportation Research Center (TRC)
University of Nevada, Las Vegas
Las Vegas, NV 89154-4015
Cell: 615-649-2489
www.danielemaasit.com 
--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/DataFrames-coming-in-SparkR-in-Apache-Spark-1-4-0-tp23116p23131.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



DataFrames coming in SparkR in Apache Spark 1.4.0

2015-06-02 Thread Emaasit
For the impatient R-user, here is a  link
<http://people.apache.org/~pwendell/spark-nightly/spark-1.4-docs/latest/sparkr.html>
  
to get started working with DataFrames using SparkR.

Or copy and paste this link into your web browser:
http://people.apache.org/~pwendell/spark-nightly/spark-1.4-docs/latest/sparkr.html

Happy coding,
Daniel



-
Daniel Emaasit, 
Ph.D. Research Assistant
Transportation Research Center (TRC)
University of Nevada, Las Vegas
Las Vegas, NV 89154-4015
Cell: 615-649-2489
www.danielemaasit.com 
--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/DataFrames-coming-in-SparkR-in-Apache-Spark-1-4-0-tp23116.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: IDE for sparkR

2015-06-02 Thread Emaasit
Rstudio is the best IDE for running sparkR.
Instructions for this can be found at this  link
<https://github.com/apache/spark/tree/branch-1.4/R>  . You will need to set
some environment variables as described below.

*Using SparkR from RStudio*

If you wish to use SparkR from RStudio or other R frontends you will need to
set some environment variables which point SparkR to your Spark
installation. For example

# Set this to where Spark is installed
Sys.setenv(SPARK_HOME="/Users/shivaram/spark")
# This line loads SparkR from the installed directory
.libPaths(c(file.path(Sys.getenv("SPARK_HOME"), "R", "lib"), .libPaths()))
library(SparkR)
sc <- sparkR.init(master="local")



-
Daniel Emaasit, 
Ph.D. Research Assistant
Transportation Research Center (TRC)
University of Nevada, Las Vegas
Las Vegas, NV 89154-4015
Cell: 615-649-2489
www.danielemaasit.com 
--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/IDE-for-sparkR-tp4764p23115.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Book: Data Analysis with SparkR

2014-11-21 Thread Emaasit
Is the a book on SparkR for the absolute & terrified beginner?
I use R for my daily analysis and I am interested in a detailed guide to
using SparkR for data analytics: like a book or online tutorials. If there's
any please direct me to the address.

Thanks,
Daniel



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Book-Data-Analysis-with-SparkR-tp19529.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org