Konstantin,

   1. You need to install the hadoop rpms on all nodes. If it is Hadoop 2,
   the nodes would have hdfs & YARN.
   2. Then you need to install Spark on all nodes. I haven't had experience
   with HDP, but the tech preview might have installed Spark as well.
   3. In the end, one should have hdfs,yarn & spark installed on all the
   nodes.
   4. After installations, check the web console to make sure hdfs, yarn &
   spark are running.
   5. Then you are ready to start experimenting/developing spark
   applications.

HTH.
Cheers
<k/>


On Mon, Jul 7, 2014 at 2:34 AM, Konstantin Kudryavtsev <
kudryavtsev.konstan...@gmail.com> wrote:

> guys, I'm not talking about running spark on VM, I don have problem with
> it.
>
> I confused in the next:
> 1) Hortonworks describe installation process as RPMs on each node
> 2) spark home page said that everything I need is YARN
>
> And I'm in stucj with understanding what I need to do to run spark on yarn
> (do I need RPMs installations or only build spark on edge node?)
>
>
> Thank you,
> Konstantin Kudryavtsev
>
>
> On Mon, Jul 7, 2014 at 4:34 AM, Robert James <srobertja...@gmail.com>
> wrote:
>
>> I can say from my experience that getting Spark to work with Hadoop 2
>> is not for the beginner; after solving one problem after another
>> (dependencies, scripts, etc.), I went back to Hadoop 1.
>>
>> Spark's Maven, ec2 scripts, and others all use Hadoop 1 - not sure
>> why, but, given so, Hadoop 2 has too many bumps
>>
>> On 7/6/14, Marco Shaw <marco.s...@gmail.com> wrote:
>> > That is confusing based on the context you provided.
>> >
>> > This might take more time than I can spare to try to understand.
>> >
>> > For sure, you need to add Spark to run it in/on the HDP 2.1 express VM.
>> >
>> > Cloudera's CDH 5 express VM includes Spark, but the service isn't
>> running by
>> > default.
>> >
>> > I can't remember for MapR...
>> >
>> > Marco
>> >
>> >> On Jul 6, 2014, at 6:33 PM, Konstantin Kudryavtsev
>> >> <kudryavtsev.konstan...@gmail.com> wrote:
>> >>
>> >> Marco,
>> >>
>> >> Hortonworks provides a Tech Preview of Spark 0.9.1 with HDP 2.1 that
>> you
>> >> can try
>> >> from
>> >>
>> http://hortonworks.com/wp-content/uploads/2014/05/SparkTechnicalPreview.pdf
>> >>  HDP 2.1 means YARN, at the same time they propose ti install rpm
>> >>
>> >> On other hand, http://spark.apache.org/ said "
>> >> Integrated with Hadoop
>> >> Spark can run on Hadoop 2's YARN cluster manager, and can read any
>> >> existing Hadoop data.
>> >>
>> >> If you have a Hadoop 2 cluster, you can run Spark without any
>> installation
>> >> needed. "
>> >>
>> >> And this is confusing for me... do I need rpm installation on not?...
>> >>
>> >>
>> >> Thank you,
>> >> Konstantin Kudryavtsev
>> >>
>> >>
>> >>> On Sun, Jul 6, 2014 at 10:56 PM, Marco Shaw <marco.s...@gmail.com>
>> >>> wrote:
>> >>> Can you provide links to the sections that are confusing?
>> >>>
>> >>> My understanding, the HDP1 binaries do not need YARN, while the HDP2
>> >>> binaries do.
>> >>>
>> >>> Now, you can also install Hortonworks Spark RPM...
>> >>>
>> >>> For production, in my opinion, RPMs are better for manageability.
>> >>>
>> >>>> On Jul 6, 2014, at 5:39 PM, Konstantin Kudryavtsev
>> >>>> <kudryavtsev.konstan...@gmail.com> wrote:
>> >>>>
>> >>>> Hello, thanks for your message... I'm confused, Hortonworhs suggest
>> >>>> install spark rpm on each node, but on Spark main page said that yarn
>> >>>> enough and I don't need to install it... What the difference?
>> >>>>
>> >>>> sent from my HTC
>> >>>>
>> >>>>> On Jul 6, 2014 8:34 PM, "vs" <vinayshu...@gmail.com> wrote:
>> >>>>> Konstantin,
>> >>>>>
>> >>>>> HWRK provides a Tech Preview of Spark 0.9.1 with HDP 2.1 that you
>> can
>> >>>>> try
>> >>>>> from
>> >>>>>
>> http://hortonworks.com/wp-content/uploads/2014/05/SparkTechnicalPreview.pdf
>> >>>>>
>> >>>>> Let me know if you see issues with the tech preview.
>> >>>>>
>> >>>>> "spark PI example on HDP 2.0
>> >>>>>
>> >>>>> I downloaded spark 1.0 pre-build from
>> >>>>> http://spark.apache.org/downloads.html
>> >>>>> (for HDP2)
>> >>>>> The run example from spark web-site:
>> >>>>> ./bin/spark-submit --class org.apache.spark.examples.SparkPi
>> >>>>> --master
>> >>>>> yarn-cluster --num-executors 3 --driver-memory 2g --executor-memory
>> 2g
>> >>>>> --executor-cores 1 ./lib/spark-examples-1.0.0-hadoop2.2.0.jar 2
>> >>>>>
>> >>>>> I got error:
>> >>>>> Application application_1404470405736_0044 failed 3 times due to AM
>> >>>>> Container for appattempt_1404470405736_0044_000003 exited with
>> >>>>> exitCode: 1
>> >>>>> due to: Exception from container-launch:
>> >>>>> org.apache.hadoop.util.Shell$ExitCodeException:
>> >>>>> at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
>> >>>>> at org.apache.hadoop.util.Shell.run(Shell.java:379)
>> >>>>> at
>> >>>>>
>> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
>> >>>>> at
>> >>>>>
>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
>> >>>>> at
>> >>>>>
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283)
>> >>>>> at
>> >>>>>
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
>> >>>>> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>> >>>>> at
>> >>>>>
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>> >>>>> at
>> >>>>>
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>> >>>>> at java.lang.Thread.run(Thread.java:744)
>> >>>>> .Failing this attempt.. Failing the application.
>> >>>>>
>> >>>>> Unknown/unsupported param List(--executor-memory, 2048,
>> >>>>> --executor-cores, 1,
>> >>>>> --num-executors, 3)
>> >>>>> Usage: org.apache.spark.deploy.yarn.ApplicationMaster [options]
>> >>>>> Options:
>> >>>>>   --jar JAR_PATH       Path to your application's JAR file
>> (required)
>> >>>>>   --class CLASS_NAME   Name of your application's main class
>> >>>>> (required)
>> >>>>> ...bla-bla-bla
>> >>>>> "
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>> --
>> >>>>> View this message in context:
>> >>>>>
>> http://apache-spark-user-list.1001560.n3.nabble.com/Unable-to-run-Spark-1-0-SparkPi-on-HDP-2-0-tp8802p8873.html
>> >>>>> Sent from the Apache Spark User List mailing list archive at
>> >>>>> Nabble.com.
>> >>
>> >
>>
>
>

Reply via email to