ally rather than compile them inside of their program. That's
>>> the one you mention here. You can choose to use this feature or not.
>>> If you know your configs are not going to change, then you don't need
>>> to set them with spark-submit.
>>>
>>
I have a Spark app which runs well on local master. I'm now ready to
put it on a cluster. What needs to be installed on the master? What
needs to be installed on the workers?
If the cluster already has Hadoop or YARN or Cloudera, does it still
need an install of Spark?
What is the purpose of spark-submit? Does it do anything outside of
the standard val conf = new SparkConf ... val sc = new SparkContext
... ?
As a new user, I can definitely say that my experience with Spark has
been rather raw. The appeal of interactive, batch, and in between all
using more or less straight Scala is unarguable. But the experience
of deploying Spark has been quite painful, mainly about gaps between
compile time and run
front of classpath, which should do
> the trick.
> however i had no luck with this. see here:
>
> https://issues.apache.org/jira/browse/SPARK-1863
>
>
>
> On Mon, Jul 7, 2014 at 1:31 PM, Robert James
> wrote:
>
>> spark-submit includes a spark-assembly uber jar
iner need to provide the dependency at runtime.
>
> This assume the Spark will work with the new version of common libraries.
>
> Of course, this is not a general solution even it works ( if may not work).
>
> Chester
>
>
>
>
> On Mon, Jul 7, 2014 at 10:31 AM, Robert James
spark-submit includes a spark-assembly uber jar, which has older
versions of many common libraries. These conflict with some of the
dependencies we need. I have been racking my brain trying to find a
solution (including experimenting with ProGuard), but haven't been
able to: when we use spark-sub
spark-submit includes a spark-assembly uber jar, which has older
versions of many common libraries. These conflict with some of the
dependencies we need. I have been racking my brain trying to find a
solution (including experimenting with ProGuard), but haven't been
able to: when we use spark-sub
When I use spark-submit (along with spark-ec2), I get dependency
conflicts. spark-assembly includes older versions of apache commons
codec and httpclient, and these conflict with many of the libs our
software uses.
Is there any way to resolve these? Or, if we use the precompiled
spark, can we si
I can say from my experience that getting Spark to work with Hadoop 2
is not for the beginner; after solving one problem after another
(dependencies, scripts, etc.), I went back to Hadoop 1.
Spark's Maven, ec2 scripts, and others all use Hadoop 1 - not sure
why, but, given so, Hadoop 2 has too man
If I've created a Spark EC2 cluster, how can I add or take away workers?
Also: If I use EC2 spot instances, what happens when Amazon removes
them? Will my computation be saved in any way, or will I need to
restart from scratch?
Finally: The spark-ec2 scripts seem to use Hadoop 1. How can I
confi
possible to make a jar assembly using your approach? How? If
not: How do you distribute the jars to the workers?
>
> On Sun, Jun 29, 2014 at 12:20 PM, Robert James
> wrote:
>
>> Although Spark's home page offers binaries for Spark 1.0.0 with Hadoop
>> 2, the Maven r
Although Spark's home page offers binaries for Spark 1.0.0 with Hadoop
2, the Maven repository only seems to have one version, which uses
Hadoop 1.
Is it possible to use a Maven link and Hadoop 2? What is the id?
If not: How can I use the prebuilt binaries to use Hadoop 2? Do I just
copy the lib/
ne solve this problem? (Surely I'm not the only one
using Hadoop 2 and sbt or maven or ivy!)
> On Jun 26, 2014 11:07 AM, "Robert James" wrote:
>
>> Yes. As far as I can tell, Spark seems to be including Hadoop 1 via
>> its transitive dependency:
>> http://mvnre
nary for Hadoop 2, since it was compiled
> expecting that TaskAttemptContext is an interface. So the error
> indicates that Spark is also seeing Hadoop 1 classes somewhere.
>
> On Wed, Jun 25, 2014 at 4:41 PM, Robert James
> wrote:
>> After upgrading to Spark 1.0.0, I
After upgrading to Spark 1.0.0, I get this error:
ERROR org.apache.spark.executor.ExecutorUncaughtExceptionHandler -
Uncaught exception in thread Thread[Executor task launch
worker-2,5,main]
java.lang.IncompatibleClassChangeError: Found interface
org.apache.hadoop.mapreduce.TaskAttemptContext, bu
In case anyone else is having this problem, deleting all ivy's cache,
then doing a sbt clean, then recompiling everything, repackaging, and
reassemblying, seems to have solved the problem. (From the sbt docs,
it seems that having to delete ivy's cache means a bug in sbt)
On 6/25/14, Ro
According to
http://mvnrepository.com/artifact/org.apache.spark/spark-core_2.10/1.0.0
, spark depends on Hadoop 1.0.4. What about the versions of Spark that
work with Hadoop 2? Do they also depend on Hadoop 1.0.4?
How does everyone handle this?
To add Spark to a SBT project, I do:
libraryDependencies += "org.apache.spark" %% "spark-core" % "1.0.0"
% "provided"
How do I make sure that the spark version which will be downloaded
will depend on, and use, Hadoop 2, and not Hadoop 1?
Even with a line:
libraryDependencies += "org.apache.h
browse/SPARK-2075
>
> -- Paul
>
> —
> p...@mult.ifario.us | Multifarious, Inc. | http://mult.ifario.us/
>
>
> On Wed, Jun 25, 2014 at 6:28 AM, Robert James
> wrote:
>
>> On 6/24/14, Robert James wrote:
>> > My app works fine under Spark 0.9. I just trie
On 6/24/14, Robert James wrote:
> My app works fine under Spark 0.9. I just tried upgrading to Spark
> 1.0, by downloading the Spark distro to a dir, changing the sbt file,
> and running sbt assembly, but I get now NoSuchMethodErrors when trying
> to use spark-submit.
>
&g
On 6/24/14, Peng Cheng wrote:
> I got 'NoSuchFieldError' which is of the same type. its definitely a
> dependency jar conflict. spark driver will load jars of itself which in
> recent version get many dependencies that are 1-2 years old. And if your
> newer version dependency is in the same packag
My app works fine under Spark 0.9. I just tried upgrading to Spark
1.0, by downloading the Spark distro to a dir, changing the sbt file,
and running sbt assembly, but I get now NoSuchMethodErrors when trying
to use spark-submit.
I copied in the SimpleApp example from
http://spark.apache.org/docs/
We need a centralized spark logging solution. Ideally, it should:
* Allow any Spark process to log at multiple levels (info, warn,
debug) using a single line, similar to log4j
* All logs should go to a central location - so, to read the logs, we
don't need to check each worker by itself
* Ideally
How can I write to Spark's logs from my client code?
What are the options to view those logs?
Besides the Web console, is there a way to read and grep the file?
iableYouWantToUse` here
> )
>
>
>
> Sincerely,
>
> DB Tsai
> ---
> My Blog: https://www.dbtsai.com
> LinkedIn: https://www.linkedin.com/in/dbtsai
>
>
> On Fri, May 16, 2014 at 1:59 PM, Robert James
> wrote
I have Spark code which runs beautifully when MASTER=local. When I
run it with MASTER set to a spark ec2 cluster, the workers seem to
run, but the results, which are supposed to be put to AWS S3, don't
appear on S3. I'm at a loss for how to debug this. I don't see any
S3 exceptions anywhere.
Ca
What is the difference between a Spark Worker and a Spark Slave?
I've experienced the same bug, which I had to workaround manually. I
posted the details here:
http://stackoverflow.com/questions/23687081/spark-workers-unable-to-find-jar-on-ec2-cluster
On 5/15/14, DB Tsai wrote:
> Hi guys,
>
> I think it maybe a bug in Spark. I wrote some code to demonstrate th
What is a good way to pass config variables to workers?
I've tried setting them in environment variables via spark-env.sh, but, as
far as I can tell, the environment variables set there don't appear in
workers' environments. If I want to be able to configure all workers,
what's a good way to do i
I'm using spark-ec2 to run some Spark code. When I set master to
"local", then it runs fine. However, when I set master to $MASTER,
the workers immediately fail, with java.lang.NoClassDefFoundError for
the classes.
I've used sbt-assembly to make a jar with the classes, confirmed using
jar tvf th
31 matches
Mail list logo