Can you check in your RM's web UI how much of each resource does Yarn
think you have available? You can also check that in the Yarn
configuration directly.
Perhaps it's not configured to use all of the available resources. (If
it was set up with Cloudera Manager, CM will reserve some room for
Hi Anson,
We've seen this error when incompatible classes are used in the driver
and executors (e.g., same class name, but the classes are different
and thus the serialized data is different). This can happen for
example if you're including some 3rd party libraries in your app's
jar, or changing
are using
CDH's version of Spark, not trying to run an Apache Spark release on
top of CDH, right? (If that's the case, then we could probably move
this conversation to cdh-us...@cloudera.org, since it would be
CDH-specific.)
On Wed Nov 19 2014 at 4:52:51 PM Marcelo Vanzin van...@cloudera.com wrote
Hi Yiming,
On Wed, Nov 19, 2014 at 5:35 PM, Yiming (John) Zhang sdi...@gmail.com wrote:
Thank you for your reply. I was wondering whether there is a method of
reusing locally-built components without installing them? That is, if I have
successfully built the spark project as a whole, how
Check the --files argument in the output spark-submit -h.
On Thu, Nov 20, 2014 at 7:51 AM, Matt Narrell matt.narr...@gmail.com wrote:
How do I configure the files to be uploaded to YARN containers. So far, I’ve
only seen --conf spark.yarn.jar=hdfs://….” which allows me to specify the
HDFS
Hi Tobias,
With the current Yarn code, packaging the configuration in your app's
jar and adding the -Dlog4j.configuration=log4jConf.xml argument to
the extraJavaOptions configs should work.
That's not the recommended way for get it to work, though, since this
behavior may change in the future.
Hello,
On Mon, Nov 24, 2014 at 12:07 PM, aecc alessandroa...@gmail.com wrote:
This is the stacktrace:
org.apache.spark.SparkException: Job aborted due to stage failure: Task not
serializable: java.io.NotSerializableException: $iwC$$iwC$$iwC$$iwC$AAA
- field (class
On Mon, Nov 24, 2014 at 1:56 PM, aecc alessandroa...@gmail.com wrote:
I checked sqlContext, they use it in the same way I would like to use my
class, they make the class Serializable with transient. Does this affects
somehow the whole pipeline of data moving? I mean, will I get performance
That's an interesting question for which I do not know the answer.
Probably a question for someone with more knowledge of the internals
of the shell interpreter...
On Mon, Nov 24, 2014 at 2:19 PM, aecc alessandroa...@gmail.com wrote:
Ok, great, I'm gonna do do it that way, thanks :). However I
Hello,
What exactly are you trying to see? Workers don't generate any events
that would be logged by enabling that config option. Workers generate
logs, and those are captured and saved to disk by the cluster manager,
generally, without you having to do anything.
On Mon, Nov 24, 2014 at 7:46 PM,
On Tue, Dec 2, 2014 at 11:22 AM, Judy Nash
judyn...@exchange.microsoft.com wrote:
Any suggestion on how can user with custom Hadoop jar solve this issue?
You'll need to include all the dependencies for that custom Hadoop jar
to the classpath. Those will include Guava (which is not included in
wrote:
Thank you, Marcelo and Sean, mvn install is a good answer for my demands.
-邮件原件-
发件人: Marcelo Vanzin [mailto:van...@cloudera.com]
发送时间: 2014年11月21日 1:47
收件人: yiming zhang
抄送: Sean Owen; user@spark.apache.org
主题: Re: How to incrementally compile spark examples using mvn
Hi
and got weird
errors because some toy version i once build was stuck in my local maven
repo and it somehow got priority over a real maven repo).
On Fri, Dec 5, 2014 at 5:28 PM, Marcelo Vanzin van...@cloudera.com
wrote:
You can set SPARK_PREPEND_CLASSES=1 and it should pick your new mllib
Hello,
In CDH 5.2 you need to manually add Hive classes to the classpath of
your Spark job if you want to use the Hive integration. Also, be aware
that since Spark 1.1 doesn't really support the version of Hive
shipped with CDH 5.2, this combination is to be considered extremely
experimental.
On
Hello,
What do you mean by app that uses 2 cores and 8G of RAM?
Spark apps generally involve multiple processes. The command line
options you used affect only one of them (the driver). You may want to
take a look at similar configuration for executors. Also, check the
documentation:
it as a public API, but mostly for internal Hive use.
It can give you a few ideas, though. Also, SPARK-3215.
On Thu, Dec 11, 2014 at 5:41 PM, Marcelo Vanzin van...@cloudera.com wrote:
Hi Manoj,
I'm not aware of any public projects that do something like that,
except for the Ooyala server which you say
Hi Manoj,
I'm not aware of any public projects that do something like that,
except for the Ooyala server which you say doesn't cover your needs.
We've been playing with something like that inside Hive, though:
On Thu, Dec 11, 2014 at 5:33 PM, Manoj Samel manojsamelt...@gmail.com wrote:
Hi,
Hi Anton,
That could solve some of the issues (I've played with that a little
bit). But there are still some areas where this would be sub-optimal,
because Spark still uses system properties in some places and those
are global, not per-class loader.
(SparkSubmit is the biggest offender here, but
On Fri, Dec 19, 2014 at 4:05 PM, Haopu Wang hw...@qilinsoft.com wrote:
My application doesn’t depends on hadoop-client directly.
It only depends on spark-core_2.10 which depends on hadoop-client 1.0.4.
This can be checked by Maven repository at
How many cores / memory do you have available per NodeManager, and how
many cores / memory are you requesting for your job?
Remember that in Yarn mode, Spark launches num executors + 1
containers. The extra container, by default, reserves 1 core and about
1g of memory (more if running in cluster
If you don't specify your own log4j.properties, Spark will load the
default one (from
core/src/main/resources/org/apache/spark/log4j-defaults.properties,
which ends up being packaged with the Spark assembly).
You can easily override the config file if you want to, though; check
the Debugging
Hi Corey,
When you run on Yarn, Yarn's libraries are placed in the classpath,
and they have precedence over your app's. So, with Spark 1.2, you'll
get Guava 11 in your classpath (with Spark 1.1 and earlier you'd get
Guava 14 from Spark, so still a problem for you).
Right now, the option Markus
Hi Koert,
On Wed, Feb 4, 2015 at 11:35 AM, Koert Kuipers ko...@tresata.com wrote:
do i understand it correctly that on yarn the the customer jars are truly
placed before the yarn and spark jars on classpath? meaning at container
construction time, on the same classloader? that would be great
On Wed, Feb 4, 2015 at 1:12 PM, Koert Kuipers ko...@tresata.com wrote:
about putting stuff on classpath before spark or yarn... yeah you can shoot
yourself in the foot with it, but since the container is isolated it should
be ok, no? we have been using HADOOP_USER_CLASSPATH_FIRST forever with
Hi Corey,
On Wed, Feb 4, 2015 at 12:44 PM, Corey Nolet cjno...@gmail.com wrote:
Another suggestion is to build Spark by yourself.
I'm having trouble seeing what you mean here, Marcelo. Guava is already
shaded to a different package for the 1.2.0 release. It shouldn't be causing
conflicts.
As the error message says...
On Wed, Jan 14, 2015 at 3:14 PM, freedafeng freedaf...@yahoo.com wrote:
Error: Cluster deploy mode is currently not supported for python
applications.
Use yarn-client instead of yarn-cluster for pyspark apps.
--
Marcelo
You're specifying the queue in the spark-submit command line:
--queue thequeue
Are you sure that queue exists?
On Thu, Jan 15, 2015 at 11:23 AM, Manoj Samel manojsamelt...@gmail.com wrote:
Hi,
Setup is as follows
Hadoop Cluster 2.3.0 (CDH5.0)
- Namenode HA
- Resource manager HA
-
Hi Kane,
What's the complete command line you're using to submit the app? Where
to you expect these options to appear?
On Fri, Jan 16, 2015 at 11:12 AM, Kane Kim kane.ist...@gmail.com wrote:
I want to add some java options when submitting application:
--conf
Hi Kane,
Here's the command line you sent me privately:
./spark-1.2.0-bin-hadoop2.4/bin/spark-submit --class
SimpleApp --conf
spark.executor.extraJavaOptions=-XX:+UnlockCommercialFeatures
-XX:+FlightRecorder --master local simpleapp.jar ./test.log
You're running the app in local mode. In that
On Thu, Jan 22, 2015 at 10:21 AM, Sean Owen so...@cloudera.com wrote:
I think a Spark site would have a lot less traffic. One annoyance is
that people can't figure out when to post on SO vs Data Science vs
Cross Validated.
Another is that a lot of the discussions we see on the Spark users
list
Hello,
On Tue, Feb 17, 2015 at 8:53 PM, dgoldenberg dgoldenberg...@gmail.com wrote:
I've tried setting spark.files.userClassPathFirst to true in SparkConf in my
program, also setting it to true in $SPARK-HOME/conf/spark-defaults.conf as
Is the code in question running on the driver or in some
https://issues.apache.org/jira/browse/SPARK-2356
Take a look through the comments, there are some workarounds listed there.
On Wed, Jan 28, 2015 at 1:40 PM, Wang, Ningjun (LNG-NPV)
ningjun.w...@lexisnexis.com wrote:
Has anybody successfully install and run spark-1.2.0 on windows 2008 R2 or
Hi Capitão,
Since you're using CDH, your question is probably more appropriate for
the cdh-u...@cloudera.org list.
The problem you're seeing is most probably an artifact of the way CDH
is currently packaged. You have to add Hive jars manually to you Spark
app's classpath if you want to use the
Spark doesn't really shade akka; it pulls a different build (kept
under the org.spark-project.akka group and, I assume, with some
build-time differences from upstream akka?), but all classes are still
in the original location.
The upgrade is a little more unfortunate than just changing akka,
Hi Alessandro,
You can look for a log line like this in your driver's output:
15/01/12 10:51:01 INFO storage.DiskBlockManager: Created local
directory at
/data/yarn/nm/usercache/systest/appcache/application_1421081007635_0002/spark-local-20150112105101-4f3d
If you're deploying your application
Short answer: yes.
Take a look at: http://spark.apache.org/docs/latest/running-on-yarn.html
Look for memoryOverhead.
On Mon, Jan 12, 2015 at 2:06 PM, Michael Albert
m_albert...@yahoo.com.invalid wrote:
Greetings!
My executors apparently are being terminated because they are
running beyond
Hi Manoj,
As long as you're logged in (i.e. you've run kinit), everything should
just work. You can run klist to make sure you're logged in.
On Thu, Jan 8, 2015 at 3:49 PM, Manoj Samel manojsamelt...@gmail.com wrote:
Hi,
For running spark 1.2 on Hadoop cluster with Kerberos, what spark
I ran this with CDH 5.2 without a problem (sorry don't have 5.3
readily available at the moment):
$ HBASE='/opt/cloudera/parcels/CDH/lib/hbase/\*'
$ spark-submit --driver-class-path $HBASE --conf
spark.executor.extraClassPath=$HBASE --master yarn --class
org.apache.spark.examples.HBaseTest
On Thu, Jan 8, 2015 at 4:09 PM, Manoj Samel manojsamelt...@gmail.com wrote:
Some old communication (Oct 14) says Spark is not certified with Kerberos.
Can someone comment on this aspect ?
Spark standalone doesn't support kerberos. Spark running on top of
Yarn works fine with kerberos.
--
On Thu, Jan 8, 2015 at 3:33 PM, freedafeng freedaf...@yahoo.com wrote:
I installed the custom as a standalone mode as normal. The master and slaves
started successfully.
However, I got error when I ran a job. It seems to me from the error message
the some library was compiled against hadoop1,
Disclaimer: this seems more of a CDH question, I'd suggest sending
these to the CDH mailing list in the future.
CDH 5.2 actually has Spark 1.1. It comes with SparkSQL built-in, but
it does not include the thrift server because of incompatibilities
with the CDH version of Hive. To use Hive
Disclaimer: CDH questions are better handled at cdh-us...@cloudera.org.
But the question I'd like to ask is: why do you need your own Spark
build? What's wrong with CDH's Spark that it doesn't work for you?
On Thu, Jan 8, 2015 at 3:01 PM, freedafeng freedaf...@yahoo.com wrote:
Could anyone come
the func1 and func2 from jars that are
already cached into local nodes?
Thanks,
Yitong
2015-02-09 14:35 GMT-08:00 Marcelo Vanzin van...@cloudera.com:
`func1` and `func2` never get serialized. They must exist on the other
end in the form of a class loaded by the JVM.
What gets serialized
`func1` and `func2` never get serialized. They must exist on the other
end in the form of a class loaded by the JVM.
What gets serialized is an instance of a particular closure (the
argument to your map function). That's a separate class. The
instance of that class that is serialized contains
and the user
that runs Spark in our case is a unix ID called mapr (in the mapr group).
Therefore, this can't read my job event logs as shown above.
Thanks,
Michael
-Original Message-
From: Marcelo Vanzin [mailto:van...@cloudera.com]
Sent: 07 January 2015 18:10
To: England, Michael (IT/UK
Nevermind my last e-mail. HDFS complains about not understanding 3777...
On Thu, Jan 8, 2015 at 9:46 AM, Marcelo Vanzin van...@cloudera.com wrote:
Hmm. Can you set the permissions of /apps/spark/historyserver/logs
to 3777? I'm not sure HDFS respects the group id bit, but it's worth a
try. (BTW
This could be cause by many things including wrong configuration. Hard
to tell with just the info you provided.
Is there any reason why you want to use your own Spark instead of the
one shipped with CDH? CDH 5.3 has Spark 1.2, so unless you really need
to run Spark 1.1, you should be better off
This particular case shouldn't cause problems since both of those
libraries are java-only (the scala version appended there is just for
helping the build scripts).
But it does look weird, so it would be nice to fix it.
On Wed, Jan 7, 2015 at 12:25 AM, Aniket Bhatnagar
aniket.bhatna...@gmail.com
The Spark code generates the log directory with 770 permissions. On
top of that you need to make sure of two things:
- all directories up to /apps/spark/historyserver/logs/ are readable
by the user running the history server
- the user running the history server belongs to the group that owns
Sorry for the noise; but I just remembered you're actually using MapR
(and not HDFS), so maybe the 3777 trick could work...
On Thu, Jan 8, 2015 at 10:32 AM, Marcelo Vanzin van...@cloudera.com wrote:
Nevermind my last e-mail. HDFS complains about not understanding 3777...
On Thu, Jan 8, 2015
Just to add to Sandy's comment, check your client configuration
(generally in /etc/spark/conf). If you're using CM, you may need to
run the Deploy Client Configuration command on the cluster to update
the configs to match the new version of CDH.
On Thu, Jan 8, 2015 at 11:38 AM, Sandy Ryza
You'll need to look at your application's logs. You can use yarn logs
--applicationId [id] to see them.
On Wed, Feb 18, 2015 at 2:39 AM, sachin Singh sachin.sha...@gmail.com wrote:
Hi,
I want to run my spark Job in Hadoop yarn Cluster mode,
I am using below command -
spark-submit --master
Those classes are not part of standard Spark. You may want to contact
Hortonworks directly if they're suggesting you use those.
On Wed, Mar 18, 2015 at 3:30 AM, patcharee patcharee.thong...@uni.no wrote:
Hi,
I am using spark 1.3. I would like to use Spark Job History Server. I added
the
I assume you're running YARN given the exception.
I don't know if this is covered in the documentation (I took a quick
look at the config document and didn't see references to it), but you
need to configure Spark's external shuffle service as and auxiliary
nodemanager service in your YARN
Instead of opening a tunnel to the Spark web ui port, could you open a
tunnel to the YARN RM web ui instead? That should allow you to
navigate to the Spark application's web ui through the RM proxy, and
hopefully that will work better.
On Fri, Feb 6, 2015 at 9:08 PM, yangqch
IIRC you have to set that configuration on the Worker processes (for
standalone). The app can't override it (only for a client-mode
driver). YARN has a similar configuration, but I don't know the name
(shouldn't be hard to find, though).
On Thu, Mar 19, 2015 at 11:56 AM, Davies Liu
On Fri, Mar 6, 2015 at 2:47 PM, nitinkak001 nitinkak...@gmail.com wrote:
I am trying to run a Hive query from Spark using HiveContext. Here is the
code
/ val conf = new SparkConf().setAppName(HiveSparkIntegrationTest)
conf.set(spark.executor.extraClassPath,
It seems from the excerpt below that your cluster is set up to use the
Yarn ATS, and the code is failing in that path. I think you'll need to
apply the following patch to your Spark sources if you want this to
work:
https://github.com/apache/spark/pull/3938
On Thu, Mar 5, 2015 at 10:04 AM, Todd
+ ALT + V for copying
commands in the shell) and that results in closing my shell. In order to
solve this I was wondering if I just deactivating the CTRL + C combination
at all! Any ideas?
// Adamantios
On Fri, Mar 13, 2015 at 7:37 PM, Marcelo Vanzin van...@cloudera.com wrote:
You can type
commands in the shell) and that results in closing my shell. In order to
solve this I was wondering if I just deactivating the CTRL + C combination
at all! Any ideas?
// Adamantios
On Fri, Mar 13, 2015 at 7:37 PM, Marcelo Vanzin van...@cloudera.com wrote:
You can type :quit.
On Fri
You can type :quit.
On Fri, Mar 13, 2015 at 10:29 AM, Adamantios Corais
adamantios.cor...@gmail.com wrote:
Hi,
I want change the default combination of keys that exit the Spark shell
(i.e. CTRL + C) to something else, such as CTRL + H?
Thank you in advance.
// Adamantios
--
Marcelo
I've never tried it, but I'm pretty sure in the very least you want
-Pscala-2.11 (not -D).
On Thu, Mar 5, 2015 at 4:46 PM, Night Wolf nightwolf...@gmail.com wrote:
Hey guys,
Trying to build Spark 1.3 for Scala 2.11.
I'm running with the folllowng Maven command;
-DskipTests -Dscala-2.11
Ah, and you may have to use dev/change-version-to-2.11.sh. (Again,
never tried compiling with scala 2.11.)
On Thu, Mar 5, 2015 at 4:52 PM, Marcelo Vanzin van...@cloudera.com wrote:
I've never tried it, but I'm pretty sure in the very least you want
-Pscala-2.11 (not -D).
On Thu, Mar 5, 2015
spark-submit --files /path/to/hive-site.xml
On Tue, Mar 24, 2015 at 10:31 AM, Udit Mehta ume...@groupon.com wrote:
Another question related to this, how can we propagate the hive-site.xml to
all workers when running in the yarn cluster mode?
On Tue, Mar 24, 2015 at 10:09 AM, Marcelo Vanzin
It does neither. If you provide a Hive configuration to Spark,
HiveContext will connect to your metastore server, otherwise it will
create its own metastore in the working directory (IIRC).
On Tue, Mar 24, 2015 at 8:58 AM, nitinkak001 nitinkak...@gmail.com wrote:
I am wondering if HiveContext
/spark-submit --class App1 --conf
spark.driver.userClassPathFirst=true --conf
spark.executor.userClassPathFirst=true
$HOME/projects/sparkapp/target/scala-2.10/sparkapp-assembly-1.0.jar
Thanks,
Alexey
On Tue, Mar 24, 2015 at 5:03 AM, Marcelo Vanzin van...@cloudera.com wrote:
You could build
The probably means there are not enough free resources in your cluster
to run the AM for the Spark job. Check your RM's web ui to see the
resources you have available.
On Wed, Mar 25, 2015 at 12:08 PM, Khandeshi, Ami
ami.khande...@fmr.com.invalid wrote:
I am seeing the same behavior. I have
Are those config values in spark-defaults.conf? I don't think you can
use ~ there - IIRC it does not do any kind of variable expansion.
On Mon, Mar 30, 2015 at 3:50 PM, Tom thubregt...@gmail.com wrote:
I have set
spark.eventLog.enabled true
as I try to preserve log files. When I run, I get
This sounds like SPARK-6532.
On Mon, Mar 30, 2015 at 1:34 PM, ARose ashley.r...@telarix.com wrote:
So, I am trying to build Spark 1.3.0 (standalone mode) on Windows 7 using
Maven, but I'm getting a build failure.
java -version
java version 1.8.0_31
Java(TM) SE Runtime Environment (build
and spark-env:
Log directory /home/hduser/spark/spark-events does not exist.
(Also, in the default /tmp/spark-events it also did not work)
On 30 March 2015 at 18:03, Marcelo Vanzin van...@cloudera.com wrote:
Are those config values in spark-defaults.conf? I don't think you can
use ~ there - IIRC
So, the error below is still showing the invalid configuration.
You mentioned in the other e-mails that you also changed the
configuration, and that the directory really, really exists. Given the
exception below, the only ways you'd get the error with a valid
configuration would be if (i) the
a text file, closed it an
viewed it, and deleted it (iii). My findings were reconfirmed by my
colleague. Any other ideas?
Thanks,
Tom
On 30 March 2015 at 19:19, Marcelo Vanzin van...@cloudera.com wrote:
So, the error below is still showing the invalid configuration.
You mentioned
?
Thanks a lot for the help
-AJ
On Mon, Mar 2, 2015 at 3:50 PM, Marcelo Vanzin van...@cloudera.com wrote:
What are you calling masternode? In yarn-cluster mode, the driver
is running somewhere in your cluster, not on the machine where you run
spark-submit.
The easiest way to get to the Spark UI
What are you calling masternode? In yarn-cluster mode, the driver
is running somewhere in your cluster, not on the machine where you run
spark-submit.
The easiest way to get to the Spark UI when using Yarn is to use the
Yarn RM's web UI. That will give you a link to the application's UI
.compute.amazonaws.com:9026 shows
me all the applications.
Do I have to do anything for the port 8088 or whatever I am seeing at 9026
port is good .Attached is screenshot .
Thanks
AJ
On Mon, Mar 2, 2015 at 4:24 PM, Marcelo Vanzin van...@cloudera.com wrote:
That's the RM's RPC port, not the web UI port
.
--
Kannan
On Thu, Feb 26, 2015 at 6:08 PM, Marcelo Vanzin van...@cloudera.com wrote:
On Thu, Feb 26, 2015 at 5:12 PM, Kannan Rajah kra...@maprtech.com wrote:
Also, I would like to know if there is a localization overhead when we
use
spark.executor.extraClassPath. Again, in the case
(URLClassLoader.java:355)
...
On Feb 25, 2015, at 5:24 PM, Marcelo Vanzin van...@cloudera.com wrote:
Guava is not in Spark. (Well, long version: it's in Spark but it's
relocated to a different package except for some special classes
leaked through the public API.)
If your app needs
On Fri, Feb 27, 2015 at 1:30 PM, Pat Ferrel p...@occamsmachete.com wrote:
@Marcelo do you mean by modifying spark.executor.extraClassPath on all
workers, that didn’t seem to work?
That's an app configuration, not a worker configuration, so if you're
trying to set it on the worker configuration
On Fri, Feb 27, 2015 at 1:42 PM, Pat Ferrel p...@occamsmachete.com wrote:
I changed in the spark master conf, which is also the only worker. I added a
path to the jar that has guava in it. Still can’t find the class.
Sorry, I'm still confused about what config you're changing. I'm
suggesting
Spark applications shown in the RM's UI should have an Application
Master link when they're running. That takes you to the Spark UI for
that application where you can see all the information you're looking
for.
If you're running a history server and add
spark.yarn.historyServer.address to your
On Wed, Mar 4, 2015 at 10:08 AM, Srini Karri skarri@gmail.com wrote:
spark.executor.extraClassPath
D:\\Apache\\spark-1.2.1-bin-hadoop2\\spark-1.2.1-bin-hadoop2.4\\bin\\classes
spark.eventLog.dir
D:/Apache/spark-1.2.1-bin-hadoop2/spark-1.2.1-bin-hadoop2.4/bin/tmp/spark-events
, Mar 4, 2015 at 4:10 PM, Marcelo Vanzin van...@cloudera.com
wrote:
Seems like someone set up m2.mines.com as a mirror in your pom file
or ~/.m2/settings.xml, and it doesn't mirror Spark 1.2 (or does but is
in a messed up state).
On Wed, Mar 4, 2015 at 3:49 PM, kpeng1 kpe...@gmail.com wrote
Seems like someone set up m2.mines.com as a mirror in your pom file
or ~/.m2/settings.xml, and it doesn't mirror Spark 1.2 (or does but is
in a messed up state).
On Wed, Mar 4, 2015 at 3:49 PM, kpeng1 kpe...@gmail.com wrote:
Hi All,
I am currently having problem with the maven dependencies for
Weird python errors like this generally mean you have different
versions of python in the nodes of your cluster. Can you check that?
On Tue, Mar 3, 2015 at 4:21 PM, subscripti...@prismalytics.io
subscripti...@prismalytics.io wrote:
Hi Friends:
We noticed the following in 'pyspark' happens when
On Wed, Feb 25, 2015 at 8:42 PM, Jim Kleckner j...@cloudphysics.com wrote:
So, should the userClassPathFirst flag work and there is a bug?
Sorry for jumping in the middle of conversation (and probably missing
some of it), but note that this option applies only to executors. If
you're trying to
Hi Anny,
You could play with creating your own log4j.properties that will write
the output somewhere else (e.g. to some remote mount, or remote
syslog). Sorry, but I don't have an example handy.
Alternatively, if you can use Yarn, it will collect all logs after the
job is finished and make them
SPARK_CLASSPATH is definitely deprecated, but my understanding is that
spark.executor.extraClassPath is not, so maybe the documentation needs
fixing.
I'll let someone who might know otherwise comment, though.
On Thu, Feb 26, 2015 at 2:43 PM, Kannan Rajah kra...@maprtech.com wrote:
Hi Dan,
This is a CDH issue, so I'd recommend using cdh-u...@cloudera.org for
those questions.
This is an issue with fixed in recent CM 5.3 updates; if you're not
using CM, or want a workaround, you can manually configure
spark.driver.extraLibraryPath and spark.executor.extraLibraryPath
to
On Thu, Feb 26, 2015 at 5:12 PM, Kannan Rajah kra...@maprtech.com wrote:
Also, I would like to know if there is a localization overhead when we use
spark.executor.extraClassPath. Again, in the case of hbase, these jars would
be typically available on all nodes. So there is no need to localize
bcc: user@, cc: cdh-user@
I recommend using CDH's mailing list whenever you have a problem with CDH.
That being said, you haven't provided enough info to debug the
problem. Since you're using CM, you can easily go look at the History
Server's logs and see what the underlying error is.
On Thu,
Hi there,
On Tue, Mar 24, 2015 at 1:40 PM, Manoj Samel manojsamelt...@gmail.com wrote:
When I run any query, it gives java.lang.NoSuchMethodError:
com.google.common.hash.HashFunction.hashInt(I)Lcom/google/common/hash/HashCode;
Are you running a custom-compiled Spark by any chance?
Does your application actually fail?
That message just means there's another application listening on that
port. Spark should try to bind to a different one after that and keep
going.
On Tue, Mar 24, 2015 at 12:43 PM, , Roy rp...@njit.edu wrote:
I get following message for each time I run spark
Since you're using YARN, you should be able to download a Spark 1.3.0
tarball from Spark's website and use spark-submit from that
installation to launch your app against the YARN cluster.
So effectively you would have 1.2.0 and 1.3.0 side-by-side in your cluster.
On Wed, Mar 18, 2015 at 11:09
On Mon, Mar 23, 2015 at 2:15 PM, Manoj Samel manojsamelt...@gmail.com wrote:
Found the issue above error - the setting for spark_shuffle was incomplete.
Now it is able to ask and get additional executors. The issue is once they
are released, it is not able to proceed with next query.
That
You could build a far jar for your application containing both your
code and the json4s library, and then run Spark with these two
options:
spark.driver.userClassPathFirst=true
spark.executor.userClassPathFirst=true
Both only work in 1.3. (1.2 has spark.files.userClassPathFirst, but
that
This happens most probably because the Spark 1.3 you have downloaded
is built against an older version of the Hadoop libraries than those
used by CDH, and those libraries cannot parse the container IDs
generated by CDH.
You can try to work around this by manually adding CDH jars to the
front of
FYI I wrote a small test to try to reproduce this, and filed
SPARK-6688 to track the fix.
On Tue, Mar 31, 2015 at 1:15 PM, Marcelo Vanzin van...@cloudera.com wrote:
Hmmm... could you try to set the log dir to
file:/home/hduser/spark/spark-events?
I checked the code and it might be the case
On top of what's been said...
On Wed, Apr 22, 2015 at 10:48 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) deepuj...@gmail.com wrote:
1) I can go to Spark UI and see the status of the APP but cannot see the
logs as the job progresses. How can i see logs of executors as they progress
?
Spark 1.3 should have links to the
You'd have to use spark.{driver,executor}.extraClassPath to modify the
system class loader. But that also means you have to manually
distribute the jar to the nodes in your cluster, into a common
location.
On Thu, Apr 23, 2015 at 6:38 PM, Night Wolf nightwolf...@gmail.com wrote:
Hi guys,
No, those have to be local paths.
On Thu, Apr 23, 2015 at 6:53 PM, Night Wolf nightwolf...@gmail.com wrote:
Thanks Marcelo, can this be a path on HDFS?
On Fri, Apr 24, 2015 at 11:52 AM, Marcelo Vanzin van...@cloudera.com
wrote:
You'd have to use spark.{driver,executor}.extraClassPath
101 - 200 of 482 matches
Mail list logo