If you want to process the data locally, why do you need to use sc.parallelize?
Store the data in regular Scala collections and use their methods to
process them (they have pretty much the same set of methods as Spark
RDDs). Then when you're happy, finally use Spark to process the
pre-processed in
(-dev@)
Try using the "yarn logs" command to read logs for finished
applications. You can also browse the RM UI to find more information
about the applications you ran.
On Mon, Sep 28, 2015 at 11:37 PM, Rachana Srivastava
wrote:
> Hello all,
>
>
>
> I am trying to test JavaKafkaWordCount on Yarn
Seems like you have "hive.server2.enable.doAs" enabled; you can either
disable it, or configure hs2 so that the user running the service
("hadoop" in your case) can impersonate others.
See:
https://hadoop.apache.org/docs/r2.7.0/hadoop-project-dist/hadoop-common/Superusers.html
On Fri, Sep 25, 201
On Fri, Sep 25, 2015 at 10:05 AM, Garry Chen wrote:
> In spark-defaults.conf the spark.master is spark://hostname:7077. From
> hive-site.xml
> spark.master
> hostname
>
That's not a valid value for spark.master (as the error indicates).
You should set it to "spark://hostname:7077"
But that's not the complete application log. You say the streaming
context is initialized, but can you show that in the logs? There's
something happening that is causing the SparkContext to not be
registered with the YARN backend, and that's why your application is
being killed.
If you can share t
Did you look at your application's logs (using the "yarn logs" command?).
That error means your application is failing to create a SparkContext.
So either you have a bug in your code, or there will be some error in
the log pointing at the actual reason for the failure.
On Tue, Sep 22, 2015 at 5:4
What Spark package are you using? In particular, which hadoop version?
On Mon, Sep 21, 2015 at 9:14 AM, ekraffmiller
wrote:
> Hi,
> I’m trying to run a simple test program to access Spark though Java. I’m
> using JDK 1.8, and Spark 1.5. I’m getting an Exception from the
> JavaSparkContext const
On Mon, Sep 14, 2015 at 6:55 AM, Adrian Bridgett wrote:
> 15/09/14 13:00:25 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0,
> 10.1.200.245): java.lang.IllegalArgumentException:
> java.net.UnknownHostException: nameservice1
> at
> org.apache.hadoop.security.SecurityUtil.buildTokenServic
Hi,
Just "spark.executor.userClassPathFirst" is not enough. You should
also set "spark.driver.userClassPathFirst". Also not that I don't
think this was really tested with the shell, but that should work with
regular apps started using spark-submit.
If that doesn't work, I'd recommend shading, as
On Thu, Sep 3, 2015 at 5:15 PM, Matei Zaharia wrote:
> Even simple Spark-on-YARN should run as the user that submitted the job,
> yes, so HDFS ACLs should be enforced. Not sure how it plays with the rest of
> Ranger.
It's slightly more complicated than that (without kerberos, the
underlying proce
On Wed, Aug 26, 2015 at 2:03 PM, Jerry wrote:
> Assuming your submitting the job from terminal; when main() is called, if I
> try to open a file locally, can I assume the machine is always the one I
> submitted the job from?
See the "--deploy-mode" option. "client" works as you describe;
"cluster
On Tue, Aug 25, 2015 at 1:50 PM, Utkarsh Sengar wrote:
> So do I need to manually copy these 2 jars on my spark executors?
Yes. I can think of a way to work around that if you're using YARN,
but not with other cluster managers.
> On Tue, Aug 25, 2015 at 10:51 AM, Marcelo Vanz
On Tue, Aug 25, 2015 at 10:48 AM, Utkarsh Sengar wrote:
> Now I am going to try it out on our mesos cluster.
> I assumed "spark.executor.extraClassPath" takes csv as jars the way "--jars"
> takes it but it should be ":" separated like a regular classpath jar.
Ah, yes, those options are just raw c
This probably means your app is failing and the second attempt is
hitting that issue. You may fix the "directory already exists" error
by setting
spark.eventLog.overwrite=true in your conf, but most probably that
will just expose the actual error in your app.
On Tue, Aug 25, 2015 at 9:37 AM, Varad
On Mon, Aug 24, 2015 at 3:58 PM, Utkarsh Sengar wrote:
> That didn't work since "extraClassPath" flag was still appending the jars at
> the end, so its still picking the slf4j jar provided by spark.
Out of curiosity, how did you verify this? The "extraClassPath"
options are supposed to prepend en
a/com/opentable/logging/AssimilateForeignLogging.java#L68
>
> Thanks,
> -Utkarsh
>
>
>
>
> On Mon, Aug 24, 2015 at 3:04 PM, Marcelo Vanzin wrote:
>>
>> Hi Utkarsh,
>>
>> Unfortunately that's not going to be easy. Since Spark bundles all
>>
Hi Utkarsh,
Unfortunately that's not going to be easy. Since Spark bundles all
dependent classes into a single fat jar file, to remove that
dependency you'd need to modify Spark's assembly jar (potentially in
all your nodes). Doing that per-job is even trickier, because you'd
probably need some ki
That was only true until Spark 1.3. Spark 1.4 can be built with JDK7
and pyspark will still work.
On Fri, Aug 21, 2015 at 8:29 AM, Chen Song wrote:
> Thanks Sean.
>
> So how PySpark is supported. I thought PySpark needs jdk 1.6.
>
> Chen
>
> On Fri, Aug 21, 2015 at 11:16 AM, Sean Owen wrote:
>>
quot;
example.
> -Original Message-
> From: Marcelo Vanzin [mailto:van...@cloudera.com]
> Sent: Tuesday, August 18, 2015 5:15 PM
> To: Ellafi, Saif A.
> Cc: wrbri...@gmail.com; user@spark.apache.org
> Subject: Re: Scala: How to match a java object
>
> On Tue, Aug
On Tue, Aug 18, 2015 at 12:59 PM, wrote:
>
> 5 match { case java.math.BigDecimal => 2 }
5 match { case _: java.math.BigDecimal => 2 }
--
Marcelo
-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional comman
On Fri, Aug 14, 2015 at 2:11 PM, Varadhan, Jawahar <
varad...@yahoo.com.invalid> wrote:
> And hence, I was planning to use Spark Streaming with Kafka or Flume with
> Kafka. But flume runs on a JVM and may not be the best option as the huge
> file will create memory issues. Please suggest someway t
That should not be a fatal error, it's just a noisy exception.
Anyway, it should go away if you add YARN gateways to those nodes (aside
from Spark gateways).
On Mon, Aug 3, 2015 at 7:10 PM, Upen N wrote:
> Hi,
> I recently installed Cloudera CDH 5.4.4. Sparks comes shipped with this
> version.
Hi Namit,
There's no need to assign a bug to yourself to say you're working on it.
The recommended way is to just post a PR on github - the bot will update
the bug saying that you have a patch open to fix the issue.
On Mon, Aug 3, 2015 at 3:50 PM, Namit Katariya
wrote:
> My username on the Apa
On Sat, Aug 1, 2015 at 9:25 AM, Akmal Abbasov
wrote:
> When I running locally(./run-example SparkPi), the event logs are being
> created, and I can start history server.
> But when I am trying
> ./spark-submit --class org.apache.spark.examples.SparkPi --master
> yarn-cluster file:///opt/hadoop/sp
"file" can be a directory (look at all children) or even a glob
("/path/*.ext", for example).
On Fri, Jul 31, 2015 at 11:35 AM, swetha wrote:
> Hi,
>
> How to add multiple sequence files from HDFS to a Spark Context to do Batch
> processing? I have something like the following in my code. Do I h
Can you share the part of the code in your script where you create the
SparkContext instance?
On Thu, Jul 30, 2015 at 7:19 PM, fordfarline wrote:
> Hi All,
>
> I`m having an issue when lanching an app (python) against a stand alone
> cluster, but runs in local, as it doesn't reach the cluster.
>
Can you run the windows batch files (e.g. spark-submit.cmd) from the cygwin
shell?
On Tue, Jul 28, 2015 at 7:26 PM, Proust GZ Feng wrote:
> Hi, Owen
>
> Add back the cygwin classpath detection can pass the issue mentioned
> before, but there seems lack of further support in the launch lib, see
>
First, it's kinda confusing to change subjects in the middle of a thread...
On Tue, Jul 28, 2015 at 1:44 PM, Elkhan Dadashov
wrote:
> @Marcelo
> *Question1*:
> Do you know why launching Spark job through SparkLauncher in Java, stdout
> logs (i.e., INFO Yarn.Client) are written into error stream
BTW this is most probably caused by this line in PythonRunner.scala:
System.exit(process.waitFor())
The YARN backend doesn't like applications calling System.exit().
On Tue, Jul 28, 2015 at 12:00 PM, Marcelo Vanzin
wrote:
> This might be an issue with how pyspark propagates t
This might be an issue with how pyspark propagates the error back to the
AM. I'm pretty sure this does not happen for Scala / Java apps.
Have you filed a bug?
On Tue, Jul 28, 2015 at 11:17 AM, Elkhan Dadashov
wrote:
> Thanks Corey for your answer,
>
> Do you mean that "final status : SUCCEEDED"
Hi Stephen,
There is no such directory currently. If you want to add an existing jar to
every app's classpath, you need to modify two config values:
spark.driver.extraClassPath and spark.executor.extraClassPath.
On Mon, Jul 27, 2015 at 10:22 PM, Stephen Boesch wrote:
> when using spark-submit:
(bcc: user@spark, cc: cdh-user@cloudera)
This is a CDH issue, so I'm moving it to the CDH mailing list.
We're taking a look at how we're packaging dependencies so that these
issues happen less when running on CDH. But in the meantime, instead of
using "--jars", you could instead add the newer jar
On Wed, Jul 15, 2015 at 5:36 AM, Jeskanen, Elina
wrote:
> I have Spark 1.4 on my local machine and I would like to connect to our
> local 4 nodes Cloudera cluster. But how?
>
>
>
> In the example it says text_file = spark.textFile("hdfs://..."), but can
> you advise me in where to get this "hdfs
That has never been the correct way to set you app's classpath.
Instead, look at http://spark.apache.org/docs/latest/configuration.html and
search for "extraClassPath".
On Wed, Jul 15, 2015 at 9:43 AM, lokeshkumar wrote:
> Hi forum
>
> I have downloaded the latest spark version 1.4.0 and starte
On Tue, Jul 14, 2015 at 3:42 PM, Elkhan Dadashov
wrote:
> I looked into Virtual memory usage (jmap+jvisualvm) does not show that
> 11.5 g Virtual Memory usage - it is much less. I get 11.5 g Virtual memory
> usage using top -p pid command for SparkSubmit process.
>
If you're looking at top you w
On Tue, Jul 14, 2015 at 12:03 PM, Shushant Arora
wrote:
> Can a container have multiple JVMs running in YARN?
>
Yes and no. A container runs a single command, but that process can start
other processes, and those also count towards the resource usage of the
container (mostly memory). For example
On Tue, Jul 14, 2015 at 11:13 AM, Shushant Arora
wrote:
> spark-submit --class classname --num-executors 10 --executor-cores 4
> --master masteradd jarname
>
> Will it allocate 10 containers throughout the life of streaming
> application on same nodes until any node failure happens and
>
It will
On Tue, Jul 14, 2015 at 10:55 AM, Shushant Arora
wrote:
> Is yarn.scheduler.maximum-allocation-vcores the setting for max vcores per
> container?
>
I don't remember YARN config names by heart, but that sounds promising. I'd
look at the YARN documentation for details.
> Whats the setting for ma
On Tue, Jul 14, 2015 at 9:53 AM, Elkhan Dadashov
wrote:
> While the program is running, these are the stats of how much memory each
> process takes:
>
> SparkSubmit process : 11.266 *gigabyte* Virtual Memory
>
> ApplicationMaster process: 2303480 *byte *Virtual Memory
>
That SparkSubmit number l
On Tue, Jul 14, 2015 at 10:40 AM, Shushant Arora
wrote:
> My understanding was --executor-cores(5 here) are maximum concurrent
> tasks possible in an executor and --num-executors (10 here)are no of
> executors or containers demanded by Application master/Spark driver program
> to yarn RM.
>
--e
On Tue, Jul 14, 2015 at 9:57 AM, Shushant Arora
wrote:
> When I specify --executor-cores > 4 it fails to start the application.
> When I give --executor-cores as 4 , it works fine.
>
Do you have any NM that advertises more than 4 available cores?
Also, it's always worth it to check if there's a
You cannot run Spark in cluster mode by instantiating a SparkContext like
that.
You have to launch it with the "spark-submit" command line script.
On Thu, Jul 9, 2015 at 2:23 PM, jegordon wrote:
> Hi to all,
>
> Is there any way to run pyspark scripts with yarn-cluster mode without
> using
> th
SIGTERM on YARN generally means the NM is killing your executor because
it's running over its requested memory limits. Check your NM logs to make
sure. And then take a look at the "memoryOverhead" setting for driver and
executors (http://spark.apache.org/docs/latest/running-on-yarn.html).
On Tue,
when starting the process. You can check the Hadoop
sources for details. Not sure if there's another way.
>
> *From: *Marcelo Vanzin
> *Sent: *Friday, June 26, 2015 6:20 PM
> *To: *Dave Ariens
> *Cc: *Tim Chen; Olivier Girardot; user@spark.apache.org
> *Subject:
On Fri, Jun 26, 2015 at 3:09 PM, Dave Ariens wrote:
> Would there be any way to have the task instances in the slaves call the
> UGI login with a principal/keytab provided to the driver?
>
That would only work with a very small number of executors. If you have
many login requests in a short per
all work with Mesos.
> On Fri, Jun 26, 2015 at 1:20 PM, Marcelo Vanzin
> wrote:
>
>> On Fri, Jun 26, 2015 at 1:13 PM, Tim Chen wrote:
>>
>>> So correct me if I'm wrong, sounds like all you need is a principal user
>>> name and also a keytab file dow
On Fri, Jun 26, 2015 at 1:13 PM, Tim Chen wrote:
> So correct me if I'm wrong, sounds like all you need is a principal user
> name and also a keytab file downloaded right?
>
I'm not familiar with Mesos so don't know what kinds of features it has,
but at the very least it would need to start cont
What master are you using? If this is not a "local" master, you'll need to
set LD_LIBRARY_PATH on the executors also (using
spark.executor.extraLibraryPath).
If you are using local, then I don't know what's going on.
On Fri, Jun 26, 2015 at 1:39 AM, Arunabha Ghosh
wrote:
> Hi,
> I'm having
y are local
> files, kmeans_data.txt is in HDFS.
>
>
> Thanks.
>
>
> On Thu, Jun 25, 2015 at 12:22 PM, Marcelo Vanzin
> wrote:
>
>> That sounds like SPARK-5479 which is not in 1.4...
>>
>> On Thu, Jun 25, 2015 at 12:17 PM, Elkhan Dadashov
>&g
That sounds like SPARK-5479 which is not in 1.4...
On Thu, Jun 25, 2015 at 12:17 PM, Elkhan Dadashov
wrote:
> In addition to previous emails, when i try to execute this command from
> command line:
>
> ./bin/spark-submit --verbose --master yarn-cluster --py-files
> mypython/libs/numpy-1.9.2.zip
That's not supported. You could use wget / curl to download the file to a
temp location before running spark-submit, though.
On Thu, Jun 11, 2015 at 12:48 PM, Gary Ogden wrote:
> I have a properties file that is hosted at a url. I would like to be able
> to use the url in the --properties-file p
I don't think it's propagated automatically. Try this:
spark-submit --conf "spark.executorEnv.PYTHONPATH=..." ...
On Wed, Jun 10, 2015 at 8:15 AM, Bob Corsaro wrote:
> I'm setting PYTHONPATH before calling pyspark, but the worker nodes aren't
> inheriting it. I've tried looking through the cod
So, I don't have an explicit solution to your problem, but...
On Wed, Jun 10, 2015 at 7:13 AM, Kostas Kougios <
kostas.koug...@googlemail.com> wrote:
> I am profiling the driver. It currently has 564MB of strings which might be
> the 1mil file names. But also it has 2.34 GB of long[] ! That's so
stop working, it's broken for good.
>
> On Tue, Jun 9, 2015 at 4:12 PM, Marcelo Vanzin
> wrote:
>
>> Apologies, I see you already posted everything from the RM logs that
>> mention your stuck app.
>>
>> Have you tried restarting the YARN cluster to see if
t seen anything like
this.
On Tue, Jun 9, 2015 at 1:01 PM, Marcelo Vanzin wrote:
> On Tue, Jun 9, 2015 at 11:31 AM, Matt Kapilevich
> wrote:
>
>> Like I mentioned earlier, I'm able to execute Hadoop jobs fine even now
>> - this problem is specific to Spark.
>>
&g
On Tue, Jun 9, 2015 at 11:31 AM, Matt Kapilevich wrote:
> Like I mentioned earlier, I'm able to execute Hadoop jobs fine even now -
> this problem is specific to Spark.
>
That doesn't necessarily mean anything. Spark apps have different resource
requirements than Hadoop apps.
Check your RM log
If your application is stuck in that state, it generally means your cluster
doesn't have enough resources to start it.
In the RM logs you can see how many vcores / memory the application is
asking for, and then you can check your RM configuration to see if that's
currently available on any single
On Fri, Jun 5, 2015 at 12:55 PM, Lee McFadden wrote:
> Regarding serialization, I'm still confused as to why I was getting a
> serialization error in the first place as I'm executing these Runnable
> classes from a java thread pool. I'm fairly new to Scala/JVM world and
> there doesn't seem to b
Ignoring the serialization thing (seems like a red herring):
On Fri, Jun 5, 2015 at 11:48 AM, Lee McFadden wrote:
> 15/06/05 11:35:32 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0,
> localhost): java.lang.NoSuchMethodError:
> org.apache.spark.executor.TaskMetrics.inputMetrics_$eq(Lscala
On Fri, Jun 5, 2015 at 11:48 AM, Lee McFadden wrote:
> Initially I had issues passing the SparkContext to other threads as it is
> not serializable. Eventually I found that adding the @transient annotation
> prevents a NotSerializableException.
>
This is really puzzling. How are you passing the
I talked to Don outside the list and he says that he's seeing this issue
with Apache Spark 1.3 too (not just CDH Spark), so it seems like there is a
real issue here.
On Wed, Jun 3, 2015 at 1:39 PM, Don Drake wrote:
> As part of upgrading a cluster from CDH 5.3.x to CDH 5.4.x I noticed that
> Spa
(bcc: user@spark, cc:cdh-user@cloudera)
If you're using CDH, Spark SQL is currently unsupported and mostly
untested. I'd recommend trying to use it in CDH. You could try an upstream
version of Spark instead.
On Wed, Jun 3, 2015 at 1:39 PM, Don Drake wrote:
> As part of upgrading a cluster from
That code hasn't changed at all between 1.3 and 1.4; it also has been
working fine for me.
Are you sure you're using exactly the same Hadoop libraries (since you're
building with -Phadoop-provided) and Hadoop configuration in both cases?
On Tue, Jun 2, 2015 at 5:29 PM, Night Wolf wrote:
> Hi al
Take a look at the Spark History Server (see documentation).
On Mon, Jun 1, 2015 at 8:36 PM, Haopu Wang wrote:
> When I start the Spark master process, the old records are not shown in
> the monitoring UI.
>
> How to show the old records? Thank you very much!
>
>
> --
at 9:10 AM, Jianshi Huang
wrote:
> Yes, all written to the same directory on HDFS.
>
> Jianshi
>
> On Wed, May 27, 2015 at 11:57 PM, Marcelo Vanzin
> wrote:
>
>> You may be the only one not seeing all the logs. Are you sure all the
>> users are writing to the same
You may be the only one not seeing all the logs. Are you sure all the users
are writing to the same log directory? The HS can only read from a single
log directory.
On Wed, May 27, 2015 at 5:33 AM, Jianshi Huang
wrote:
> No one using History server? :)
>
> Am I the only one need to see all user'
Is it just me or does that look completely unrelated to
Spark-the-Apache-project?
On Tue, May 26, 2015 at 10:55 AM, Ted Yu wrote:
> Have you looked at https://github.com/spark/sparkjs ?
>
> Cheers
>
> On Tue, May 26, 2015 at 10:17 AM, marcos rebelo wrote:
>
>> Hi all,
>>
>> My first message on
Seems like there might be a mismatch between your Spark jars and your
cluster's HDFS version. Make sure you're using the Spark jar that matches
the hadoop version of your cluster.
On Thu, May 21, 2015 at 8:48 AM, roy wrote:
> Hi,
>
>After restarting Spark HistoryServer, it failed to come up,
15/05/19 14:10:47 INFO handler.ContextHandler: stopped
> o.s.j.s.ServletContextHandler{/storage/rdd,null}
> 15/05/19 14:10:47 INFO handler.ContextHandler: stopped
> o.s.j.s.ServletContextHandler{/storage/json,null}
> 15/05/19 14:10:47 INFO handler.ContextHandler: stopped
> o.s
Hi Shay,
Yeah, that seems to be a bug; it doesn't seem to be related to the default
FS nor compareFs either - I can reproduce this with HDFS when copying files
from the local fs too. In yarn-client mode things seem to work.
Could you file a bug to track this? If you don't have a jira account I ca
hould behave exactly the same as SPARK_CLASSPATH. It would be nice
to know whether that is also the case in 1.4 (I took a quick look at the
related code and it seems correct), but I don't have Mesos around to test.
>
> On Fri, May 15, 2015 at 12:04 PM, Marcelo Vanzin
> wrote:
y.
It would be really weird if those options worked differently from
SPARK_CLASSPATH, since they were meant to replace it.
On Fri, May 15, 2015 at 11:54 AM, Marcelo Vanzin
> wrote:
>
>> Ah, I see. yeah, it sucks that Spark has to expose Optional (and things
>> it depend
resolution is actually backport to 1.2.2, which is working
> fine.
>
>
>
>
>
> *From:* Marcelo Vanzin [mailto:van...@cloudera.com]
> *Sent:* Thursday, May 14, 2015 6:27 PM
> *To:* Anton Brazhnyk
> *Cc:* user@spark.apache.org
> *Subject:* Re: Spark's Guava pieces ca
What version of Spark are you using?
The bug you mention is only about the Optional class (and a handful of
others, but none of the classes you're having problems with). All other
Guava classes should be shaded since Spark 1.2, so you should be able to
use your own version of Guava with no problem
Are you actually running anything that requires all those slots? e.g.,
locally, I get this with "local[16]", but only after I run something that
actually uses those 16 slots:
"Executor task launch worker-15" daemon prio=10 tid=0x7f4c80029800
nid=0x8ce waiting on condition [0x7f4c62493000]
Note that `object` is equivalent to a class full of static fields / methods
(in Java), so the data it holds will not be serialized, ever.
What you want is a config class instead, so you can instantiate it, and
that instance can be serialized. Then you can easily do (1) or (3).
On Mon, May 11, 201
On Thu, May 7, 2015 at 7:39 PM, felicia wrote:
> we tried to add /usr/lib/parquet/lib & /usr/lib/parquet to SPARK_CLASSPATH
> and it doesn't seems to work,
>
To add the jars to the classpath you need to use "/usr/lib/parquet/lib/*",
otherwise you're just adding the directory (and not the files w
bci=5, line=45 (Interpreted frame)
> - java.lang.reflect.Constructor.newInstance(java.lang.Object[]) @bci=79,
> line=526 (Interpreted frame)
> - org.apache.spark.deploy.history.HistoryServer$.main(java.lang.String[])
> @bci=89, line=185 (Interpreted frame)
> - org.apache.spark.deploy.histo
2015-05-07 11:03 GMT-07:00 Marcelo Vanzin :
>
> Can you get a jstack for the process? Maybe it's stuck somewhere.
>>
>> On Thu, May 7, 2015 at 11:00 AM, Koert Kuipers wrote:
>>
>>> i am trying to launch the spark 1.3.1 history server on a secure cluster.
>>&g
Can you get a jstack for the process? Maybe it's stuck somewhere.
On Thu, May 7, 2015 at 11:00 AM, Koert Kuipers wrote:
> i am trying to launch the spark 1.3.1 history server on a secure cluster.
>
> i can see in the logs that it successfully logs into kerberos, and it is
> replaying all the log
What Spark tarball are you using? You may want to try the one for hadoop
2.6 (the one for hadoop 2.4 may cause that issue, IIRC).
On Tue, May 5, 2015 at 6:54 PM, felicia wrote:
> Hi all,
>
> We're trying to implement SparkSQL on CDH5.3.0 with cluster mode,
> and we get this error either using ja
Are you using a Spark build that matches your YARN cluster version?
That seems like it could happen if you're using a Spark built against
a newer version of YARN than you're running.
On Thu, Apr 2, 2015 at 12:53 AM, 董帅阳 <917361...@qq.com> wrote:
> spark 1.3.0
>
>
> spark@pc-zjqdyyn1:~> tail /etc/
On top of what's been said...
On Wed, Apr 22, 2015 at 10:48 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) wrote:
> 1) I can go to Spark UI and see the status of the APP but cannot see the
> logs as the job progresses. How can i see logs of executors as they progress
> ?
Spark 1.3 should have links to the executor logs in t
No, those have to be local paths.
On Thu, Apr 23, 2015 at 6:53 PM, Night Wolf wrote:
> Thanks Marcelo, can this be a path on HDFS?
>
> On Fri, Apr 24, 2015 at 11:52 AM, Marcelo Vanzin
> wrote:
>>
>> You'd have to use spark.{driver,executor}.extraClassPath to modi
You'd have to use spark.{driver,executor}.extraClassPath to modify the
system class loader. But that also means you have to manually
distribute the jar to the nodes in your cluster, into a common
location.
On Thu, Apr 23, 2015 at 6:38 PM, Night Wolf wrote:
> Hi guys,
>
> Having a problem build a
I think you want to take a look at:
https://issues.apache.org/jira/browse/SPARK-6207
On Mon, Apr 20, 2015 at 1:58 PM, Andrew Lee wrote:
> Hi All,
>
> Affected version: spark 1.2.1 / 1.2.2 / 1.3-rc1
>
> Posting this problem to user group first to see if someone is encountering
> the same problem.
I think Michael is referring to this:
"""
Exception in thread "main" java.lang.IllegalArgumentException: You
must specify at least 1 executor!
Usage: org.apache.spark.deploy.yarn.Client [options]
"""
spark-submit --conf spark.dynamicAllocation.enabled=true --conf
spark.dynamicAllocation.minExecut
You can copy the dependencies to all nodes in your cluster, and then
use "spark.{executor,driver}.extraClassPath" to add them to the
classpath of your executors / driver.
On Mon, Apr 13, 2015 at 4:15 AM, Michael Weir
wrote:
> My app works fine with the single, "uber" jar file containing my app
Set spark.yarn.maxAppAttempts=1 if you don't want retries.
On Thu, Apr 9, 2015 at 10:31 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) wrote:
> Hello,
> I have a spark job with 5 stages. After it runs 3rd stage, the console shows
>
>
> 15/04/09 10:25:57 INFO yarn.Client: Application report for
> application_1427705526386_127
"spark.eventLog.dir" should contain the full HDFS URL. In general,
this should be sufficient:
spark.eventLog.dir=hdfs:/user/spark/applicationHistory
On Wed, Apr 8, 2015 at 6:45 AM, Vijayasarathy Kannan wrote:
> I am trying to run a Spark application using spark-submit on a cluster using
> Cloud
Maybe you have some sbt-built 1.3 version in your ~/.ivy2/ directory that's
masking the maven one? That's the only explanation I can come up with...
On Tue, Apr 7, 2015 at 12:22 PM, Jacek Lewandowski <
jacek.lewandow...@datastax.com> wrote:
> So weird, as I said - I created a new empty project wh
The Spark history server does not have the ability to serve executor
logs currently. You need to use the "yarn logs" command for that.
On Tue, Apr 7, 2015 at 2:51 AM, donhoff_h <165612...@qq.com> wrote:
> Hi, Experts
>
> I run my Spark Cluster on Yarn. I used to get executors' Logs from Spark's
>
BTW, just out of curiosity, I checked both the 1.3.0 release assembly
and the spark-core_2.10 artifact downloaded from
http://mvnrepository.com/, and neither contain any references to
anything under "org.eclipse" (all referenced jetty classes are the
shaded ones under org.spark-project.jetty).
On
FYI I wrote a small test to try to reproduce this, and filed
SPARK-6688 to track the fix.
On Tue, Mar 31, 2015 at 1:15 PM, Marcelo Vanzin wrote:
> Hmmm... could you try to set the log dir to
> "file:/home/hduser/spark/spark-events"?
>
> I checked the code and it migh
Try "sbt assembly" instead.
On Wed, Apr 1, 2015 at 10:09 AM, Vijayasarathy Kannan wrote:
> Why do I get
> "Failed to find Spark assembly JAR.
> You need to build Spark before running this program." ?
>
> I downloaded "spark-1.2.1.tgz" from the downloads page and extracted it.
> When I do "sbt pac
d in the error message (i, ii), created a text file, closed it an
> viewed it, and deleted it (iii). My findings were reconfirmed by my
> colleague. Any other ideas?
>
> Thanks,
>
> Tom
>
>
> On 30 March 2015 at 19:19, Marcelo Vanzin wrote:
>>
>> So, the error
So, the error below is still showing the invalid configuration.
You mentioned in the other e-mails that you also changed the
configuration, and that the directory really, really exists. Given the
exception below, the only ways you'd get the error with a valid
configuration would be if (i) the dire
k-env:
> "Log directory /home/hduser/spark/spark-events does not exist."
> (Also, in the default /tmp/spark-events it also did not work)
>
> On 30 March 2015 at 18:03, Marcelo Vanzin wrote:
>>
>> Are those config values in spark-defaults.conf? I don't think you
Are those config values in spark-defaults.conf? I don't think you can
use "~" there - IIRC it does not do any kind of variable expansion.
On Mon, Mar 30, 2015 at 3:50 PM, Tom wrote:
> I have set
> spark.eventLog.enabled true
> as I try to preserve log files. When I run, I get
> "Log directory /tm
This sounds like SPARK-6532.
On Mon, Mar 30, 2015 at 1:34 PM, ARose wrote:
> So, I am trying to build Spark 1.3.0 (standalone mode) on Windows 7 using
> Maven, but I'm getting a build failure.
>
> java -version
> java version "1.8.0_31"
> Java(TM) SE Runtime Environment (build 1.8.0_31-b13)
> Jav
201 - 300 of 507 matches
Mail list logo