Hi Ranjan,
Whatever code is being passed as closure to spark operations like map,
flatmap, filter etc are part of task
All others are in driver.
Thanks,
Sourav
On Mon, Mar 10, 2014 at 12:03 PM, Sen, Ranjan [USA] wrote:
> Hi Patrick
>
> How do I know which part of the code is in the driver and
Just a correction - what the strange symbol which was auto generated is
for “. . . . “
Thanks again,
Ranjan
On 3/9/14, 11:33 PM, "Sen, Ranjan [USA]" wrote:
>Hi Patrick
>
>How do I know which part of the code is in the driver and which in task?
>The structure of my code is as below-
>
>Š
>
>S
Hi Patrick
How do I know which part of the code is in the driver and which in task?
The structure of my code is as below-
Š
Static boolean done=false;
Š
Public static void main(..
..
JavaRDD lines = ..
..
While (!done) {
..
While (..) {
JavaPairRDD> labs1 = labs.map (new PairFunction<Š );
Hey Sen,
Is your code in the driver code or inside one of the tasks?
If it's in the tasks, the place you would expect these to be is in
stdout file under //work/[stdout/stderr]. Are you seeing
at least stderr logs in that folder? If not then the tasks might not
be running on the workers machines.
There was an issue related to this fixed recently:
https://github.com/apache/spark/pull/103
On Sun, Mar 9, 2014 at 8:40 PM, Koert Kuipers wrote:
> edit last line of sbt/sbt, after which i run:
> sbt/sbt test
>
>
> On Sun, Mar 9, 2014 at 10:24 PM, Sean Owen wrote:
>
>> How are you specifying th
edit last line of sbt/sbt, after which i run:
sbt/sbt test
On Sun, Mar 9, 2014 at 10:24 PM, Sean Owen wrote:
> How are you specifying these args?
> On Mar 9, 2014 8:55 PM, "Koert Kuipers" wrote:
>
>> i just checkout out the latest 0.9
>>
>> no matter what java options i use in sbt/sbt (i tried
i just checkout out the latest 0.9
no matter what java options i use in sbt/sbt (i tried -Xmx6G
-XX:MaxPermSize=2000m -XX:ReservedCodeCacheSize=300m) i keep getting errors
"java.lang.OutOfMemoryError: PermGen space" when running the tests.
curiously i managed to run the tests with the default dep
Hi, I am running cdh5b2. I have installed the hadoop2 version of shark
0.9.0 for cdh5. Want to know if there is a compatible version of shark that
will run with this combination.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/CDH5b2-Spark-0-9-0-and-shark
Hi Aureliaono,
First, docker is not ready for production, unless you know what are doing and
prepared for some risk.
Then, in my opinion , there are so many hard code in spark docker script, you
have to modify it for your goal.
Yours, Xuefeng Wu 吴雪峰 敬上
> On 2014年3月10日, at 上午12:33, Aurelian
Hi All,
I have already set up Spark-0.9.0-incubating on our school's cluster. I
successfully run the Spark PageRank demo located in
/spark-0.9.0-incubating/examples/src/main/scala/org/apache/spark/examples.
Problem 1. I want to run the TriangleCount whose source code located
in/spark-0.9.0
Hi
I have some System.out.println in my Java code that is working ok in a local
environment. But when I run the same code on a standalone mode in a EC2
cluster I do not see them at the worker stdout (in the worker node under /work ) or at the driver console. Could you help me understand how do
Whoa, wait, the docker scripts are only used for testing purposes right
now. They have not been designed with the intention of replacing the
spark-ec2 scripts. For instance, there isn't an ssh server running so you
can stop and restart the cluster (like sbin/stop-all.sh). Also, we
currently mount S
Hi,
I've installed Spark 0.81 on IDH 3.0.2 as on YARN.
My cluster have 3 servers, 1 is NN and DN, other 2 only DN.
I manage to launch spark-shell and execute the mllib kmeans.
The problem is it is using only one node ( the NN ) and not running on the
other 2 DN
Please advise
My spark-env.sh file
Hi Dana,
It’s hard to tell exactly what is consuming time, but I’d suggest starting by
profiling the single application first. Three things to look at there:
1) How many stages and how many tasks per stage is Spark launching (in the
application web UI at http://:4040)? If you have hundreds of t
Hi,
Is the spark docker script now mature enough to substitute spark-ec2
script? Anyone here using the docker script is production?
YARN also have this scheduling option.
The problem is all of our applications have the same flow where the first
stage is the heaviest and the rest are very small.
The problem is when some request (application) start to run on the same time,
the first stage of all is schedule in parallel, and fo
Yes TD,
I can use tcpdump to see if the data are being accepted by the receiver
and if else them are arriving into the IP packet.
Thanks
Em 3/8/14, 4:19, Tathagata Das escreveu:
I am not sure how to debug this without any more information about the
source. Can you monitor on the receiver side
Hi,
Does GraphX currently support Giraph/Pregel's "aggregator" feature? I
was thinking to implement a PageRank version that is able to correctly
handle dangling vertices (i.e. vertices with no outlinks). Therefore I
would have to globally sum up the rank associated to them in every
iteration,
Hi Kane,
In the sequence file, the class is org.apache.hadoop.io.Text. You need to
convert Text to String. There are two approaches:
1. Use implicit conversions to convert Text to String automatically. I
recommend this one. E.g.,
val t2 = sc.sequenceFile[String, String]("/user/hdfs/e1Mseq")
t2.g
19 matches
Mail list logo