Get attempt number in a closure

2014-10-20 Thread Yin Huai
Hello, Is there any way to get the attempt number in a closure? Seems TaskContext.attemptId actually returns the taskId of a task (see this https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/executor/Executor.scala#L181 and this

Re: Get attempt number in a closure

2014-10-20 Thread Reynold Xin
I also ran into this earlier. It is a bug. Do you want to file a jira? I think part of the problem is that we don't actually have the attempt id on the executors. If we do, that's great. If not, we'd need to propagate that over. On Mon, Oct 20, 2014 at 7:17 AM, Yin Huai huaiyin@gmail.com

Re: Get attempt number in a closure

2014-10-20 Thread Yin Huai
Yeah, seems we need to pass the attempt id to executors through TaskDescription. I have created https://issues.apache.org/jira/browse/SPARK-4014. On Mon, Oct 20, 2014 at 1:57 PM, Reynold Xin r...@databricks.com wrote: I also ran into this earlier. It is a bug. Do you want to file a jira? I

Re: Get attempt number in a closure

2014-10-20 Thread Patrick Wendell
There is a deeper issue here which is AFAIK we don't even store a notion of attempt inside of Spark, we just use a new taskId with the same index. On Mon, Oct 20, 2014 at 12:38 PM, Yin Huai huaiyin@gmail.com wrote: Yeah, seems we need to pass the attempt id to executors through

Re: Get attempt number in a closure

2014-10-20 Thread Kay Ousterhout
Are you guys sure this is a bug? In the task scheduler, we keep two identifiers for each task: the index, which uniquely identifiers the computation+partition, and the taskId which is unique across all tasks for that Spark context (See

Re: Get attempt number in a closure

2014-10-20 Thread Kay Ousterhout
Sorry to clarify, there are two issues here: (1) attemptId has different meanings in the codebase (2) we currently don't propagate the 0-based per-task attempt identifier to the executors. (1) should definitely be fixed. It sounds like Yin's original email was requesting that we add (2). On

Re: Get attempt number in a closure

2014-10-20 Thread Reynold Xin
Yes, as I understand it this is for (2). Imagine a use case in which I want to save some output. In order to make this atomic, the program uses part_[index]_[attempt].dat, and once it finishes writing, it renames this to part_[index].dat. Right now [attempt] is just the TID, which could show up

Re: Get attempt number in a closure

2014-10-20 Thread Yin Huai
Yes, it is for (2). I was confused because the doc of TaskContext.attemptId (release 1.1) http://spark.apache.org/docs/1.1.0/api/scala/index.html#org.apache.spark.TaskContext is the number of attempts to execute this task. Seems the per-task attempt id used to populate attempt field in the UI is

something wrong with Jenkins or something untested merged?

2014-10-20 Thread Nan Zhu
Hi, I just submitted a patch https://github.com/apache/spark/pull/2864/files with one line change but the Jenkins told me it's failed to compile on the unrelated files? https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21935/console Best, Nan

Building and Running Spark on OS X

2014-10-20 Thread Nicholas Chammas
If one were to put together a short but comprehensive guide to setting up Spark to run locally on OS X, would it look like this? # Install Maven. On OS X, we suggest using Homebrew. brew install maven # Set some important Java and Maven environment variables.export

Re: something wrong with Jenkins or something untested merged?

2014-10-20 Thread Ted Yu
I performed build on latest master branch but didn't get compilation error. FYI On Mon, Oct 20, 2014 at 3:51 PM, Nan Zhu zhunanmcg...@gmail.com wrote: Hi, I just submitted a patch https://github.com/apache/spark/pull/2864/files with one line change but the Jenkins told me it's failed to

Re: something wrong with Jenkins or something untested merged?

2014-10-20 Thread Nan Zhu
yes, I can compile locally, too but it seems that Jenkins is not happy now...https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/ All failed to compile Best, -- Nan Zhu On Monday, October 20, 2014 at 7:56 PM, Ted Yu wrote: I performed build on latest master branch but

Re: Building and Running Spark on OS X

2014-10-20 Thread Reynold Xin
I usually use SBT on Mac and that one doesn't require any setup ... On Mon, Oct 20, 2014 at 4:43 PM, Nicholas Chammas nicholas.cham...@gmail.com wrote: If one were to put together a short but comprehensive guide to setting up Spark to run locally on OS X, would it look like this? # Install

Re: Building and Running Spark on OS X

2014-10-20 Thread Nicholas Chammas
Yeah, I would use sbt too, but I thought if I wanted to publish a little reference page for OS X users then I probably should use the “official https://github.com/apache/spark#building-spark“ build instructions. Nick ​ On Mon, Oct 20, 2014 at 8:00 PM, Reynold Xin r...@databricks.com wrote: I

Re: Building and Running Spark on OS X

2014-10-20 Thread Denny Lee
+1 huge fan of sbt with OSX On Oct 20, 2014, at 17:00, Reynold Xin r...@databricks.com wrote: I usually use SBT on Mac and that one doesn't require any setup ... On Mon, Oct 20, 2014 at 4:43 PM, Nicholas Chammas nicholas.cham...@gmail.com wrote: If one were to put together a short

Re: Building and Running Spark on OS X

2014-10-20 Thread Sean Owen
Maven is at least built in to OS X (well, with dev tools). You don't even have to brew install it. Surely SBT isn't in the dev tools even? I recall I had to install it. I'd be surprised to hear it required zero setup. On Mon, Oct 20, 2014 at 8:04 PM, Nicholas Chammas nicholas.cham...@gmail.com

Re: something wrong with Jenkins or something untested merged?

2014-10-20 Thread Patrick Wendell
The failure is in the Kinesis compoent, can you reproduce this if you build with -Pkinesis-asl? - Patrick On Mon, Oct 20, 2014 at 5:08 PM, shane knapp skn...@berkeley.edu wrote: hmm, strange. i'll take a look. On Mon, Oct 20, 2014 at 5:11 PM, Nan Zhu zhunanmcg...@gmail.com wrote: yes, I

Re: Building and Running Spark on OS X

2014-10-20 Thread Nicholas Chammas
I think starting in Mavericks, Maven is no longer included by default http://stackoverflow.com/questions/19678594/maven-not-found-in-mac-osx-mavericks . On Mon, Oct 20, 2014 at 8:15 PM, Sean Owen so...@cloudera.com wrote: Maven is at least built in to OS X (well, with dev tools). You don't

Re: Building and Running Spark on OS X

2014-10-20 Thread Hari Shreedharan
The sbt executable that is in the spark repo can be used to build sbt without any other set up (it will download the sbt jars etc). Thanks, Hari On Mon, Oct 20, 2014 at 5:16 PM, Sean Owen so...@cloudera.com wrote: Maven is at least built in to OS X (well, with dev tools). You don't even

Re: Building and Running Spark on OS X

2014-10-20 Thread Sean Owen
Oh right, we're talking about the bundled sbt of course. And I didn't know Maven wasn't installed anymore! On Mon, Oct 20, 2014 at 8:20 PM, Hari Shreedharan hshreedha...@cloudera.com wrote: The sbt executable that is in the spark repo can be used to build sbt without any other set up (it will

Re: something wrong with Jenkins or something untested merged?

2014-10-20 Thread shane knapp
ok, so earlier today i installed a 2nd JDK within jenkins (7u71), which fixed the SparkR build but apparently made Spark itself quite unhappy. i removed that JDK, triggered a build ( https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21943/console), and it compiled kinesis w/o

Re: Building and Running Spark on OS X

2014-10-20 Thread Nicholas Chammas
So back to my original question... :) If we wanted to post this guide to the user list or to a gist for easy reference, would we rather have Maven or SBT listed? And is there anything else about the steps that should be modified? Nick On Mon, Oct 20, 2014 at 8:25 PM, Sean Owen

Re: something wrong with Jenkins or something untested merged?

2014-10-20 Thread Patrick Wendell
Thanks Shane - we should fix the source code issues in the Kinesis code that made stricter Java compilers reject it. - Patrick On Mon, Oct 20, 2014 at 5:28 PM, shane knapp skn...@berkeley.edu wrote: ok, so earlier today i installed a 2nd JDK within jenkins (7u71), which fixed the SparkR build

Re: something wrong with Jenkins or something untested merged?

2014-10-20 Thread Patrick Wendell
I created an issue to fix this: https://issues.apache.org/jira/browse/SPARK-4021 On Mon, Oct 20, 2014 at 5:32 PM, Patrick Wendell pwend...@gmail.com wrote: Thanks Shane - we should fix the source code issues in the Kinesis code that made stricter Java compilers reject it. - Patrick On Mon,

Re: something wrong with Jenkins or something untested merged?

2014-10-20 Thread shane knapp
thanks, patrick! :) On Mon, Oct 20, 2014 at 5:35 PM, Patrick Wendell pwend...@gmail.com wrote: I created an issue to fix this: https://issues.apache.org/jira/browse/SPARK-4021 On Mon, Oct 20, 2014 at 5:32 PM, Patrick Wendell pwend...@gmail.com wrote: Thanks Shane - we should fix the

Re: Building and Running Spark on OS X

2014-10-20 Thread Jeremy Freeman
I also prefer sbt on Mac. You might want to add checking for / getting Python 2.6+ (though most modern Macs should have it), and maybe numpy as an optional dependency. I often just point people to Anaconda. — Jeremy - jeremyfreeman.net @thefreemanlab On Oct 20, 2014,