Re: running the Terasort example

2014-12-16 Thread Ewan Higgs
Hi Tim, run-example is here: https://github.com/ehiggs/spark/blob/terasort/bin/run-example It should be in the repository that you cloned. So if you were at the top level of the checkout, run-example would be run as ./bin/run-example. Yours, Ewan Higgs On 12/12/14 01:06, Tim Harsch wrote:

Data Loss - Spark streaming

2014-12-16 Thread Jeniba Johnson
Hi, I need a clarification, while running streaming examples, suppose the batch interval is set to 5 minutes, after collecting the data from the input source(FLUME) and processing till 5 minutes. What will happen to the data which is flowing continuously from the input source to spark

RDD data flow

2014-12-16 Thread Madhu
I was looking at some of the Partition implementations in core/rdd and getOrCompute(...) in CacheManager. It appears that getOrCompute(...) returns an InterruptibleIterator, which delegates to a wrapped Iterator. That would imply that Partitions should extend Iterator, but that is not always the

Re: running the Terasort example

2014-12-16 Thread Tim Harsch
Hi Ewan, Thanks, I think I was just a bit confused at the time, I was looking at the spark-perf repo when there was the problem (uh.. ok)… I notice now with a pull down just minutes back that I still get a compile problem. [ERROR]

Re: Scala's Jenkins setup looks neat

2014-12-16 Thread Nicholas Chammas
News flash! From the latest version of the GitHub API https://developer.github.com/v3/repos/statuses/: Note that the repo:status OAuth scope https://developer.github.com/v3/oauth/#scopes grants targeted access to Statuses *without* also granting access to repository code, while the repo scope

Re: Scala's Jenkins setup looks neat

2014-12-16 Thread Nicholas Chammas
Actually, reading through the existing issue opened for this https://issues.apache.org/jira/browse/INFRA-7367 back in February, I don’t see any explanation from ASF Infra as to why they won’t grant permission against the Status API. They just recommended transitioning to the Apache Jenkins

Re: Scala's Jenkins setup looks neat

2014-12-16 Thread Reynold Xin
This was the ticket: https://issues.apache.org/jira/browse/INFRA-7918 On Tue, Dec 16, 2014 at 6:23 PM, Nicholas Chammas nicholas.cham...@gmail.com wrote: Actually, reading through the existing issue opened for this https://issues.apache.org/jira/browse/INFRA-7367 back in February, I don’t

Re: Scala's Jenkins setup looks neat

2014-12-16 Thread Nicholas Chammas
I see. That’s a separate discussion about closing PRs vs. just updating the CI status on individual commits. I’ll comment on INFRA-7367 https://issues.apache.org/jira/browse/INFRA-7367. Nick ​ On Tue Dec 16 2014 at 9:38:04 PM Reynold Xin r...@databricks.com wrote: This was the ticket:

Re: Interested in contributing to GraphX in Python

2014-12-16 Thread GregBowyer
I have been thinking about this for a little while and I wonder if it makes sense to look at forcing off heap mmap storage what can be shared with python. The idea would be that java makes a DirectByteBuffer (or similar) with python doing memoryview over that buffer. Then for all except for real

[RESULT] [VOTE] Release Apache Spark 1.2.0 (RC2)

2014-12-16 Thread Patrick Wendell
This vote has PASSED with 12 +1 votes (8 binding) and no 0 or -1 votes: +1: Matei Zaharia* Madhu Siddalingaiah Reynold Xin* Sandy Ryza Josh Rozen* Mark Hamstra* Denny Lee Tom Graves* GuiQiang Li Nick Pentreath* Sean McNamara* Patrick Wendell* 0: -1: I'll finalize and package this release in

Re: [VOTE] Release Apache Spark 1.2.0 (RC2)

2014-12-16 Thread Patrick Wendell
I'm closing this vote now, will send results in a new thread. On Sat, Dec 13, 2014 at 12:47 PM, Sean McNamara sean.mcnam...@webtrends.com wrote: +1 tested on OS X and deployed+tested our apps via YARN into our staging cluster. Sean On Dec 11, 2014, at 10:40 AM, Reynold Xin

[ANNOUNCE] Requiring JIRA for inclusion in release credits

2014-12-16 Thread Patrick Wendell
Hey All, Due to the very high volume of contributions, we're switching to an automated process for generating release credits. This process relies on JIRA for categorizing contributions, so it's not possible for us to provide credits in the case where users submit pull requests with no associated