Re: About Spark job web ui persist(JIRA-969)

2014-01-16 Thread Pillis Work
Hello, I wanted to write down at a high level the changes I was thinking of. Please feel free to critique and suggest changes. SparkContext: SparkContext start will not be starting UI anymore. Rather it will launch a SparkContextObserver (has SparkListener trait) which will generate a SparkContext

Re: JavaNetworkWordCount Researches

2014-01-16 Thread Eduardo Costa Alfaia
Hi Tathagata, Thank you very much by the explain. Another curiosity is that I did some tests with this code yesterday where I used three machines like worker and I can see that one these machines have had the RAM memory increased, about 90% in use, in compare the others this hasn’t changed dra

Changing print()

2014-01-16 Thread Eduardo Costa Alfaia
Hi guys, How could I change the def print() to print more lines in JavaNetworkWordCount.java? Thanks all -- --- INFORMATIVA SUL TRATTAMENTO DEI DATI PERSONALI I dati utilizzati per l'invio del presente messaggio sono trattati dall'Università degli Studi di Brescia esclusivamente per finalità

RE: About Spark job web ui persist(JIRA-969)

2014-01-16 Thread Xia, Junluan
Hi Pillis It sound goods 1. For SparkContextData, I think we could persist in HDFS not in local disk(one SparkUI service may show more than one sparkcontext) 2. we also could consider SparkContextData as one metrics input(MetricsSource), for long running spark job, SparkContextData will shown in

Re: About Spark job web ui persist(JIRA-969)

2014-01-16 Thread Pillis Work
Hi Junluan, 1. Yes, we could persist to HDFS or any FS. I think at a minimum we should persist it to local disk - keeps the core simple. We can think of HDFS interactions as level-2 functionality that can be implemented once we have a good local implementation. The persistence/hydration layer of a

Re: [VOTE] Release Apache Spark 0.9.0-incubating (rc1)

2014-01-16 Thread Patrick Wendell
Hey Alex, Thanks for testing out this rc. Would you mind forking this into a different thread so we can discuss there? Also, does your application build and run correctly with spark 0.8.1? That would determine whether the problem is specifically with this rc... Patrick --- sent from my phone On

Re: [VOTE] Release Apache Spark 0.9.0-incubating (rc1)

2014-01-16 Thread Patrick Wendell
I also ran your example locally and it worked with 0.8.1 and 0.9.0-rc1. So it's possible somehow you are pulling in an older version if Spark or an incompatible version of Hadoop. - Patrick On Thu, Jan 16, 2014 at 9:39 AM, Patrick Wendell wrote: > Hey Alex, > > Thanks for testing out this rc. Wo

GC tuning for Spark

2014-01-16 Thread Kay Ousterhout
Hi all, I'm finding that Java GC can be a major performance bottleneck when running Spark at high (>50% or so) memory utilization. What GC tuning have people tried for Spark and how effective has it been? Thanks! Kay

Re: JavaNetworkWordCount Researches

2014-01-16 Thread Tathagata Das
Hi Eduardo, If the streaming data is sent to Worker X, then the data is stored in the memory of Worker X and another worker Y. if replication is disabled through the StorageLevel in the input stream, then only worker X. That is why you could be seeing one the machines have a high memory usage. The

Re: GC tuning for Spark

2014-01-16 Thread Tathagata Das
There are a bunch of tricks noted in the Tuning Guide. You may have seen them already but I thought its still worth mentioning for the records. Besides those, if you are concerned about consistent latency (that is, low variab

Re: GC tuning for Spark

2014-01-16 Thread Mark Hamstra
And, of course, there are the bigger-hammer-than-GC-tuning approaches using some combination of unchecked, off-heap and Tachyon. On Thu, Jan 16, 2014 at 11:54 AM, Tathagata Das wrote: > There are a bunch of tricks noted in the Tuning > Guide< > http://spark.incubator.apache.org/docs/latest/tuni

Re: GC tuning for Spark

2014-01-16 Thread Binh Nguyen
I think incorporating https://github.com/amplab/tachyon/wiki is a better solution. I remembered Matei has said that it was in his plan but not sure about the ETA for it to happen. On Thu, Jan 16, 2014 at 12:30 PM, Mark Hamstra wrote: > And, of course, there are the bigger-hammer-than-GC-tuning a

Re: [VOTE] Release Apache Spark 0.9.0-incubating (rc1)

2014-01-16 Thread Patrick Wendell
I'll kick this vote off with a +1. On Thu, Jan 16, 2014 at 10:43 AM, Patrick Wendell wrote: > I also ran your example locally and it worked with 0.8.1 and > 0.9.0-rc1. So it's possible somehow you are pulling in an older > version if Spark or an incompatible version of Hadoop. > > - Patrick > > O

testing 0.9.0-incubating and maven

2014-01-16 Thread Alex Cozzi
Hi Patrick, thank you for testing. I think I found out what is wrong: I am trying to build my own examples that also depend on another library which in turns depends on hadoop 2.2. what was happening is that my library brings in hadoop 2.2, while spark depends on hadoop 1.04 and then I think I g

Re: testing 0.9.0-incubating and maven

2014-01-16 Thread Patrick Wendell
Hey Alex, Maven profiles only affect the Spark build itself. They do not transitively affect your own build. Checkout the docs for how to deploy applications on yarn: http://spark.incubator.apache.org/docs/latest/running-on-yarn.html When compiling your application, just should explicitly add th

Re: testing 0.9.0-incubating and maven

2014-01-16 Thread Alex Cozzi
Thanks for the help. I am doing progress, but I found I need to do a bit of fiddling with excluding dependencies from spark in order to have mine take effect. As soon as I have a working pom I will post here as an example. Alex Cozzi alexco...@gmail.com -

SparkR developer release

2014-01-16 Thread Shivaram Venkataraman
I'm happy to announce the developer preview of SparkR, an R frontend for Spark. SparkR presents Spark's API in R and allows you to write code in R and run the computation on a Spark cluster. You can try out SparkR today by installing it from our github repo at https://github.com/amplab-extras/Spark

Re: SparkR developer release

2014-01-16 Thread Chester Chen
This is something I am looking for, definitely will take a look Chester Sent from my iPhone On Jan 16, 2014, at 2:14 PM, Shivaram Venkataraman wrote: > I'm happy to announce the developer preview of SparkR, an R frontend > for Spark. SparkR presents Spark's API in R and allows you to write > c

Re : SparkR developer release

2014-01-16 Thread andy.petre...@gmail.com
Cool that's awesome and something I'll surely investigate in the coming weeks. Great job! Envoyé depuis mon HTC - Reply message - De : "Shivaram Venkataraman" Pour : , Cc : "Zongheng Yang" , "Matthew L Massie" Objet : SparkR developer release Date : jeu., janv. 16, 2014 23:14 I'm ha

Re: [VOTE] Release Apache Spark 0.9.0-incubating (rc1)

2014-01-16 Thread Matei Zaharia
+1 for me as well. I built and tested this on Mac OS X, and looked through the new docs. Matei On Jan 15, 2014, at 5:48 PM, Patrick Wendell wrote: > Please vote on releasing the following candidate as Apache Spark > (incubating) version 0.9.0. > > A draft of the release notes along with the c

Re: [VOTE] Release Apache Spark 0.9.0-incubating (rc1)

2014-01-16 Thread Reynold Xin
+1 On Thu, Jan 16, 2014 at 3:23 PM, Matei Zaharia wrote: > +1 for me as well. > > I built and tested this on Mac OS X, and looked through the new docs. > > Matei > > On Jan 15, 2014, at 5:48 PM, Patrick Wendell wrote: > > > Please vote on releasing the following candidate as Apache Spark > > (i

Re: SparkR developer release

2014-01-16 Thread Raja Pasupuleti
Nice! On Thu, Jan 16, 2014 at 5:14 PM, Shivaram Venkataraman < shiva...@eecs.berkeley.edu> wrote: > I'm happy to announce the developer preview of SparkR, an R frontend > for Spark. SparkR presents Spark's API in R and allows you to write > code in R and run the computation on a Spark cluster. Y

Re: [VOTE] Release Apache Spark 0.9.0-incubating (rc1)

2014-01-16 Thread Ali Ghodsi
+1 Builds fine on Maverick, runs tests, spark-shell, sbt assembly, maven build, etc. --Ali On Thu, Jan 16, 2014 at 3:33 PM, Reynold Xin wrote: > +1 > > > On Thu, Jan 16, 2014 at 3:23 PM, Matei Zaharia >wrote: > > > +1 for me as well. > > > > I built and tested this on Mac OS X, and looked th

Re: About Spark job web ui persist(JIRA-969)

2014-01-16 Thread Pillis Work
Hello, If changes are acceptable, I would like to request assignment of JIRA to me for implementation. Regards pillis On Thu, Jan 16, 2014 at 9:28 AM, Pillis Work wrote: > Hi Junluan, > 1. Yes, we could persist to HDFS or any FS. I think at a minimum we should > persist it to local disk - keeps

Re: [VOTE] Release Apache Spark 0.9.0-incubating (rc1)

2014-01-16 Thread Henry Saputra
NOTICE and LICENSE look good. No more executable binaries/jars packaged with source - good. Hashes look good. Signatures look good. Untar and run sbt compile, assembly, test. Run some examples in local and ran ok. +1 - Henry On Wed, Jan 15, 2014 at 5:48 PM, Patrick Wendell wrote: > Please vo

Re: [VOTE] Release Apache Spark 0.9.0-incubating (rc1)

2014-01-16 Thread Sean McNamara
+1 Compiled and ran fine on Mavericks and Precise. Sean On 1/16/14, 9:17 PM, "Henry Saputra" wrote: >NOTICE and LICENSE look good. > >No more executable binaries/jars packaged with source - good. > >Hashes look good. >Signatures look good. >Untar and run sbt compile, assembly, test. >Run some

RE: About Spark job web ui persist(JIRA-969)

2014-01-16 Thread Xia, Junluan
Hi pillis Do you mind to submit more detail design document? May contain 1. what data structures will be exposed to external UI/metrics/third party 2. what new UI looks like if you want to persist SparkContextData periodically 3. . Any other suggestions? -Original Message- From: