help me with setting up IntelliJ Idea development IDE for Spark

2013-10-27 Thread dachuan
spark. I count on this for jumping, otherwise I can simply use Vim. And I am pretty new to maven, embarrassing to say. thanks, dachuan. -- Dachuan Huang Cellphone: 614-390-7234 2015 Neil Avenue Ohio State University Columbus, Ohio U.S.A. 43210

Re: help me with setting up IntelliJ Idea development IDE for Spark

2013-10-27 Thread dachuan
thanks for the help! sbt/sbt gen-idea works perfectly in RHEL6, but it doesn't work in my cygwin (which is the machine I run IntelliJ Idea ..) It reports: $ sbt/sbt gen-idea Error: Unable to access jarfile /home/Dachuan/incubator-spark/sbt/sbt-launch-0.11.3-2.jar and sbt/sbt assembly, sb

Re: help me with setting up IntelliJ Idea development IDE for Spark

2013-10-28 Thread dachuan
27;t use cygwin for these jobs. cheers. On Sun, Oct 27, 2013 at 11:51 PM, dachuan wrote: > thanks for the help! > > sbt/sbt gen-idea works perfectly in RHEL6, but it doesn't work in my > cygwin (which is the machine I run IntelliJ Idea ..) > > It reports: > $ sbt/sbt ge

a question about lineage graphs in streaming

2013-11-02 Thread dachuan
itely? when you say "grow indefinitely", do you refer to lineage graph's width or the number of lineage graphs? thanks, dachuan.

Re: a question about lineage graphs in streaming

2013-11-02 Thread dachuan
ve no idea what application figure 3 is talking about. Mark, sorry I don't quite understand what you've said. thanks, dachuan. On Sat, Nov 2, 2013 at 4:35 PM, Mark Hamstra wrote: > You're coming at the paper from a different context than that in which it > was written. T

Re: a question about lineage graphs in streaming

2013-11-02 Thread dachuan
auses pre-checkpoint lineage to be forgotten, so > checkpointing is an effective means to control the growth of RDD state. > > > On Sat, Nov 2, 2013 at 2:24 PM, dachuan wrote: > > > It seems what Christopher said makes certain sense, because this round's > > RDD depends on

Re: hadoop configuration

2013-11-04 Thread dachuan
n of Configuration under > /core/src/main/scala/org/apache/hadoop/ > The only subdirectories in this directory are mapred and mapreduce. Does > anybody know where 'Configuration' is defined? > -- Dachuan Huang Cellphone: 614-390-7234 2015 Neil Avenue Ohio State University Columbus, Ohio U.S.A. 43210

a question about RDD.checkpoint()

2013-11-08 Thread dachuan
s RDD scala objects and RDD's data (which is managed by BlockManager) will be garbage collected? And could you please point me to the relevant source code region, if possible? thanks, dachuan. -- Dachuan Huang Cellphone: 614-390-7234 2015 Neil Avenue Ohio State University Columbus, Ohio U.S.A. 43210

a question about FetchFailed

2013-11-11 Thread dachuan
nd its parent stage will both need to be re-executed. If my understanding is correct, then why does Spark need to materialize the intermediate data? This is "restart" fault tolerance mechanism. thanks for your help, dachuan. -- Dachuan Huang Cellphone: 614-390-7234 2015 Neil Avenue O

Fwd: real world streaming code

2014-01-27 Thread dachuan
This email, which includes my questions about spark streaming, is forwarded from user@mailing-list. Sorry about this, because I haven't got any reply yet. thanks, dachuan. -- Forwarded message -- From: dachuan Date: Fri, Jan 24, 2014 at 10:28 PM Subject: real world stre

Re: real world streaming code

2014-01-27 Thread dachuan
gt; > Mayur Rustagi > > Ph: +919632149971 > > h <https://twitter.com/mayur_rustagi>ttp://www.sigmoidanalytics.com > > https://twitter.com/mayur_rustagi > > > > > > > > On Mon, Jan 27, 2014 at 10:52 PM, dachuan wrote: > > > &

possible log info bug

2014-02-17 Thread dachuan
Hi, In spark-0.9.0-incubating, Master.scala, line 170 logInfo("Registering worker %s:%d with %d cores, %s RAM".format( host, workerPort, cores, Utils.megabytesToString(memory))) might need to be corrected to: logInfo("Registering worker %s:%d with %d cores, %s RAM".format(

Re: possible log info bug

2014-02-17 Thread dachuan
right, exactly. thanks! On Mon, Feb 17, 2014 at 12:18 PM, Andrew Ash wrote: > Hi dachuan, > > At first glance that does look like a bug. I've opened a pull request with > the change here: > > https://github.com/apache/incubator-spark/pull/608 > > Is that the fix

Re: Stackoverflow after a small change by me

2013-12-24 Thread Dachuan Huang
wrote: > Hello Dachuan, > > RDDs generated by StateDStream are checkpointed because the tree of RDD > dependencies (i.e. the RDD lineage) can grow indefinitely as each state RDD > depends on the state RDD from the previous batch of data. Checkpointing > save an RDD to HDFS to