Re: Is there any way to stop a jenkins build

2015-12-29 Thread Josh Rosen
Yeah, I thought that my quick fix might address the HiveThriftBinaryServerSuite hanging issue, but it looks like it didn't work so I'll now have to do the more principled fix of using a UDF which sleeps for some amount of time. In order to stop builds, you need to have a Jenkins account with the

Re: Is there any way to stop a jenkins build

2015-12-29 Thread Herman van Hövell tot Westerflier
Thanks. I'll merge the most recent master... Still curious if we can stop a build. Kind regards, Herman van Hövell tot Westerflier 2015-12-29 18:59 GMT+01:00 Ted Yu : > HiveThriftBinaryServerSuite got stuck. > > I thought Josh has fixed this issue: > > [SPARK-11823][SQL]

Re: Spark streaming 1.6.0-RC4 NullPointerException using mapWithState

2015-12-29 Thread Shixiong Zhu
Could you create a JIRA? We can continue the discussion there. Thanks! Best Regards, Shixiong Zhu 2015-12-29 3:42 GMT-08:00 Jan Uyttenhove : > Hi guys, > > I upgraded to the RC4 of Spark (streaming) 1.6.0 to (re)test the new > mapWithState API, after previously reporting issue

Is there any way to stop a jenkins build

2015-12-29 Thread Herman van Hövell tot Westerflier
My AMPLAB jenkins build has been stuck for a few hours now: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48414/consoleFull Is there a way for me to stop the build? Kind regards, Herman van Hövell

Re: Is there any way to stop a jenkins build

2015-12-29 Thread Herman van Hövell tot Westerflier
Hi Josh, Your HiveThriftBinaryServerSuite fix wasn't in the build I was running (I forgot to merge the latest master). So it might actually work. As for stopping the build, it is understandable that you cannot do that without the proper permissions. It would still be cool to be able to issue a

Re: RDD[Vector] Immutability issue

2015-12-29 Thread ai he
Hi salexln, RDD's immutability depends on the underlying structure. I have the following example. -- scala> val m = Array.fill(2, 2)(0) m: Array[Array[Int]] = Array(Array(0, 0),

Re: RDD[Vector] Immutability issue

2015-12-29 Thread ai he
Same thing. Say, your underlying structure is like Array(ArrayBuffer(1, 2), ArrayBuffer(3, 4)). Then you can add/remove data in ArrayBuffers and then the change will be reflected in the rdd. On Tue, Dec 29, 2015 at 11:19 AM, salexln wrote: > I see, so in order the RDD to

Re: RDD[Vector] Immutability issue

2015-12-29 Thread Mark Hamstra
You can, but you shouldn't. Using backdoors to mutate the data in an RDD is a good way to produce confusing and inconsistent results when, e.g., an RDD's lineage needs to be recomputed or a Task is resubmitted on fetch failure. On Tue, Dec 29, 2015 at 11:24 AM, ai he wrote:

Re: RDD[Vector] Immutability issue

2015-12-29 Thread Vivekananda Venkateswaran
RDD is collection of object And if these objects are mutable and changed then the same will reflect in RDD. For immutable objects it will not. Changing the mutable objects that are in the RDD is not right practise. The RDD is immutable in the sense that any transformation on the RDD will result

Re: Spark streaming 1.6.0-RC4 NullPointerException using mapWithState

2015-12-29 Thread Shixiong(Ryan) Zhu
Hi Jan, could you post your codes? I could not reproduce this issue in my environment. Best Regards, Shixiong Zhu 2015-12-29 10:22 GMT-08:00 Shixiong Zhu : > Could you create a JIRA? We can continue the discussion there. Thanks! > > Best Regards, > Shixiong Zhu > >

Re: running lda in spark throws exception

2015-12-29 Thread Joseph Bradley
Hi Li, I'm wondering if you're running into the same bug reported here: https://issues.apache.org/jira/browse/SPARK-12488 I haven't figured out yet what is causing it. Do you have a small corpus which reproduces this error, and which you can share on the JIRA? If so, that would help a lot in

IndentationCheck of checkstyle

2015-12-29 Thread Ted Yu
Hi, I noticed that there are a lot of checkstyle warnings in the following form: To my knowledge, we use two spaces for each tab. Not sure why all of a sudden we have so many IndentationCheck warnings: grep 'hild have incorrect indentati' trunkCheckstyle.xml | wc 3133 52645 678294 If

problem with reading source code-pull out nondeterministic expresssions

2015-12-29 Thread 汪洋
Hi fellas, I am new to spark and I have a newbie question. I am currently reading the source code in spark sql catalyst analyzer. I not quite understand the partial function in PullOutNondeterministric. What does it mean by "pull out”? Why do we have to do the "pulling out”? I would really

Re: IndentationCheck of checkstyle

2015-12-29 Thread Reynold Xin
OK to close the loop - this thread has nothing to do with Spark? On Tue, Dec 29, 2015 at 9:55 PM, Ted Yu wrote: > Oops, wrong list :-) > > On Dec 29, 2015, at 9:48 PM, Reynold Xin wrote: > > +Herman > > Is this coming from the newly merged Hive

Re: IndentationCheck of checkstyle

2015-12-29 Thread Ted Yu
Oops, wrong list :-) > On Dec 29, 2015, at 9:48 PM, Reynold Xin wrote: > > +Herman > > Is this coming from the newly merged Hive parser? > > > >> On Tue, Dec 29, 2015 at 9:46 PM, Allen Zhang wrote: >> >> >> format issue I think, go ahead >> >>

Re: running lda in spark throws exception

2015-12-29 Thread Li Li
I will use a portion of data and try. will the hdfs block affect spark?(if so, it's hard to reproduce) On Wed, Dec 30, 2015 at 3:22 AM, Joseph Bradley wrote: > Hi Li, > > I'm wondering if you're running into the same bug reported here: >

Partitioning of RDD across worker machines

2015-12-29 Thread Disha Shrivastava
Hi, Suppose I have a file locally on my master machine and the same file is also present in the same path on all the worker machines , say /home/user_name/Desktop. I wanted to know that when we partition the data using sc.parallelize , Spark actually broadcasts parts of the RDD to all the worker

Spark streaming 1.6.0-RC4 NullPointerException using mapWithState

2015-12-29 Thread Jan Uyttenhove
Hi guys, I upgraded to the RC4 of Spark (streaming) 1.6.0 to (re)test the new mapWithState API, after previously reporting issue SPARK-11932 ( https://issues.apache.org/jira/browse/SPARK-11932). My Spark streaming job involves reading data from a Kafka topic (using