[RESULT] [VOTE] Release Apache Spark 1.4.1 (RC4)

2015-07-12 Thread Patrick Wendell
This vote passes with 14 +1 (7 binding) votes and no 0 or -1 votes. +1 (14): Patrick Wendell Reynold Xin Sean Owen Burak Yavuz Mark Hamstra Michael Armbrust Andrew Or York, Brennon Krishna Sankar Luciano Resende Holden Karau Tom Graves Denny Lee Sean McNamara - Patrick On Wed, Jul 8, 2015 at 10:

Re: ./dev/run-tests fail on master

2015-07-12 Thread Xiaoyu Ma
Hi Ted, Seems maven build/test part works fine for me. Thanks! Forget to provide more info: I’m using python 2.7.6, MacOS 10.10.3, JDK 1.7.0_79, Maven 3.3.1 马晓宇 / Xiaoyu Ma hzmaxia...@corp.netease.com > On Jul 13, 2015, at 11:34 AM, Ted Yu wrote: > > When I ran dev/run-tests , I got : > >

Re: jenkins downtime 7/13/15, 7am PDT

2015-07-12 Thread shane knapp
reminder: this is happening tomorrow morning! On Thu, Jul 9, 2015 at 1:07 PM, shane knapp wrote: > i'll be taking jenkins down for system and jenkins app updates. this > should be pretty quick and i'm expecting to have everything back up > and building by 9am. > > i will send a reminder email t

Re: ./dev/run-tests fail on master

2015-07-12 Thread Ted Yu
When I ran dev/run-tests , I got : File "./dev/run-tests.py", line 68, in __main__.identify_changed_files_from_git_commits Failed example: 'root' in [x.name for x in determine_modules_for_files( identify_changed_files_from_git_commits("50a0496a43", target_ref="6765ef9"))] Exception raised:

Re: Foundation policy on releases and Spark nightly builds

2015-07-12 Thread Patrick Wendell
Hey Sean B, Would you mind outlining for me how we go about changing this policy - I think it's outdated and doesn't make much sense. Ideally I'd like to propose a vote to modify the text slightly such that our current behavior is seen as complaint. Specifically: - What concrete steps can I take

./dev/run-tests fail on master

2015-07-12 Thread Xiaoyu Ma
Hi guys, I was trying to rerun test using run-tests on master but I got below errors. I was able to build using maven though. Any advice? [error]^ [error] /Users/ilovesoup1/workspace/eclipseWS/spark/network/common/src/main/java/org/apache/spark/network/server/TransportReq

Re: pyspark.sql.tests: is test_time_with_timezone a flaky test?

2015-07-12 Thread Davies Liu
Will be fixed by https://github.com/apache/spark/pull/7363 On Sun, Jul 12, 2015 at 7:45 PM, Davies Liu wrote: > Thanks for reporting this, I'm working on it. It turned out that it's > a bug in when run with Python3.4, will sending out a fix soon. > > On Sun, Jul 12, 2015 at 1:33 PM, Cheolsoo Park

Re: Foundation policy on releases and Spark nightly builds

2015-07-12 Thread Sean Busbey
Please note that when the policy refers to "developers" it means the developers of the project at hand, that is participants on the dev@spark mailing list. As I stated in my original email, you're welcome to continue the discussion on the policy including the definition of developers on general@in

Re: pyspark.sql.tests: is test_time_with_timezone a flaky test?

2015-07-12 Thread Davies Liu
Thanks for reporting this, I'm working on it. It turned out that it's a bug in when run with Python3.4, will sending out a fix soon. On Sun, Jul 12, 2015 at 1:33 PM, Cheolsoo Park wrote: > Hi devs, > > For some reason, I keep getting this test failure (3 out of 4 builds) in my > PR- > > =

Re: Are These Issues Suitable for our Senior Project?

2015-07-12 Thread emrehan
Thanks for all the kind responses! I'd love to experience the contribution flow with a small task first, but I couldn't find any unassigned interesting tickets for 1.5. I'm hoping to get assigned to a small ticket by the end of the summer though. If nobody would suggest anything else it seems lik

Re: [VOTE] Release Apache Spark 1.4.1 (RC4)

2015-07-12 Thread Patrick Wendell
I think we can close this vote soon. Any addition votes/testing would be much appreciated! On Fri, Jul 10, 2015 at 11:30 AM, Sean McNamara wrote: > +1 > > Sean > >> On Jul 8, 2015, at 11:55 PM, Patrick Wendell wrote: >> >> Please vote on releasing the following candidate as Apache Spark version

Re: Foundation policy on releases and Spark nightly builds

2015-07-12 Thread Patrick Wendell
Thanks Sean O. I was thinking something like "NOTE: Nightly builds are meant for development and testing purposes. They do not go through Apache's release auditing process and are not official releases." - Patrick On Sun, Jul 12, 2015 at 3:39 PM, Sean Owen wrote: > (This sounds pretty good to me

pyspark.sql.tests: is test_time_with_timezone a flaky test?

2015-07-12 Thread Cheolsoo Park
Hi devs, For some reason, I keep getting this test failure (3 out of 4 builds) in my PR - == FAIL: test_time_with_timezone (__main__.SQLTests) ---

Re: Foundation policy on releases and Spark nightly builds

2015-07-12 Thread Sean Owen
(This sounds pretty good to me. Mark it developers-only, not formally tested by the community, etc.) On Sun, Jul 12, 2015 at 7:50 PM, Patrick Wendell wrote: > Hey Sean B., > > Thanks for bringing this to our attention. I think putting them on the > developer wiki would substantially decrease visi

Re: [PySpark DataFrame] When a Row is not a Row

2015-07-12 Thread Davies Liu
We finally fix this in 1.5 (next release), see https://github.com/apache/spark/pull/7301 On Sat, Jul 11, 2015 at 10:32 PM, Jerry Lam wrote: > Hi guys, > > I just hit the same problem. It is very confusing when Row is not the same > Row type at runtime. The worst thing is that when I use Spark in

Re: Foundation policy on releases and Spark nightly builds

2015-07-12 Thread Patrick Wendell
Hey Sean B., Thanks for bringing this to our attention. I think putting them on the developer wiki would substantially decrease visibility in a way that is not beneficial to the project - this feature was specifically requested by developers from other projects that integrate with Spark. If the c

Spark development under Windows

2015-07-12 Thread Olivier Delalleau
Hi, New to Spark here, trying to look into issue https://issues.apache.org/jira/browse/SPARK-8976 Ideallly I'd rather develop under Windows rather than a Linux VM, since this is the environment I want to test Spark in. My first step before changing any code would be to run the test suite (dev/run

Re: Spark master broken?

2015-07-12 Thread Josh Rosen
I think it is just broken for 2.11 since pull requests are building properly. Sent from my phone > On Jul 12, 2015, at 8:22 AM, René Treffer wrote: > > Java 8, make-distribution > > Jenkins does show the same error, though: > https://amplab.cs.berkeley.edu/jenkins/job/Spark-Master-Maven-Snaps

Re: Spark master broken?

2015-07-12 Thread René Treffer
Java 8, make-distribution Jenkins does show the same error, though: https://amplab.cs.berkeley.edu/jenkins/job/Spark-Master-Maven-Snapshots/325/console On Sun, Jul 12, 2015 at 4:32 PM, Ted Yu wrote: > Jenkins shows green builds. > > What Java version did you use ? > > Cheers > > On Sun, Jul 12,

Re: Spark master broken?

2015-07-12 Thread Ted Yu
Jenkins shows green builds. What Java version did you use ? Cheers On Sun, Jul 12, 2015 at 3:49 AM, René Treffer wrote: > Hi *, > > I'm currently trying to build master but it fails with > > [error] Picked up JAVA_TOOL_OPTIONS: >> -javaagent:/usr/share/java/jayatanaag.jar >> [error] >> /home/

Spark master broken?

2015-07-12 Thread René Treffer
Hi *, I'm currently trying to build master but it fails with [error] Picked up JAVA_TOOL_OPTIONS: > -javaagent:/usr/share/java/jayatanaag.jar > [error] > /home/rtreffer/work/spark-master/sql/catalyst/src/main/java/org/apache/spark/sql/execution/UnsafeExternalRowSorter.java:135: > > error: org.a

question related partitions of the DataFrame

2015-07-12 Thread Gil Vernik
Hi, DataFrame extends RDDApi, that provides RDD like methods. My question is, does DataFrame is sort of stand alone RDD with it?s own partitions or it depends on the underlying RDD that was used to load the data into its partitions? It's written that DataFrame has ability to scale from kilobyt

Re: Should spark-ec2 get its own repo?

2015-07-12 Thread Sean Owen
I agree with these points. The ec2 support is substantially a separate project, and would likely be better managed as one. People can much more rapidly iterate on it and release it. I suggest: 1. Pick a new repo location. amplab/spark-ec2 ? spark-ec2/spark-ec2 ? 2. Add interested parties as owner