Re: File JIRAs for all flaky test failures

2017-03-28 Thread Saikat Kanjilal
I'm happy to help out in this effort and will look at that label and see what tests I can look into and/or fix. From: Kay Ousterhout Sent: Monday, March 27, 2017 9:47 PM To: Reynold Xin Cc: Saikat Kanjilal; Sean Owen;

Re: Output Committers for S3

2017-03-28 Thread Ryan Blue
Steve is right that the S3 committer isn't a ParquetOutputCommitter. I think that the reason that check exists is to make sure Parquet writes _metadata summary files to an output directory. But, I think the **summary files are a bad idea**, so we bypass that logic and use the committer directly if

Re: Outstanding Spark 2.1.1 issues

2017-03-28 Thread Michael Armbrust
We just fixed the build yesterday. I'll kick off a new RC today. On Tue, Mar 28, 2017 at 8:04 AM, Asher Krim wrote: > Hey Michael, > any update on this? We're itching for a 2.1.1 release (specifically > SPARK-14804 which is currently blocking us) > > Thanks, > Asher Krim >

Re: Outstanding Spark 2.1.1 issues

2017-03-28 Thread Asher Krim
Hey Michael, any update on this? We're itching for a 2.1.1 release (specifically SPARK-14804 which is currently blocking us) Thanks, Asher Krim Senior Software Engineer On Wed, Mar 22, 2017 at 7:44 PM, Michael Armbrust wrote: > An update: I cut the tag for RC1 last

Re: Output Committers for S3

2017-03-28 Thread Steve Loughran
> On 28 Mar 2017, at 05:20, sririshindra wrote: > > Hi > > I have a job which saves a dataframe as parquet file to s3. > > The built a jar using your repository https://github.com/rdblue/s3committer. > > I added the following config in the to the Spark Session >

Re: Fwd: [SparkSQL] Project using NamedExpression

2017-03-28 Thread Liang-Chi Hsieh
I am not sure why you want to transform rows in the dataframe using mapPartitions like that. If you want to project the rows with some expressions, you can use the API like selectExpr and let Spark SQL to resolve expressions. To resolve expressions manually, you need to (at least) deal with a