Re: Bringing up JDBC Tests to trunk

2015-11-22 Thread Luciano Resende
Hey Josh,

Thanks for helping bringing this up, I have just pushed a WIP PR for
bringing the DB2 tests to be running on Docker, and I have a question about
how the jdbc drivers are actually being setup for the other datasources
(MySQL and PostgreSQL), are these setup directly on the Jenkins slaves ? I
didn't see the jars or anything specific on the pom or other files...


Thanks

On Wed, Oct 21, 2015 at 1:26 PM, Josh Rosen  wrote:

> Hey Luciano,
>
> This sounds like a reasonable plan to me. One of my colleagues has written
> some Dockerized MySQL testing utilities, so I'll take a peek at those to
> see if there are any specifics of their solution that we should adapt for
> Spark.
>
> On Wed, Oct 21, 2015 at 1:16 PM, Luciano Resende 
> wrote:
>
>> I have started looking into PR-8101 [1] and what is required to merge it
>> into trunk which will also unblock me around SPARK-10521 [2].
>>
>> So here is the minimal plan I was thinking about :
>>
>> - make the docker image version fixed so we make sure we are using the
>> same image all the time
>> - pull the required images on the Jenkins executors so tests are not
>> delayed/timedout because it is waiting for docker images to download
>> - create a profile to run the JDBC tests
>> - create daily jobs for running the JDBC tests
>>
>>
>> In parallel, I learned that Alan Chin from my team is working with the
>> AmpLab team to expand the build capacity for Spark, so I will use some of
>> the nodes he is preparing to test/run these builds for now.
>>
>> Please let me know if there is anything else needed around this.
>>
>>
>> [1] https://github.com/apache/spark/pull/8101
>> [2] https://issues.apache.org/jira/browse/SPARK-10521
>>
>> --
>> Luciano Resende
>> http://people.apache.org/~lresende
>> http://twitter.com/lresende1975
>> http://lresende.blogspot.com/
>>
>
>


-- 
Luciano Resende
http://people.apache.org/~lresende
http://twitter.com/lresende1975
http://lresende.blogspot.com/


[ANNOUNCE] Spark 1.6.0 Release Preview

2015-11-22 Thread Michael Armbrust
In order to facilitate community testing of Spark 1.6.0, I'm excited to
announce the availability of an early preview of the release. This is not a
release candidate, so there is no voting involved. However, it'd be awesome
if community members can start testing with this preview package and report
any problems they encounter.

This preview package contains all the commits to branch-1.6
 till commit
308381420f51b6da1007ea09a02d740613a226e0
.

The staging maven repository for this preview build can be found here:
https://repository.apache.org/content/repositories/orgapachespark-1162

Binaries for this preview build can be found here:
http://people.apache.org/~pwendell/spark-releases/spark-v1.6.0-preview2-bin/

A build of the docs can also be found here:
http://people.apache.org/~pwendell/spark-releases/spark-v1.6.0-preview2-docs/

The full change log for this release can be found on JIRA

.

*== How can you help? ==*

If you are a Spark user, you can help us test this release by taking a
Spark workload and running on this preview release, then reporting any
regressions.

*== Major Features ==*

When testing, we'd appreciate it if users could focus on areas that have
changed in this release.  Some notable new features include:

SPARK-11787  *Parquet
Performance* - Improve Parquet scan performance when using flat schemas.
SPARK-10810  *Session *
*Management* - Multiple users of the thrift (JDBC/ODBC) server now have
isolated sessions including their own default database (i.e USE mydb) even
on shared clusters.
SPARK-   *Dataset API* -
A new, experimental type-safe API (similar to RDDs) that performs many
operations on serialized binary data and code generation (i.e. Project
Tungsten)
SPARK-1  *Unified
Memory Management* - Shared memory for execution and caching instead of
exclusive division of the regions.
SPARK-10978  *Datasource
API Avoid Double Filter* - When implementing a datasource with filter
pushdown, developers can now tell Spark SQL to avoid double evaluating a
pushed-down filter.
SPARK-2629   *New
improved state management* - trackStateByKey - a DStream transformation for
stateful stream processing, supersedes updateStateByKey in functionality
and performance.

Happy testing!

Michael