Re: [VOTE] SPARK 2.3.2 (RC3)

2018-07-18 Thread Takeshi Yamamuro
+1 (non-binding)

I run tests on a EC2 m4.2xlarge instance;
[ec2-user]$ java -version
openjdk version "1.8.0_171"
OpenJDK Runtime Environment (build 1.8.0_171-b10)
OpenJDK 64-Bit Server VM (build 25.171-b10, mixed mode)


On Thu, Jul 19, 2018 at 5:29 AM Ryan Blue  wrote:

> +1 (non-binding)
>
> On Wed, Jul 18, 2018 at 10:38 AM Denny Lee  wrote:
>
>> +1 (non-binding)
>> On Tue, Jul 17, 2018 at 23:04 John Zhuge  wrote:
>>
>>> +1 (non-binding)
>>>
>>> On Mon, Jul 16, 2018 at 8:04 PM Saisai Shao 
>>> wrote:
>>>
 I will put my +1 on this RC.

 For the test failure fix, I will include it if there's another RC.

 Sean Owen  于2018年7月16日周一 下午10:47写道:

>>> OK, hm, will try to get to the bottom of it. But if others can build
> this module successfully, I give a +1 . The test failure is inevitable 
> here
> and should not block release.
>
> On Sun, Jul 15, 2018 at 9:39 PM Saisai Shao 
> wrote:
>
 Hi Sean,
>>
>> I just did a clean build with mvn/sbt on 2.3.2, I didn't meet the
>> errors you pasted here. I'm not sure how it happens.
>>
>> Sean Owen  于2018年7月16日周一 上午6:30写道:
>>
> Looks good to me, with the following caveats.
>>>
>>> First see the discussion on
>>> https://issues.apache.org/jira/browse/SPARK-24813 ; the
>>> flaky HiveExternalCatalogVersionsSuite will probably fail all the time
>>> right now. That's not a regression and is a test-only issue, so don't 
>>> think
>>> it must block the release. However if this fix holds up, and we need
>>> another RC, worth pulling in for sure.
>>>
>>> Also is anyone seeing this while building and testing the Spark
>>> SQL + Kafka module? I see this error even after a clean rebuild. I sort 
>>> of
>>> get what the error is saying but can't figure out why it would only 
>>> happen
>>> at test/runtime. Haven't seen it before.
>>>
>>> [error] missing or invalid dependency detected while loading class
>>> file 'MetricsSystem.class'.
>>>
>>> [error] Could not access term eclipse in package org,
>>>
>>> [error] because it (or its dependencies) are missing. Check your
>>> build definition for
>>>
>>> [error] missing or conflicting dependencies. (Re-run with
>>> `-Ylog-classpath` to see the problematic classpath.)
>>>
>>> [error] A full rebuild may help if 'MetricsSystem.class' was
>>> compiled against an incompatible version of org.
>>>
>>> [error] missing or invalid dependency detected while loading class
>>> file 'MetricsSystem.class'.
>>>
>>> [error] Could not access term jetty in value org.eclipse,
>>>
>>> [error] because it (or its dependencies) are missing. Check your
>>> build definition for
>>>
>>> [error] missing or conflicting dependencies. (Re-run with
>>> `-Ylog-classpath` to see the problematic classpath.)
>>>
>>> [error] A full rebuild may help if 'MetricsSystem.class' was
>>> compiled against an incompatible version of org.eclipse
>>>
>>> On Sun, Jul 15, 2018 at 3:09 AM Saisai Shao 
>>> wrote:
>>>
>> Please vote on releasing the following candidate as Apache Spark
 version 2.3.2.

 The vote is open until July 20 PST and passes if a majority +1 PMC
 votes are cast, with a minimum of 3 +1 votes.

 [ ] +1 Release this package as Apache Spark 2.3.2
 [ ] -1 Do not release this package because ...

 To learn more about Apache Spark, please see
 http://spark.apache.org/

 The tag to be voted on is v2.3.2-rc3
 (commit b3726dadcf2997f20231873ec6e057dba433ae64):
 https://github.com/apache/spark/tree/v2.3.2-rc3

 The release files, including signatures, digests, etc. can be found
 at:
 https://dist.apache.org/repos/dist/dev/spark/v2.3.2-rc3-bin/

 Signatures used for Spark RCs can be found in this file:
 https://dist.apache.org/repos/dist/dev/spark/KEYS

 The staging repository for this release can be found at:

 https://repository.apache.org/content/repositories/orgapachespark-1278/

 The documentation corresponding to this release can be found at:
 https://dist.apache.org/repos/dist/dev/spark/v2.3.2-rc3-docs/

 The list of bug fixes going into 2.3.2 can be found at the
 following URL:
 https://issues.apache.org/jira/projects/SPARK/versions/12343289

 Note. RC2 was cancelled because of one blocking issue SPARK-24781
 during release preparation.

 FAQ

 =
 How can I help test this release?
 =

 If you are a Spark user, you can help us test this release by taking
 an existing Spark workload and running on this 

Re: [VOTE] SPIP: Standardize SQL logical plans

2018-07-18 Thread Felix Cheung
+1



From: Bruce Robbins 
Sent: Wednesday, July 18, 2018 3:02 PM
To: Ryan Blue
Cc: Spark Dev List
Subject: Re: [VOTE] SPIP: Standardize SQL logical plans

+1 (non-binding)

On Tue, Jul 17, 2018 at 10:59 AM, Ryan Blue 
mailto:b...@apache.org>> wrote:
Hi everyone,

>From discussion on the proposal doc and the discussion thread, I think we have 
>consensus around the plan to standardize logical write operations for 
>DataSourceV2. I would like to call a vote on the proposal.

The proposal doc is here: SPIP: Standardize SQL logical 
plans.

This vote is for the plan in that doc. The related SPIP with APIs to 
create/alter/drop tables will be a separate vote.

Please vote in the next 72 hours:

[+1]: Spark should adopt the SPIP
[-1]: Spark should not adopt the SPIP because . . .

Thanks for voting, everyone!

--
Ryan Blue



Re: [VOTE] SPIP: Standardize SQL logical plans

2018-07-18 Thread Bruce Robbins
+1 (non-binding)

On Tue, Jul 17, 2018 at 10:59 AM, Ryan Blue  wrote:

> Hi everyone,
>
> From discussion on the proposal doc and the discussion thread, I think we
> have consensus around the plan to standardize logical write operations for
> DataSourceV2. I would like to call a vote on the proposal.
>
> The proposal doc is here: SPIP: Standardize SQL logical plans
> 
> .
>
> This vote is for the plan in that doc. The related SPIP with APIs to
> create/alter/drop tables will be a separate vote.
>
> Please vote in the next 72 hours:
>
> [+1]: Spark should adopt the SPIP
> [-1]: Spark should not adopt the SPIP because . . .
>
> Thanks for voting, everyone!
>
> --
> Ryan Blue
>


Re: [VOTE] SPIP: Standardize SQL logical plans

2018-07-18 Thread Dongjoon Hyun
+1 (non-binding).

Bests,
Dongjoon.

On Wed, Jul 18, 2018 at 11:32 AM Henry Robinson 
wrote:

> +1 (non-binding)
> On Wed, Jul 18, 2018 at 9:12 AM Reynold Xin  wrote:
>
>> +1 on this, on the condition that we can come up with a design that will
>> remove the existing plans.
>>
>>
>> On Tue, Jul 17, 2018 at 11:00 AM Ryan Blue  wrote:
>>
>>> Hi everyone,
>>>
>>> From discussion on the proposal doc and the discussion thread, I think
>>> we have consensus around the plan to standardize logical write operations
>>> for DataSourceV2. I would like to call a vote on the proposal.
>>>
>>> The proposal doc is here: SPIP: Standardize SQL logical plans
>>> 
>>> .
>>>
>>> This vote is for the plan in that doc. The related SPIP with APIs to
>>> create/alter/drop tables will be a separate vote.
>>>
>>> Please vote in the next 72 hours:
>>>
>>> [+1]: Spark should adopt the SPIP
>>> [-1]: Spark should not adopt the SPIP because . . .
>>>
>>> Thanks for voting, everyone!
>>>
>>> --
>>> Ryan Blue
>>>
>>


Re: [VOTE] SPARK 2.3.2 (RC3)

2018-07-18 Thread Ryan Blue
+1 (non-binding)

On Wed, Jul 18, 2018 at 10:38 AM Denny Lee  wrote:

> +1 (non-binding)
> On Tue, Jul 17, 2018 at 23:04 John Zhuge  wrote:
>
>> +1 (non-binding)
>>
>> On Mon, Jul 16, 2018 at 8:04 PM Saisai Shao 
>> wrote:
>>
>>> I will put my +1 on this RC.
>>>
>>> For the test failure fix, I will include it if there's another RC.
>>>
>>> Sean Owen  于2018年7月16日周一 下午10:47写道:
>>>
>> OK, hm, will try to get to the bottom of it. But if others can build this
 module successfully, I give a +1 . The test failure is inevitable here and
 should not block release.

 On Sun, Jul 15, 2018 at 9:39 PM Saisai Shao 
 wrote:

>>> Hi Sean,
>
> I just did a clean build with mvn/sbt on 2.3.2, I didn't meet the
> errors you pasted here. I'm not sure how it happens.
>
> Sean Owen  于2018年7月16日周一 上午6:30写道:
>
 Looks good to me, with the following caveats.
>>
>> First see the discussion on
>> https://issues.apache.org/jira/browse/SPARK-24813 ; the
>> flaky HiveExternalCatalogVersionsSuite will probably fail all the time
>> right now. That's not a regression and is a test-only issue, so don't 
>> think
>> it must block the release. However if this fix holds up, and we need
>> another RC, worth pulling in for sure.
>>
>> Also is anyone seeing this while building and testing the Spark SQL +
>> Kafka module? I see this error even after a clean rebuild. I sort of get
>> what the error is saying but can't figure out why it would only happen at
>> test/runtime. Haven't seen it before.
>>
>> [error] missing or invalid dependency detected while loading class
>> file 'MetricsSystem.class'.
>>
>> [error] Could not access term eclipse in package org,
>>
>> [error] because it (or its dependencies) are missing. Check your
>> build definition for
>>
>> [error] missing or conflicting dependencies. (Re-run with
>> `-Ylog-classpath` to see the problematic classpath.)
>>
>> [error] A full rebuild may help if 'MetricsSystem.class' was compiled
>> against an incompatible version of org.
>>
>> [error] missing or invalid dependency detected while loading class
>> file 'MetricsSystem.class'.
>>
>> [error] Could not access term jetty in value org.eclipse,
>>
>> [error] because it (or its dependencies) are missing. Check your
>> build definition for
>>
>> [error] missing or conflicting dependencies. (Re-run with
>> `-Ylog-classpath` to see the problematic classpath.)
>>
>> [error] A full rebuild may help if 'MetricsSystem.class' was compiled
>> against an incompatible version of org.eclipse
>>
>> On Sun, Jul 15, 2018 at 3:09 AM Saisai Shao 
>> wrote:
>>
> Please vote on releasing the following candidate as Apache Spark
>>> version 2.3.2.
>>>
>>> The vote is open until July 20 PST and passes if a majority +1 PMC
>>> votes are cast, with a minimum of 3 +1 votes.
>>>
>>> [ ] +1 Release this package as Apache Spark 2.3.2
>>> [ ] -1 Do not release this package because ...
>>>
>>> To learn more about Apache Spark, please see
>>> http://spark.apache.org/
>>>
>>> The tag to be voted on is v2.3.2-rc3
>>> (commit b3726dadcf2997f20231873ec6e057dba433ae64):
>>> https://github.com/apache/spark/tree/v2.3.2-rc3
>>>
>>> The release files, including signatures, digests, etc. can be found
>>> at:
>>> https://dist.apache.org/repos/dist/dev/spark/v2.3.2-rc3-bin/
>>>
>>> Signatures used for Spark RCs can be found in this file:
>>> https://dist.apache.org/repos/dist/dev/spark/KEYS
>>>
>>> The staging repository for this release can be found at:
>>>
>>> https://repository.apache.org/content/repositories/orgapachespark-1278/
>>>
>>> The documentation corresponding to this release can be found at:
>>> https://dist.apache.org/repos/dist/dev/spark/v2.3.2-rc3-docs/
>>>
>>> The list of bug fixes going into 2.3.2 can be found at the following
>>> URL:
>>> https://issues.apache.org/jira/projects/SPARK/versions/12343289
>>>
>>> Note. RC2 was cancelled because of one blocking issue SPARK-24781
>>> during release preparation.
>>>
>>> FAQ
>>>
>>> =
>>> How can I help test this release?
>>> =
>>>
>>> If you are a Spark user, you can help us test this release by taking
>>> an existing Spark workload and running on this release candidate,
>>> then
>>> reporting any regressions.
>>>
>>> If you're working in PySpark you can set up a virtual env and install
>>> the current RC and see if anything important breaks, in the
>>> Java/Scala
>>> you can add the staging repository to your projects resolvers and
>>> test
>>> with the RC (make sure to clean up the artifact cache before/after so
>>> you don't end up 

Re: [VOTE] SPIP: Standardize SQL logical plans

2018-07-18 Thread Henry Robinson
+1 (non-binding)
On Wed, Jul 18, 2018 at 9:12 AM Reynold Xin  wrote:

> +1 on this, on the condition that we can come up with a design that will
> remove the existing plans.
>
>
> On Tue, Jul 17, 2018 at 11:00 AM Ryan Blue  wrote:
>
>> Hi everyone,
>>
>> From discussion on the proposal doc and the discussion thread, I think we
>> have consensus around the plan to standardize logical write operations for
>> DataSourceV2. I would like to call a vote on the proposal.
>>
>> The proposal doc is here: SPIP: Standardize SQL logical plans
>> 
>> .
>>
>> This vote is for the plan in that doc. The related SPIP with APIs to
>> create/alter/drop tables will be a separate vote.
>>
>> Please vote in the next 72 hours:
>>
>> [+1]: Spark should adopt the SPIP
>> [-1]: Spark should not adopt the SPIP because . . .
>>
>> Thanks for voting, everyone!
>>
>> --
>> Ryan Blue
>>
>


Re: [VOTE] SPARK 2.3.2 (RC3)

2018-07-18 Thread Denny Lee
+1 (non-binding)
On Tue, Jul 17, 2018 at 23:04 John Zhuge  wrote:

> +1 (non-binding)
>
> On Mon, Jul 16, 2018 at 8:04 PM Saisai Shao 
> wrote:
>
>> I will put my +1 on this RC.
>>
>> For the test failure fix, I will include it if there's another RC.
>>
>> Sean Owen  于2018年7月16日周一 下午10:47写道:
>>
> OK, hm, will try to get to the bottom of it. But if others can build this
>>> module successfully, I give a +1 . The test failure is inevitable here and
>>> should not block release.
>>>
>>> On Sun, Jul 15, 2018 at 9:39 PM Saisai Shao 
>>> wrote:
>>>
>> Hi Sean,

 I just did a clean build with mvn/sbt on 2.3.2, I didn't meet the
 errors you pasted here. I'm not sure how it happens.

 Sean Owen  于2018年7月16日周一 上午6:30写道:

>>> Looks good to me, with the following caveats.
>
> First see the discussion on
> https://issues.apache.org/jira/browse/SPARK-24813 ; the
> flaky HiveExternalCatalogVersionsSuite will probably fail all the time
> right now. That's not a regression and is a test-only issue, so don't 
> think
> it must block the release. However if this fix holds up, and we need
> another RC, worth pulling in for sure.
>
> Also is anyone seeing this while building and testing the Spark SQL +
> Kafka module? I see this error even after a clean rebuild. I sort of get
> what the error is saying but can't figure out why it would only happen at
> test/runtime. Haven't seen it before.
>
> [error] missing or invalid dependency detected while loading class
> file 'MetricsSystem.class'.
>
> [error] Could not access term eclipse in package org,
>
> [error] because it (or its dependencies) are missing. Check your build
> definition for
>
> [error] missing or conflicting dependencies. (Re-run with
> `-Ylog-classpath` to see the problematic classpath.)
>
> [error] A full rebuild may help if 'MetricsSystem.class' was compiled
> against an incompatible version of org.
>
> [error] missing or invalid dependency detected while loading class
> file 'MetricsSystem.class'.
>
> [error] Could not access term jetty in value org.eclipse,
>
> [error] because it (or its dependencies) are missing. Check your build
> definition for
>
> [error] missing or conflicting dependencies. (Re-run with
> `-Ylog-classpath` to see the problematic classpath.)
>
> [error] A full rebuild may help if 'MetricsSystem.class' was compiled
> against an incompatible version of org.eclipse
>
> On Sun, Jul 15, 2018 at 3:09 AM Saisai Shao 
> wrote:
>
 Please vote on releasing the following candidate as Apache Spark
>> version 2.3.2.
>>
>> The vote is open until July 20 PST and passes if a majority +1 PMC
>> votes are cast, with a minimum of 3 +1 votes.
>>
>> [ ] +1 Release this package as Apache Spark 2.3.2
>> [ ] -1 Do not release this package because ...
>>
>> To learn more about Apache Spark, please see http://spark.apache.org/
>>
>> The tag to be voted on is v2.3.2-rc3
>> (commit b3726dadcf2997f20231873ec6e057dba433ae64):
>> https://github.com/apache/spark/tree/v2.3.2-rc3
>>
>> The release files, including signatures, digests, etc. can be found
>> at:
>> https://dist.apache.org/repos/dist/dev/spark/v2.3.2-rc3-bin/
>>
>> Signatures used for Spark RCs can be found in this file:
>> https://dist.apache.org/repos/dist/dev/spark/KEYS
>>
>> The staging repository for this release can be found at:
>>
>> https://repository.apache.org/content/repositories/orgapachespark-1278/
>>
>> The documentation corresponding to this release can be found at:
>> https://dist.apache.org/repos/dist/dev/spark/v2.3.2-rc3-docs/
>>
>> The list of bug fixes going into 2.3.2 can be found at the following
>> URL:
>> https://issues.apache.org/jira/projects/SPARK/versions/12343289
>>
>> Note. RC2 was cancelled because of one blocking issue SPARK-24781
>> during release preparation.
>>
>> FAQ
>>
>> =
>> How can I help test this release?
>> =
>>
>> If you are a Spark user, you can help us test this release by taking
>> an existing Spark workload and running on this release candidate, then
>> reporting any regressions.
>>
>> If you're working in PySpark you can set up a virtual env and install
>> the current RC and see if anything important breaks, in the Java/Scala
>> you can add the staging repository to your projects resolvers and test
>> with the RC (make sure to clean up the artifact cache before/after so
>> you don't end up building with a out of date RC going forward).
>>
>> ===
>> What should happen to JIRA tickets still targeting 2.3.2?
>> ===
>>

Re: [VOTE] SPIP: Standardize SQL logical plans

2018-07-18 Thread Reynold Xin
+1 on this, on the condition that we can come up with a design that will
remove the existing plans.


On Tue, Jul 17, 2018 at 11:00 AM Ryan Blue  wrote:

> Hi everyone,
>
> From discussion on the proposal doc and the discussion thread, I think we
> have consensus around the plan to standardize logical write operations for
> DataSourceV2. I would like to call a vote on the proposal.
>
> The proposal doc is here: SPIP: Standardize SQL logical plans
> 
> .
>
> This vote is for the plan in that doc. The related SPIP with APIs to
> create/alter/drop tables will be a separate vote.
>
> Please vote in the next 72 hours:
>
> [+1]: Spark should adopt the SPIP
> [-1]: Spark should not adopt the SPIP because . . .
>
> Thanks for voting, everyone!
>
> --
> Ryan Blue
>


Re: [VOTE] SPIP: Standardize SQL logical plans

2018-07-18 Thread Alessandro Solimando
+1 (non-binding)

On 18 July 2018 at 17:32, Xiao Li  wrote:

> +1 (binding)
>
> Like what Ryan and I discussed offline, the contents of implementation
> sketch is not part of this vote.
>
> Cheers,
>
> Xiao
>
> 2018-07-18 8:00 GMT-07:00 Russell Spitzer :
>
>> +1 (non-binding)
>>
>> On Wed, Jul 18, 2018 at 1:32 AM Marco Gaido 
>> wrote:
>>
>>> +1 (non-binding)
>>>
>>>
>>> On Wed, 18 Jul 2018, 07:43 Takeshi Yamamuro, 
>>> wrote:
>>>
 +1 (non-binding)

 On Wed, Jul 18, 2018 at 2:41 PM John Zhuge  wrote:

> +1 (non-binding)
>
> On Tue, Jul 17, 2018 at 8:06 PM Wenchen Fan 
> wrote:
>
>> +1 (binding). I think this is more clear to both users and
>> developers, compared to the existing one which only supports
>> append/overwrite and doesn't work with tables in data source(like JDBC
>> table) well.
>>
>> On Wed, Jul 18, 2018 at 2:06 AM Ryan Blue  wrote:
>>
>>> +1 (not binding)
>>>
>>> On Tue, Jul 17, 2018 at 10:59 AM Ryan Blue  wrote:
>>>
 Hi everyone,

 From discussion on the proposal doc and the discussion thread, I
 think we have consensus around the plan to standardize logical write
 operations for DataSourceV2. I would like to call a vote on the 
 proposal.

 The proposal doc is here: SPIP: Standardize SQL logical plans
 
 .

 This vote is for the plan in that doc. The related SPIP with APIs
 to create/alter/drop tables will be a separate vote.

 Please vote in the next 72 hours:

 [+1]: Spark should adopt the SPIP
 [-1]: Spark should not adopt the SPIP because . . .

 Thanks for voting, everyone!

 --
 Ryan Blue

>>>
>>>
>>> --
>>> Ryan Blue
>>>
>>> --
>>> John Zhuge
>>>
>>

 --
 ---
 Takeshi Yamamuro

>>>
>


Re: [VOTE] SPIP: Standardize SQL logical plans

2018-07-18 Thread Xiao Li
+1 (binding)

Like what Ryan and I discussed offline, the contents of implementation
sketch is not part of this vote.

Cheers,

Xiao

2018-07-18 8:00 GMT-07:00 Russell Spitzer :

> +1 (non-binding)
>
> On Wed, Jul 18, 2018 at 1:32 AM Marco Gaido 
> wrote:
>
>> +1 (non-binding)
>>
>>
>> On Wed, 18 Jul 2018, 07:43 Takeshi Yamamuro, 
>> wrote:
>>
>>> +1 (non-binding)
>>>
>>> On Wed, Jul 18, 2018 at 2:41 PM John Zhuge  wrote:
>>>
 +1 (non-binding)

 On Tue, Jul 17, 2018 at 8:06 PM Wenchen Fan 
 wrote:

> +1 (binding). I think this is more clear to both users and developers,
> compared to the existing one which only supports append/overwrite and
> doesn't work with tables in data source(like JDBC table) well.
>
> On Wed, Jul 18, 2018 at 2:06 AM Ryan Blue  wrote:
>
>> +1 (not binding)
>>
>> On Tue, Jul 17, 2018 at 10:59 AM Ryan Blue  wrote:
>>
>>> Hi everyone,
>>>
>>> From discussion on the proposal doc and the discussion thread, I
>>> think we have consensus around the plan to standardize logical write
>>> operations for DataSourceV2. I would like to call a vote on the 
>>> proposal.
>>>
>>> The proposal doc is here: SPIP: Standardize SQL logical plans
>>> 
>>> .
>>>
>>> This vote is for the plan in that doc. The related SPIP with APIs to
>>> create/alter/drop tables will be a separate vote.
>>>
>>> Please vote in the next 72 hours:
>>>
>>> [+1]: Spark should adopt the SPIP
>>> [-1]: Spark should not adopt the SPIP because . . .
>>>
>>> Thanks for voting, everyone!
>>>
>>> --
>>> Ryan Blue
>>>
>>
>>
>> --
>> Ryan Blue
>>
>> --
>> John Zhuge
>>
>
>>>
>>> --
>>> ---
>>> Takeshi Yamamuro
>>>
>>


Re: [VOTE] SPIP: Standardize SQL logical plans

2018-07-18 Thread Russell Spitzer
+1 (non-binding)

On Wed, Jul 18, 2018 at 1:32 AM Marco Gaido  wrote:

> +1 (non-binding)
>
>
> On Wed, 18 Jul 2018, 07:43 Takeshi Yamamuro, 
> wrote:
>
>> +1 (non-binding)
>>
>> On Wed, Jul 18, 2018 at 2:41 PM John Zhuge  wrote:
>>
>>> +1 (non-binding)
>>>
>>> On Tue, Jul 17, 2018 at 8:06 PM Wenchen Fan  wrote:
>>>
 +1 (binding). I think this is more clear to both users and developers,
 compared to the existing one which only supports append/overwrite and
 doesn't work with tables in data source(like JDBC table) well.

 On Wed, Jul 18, 2018 at 2:06 AM Ryan Blue  wrote:

> +1 (not binding)
>
> On Tue, Jul 17, 2018 at 10:59 AM Ryan Blue  wrote:
>
>> Hi everyone,
>>
>> From discussion on the proposal doc and the discussion thread, I
>> think we have consensus around the plan to standardize logical write
>> operations for DataSourceV2. I would like to call a vote on the proposal.
>>
>> The proposal doc is here: SPIP: Standardize SQL logical plans
>> 
>> .
>>
>> This vote is for the plan in that doc. The related SPIP with APIs to
>> create/alter/drop tables will be a separate vote.
>>
>> Please vote in the next 72 hours:
>>
>> [+1]: Spark should adopt the SPIP
>> [-1]: Spark should not adopt the SPIP because . . .
>>
>> Thanks for voting, everyone!
>>
>> --
>> Ryan Blue
>>
>
>
> --
> Ryan Blue
>
> --
> John Zhuge
>

>>
>> --
>> ---
>> Takeshi Yamamuro
>>
>


Re: Pyspark access to scala/java libraries

2018-07-18 Thread HARSH TAKKAR
Hi

You can access your java packages using following in pySpark

obj = sc._jvm.yourPackage.className()


Kind Regards
Harsh Takkar

On Wed, Jul 18, 2018 at 4:00 AM Mohit Jaggi  wrote:

> Thanks 0xF0F0F0 and Ashutosh for the pointers.
>
> Holden,
> I am trying to look into sparklingml...what am I looking for? Also which
> chapter/page of your book should I look at?
>
> Mohit.
>
> On Sun, Jul 15, 2018 at 3:02 AM Holden Karau 
> wrote:
>
>> If you want to see some examples in a library shows a way to do it -
>> https://github.com/sparklingpandas/sparklingml and high performance
>> spark also talks about it.
>>
>> On Sun, Jul 15, 2018, 11:57 AM <0xf0f...@protonmail.com.invalid> wrote:
>>
>>> Check
>>> https://stackoverflow.com/questions/31684842/calling-java-scala-function-from-a-task
>>>
>>> ​Sent with ProtonMail Secure Email.​
>>>
>>> ‐‐‐ Original Message ‐‐‐
>>>
>>> On July 15, 2018 8:01 AM, Mohit Jaggi  wrote:
>>>
>>> > Trying again…anyone know how to make this work?
>>> >
>>> > > On Jul 9, 2018, at 3:45 PM, Mohit Jaggi mohitja...@gmail.com wrote:
>>> > >
>>> > > Folks,
>>> > >
>>> > > I am writing some Scala/Java code and want it to be usable from
>>> pyspark.
>>> > >
>>> > > For example:
>>> > >
>>> > > class MyStuff(addend: Int) {
>>> > >
>>> > > def myMapFunction(x: Int) = x + addend
>>> > >
>>> > > }
>>> > >
>>> > > I want to call it from pyspark as:
>>> > >
>>> > > df = ...
>>> > >
>>> > > mystuff = sc._jvm.MyStuff(5)
>>> > >
>>> > > df[‘x’].map(lambda x: mystuff.myMapFunction(x))
>>> > >
>>> > > How can I do this?
>>> > >
>>> > > Mohit.
>>> >
>>> > --
>>> >
>>> > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>>
>>>
>>>
>>> -
>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>>
>>>


Re: [VOTE] SPIP: Standardize SQL logical plans

2018-07-18 Thread Marco Gaido
+1 (non-binding)

On Wed, 18 Jul 2018, 07:43 Takeshi Yamamuro,  wrote:

> +1 (non-binding)
>
> On Wed, Jul 18, 2018 at 2:41 PM John Zhuge  wrote:
>
>> +1 (non-binding)
>>
>> On Tue, Jul 17, 2018 at 8:06 PM Wenchen Fan  wrote:
>>
>>> +1 (binding). I think this is more clear to both users and developers,
>>> compared to the existing one which only supports append/overwrite and
>>> doesn't work with tables in data source(like JDBC table) well.
>>>
>>> On Wed, Jul 18, 2018 at 2:06 AM Ryan Blue  wrote:
>>>
 +1 (not binding)

 On Tue, Jul 17, 2018 at 10:59 AM Ryan Blue  wrote:

> Hi everyone,
>
> From discussion on the proposal doc and the discussion thread, I think
> we have consensus around the plan to standardize logical write operations
> for DataSourceV2. I would like to call a vote on the proposal.
>
> The proposal doc is here: SPIP: Standardize SQL logical plans
> 
> .
>
> This vote is for the plan in that doc. The related SPIP with APIs to
> create/alter/drop tables will be a separate vote.
>
> Please vote in the next 72 hours:
>
> [+1]: Spark should adopt the SPIP
> [-1]: Spark should not adopt the SPIP because . . .
>
> Thanks for voting, everyone!
>
> --
> Ryan Blue
>


 --
 Ryan Blue

 --
 John Zhuge

>>>
>
> --
> ---
> Takeshi Yamamuro
>


Re: [VOTE] SPARK 2.3.2 (RC3)

2018-07-18 Thread John Zhuge
+1 (non-binding)

On Mon, Jul 16, 2018 at 8:04 PM Saisai Shao  wrote:

> I will put my +1 on this RC.
>
> For the test failure fix, I will include it if there's another RC.
>
> Sean Owen  于2018年7月16日周一 下午10:47写道:
>
>> OK, hm, will try to get to the bottom of it. But if others can build this
>> module successfully, I give a +1 . The test failure is inevitable here and
>> should not block release.
>>
>> On Sun, Jul 15, 2018 at 9:39 PM Saisai Shao 
>> wrote:
>>
>>> Hi Sean,
>>>
>>> I just did a clean build with mvn/sbt on 2.3.2, I didn't meet the errors
>>> you pasted here. I'm not sure how it happens.
>>>
>>> Sean Owen  于2018年7月16日周一 上午6:30写道:
>>>
 Looks good to me, with the following caveats.

 First see the discussion on
 https://issues.apache.org/jira/browse/SPARK-24813 ; the
 flaky HiveExternalCatalogVersionsSuite will probably fail all the time
 right now. That's not a regression and is a test-only issue, so don't think
 it must block the release. However if this fix holds up, and we need
 another RC, worth pulling in for sure.

 Also is anyone seeing this while building and testing the Spark SQL +
 Kafka module? I see this error even after a clean rebuild. I sort of get
 what the error is saying but can't figure out why it would only happen at
 test/runtime. Haven't seen it before.

 [error] missing or invalid dependency detected while loading class file
 'MetricsSystem.class'.

 [error] Could not access term eclipse in package org,

 [error] because it (or its dependencies) are missing. Check your build
 definition for

 [error] missing or conflicting dependencies. (Re-run with
 `-Ylog-classpath` to see the problematic classpath.)

 [error] A full rebuild may help if 'MetricsSystem.class' was compiled
 against an incompatible version of org.

 [error] missing or invalid dependency detected while loading class file
 'MetricsSystem.class'.

 [error] Could not access term jetty in value org.eclipse,

 [error] because it (or its dependencies) are missing. Check your build
 definition for

 [error] missing or conflicting dependencies. (Re-run with
 `-Ylog-classpath` to see the problematic classpath.)

 [error] A full rebuild may help if 'MetricsSystem.class' was compiled
 against an incompatible version of org.eclipse

 On Sun, Jul 15, 2018 at 3:09 AM Saisai Shao 
 wrote:

> Please vote on releasing the following candidate as Apache Spark
> version 2.3.2.
>
> The vote is open until July 20 PST and passes if a majority +1 PMC
> votes are cast, with a minimum of 3 +1 votes.
>
> [ ] +1 Release this package as Apache Spark 2.3.2
> [ ] -1 Do not release this package because ...
>
> To learn more about Apache Spark, please see http://spark.apache.org/
>
> The tag to be voted on is v2.3.2-rc3
> (commit b3726dadcf2997f20231873ec6e057dba433ae64):
> https://github.com/apache/spark/tree/v2.3.2-rc3
>
> The release files, including signatures, digests, etc. can be found at:
> https://dist.apache.org/repos/dist/dev/spark/v2.3.2-rc3-bin/
>
> Signatures used for Spark RCs can be found in this file:
> https://dist.apache.org/repos/dist/dev/spark/KEYS
>
> The staging repository for this release can be found at:
> https://repository.apache.org/content/repositories/orgapachespark-1278/
>
> The documentation corresponding to this release can be found at:
> https://dist.apache.org/repos/dist/dev/spark/v2.3.2-rc3-docs/
>
> The list of bug fixes going into 2.3.2 can be found at the following
> URL:
> https://issues.apache.org/jira/projects/SPARK/versions/12343289
>
> Note. RC2 was cancelled because of one blocking issue SPARK-24781
> during release preparation.
>
> FAQ
>
> =
> How can I help test this release?
> =
>
> If you are a Spark user, you can help us test this release by taking
> an existing Spark workload and running on this release candidate, then
> reporting any regressions.
>
> If you're working in PySpark you can set up a virtual env and install
> the current RC and see if anything important breaks, in the Java/Scala
> you can add the staging repository to your projects resolvers and test
> with the RC (make sure to clean up the artifact cache before/after so
> you don't end up building with a out of date RC going forward).
>
> ===
> What should happen to JIRA tickets still targeting 2.3.2?
> ===
>
> The current list of open tickets targeted at 2.3.2 can be found at:
> https://issues.apache.org/jira/projects/SPARK and search for "Target
> Version/s" = 2.3.2
>
> Committers should