Re: [VOTE] Release Spark 3.1.2 (RC1)

2021-05-26 Thread Dongjoon Hyun
+1

Bests,
Dongjoon

On Wed, May 26, 2021 at 7:55 PM Kent Yao  wrote:

> +1, non-binding
>
> *Kent Yao *
> @ Data Science Center, Hangzhou Research Institute, NetEase Corp.
> *a spark enthusiast*
> *kyuubi is a unified multi-tenant JDBC
> interface for large-scale data processing and analytics, built on top
> of Apache Spark .*
> *spark-authorizer A Spark
> SQL extension which provides SQL Standard Authorization for **Apache
> Spark .*
> *spark-postgres  A library for
> reading data from and transferring data to Postgres / Greenplum with Spark
> SQL and DataFrames, 10~100x faster.*
> *itatchi A** library t**hat
> brings useful functions from various modern database management systems to 
> **Apache
> Spark .*
>
>
>
> On 05/27/2021 10:44,Yuming Wang 
> wrote:
>
> +1 (non-binding)
>
> On Wed, May 26, 2021 at 11:27 PM Maxim Gekk 
> wrote:
>
>> +1 (non-binding)
>>
>> On Mon, May 24, 2021 at 9:14 AM Dongjoon Hyun 
>> wrote:
>>
>>> Please vote on releasing the following candidate as Apache Spark version
>>> 3.1.2.
>>>
>>> The vote is open until May 27th 1AM (PST) and passes if a majority +1
>>> PMC votes are cast, with a minimum of 3 +1 votes.
>>>
>>> [ ] +1 Release this package as Apache Spark 3.1.2
>>> [ ] -1 Do not release this package because ...
>>>
>>> To learn more about Apache Spark, please see https://spark.apache.org/
>>>
>>> The tag to be voted on is v3.1.2-rc1 (commit
>>> de351e30a90dd988b133b3d00fa6218bfcaba8b8):
>>> https://github.com/apache/spark/tree/v3.1.2-rc1
>>>
>>> The release files, including signatures, digests, etc. can be found at:
>>> https://dist.apache.org/repos/dist/dev/spark/v3.1.2-rc1-bin/
>>>
>>> Signatures used for Spark RCs can be found in this file:
>>> https://dist.apache.org/repos/dist/dev/spark/KEYS
>>>
>>> The staging repository for this release can be found at:
>>> https://repository.apache.org/content/repositories/orgapachespark-1384/
>>>
>>> The documentation corresponding to this release can be found at:
>>> https://dist.apache.org/repos/dist/dev/spark/v3.1.2-rc1-docs/
>>>
>>> The list of bug fixes going into 3.1.2 can be found at the following URL:
>>> https://issues.apache.org/jira/projects/SPARK/versions/12349602
>>>
>>> This release is using the release script of the tag v3.1.2-rc1.
>>>
>>> FAQ
>>>
>>> =
>>> How can I help test this release?
>>> =
>>>
>>> If you are a Spark user, you can help us test this release by taking
>>> an existing Spark workload and running on this release candidate, then
>>> reporting any regressions.
>>>
>>> If you're working in PySpark you can set up a virtual env and install
>>> the current RC and see if anything important breaks, in the Java/Scala
>>> you can add the staging repository to your projects resolvers and test
>>> with the RC (make sure to clean up the artifact cache before/after so
>>> you don't end up building with a out of date RC going forward).
>>>
>>> ===
>>> What should happen to JIRA tickets still targeting 3.1.2?
>>> ===
>>>
>>> The current list of open tickets targeted at 3.1.2 can be found at:
>>> https://issues.apache.org/jira/projects/SPARK and search for "Target
>>> Version/s" = 3.1.2
>>>
>>> Committers should look at those and triage. Extremely important bug
>>> fixes, documentation, and API tweaks that impact compatibility should
>>> be worked on immediately. Everything else please retarget to an
>>> appropriate release.
>>>
>>> ==
>>> But my bug isn't fixed?
>>> ==
>>>
>>> In order to make timely releases, we will typically not hold the
>>> release unless the bug in question is a regression from the previous
>>> release. That being said, if there is something which is a regression
>>> that has not been correctly targeted please ping me or a committer to
>>> help target the issue.
>>>
>>


Re: [VOTE] Release Spark 3.1.2 (RC1)

2021-05-26 Thread Kent Yao







+1, non-binding






  





















Kent Yao @ Data Science Center, Hangzhou Research Institute, NetEase Corp.a spark enthusiastkyuubiis a unified multi-tenant JDBC interface for large-scale data processing and analytics, built on top of Apache Spark.spark-authorizerA Spark SQL extension which provides SQL Standard Authorization for Apache Spark.spark-postgres A library for reading data from and transferring data to Postgres / Greenplum with Spark SQL and DataFrames, 10~100x faster.itatchiA library that brings useful functions from various modern database management systems to Apache Spark.
















 


On 05/27/2021 10:44,Yuming Wang wrote: 


+1 (non-binding)On Wed, May 26, 2021 at 11:27 PM Maxim Gekk  wrote:+1 (non-binding)On Mon, May 24, 2021 at 9:14 AM Dongjoon Hyun  wrote:Please vote on releasing the following candidate as Apache Spark version 3.1.2.The vote is open until May 27th 1AM (PST) and passes if a majority +1 PMC votes are cast, with a minimum of 3 +1 votes.[ ] +1 Release this package as Apache Spark 3.1.2[ ] -1 Do not release this package because ...To learn more about Apache Spark, please see https://spark.apache.org/The tag to be voted on is v3.1.2-rc1 (commit de351e30a90dd988b133b3d00fa6218bfcaba8b8):https://github.com/apache/spark/tree/v3.1.2-rc1The release files, including signatures, digests, etc. can be found at:https://dist.apache.org/repos/dist/dev/spark/v3.1.2-rc1-bin/Signatures used for Spark RCs can be found in this file:https://dist.apache.org/repos/dist/dev/spark/KEYSThe staging repository for this release can be found at:https://repository.apache.org/content/repositories/orgapachespark-1384/The documentation corresponding to this release can be found at:https://dist.apache.org/repos/dist/dev/spark/v3.1.2-rc1-docs/The list of bug fixes going into 3.1.2 can be found at the following URL:https://issues.apache.org/jira/projects/SPARK/versions/12349602This release is using the release script of the tag v3.1.2-rc1.FAQ=How can I help test this release?=If you are a Spark user, you can help us test this release by takingan existing Spark workload and running on this release candidate, thenreporting any regressions.If you're working in PySpark you can set up a virtual env and installthe current RC and see if anything important breaks, in the Java/Scalayou can add the staging repository to your projects resolvers and testwith the RC (make sure to clean up the artifact cache before/after soyou don't end up building with a out of date RC going forward).===What should happen to JIRA tickets still targeting 3.1.2?===The current list of open tickets targeted at 3.1.2 can be found at:https://issues.apache.org/jira/projects/SPARK and search for "Target Version/s" = 3.1.2Committers should look at those and triage. Extremely important bugfixes, documentation, and API tweaks that impact compatibility shouldbe worked on immediately. Everything else please retarget to anappropriate release.==But my bug isn't fixed?==In order to make timely releases, we will typically not hold therelease unless the bug in question is a regression from the previousrelease. That being said, if there is something which is a regressionthat has not been correctly targeted please ping me or a committer tohelp target the issue.






-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org



Re: [VOTE] Release Spark 3.1.2 (RC1)

2021-05-26 Thread Yuming Wang
+1 (non-binding)

On Wed, May 26, 2021 at 11:27 PM Maxim Gekk 
wrote:

> +1 (non-binding)
>
> On Mon, May 24, 2021 at 9:14 AM Dongjoon Hyun 
> wrote:
>
>> Please vote on releasing the following candidate as Apache Spark version
>> 3.1.2.
>>
>> The vote is open until May 27th 1AM (PST) and passes if a majority +1 PMC
>> votes are cast, with a minimum of 3 +1 votes.
>>
>> [ ] +1 Release this package as Apache Spark 3.1.2
>> [ ] -1 Do not release this package because ...
>>
>> To learn more about Apache Spark, please see https://spark.apache.org/
>>
>> The tag to be voted on is v3.1.2-rc1 (commit
>> de351e30a90dd988b133b3d00fa6218bfcaba8b8):
>> https://github.com/apache/spark/tree/v3.1.2-rc1
>>
>> The release files, including signatures, digests, etc. can be found at:
>> https://dist.apache.org/repos/dist/dev/spark/v3.1.2-rc1-bin/
>>
>> Signatures used for Spark RCs can be found in this file:
>> https://dist.apache.org/repos/dist/dev/spark/KEYS
>>
>> The staging repository for this release can be found at:
>> https://repository.apache.org/content/repositories/orgapachespark-1384/
>>
>> The documentation corresponding to this release can be found at:
>> https://dist.apache.org/repos/dist/dev/spark/v3.1.2-rc1-docs/
>>
>> The list of bug fixes going into 3.1.2 can be found at the following URL:
>> https://issues.apache.org/jira/projects/SPARK/versions/12349602
>>
>> This release is using the release script of the tag v3.1.2-rc1.
>>
>> FAQ
>>
>> =
>> How can I help test this release?
>> =
>>
>> If you are a Spark user, you can help us test this release by taking
>> an existing Spark workload and running on this release candidate, then
>> reporting any regressions.
>>
>> If you're working in PySpark you can set up a virtual env and install
>> the current RC and see if anything important breaks, in the Java/Scala
>> you can add the staging repository to your projects resolvers and test
>> with the RC (make sure to clean up the artifact cache before/after so
>> you don't end up building with a out of date RC going forward).
>>
>> ===
>> What should happen to JIRA tickets still targeting 3.1.2?
>> ===
>>
>> The current list of open tickets targeted at 3.1.2 can be found at:
>> https://issues.apache.org/jira/projects/SPARK and search for "Target
>> Version/s" = 3.1.2
>>
>> Committers should look at those and triage. Extremely important bug
>> fixes, documentation, and API tweaks that impact compatibility should
>> be worked on immediately. Everything else please retarget to an
>> appropriate release.
>>
>> ==
>> But my bug isn't fixed?
>> ==
>>
>> In order to make timely releases, we will typically not hold the
>> release unless the bug in question is a regression from the previous
>> release. That being said, if there is something which is a regression
>> that has not been correctly targeted please ping me or a committer to
>> help target the issue.
>>
>


Re: [VOTE] SPIP: Catalog API for view metadata

2021-05-26 Thread John Zhuge
Looks like we are running in circles. Should we have an online meeting to
get this sorted out?

Thanks,
John

On Wed, May 26, 2021 at 12:01 AM Wenchen Fan  wrote:

> OK, then I'd vote for TableViewCatalog, because
> 1. This is how Hive catalog works, and we need to migrate Hive catalog to
> the v2 API sooner or later.
> 2. Because of 1, TableViewCatalog is easy to support in the current
> table/view resolution framework.
> 3. It's better to avoid name conflicts between table and views at the API
> level, instead of relying on the catalog implementation.
> 4. Caching invalidation is always a tricky problem.
>
> On Tue, May 25, 2021 at 3:09 AM Ryan Blue 
> wrote:
>
>> I don't think that it makes sense to discuss a different approach in the
>> PR rather than in the vote. Let's discuss this now since that's the purpose
>> of an SPIP.
>>
>> On Mon, May 24, 2021 at 11:22 AM John Zhuge  wrote:
>>
>>> Hi everyone, I’d like to start a vote for the ViewCatalog design
>>> proposal (SPIP).
>>>
>>> The proposal is to add a ViewCatalog interface that can be used to load,
>>> create, alter, and drop views in DataSourceV2.
>>>
>>> The full SPIP doc is here:
>>> https://docs.google.com/document/d/1XOxFtloiMuW24iqJ-zJnDzHl2KMxipTjJoxleJFz66A/edit?usp=sharing
>>>
>>> Please vote on the SPIP in the next 72 hours. Once it is approved, I’ll
>>> update the PR for review.
>>>
>>> [ ] +1: Accept the proposal as an official SPIP
>>> [ ] +0
>>> [ ] -1: I don’t think this is a good idea because …
>>>
>>
>>
>> --
>> Ryan Blue
>> Software Engineer
>> Netflix
>>
>

-- 
John Zhuge


Re: [VOTE] Release Spark 3.1.2 (RC1)

2021-05-26 Thread Maxim Gekk
+1 (non-binding)

On Mon, May 24, 2021 at 9:14 AM Dongjoon Hyun 
wrote:

> Please vote on releasing the following candidate as Apache Spark version
> 3.1.2.
>
> The vote is open until May 27th 1AM (PST) and passes if a majority +1 PMC
> votes are cast, with a minimum of 3 +1 votes.
>
> [ ] +1 Release this package as Apache Spark 3.1.2
> [ ] -1 Do not release this package because ...
>
> To learn more about Apache Spark, please see https://spark.apache.org/
>
> The tag to be voted on is v3.1.2-rc1 (commit
> de351e30a90dd988b133b3d00fa6218bfcaba8b8):
> https://github.com/apache/spark/tree/v3.1.2-rc1
>
> The release files, including signatures, digests, etc. can be found at:
> https://dist.apache.org/repos/dist/dev/spark/v3.1.2-rc1-bin/
>
> Signatures used for Spark RCs can be found in this file:
> https://dist.apache.org/repos/dist/dev/spark/KEYS
>
> The staging repository for this release can be found at:
> https://repository.apache.org/content/repositories/orgapachespark-1384/
>
> The documentation corresponding to this release can be found at:
> https://dist.apache.org/repos/dist/dev/spark/v3.1.2-rc1-docs/
>
> The list of bug fixes going into 3.1.2 can be found at the following URL:
> https://issues.apache.org/jira/projects/SPARK/versions/12349602
>
> This release is using the release script of the tag v3.1.2-rc1.
>
> FAQ
>
> =
> How can I help test this release?
> =
>
> If you are a Spark user, you can help us test this release by taking
> an existing Spark workload and running on this release candidate, then
> reporting any regressions.
>
> If you're working in PySpark you can set up a virtual env and install
> the current RC and see if anything important breaks, in the Java/Scala
> you can add the staging repository to your projects resolvers and test
> with the RC (make sure to clean up the artifact cache before/after so
> you don't end up building with a out of date RC going forward).
>
> ===
> What should happen to JIRA tickets still targeting 3.1.2?
> ===
>
> The current list of open tickets targeted at 3.1.2 can be found at:
> https://issues.apache.org/jira/projects/SPARK and search for "Target
> Version/s" = 3.1.2
>
> Committers should look at those and triage. Extremely important bug
> fixes, documentation, and API tweaks that impact compatibility should
> be worked on immediately. Everything else please retarget to an
> appropriate release.
>
> ==
> But my bug isn't fixed?
> ==
>
> In order to make timely releases, we will typically not hold the
> release unless the bug in question is a regression from the previous
> release. That being said, if there is something which is a regression
> that has not been correctly targeted please ping me or a committer to
> help target the issue.
>


Re: [VOTE] Release Spark 3.1.2 (RC1)

2021-05-26 Thread Thomas Graves
+1


Tom Graves

On Mon, May 24, 2021 at 1:14 AM Dongjoon Hyun  wrote:
>
> Please vote on releasing the following candidate as Apache Spark version 
> 3.1.2.
>
> The vote is open until May 27th 1AM (PST) and passes if a majority +1 PMC 
> votes are cast, with a minimum of 3 +1 votes.
>
> [ ] +1 Release this package as Apache Spark 3.1.2
> [ ] -1 Do not release this package because ...
>
> To learn more about Apache Spark, please see https://spark.apache.org/
>
> The tag to be voted on is v3.1.2-rc1 (commit 
> de351e30a90dd988b133b3d00fa6218bfcaba8b8):
> https://github.com/apache/spark/tree/v3.1.2-rc1
>
> The release files, including signatures, digests, etc. can be found at:
> https://dist.apache.org/repos/dist/dev/spark/v3.1.2-rc1-bin/
>
> Signatures used for Spark RCs can be found in this file:
> https://dist.apache.org/repos/dist/dev/spark/KEYS
>
> The staging repository for this release can be found at:
> https://repository.apache.org/content/repositories/orgapachespark-1384/
>
> The documentation corresponding to this release can be found at:
> https://dist.apache.org/repos/dist/dev/spark/v3.1.2-rc1-docs/
>
> The list of bug fixes going into 3.1.2 can be found at the following URL:
> https://issues.apache.org/jira/projects/SPARK/versions/12349602
>
> This release is using the release script of the tag v3.1.2-rc1.
>
> FAQ
>
> =
> How can I help test this release?
> =
>
> If you are a Spark user, you can help us test this release by taking
> an existing Spark workload and running on this release candidate, then
> reporting any regressions.
>
> If you're working in PySpark you can set up a virtual env and install
> the current RC and see if anything important breaks, in the Java/Scala
> you can add the staging repository to your projects resolvers and test
> with the RC (make sure to clean up the artifact cache before/after so
> you don't end up building with a out of date RC going forward).
>
> ===
> What should happen to JIRA tickets still targeting 3.1.2?
> ===
>
> The current list of open tickets targeted at 3.1.2 can be found at:
> https://issues.apache.org/jira/projects/SPARK and search for "Target 
> Version/s" = 3.1.2
>
> Committers should look at those and triage. Extremely important bug
> fixes, documentation, and API tweaks that impact compatibility should
> be worked on immediately. Everything else please retarget to an
> appropriate release.
>
> ==
> But my bug isn't fixed?
> ==
>
> In order to make timely releases, we will typically not hold the
> release unless the bug in question is a regression from the previous
> release. That being said, if there is something which is a regression
> that has not been correctly targeted please ping me or a committer to
> help target the issue.

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org



Re: Should AggregationIterator.initializeBuffer be moved down to SortBasedAggregationIterator?

2021-05-26 Thread Cheng Su
Yes I think it should be okay to move.

A little bit check of history - the method was introduced in 
https://github.com/apache/spark/pull/7813, where at that time, it’s being used 
by SortBasedAggregationIterator and UnsafeHybridAggregationIterator. But right 
now UnsafeHybridAggregationIterator is refactored away and no longer exists.

Cheng Su

From: Jacek Laskowski 
Date: Tuesday, May 25, 2021 at 6:35 AM
To: dev 
Subject: Should AggregationIterator.initializeBuffer be moved down to 
SortBasedAggregationIterator?

Hi,

Just found out that the only purpose of AggregationIterator.initializeBuffer is 
to keep SortBasedAggregationIterator happy [1].

Shouldn't it be moved down to SortBasedAggregationIterator to make things 
clear(er)?

[1] https://github.com/apache/spark/search?q=initializeBuffer

Pozdrawiam,
Jacek Laskowski

https://about.me/JacekLaskowski
"The Internals Of" Online Books
Follow me on 
https://twitter.com/jaceklaskowski




Re: [VOTE] SPIP: Catalog API for view metadata

2021-05-26 Thread Wenchen Fan
OK, then I'd vote for TableViewCatalog, because
1. This is how Hive catalog works, and we need to migrate Hive catalog to
the v2 API sooner or later.
2. Because of 1, TableViewCatalog is easy to support in the current
table/view resolution framework.
3. It's better to avoid name conflicts between table and views at the API
level, instead of relying on the catalog implementation.
4. Caching invalidation is always a tricky problem.

On Tue, May 25, 2021 at 3:09 AM Ryan Blue  wrote:

> I don't think that it makes sense to discuss a different approach in the
> PR rather than in the vote. Let's discuss this now since that's the purpose
> of an SPIP.
>
> On Mon, May 24, 2021 at 11:22 AM John Zhuge  wrote:
>
>> Hi everyone, I’d like to start a vote for the ViewCatalog design proposal
>> (SPIP).
>>
>> The proposal is to add a ViewCatalog interface that can be used to load,
>> create, alter, and drop views in DataSourceV2.
>>
>> The full SPIP doc is here:
>> https://docs.google.com/document/d/1XOxFtloiMuW24iqJ-zJnDzHl2KMxipTjJoxleJFz66A/edit?usp=sharing
>>
>> Please vote on the SPIP in the next 72 hours. Once it is approved, I’ll
>> update the PR for review.
>>
>> [ ] +1: Accept the proposal as an official SPIP
>> [ ] +0
>> [ ] -1: I don’t think this is a good idea because …
>>
>
>
> --
> Ryan Blue
> Software Engineer
> Netflix
>