Re: [Discuss] Compatibility Guarantees and Versioning Post "1.0.0"
SGTM could you or another PMC member start one? Thanks, Micah On Saturday, July 13, 2019, Wes McKinney wrote: > Micah -- I would suggest that -- absent more opinions -- we vote about > adopting the versioning scheme you described here (Format Version and > Library Version) > > On Wed, Jul 10, 2019 at 8:46 AM Wes McKinney wrote: > > > > On Wed, Jul 10, 2019 at 12:43 AM Micah Kornfield > wrote: > > > > > > Hi Eric, > > > Short answer: I think your understanding matches what I was > proposing. Longer answer below. > > > > > >> So, for example, we release library v1.0.0 in a few months and then > library v2.0.0 a few months after that. In v2.0.0, C++, Python, and Java > didn't make any breaking API changes from 1.0.0. But C# made 3 API breaking > changes. This would be acceptable? > > > > > > Yes. I think all language bindings are going under rapid enough > iteration that we are making at least a few small breaking API changes on > each release even though we try to avoid it. I think it will be worth > having further discussions on the release process once at least a few > languages get to a more stable point. > > > > > > > I agree with this. I think we are a pretty long ways away from making > > API stability _guarantees_ in any of the implementations, though we > > certainly should try to be courteous about the changes we do make, to > > allow for graceful transitions over a period of 1-2 releases if > > possible. > > > > > Thanks, > > > Micah > > > > > > On Tue, Jul 9, 2019 at 2:26 PM Eric Erhardt < > eric.erha...@microsoft.com> wrote: > > >> > > >> Just to be sure I fully understand the proposal: > > >> > > >> For the Library Version, we are going to increment the MAJOR version > on every normal release, and increment the MINOR version if we need to > release a patch/bug fix type of release. > > >> > > >> Since SemVer allows for API breaking changes on MAJOR versions, this > basically means, each library (C++, Python, C#, Java, etc) _can_ introduce > API breaking changes on every normal release (like we have been with the > 0.x.0 releases). > > >> > > >> So, for example, we release library v1.0.0 in a few months and then > library v2.0.0 a few months after that. In v2.0.0, C++, Python, and Java > didn't make any breaking API changes from 1.0.0. But C# made 3 API breaking > changes. This would be acceptable? > > >> > > >> If my understanding above is correct, then I think this is a good > plan. Initially I was concerned that the C# library wouldn't be free to > make API breaking changes with making the version `1.0.0`. The C# library > is still pretty inadequate, and I have a feeling there are a few things > that will need to change about it in the future. But with the above plan, > this concern won't be a problem. > > >> > > >> Eric > > >> > > >> -Original Message- > > >> From: Micah Kornfield > > >> Sent: Monday, July 1, 2019 10:02 PM > > >> To: Wes McKinney > > >> Cc: dev@arrow.apache.org > > >> Subject: Re: [Discuss] Compatibility Guarantees and Versioning Post > "1.0.0" > > >> > > >> Hi Wes, > > >> Thanks for your response. In regards to the protocol negotiation > your description of feature reporting (snipped below) is along the lines of > what I was thinking. It might not be necessary for 1.0.0, but at some > point might become useful. > > >> > > >> > > >> > Note that we don't really have a mechanism for clients and servers > to > > >> > report to each other what features they support, so this could help > > >> > with that when for applications where it might matter. > > >> > > >> > > >> Thanks, > > >> Micah > > >> > > >> > > >> On Mon, Jul 1, 2019 at 12:54 PM Wes McKinney > wrote: > > >> > > >> > hi Micah, > > >> > > > >> > Sorry for the delay in feedback. I looked at the document and it > seems > > >> > like a reasonable perspective about forward- and > > >> > backward-compatibility. > > >> > > > >> > It seems like the main thing you are proposing is to apply Semantic > > >> > Versioning to Format and Library versions separately. That's an > > >> > interesting idea, my thought had been to have a version number that > is > > >> > FORMAT_VERSION.LIBRARY_VERSION.PATCH_VERSION. But your proposal is > > >> > more flexible in some ways, so let me clarify for others reading > > >> > > > >> > In what you are proposing, the next release would be: > > >> > > > >> > Format version: 1.0.0 > > >> > Library version: 1.0.0 > > >> > > > >> > Suppose that 20 major versions down the road we stand at > > >> > > > >> > Format version: 1.5.0 > > >> > Library version: 20.0.0 > > >> > > > >> > The minor version of the Format would indicate that there are > > >> > additions, like new elements in the Type union, but otherwise > backward > > >> > and forward compatible. So the Minor version means "new things, but > > >> > old clients will not be disrupted if those new things are not used". > > >> > We've already been doing this since the V4 Format iteration but we > > >> > have not had a way to
[jira] [Created] (ARROW-5946) [Rust] [DataFusion] Projection push down with aggregate producing incorrect results
Andy Grove created ARROW-5946: - Summary: [Rust] [DataFusion] Projection push down with aggregate producing incorrect results Key: ARROW-5946 URL: https://issues.apache.org/jira/browse/ARROW-5946 Project: Apache Arrow Issue Type: Bug Components: Rust, Rust - DataFusion Affects Versions: 0.14.0 Reporter: Andy Grove Assignee: Andy Grove Fix For: 1.0.0 I was testing some queries with the 0.14 release and noticed that the projected schema for a table scan is completely wrong (however the results of the query are not necessarily wrong) {code:java} // schema for nyxtaxi csv files let schema = Schema::new(vec![ Field::new("VendorID", DataType::Utf8, true), Field::new("tpep_pickup_datetime", DataType::Utf8, true), Field::new("tpep_dropoff_datetime", DataType::Utf8, true), Field::new("passenger_count", DataType::Utf8, true), Field::new("trip_distance", DataType::Float64, true), Field::new("RatecodeID", DataType::Utf8, true), Field::new("store_and_fwd_flag", DataType::Utf8, true), Field::new("PULocationID", DataType::Utf8, true), Field::new("DOLocationID", DataType::Utf8, true), Field::new("payment_type", DataType::Utf8, true), Field::new("fare_amount", DataType::Float64, true), Field::new("extra", DataType::Float64, true), Field::new("mta_tax", DataType::Float64, true), Field::new("tip_amount", DataType::Float64, true), Field::new("tolls_amount", DataType::Float64, true), Field::new("improvement_surcharge", DataType::Float64, true), Field::new("total_amount", DataType::Float64, true), ]); let mut ctx = ExecutionContext::new(); ctx.register_csv("tripdata", "file.csv", , true); let optimized_plan = ctx.create_logical_plan( "SELECT passenger_count, MIN(fare_amount), MAX(fare_amount) \ FROM tripdata GROUP BY passenger_count").unwrap();{code} -- This message was sent by Atlassian JIRA (v7.6.14#76016)
Re: [Discuss] Compatibility Guarantees and Versioning Post "1.0.0"
Micah -- I would suggest that -- absent more opinions -- we vote about adopting the versioning scheme you described here (Format Version and Library Version) On Wed, Jul 10, 2019 at 8:46 AM Wes McKinney wrote: > > On Wed, Jul 10, 2019 at 12:43 AM Micah Kornfield > wrote: > > > > Hi Eric, > > Short answer: I think your understanding matches what I was proposing. > > Longer answer below. > > > >> So, for example, we release library v1.0.0 in a few months and then > >> library v2.0.0 a few months after that. In v2.0.0, C++, Python, and Java > >> didn't make any breaking API changes from 1.0.0. But C# made 3 API > >> breaking changes. This would be acceptable? > > > > Yes. I think all language bindings are going under rapid enough iteration > > that we are making at least a few small breaking API changes on each > > release even though we try to avoid it. I think it will be worth having > > further discussions on the release process once at least a few languages > > get to a more stable point. > > > > I agree with this. I think we are a pretty long ways away from making > API stability _guarantees_ in any of the implementations, though we > certainly should try to be courteous about the changes we do make, to > allow for graceful transitions over a period of 1-2 releases if > possible. > > > Thanks, > > Micah > > > > On Tue, Jul 9, 2019 at 2:26 PM Eric Erhardt > > wrote: > >> > >> Just to be sure I fully understand the proposal: > >> > >> For the Library Version, we are going to increment the MAJOR version on > >> every normal release, and increment the MINOR version if we need to > >> release a patch/bug fix type of release. > >> > >> Since SemVer allows for API breaking changes on MAJOR versions, this > >> basically means, each library (C++, Python, C#, Java, etc) _can_ introduce > >> API breaking changes on every normal release (like we have been with the > >> 0.x.0 releases). > >> > >> So, for example, we release library v1.0.0 in a few months and then > >> library v2.0.0 a few months after that. In v2.0.0, C++, Python, and Java > >> didn't make any breaking API changes from 1.0.0. But C# made 3 API > >> breaking changes. This would be acceptable? > >> > >> If my understanding above is correct, then I think this is a good plan. > >> Initially I was concerned that the C# library wouldn't be free to make API > >> breaking changes with making the version `1.0.0`. The C# library is still > >> pretty inadequate, and I have a feeling there are a few things that will > >> need to change about it in the future. But with the above plan, this > >> concern won't be a problem. > >> > >> Eric > >> > >> -Original Message- > >> From: Micah Kornfield > >> Sent: Monday, July 1, 2019 10:02 PM > >> To: Wes McKinney > >> Cc: dev@arrow.apache.org > >> Subject: Re: [Discuss] Compatibility Guarantees and Versioning Post "1.0.0" > >> > >> Hi Wes, > >> Thanks for your response. In regards to the protocol negotiation your > >> description of feature reporting (snipped below) is along the lines of > >> what I was thinking. It might not be necessary for 1.0.0, but at some > >> point might become useful. > >> > >> > >> > Note that we don't really have a mechanism for clients and servers to > >> > report to each other what features they support, so this could help > >> > with that when for applications where it might matter. > >> > >> > >> Thanks, > >> Micah > >> > >> > >> On Mon, Jul 1, 2019 at 12:54 PM Wes McKinney wrote: > >> > >> > hi Micah, > >> > > >> > Sorry for the delay in feedback. I looked at the document and it seems > >> > like a reasonable perspective about forward- and > >> > backward-compatibility. > >> > > >> > It seems like the main thing you are proposing is to apply Semantic > >> > Versioning to Format and Library versions separately. That's an > >> > interesting idea, my thought had been to have a version number that is > >> > FORMAT_VERSION.LIBRARY_VERSION.PATCH_VERSION. But your proposal is > >> > more flexible in some ways, so let me clarify for others reading > >> > > >> > In what you are proposing, the next release would be: > >> > > >> > Format version: 1.0.0 > >> > Library version: 1.0.0 > >> > > >> > Suppose that 20 major versions down the road we stand at > >> > > >> > Format version: 1.5.0 > >> > Library version: 20.0.0 > >> > > >> > The minor version of the Format would indicate that there are > >> > additions, like new elements in the Type union, but otherwise backward > >> > and forward compatible. So the Minor version means "new things, but > >> > old clients will not be disrupted if those new things are not used". > >> > We've already been doing this since the V4 Format iteration but we > >> > have not had a way to signal that there may be new features. As a > >> > corollary to this, I wonder if we should create a dual version in the > >> > metadata > >> > > >> > PROTOCOL VERSION: (what is currently MetadataVersion, V2) FEATURE > >> > VERSION: not
Re: [DISCUSS] Need for 0.14.1 release due to Python package problems, Parquet forward compatibility problems
On Sat, Jul 13, 2019 at 12:57 PM Wes McKinney wrote: > > OK, that's been merged and updated. Here's a Crossbow build > https://github.com/ursa-labs/crossbow/branches/all?utf8=%E2%9C%93=build-665 I'll keep an eye on CI. Anything else I can do to help get an RC out please let me know
Re: [DISCUSS] Need for 0.14.1 release due to Python package problems, Parquet forward compatibility problems
Sorry, spoke too soon, https://github.com/apache/arrow/pull/4856 is the last patch to go in, I'm reviewing that now On Sat, Jul 13, 2019 at 12:06 PM Wes McKinney wrote: > > Thanks Kou. > > I've updated the patch release script [1], pushed the maint-0.14.x > branch [2], and just submitted a Crossbow packaging run [3] > > If all looks good, I think this branch can be used to create an RC > > [1]: https://gist.github.com/wesm/1e4ac14baaa8b27bf13b071d2d715014 > [2]: https://github.com/apache/arrow/tree/maint-0.14.x > [3]: > https://github.com/ursa-labs/crossbow/branches/all?utf8=%E2%9C%93=build-664 > > On Fri, Jul 12, 2019 at 5:22 PM Sutou Kouhei wrote: > > > > Hi, > > > > I've created pull requests that were used to release 0.14.0: > > > > ARROW-5937: [Release] Stop parallel binary upload > > https://github.com/apache/arrow/pull/4868 > > > > ARROW-5938: [Release] Create branch for adding release note automatically > > https://github.com/apache/arrow/pull/4869 > > > > ARROW-5939: [Release] Add support for generating vote email template > > separately > > https://github.com/apache/arrow/pull/4870 > > > > ARROW-5940: [Release] Add support for re-uploading sign/checksum for binary > > artifacts > > https://github.com/apache/arrow/pull/4871 > > > > ARROW-5941: [Release] Avoid re-uploading already uploaded binary artifacts > > https://github.com/apache/arrow/pull/4872 > > (This will be conflicted with https://github.com/apache/arrow/pull/4868 .) > > > > > > They will be useful to release 0.14.1. > > > > > > Thanks, > > -- > > kou > > > > In > > "Re: [DISCUSS] Need for 0.14.1 release due to Python package problems, > > Parquet forward compatibility problems" on Fri, 12 Jul 2019 13:27:41 -0500, > > Wes McKinney wrote: > > > > > I updated https://gist.github.com/wesm/1e4ac14baaa8b27bf13b071d2d715014 > > > to include all the cited patches, as well as the Parquet forward > > > compatibility fix. > > > > > > I'm waiting on CI to be able to pass ARROW-5921 (fuzzing-discovered > > > IPC crash) and the ARROW-5889 (Parquet backwards compatibility with > > > 0.13) needs to be rebased > > > > > > https://github.com/apache/arrow/pull/4856 > > > > > > I think those are the last 2 patches that should go into the branch > > > unless something else comes up. Once those land I'll update the > > > commands and then push up the patch release branch (hopefully > > > everything will cherry pick cleanly) > > > > > > On Fri, Jul 12, 2019 at 12:34 PM Francois Saint-Jacques > > > wrote: > > >> > > >> There's also ARROW-5921 (I tagged it 0.14.1) if it passes travis. This > > >> one fixes a segfault found via fuzzing. > > >> > > >> François > > >> > > >> On Fri, Jul 12, 2019 at 6:54 AM Krisztián Szűcs > > >> wrote: > > >> > > > >> > PRs touching the wheel packaging scripts: > > >> > - https://github.com/apache/arrow/pull/4828 (lz4) > > >> > - https://github.com/apache/arrow/pull/4833 (uriparser - only if > > >> > https://github.com/apache/arrow/commit/88fcb096c4f24861bc7f8181cba1ad8be0e4048a > > >> > is cherry picked as well) > > >> > - https://github.com/apache/arrow/pull/4834 (zlib) > > >> > > > >> > On Fri, Jul 12, 2019 at 11:49 AM Hatem Helal > > >> > wrote: > > >> > > > >> > > Thanks François, I closed PARQUET-1623 this morning. It would be > > >> > > nice to > > >> > > include the PR in the patch release: > > >> > > > > >> > > https://github.com/apache/arrow/pull/4857 > > >> > > > > >> > > This bug has been around for a few releases but I think it should be > > >> > > a low > > >> > > risk change to include. > > >> > > > > >> > > Hatem > > >> > > > > >> > > > > >> > > On 7/12/19, 2:27 AM, "Francois Saint-Jacques" > > >> > > > > >> > > wrote: > > >> > > > > >> > > I just merged PARQUET-1623, I think it's worth inserting since it > > >> > > fixes an invalid memory write. Note that I couldn't > > >> > > resolve/close the > > >> > > parquet issue, do I have to be contributor to the project? > > >> > > > > >> > > François > > >> > > > > >> > > On Thu, Jul 11, 2019 at 6:10 PM Wes McKinney > > >> > > > > >> > > wrote: > > >> > > > > > >> > > > I just merged Eric's 2nd patch ARROW-5908 and I went through > > >> > > all the > > >> > > > patches since the release commit and have come up with the > > >> > > following > > >> > > > list of 32 fix-only patches to pick into a maintenance branch: > > >> > > > > > >> > > > https://gist.github.com/wesm/1e4ac14baaa8b27bf13b071d2d715014 > > >> > > > > > >> > > > Note there's still unresolved Parquet forward/backward > > >> > > compatibility > > >> > > > issues in C++ that we haven't merged patches for yet, so that > > >> > > is > > >> > > > pending. > > >> > > > > > >> > > > Are there any other patches / JIRA issues people would like to > > >> > > see > > >> > > > resolved in a patch release? > > >> > > > > > >> > > > Thanks > > >> > > > > > >> > > > On Thu, Jul 11, 2019 at 3:03 PM Wes
Re: [DISCUSS] Need for 0.14.1 release due to Python package problems, Parquet forward compatibility problems
Thanks Kou. I've updated the patch release script [1], pushed the maint-0.14.x branch [2], and just submitted a Crossbow packaging run [3] If all looks good, I think this branch can be used to create an RC [1]: https://gist.github.com/wesm/1e4ac14baaa8b27bf13b071d2d715014 [2]: https://github.com/apache/arrow/tree/maint-0.14.x [3]: https://github.com/ursa-labs/crossbow/branches/all?utf8=%E2%9C%93=build-664 On Fri, Jul 12, 2019 at 5:22 PM Sutou Kouhei wrote: > > Hi, > > I've created pull requests that were used to release 0.14.0: > > ARROW-5937: [Release] Stop parallel binary upload > https://github.com/apache/arrow/pull/4868 > > ARROW-5938: [Release] Create branch for adding release note automatically > https://github.com/apache/arrow/pull/4869 > > ARROW-5939: [Release] Add support for generating vote email template > separately > https://github.com/apache/arrow/pull/4870 > > ARROW-5940: [Release] Add support for re-uploading sign/checksum for binary > artifacts > https://github.com/apache/arrow/pull/4871 > > ARROW-5941: [Release] Avoid re-uploading already uploaded binary artifacts > https://github.com/apache/arrow/pull/4872 > (This will be conflicted with https://github.com/apache/arrow/pull/4868 .) > > > They will be useful to release 0.14.1. > > > Thanks, > -- > kou > > In > "Re: [DISCUSS] Need for 0.14.1 release due to Python package problems, > Parquet forward compatibility problems" on Fri, 12 Jul 2019 13:27:41 -0500, > Wes McKinney wrote: > > > I updated https://gist.github.com/wesm/1e4ac14baaa8b27bf13b071d2d715014 > > to include all the cited patches, as well as the Parquet forward > > compatibility fix. > > > > I'm waiting on CI to be able to pass ARROW-5921 (fuzzing-discovered > > IPC crash) and the ARROW-5889 (Parquet backwards compatibility with > > 0.13) needs to be rebased > > > > https://github.com/apache/arrow/pull/4856 > > > > I think those are the last 2 patches that should go into the branch > > unless something else comes up. Once those land I'll update the > > commands and then push up the patch release branch (hopefully > > everything will cherry pick cleanly) > > > > On Fri, Jul 12, 2019 at 12:34 PM Francois Saint-Jacques > > wrote: > >> > >> There's also ARROW-5921 (I tagged it 0.14.1) if it passes travis. This > >> one fixes a segfault found via fuzzing. > >> > >> François > >> > >> On Fri, Jul 12, 2019 at 6:54 AM Krisztián Szűcs > >> wrote: > >> > > >> > PRs touching the wheel packaging scripts: > >> > - https://github.com/apache/arrow/pull/4828 (lz4) > >> > - https://github.com/apache/arrow/pull/4833 (uriparser - only if > >> > https://github.com/apache/arrow/commit/88fcb096c4f24861bc7f8181cba1ad8be0e4048a > >> > is cherry picked as well) > >> > - https://github.com/apache/arrow/pull/4834 (zlib) > >> > > >> > On Fri, Jul 12, 2019 at 11:49 AM Hatem Helal > >> > wrote: > >> > > >> > > Thanks François, I closed PARQUET-1623 this morning. It would be nice > >> > > to > >> > > include the PR in the patch release: > >> > > > >> > > https://github.com/apache/arrow/pull/4857 > >> > > > >> > > This bug has been around for a few releases but I think it should be a > >> > > low > >> > > risk change to include. > >> > > > >> > > Hatem > >> > > > >> > > > >> > > On 7/12/19, 2:27 AM, "Francois Saint-Jacques" > >> > > > >> > > wrote: > >> > > > >> > > I just merged PARQUET-1623, I think it's worth inserting since it > >> > > fixes an invalid memory write. Note that I couldn't resolve/close > >> > > the > >> > > parquet issue, do I have to be contributor to the project? > >> > > > >> > > François > >> > > > >> > > On Thu, Jul 11, 2019 at 6:10 PM Wes McKinney > >> > > wrote: > >> > > > > >> > > > I just merged Eric's 2nd patch ARROW-5908 and I went through all > >> > > the > >> > > > patches since the release commit and have come up with the > >> > > following > >> > > > list of 32 fix-only patches to pick into a maintenance branch: > >> > > > > >> > > > https://gist.github.com/wesm/1e4ac14baaa8b27bf13b071d2d715014 > >> > > > > >> > > > Note there's still unresolved Parquet forward/backward > >> > > compatibility > >> > > > issues in C++ that we haven't merged patches for yet, so that is > >> > > > pending. > >> > > > > >> > > > Are there any other patches / JIRA issues people would like to > >> > > see > >> > > > resolved in a patch release? > >> > > > > >> > > > Thanks > >> > > > > >> > > > On Thu, Jul 11, 2019 at 3:03 PM Wes McKinney > >> > > > >> > > wrote: > >> > > > > > >> > > > > Eric -- you are free to set the Fix Version prior to the patch > >> > > being merged > >> > > > > > >> > > > > On Thu, Jul 11, 2019 at 3:01 PM Eric Erhardt > >> > > > > wrote: > >> > > > > > > >> > > > > > The two C# fixes I'd like in the 0.14.1 release are: > >> > > > > > > >> > > > > > https://issues.apache.org/jira/browse/ARROW-5887 - already > >> > >
Re: [DISCUSS] Format additions for encoding/compression (Was: [Discuss] Format additions to Arrow for sparse data and data integrity)
On Sat, Jul 13, 2019 at 11:23 AM Antoine Pitrou wrote: > > On Fri, 12 Jul 2019 20:37:15 -0700 > Micah Kornfield wrote: > > > > If the latter, I wonder why Parquet cannot simply be used instead of > > > reinventing something similar but different. > > > > This is a reasonable point. However there is continuum here between file > > size and read and write times. Parquet will likely always be the smallest > > with the largest times to convert to and from Arrow. An uncompressed > > Feather/Arrow file will likely always take the most space but will much > > faster conversion times. > > I'm curious whether the Parquet conversion times are inherent to the > Parquet format or due to inefficiencies in the implementation. > Parquet is fundamentally more complex to decode. Consider several layers of logic that must happen for values to end up in the right place * Data pages are usually compressed, and a column consists of many data pages each having a Thrift header that must be deserialized * Values are usually dictionary-encoded, dictionary indices are encoded using hybrid bit-packed / RLE scheme * Null/not-null is encoded in definition levels * Only non-null values are stored, so when decoding to Arrow, values have to be "moved into place" The current C++ implementation could certainly be made faster. One consideration with Parquet is that the files are much smaller, so when you are reading them over the network the effective end-to-end time including IO and deserialization will frequently win. > Regards > > Antoine. > >
Re: [DISCUSS] Format additions for encoding/compression (Was: [Discuss] Format additions to Arrow for sparse data and data integrity)
On Fri, 12 Jul 2019 20:37:15 -0700 Micah Kornfield wrote: > > If the latter, I wonder why Parquet cannot simply be used instead of > > reinventing something similar but different. > > This is a reasonable point. However there is continuum here between file > size and read and write times. Parquet will likely always be the smallest > with the largest times to convert to and from Arrow. An uncompressed > Feather/Arrow file will likely always take the most space but will much > faster conversion times. I'm curious whether the Parquet conversion times are inherent to the Parquet format or due to inefficiencies in the implementation. Regards Antoine.
Re: [DISCUSS] Release cadence and release vote conventions
I would like to volunteer to help with Java and Rust release process work, especially nightly releases. Although I'm not that familiar with the Java implementation of Arrow, I have been using Java and Maven for a very long time. Do we envisage a single nightly release process that releases all languages simultaneously? or do we want separate process per language, with different maintainers? On Wed, Jul 10, 2019 at 8:18 AM Wes McKinney wrote: > On Sun, Jul 7, 2019 at 7:40 PM Sutou Kouhei wrote: > > > > Hi, > > > > > in future releases we should > > > institute a minimum 24-hour "quiet period" after any community > > > feedback on a release candidate to allow issues to be examined > > > further. > > > > I agree with this. I'll do so when I do a release manager in > > the future. > > > > > To be able to release more often, two things have to happen: > > > > > > * More PMC members must engage with the release management role, > > > process, and tools > > > * Continued improvements to release tooling to make the process less > > > painful for the release manager. For example, it seems we may want to > > > find a different place than Bintray to host binary artifacts > > > temporarily during release votes > > > > My opinion that we need to build nightly release system. > > > > It uses dev/release/NN-*.sh to build .tar.gz and binary > > artifacts from the .tar.gz. > > It also uses dev/release/verify-release-candidate.* to > > verify build .tar.gz and binary artifacts. > > It also uses dev/release/post-NN-*.sh to do post release > > tasks. (Some tasks such as uploading a package to packaging > > system will be dry-run.) > > > > I agree that having a turn-key release system that's capable of > producing nightly packages is the way to do. That way any problems > that would block a release will come up as they happen rather than > piling up until the very end like they are now. > > > I needed 10 or more changes for dev/release/ to create > > 0.14.0 RC0. (Some of them are still in my local stashes. I > > don't have time to create pull requests for them > > yet. Because I postponed some tasks of my main > > business. I'll create pull requests after I finished the > > postponed tasks of my main business.) > > > > Thanks. I'll follow up on the 0.14.1/0.15.0 thread -- since we need to > release again soon because of problems with 0.14.0 please let us know > what patches will be needed to make another release. > > > If we fix problems related to dev/release/ in our normal > > development process, release process will be less painful. > > > > The biggest problem for 0.14.0 RC0 is java/pom.xml related: > > https://github.com/apache/arrow/pull/4717 > > > > It was difficult for me because I don't have Java > > knowledge. Release manager needs help from many developers > > because release manager may not have knowledge of all > > supported languages. Apache Arrow supports 10 over > > languages. > > > > > > For Bintray API limit problem, we'll be able to resolve it. > > I was added to https://bintray.com/apache/ members: > > > > https://issues.apache.org/jira/browse/INFRA-18698 > > > > I'll be able to use Bintray API without limitation in the > > future. Release managers should also request the same thing. > > > > This is good, I will add myself. Other PMC members should also add > themselves. > > > > > Thanks, > > -- > > kou > > > > In > > "[DISCUSS] Release cadence and release vote conventions" on Sat, 6 Jul > 2019 16:28:50 -0500, > > Wes McKinney wrote: > > > > > hi folks, > > > > > > As a reminder, particularly since we have many new community members > > > (some of whom have never been involved with an ASF project before), > > > releases are approved exclusively by the PMC and in general releases > > > cannot be vetoed. In spite of that, we strive to make releases that > > > have unanimous (either by explicit +1 or lazy consent) support of the > > > PMC. So it is better to have unanimous 5 +1 votes than 6 +1 votes with > > > a -1 dissenting vote. > > > > > > On the 0.14.0 vote, as with previous release votes, some issues with > > > the release were raised by members of the community, whether build or > > > test-related problems or other failures. Technically speaking, such > > > issues have no _direct_ bearing on whether a release vote passes, only > > > on whether PMC members vote +1, 0, or -1. A PMC member is allowed to > > > change their vote based on new information -- for example, if I voted > > > +1 on a release and then someone reported a serious licensing issue, > > > then I would revise my vote to -1. > > > > > > On the RC0 vote thread, Jacques wrote [1] > > > > > > "A release vote should last until we arrive at consensus. When an > > > issue is potentially identified, those that have voted should be given > > > ample time to change their vote and others that may have been lazy > > > consenters should be given time to chime in. There is no maximum > > > amount of time a vote can be open. Allowing
[jira] [Created] (ARROW-5945) [Rust] [DataFusion] Table trait should support building complete queries
Andy Grove created ARROW-5945: - Summary: [Rust] [DataFusion] Table trait should support building complete queries Key: ARROW-5945 URL: https://issues.apache.org/jira/browse/ARROW-5945 Project: Apache Arrow Issue Type: New Feature Components: Rust, Rust - DataFusion Affects Versions: 0.14.0 Reporter: Andy Grove Assignee: Andy Grove Fix For: 1.0.0 DataFusion 0.13 included a preview Table trait, which provides a DataFrame style method of building a logical query plan, but it was not usable for any real-world queries. I would now like the trait to support building real queries, especially aggregate queries. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (ARROW-5944) Remove 'div' alias for 'divide'
Prudhvi Porandla created ARROW-5944: --- Summary: Remove 'div' alias for 'divide' Key: ARROW-5944 URL: https://issues.apache.org/jira/browse/ARROW-5944 Project: Apache Arrow Issue Type: Task Reporter: Prudhvi Porandla Assignee: Prudhvi Porandla div and divide are two different operators. -- This message was sent by Atlassian JIRA (v7.6.14#76016)