I intend to cut a new arrow release later this week, I would prefer we wait for this.

On 07/11/2023 11:39, Andrew Lamb wrote:
Perhaps we can create an arrow 48.1.0 patch release to include the fix?

On Tue, Nov 7, 2023 at 12:48 AM Will Jones <will.jones...@gmail.com> wrote:

Thanks for the clarification, Raphael. That likely narrows the scope of who
is affected. If this bug is present in DataFusion 33, then delta-rs will
likely skip upgrading until 34. If we're the only downstream project this
parsing issue affects, then I think it's fine to release.

On Mon, Nov 6, 2023 at 8:22 PM Raphael Taylor-Davies
<r.taylordav...@googlemail.com.invalid> wrote:

Hi,

To further clarify the bug concerns the serde compatibility feature that
allows converting a serde compatible data structure to arrow [1]. It will
not impact workloads reading JSON.

I am not sure this is a sufficiently fundamental bug to warrant special
concern, but happy to defer to others.

Kind Regards,

Raphael

[1]: https://docs.rs/arrow/latest/arrow/#serde-compatibility

On 7 November 2023 03:20:59 GMT, Will Jones <will.jones...@gmail.com>
wrote:
Hello,

There is an upstream bug in arrow-json that can cause the JSON reader to
return incorrect data for large integers [1]. It was recently fixed by
Raphael within the last 24 hours, but is not included in any release.
The
bug was introduced in Arrow 48, which this DataFusion release will
expose
users to.

Not sure what the precedent here is, but I think either we should
consider
either (a) seeing if we can release and upgrade Arrow to include the
fix,
or else (b) calling out the regression as a known bug so downstream
projects can include the path in their applications.

Best,

Will Jones

[1] https://github.com/apache/arrow-rs/issues/5038
[2] https://github.com/apache/arrow-rs/pull/5042

On Mon, Nov 6, 2023 at 12:25 PM Andrew Lamb <al...@influxdata.com>
wrote:
+1 (the tests passed for me). I have left a comment on
https://github.com/apache/arrow-datafusion/issues/8069

On Mon, Nov 6, 2023 at 2:02 PM Andy Grove <andygrov...@gmail.com>
wrote:
I filed https://github.com/apache/arrow-datafusion/issues/8069

On Mon, Nov 6, 2023 at 11:59 AM Andy Grove <andygrov...@gmail.com>
wrote:
I see the same error when I run on my M1 Macbook Air with 16 GB
RAM.
---- aggregates::tests::run_first_last_multi_partitions stdout
----
Error: ResourcesExhausted("Failed to allocate additional 632 bytes
for
GroupedHashAggregateStream[0] with 1829 bytes already allocated -
maximum
available is 605")

It worked fine on my workstation with 128 GB RAM.



On Mon, Nov 6, 2023 at 11:23 AM L. C. Hsieh <vii...@gmail.com>
wrote:
Hmm, ran verification script and got one failure:

failures:

---- aggregates::tests::run_first_last_multi_partitions stdout
----
Error: ResourcesExhausted("Failed to allocate additional 632
bytes
for
GroupedHashAggregateStream[0] with 1829 bytes already allocated -
maximum available is 605")

failures:
     aggregates::tests::run_first_last_multi_partitions

test result: FAILED. 557 passed; 1 failed; 1 ignored; 0
measured; 0
filtered out; finished in 2.21s



On Mon, Nov 6, 2023 at 6:57 AM Andy Grove <andygrov...@gmail.com
wrote:
Hi,

I would like to propose a release of Apache Arrow DataFusion
Implementation,
version 33.0.0.

This release candidate is based on commit:
262f08778b8ec231d96792c01fc3e051640eb5d4 [1]
The proposed release tarball and signatures are hosted at [2].
The changelog is located at [3].

Please download, verify checksums and signatures, run the unit
tests,
and
vote
on the release. The vote will be open for at least 72 hours.

Only votes from PMC members are binding, but all members of the
community
are
encouraged to test the release and vote with "(non-binding)".

The standard verification procedure is documented at

https://github.com/apache/arrow-datafusion/blob/main/dev/release/README.md#verifying-release-candidates
.

[ ] +1 Release this as Apache Arrow DataFusion 33.0.0
[ ] +0
[ ] -1 Do not release this as Apache Arrow DataFusion 33.0.0
because...
Here is my vote:

+1

[1]:

https://github.com/apache/arrow-datafusion/tree/262f08778b8ec231d96792c01fc3e051640eb5d4
[2]:

https://dist.apache.org/repos/dist/dev/arrow/apache-arrow-datafusion-33.0.0-rc1
[3]:

https://github.com/apache/arrow-datafusion/blob/262f08778b8ec231d96792c01fc3e051640eb5d4/CHANGELOG.md

Reply via email to