Hi,

To further clarify the bug concerns the serde compatibility feature that allows 
converting a serde compatible data structure to arrow [1]. It will not impact 
workloads reading JSON. 

I am not sure this is a sufficiently fundamental bug to warrant special 
concern, but happy to defer to others.

Kind Regards,

Raphael

[1]: https://docs.rs/arrow/latest/arrow/#serde-compatibility

On 7 November 2023 03:20:59 GMT, Will Jones <will.jones...@gmail.com> wrote:
>Hello,
>
>There is an upstream bug in arrow-json that can cause the JSON reader to
>return incorrect data for large integers [1]. It was recently fixed by
>Raphael within the last 24 hours, but is not included in any release. The
>bug was introduced in Arrow 48, which this DataFusion release will expose
>users to.
>
>Not sure what the precedent here is, but I think either we should consider
>either (a) seeing if we can release and upgrade Arrow to include the fix,
>or else (b) calling out the regression as a known bug so downstream
>projects can include the path in their applications.
>
>Best,
>
>Will Jones
>
>[1] https://github.com/apache/arrow-rs/issues/5038
>[2] https://github.com/apache/arrow-rs/pull/5042
>
>On Mon, Nov 6, 2023 at 12:25 PM Andrew Lamb <al...@influxdata.com> wrote:
>
>> +1 (the tests passed for me). I have left a comment on
>> https://github.com/apache/arrow-datafusion/issues/8069
>>
>> On Mon, Nov 6, 2023 at 2:02 PM Andy Grove <andygrov...@gmail.com> wrote:
>>
>> > I filed https://github.com/apache/arrow-datafusion/issues/8069
>> >
>> > On Mon, Nov 6, 2023 at 11:59 AM Andy Grove <andygrov...@gmail.com>
>> wrote:
>> >
>> > > I see the same error when I run on my M1 Macbook Air with 16 GB RAM.
>> > >
>> > > ---- aggregates::tests::run_first_last_multi_partitions stdout ----
>> > > Error: ResourcesExhausted("Failed to allocate additional 632 bytes for
>> > > GroupedHashAggregateStream[0] with 1829 bytes already allocated -
>> maximum
>> > > available is 605")
>> > >
>> > > It worked fine on my workstation with 128 GB RAM.
>> > >
>> > >
>> > >
>> > > On Mon, Nov 6, 2023 at 11:23 AM L. C. Hsieh <vii...@gmail.com> wrote:
>> > >
>> > >> Hmm, ran verification script and got one failure:
>> > >>
>> > >> failures:
>> > >>
>> > >> ---- aggregates::tests::run_first_last_multi_partitions stdout ----
>> > >> Error: ResourcesExhausted("Failed to allocate additional 632 bytes for
>> > >> GroupedHashAggregateStream[0] with 1829 bytes already allocated -
>> > >> maximum available is 605")
>> > >>
>> > >> failures:
>> > >>     aggregates::tests::run_first_last_multi_partitions
>> > >>
>> > >> test result: FAILED. 557 passed; 1 failed; 1 ignored; 0 measured; 0
>> > >> filtered out; finished in 2.21s
>> > >>
>> > >>
>> > >>
>> > >> On Mon, Nov 6, 2023 at 6:57 AM Andy Grove <andygrov...@gmail.com>
>> > wrote:
>> > >> >
>> > >> > Hi,
>> > >> >
>> > >> > I would like to propose a release of Apache Arrow DataFusion
>> > >> Implementation,
>> > >> > version 33.0.0.
>> > >> >
>> > >> > This release candidate is based on commit:
>> > >> > 262f08778b8ec231d96792c01fc3e051640eb5d4 [1]
>> > >> > The proposed release tarball and signatures are hosted at [2].
>> > >> > The changelog is located at [3].
>> > >> >
>> > >> > Please download, verify checksums and signatures, run the unit
>> tests,
>> > >> and
>> > >> > vote
>> > >> > on the release. The vote will be open for at least 72 hours.
>> > >> >
>> > >> > Only votes from PMC members are binding, but all members of the
>> > >> community
>> > >> > are
>> > >> > encouraged to test the release and vote with "(non-binding)".
>> > >> >
>> > >> > The standard verification procedure is documented at
>> > >> >
>> > >>
>> >
>> https://github.com/apache/arrow-datafusion/blob/main/dev/release/README.md#verifying-release-candidates
>> > >> > .
>> > >> >
>> > >> > [ ] +1 Release this as Apache Arrow DataFusion 33.0.0
>> > >> > [ ] +0
>> > >> > [ ] -1 Do not release this as Apache Arrow DataFusion 33.0.0
>> > because...
>> > >> >
>> > >> > Here is my vote:
>> > >> >
>> > >> > +1
>> > >> >
>> > >> > [1]:
>> > >> >
>> > >>
>> >
>> https://github.com/apache/arrow-datafusion/tree/262f08778b8ec231d96792c01fc3e051640eb5d4
>> > >> > [2]:
>> > >> >
>> > >>
>> >
>> https://dist.apache.org/repos/dist/dev/arrow/apache-arrow-datafusion-33.0.0-rc1
>> > >> > [3]:
>> > >> >
>> > >>
>> >
>> https://github.com/apache/arrow-datafusion/blob/262f08778b8ec231d96792c01fc3e051640eb5d4/CHANGELOG.md
>> > >>
>> > >
>> >
>>

Reply via email to