I filed  https://github.com/apache/arrow-rs/issues/5050 to discuss some
different possibilities (for example, do an arrow patch release or hold the
datafusion release)

Please share your thoughts on the ticket

On Tue, Nov 7, 2023 at 6:51 AM Andrew Lamb <al...@influxdata.com> wrote:

> I think we should create a ticket to discuss releasing 48.1.0 (in addition
> to 49.0.0) -- I can do so later today if no one beats me to it
>
> On Tue, Nov 7, 2023 at 6:47 AM Raphael Taylor-Davies
> <r.taylordav...@googlemail.com.invalid> wrote:
>
>> It will contain breaking dependency updates, including object_store.
>>
>> I hope to cut it today.
>>
>> On 07/11/2023 11:43, Andrew Lamb wrote:
>> > If the release later in the week doesn't have any breaking API changes,
>> > perhaps it can be 48.1.0 (and thus also get the bugfix to datafusion)
>> >
>> > On Tue, Nov 7, 2023 at 6:41 AM Raphael Taylor-Davies
>> > <r.taylordav...@googlemail.com.invalid> wrote:
>> >
>> >> I intend to cut a new arrow release later this week, I would prefer we
>> >> wait for this.
>> >>
>> >> On 07/11/2023 11:39, Andrew Lamb wrote:
>> >>> Perhaps we can create an arrow 48.1.0 patch release to include the
>> fix?
>> >>>
>> >>> On Tue, Nov 7, 2023 at 12:48 AM Will Jones <will.jones...@gmail.com>
>> >> wrote:
>> >>>> Thanks for the clarification, Raphael. That likely narrows the scope
>> of
>> >> who
>> >>>> is affected. If this bug is present in DataFusion 33, then delta-rs
>> will
>> >>>> likely skip upgrading until 34. If we're the only downstream project
>> >> this
>> >>>> parsing issue affects, then I think it's fine to release.
>> >>>>
>> >>>> On Mon, Nov 6, 2023 at 8:22 PM Raphael Taylor-Davies
>> >>>> <r.taylordav...@googlemail.com.invalid> wrote:
>> >>>>
>> >>>>> Hi,
>> >>>>>
>> >>>>> To further clarify the bug concerns the serde compatibility feature
>> >> that
>> >>>>> allows converting a serde compatible data structure to arrow [1]. It
>> >> will
>> >>>>> not impact workloads reading JSON.
>> >>>>>
>> >>>>> I am not sure this is a sufficiently fundamental bug to warrant
>> special
>> >>>>> concern, but happy to defer to others.
>> >>>>>
>> >>>>> Kind Regards,
>> >>>>>
>> >>>>> Raphael
>> >>>>>
>> >>>>> [1]: https://docs.rs/arrow/latest/arrow/#serde-compatibility
>> >>>>>
>> >>>>> On 7 November 2023 03:20:59 GMT, Will Jones <
>> will.jones...@gmail.com>
>> >>>>> wrote:
>> >>>>>> Hello,
>> >>>>>>
>> >>>>>> There is an upstream bug in arrow-json that can cause the JSON
>> reader
>> >> to
>> >>>>>> return incorrect data for large integers [1]. It was recently
>> fixed by
>> >>>>>> Raphael within the last 24 hours, but is not included in any
>> release.
>> >>>> The
>> >>>>>> bug was introduced in Arrow 48, which this DataFusion release will
>> >>>> expose
>> >>>>>> users to.
>> >>>>>>
>> >>>>>> Not sure what the precedent here is, but I think either we should
>> >>>> consider
>> >>>>>> either (a) seeing if we can release and upgrade Arrow to include
>> the
>> >>>> fix,
>> >>>>>> or else (b) calling out the regression as a known bug so downstream
>> >>>>>> projects can include the path in their applications.
>> >>>>>>
>> >>>>>> Best,
>> >>>>>>
>> >>>>>> Will Jones
>> >>>>>>
>> >>>>>> [1] https://github.com/apache/arrow-rs/issues/5038
>> >>>>>> [2] https://github.com/apache/arrow-rs/pull/5042
>> >>>>>>
>> >>>>>> On Mon, Nov 6, 2023 at 12:25 PM Andrew Lamb <al...@influxdata.com>
>> >>>> wrote:
>> >>>>>>> +1 (the tests passed for me). I have left a comment on
>> >>>>>>> https://github.com/apache/arrow-datafusion/issues/8069
>> >>>>>>>
>> >>>>>>> On Mon, Nov 6, 2023 at 2:02 PM Andy Grove <andygrov...@gmail.com>
>> >>>>> wrote:
>> >>>>>>>> I filed https://github.com/apache/arrow-datafusion/issues/8069
>> >>>>>>>>
>> >>>>>>>> On Mon, Nov 6, 2023 at 11:59 AM Andy Grove <
>> andygrov...@gmail.com>
>> >>>>>>> wrote:
>> >>>>>>>>> I see the same error when I run on my M1 Macbook Air with 16 GB
>> >>>> RAM.
>> >>>>>>>>> ---- aggregates::tests::run_first_last_multi_partitions stdout
>> >>>> ----
>> >>>>>>>>> Error: ResourcesExhausted("Failed to allocate additional 632
>> bytes
>> >>>>> for
>> >>>>>>>>> GroupedHashAggregateStream[0] with 1829 bytes already allocated
>> -
>> >>>>>>> maximum
>> >>>>>>>>> available is 605")
>> >>>>>>>>>
>> >>>>>>>>> It worked fine on my workstation with 128 GB RAM.
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>> On Mon, Nov 6, 2023 at 11:23 AM L. C. Hsieh <vii...@gmail.com>
>> >>>>> wrote:
>> >>>>>>>>>> Hmm, ran verification script and got one failure:
>> >>>>>>>>>>
>> >>>>>>>>>> failures:
>> >>>>>>>>>>
>> >>>>>>>>>> ---- aggregates::tests::run_first_last_multi_partitions stdout
>> >>>> ----
>> >>>>>>>>>> Error: ResourcesExhausted("Failed to allocate additional 632
>> >>>> bytes
>> >>>>> for
>> >>>>>>>>>> GroupedHashAggregateStream[0] with 1829 bytes already
>> allocated -
>> >>>>>>>>>> maximum available is 605")
>> >>>>>>>>>>
>> >>>>>>>>>> failures:
>> >>>>>>>>>>       aggregates::tests::run_first_last_multi_partitions
>> >>>>>>>>>>
>> >>>>>>>>>> test result: FAILED. 557 passed; 1 failed; 1 ignored; 0
>> >>>> measured; 0
>> >>>>>>>>>> filtered out; finished in 2.21s
>> >>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>> On Mon, Nov 6, 2023 at 6:57 AM Andy Grove <
>> andygrov...@gmail.com
>> >>>>>>>> wrote:
>> >>>>>>>>>>> Hi,
>> >>>>>>>>>>>
>> >>>>>>>>>>> I would like to propose a release of Apache Arrow DataFusion
>> >>>>>>>>>> Implementation,
>> >>>>>>>>>>> version 33.0.0.
>> >>>>>>>>>>>
>> >>>>>>>>>>> This release candidate is based on commit:
>> >>>>>>>>>>> 262f08778b8ec231d96792c01fc3e051640eb5d4 [1]
>> >>>>>>>>>>> The proposed release tarball and signatures are hosted at [2].
>> >>>>>>>>>>> The changelog is located at [3].
>> >>>>>>>>>>>
>> >>>>>>>>>>> Please download, verify checksums and signatures, run the unit
>> >>>>>>> tests,
>> >>>>>>>>>> and
>> >>>>>>>>>>> vote
>> >>>>>>>>>>> on the release. The vote will be open for at least 72 hours.
>> >>>>>>>>>>>
>> >>>>>>>>>>> Only votes from PMC members are binding, but all members of
>> the
>> >>>>>>>>>> community
>> >>>>>>>>>>> are
>> >>>>>>>>>>> encouraged to test the release and vote with "(non-binding)".
>> >>>>>>>>>>>
>> >>>>>>>>>>> The standard verification procedure is documented at
>> >>>>>>>>>>>
>> >>
>> https://github.com/apache/arrow-datafusion/blob/main/dev/release/README.md#verifying-release-candidates
>> >>>>>>>>>>> .
>> >>>>>>>>>>>
>> >>>>>>>>>>> [ ] +1 Release this as Apache Arrow DataFusion 33.0.0
>> >>>>>>>>>>> [ ] +0
>> >>>>>>>>>>> [ ] -1 Do not release this as Apache Arrow DataFusion 33.0.0
>> >>>>>>>> because...
>> >>>>>>>>>>> Here is my vote:
>> >>>>>>>>>>>
>> >>>>>>>>>>> +1
>> >>>>>>>>>>>
>> >>>>>>>>>>> [1]:
>> >>>>>>>>>>>
>> >>
>> https://github.com/apache/arrow-datafusion/tree/262f08778b8ec231d96792c01fc3e051640eb5d4
>> >>>>>>>>>>> [2]:
>> >>>>>>>>>>>
>> >>
>> https://dist.apache.org/repos/dist/dev/arrow/apache-arrow-datafusion-33.0.0-rc1
>> >>>>>>>>>>> [3]:
>> >>>>>>>>>>>
>> >>
>> https://github.com/apache/arrow-datafusion/blob/262f08778b8ec231d96792c01fc3e051640eb5d4/CHANGELOG.md
>> >>
>>
>

Reply via email to