I think we should create a ticket to discuss releasing 48.1.0 (in addition
to 49.0.0) -- I can do so later today if no one beats me to it

On Tue, Nov 7, 2023 at 6:47 AM Raphael Taylor-Davies
<r.taylordav...@googlemail.com.invalid> wrote:

> It will contain breaking dependency updates, including object_store.
>
> I hope to cut it today.
>
> On 07/11/2023 11:43, Andrew Lamb wrote:
> > If the release later in the week doesn't have any breaking API changes,
> > perhaps it can be 48.1.0 (and thus also get the bugfix to datafusion)
> >
> > On Tue, Nov 7, 2023 at 6:41 AM Raphael Taylor-Davies
> > <r.taylordav...@googlemail.com.invalid> wrote:
> >
> >> I intend to cut a new arrow release later this week, I would prefer we
> >> wait for this.
> >>
> >> On 07/11/2023 11:39, Andrew Lamb wrote:
> >>> Perhaps we can create an arrow 48.1.0 patch release to include the fix?
> >>>
> >>> On Tue, Nov 7, 2023 at 12:48 AM Will Jones <will.jones...@gmail.com>
> >> wrote:
> >>>> Thanks for the clarification, Raphael. That likely narrows the scope
> of
> >> who
> >>>> is affected. If this bug is present in DataFusion 33, then delta-rs
> will
> >>>> likely skip upgrading until 34. If we're the only downstream project
> >> this
> >>>> parsing issue affects, then I think it's fine to release.
> >>>>
> >>>> On Mon, Nov 6, 2023 at 8:22 PM Raphael Taylor-Davies
> >>>> <r.taylordav...@googlemail.com.invalid> wrote:
> >>>>
> >>>>> Hi,
> >>>>>
> >>>>> To further clarify the bug concerns the serde compatibility feature
> >> that
> >>>>> allows converting a serde compatible data structure to arrow [1]. It
> >> will
> >>>>> not impact workloads reading JSON.
> >>>>>
> >>>>> I am not sure this is a sufficiently fundamental bug to warrant
> special
> >>>>> concern, but happy to defer to others.
> >>>>>
> >>>>> Kind Regards,
> >>>>>
> >>>>> Raphael
> >>>>>
> >>>>> [1]: https://docs.rs/arrow/latest/arrow/#serde-compatibility
> >>>>>
> >>>>> On 7 November 2023 03:20:59 GMT, Will Jones <will.jones...@gmail.com
> >
> >>>>> wrote:
> >>>>>> Hello,
> >>>>>>
> >>>>>> There is an upstream bug in arrow-json that can cause the JSON
> reader
> >> to
> >>>>>> return incorrect data for large integers [1]. It was recently fixed
> by
> >>>>>> Raphael within the last 24 hours, but is not included in any
> release.
> >>>> The
> >>>>>> bug was introduced in Arrow 48, which this DataFusion release will
> >>>> expose
> >>>>>> users to.
> >>>>>>
> >>>>>> Not sure what the precedent here is, but I think either we should
> >>>> consider
> >>>>>> either (a) seeing if we can release and upgrade Arrow to include the
> >>>> fix,
> >>>>>> or else (b) calling out the regression as a known bug so downstream
> >>>>>> projects can include the path in their applications.
> >>>>>>
> >>>>>> Best,
> >>>>>>
> >>>>>> Will Jones
> >>>>>>
> >>>>>> [1] https://github.com/apache/arrow-rs/issues/5038
> >>>>>> [2] https://github.com/apache/arrow-rs/pull/5042
> >>>>>>
> >>>>>> On Mon, Nov 6, 2023 at 12:25 PM Andrew Lamb <al...@influxdata.com>
> >>>> wrote:
> >>>>>>> +1 (the tests passed for me). I have left a comment on
> >>>>>>> https://github.com/apache/arrow-datafusion/issues/8069
> >>>>>>>
> >>>>>>> On Mon, Nov 6, 2023 at 2:02 PM Andy Grove <andygrov...@gmail.com>
> >>>>> wrote:
> >>>>>>>> I filed https://github.com/apache/arrow-datafusion/issues/8069
> >>>>>>>>
> >>>>>>>> On Mon, Nov 6, 2023 at 11:59 AM Andy Grove <andygrov...@gmail.com
> >
> >>>>>>> wrote:
> >>>>>>>>> I see the same error when I run on my M1 Macbook Air with 16 GB
> >>>> RAM.
> >>>>>>>>> ---- aggregates::tests::run_first_last_multi_partitions stdout
> >>>> ----
> >>>>>>>>> Error: ResourcesExhausted("Failed to allocate additional 632
> bytes
> >>>>> for
> >>>>>>>>> GroupedHashAggregateStream[0] with 1829 bytes already allocated -
> >>>>>>> maximum
> >>>>>>>>> available is 605")
> >>>>>>>>>
> >>>>>>>>> It worked fine on my workstation with 128 GB RAM.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> On Mon, Nov 6, 2023 at 11:23 AM L. C. Hsieh <vii...@gmail.com>
> >>>>> wrote:
> >>>>>>>>>> Hmm, ran verification script and got one failure:
> >>>>>>>>>>
> >>>>>>>>>> failures:
> >>>>>>>>>>
> >>>>>>>>>> ---- aggregates::tests::run_first_last_multi_partitions stdout
> >>>> ----
> >>>>>>>>>> Error: ResourcesExhausted("Failed to allocate additional 632
> >>>> bytes
> >>>>> for
> >>>>>>>>>> GroupedHashAggregateStream[0] with 1829 bytes already allocated
> -
> >>>>>>>>>> maximum available is 605")
> >>>>>>>>>>
> >>>>>>>>>> failures:
> >>>>>>>>>>       aggregates::tests::run_first_last_multi_partitions
> >>>>>>>>>>
> >>>>>>>>>> test result: FAILED. 557 passed; 1 failed; 1 ignored; 0
> >>>> measured; 0
> >>>>>>>>>> filtered out; finished in 2.21s
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> On Mon, Nov 6, 2023 at 6:57 AM Andy Grove <
> andygrov...@gmail.com
> >>>>>>>> wrote:
> >>>>>>>>>>> Hi,
> >>>>>>>>>>>
> >>>>>>>>>>> I would like to propose a release of Apache Arrow DataFusion
> >>>>>>>>>> Implementation,
> >>>>>>>>>>> version 33.0.0.
> >>>>>>>>>>>
> >>>>>>>>>>> This release candidate is based on commit:
> >>>>>>>>>>> 262f08778b8ec231d96792c01fc3e051640eb5d4 [1]
> >>>>>>>>>>> The proposed release tarball and signatures are hosted at [2].
> >>>>>>>>>>> The changelog is located at [3].
> >>>>>>>>>>>
> >>>>>>>>>>> Please download, verify checksums and signatures, run the unit
> >>>>>>> tests,
> >>>>>>>>>> and
> >>>>>>>>>>> vote
> >>>>>>>>>>> on the release. The vote will be open for at least 72 hours.
> >>>>>>>>>>>
> >>>>>>>>>>> Only votes from PMC members are binding, but all members of the
> >>>>>>>>>> community
> >>>>>>>>>>> are
> >>>>>>>>>>> encouraged to test the release and vote with "(non-binding)".
> >>>>>>>>>>>
> >>>>>>>>>>> The standard verification procedure is documented at
> >>>>>>>>>>>
> >>
> https://github.com/apache/arrow-datafusion/blob/main/dev/release/README.md#verifying-release-candidates
> >>>>>>>>>>> .
> >>>>>>>>>>>
> >>>>>>>>>>> [ ] +1 Release this as Apache Arrow DataFusion 33.0.0
> >>>>>>>>>>> [ ] +0
> >>>>>>>>>>> [ ] -1 Do not release this as Apache Arrow DataFusion 33.0.0
> >>>>>>>> because...
> >>>>>>>>>>> Here is my vote:
> >>>>>>>>>>>
> >>>>>>>>>>> +1
> >>>>>>>>>>>
> >>>>>>>>>>> [1]:
> >>>>>>>>>>>
> >>
> https://github.com/apache/arrow-datafusion/tree/262f08778b8ec231d96792c01fc3e051640eb5d4
> >>>>>>>>>>> [2]:
> >>>>>>>>>>>
> >>
> https://dist.apache.org/repos/dist/dev/arrow/apache-arrow-datafusion-33.0.0-rc1
> >>>>>>>>>>> [3]:
> >>>>>>>>>>>
> >>
> https://github.com/apache/arrow-datafusion/blob/262f08778b8ec231d96792c01fc3e051640eb5d4/CHANGELOG.md
> >>
>

Reply via email to