Hi,
I have some PRs that were to improve Dataset API's Java implementation
have not been reviewing for months. Could someone help me to review
them? Thanks in advance.
1. https://github.com/apache/arrow/pull/10201
ARROW-11776: [Java][Dataset] Support writing to files within dataset
scanner via
I think if someone wants to build a plugin model for datasets / file
formats (and refactor the existing "built-in" formats to use those
plugin APIs), that sounds like a fine idea to me. I don't think the
idea was for the API to be closed only to the formats that are
implemented inside the Arrow
Another Flatbuffers/Message.fbs project we should rekindle soon, in
addition to the schema evolution/replacement question which has been
raised with Flight, is that of sparse/compressed data (e.g. RLE). I
have a vacation plus some travel coming up so won't be able to devote
meaningful attention to
Hi Jorge,
I see value in consolidating development in a single repo and releasing under
the existing arrow crate. Regarding versioning, I think once we follow
semantic versioning we are fine. I don't think it's worth migrating to a
different repo and crate to comply with the de-facto
I'd break things into (at least) four subproblems.
# Nested fork/join Deadlock
The original problem I set out to solve was the problem of nested
fork/joins leading to deadlock. In particular, the parquet reader
issues a fork/join per column and the dataset scanner issues a
fork/join per file.
hi all,
We've had some discussions in the past about our approach to nested
parallelism (for example, reading multiple Parquet or CSV files or
compressed Arrow IPC files in parallel, each of which can benefit from
internal parallelism for faster parsing / decoding performance). Since
then, there
Hi Paddy,
> What do you think about moving Arrow2 into the main Arrow repo where it
is only enabled via an "experimental" feature flag?
AFAIK this is already possible:
* add `arrow2 = { version = "0.2.0", optional = true }` to Cargo.toml
* add `#[cfg(feature = "arrow2")]\npub mod arrow2;\n` to
Hello everyone,
Our biweekly sync call is tomorrow (3 August) at 12:00 noon Eastern time.
For today's call, let's please us this Google Meet URL (different from the
usual one):
https://meet.google.com/vbq-yufg-zwr?authuser=0
All are welcome to join. Notes will be shared with the mailing list
flatc does have the option to disable warnings (--no-warnings)
On Tue, Aug 3, 2021 at 2:26 PM Micah Kornfield wrote:
>
> >
> > Is it something that can be done in a major version release?
>
>
> This seems like it would be a major version release of the specification,
> which I think we were
>
> Is it something that can be done in a major version release?
This seems like it would be a major version release of the specification,
which I think we were trying to essentially avoid in any reasonable time
frame. Is there no way to turn the warnings off?
On Mon, Aug 2, 2021 at 2:11 PM
great idea!
On Tue, Aug 3, 2021 at 8:49 AM Andy Grove wrote:
> I also like the idea of moving arrow2/parquet2 into the official repos.
> This is effectively what we did with Ballista, which is still experimental.
> Ballista was simpler because it depends on DataFusion rather than the other
>
We should post the 5.0.0 release blog post soon. If anyone would like
to review the content or make changes or additions, please do so as
soon as possible:
https://github.com/apache/arrow-site/pull/127
Thanks,
Ian
On Fri, Jul 16, 2021 at 1:44 PM Neal Richardson
wrote:
>
> I've started a draft
I also like the idea of moving arrow2/parquet2 into the official repos.
This is effectively what we did with Ballista, which is still experimental.
Ballista was simpler because it depends on DataFusion rather than the other
way around, but I like the idea of using feature flags to enable
Hi Jorge,
What do you think about moving Arrow2 into the main Arrow repo where it is only
enabled via an "experimental" feature flag? This would allow development of
Arrow2 to proceed in the main repo but also this would be a clear signal that
Arrow2 is <1.0. When we feel ready (i.e. Arrow2
14 matches
Mail list logo