Hi Kou,
Thanks for pushing for this!
Le 06/06/2024 à 11:27, Sutou Kouhei a écrit :
4. Standardize Apache Arrow schema for statistics and
transmit statistics via separated API call that uses the
C data interface
[...]
I think that 4. is the best approach in these candidates.
I
Hello,
Arrow C++ features a MemoryPool abstraction that allows using different
allocators interchangeably. Several MemoryPool implementations are
provided with Arrow C++ (though one can also build their own):
- a jemalloc-based implementation, currently the default on Linux
- a
(Gang Wu, Antoine Pitrou, Wes McKinney)
9x +1 non-binding (Micah Kornfield, Felipe Oliveira Carvalho, Fokko
Driesprong, Alenka Frim, Andy Grove, Raúl Cumplido, Sutou Kouhei, Jiashen
Zhang, Rok Mihevc)
Arrow:
6x +1 binding (Micah Kornfield, Antoine Pitrou, Andy Grove, Raúl Cumplido,
Wes McKinney
Hi Li!
Sorry for the delay.
It seems the problem lies here:
https://github.com/apache/arrow/blob/9f5899019d23b2b1eae2fedb9f6be8827885d843/cpp/src/arrow/filesystem/s3fs.cc#L1858
The Future is marked finished with the ObjectOutputStream's mutex taken,
and the Future's callback then triggers a
+1 (binding).
Thanks for taking this up, Rok!
Regards
Antoine.
Le 29/05/2024 à 16:14, Rok Mihevc a écrit :
# sending this to both dev@arrow and dev@parquet
Hi all,
Following the ML discussion [1] I would like to propose a vote for
parquet-cpp issues to be moved from Parquet Jira [2] to
Is it somehow possible to be a "member" of this account to indicate that
we have PMC status, or is that not possible within the LinkedIn
membership/permissions model?
Le 24/05/2024 à 18:04, Ian Cook a écrit :
Following the discussion [1] earlier this year about the status of the
Apache
t;, "min",
> > >"byte_width" and "distinct_count" but users can also use
> > >application specific keys.
> > > 3. If true, then the value is approximate or best-effort.
> > >
> > > VALUE_SCHEMA is a dense union with
Le 23/05/2024 à 16:09, Felipe Oliveira Carvalho a écrit :
Protocols that produce/consume statistics might want to use the C Data
Interface as a primitive for passing Arrow arrays of statistics.
This is also my opinion.
I think what we are slowly converging on is the need for a spec to
Hi Kou,
I agree that Dewey that this is overstretching the capabilities of the C
Data Interface. In particular, stuffing a pointer as metadata value and
decreeing it immortal doesn't sound like a good design decision.
Why not simply pass the statistics ArrowArray separately in your
I think these flags should be advisory and consumers should be free to
ignore them. However, some consumers apparently would benefit from them
to more faithfully represent the producer's intention.
For example, in Arrow C++, we could perhaps have a ImportDatum function
whose actual return
+1 (binding)
Le 19/04/2024 à 22:22, Rok Mihevc a écrit :
Hi all,
Following initial requests [1][2] and recent tangential ML discussion [3] I
would like to propose a vote to add language for UUID canonical extension
type to CanonicalExtensions.rst as in PR [4] and written below.
A draft C++
+1 (binding) for the current proposal, i.e. with the RFC 8289
requirement and the 3 current String types allowed.
Regards
Antoine.
Le 30/04/2024 à 19:26, Rok Mihevc a écrit :
Hi all, thanks for the votes and comments so far.
I've amended [1] the proposed language with the RFC-8259
o we could use this in that context).
I think that I would still prefer a canonical extension type (with storage
type null) over a new dedicated type.
On Wed, Apr 17, 2024 at 5:39 AM Antoine Pitrou wrote:
Ah! Well, I think this could be an interesting proposal, but someone
should put a mor
Ah! Well, I think this could be an interesting proposal, but someone
should put a more formal proposal, perhaps as a draft PR.
Regards
Antoine.
Le 17/04/2024 à 11:57, David Li a écrit :
For an unsupported/other extension type.
On Wed, Apr 17, 2024, at 18:32, Antoine Pitrou wrote:
What
Out of curiosity, did you notice this by chance or do you have some kind
of script that processes ASF mailing-list archives for possible voting
irregularities?
Regards
Antoine.
Le 17/04/2024 à 10:44, Christofer Dutz a écrit :
When looking at whimsy, I can’t see any person named Sutou
ne-off nominal types for
very specific use-cases?
—
Felipe
On Thu, 11 Apr 2024 at 05:06 Antoine Pitrou wrote:
Yes, JSON and UUID are obvious candidates for new canonical extension
types. XML also comes to mind, but I'm not sure there's much of a use
case for it.
Regards
Antoine.
Le 10/04/2024 à
:06 Antoine Pitrou wrote:
Yes, JSON and UUID are obvious candidates for new canonical extension
types. XML also comes to mind, but I'm not sure there's much of a use
case for it.
Regards
Antoine.
Le 10/04/2024 à 22:55, Wes McKinney a écrit :
In the past we have discussed adding a canonical
Yes, JSON and UUID are obvious candidates for new canonical extension
types. XML also comes to mind, but I'm not sure there's much of a use
case for it.
Regards
Antoine.
Le 10/04/2024 à 22:55, Wes McKinney a écrit :
In the past we have discussed adding a canonical type for UUID and
Hello John,
Arrow IPC files can be backed quite naturally by shared memory, simply
by memory-mapping them for reading. So if you have some pieces of shared
memory containing Arrow IPC files, and they are reachable using a
filesystem mount point, you're pretty much done.
You can see an
It seems that perhaps this discussion should be rebooted for each
individual component, one at a time?
Let's start with something simple and obvious, with some frequent
contribution activity, such as perhaps Go?
Le 09/04/2024 à 14:27, Joris Van den Bossche a écrit :
I am also in favor
Le 28/03/2024 à 21:42, Jacob Wujciak a écrit :
For Arrow C++ bindings like Arrow R and PyArrow having distinct versions
would require additional work to both enable the use of different versions
and ensure version compatibility is monitored and potentially updated if
needed.
We could simply
Thanks. The Arrow spec does support multiple union members with the same
type, but not all implementations do. The C++ implementation should
support it, though to my surprise we do not seem to have any tests for it.
If the Java implementation doesn't, then you can probably open an issue
Can you explain what ADT means ?
Le 02/04/2024 à 11:31, Finn Völkel a écrit :
Hi,
my question primarily concerns the union layout described at
https://arrow.apache.org/docs/format/Columnar.html#union-layout
There are two ways to use unions:
- polymorphic vectors (world 1)
- ADT
Regardless of whether they have different compression ratios, it doesn't
explain why you would want a different compression *algorithm* altogether.
The choice of a compression algorithm should basically be driven by two
concerns: the acceptable space/time tradeoff (do you want to minimize
Hello Andrei,
Le 23/03/2024 à 13:23, Andrei Lazăr a écrit :
At this very moment, specifying different compression algorithms per column
is supported and in my use case it is extremely helpful, as I have some
columns (mostly containing floats), for which a compression algorithm like
Snappy
Also, with ADBC driver implementations currently in flux (none of them
has reached the "stable" status in
https://arrow.apache.org/adbc/main/driver/status.html), it might be a
disservice to users to implicitly fetch drivers from potentially
outdated DLLs on the current system.
Regards
Congratulations Bryce, and keep up the good work!
Regards
Antoine.
Le 18/03/2024 à 03:21, Nic Crane a écrit :
On behalf of the Arrow PMC, I'm happy to announce that Bryce Mecum has
accepted an invitation to become a committer on Apache Arrow. Welcome, and
thank you for your contributions!
I didn't run the release script but I'm +1 on this (binding).
Regards
Antoine.
Le 04/03/2024 à 10:05, Raúl Cumplido a écrit :
Hi,
I would like to propose the following release candidate (RC0) of Apache
Arrow version 15.0.1. This is a release consisting of 37
resolved GitHub issues[1].
want as many
parties in the community as possible to be part of this.
Thanks everyone.
--Matt
On Tue, Feb 27, 2024 at 12:48 PM Antoine Pitrou wrote:
Hello,
I'd really like to see more engagement and criticism from non-Voltron
Data parties before this is formally adopted as an Arrow spec
Hello,
I'd really like to see more engagement and criticism from non-Voltron
Data parties before this is formally adopted as an Arrow spec.
Regards
Antoine.
Le 27/02/2024 à 18:35, Matt Topol a écrit :
Hey all,
I'd like to propose a vote for us to officially adopt the protocol
described
for today's bi-weekly call.
Thanks,
Raúl
El mar, 13 feb 2024 a las 23:20, Antoine Pitrou () escribió:
Well, https://github.com/apache/arrow/issues/20379 makes me wonder if
anyone is using the Java Dataset bridge seriously.
Le 13/02/2024 à 21:10, Dane Pitkin a écrit :
Hi all,
Arrow Java identified
Well, https://github.com/apache/arrow/issues/20379 makes me wonder if
anyone is using the Java Dataset bridge seriously.
Le 13/02/2024 à 21:10, Dane Pitkin a écrit :
Hi all,
Arrow Java identified an issue[1] in the 15.0.0 release. There is an
undefined symbol in the dataset module that
ed semantics? If so, is there a way to include the
original service in the list of locations without the implied precedence?
Thanks,
Joel
On Mon, Feb 12, 2024 at 11:52 James Duong
.invalid>
wrote:
This seems like a good idea, and also improves consistency with clients
that erroneously assumed that th
Hi Dewey,
Le 12/02/2024 à 15:01, Dewey Dunnington a écrit :
Apache Arrow nanoarrow is a small C library for building and
interpreting Arrow C Data interface structures with bindings for users
of the R programming language.
Do you want to reconsider this sentence? It seems nanoarrow is
Hello,
This looks fine to me.
Regards
Antoine.
Le 12/02/2024 à 14:46, David Li a écrit :
Hello,
I'd like to propose a slight update to Flight RPC to make Flight SQL work
better in different deployment scenarios. Comments on the doc would be
appreciated:
I think we should find a proper descriptive name for the
"high-performance protocol", because "high-performance" is vague and
context-dependent, and also spreads unnecessary confusion about existing
alternatives such as regular Arrow IPC.
I would for example propose "Dissociated Arrow IPC"
My 2 cents : I don't understand what an open source project gains by
publishing on a microblogging platform.
As for Twitter specifically, its recent governance changes would be good
reason for terminating the @ApacheArrow account, IMHO.
Regards
Antoine.
Le 27/01/2024 à 23:06, Bryce
Hello,
My own answers:
1) isDelta should be true only when a delta is being transmitted (to be
appended to the existing dictionary with the same id); it should be
false when a full dictionary is being transmitted (to replace the
existing dictionary with the same id, if any)
2) yes, it
Impressive, thank you!
Le 23/01/2024 à 14:06, Andrew Lamb a écrit :
If anyone is interested, here is a new blog post about the last 6 months in
DataFusion[1] and where we are heading this year.
Andrew
[1]: https://arrow.apache.org/blog/2024/01/19/datafusion-34.0.0/
Well, if the main objective is to just follow the ASF Release
guidelines, then our verification process can be simplified drastically.
The ASF indeed just requires:
"""
Every ASF release MUST contain one or more source packages, which MUST
be sufficient for a user to build and test the
Go verification fails on Ubuntu 22.04:
```
# google.golang.org/grpc
../../gopath/pkg/mod/google.golang.org/grpc@v1.58.3/server.go:2096:14:
undefined: atomic.Int64
note: module requires Go 1.19
# github.com/apache/arrow/go/v15/arrow/avro
arrow/avro/reader_types.go:594:16: undefined:
Hi,
For now, I would suggest that each implementation decides on their own
strategy, because we don't have a clear idea of which is better (and
extension types are probably not getting a lot of use yet).
Regards
Antoine.
Le 13/12/2023 à 17:39, Benjamin Kietzman a écrit :
The main
Hi Curt,
Yes, it's a problem in the Java implementation of these tests. Ideally
this should be fixed, but doing so would require some amount of scaffolding.
Regards
Antoine.
Le 09/12/2023 à 21:47, Curt Hagenlocher a écrit :
I've (mostly) fixed the C# implementation of dictionary IPC but
+1 (binding)
Le 08/12/2023 à 20:42, David Li a écrit :
Let's start a formal vote just so we're on the same page now that we've
discussed a few things.
I would like to propose we remove 'experimental' from Flight SQL and make it
stable:
- Remove the 'experimental' option from the Protobuf
Hi,
While this looks like a nice start, I would expect more precise
recommendations for writing non-trivial services. Especially, one
question is how to send both an application-specific POST request and an
Arrow stream, or an application-specific GET response and an Arrow
stream. This
Given that MCJIT is deprecated and there doesn't seem to be a downside
to the new APIs, migrating to ORC v2 sounds fine to me.
Just a question: does it raise the minimum supported LLVM version?
Regards
Antoine.
Le 05/12/2023 à 03:35, Yue Ni a écrit :
Hi there,
I'd like to initiate a
For the sake of clarity, it seems this is talking about the Conference
on Innovative Data Systems Research:
https://www.cidrdb.org/cidr2024/
Regards
Antoine.
Le 06/12/2023 à 01:15, Wes McKinney a écrit :
I will also be there.
On Mon, Dec 4, 2023 at 12:58 PM Tony Wang wrote:
I am
Get
Hello,
Le 21/11/2023 à 22:59, Chris Thomas a écrit :
I apologize if this is not the appropriate venue for this request; if
that's the case, please let me know where I should be asking:
Earlier this month Dependabot flagged a security vulnerability with PyArrow
which prompted us to do an
I also agree that an informal spec "how to efficiently transfer Arrow
data over HTTP" makes sense.
Probably with several aspects:
- one-shot GET data
- streaming GET
- one-shot PUT or POST
- streaming POST
- non-Arrow prologue and epilogue (for example JSON-based metadata)
- conventions for
Welcome Raul, we're glad to have you!
Regards
Antoine.
Le 13/11/2023 à 20:27, Andrew Lamb a écrit :
The Project Management Committee (PMC) for Apache Arrow has invited
Raúl Cumplido to become a PMC member and we are pleased to announce
that Raúl Cumplido has accepted.
Please join me in
/CanonicalExtensions.html
On Thu, Nov 9, 2023, at 11:56, Antoine Pitrou wrote:
Or they could trivially use a int64 column for that, since the scale is
fixed anyway, and you're probably not going to multiply money values
together.
Le 09/11/2023 à 17:54, Curt Hagenlocher a écrit :
If Arrow had a decimal64 type
, at 11:56, Antoine Pitrou wrote:
Or they could trivially use a int64 column for that, since the scale is
fixed anyway, and you're probably not going to multiply money values
together.
Le 09/11/2023 à 17:54, Curt Hagenlocher a écrit :
If Arrow had a decimal64 type, someone could choose to use
column knowing that there are edge cases where they may
get an undesired result.
On Thu, Nov 9, 2023 at 8:42 AM Antoine Pitrou wrote:
Le 09/11/2023 à 17:23, Curt Hagenlocher a écrit :
Or more succinctly,
"111,111,111,111,111." will fit into a decimal64; would you prevent
it
Le 09/11/2023 à 17:23, Curt Hagenlocher a écrit :
Or more succinctly,
"111,111,111,111,111." will fit into a decimal64; would you prevent it
from being stored in one so that you can describe the column as
"decimal(18, 4)"?
That's what we do for other decimal types, see PyArrow below:
```
For the record, the correct PR link seems to be
https://github.com/apache/arrow/pull/38385
Le 08/11/2023 à 21:49, David Li a écrit :
Hello,
Joel Lubi has proposed adding bulk ingestion support to Arrow Flight SQL [1].
This provides a path for uploading an Arrow dataset to a Flight SQL
Severity: critical
Affected versions:
- PyArrow 0.14.0 through 14.0.0
- PyArrow 0.14.0 through 14.0.0
Description:
Deserialization of untrusted data in IPC and Parquet readers in PyArrow
versions 0.14.0 to 14.0.0 allows arbitrary code execution. An application is
vulnerable if it reads Arrow
Le 26/10/2023 à 20:02, Benjamin Kietzman a écrit :
Is this buffer lengths buffer only present if the array type is Utf8View?
IIUC, the proposal would add the buffer lengths buffer for all types if the
schema's
flags include ARROW_FLAG_BUFFER_LENGTHS. I do find it appealing to avoid
the
Le 26/10/2023 à 18:59, Dewey Dunnington a écrit :
That sounds a bit hackish to me.
Including only *some* buffer sizes in array->buffers[array->n_buffers]
special-cased for only two types (or altering the number of buffers
required by the IPC format vs. the number of buffers required by the
Le 26/10/2023 à 17:45, Dewey Dunnington a écrit :
The lack of buffer sizes is something that has come up for me a few
times working with nanoarrow (which dedicates a significant amount of
code to calculating buffer sizes, which it uses to do validation and
more efficient copying).
By the
Le 26/10/2023 à 17:45, Dewey Dunnington a écrit :
> A potential alternative might be to allow any ArrowArray to declare
> its buffer sizes in array->buffers[array->n_buffers], perhaps with a
> new flag in schema->flags to advertise that capability.
That sounds a bit hackish to me.
I'd rather
Hello,
We might want to keep the variadic buffers at the end and instead export
the buffer sizes as buffer #2? Though that's mostly stylistic...
Regards
Antoine.
Le 25/10/2023 à 18:36, Benjamin Kietzman a écrit :
Hello all,
The C ABI does not store buffer lengths explicitly, which
Welcome Xuwei!
Le 23/10/2023 à 05:28, Sutou Kouhei a écrit :
On behalf of the Arrow PMC, I'm happy to announce that Xuwei Fu
has accepted an invitation to become a committer on Apache
Arrow. Welcome, and thank you for your contributions!
active the community
is being, I'm reasonably confident that they'll come to it soon :)
Regards
Antoine.
Le 26/09/2023 à 14:46, Antoine Pitrou a écrit :
Hello,
We have added some infrastructure for integration testing of the C Data
Interface between Arrow implementations. We are now testing
The fact that they describe Arrow and Feather as distinct formats
(they're not!) with different characteristics is a bit of a bummer.
Le 18/10/2023 à 22:20, Andrew Lamb a écrit :
If you are looking for a more formal discussion and empirical analysis of
the differences, I suggest reading "A
+1
Le 18/10/2023 à 19:02, Benjamin Kietzman a écrit :
Hello all,
I propose "vu" and "vz" as format strings for the Utf8View and
BinaryView types in the Arrow C data interface [1].
The vote will be open for at least 72 hours.
[ ] +1 - I'm in favor of these new C data format strings
[ ] +0
[ ]
Welcome to the PMC, Jon!
Le 14/10/2023 à 19:42, David Li a écrit :
Congrats Jon!
On Sat, Oct 14, 2023, at 13:25, Ian Cook wrote:
Congratulations Jonathan!
On Sat, Oct 14, 2023 at 13:24 Andrew Lamb wrote:
The Project Management Committee (PMC) for Apache Arrow has invited
Jonathan Keane
PM Antoine Pitrou wrote:
Hi Alva,
I'll let others give their opinions on the repo.
Regards
Antoine.
Le 10/10/2023 à 19:25, Alva Bandy a écrit :
Hi Antoine,
Thanks for the reply.
It would be great to get the Swift implementation added to the
integration test. I have a task for adding
not looked into Julia’s implementation.
Thank you,
Alva Bandy
On 2023/10/10 08:54:30 Antoine Pitrou wrote:
Hello Alva,
This is a reasonable request, but it might come with its own drawbacks
as well.
One significant drawback is that adding the Swift implementation to the
cross-implementation integration
Hello Alva,
This is a reasonable request, but it might come with its own drawbacks
as well.
One significant drawback is that adding the Swift implementation to the
cross-implementation integration tests will be slightly more complicated.
It is very important that all Arrow implementations
+1 from me.
But I also reiterate my plea that these existing parsers get fixed so as
to entirely validate the format string instead of stopping early.
Regards
Antoine.
Le 06/10/2023 à 23:26, Felipe Oliveira Carvalho a écrit :
Hello,
I'm writing to propose "+vl" and "+vL" as format
);
+} else {
+ type_ = list_view(field);
+}
+ } else {
+return f_parser_.Invalid();
+ }
+}
+
return Status::OK();
}
--
Felipe
On Thu, Oct 5, 2023 at 5:26 PM Antoine Pitrou
wrote:
I don't think the parsing will be a problem even in C. It's not like
I don't think the parsing will be a problem even in C. It's not like you
have to backtrack anyway.
+1 from me on Felipe's proposal.
Regards
Antoine.
Le 05/10/2023 à 20:33, Felipe Oliveira Carvalho a écrit :
This mailing list thread is going to be the discussion.
The union types also
+1 from me. It might be worth spelling out whether any relationship is
expected between the `app_metadata` for a FlightInfo and any of the
corresponding `FlightEndpoint`s and `FlightData` chunks.
Le 12/09/2023 à 17:48, Matt Topol a écrit :
Hey all,
I would like to propose adding a new
Le 03/10/2023 à 01:36, Matt Topol a écrit :
The cost of conversion is actually significantly higher than the actual
overhead of simply accessing the values in either representation, leading
to a high potential for bottleneck. For systems like Velox and DuckDB where
it's important to be able
approach be willing to meet us in the middle and switch to
an offset based encoding? This to me feels like it would be the best
outcome for the ecosystem as a whole.
Kind Regards,
Raphael
On 02/10/2023 13:50, Antoine Pitrou wrote:
Le 01/10/2023 à 16:21, Micah Kornfield a écrit :
I would also
Hello,
+1 and thanks for working on this!
There'll probably be some minor comments to the format PR, but those
don't deter from accepting these new layouts into the standard.
Regards
Antoine.
Le 29/09/2023 à 14:09, Felipe Oliveira Carvalho a écrit :
Hello,
I'd like to propose adding
Le 01/10/2023 à 16:21, Micah Kornfield a écrit :
I would also assert that another way to reduce this risk is to add
some prose to the relevant sections of the columnar format
specification doc to clearly explain that a raw pointers variant of
the layout, while not part of the official spec,
be clearly flagged as being non-Arrow compliant.
It could be by naming (e.g. `arrow::non_arrow_string_view()`) or by
specific namespacing (e.g. `non_arrow::raw_pointers_string_view()`).
But, they could be also be provided by a distinct library.
Regards
Antoine.
Le 28/09/2023 à 09:01, Antoine
Hi Ben,
Le 27/09/2023 à 23:25, Benjamin Kietzman a écrit :
@Antoine
What this PR is creating is an "unofficial" Arrow format, with data
types exposed in Arrow C++ that are not part of the Arrow standard, but
are exposed as if they were.
We already do this in every implementation of the
Hello,
What this PR is creating is an "unofficial" Arrow format, with data
types exposed in Arrow C++ that are not part of the Arrow standard, but
are exposed as if they were. Most users will probably not read the
official format spec, but will simply trust the official Arrow
Hello,
We have added some infrastructure for integration testing of the C Data
Interface between Arrow implementations. We are now testing the C++ and
Go implementations, but the goal in the future is for all major
implementations to be tested there (perhaps including nanoarrow).
- PR to
Hi Yue,
Le 25/09/2023 à 18:15, Yue Ni a écrit :
a CMake entrypoint (for example a function) making it easy for
third-party projects to compile their own functions
I can come up with a minimum CMake template so that users can compile C++
based functions, and I think if the integration
Hello,
Being making Gandiva more extensible sounds like a worthwhile improvement.
However, I'm not sure why we would need to choose a JSON-based format
for this. Instead, I think Gandiva could simply provide the two
following basic-blocks:
1. a CMake entrypoint (for example a function)
Le 13/09/2023 à 02:37, Rok Mihevc a écrit :
* **ragged_dimensions** = indices of ragged dimensions whose sizes may
differ. Dimensions where all elements have the same size are called
uniform dimensions. Indices are a subset of all possible dimension
indices ([0, 1, ..,
Hi Li,
Le 06/09/2023 à 17:55, Li Jin a écrit :
Hello,
I have been testing "What is the max rss needed to scan through ~100G of
data in a parquet stored in gcs using Arrow C++".
The current answer is about ~6G of memory which seems a bit high so I
looked into it. What I observed during the
Hello Jonas,
What is the standardization model you are after? PEP 249 is marked final
and therefore won't be updated (except for minutiae such as typos,
markup, etc.).
Are you planning to submit a new PEP for this extension? If so, I would
suggest starting a discussion on
+1 on the format additions
The implementations will probably need a bit more review back-and-forth.
Regards
Antoine.
Le 28/06/2023 à 21:34, Benjamin Kietzman a écrit :
Hello,
I'd like to propose adding Utf8View arrays to the arrow format.
Previous discussion in [1], columnar format
Hello,
Arrow C++ comes with execution facilities (such as thread pools, async
generators...) meant to unlock higher performance by hiding IO latencies
and exploiting several CPU cores. These execution facilities also
obscure the context in which a task is executed: you cannot simply use
latbuffers output from the
release package only.
“Caches”, multi stage compilation etc should be ok.
Best regards,
Adam Lippai
On Tue, Aug 22, 2023 at 10:40 Antoine Pitrou wrote:
If the main impetus for the verification script is to comply with ASF
requirements, probably the script can be
latbuffers output from the
release package only.
“Caches”, multi stage compilation etc should be ok.
Best regards,
Adam Lippai
On Tue, Aug 22, 2023 at 10:40 Antoine Pitrou wrote:
If the main impetus for the verification script is to comply with ASF
requirements, probably the script can be made mu
cripts don't need much maintenance
so we just continue the ceremony. However, I certainly don't think we would
lose much/any test coverage if we stopped their use.
Andrew
On Tue, Aug 22, 2023 at 4:54 AM Antoine Pitrou wrote:
Hello,
Abiding by the Apache Software Foundation's guidelines,
+1 from me (binding). The verification script failed for me, but I
consider it not a problem (see separate discussion thread).
Regards
Antoine.
Le 18/08/2023 à 10:00, Raúl Cumplido a écrit :
Hi,
I would like to propose the following release candidate (RC3) of Apache
Arrow version
Hello,
Abiding by the Apache Software Foundation's guidelines, every Arrow
release is voted on and requires at least 3 "binding" votes to be approved.
Also, every Arrow release vote is accompanied by a little ceremonial
where contributors and core developers run a release verification
Hello,
It seems the verification instructions are not up to date?
https://cwiki.apache.org/confluence/display/ARROW/How+to+Verify+Release+Candidates
I've tried to run the suggested command:
$ dev/release/verify-release-candidate.sh source 13.0.0 3
and I get the following error message:
"""
Or you can simply call the "sort_indices" compute function:
https://arrow.apache.org/docs/cpp/compute.html#sorts-and-partitions
Le 17/08/2023 à 23:20, Ian Cook a écrit :
Li,
Here's a standalone C++ example that constructs a Table and executes
an Acero ExecPlan to sort it:
.)
This vote will be open for at least 72 hours.
[ ] +1 Adopt the ADBC 1.1.0 specification
[ ] 0
[ ] -1 Do not adopt the specification because...
Thanks to Sutou Kouhei, Matt Topol, Dewey Dunnington, Antoine Pitrou, Will Ayd,
and Will Jones for feedback on the design and various work-in-progress PRs.
[1
ion of metadata to a string, different
encoder-implementations still might still produce non-comparable strings,
resulting in falsely reported datatype mismatches, but at least avoiding
the case of false positives.
On Wed, Aug 16, 2023 at 5:19 PM Antoine Pitrou wrote:
Hi Jeremy,
A single key ma
Hi Jeremy,
A single key makes it easier for generic code to recreate extension
types it does not know about.
Here is an example in the C++ IPC layer:
https://github.com/apache/arrow/blob/641201416c1075edfd05d78b539275065daac31d/cpp/src/arrow/ipc/metadata_internal.cc#L823-L845
Here is
+1 from me (binding).
It would be nice to get approval from authors of other implementations
such as Rust, C#, Javascript...
Thanks for doing this!
Le 16/08/2023 à 16:16, Matt Topol a écrit :
Hey All,
As proposed by Felipe [1] I'm starting a vote on the proposed update to the
Format
I think we should.
Regards
Antoine.
Le 15/08/2023 à 19:58, Matt Topol a écrit :
I'm in favor of this as the C Data format string. Though since this is
technically a format/spec change do others think we should take a vote on
this?
--Matt
On Tue, Aug 15, 2023, 12:19 PM Felipe Oliveira
1 - 100 of 1823 matches
Mail list logo