If I may, I would be really interested to be kept in the loop as well. I
have been working on a small library making it easy to declare Python types
and automatically getting them supported in Pyarrow as extension types (and
then benefit of vecotrized ops) : https://github.com/balancap/arrowbic
Th
>
> I do not know if we voted on a naming convention, but we may want to
> reserve a namespace for us (e.g. "arrow").
+1 to calling out in docs that the arrow namespace should be reserved.
maybe "apache.arrow" to lower the possibility of collisions with people who
already have extension types? (I
Note that we do not have tests on tensor arrays, so testing the extension
type on these may be hindered by divergences between implementations. I do
not think we even have json integration files for them.
If the focus is extension types, maybe it would be best to cover types
whose physical represe
On Mon, 7 Feb 2022 at 21:02, Rok Mihevc wrote:
> To follow up the discussion from the bi-weekly Arrow sync:
>
> - JSON seems the most suitable candidate for the extension metadata.
> E.g.: TensorArray
> {"key": "ARROW:extension:name", "value": "tensor 3, 4), strides=(12, 4, 1)>"},
> {"key": "ARRO
To follow up the discussion from the bi-weekly Arrow sync:
- JSON seems the most suitable candidate for the extension metadata.
E.g.: TensorArray
{"key": "ARROW:extension:name", "value": "tensor"},
{"key": "ARROW:extension:metadata", "value": "{'type': 'int64',
'shape': [3, 3, 4], 'strides': [12,
Le 25/01/2022 à 10:12, Joris Van den Bossche a écrit :
On Sat, 22 Jan 2022 at 20:27, Rok Mihevc wrote:
Thanks for the input Weston!
How about arrow/experimental/format/ExtensionTypes.fbs or
arrow/format/ExtensionTypes.fbs for language independent schema and
loosely arrow//extensions for imp
On Sat, 22 Jan 2022 at 20:27, Rok Mihevc wrote:
>
> Thanks for the input Weston!
>
> How about arrow/experimental/format/ExtensionTypes.fbs or
> arrow/format/ExtensionTypes.fbs for language independent schema and
> loosely arrow//extensions for implementations?
>
> Having machine readable definiti
Sorry meant to add, that I think the C++ implementation should go
where-ever is most convenient to make it work well in the system (unless
the type requires heavy third-party dependencies).
On Sat, Jan 22, 2022 at 8:53 PM Micah Kornfield
wrote:
> Do we need a vote on this?
>
> I was imagining w
>
> Do we need a vote on this?
I was imagining well known types would follow roughly the same process that
new types follow (requiring two different language implementations and an
integration test). I don't think we need to stick to java as the second
language though.
On Sat, Jan 22, 2022 at 1
Thanks for the input Weston!
How about arrow/experimental/format/ExtensionTypes.fbs or
arrow/format/ExtensionTypes.fbs for language independent schema and
loosely arrow//extensions for implementations?
Having machine readable definitions could perhaps be useful for
generating implementations in s
Those all seem to be C++ locations. If we want to define
cross-implementation "Well Known Extension Types" then it seems we
would want to come up with some kind of language independent agreement
(could just be a markdown file but maybe there is some advantage to
having something programmatically c
To continue the ExtensionType part of this thread - I would like to
add TensorArray [1] as an ExtensionType to Arrow but we have not yet
agreed on an "official" location for "Well Known Extension Types".
Where could we put these? Some suggestions:
* implementation folders (e.g. arrow/cpp/extensio
I agree with others on this thread. Thanks for writing this down Micah
On Fri, Apr 30, 2021 at 11:16 AM Antoine Pitrou wrote:
>
> I concur with both what Wes and Micah said.
>
> As for temporal types, they have wide-spread use and their semantics
> require dedicated treatment for arithmetic and
I concur with both what Wes and Micah said.
As for temporal types, they have wide-spread use and their semantics
require dedicated treatment for arithmetic and conversion, so it's
helpful to define dedicated types for them, as opposed to mere annotations.
Regards
Antoine.
Le 30/04/2021 à
I agree that the bar for adding new types to the Type union in Schema.fbs
should be quite high going forward. Using extension types increasingly for
adding specializations of built-in types will mean less burden for
implementations to simply "propagate forward" this data (by preserving the
extra me
+1 this looks good to me.
My only concern is with criteria #3 " Is the underlying encoding of the
type already semantically supported by a type?". I think this is a good
criteria, but it's inconsistent with the current spec. By that criteria
some existing types (Timestamp, Time, Duration, Date) sh
Thanks for writing this.
I agree. That is a good decision tree. +1
Best,
Jorge
On Thu, Apr 29, 2021 at 6:08 PM Micah Kornfield
wrote:
> The discussion around adding another interval type to the Schema.fbs raises
> the issue of when do we decide to add a new type to the Schema.fbs vs using
> o
The discussion around adding another interval type to the Schema.fbs raises
the issue of when do we decide to add a new type to the Schema.fbs vs using
other means (primarily extension types [1]).
A few criteria come to mind that could help decide (feedback welcome):
1. Is the type a new paramet
18 matches
Mail list logo