Thanks for referencing this, Antoine. The concepts and principles seem to be
pretty concrete so I
may take some time to read it in detail.
BTW I noticed that by the current discussion in ticket ARROW-7272[1] it's
unlikely clear whether
this one or ipc flatbuffers could be a better approach for
Dear all,
I need your help regarding the pyarrow.table.schema.
I tried to create a schema and use with_metadata/add_metadata functions to
add the metadata (a python dict) to the schema. However, nothing showed up
when I run 'schema.metadata'. I can't get the metadata added to the schema.
This
Hi Francois,
Thanks for the proposal and your effort.
I made a simple JNI poc before for RecordBatch/VectorSchemaRoot interaction
between Java and C++[1][2].
This may help a little.
Thanks,
Ji Liu
[1] https://github.com/tianchen92/jni-poc-java
[2] https://github.com/tianchen92/jni-poc-cpp
Hello Hongze,
The C++ implementation of dataset, notably Dataset, DataSource,
DataSourceDiscovery, and Scanner classes are not ready/designed for
distributed computing. They don't serialize and they reference by
pointer all around, thus I highly doubt that you can implement parts
in Java, and
Attendees:
- Micah Kornfield, Google
- Praveen Kumar, Dremio
- Todd Hendricks
- François Saint-Jacques RStudio/Ursa Labs
Subject
- Bazel. Micah wants feedback on the PR. This first is aimed a
developer productivity, notably shorter link time and sandboxed build.
As a first PoC, parts of the
Francois Saint-Jacques created ARROW-7272:
-
Summary: [C++][Java] JNI bridge between RecordBatch and
VectorSchemaRoot
Key: ARROW-7272
URL: https://issues.apache.org/jira/browse/ARROW-7272
On Tue, Nov 26, 2019 at 9:40 AM Maarten Breddels
wrote:
>
> Op di 26 nov. 2019 om 15:02 schreef Wes McKinney :
>
> > hi Maarten
> >
> > I opened https://issues.apache.org/jira/browse/ARROW-7245 in part based
> > on this.
> >
> > I think that normalizing to a common type (which would require
>
> I don't get how this is a cycle. It only means Bazel is too limited to
> distinguish between a header dependency and a C++ module?
Agreed, this isn't a true cycle, but bazel is opinionated about this (i.e.
forces workarounds). In the example I highlighted it might have been
cleaner to
Fair enough. I'm okay with the bytes approach and the proposal looks good
to me.
On Fri, Nov 8, 2019 at 11:37 AM David Li wrote:
> I've updated the proposal.
>
> On the subject of Protobuf Any vs bytes, and how to handle
> errors/metadata, I still think using bytes is preferable:
> - It doesn't
https://meet.google.com/vtm-teks-phx
I'm unable to join on account of the Thanksgiving holiday, but others
are welcome to discuss and share call notes after
Le 27/11/2019 à 06:16, Micah Kornfield a écrit :
>
>> Can you give an example of circular dependency? Can this be solved by
>> having more "type_fwd.h" headers for forward declarations of opaque types?
>
> I think the type_fwd.h might contribute to the problem. The solution would
> be more
The flight compilation error occurring in the Conda builds
are caused by a recent protobuf conda-forge update and
should be fixed by https://github.com/apache/arrow/pull/5917
On Wed, Nov 27, 2019 at 2:01 PM Crossbow wrote:
>
> Arrow Build Report for Job nightly-2019-11-27-0
>
> All tasks:
>
Krisztian Szucs created ARROW-7271:
--
Summary: [C++][Flight] Use the single parameter version of
SetTotalBytesLimit
Key: ARROW-7271
URL: https://issues.apache.org/jira/browse/ARROW-7271
Project:
Arrow Build Report for Job nightly-2019-11-27-0
All tasks:
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-11-27-0
Failed Tasks:
- homebrew-cpp:
URL:
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-11-27-0-travis-homebrew-cpp
- test-conda-cpp:
To set up bridges between Java and C++, the C data interface
specification may help:
https://github.com/apache/arrow/pull/5442
There's an implementation for C++ here, and it also includes a Python-R
bridge able to share Arrow data between two different runtimes (i.e.
PyArrow and R-Arrow were
Hi Micah,
Regarding our use cases, we'd use the API on Parquet files with some pushed
filters and projectors, and we'd extend the C++ Datasets code to provide
necessary support for our own data formats.
> If JNI is seen as too cumbersome, another possible avenue to pursue is
> writing a gRPC
Sebastien Binet created ARROW-7270:
--
Summary: [Go] preserve CSV reading behaviour, improve memory usage
Key: ARROW-7270
URL: https://issues.apache.org/jira/browse/ARROW-7270
Project: Apache Arrow
Hi Hongze,
I have a strong preference for not porting non-trivial logic from one
language to another, especially if the main goal is performance. I think
this will replicate bugs and cause confusion if inconsistencies occur. It
is also a non-trivial amount of work to develop, review, setup CI,
18 matches
Mail list logo