Re: Flight/FlightSQL Optimization for Small Results?

2022-03-04 Thread Micah Kornfield
I put together straw-man proposal in PR [1] for the Flight changes. Ultimately, it seemed based on the use-cases discussed inlining the data on the Ticket made the most sense. This might be overly complex (I'm not sure how I feel about a enum indicating partial vs full results) but welcome

Re: [FlightSQL] Structured/Serialized representation of query (like JSON) rather than SQL string possible?

2022-03-04 Thread Gavin Ray
To touch on the question about supported features -- is it possible to advertise arbitrary/custom "capabilites" in GetSqlInfo? Say that you want to represent some set of behaviors that FlightSQL services can support. Stuff like "Supports grouping by multiple distinct aggregates", "Supports

Integration test failing C++ / Rust on master (was Re: [Rust] Arrow 10.0.0 release)

2022-03-04 Thread Andrew Lamb
There appears to be an integration test failure happening in the arrow repository as well[1] as the arrow-rs (and arrow2 repositories). Given the data I have collected so far it seems potentially related to third-party dependencies (either in C++ or Rust). More details on the ticket[2]. Has

Re: [FlightSQL] Structured/Serialized representation of query (like JSON) rather than SQL string possible?

2022-03-04 Thread David Li
We could also add say CommandSubstraitQuery as a distinct message, and older servers would just reject it as an unknown request type. -David On Fri, Mar 4, 2022, at 17:01, Micah Kornfield wrote: >> >> 1. How does a server report that it supports each command type? Initial >> thought is a

Re: [FlightSQL] Structured/Serialized representation of query (like JSON) rather than SQL string possible?

2022-03-04 Thread Micah Kornfield
> > 1. How does a server report that it supports each command type? Initial > thought is a property in GetSqlInfo. This sounds reasonable. > What happens to client code written prior to changing the command type > to be a oneOf field? Same for servers. It is transparent from older clients

Re: [FlightSQL] Structured/Serialized representation of query (like JSON) rather than SQL string possible?

2022-03-04 Thread James Duong
It sounds like an interesting and useful project to use Subtstrait as an alternative to SQL strings. Important aspects to spec out are: 1. How does a server report that it supports each command type? Initial thought is a property in GetSqlInfo. 2. What happens to client code written prior to

Re: [C++] Ways to make CMake more robust across environments?

2022-03-04 Thread Will Jones
Thanks for the reply Antoine. Admittedly, not the good news part of me wanted to hear, but aligned with what I expected. Just wanted to confirm with the community. > And I'm not sure we want to expand the scope of our CI testing even > more, unless we also have motivated people to fix the

[Rust] Arrow 10.0.0 release

2022-03-04 Thread Andrew Lamb
Today is the normal day we would create a release candidate for the next version of arrow-rs. We are tracking progress of the 10.0.0 release on [1] Unfortunately, the integration tests have started failing [2] on our CI jobs, which I think is blocking the release. I am looking into it, but would

Re: [PyArrow] Arrow StructArray buffer allocation

2022-03-04 Thread Hanqi Wu
> On Mar 4, 2022, at 9:08 AM, Antoine Pitrou wrote: > > > I opened https://issues.apache.org/jira/browse/ARROW-15846 > Regards > > Antoine. > > > Le 04/03/2022 à 15:05, Antoine Pitrou a écrit : >> Le 04/03/2022 à 15:01, Hanqi Wu a écrit : >>> Hi Antoine, >>> >>> I agree n_buffers should

Re: [PyArrow] Arrow StructArray buffer allocation

2022-03-04 Thread Antoine Pitrou
I opened https://issues.apache.org/jira/browse/ARROW-15846 Regards Antoine. Le 04/03/2022 à 15:05, Antoine Pitrou a écrit : Le 04/03/2022 à 15:01, Hanqi Wu a écrit : Hi Antoine, I agree n_buffers should still be set to 1. But as per the below PyArrow doc, n_buffers’s value will be 0 if

Re: [PyArrow] Arrow StructArray buffer allocation

2022-03-04 Thread Antoine Pitrou
Le 04/03/2022 à 15:01, Hanqi Wu a écrit : Hi Antoine, I agree n_buffers should still be set to 1. But as per the below PyArrow doc, n_buffers’s value will be 0 if no null values in a struct array. This is what confuses me. "A struct array does not have any additional allocated physical

Re: [PyArrow] Arrow StructArray buffer allocation

2022-03-04 Thread Hanqi Wu
Hi Antoine, I agree n_buffers should still be set to 1. But as per the below PyArrow doc, n_buffers’s value will be 0 if no null values in a struct array. This is what confuses me. "A struct array does not have any additional allocated physical storage for its values. A struct array must

Re: [PyArrow] Arrow StructArray buffer allocation

2022-03-04 Thread Antoine Pitrou
Hi Hanqi, Le 04/03/2022 à 14:53, Hanqi Wu a écrit : Hi Antoine, I agree. But my question is for Arrow StructArray with No null values. In this case, as per the documentation, n_buffers should be set to 0. Well, no. As I said, it should still be 1. You can also take a look at the fields

Re: [C++] Ways to make CMake more robust across environments?

2022-03-04 Thread Antoine Pitrou
Hello Will, Le 04/03/2022 à 01:27, Will Jones a écrit : I've come across several different environments where Arrow either fails to configure with CMake or fails to link libraries. Some recent examples I've come across: * (Just fixed [1]) Windows, RTools4 (MSYS2), Debug, dynamic libraries

Re: [PyArrow] Arrow StructArray buffer allocation

2022-03-04 Thread Hanqi Wu
Hi Antoine, I agree. But my question is for Arrow StructArray with No null values. In this case, as per the documentation, n_buffers should be set to 0. However, “import_from_c” expects StructArray to always have at least 1 buffer allocated, otherwise it throws an exception. Best, Hanqi >

Re: [PyArrow] Arrow StructArray buffer allocation

2022-03-04 Thread Antoine Pitrou
Le 04/03/2022 à 04:17, Hanqi Wu a écrit : Hello community, As per the below documentation, for an Arrow StructArray, it won’t have any physical buffers backing it if it doesn’t contain any null value: https://arrow.apache.org/docs/format/Columnar.html#struct-layout However, in PyArrow, it

[PyArrow] Arrow StructArray buffer allocation

2022-03-04 Thread Hanqi Wu
Hello community, As per the below documentation, for an Arrow StructArray, it won’t have any physical buffers backing it if it doesn’t contain any null value: https://arrow.apache.org/docs/format/Columnar.html#struct-layout However, in PyArrow, it complains if you try to import from C an

[PyArrow] Arrow StructArray buffer allocation

2022-03-04 Thread Hanqi Wu
Hello community, As per the below documentation, for an Arrow StructArray, it won’t have any physical buffers backing it if it doesn’t contain any null value: https://arrow.apache.org/docs/format/Columnar.html#struct-layout However, in PyArrow, it complains if you try to import from C an

Re: [Rust] Encourage all community members to review PRs

2022-03-04 Thread Andrew Lamb
Thank you xudong, that is a great perspective. A little more backstory is that I am acutely aware of the backup of PRs in need of review in DataFusion, and less so in Arrow. As the contributions grow, we also need to grow our capacity to review them. We are working in various ways to grow our