I think this unlocks a bunch of use cases. I think people are generally using Arrow in simpler, non-streaming ways right now and thus the quiet. Producing an iterator pattern is logical as you move to streams of smaller chunks (common in distributed and multi-tenant systems).
On Mon, Aug 10, 2020 at 11:56 AM Wes McKinney <wesmck...@gmail.com> wrote: > I'm still in need of it. I'd be interested in developing a solution > that can be used in some database APIs, e.g. using it for the result > interface for an embedded SQL database like SQLite or DuckDB would be > an interesting motivating use case. > > One approach would be to create something unofficial and used only in > the C++ library's implementation of the C API so that it can make > breaking changes for a time and then propose to formalize it in the > ABI later. > > On Mon, Aug 10, 2020 at 9:22 AM Antoine Pitrou <solip...@pitrou.net> > wrote: > > > > > > From the absence of response, it would seem there isn't much interest > > in this. Please speak up if you think this would be useful to you. > > > > Regards > > > > Antoine. > > > > > > On Tue, 7 Jul 2020 07:49:17 -0500 > > Wes McKinney <wesmck...@gmail.com> wrote: > > > Any opinions about this? It seems the next steps would be a concrete > > > API proposal and perhaps a reference implementation thereof. > > > > > > On Sun, Jun 28, 2020 at 11:26 PM Wes McKinney <wesmck...@gmail.com> > wrote: > > > > > > > > In ARROW-8301 [1] and elsewhere we've been discussing how to > > > > communicate what amounts to a sequence of arrays or a sequence of > > > > RecordBatch objects using the C data interface. > > > > > > > > Example use cases: > > > > > > > > * Returning a sequence of record / row batches from a database driver > > > > * Sending a C++ arrow::ChunkedArray or arrow::Table to a consumer > > > > using only the C interface > > > > > > > > Applications could define their own custom iterator interfaces to > > > > communicate what amounts to a sequence of the ArrowArray C interface > > > > objects, but it is likely a common enough use case to have an > > > > off-the-shelf solution so that we can support this solution in our > > > > reference libraries (e.g. Arrow C++, pyarrow, Arrow R) > > > > > > > > I suggested a C structure as follows > > > > > > > > struct ArrowArrayStream { > > > > void (*get_schema)(struct ArrowSchema*); > > > > // Non-zero return value indicates an error? > > > > int (*get_next)(struct ArrowArray*); > > > > void (*get_error)(... ERROR HANDLING TODO ); > > > > void (*release)(struct ArrowArrayStream*); > > > > void* private_data; > > > > }; > > > > > > > > The producer would populate this object with pointers to its > > > > implementations of these functions. > > > > > > > > Thoughts about this? > > > > > > > > Thanks, > > > > Wes > > > > > > > > [1]: https://issues.apache.org/jira/browse/ARROW-8301 > > > > > > > > > >