paleolimbot opened a new pull request, #336:
URL: https://github.com/apache/arrow-nanoarrow/pull/336

   An early commit implemented `nanoarrow::EmptyArrayStream` and 
`nanoarrow::VectorArrayStream` in the nanoarrow.hpp C++ helpers. The intention 
at the time was to make it "easy"/idiomatic to use a C++ object to back an 
`ArrowArrayStream`; however, it was in practice difficult to actually make work 
(I tried briefly and gave up when writing a [dummy ADBC 
driver](https://github.com/voltrondata-labs/blog-posts-code/blob/main/2023-08-nanoarrow-adbc/simple_csv_reader.cc#L52-L177)
 for [this blog 
post](https://voltrondata.com/resources/nanoarrow-lightweight-embeddable-arrow-implementation-data-pipelines).
   
   This PR deprecates the original syntax and migrates it to the following.
   
   Implementation:
   
   ```cpp
   class StreamImpl {
    public:
     // Public methods (e.g., constructor) used from C++ to initialize relevant 
data
   
     // Idiomatic exporter to move data + lifecycle responsibility to an 
instance
     // managed by the ArrowArrayStream callbacks
     void ToArrayStream(struct ArrowArrayStream* out) {
       ArrayStreamFactory<StreamImpl>::InitArrayStream(new StreamImpl(...), 
out);
     }
   
    private:
     // Make relevant methods available to the ArrayStreamFactory
     friend class ArrayStreamFactory<StreamImpl>;
   
     // Method implementations (called from C, not normally interacted with 
from C++)
     int GetSchema(struct ArrowSchema* schema) { return ENOTSUP; }
     int GetNext(struct ArrowArray* array) { return ENOTSUP; }
     const char* GetLastError() { nullptr; }
   };
   ```
   
   Usage:
   
   ```cpp
   // Call constructor and/or public methods to initialize relevant data
   StreamImpl impl;
   
   // Export to ArrowArrayStream after data are finalized
   UniqueArrayStream stream;
   impl.ToArrayStream(stream.get());
   ```
   
   I'm open to suggestions on how to make that better! It might be that the 
`ToArrayStream()` bit is confusing (i.e., just use 
`ArrayStreamFactory<>::InitArrayStream(new StreamImpl())` directory) but it 
also seemed better to keep the lines where a raw pointer was floating around to 
be entirely contained within the `StreamImpl` class.
   
   It also fixes an issue with the `XXX_pointer()` functions, which because of 
the way they were declared, required that `nanoarrow.hpp` be confusingly 
included *after* `nanoarrow_ipc.hpp`. The correct way to do this (I think) was 
to declare the template and add implementations (rather than use overloads).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to