Hi,

How do you send the written data over the network? Do you
use raw socket(2) and write(2)? If you use raw socket, can
we wrap the raw socket by GioUnixSocketStream[1]? We can
wrap the raw socket by g_unix_output_stream_new()[2] with
the file descriptor of the raw socket.

[1] https://docs.gtk.org/gio/class.UnixOutputStream.html
[2] https://docs.gtk.org/gio/ctor.UnixOutputStream.new.html

If we can wrap the raw socket by GioUnixSocketStream, we
don't need to create GArrowBuffer for serialized record
batches. We can write serialized record batches to the raw
socket directly.

I created examples to send/receive record batches via
network: https://github.com/apache/arrow/pull/13590

This may help you.


Thanks,
-- 
kou

In <[email protected]>
  "Re: [C/GLib] Trying (and failing) to send RecordBatches between Client and 
Server in C" on Mon, 11 Jul 2022 15:48:38 +0200,
  Joel Ziegler <[email protected]> wrote:

> Man i should stop assuming return types with data in name are just
> bytes and instead read up on the datatype. Sorry for that, it's my
> first time with glib and Arrow.
> 
> Thanks a lot for the help, again! I am able to send RecordBatches over
> the network now.
> 
> A new problem arose and i could solve it, but i am not sure, whether
> my solution is appropriate. I would be glad if you can give me your
> opinion.
> 
> I started splitting bigger Tables in multiple RecordBatches, sending
> them over the network and reading them with
> garrow_record_batch_reader_read_next(). But i created a new
> GArrowRecordBatchStreamWriter for each RecordBatch and closed it with
> g_object_unref() before sending the data over, because i want to
> "close" the writer before reading from the output buffer. This lead to
> the StreamReader only assuming the first RecordBatch in the stream,
> probably because the writer writes an EOS. So i started not using
> g_object_unref() on the StreamWriter and just reading from the buffer,
> which seems to work fine. Am i just lucky? Or is there another way of
> securely reading parts of the buffer, even though more RecordBatches
> will be written in the future?
> 
> I also wanted to ask, where can i find the usage of these Arrow Writer
> Classes? The usage of the GLib classes are well documented and i was
> just blind in not finding the information, you provided, because of
> false assumptions. But i can't find the Arrow documentation, which is
> explaining the usage of the Writer classes, as you did to me.
> 
> Sorry, if i am asking too much. I am also fine, if you just send some
> direction or links, with which i can find the solution by myself. You
> don't have to build my code :)
> 
> 
> Sincerely, Joel Ziegler
> 
> 
> On 09.07.22 04:48, Sutou Kouhei wrote:
>> Hi,
>>
>>>      GBytes *data = garrow_buffer_get_data(GARROW_BUFFER(buffer));
>>>      gint64 length = garrow_buffer_get_size(GARROW_BUFFER(buffer));
>>>
>>>      GArrowBuffer *receivingBuffer = garrow_buffer_new(data, length);
>> The data is GBytes * not const char *. You need to get raw
>> data from GBytes *:
>>
>>    GBytes *data = garrow_buffer_get_data(GARROW_BUFFER(buffer));
>>
>>    gsize data_size;
>>    gconstpointer data_raw = g_bytes_get_data(data, &data_size);
>>    GArrowBuffer *receivingBuffer = garrow_buffer_new(data_raw,
>>    data_size);
>>
>> And you need to call g_bytes_unref() against the data when
>> no longer needed:
>>
>>    g_bytes_unref(data);
>>
>>
>> Thanks,

Reply via email to