Thank you, I will look on that,
The real problem is that I read data in chunks and the end of the chunk is
truncated (not a complete line) . I need to wait for the next chunk to have
the line completion.

Is there a way you suggest to process only the chunks smoothly ?

Thank you


Le ven. 8 juil. 2022 à 03:37, Sutou Kouhei <[email protected]> a écrit :

> Answered on dev@:
> https://lists.apache.org/thread/5rpykkfoz416mq889pcpx9rwrrtjog60
>
> In <CAJdzkC04+Uxa6bdmozPQFDkQ07M4Q=fmuhh2gvqzz-na2lm...@mail.gmail.com>
>   "StreamReader" on Sat, 2 Jul 2022 16:04:45 +0200,
>   L Ait <[email protected]> wrote:
>
> > Hi,
> >
> > I need help to integrate arrow cpp in my current project. In fact I built
> > cpp library and can call api.
> >
> > What I need is that:
> >
> > I have a c++ project that reads data by chunks then uses some erasure
> code
> > to rebuild original data.
> >
> > The rebuild is done in chunks , At each iteration I can access a buffer
> of
> > rebuilt data.
> >
> > My need is to pass this data as a stream to arrow process then send the
> > processed stream.
> >
> > For example if my original file is a csv and I would like to filter and
> > save first column:
> >
> > file
> >
> > col1,col2, col3, col3
> > a1,b1,c1,d1
> > an,bn,cn,dn
> >
> > split to 6 chunks of equal sizes chunk1:
> >
> > a1,b1,c1,d1
> > ak,bk
> >
> > chunk2:
> >
> > ck,dk
> > ...
> > am,bm,cm,dm
> >
> > and so on.
> >
> > My question is how to use the right StreamReader  in arrow and how this
> > deals with in complete records( lines)  at the beginning and end of each
> > chunk ?
> >
> > Here a snippet of code I use :
> > buffer_type_t res = fut.get0();
> > BOOST_LOG_TRIVIAL(trace) <<
> > "RawxBackendReader: Got result with buffer size: " << res.size();
> > std::shared_ptr<arrow::io::InputStream> input;
> >
> > std::shared_ptr<arrow::io::BufferReader> buffer(new
> arrow::io::BufferReader(
> > reinterpret_cast<const uint8_t*>(res.get()), res.size()));
> > input = buffer;
> > BOOST_LOG_TRIVIAL(trace) << "laa type input" << input.get();
> >
> > ArrowFilter arrow_filter = ArrowFilter(input);
> > arrow_filter.ToCsv();
> >
> >
> > result.push_back(std::move(res));
> >
> > Thank you
>

Reply via email to