Thank you, I will look on that, The real problem is that I read data in chunks and the end of the chunk is truncated (not a complete line) . I need to wait for the next chunk to have the line completion.
Is there a way you suggest to process only the chunks smoothly ? Thank you Le ven. 8 juil. 2022 à 03:37, Sutou Kouhei <[email protected]> a écrit : > Answered on dev@: > https://lists.apache.org/thread/5rpykkfoz416mq889pcpx9rwrrtjog60 > > In <CAJdzkC04+Uxa6bdmozPQFDkQ07M4Q=fmuhh2gvqzz-na2lm...@mail.gmail.com> > "StreamReader" on Sat, 2 Jul 2022 16:04:45 +0200, > L Ait <[email protected]> wrote: > > > Hi, > > > > I need help to integrate arrow cpp in my current project. In fact I built > > cpp library and can call api. > > > > What I need is that: > > > > I have a c++ project that reads data by chunks then uses some erasure > code > > to rebuild original data. > > > > The rebuild is done in chunks , At each iteration I can access a buffer > of > > rebuilt data. > > > > My need is to pass this data as a stream to arrow process then send the > > processed stream. > > > > For example if my original file is a csv and I would like to filter and > > save first column: > > > > file > > > > col1,col2, col3, col3 > > a1,b1,c1,d1 > > an,bn,cn,dn > > > > split to 6 chunks of equal sizes chunk1: > > > > a1,b1,c1,d1 > > ak,bk > > > > chunk2: > > > > ck,dk > > ... > > am,bm,cm,dm > > > > and so on. > > > > My question is how to use the right StreamReader in arrow and how this > > deals with in complete records( lines) at the beginning and end of each > > chunk ? > > > > Here a snippet of code I use : > > buffer_type_t res = fut.get0(); > > BOOST_LOG_TRIVIAL(trace) << > > "RawxBackendReader: Got result with buffer size: " << res.size(); > > std::shared_ptr<arrow::io::InputStream> input; > > > > std::shared_ptr<arrow::io::BufferReader> buffer(new > arrow::io::BufferReader( > > reinterpret_cast<const uint8_t*>(res.get()), res.size())); > > input = buffer; > > BOOST_LOG_TRIVIAL(trace) << "laa type input" << input.get(); > > > > ArrowFilter arrow_filter = ArrowFilter(input); > > arrow_filter.ToCsv(); > > > > > > result.push_back(std::move(res)); > > > > Thank you >
