I'm struggling to get a simple parquet writer working using the c++
library. The source is here:

https://gist.github.com/tnarg/8878a38d4a22104328c4d289319f9ac1

and I'm compiling like so

g++ --std=c++11 -o writer writer.cc -lparquet -larrow -larrow_io

When I run this program, I get the following error

gmonroe@foo:~$ ./writer
terminate called after throwing an instance of 'parquet::ParquetException'
  what():  Less than the number of expected rows written in the current
column chunk
Aborted (core dumped)

If I change NUM_ROWS_PER_ROW_GROUP=3, this writer succeeds. This suggests
that every column needs to contain N values such that N
% NUM_ROWS_PER_ROW_GROUP = 0 and N > 0. For an arbitrarily complex set of
values the only reasonable choice for NUM_ROWS_PER_ROW_GROUP is 1.

Is this a bug in the c++ library or am I missing something in the API?

Regards,
Grant Monroe

Reply via email to