I think it is OK to have a separate PR for random nested data generation. We wanted to do this for parquet as well, but didn't get to it. Instead we constructed a very detailed set of nesting level tests.
On Sun, Jan 31, 2021 at 9:12 PM Ying Zhou <yzhou7...@gmail.com> wrote: > Hi, > > As a part of the process of reducing test size in this pull request > https://github.com/apache/arrow/pull/8648 < > https://github.com/apache/arrow/pull/8648> which contains the ORC writer > for C++ and Python I wrote a random chunked array generator and a random > table generator. To reduce test size to ideal levels it will be necessary > to improve arrow::random::RandomArrayGenerator::ArrayOf to support nested > types. I really don’t think such work really belongs to the ORC writer PR. > Shall I first try to get this PR to pass and then file a separate one with > improvements in arrow/testing/random or shall I file them together as one > PR? Thanks! > > Ying