On 10 Jan 2023, at 18:36, Dongjoon Hyun <dongjoon.h...@gmail.com> wrote:
It sounds interesting. Are you writing and reading ORC files progmatically via ORC library? Or, do you use Spark/Flink/PyArrow/Dask? Writing will be done using ORC C++ library. Reading will be done using Spark with python. //hinko Dongjoon On Tue, Jan 10, 2023 at 8:23 AM Hinko Kocevar <hinko.koce...@ess.eu<mailto:hinko.koce...@ess.eu>> wrote: I would like to use ORC file to hold several columns of data. One of the columns will be a list (array) of floats that could span 10 000 - 50 000 elements is length. Other columns will not be lists, but of different data types. Is having such long lists in any way an issue in terms of performance or otherwise for the ORC file? Thank you in advance! //Hinko