Hi Richard, It is possible - I've created an example in this gist showing how to loop through a list of files and write to a Parquet file one row at a time: https://gist.github.com/thisisnic/5bdb85d2742bc318433f2f14b8bd77cf.
Does this solve your problem? On Thu, 27 Jul 2023 at 12:22, Richard Beare <[email protected]> wrote: > Hi arrow experts, > > I have what I think should be a standard problem, but I'm not seeing the > correct solution. > > I have data in a nonstandard form (nifti neuroimaging files) that I can > load into R and transform into a single row dataframe (which is 30K > columns). In a small example I can load about 80 of these into a single > dataframe and save as feather or parquet without problem. I'd like to > address the problem where I have thousands. > > The approach of loading a collection (e.g. 10) into a dataframe and saving > with a hive standard name and repeating does work, but doesn't seem like > the right way to do it. > > Is there a way to stream data, one row at a time, into a feather or > parquet file? > I've attempted to use write_feather with a FileOutputputStream sink, but > without luch >
