Hi all,

I‘ve made the Pip/Conda module npy-append-array for exactly this purpose, see

https://github.com/xor2k/npy-append-array

It works with one dimensional arrays, too, of course. The key challange is to 
properly initialize and update the header accordingly as the array grows which 
my module takes care of. I‘d like to integrate this functionality directly into 
Numpy, see PR

https://github.com/numpy/numpy/pull/20321/

but I have been busy and did have not received any feedback recently. A more 
direct integration into Numpy would allow to skip or ease the header update 
part, e.g. by introducing a new file format version. This could turn .npy into 
a sort of binary CSV equivalent where the size of the array is determined by 
the file size.

Best, Michael

> On 24. Aug 2022, at 03:04, Robert Kern <robert.k...@gmail.com> wrote:
> 
> On Tue, Aug 23, 2022 at 8:47 PM <bross_phobr...@sonic.net> wrote:
>> I want to calc multiple ndarrays at once and lack memory, so want to write 
>> in chunks (here sized to GPU batch capacity). It seems there should be an 
>> interface to write the header, then write a number of elements cyclically, 
>> then add any closing rubric and close the file. 
>> 
>> Is it as simple as lib.format.write_array_header_2_0(fp, d) 
>> then writing multiple shape(N,) arrays of float by fp.write(item.tobytes())?
>  
> `item.tofile(fp)` is more efficient, but yes, that's the basic scheme. There 
> is no footer after the data.
> 
> The alternative is to use `np.lib.format.open_memmap(filename, mode='w+', 
> dtype=dtype, shape=shape)`, then assign slices sequentially to the returned 
> memory-mapped array. A memory-mapped array is usually going to be friendlier 
> to whatever memory limits you are running into than a nominally "in-memory" 
> array.
> 
> -- 
> Robert Kern
> _______________________________________________
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: michael.sieber...@gmail.com
_______________________________________________
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com

Reply via email to