And a part of the issue there is that the format specifies that some of that metadata is executable python.
(I took a look at implementing this myself, in a library, ran into that issue, and decided to spend my time elsewhere.) That said, I wouldn't worry about efficiency of memory allocation for something that's parsing a file. J routinely makes copies of data while operating on it, and that has proved to be significantly faster than naive optimization approaches would suggest. It turns out that cache coherency matters a lot more than micro-optimizations. For something like this, it's much more important that it works correctly than that it be "fast". Otherwise you just get something that does the wrong thing, really fast. Put differently: you should first parse the file into a representation that has all the information you need to send a serialized form of the file to J. The overhead of reading the file will be orders of magnitude greater than the cost of a serialize/deserialize step. And, maintaining clean interfaces is critical to the long term survival of a system like this. Good luck, -- Raul On Mon, Oct 5, 2020 at 12:48 AM 'Zhihao Yuan' via Source <[email protected]> wrote: > > On Fri, Oct 2, 2020 at 12:45 PM Alex Shroyer <[email protected]> wrote: > > > My 2 cents, this should not be part of the J engine, but implemented as a > > library. > > It appears that NumPy itself implements .npy conversion as a library: > > https://github.com/numpy/numpy/blob/master/numpy/lib/format.py > > > > > A part of the reason is that since the code is > written in Python, they don't need to write a > parser or serializer to handle .npy metadata, > but we do. > > -- > Zhihao > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
