There are definitely problems with any format, and you will need to know 
the tradeoffs to make the correct choice.

If speed of serialization and deserialization is important, and you don't 
intend to share the file or hold on to it for an extended period of time, 
the native Julia serialization format should be your best bet. It is 
designed specifically for Julia, and does not require conversion to match 
different type systems.

If portability and semi long term storage is required, speed matters or you 
want the reloaded data to preserve type information and other metadata, you 
should use some sort of standardized binary format, like HDF5, that makes 
promises to keep backwards compatibility.

If you want to infinitely store and share the data, and use it on any 
platform, you should go with a simple format, that it is easy to write a 
parser for. CSV is a common choice, but also JSON and XML. 

Ivar

kl. 12:25:05 UTC+2 torsdag 3. april 2014 skrev Tim Holy følgende:
>
> Two problems with CSV: 
> - parsing text is slow compared to binary 
> - csv doesn't support arbitrary user-defined types. 
>
> CSV is fine for arrays of numbers and other simple structures, although 
> I'm not 
> sure whether you can reliably distinguish between a Vector{Uint8} and a 
> Vector{Int}. So even for simple objects your types might change when you 
> read 
> it back in. 
>
> --Tim 
>
> On Thursday, April 03, 2014 01:16:03 AM Ivar Nesje wrote: 
> > Portable and open source is definitely good properties to ensure long 
> therm 
> > thrust. If the standard has several implementations, it helps to ensure 
> > that the standard is concistent and usable in different settings, and 
> > discourages incompatible changes. It is also a good sign of popularity. 
> I 
> > have not looked at HDF5, but I suspect there are many settings where a 
> > zipped directory of csv files are easier to use, as well as cases where 
> the 
> > opposite is true. As we don't bundle the library with Julia, 
> installation 
> > is an additional issue. 
> > 
> > This is anyway only relevant for a long therm STORAGE format. That is a 
> > different requirement from a file to transfer data between machines, or 
> > work sessions on different days. 
>

Reply via email to