Markus Metz wrote: > I have > read "The issues" and understand the problem, but some sort of > implementation of G_fseek and G_ftell is needed, otherwise modules and > libraries need a workaround like the iostream library is doing now. > Instead of having many (potentially different) workarounds, one proper > solution is preferable. This may not be easy, and as much as I like > tackling not easy problems, here I can only say: Please do it!.
I have added G_fseek() and G_ftell() to 7.0 in r35818. > >> As you suggested, 2 32bit reads can be done, and > >> depending on the endian-ness of the host system either the high word > >> value or the low word value used. > >> > > > > The low word is always used. That might be the first word or the > > second word, but it's always the low word. > > I got confused by this endian-ness and confused low/high word with > first/second word. With the current code, the low word would be the > second word when doing 2 32bit reads on a 64bit sized buffer, > independent on a endian-ness mismatch. In this case, the libs would have > to check if the high word is != 0 and then exit with an ERROR message, > right? Right. The files are always written big-endian, so the high word will always be first in the file. As well as checking that the high word is zero, you also need to check that the low word is <= 0x7fffffff (off_t is signed, hence the limit being 2GiB not 4GiB). > >> When writing offsets, it would be easiest (also safest?) to always use > >> sizeof(off_t) of the libs. There will be no mix of different offset > >> sizes because topo and cidx are currently written anew when the vector > >> was updated. > >> > > > > It would be both easiest and safest. Although it would be preferable > > to use 32 bits if that is known to be sufficient, I don't know whether > > this is feasible. > > I don't think so. With v.in.ogr, you have no chance to estimate the coor > file size. Coming back to my test shapefile for v.in.ogr with a total > size below 5MB, that thing results in a coor file > 8GB with cleaning > and > 4GB without cleaning. When working on a grass vector, each module > would have to estimate the increase of the coor file. Most modules copy > the input vector to the output vector, do the requested modifications on > the output vector and write out the output vector. You would have to do > some very educated guessing on the size of the final coor file, > considering the expected amount of dead lines and the expected amount of > additional vertices, to decide if a 32bit off_t would be sufficient. > Instead I would prefer to use 64 bits whenever possible. Personally, I > would regard 32bit support as a courtesy, but please don't start a > discussion about that. The issue is whether the coor file size is known at the point that you start writing the topo/cidx files. If the files are generated concurrently, then it isn't feasible. If the coor file is generated first, then it is. -- Glynn Clements <[email protected]> _______________________________________________ grass-dev mailing list [email protected] http://lists.osgeo.org/mailman/listinfo/grass-dev
