Le lundi 06 avril 2015 23:11:21, Dmitriy Baryshnikov a écrit : > Hi Even, > > It seems to me that this is duplicating of RFC 50: OGR field subtypes. > For example we have the master field type DateTime and Subtype - Year. > So the internal structure for date/time representation may be adopt to > such technique.
The subtype is defined at field definition level. In all formats we currently handle we only know the date/time precision when reading values (and they might have different precision between records), so after having created the layer and field definitions. > > Best regards, > Dmitry > > 06.04.2015 15:02, Even Rouault пишет: > > Le lundi 06 avril 2015 13:48:47, Even Rouault a écrit : > >> Le lundi 06 avril 2015 11:32:33, Dmitriy Baryshnikov a écrit : > >>> The first solution looks reasonable. But there is lack in precision > >>> field - there the only time is significant: > >>> > >>> ODTP_HMSm > >>> ODTP_HMS > >>> ODTP_HM > >>> ODTP_H > >> > >> As I didn't want to multiply the values in the enumeration, my intent > >> was to reuse the ODTP_YMDxxxx values for OFTTime only. > > > > I meant "for OFTTime too" > > > >> This was what I wanted > >> to intend with the precision between parenthesis in the comment of > >> ODTP_YMDH "Year, month, day (if OFTDateTime) and hour" > >> > >> Or perhaps, the enumeration should capture the most precise part of the > >> (date)time structure ? > >> ODTP_Year > >> ODTP_Month > >> ODTP_Day > >> ODTP_Hour > >> ODTP_Minute > >> ODTP_Second > >> ODTP_Millisecond > >> > >>> etc. > >>> > >>> Best regards, > >>> > >>> Dmitry > >>> > >>> 05.04.2015 22:25, Even Rouault пишет: > >>>> Hi, > >>>> > >>>> In an effort of revisiting http://trac.osgeo.org/gdal/ticket/2680, > >>>> which is about lack of precision of the current datetime structure, > >>>> I've imagined different solutions how to modify the OGRField > >>>> structure, and failed to pick up one that would be the obvious > >>>> solution, so opinions are welcome. > >>>> > >>>> The issue is how to add (at least) microsecond accuracy to the > >>>> datetime structure, as a few formats support it explicitely or > >>>> implicitely : MapInfo, GPX, Atom (GeoRSS driver), GeoPackage, SQLite, > >>>> PostgreSQL, CSV, GeoJSON, ODS, XLSX, KML (potentially GML too)... > >>>> > >>>> Below a few potential solutions : > >>>> > >>>> --------------------------------------- > >>>> Solution 1) : Millisecond accuracy, second becomes a float > >>>> > >>>> This is the solution I've prototyped. > >>>> > >>>> typedef union { > >>>> [...] > >>>> > >>>> struct { > >>>> > >>>> GInt16 Year; > >>>> GByte Month; > >>>> GByte Day; > >>>> GByte Hour; > >>>> GByte Minute; > >>>> GByte TZFlag; > >>>> GByte Precision; /* value in OGRDateTimePrecision */ > >>>> float Second; /* from 00.000 to 60.999 (millisecond > >>>> accuracy) */ > >>>> > >>>> } Date; > >>>> > >>>> } OGRField > >>>> > >>>> So sub-second precision is representing with a single precision > >>>> floating point number, storing both integral and decimal parts. (we > >>>> could theorically have a hundredth of millisecond accuracy, 10^-5 s, > >>>> since 6099999 fits on the 23 bits of the mantissa) > >>>> > >>>> Another addition is the Precision field that indicates which parts of > >>>> the datetime structure are significant. > >>>> > >>>> /** Enumeration that defines the precision of a DateTime. > >>>> > >>>> * @since GDAL 2.0 > >>>> */ > >>>> > >>>> typedef enum > >>>> { > >>>> > >>>> ODTP_Undefined, /**< Undefined */ > >>>> ODTP_Guess, /**< Only valid when setting through > >>>> SetField(i,year, > >>>> > >>>> month...) where OGR will guess */ > >>>> > >>>> ODTP_Y, /**< Year is significant */ > >>>> ODTP_YM, /**< Year and month are significant*/ > >>>> ODTP_YMD, /**< Year, month and day are significant */ > >>>> ODTP_YMDH, /**< Year, month, day (if OFTDateTime) and > >>>> hour are > >>>> > >>>> significant */ > >>>> > >>>> ODTP_YMDHM, /**< Year, month, day (if OFTDateTime), hour > >>>> and > >>>> > >>>> minute are significant */ > >>>> > >>>> ODTP_YMDHMS, /**< Year, month, day (if OFTDateTime), > >>>> hour, minute > >>>> > >>>> and integral second are significant */ > >>>> > >>>> ODTP_YMDHMSm, /**< Year, month, day (if OFTDateTime), > >>>> hour, minute > >>>> > >>>> and second with microseconds are significant */ > >>>> } OGRDateTimePrecision; > >>>> > >>>> I think this is important since "2015/04/05 17:12:34" and "2015/04/05 > >>>> 17:12:34.000" do not really mean the same thing and it might be good > >>>> to be able to preserve the original accuracy when converting between > >>>> formats. > >>>> > >>>> A drawback of this solution is that the size of the OGRField structure > >>>> increases from 8 bytes to 12 on 32 bit builds (and remain 16 bytes on > >>>> 64 bit). This is probably not that important since in most cases not > >>>> that many OGRField structures are instanciated at one time (typically, > >>>> you iterate over features one at a time). > >>>> This could be more of a problem for use cases that involve the MEM > >>>> driver, as it keep all features in memory. > >>>> > >>>> Another drawback is that the change of the structure might not be > >>>> directly noticed by application developers as the Second field name is > >>>> preserved, but a new Precision field is added, so there's a risk that > >>>> Precision is let uninitialized if the field is set through > >>>> OGRFeature::SetField(int iFieldIndex, OGRField* psRawField). That > >>>> could lead to unexpected formatting (but hopefully not crashes with > >>>> defensive programming). However I'd think it is unlikely that many > >>>> applications directly manipulate OGRField directly, instead of using > >>>> the getters and setters of OGRFeature. > >>>> > >>>> --------------------------------------- > >>>> Solution 2) : Millisecond accuracy, second and milliseconds in > >>>> distinct fields > >>>> > >>>> typedef union { > >>>> [...] > >>>> > >>>> struct { > >>>> > >>>> GInt16 Year; > >>>> GByte Month; > >>>> GByte Day; > >>>> GByte Hour; > >>>> GByte Minute; > >>>> GByte TZFlag; > >>>> GByte Precision; /* value in OGRDateTimePrecision */ > >>>> GByte Second; /* from 0 to 60 */ > >>>> > >>>> GUInt16 Millisecond; /* from 0 to 999 */ > >>>> > >>>> } Date; > >>>> > >>>> } OGRField > >>>> > >>>> Same size of structure as in 1) > >>>> > >>>> --------------------------------------- > >>>> Solution 3) : Millisecond accuracy, pack all fields > >>>> > >>>> Conceptually, this would use bit fields to avoid wasting unused bits : > >>>> > >>>> typedef union { > >>>> [...] > >>>> > >>>> struct { > >>>> > >>>> GInt16 Year; > >>>> GUIntBig Month:4; > >>>> GUIntBig Day:5; > >>>> GUIntBig Hour:5; > >>>> GUIntBig Minute:6; > >>>> GUIntBig Second:6; > >>>> GUIntBig Millisecond:10; /* 0-999 */ > >>>> GUIntBig TZFlag:8; > >>>> GUIntBig Precision:4; > >>>> > >>>> } Date; > >>>> > >>>> } OGRField; > >>>> > >>>> This was proposed in the above mentionned ticket. And as there were > >>>> enough remaining bits, I've also added the Precision field (and in all > >>>> other solutions). > >>>> > >>>> The advantage is that sizeof(mydate) remains 8 bytes on 32 bits > >>>> builds. > >>>> > >>>> But the C standard only defines bitfields of int/unsigned int, so this > >>>> is not portable, plus the fact that the way bits are packed is not > >>>> defined by the standard, so different compilers could come up with > >>>> different packing. A workaround is to do the bit manipulation through > >>>> macros : > >>>> > >>>> typedef union { > >>>> [...] > >>>> > >>>> struct { > >>>> > >>>> GUIntBig opaque; > >>>> > >>>> } Date; > >>>> > >>>> } OGRField; > >>>> > >>>> #define GET_BITS(x,y_bits,shift) (int)(((x).Date.opaque >> > >>>> (shift)) & ((1 << (y_bits))-1)) > >>>> > >>>> #define GET_YEAR(x) (short)GET_BITS(x,16,64-16) > >>>> #define GET_MONTH(x) GET_BITS(x,4,64-16-4) > >>>> #define GET_DAY(x) GET_BITS(x,5,64-16-4-5) > >>>> #define GET_HOUR(x) GET_BITS(x,5,64-16-4-5-5) > >>>> #define GET_MINUTE(x) GET_BITS(x,6,64-16-4-5-5-6) > >>>> #define GET_SECOND(x) GET_BITS(x,6,64-16-4-5-5-6-6) > >>>> #define GET_MILLISECOND(x) GET_BITS(x,10,64-16-4-5-5-6-6-10) > >>>> #define GET_TZFLAG(x) GET_BITS(x,8,64-16-4-5-5-6-6-10-8) > >>>> #define GET_PRECISION(x) GET_BITS(x,4,64-16-4-5-5-6-6-10-8-4) > >>>> > >>>> #define SET_BITS(x,y,y_bits,shift) (x).Date.opaque = ((x).Date.opaque > >>>> & (~( (GUIntBig)((1 << (y_bits))-1) << (shift) )) | ((GUIntBig)(y) << > >>>> (shift))) > >>>> > >>>> #define SET_YEAR(x,val) SET_BITS(x,val,16,64-16) > >>>> #define SET_MONTH(x,val) SET_BITS(x,val,4,64-16-4) > >>>> #define SET_DAY(x,val) SET_BITS(x,val,5,64-16-4-5) > >>>> #define SET_HOUR(x,val) SET_BITS(x,val,5,64-16-4-5-5) > >>>> #define SET_MINUTE(x,val) SET_BITS(x,val,6,64-16-4-5-5-6) > >>>> #define SET_SECOND(x,val) SET_BITS(x,val,6,64-16-4-5-5-6-6) > >>>> #define SET_MILLISECOND(x,val) > >>>> SET_BITS(x,val,10,64-16-4-5-5-6-6-10) #define SET_TZFLAG(x,val) > >>>> > >>>> SET_BITS(x,val,8,64-16-4-5-5-6-6-10-8) #define SET_PRECISION(x,val) > >>>> > >>>> SET_BITS(x,val,4,64-16-4-5-5-6-6-10-8-4) > >>>> > >>>> Main advantage: the size of OGRField remains unchanged (so 8 bytes on > >>>> 32-bit builds). > >>>> > >>>> Drawback: manipulation of datetime members is less natural, but there > >>>> are not that many places in the GDAL code base were the OGRField.Date > >>>> members are used, so it is not much that a problem. > >>>> > >>>> --------------------------------------- > >>>> Solution 4) : Microsecond accuracy with one field > >>>> > >>>> Solution 1) used a float for second and sub-second, but a float has > >>>> only 23 bits of mantissa, which is enough to represent second with > >>>> millisecond accuracy, but not for microsecond (you need 26 bits for > >>>> that). So use a 32-bit integer instead of a 32-bit floating point. > >>>> > >>>> typedef union { > >>>> [...] > >>>> > >>>> struct { > >>>> > >>>> GInt16 Year; > >>>> GByte Month; > >>>> GByte Day; > >>>> GByte Hour; > >>>> GByte Minute; > >>>> GByte TZFlag; > >>>> GByte Precision; /* value in OGRDateTimePrecision */ > >>>> GUInt32 Microseconds; /* 00000000 to 59999999 */ > >>>> > >>>> } Date; > >>>> > >>>> } OGRField > >>>> > >>>> Same as solution 1: sizeof(OGRField) becomes 12 bytes on 32-bit builds > >>>> (and remain 16 bytes on 64-bit builds) > >>>> > >>>> We would need to add an extra value in OGRDateTimePrecision to mean > >>>> the microsecond accuracy. > >>>> > >>>> Not really clear we need microseconds accuracy... Most formats that > >>>> support subsecond accuracy use ISO 8601 representation (e.g. YYYY-MM- > >>>> DDTHH:MM:SS.sssssZ) that doesn't define the maximal number of decimals > >>>> beyond second. From > >>>> http://www.postgresql.org/docs/9.1/static/datatype-datetime.html, > >>>> PostgreSQL supports microsecond accuracy. > >>>> > >>>> --------------------------------------- > >>>> Solution 5) : Microsecond with 3 fields > >>>> > >>>> A variant where we split second into 3 integer parts: > >>>> > >>>> typedef union { > >>>> [...] > >>>> > >>>> struct { > >>>> > >>>> GInt16 Year; > >>>> GByte Month; > >>>> GByte Day; > >>>> GByte Hour; > >>>> GByte Minute; > >>>> GByte TZFlag; > >>>> GByte Precision; /* value in OGRDateTimePrecision */ > >>>> > >>>> GByte Second; /* 0 to 59 */ > >>>> > >>>> GUInt16 Millisecond; /* 0 to 999 */ > >>>> GUInt16 Microsecond; /* 0 to 999 */ > >>>> > >>>> } Date; > >>>> > >>>> } OGRField > >>>> > >>>> Drawback: due to alignment, sizeof(OGRField) becomes 16 bytes on > >>>> 32-bit builds (and remain 16 bytes on 64-bit builds) > >>>> > >>>> --------------------------------------- > >>>> Solution 6) : Nanosecond accuracy and beyond ! > >>>> > >>>> Now that we are using 16 bytes, why not having nanosecond accuracy ? > >>>> > >>>> typedef union { > >>>> [...] > >>>> > >>>> struct { > >>>> > >>>> GInt16 Year; > >>>> GByte Month; > >>>> GByte Day; > >>>> GByte Hour; > >>>> GByte Minute; > >>>> GByte TZFlag; > >>>> GByte Precision; /* value in OGRDateTimePrecision */ > >>>> > >>>> double Second; /* 0.000000000 to 60.999999999 */ > >>>> > >>>> } Date; > >>>> > >>>> } OGRField > >>>> > >>>> Actually we even have picosecond accuracy! (since for picoseconds, we > >>>> need 46 bits and a double has 52 bits of mantissa). And if we use a > >>>> 64-bit integer instead of a double, we can have femtosecond accuracy > >>>> ;-) > >>>> > >>>> Any preference ? > >>>> > >>>> Even > >>> > >>> _______________________________________________ > >>> gdal-dev mailing list > >>> [email protected] > >>> http://lists.osgeo.org/mailman/listinfo/gdal-dev > > _______________________________________________ > gdal-dev mailing list > [email protected] > http://lists.osgeo.org/mailman/listinfo/gdal-dev -- Spatialys - Geospatial professional services http://www.spatialys.com _______________________________________________ gdal-dev mailing list [email protected] http://lists.osgeo.org/mailman/listinfo/gdal-dev
