tustvold commented on code in PR #5323:
URL: https://github.com/apache/arrow-rs/pull/5323#discussion_r1461698330
##########
arrow-schema/src/datatype.rs:
##########
@@ -145,10 +145,30 @@ pub enum DataType {
/// ```
Timestamp(TimeUnit, Option<Arc<str>>),
/// A signed 32-bit date representing the elapsed time since UNIX epoch
(1970-01-01)
- /// in days (32 bits).
+ /// in days.
Date32,
/// A signed 64-bit date representing the elapsed time since UNIX epoch
(1970-01-01)
- /// in milliseconds (64 bits). Values are evenly divisible by 86400000.
+ /// in milliseconds.
+ ///
+ /// According to the specification (see [Schema.fbs]), this should be
treated as the number of
+ /// days, in milliseconds, since the UNIX epoch. Therefore, values must be
evenly divisible by
+ /// `86_400_000` (the number of milliseconds in a standard day).
+ ///
+ /// The reason for this is historically for compatibility with other
languages native libraries,
Review Comment:
```suggestion
/// The reason for this is for compatibility with other language's
native libraries,
```
##########
arrow-schema/src/datatype.rs:
##########
@@ -145,10 +145,30 @@ pub enum DataType {
/// ```
Timestamp(TimeUnit, Option<Arc<str>>),
/// A signed 32-bit date representing the elapsed time since UNIX epoch
(1970-01-01)
- /// in days (32 bits).
+ /// in days.
Date32,
/// A signed 64-bit date representing the elapsed time since UNIX epoch
(1970-01-01)
- /// in milliseconds (64 bits). Values are evenly divisible by 86400000.
+ /// in milliseconds.
+ ///
+ /// According to the specification (see [Schema.fbs]), this should be
treated as the number of
+ /// days, in milliseconds, since the UNIX epoch. Therefore, values must be
evenly divisible by
+ /// `86_400_000` (the number of milliseconds in a standard day).
+ ///
+ /// The reason for this is historically for compatibility with other
languages native libraries,
+ /// such as Java when it used to store dates as timestamps.
Review Comment:
```suggestion
/// such as Java, which historically lacked a dedicated date type
/// and only supported timestamps.
```
##########
arrow-schema/src/datatype.rs:
##########
@@ -145,10 +145,30 @@ pub enum DataType {
/// ```
Timestamp(TimeUnit, Option<Arc<str>>),
/// A signed 32-bit date representing the elapsed time since UNIX epoch
(1970-01-01)
- /// in days (32 bits).
+ /// in days.
Date32,
/// A signed 64-bit date representing the elapsed time since UNIX epoch
(1970-01-01)
- /// in milliseconds (64 bits). Values are evenly divisible by 86400000.
+ /// in milliseconds.
+ ///
+ /// According to the specification (see [Schema.fbs]), this should be
treated as the number of
+ /// days, in milliseconds, since the UNIX epoch. Therefore, values must be
evenly divisible by
+ /// `86_400_000` (the number of milliseconds in a standard day).
+ ///
+ /// The reason for this is historically for compatibility with other
languages native libraries,
+ /// such as Java when it used to store dates as timestamps.
+ ///
+ /// Practically, validation that values of this type are evenly divisible
by `86_400_000` is not enforced
+ /// by this library for performance and usability reasons. Date64 values
will be treated similar to the
Review Comment:
```suggestion
/// by this library for performance and usability reasons. Date64 values
will be treated similarly to the
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]