Re: [Rd] Wrong length of POSIXt vectors (PR#10507)
On 15/12/2007 5:17 PM, Martin Maechler wrote: TP == Tony Plate [EMAIL PROTECTED] on Fri, 14 Dec 2007 13:58:30 -0700 writes: TP Duncan Murdoch wrote: On 12/13/2007 1:59 PM, Tony Plate wrote: Duncan Murdoch wrote: On 12/11/2007 6:20 AM, [EMAIL PROTECTED] wrote: Full_Name: Petr Simecek Version: 2.5.1, 2.6.1 OS: Windows XP Submission from: (NULL) (195.113.231.2) Several times I have experienced that a length of a POSIXt vector has not been computed right. Example: tv-structure(list(sec = c(50, 0, 55, 12, 2, 0, 37, NA, 17, 3, 31 ), min = c(1L, 10L, 11L, 15L, 16L, 18L, 18L, NA, 20L, 22L, 22L ), hour = c(12L, 12L, 12L, 12L, 12L, 12L, 12L, NA, 12L, 12L, 12L), mday = c(13L, 13L, 13L, 13L, 13L, 13L, 13L, NA, 13L, 13L, 13L), mon = c(5L, 5L, 5L, 5L, 5L, 5L, 5L, NA, 5L, 5L, 5L), year = c(105L, 105L, 105L, 105L, 105L, 105L, 105L, NA, 105L, 105L, 105L), wday = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, NA, 1L, 1L, 1L), yday = c(163L, 163L, 163L, 163L, 163L, 163L, 163L, NA, 163L, 163L, 163L), isdst = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, -1L, 1L, 1L, 1L)), .Names = c(sec, min, hour, mday, mon, year, wday, yday, isdst ), class = c(POSIXt, POSIXlt)) print(tv) # print 11 time points (right) length(tv) # returns 9 (wrong) tv is a list of length 9. The answer is right, your expectation is wrong. I have tried that on several computers with/without switching to English locales, i.e. Sys.setlocale(LC_TIME, en). I have searched a help pages but I cannot imagine how that could be OK. See this in ?POSIXt: Class 'POSIXlt' is a named list of vectors... You could define your own length measurement as length.POSIXlt - function(x) length(x$sec) and you'll get the answer you expect, but be aware that length.XXX methods are quite rare, and you may surprise some of your users. On the other hand, isn't the fact that length() currently always returns 9 for POSIXlt objects likely to be a surprise to many users of POSIXlt? The back of The New S Language says Easy-to-use facilities allow you to organize, store and retrieve all sorts of data. ... S functions and data organization make applications easy to write. Now, POSIXlt has methods for c() and vector subsetting [ (and many other vector-manipulation methods - see methods(class=POSIXlt)). Hence, from the point of view of intending to supply easy-to-use facilities ... [for] all sorts of data, isn't it a little incongruous that length() is not also provided -- as 3 functions (any others?) comprise a core set of vector-manipulation functions? Would it make sense to have an informal prescription (e.g., in R-exts) that a class that implements a vector-like object and provides at least of one of functions 'c', '[' and 'length' should provide all three? It would also be easy to describe a test-suite that should be included in the 'test' directory of a package implementing such a class, that had some tests of the basic vector-manipulation functionality, such as: # at this point, x0, x1, x3, x10 should exist, as vectors of the # class being tested, of length 0, 1, 3, and 10, and they should # contain no duplicate elements length(x0) [1] 1 length(c(x0, x1)) [1] 2 length(c(x1,x10)) [1] 11 all(x3 == x3[seq(len=length(x3))]) [1] TRUE all(x3 == c(x3[1], x3[2], x3[3])) [1] TRUE length(c(x3[2], x10[5:7])) [1] 4 It would also be possible to describe a larger set of vector manipulation functions that should be implemented together, including e.g., 'rep', 'unique', 'duplicated', '==', 'sort', '[-', 'is.na', head, tail ... (many of which are provided for POSIXlt). Or is there some good reason that length() cannot be provided (while 'c' and '[' can) for some vector-like classes such as POSIXlt? What you say sounds good in general, but the devil is in the details. Changing the meaning of length(x) for some objects has fairly widespread effects. Are they all positive? I don't know. Adding a prescription like the one you suggest would be good if it's easy to implement, but bad if it's already widely violated. How many base or CRAN or Bioconductor packages violate it currently? Do the ones that provide all 3 methods do so in a consistent way, i.e. does length(x) mean the same thing in all of them? TP I'm not sure doing something like this would be so bad even if it is TP already widely
Re: [Rd] Wrong length of POSIXt vectors (PR#10507)
Duncan Murdoch wrote: On 15/12/2007 5:17 PM, Martin Maechler wrote: TP == Tony Plate [EMAIL PROTECTED] on Fri, 14 Dec 2007 13:58:30 -0700 writes: TP Duncan Murdoch wrote: On 12/13/2007 1:59 PM, Tony Plate wrote: Duncan Murdoch wrote: On 12/11/2007 6:20 AM, [EMAIL PROTECTED] wrote: Full_Name: Petr Simecek Version: 2.5.1, 2.6.1 OS: Windows XP Submission from: (NULL) (195.113.231.2) Several times I have experienced that a length of a POSIXt vector has not been computed right. Example: tv-structure(list(sec = c(50, 0, 55, 12, 2, 0, 37, NA, 17, 3, 31 ), min = c(1L, 10L, 11L, 15L, 16L, 18L, 18L, NA, 20L, 22L, 22L ), hour = c(12L, 12L, 12L, 12L, 12L, 12L, 12L, NA, 12L, 12L, 12L), mday = c(13L, 13L, 13L, 13L, 13L, 13L, 13L, NA, 13L, 13L, 13L), mon = c(5L, 5L, 5L, 5L, 5L, 5L, 5L, NA, 5L, 5L, 5L), year = c(105L, 105L, 105L, 105L, 105L, 105L, 105L, NA, 105L, 105L, 105L), wday = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, NA, 1L, 1L, 1L), yday = c(163L, 163L, 163L, 163L, 163L, 163L, 163L, NA, 163L, 163L, 163L), isdst = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, -1L, 1L, 1L, 1L)), .Names = c(sec, min, hour, mday, mon, year, wday, yday, isdst ), class = c(POSIXt, POSIXlt)) print(tv) # print 11 time points (right) length(tv) # returns 9 (wrong) tv is a list of length 9. The answer is right, your expectation is wrong. I have tried that on several computers with/without switching to English locales, i.e. Sys.setlocale(LC_TIME, en). I have searched a help pages but I cannot imagine how that could be OK. See this in ?POSIXt: Class 'POSIXlt' is a named list of vectors... You could define your own length measurement as length.POSIXlt - function(x) length(x$sec) and you'll get the answer you expect, but be aware that length.XXX methods are quite rare, and you may surprise some of your users. On the other hand, isn't the fact that length() currently always returns 9 for POSIXlt objects likely to be a surprise to many users of POSIXlt? The back of The New S Language says Easy-to-use facilities allow you to organize, store and retrieve all sorts of data. ... S functions and data organization make applications easy to write. Now, POSIXlt has methods for c() and vector subsetting [ (and many other vector-manipulation methods - see methods(class=POSIXlt)). Hence, from the point of view of intending to supply easy-to-use facilities ... [for] all sorts of data, isn't it a little incongruous that length() is not also provided -- as 3 functions (any others?) comprise a core set of vector-manipulation functions? Would it make sense to have an informal prescription (e.g., in R-exts) that a class that implements a vector-like object and provides at least of one of functions 'c', '[' and 'length' should provide all three? It would also be easy to describe a test-suite that should be included in the 'test' directory of a package implementing such a class, that had some tests of the basic vector-manipulation functionality, such as: # at this point, x0, x1, x3, x10 should exist, as vectors of the # class being tested, of length 0, 1, 3, and 10, and they should # contain no duplicate elements length(x0) [1] 1 length(c(x0, x1)) [1] 2 length(c(x1,x10)) [1] 11 all(x3 == x3[seq(len=length(x3))]) [1] TRUE all(x3 == c(x3[1], x3[2], x3[3])) [1] TRUE length(c(x3[2], x10[5:7])) [1] 4 It would also be possible to describe a larger set of vector manipulation functions that should be implemented together, including e.g., 'rep', 'unique', 'duplicated', '==', 'sort', '[-', 'is.na', head, tail ... (many of which are provided for POSIXlt). Or is there some good reason that length() cannot be provided (while 'c' and '[' can) for some vector-like classes such as POSIXlt? What you say sounds good in general, but the devil is in the details. Changing the meaning of length(x) for some objects has fairly widespread effects. Are they all positive? I don't know. Adding a prescription like the one you suggest would be good if it's easy to implement, but bad if it's already widely violated. How many base or CRAN or Bioconductor packages violate it currently? Do the ones that provide all 3 methods do so in a consistent way, i.e. does length(x) mean the same thing in all of them? TP I'm not sure doing something like this would be so bad even if it
Re: [Rd] Wrong length of POSIXt vectors (PR#10507)
TP == Tony Plate [EMAIL PROTECTED] on Fri, 14 Dec 2007 13:58:30 -0700 writes: TP Duncan Murdoch wrote: On 12/13/2007 1:59 PM, Tony Plate wrote: Duncan Murdoch wrote: On 12/11/2007 6:20 AM, [EMAIL PROTECTED] wrote: Full_Name: Petr Simecek Version: 2.5.1, 2.6.1 OS: Windows XP Submission from: (NULL) (195.113.231.2) Several times I have experienced that a length of a POSIXt vector has not been computed right. Example: tv-structure(list(sec = c(50, 0, 55, 12, 2, 0, 37, NA, 17, 3, 31 ), min = c(1L, 10L, 11L, 15L, 16L, 18L, 18L, NA, 20L, 22L, 22L ), hour = c(12L, 12L, 12L, 12L, 12L, 12L, 12L, NA, 12L, 12L, 12L), mday = c(13L, 13L, 13L, 13L, 13L, 13L, 13L, NA, 13L, 13L, 13L), mon = c(5L, 5L, 5L, 5L, 5L, 5L, 5L, NA, 5L, 5L, 5L), year = c(105L, 105L, 105L, 105L, 105L, 105L, 105L, NA, 105L, 105L, 105L), wday = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, NA, 1L, 1L, 1L), yday = c(163L, 163L, 163L, 163L, 163L, 163L, 163L, NA, 163L, 163L, 163L), isdst = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, -1L, 1L, 1L, 1L)), .Names = c(sec, min, hour, mday, mon, year, wday, yday, isdst ), class = c(POSIXt, POSIXlt)) print(tv) # print 11 time points (right) length(tv) # returns 9 (wrong) tv is a list of length 9. The answer is right, your expectation is wrong. I have tried that on several computers with/without switching to English locales, i.e. Sys.setlocale(LC_TIME, en). I have searched a help pages but I cannot imagine how that could be OK. See this in ?POSIXt: Class 'POSIXlt' is a named list of vectors... You could define your own length measurement as length.POSIXlt - function(x) length(x$sec) and you'll get the answer you expect, but be aware that length.XXX methods are quite rare, and you may surprise some of your users. On the other hand, isn't the fact that length() currently always returns 9 for POSIXlt objects likely to be a surprise to many users of POSIXlt? The back of The New S Language says Easy-to-use facilities allow you to organize, store and retrieve all sorts of data. ... S functions and data organization make applications easy to write. Now, POSIXlt has methods for c() and vector subsetting [ (and many other vector-manipulation methods - see methods(class=POSIXlt)). Hence, from the point of view of intending to supply easy-to-use facilities ... [for] all sorts of data, isn't it a little incongruous that length() is not also provided -- as 3 functions (any others?) comprise a core set of vector-manipulation functions? Would it make sense to have an informal prescription (e.g., in R-exts) that a class that implements a vector-like object and provides at least of one of functions 'c', '[' and 'length' should provide all three? It would also be easy to describe a test-suite that should be included in the 'test' directory of a package implementing such a class, that had some tests of the basic vector-manipulation functionality, such as: # at this point, x0, x1, x3, x10 should exist, as vectors of the # class being tested, of length 0, 1, 3, and 10, and they should # contain no duplicate elements length(x0) [1] 1 length(c(x0, x1)) [1] 2 length(c(x1,x10)) [1] 11 all(x3 == x3[seq(len=length(x3))]) [1] TRUE all(x3 == c(x3[1], x3[2], x3[3])) [1] TRUE length(c(x3[2], x10[5:7])) [1] 4 It would also be possible to describe a larger set of vector manipulation functions that should be implemented together, including e.g., 'rep', 'unique', 'duplicated', '==', 'sort', '[-', 'is.na', head, tail ... (many of which are provided for POSIXlt). Or is there some good reason that length() cannot be provided (while 'c' and '[' can) for some vector-like classes such as POSIXlt? What you say sounds good in general, but the devil is in the details. Changing the meaning of length(x) for some objects has fairly widespread effects. Are they all positive? I don't know. Adding a prescription like the one you suggest would be good if it's easy to implement, but bad if it's already widely violated. How many base or CRAN or Bioconductor packages violate it currently? Do the ones that provide all 3 methods do so in a consistent way, i.e. does length(x) mean the same thing in all of them? TP I'm not sure doing something like this would be so bad even if it is TP already widely violated. R has evolved significantly over time, and TP many rough edges have been cleaned up, sometimes in ways that were not TP backward compatible.
Re: [Rd] Wrong length of POSIXt vectors (PR#10507)
If it were simply deprecated and then changed then everyone using it would get a warning during the period of deprecation so it would not be so bad. Given that its current behavior is not very useful I suspect its not widely used anyways. | haven't followed the whole discussion so sorry if these points have already been made. On Dec 15, 2007 5:17 PM, Martin Maechler [EMAIL PROTECTED] wrote: TP == Tony Plate [EMAIL PROTECTED] on Fri, 14 Dec 2007 13:58:30 -0700 writes: TP Duncan Murdoch wrote: On 12/13/2007 1:59 PM, Tony Plate wrote: Duncan Murdoch wrote: On 12/11/2007 6:20 AM, [EMAIL PROTECTED] wrote: Full_Name: Petr Simecek Version: 2.5.1, 2.6.1 OS: Windows XP Submission from: (NULL) (195.113.231.2) Several times I have experienced that a length of a POSIXt vector has not been computed right. Example: tv-structure(list(sec = c(50, 0, 55, 12, 2, 0, 37, NA, 17, 3, 31 ), min = c(1L, 10L, 11L, 15L, 16L, 18L, 18L, NA, 20L, 22L, 22L ), hour = c(12L, 12L, 12L, 12L, 12L, 12L, 12L, NA, 12L, 12L, 12L), mday = c(13L, 13L, 13L, 13L, 13L, 13L, 13L, NA, 13L, 13L, 13L), mon = c(5L, 5L, 5L, 5L, 5L, 5L, 5L, NA, 5L, 5L, 5L), year = c(105L, 105L, 105L, 105L, 105L, 105L, 105L, NA, 105L, 105L, 105L), wday = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, NA, 1L, 1L, 1L), yday = c(163L, 163L, 163L, 163L, 163L, 163L, 163L, NA, 163L, 163L, 163L), isdst = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, -1L, 1L, 1L, 1L)), .Names = c(sec, min, hour, mday, mon, year, wday, yday, isdst ), class = c(POSIXt, POSIXlt)) print(tv) # print 11 time points (right) length(tv) # returns 9 (wrong) tv is a list of length 9. The answer is right, your expectation is wrong. I have tried that on several computers with/without switching to English locales, i.e. Sys.setlocale(LC_TIME, en). I have searched a help pages but I cannot imagine how that could be OK. See this in ?POSIXt: Class 'POSIXlt' is a named list of vectors... You could define your own length measurement as length.POSIXlt - function(x) length(x$sec) and you'll get the answer you expect, but be aware that length.XXX methods are quite rare, and you may surprise some of your users. On the other hand, isn't the fact that length() currently always returns 9 for POSIXlt objects likely to be a surprise to many users of POSIXlt? The back of The New S Language says Easy-to-use facilities allow you to organize, store and retrieve all sorts of data. ... S functions and data organization make applications easy to write. Now, POSIXlt has methods for c() and vector subsetting [ (and many other vector-manipulation methods - see methods(class=POSIXlt)). Hence, from the point of view of intending to supply easy-to-use facilities ... [for] all sorts of data, isn't it a little incongruous that length() is not also provided -- as 3 functions (any others?) comprise a core set of vector-manipulation functions? Would it make sense to have an informal prescription (e.g., in R-exts) that a class that implements a vector-like object and provides at least of one of functions 'c', '[' and 'length' should provide all three? It would also be easy to describe a test-suite that should be included in the 'test' directory of a package implementing such a class, that had some tests of the basic vector-manipulation functionality, such as: # at this point, x0, x1, x3, x10 should exist, as vectors of the # class being tested, of length 0, 1, 3, and 10, and they should # contain no duplicate elements length(x0) [1] 1 length(c(x0, x1)) [1] 2 length(c(x1,x10)) [1] 11 all(x3 == x3[seq(len=length(x3))]) [1] TRUE all(x3 == c(x3[1], x3[2], x3[3])) [1] TRUE length(c(x3[2], x10[5:7])) [1] 4 It would also be possible to describe a larger set of vector manipulation functions that should be implemented together, including e.g., 'rep', 'unique', 'duplicated', '==', 'sort', '[-', 'is.na', head, tail ... (many of which are provided for POSIXlt). Or is there some good reason that length() cannot be provided (while 'c' and '[' can) for some vector-like classes such as POSIXlt? What you say sounds good in general, but the devil is in the details. Changing the meaning of length(x) for some objects has fairly widespread effects. Are they all positive? I don't know. Adding a prescription like the one you suggest would be good if it's easy to implement, but bad if it's already widely violated. How many base or CRAN or Bioconductor packages violate it currently? Do the ones that provide all 3 methods do so in a
Re: [Rd] Wrong length of POSIXt vectors (PR#10507)
Duncan Murdoch wrote: On 12/13/2007 1:59 PM, Tony Plate wrote: Duncan Murdoch wrote: On 12/11/2007 6:20 AM, [EMAIL PROTECTED] wrote: Full_Name: Petr Simecek Version: 2.5.1, 2.6.1 OS: Windows XP Submission from: (NULL) (195.113.231.2) Several times I have experienced that a length of a POSIXt vector has not been computed right. Example: tv-structure(list(sec = c(50, 0, 55, 12, 2, 0, 37, NA, 17, 3, 31 ), min = c(1L, 10L, 11L, 15L, 16L, 18L, 18L, NA, 20L, 22L, 22L ), hour = c(12L, 12L, 12L, 12L, 12L, 12L, 12L, NA, 12L, 12L, 12L), mday = c(13L, 13L, 13L, 13L, 13L, 13L, 13L, NA, 13L, 13L, 13L), mon = c(5L, 5L, 5L, 5L, 5L, 5L, 5L, NA, 5L, 5L, 5L), year = c(105L, 105L, 105L, 105L, 105L, 105L, 105L, NA, 105L, 105L, 105L), wday = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, NA, 1L, 1L, 1L), yday = c(163L, 163L, 163L, 163L, 163L, 163L, 163L, NA, 163L, 163L, 163L), isdst = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, -1L, 1L, 1L, 1L)), .Names = c(sec, min, hour, mday, mon, year, wday, yday, isdst ), class = c(POSIXt, POSIXlt)) print(tv) # print 11 time points (right) length(tv) # returns 9 (wrong) tv is a list of length 9. The answer is right, your expectation is wrong. I have tried that on several computers with/without switching to English locales, i.e. Sys.setlocale(LC_TIME, en). I have searched a help pages but I cannot imagine how that could be OK. See this in ?POSIXt: Class 'POSIXlt' is a named list of vectors... You could define your own length measurement as length.POSIXlt - function(x) length(x$sec) and you'll get the answer you expect, but be aware that length.XXX methods are quite rare, and you may surprise some of your users. On the other hand, isn't the fact that length() currently always returns 9 for POSIXlt objects likely to be a surprise to many users of POSIXlt? The back of The New S Language says Easy-to-use facilities allow you to organize, store and retrieve all sorts of data. ... S functions and data organization make applications easy to write. Now, POSIXlt has methods for c() and vector subsetting [ (and many other vector-manipulation methods - see methods(class=POSIXlt)). Hence, from the point of view of intending to supply easy-to-use facilities ... [for] all sorts of data, isn't it a little incongruous that length() is not also provided -- as 3 functions (any others?) comprise a core set of vector-manipulation functions? Would it make sense to have an informal prescription (e.g., in R-exts) that a class that implements a vector-like object and provides at least of one of functions 'c', '[' and 'length' should provide all three? It would also be easy to describe a test-suite that should be included in the 'test' directory of a package implementing such a class, that had some tests of the basic vector-manipulation functionality, such as: # at this point, x0, x1, x3, x10 should exist, as vectors of the # class being tested, of length 0, 1, 3, and 10, and they should # contain no duplicate elements length(x0) [1] 1 length(c(x0, x1)) [1] 2 length(c(x1,x10)) [1] 11 all(x3 == x3[seq(len=length(x3))]) [1] TRUE all(x3 == c(x3[1], x3[2], x3[3])) [1] TRUE length(c(x3[2], x10[5:7])) [1] 4 It would also be possible to describe a larger set of vector manipulation functions that should be implemented together, including e.g., 'rep', 'unique', 'duplicated', '==', 'sort', '[-', 'is.na', head, tail ... (many of which are provided for POSIXlt). Or is there some good reason that length() cannot be provided (while 'c' and '[' can) for some vector-like classes such as POSIXlt? What you say sounds good in general, but the devil is in the details. Changing the meaning of length(x) for some objects has fairly widespread effects. Are they all positive? I don't know. Adding a prescription like the one you suggest would be good if it's easy to implement, but bad if it's already widely violated. How many base or CRAN or Bioconductor packages violate it currently? Do the ones that provide all 3 methods do so in a consistent way, i.e. does length(x) mean the same thing in all of them? I'm not sure doing something like this would be so bad even if it is already widely violated. R has evolved significantly over time, and many rough edges have been cleaned up, sometimes in ways that were not backward compatible. This is a great thing my thanks go to the people working on R. If some base or CRAN or Bioconductor packages currently don't implement vector operations consistently, wouldn't it be good to know that? Wouldn't it be useful to have an automatic way of determining whether a particular vector-like class is consistent with generally agreed set of principles for how basic vector operations should work -- things like length(x)+length(y)==length(c(x,y))? This could help developers check, document improve their code, and it could help users understand how to use
Re: [Rd] Wrong length of POSIXt vectors (PR#10507)
Duncan Murdoch wrote: On 12/11/2007 6:20 AM, [EMAIL PROTECTED] wrote: Full_Name: Petr Simecek Version: 2.5.1, 2.6.1 OS: Windows XP Submission from: (NULL) (195.113.231.2) Several times I have experienced that a length of a POSIXt vector has not been computed right. Example: tv-structure(list(sec = c(50, 0, 55, 12, 2, 0, 37, NA, 17, 3, 31 ), min = c(1L, 10L, 11L, 15L, 16L, 18L, 18L, NA, 20L, 22L, 22L ), hour = c(12L, 12L, 12L, 12L, 12L, 12L, 12L, NA, 12L, 12L, 12L), mday = c(13L, 13L, 13L, 13L, 13L, 13L, 13L, NA, 13L, 13L, 13L), mon = c(5L, 5L, 5L, 5L, 5L, 5L, 5L, NA, 5L, 5L, 5L), year = c(105L, 105L, 105L, 105L, 105L, 105L, 105L, NA, 105L, 105L, 105L), wday = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, NA, 1L, 1L, 1L), yday = c(163L, 163L, 163L, 163L, 163L, 163L, 163L, NA, 163L, 163L, 163L), isdst = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, -1L, 1L, 1L, 1L)), .Names = c(sec, min, hour, mday, mon, year, wday, yday, isdst ), class = c(POSIXt, POSIXlt)) print(tv) # print 11 time points (right) length(tv) # returns 9 (wrong) tv is a list of length 9. The answer is right, your expectation is wrong. I have tried that on several computers with/without switching to English locales, i.e. Sys.setlocale(LC_TIME, en). I have searched a help pages but I cannot imagine how that could be OK. See this in ?POSIXt: Class 'POSIXlt' is a named list of vectors... You could define your own length measurement as length.POSIXlt - function(x) length(x$sec) and you'll get the answer you expect, but be aware that length.XXX methods are quite rare, and you may surprise some of your users. On the other hand, isn't the fact that length() currently always returns 9 for POSIXlt objects likely to be a surprise to many users of POSIXlt? The back of The New S Language says Easy-to-use facilities allow you to organize, store and retrieve all sorts of data. ... S functions and data organization make applications easy to write. Now, POSIXlt has methods for c() and vector subsetting [ (and many other vector-manipulation methods - see methods(class=POSIXlt)). Hence, from the point of view of intending to supply easy-to-use facilities ... [for] all sorts of data, isn't it a little incongruous that length() is not also provided -- as 3 functions (any others?) comprise a core set of vector-manipulation functions? Would it make sense to have an informal prescription (e.g., in R-exts) that a class that implements a vector-like object and provides at least of one of functions 'c', '[' and 'length' should provide all three? It would also be easy to describe a test-suite that should be included in the 'test' directory of a package implementing such a class, that had some tests of the basic vector-manipulation functionality, such as: # at this point, x0, x1, x3, x10 should exist, as vectors of the # class being tested, of length 0, 1, 3, and 10, and they should # contain no duplicate elements length(x0) [1] 1 length(c(x0, x1)) [1] 2 length(c(x1,x10)) [1] 11 all(x3 == x3[seq(len=length(x3))]) [1] TRUE all(x3 == c(x3[1], x3[2], x3[3])) [1] TRUE length(c(x3[2], x10[5:7])) [1] 4 It would also be possible to describe a larger set of vector manipulation functions that should be implemented together, including e.g., 'rep', 'unique', 'duplicated', '==', 'sort', '[-', 'is.na', head, tail ... (many of which are provided for POSIXlt). Or is there some good reason that length() cannot be provided (while 'c' and '[' can) for some vector-like classes such as POSIXlt? -- Tony Plate Duncan Murdoch __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Wrong length of POSIXt vectors (PR#10507)
On 12/13/2007 1:59 PM, Tony Plate wrote: Duncan Murdoch wrote: On 12/11/2007 6:20 AM, [EMAIL PROTECTED] wrote: Full_Name: Petr Simecek Version: 2.5.1, 2.6.1 OS: Windows XP Submission from: (NULL) (195.113.231.2) Several times I have experienced that a length of a POSIXt vector has not been computed right. Example: tv-structure(list(sec = c(50, 0, 55, 12, 2, 0, 37, NA, 17, 3, 31 ), min = c(1L, 10L, 11L, 15L, 16L, 18L, 18L, NA, 20L, 22L, 22L ), hour = c(12L, 12L, 12L, 12L, 12L, 12L, 12L, NA, 12L, 12L, 12L), mday = c(13L, 13L, 13L, 13L, 13L, 13L, 13L, NA, 13L, 13L, 13L), mon = c(5L, 5L, 5L, 5L, 5L, 5L, 5L, NA, 5L, 5L, 5L), year = c(105L, 105L, 105L, 105L, 105L, 105L, 105L, NA, 105L, 105L, 105L), wday = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, NA, 1L, 1L, 1L), yday = c(163L, 163L, 163L, 163L, 163L, 163L, 163L, NA, 163L, 163L, 163L), isdst = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, -1L, 1L, 1L, 1L)), .Names = c(sec, min, hour, mday, mon, year, wday, yday, isdst ), class = c(POSIXt, POSIXlt)) print(tv) # print 11 time points (right) length(tv) # returns 9 (wrong) tv is a list of length 9. The answer is right, your expectation is wrong. I have tried that on several computers with/without switching to English locales, i.e. Sys.setlocale(LC_TIME, en). I have searched a help pages but I cannot imagine how that could be OK. See this in ?POSIXt: Class 'POSIXlt' is a named list of vectors... You could define your own length measurement as length.POSIXlt - function(x) length(x$sec) and you'll get the answer you expect, but be aware that length.XXX methods are quite rare, and you may surprise some of your users. On the other hand, isn't the fact that length() currently always returns 9 for POSIXlt objects likely to be a surprise to many users of POSIXlt? The back of The New S Language says Easy-to-use facilities allow you to organize, store and retrieve all sorts of data. ... S functions and data organization make applications easy to write. Now, POSIXlt has methods for c() and vector subsetting [ (and many other vector-manipulation methods - see methods(class=POSIXlt)). Hence, from the point of view of intending to supply easy-to-use facilities ... [for] all sorts of data, isn't it a little incongruous that length() is not also provided -- as 3 functions (any others?) comprise a core set of vector-manipulation functions? Would it make sense to have an informal prescription (e.g., in R-exts) that a class that implements a vector-like object and provides at least of one of functions 'c', '[' and 'length' should provide all three? It would also be easy to describe a test-suite that should be included in the 'test' directory of a package implementing such a class, that had some tests of the basic vector-manipulation functionality, such as: # at this point, x0, x1, x3, x10 should exist, as vectors of the # class being tested, of length 0, 1, 3, and 10, and they should # contain no duplicate elements length(x0) [1] 1 length(c(x0, x1)) [1] 2 length(c(x1,x10)) [1] 11 all(x3 == x3[seq(len=length(x3))]) [1] TRUE all(x3 == c(x3[1], x3[2], x3[3])) [1] TRUE length(c(x3[2], x10[5:7])) [1] 4 It would also be possible to describe a larger set of vector manipulation functions that should be implemented together, including e.g., 'rep', 'unique', 'duplicated', '==', 'sort', '[-', 'is.na', head, tail ... (many of which are provided for POSIXlt). Or is there some good reason that length() cannot be provided (while 'c' and '[' can) for some vector-like classes such as POSIXlt? What you say sounds good in general, but the devil is in the details. Changing the meaning of length(x) for some objects has fairly widespread effects. Are they all positive? I don't know. Adding a prescription like the one you suggest would be good if it's easy to implement, but bad if it's already widely violated. How many base or CRAN or Bioconductor packages violate it currently? Do the ones that provide all 3 methods do so in a consistent way, i.e. does length(x) mean the same thing in all of them? I agree that the current state is less than perfect, but making it better would really be a lot of work. I suspect there are better ways to spend my time, so I'm not going to volunteer to do it. I'm not even going to invite someone else to do it, or offer to review your work if you volunteer. I think this falls into the class of next time we write a language, let's handle this better problems. Duncan Murdoch __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Wrong length of POSIXt vectors (PR#10507)
Full_Name: Petr Simecek Version: 2.5.1, 2.6.1 OS: Windows XP Submission from: (NULL) (195.113.231.2) Several times I have experienced that a length of a POSIXt vector has not been computed right. Example: tv-structure(list(sec = c(50, 0, 55, 12, 2, 0, 37, NA, 17, 3, 31 ), min = c(1L, 10L, 11L, 15L, 16L, 18L, 18L, NA, 20L, 22L, 22L ), hour = c(12L, 12L, 12L, 12L, 12L, 12L, 12L, NA, 12L, 12L, 12L), mday = c(13L, 13L, 13L, 13L, 13L, 13L, 13L, NA, 13L, 13L, 13L), mon = c(5L, 5L, 5L, 5L, 5L, 5L, 5L, NA, 5L, 5L, 5L), year = c(105L, 105L, 105L, 105L, 105L, 105L, 105L, NA, 105L, 105L, 105L), wday = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, NA, 1L, 1L, 1L), yday = c(163L, 163L, 163L, 163L, 163L, 163L, 163L, NA, 163L, 163L, 163L), isdst = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, -1L, 1L, 1L, 1L)), .Names = c(sec, min, hour, mday, mon, year, wday, yday, isdst ), class = c(POSIXt, POSIXlt)) print(tv) # print 11 time points (right) length(tv) # returns 9 (wrong) I have tried that on several computers with/without switching to English locales, i.e. Sys.setlocale(LC_TIME, en). I have searched a help pages but I cannot imagine how that could be OK. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Wrong length of POSIXt vectors (PR#10507)
[EMAIL PROTECTED] wrote: Full_Name: Petr Simecek Version: 2.5.1, 2.6.1 OS: Windows XP Submission from: (NULL) (195.113.231.2) Several times I have experienced that a length of a POSIXt vector has not been computed right. Example: tv-structure(list(sec = c(50, 0, 55, 12, 2, 0, 37, NA, 17, 3, 31 ), min = c(1L, 10L, 11L, 15L, 16L, 18L, 18L, NA, 20L, 22L, 22L ), hour = c(12L, 12L, 12L, 12L, 12L, 12L, 12L, NA, 12L, 12L, 12L), mday = c(13L, 13L, 13L, 13L, 13L, 13L, 13L, NA, 13L, 13L, 13L), mon = c(5L, 5L, 5L, 5L, 5L, 5L, 5L, NA, 5L, 5L, 5L), year = c(105L, 105L, 105L, 105L, 105L, 105L, 105L, NA, 105L, 105L, 105L), wday = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, NA, 1L, 1L, 1L), yday = c(163L, 163L, 163L, 163L, 163L, 163L, 163L, NA, 163L, 163L, 163L), isdst = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, -1L, 1L, 1L, 1L)), .Names = c(sec, min, hour, mday, mon, year, wday, yday, isdst ), class = c(POSIXt, POSIXlt)) print(tv) # print 11 time points (right) length(tv) # returns 9 (wrong) I have tried that on several computers with/without switching to English locales, i.e. Sys.setlocale(LC_TIME, en). I have searched a help pages but I cannot imagine how that could be OK. Given the way you define it, you should be able to imagine it! It's a list of length 9: sec, min, hour,..., isdst. -- O__ Peter Dalgaard Ă˜ster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Wrong length of POSIXt vectors (PR#10507)
On 12/11/2007 6:20 AM, [EMAIL PROTECTED] wrote: Full_Name: Petr Simecek Version: 2.5.1, 2.6.1 OS: Windows XP Submission from: (NULL) (195.113.231.2) Several times I have experienced that a length of a POSIXt vector has not been computed right. Example: tv-structure(list(sec = c(50, 0, 55, 12, 2, 0, 37, NA, 17, 3, 31 ), min = c(1L, 10L, 11L, 15L, 16L, 18L, 18L, NA, 20L, 22L, 22L ), hour = c(12L, 12L, 12L, 12L, 12L, 12L, 12L, NA, 12L, 12L, 12L), mday = c(13L, 13L, 13L, 13L, 13L, 13L, 13L, NA, 13L, 13L, 13L), mon = c(5L, 5L, 5L, 5L, 5L, 5L, 5L, NA, 5L, 5L, 5L), year = c(105L, 105L, 105L, 105L, 105L, 105L, 105L, NA, 105L, 105L, 105L), wday = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, NA, 1L, 1L, 1L), yday = c(163L, 163L, 163L, 163L, 163L, 163L, 163L, NA, 163L, 163L, 163L), isdst = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, -1L, 1L, 1L, 1L)), .Names = c(sec, min, hour, mday, mon, year, wday, yday, isdst ), class = c(POSIXt, POSIXlt)) print(tv) # print 11 time points (right) length(tv) # returns 9 (wrong) tv is a list of length 9. The answer is right, your expectation is wrong. I have tried that on several computers with/without switching to English locales, i.e. Sys.setlocale(LC_TIME, en). I have searched a help pages but I cannot imagine how that could be OK. See this in ?POSIXt: Class 'POSIXlt' is a named list of vectors... You could define your own length measurement as length.POSIXlt - function(x) length(x$sec) and you'll get the answer you expect, but be aware that length.XXX methods are quite rare, and you may surprise some of your users. Duncan Murdoch __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Wrong length of POSIXt vectors (PR#10507)
It is right: it is a list of length 9. You even constructed it as such a list! On Tue, 11 Dec 2007, [EMAIL PROTECTED] wrote: Full_Name: Petr Simecek Version: 2.5.1, 2.6.1 OS: Windows XP Submission from: (NULL) (195.113.231.2) Several times I have experienced that a length of a POSIXt vector has not been computed right. Example: tv-structure(list(sec = c(50, 0, 55, 12, 2, 0, 37, NA, 17, 3, 31 ), min = c(1L, 10L, 11L, 15L, 16L, 18L, 18L, NA, 20L, 22L, 22L ), hour = c(12L, 12L, 12L, 12L, 12L, 12L, 12L, NA, 12L, 12L, 12L), mday = c(13L, 13L, 13L, 13L, 13L, 13L, 13L, NA, 13L, 13L, 13L), mon = c(5L, 5L, 5L, 5L, 5L, 5L, 5L, NA, 5L, 5L, 5L), year = c(105L, 105L, 105L, 105L, 105L, 105L, 105L, NA, 105L, 105L, 105L), wday = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, NA, 1L, 1L, 1L), yday = c(163L, 163L, 163L, 163L, 163L, 163L, 163L, NA, 163L, 163L, 163L), isdst = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, -1L, 1L, 1L, 1L)), .Names = c(sec, min, hour, mday, mon, year, wday, yday, isdst ), class = c(POSIXt, POSIXlt)) print(tv) # print 11 time points (right) length(tv) # returns 9 (wrong) I have tried that on several computers with/without switching to English locales, i.e. Sys.setlocale(LC_TIME, en). I have searched a help pages but I cannot imagine how that could be OK. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel