David Winsemius dwinsemius at comcast.net writes:
On Feb 8, 2013, at 10:55 AM, Edwin Isensee wrote:
I'm using the read.xls function from gdata package to read one Excel file,
like the example below:
library(gdata)
my_file - '/Users/Desktop/Project.xlsx'
valores - read.xls(my_file)
The problem is: one of the columns at the Excel file holds date information
like 1-Jan-13, 5-Jan-13, 25-Jan-13.
Actually it holds them as number of days and only displays them in
that format.
At Excel these information are treated
as dates. When I read the file into a dataframe the corresponding data
frame column holds numeric information like 41275, 41279, 41299. How can I
convert these numeric information into the original date information?
The easiest way would be to create a format in Excel. -mm-dd
should work well. Otherwise you should read the documentation about
date encoding. You can take those values and add them to something
like: as.Date(1900-01-01). I say something like because Excel
date calculations have always had a strange bug that MS refuses to
acknowledge or fix that may make the date one or two days more or
less.
as.Date(1900-01-01) +c( 41275, 41279, 41299)
[1] 2013-01-03 2013-01-07 2013-01-27
The HFWutils package, now archived, had a function that
did this. I extracted just that function: below I also
post some information about where (I think) the
Excel date bug referred to above comes from -- an
interesting historical story.
If you are using dates before Feb 1900, watch out (and
read below)!
## from http://cran.r-project.org/src/contrib/Archive/
## HFWutils/HFWutils_0.9.2008.05.17.tar.gz
excelDate2Date - function(excelDate) {
Date - excelDate + as.Date(1900-01-01) - 2
## FIXME: add if 1900-Feb-28 switch?
return(Date)
}
## http://www.cpearson.com/excel/datetime.htm
## Dates
## The integer portion of the number, d, represents the number of
## days since 1900-Jan-0. For example, the date 19-Jan-2000 is stored
## as 36,544, since 36,544 days have passed since 1900-Jan-0. The
## number 1 represents 1900-Jan-1. It should be noted that the number
## 0 does not represent 1899-Dec-31. It does not. If you use the
## MONTH function with the date 0, it will return January, not
## December. Moreover, the YEAR function will return 1900, not 1899.
## Actually, this number is one greater than the actual number of
## days. This is because Excel behaves as if the date 1900-Feb-29
## existed. It did not. The year 1900 was not a leap year (the year
## 2000 is a leap year). In Excel, the day after 1900-Feb-28 is
## 1900-Feb-29. In reality, the day after 1900-Feb-28 was 1900-Mar-1.
## This is not a bug. Indeed, it is by design. Excel works this
## way because it was truly a bug in Lotus 123. When Excel was
## introduced, 123 has nearly the entire market for spreadsheet
## software. Microsoft decided to continue Lotus' bug, in order to
## fully compatible. Users who switched from 123 to Excel would not
## have to make any changes to their data. As long as all your dates
## later than 1900-Mar-1, this should be of no concern.
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.