John Machin wrote: > On Tue, 17 May 2005 17:38:30 -0500, Terry Hancock > <[EMAIL PROTECTED]> wrote: > > >>What do you do when a date or time is >>incompletely specified? ISTM, that as it is, there is no >>formal way to store this --- you have to guess, and there's >>no way to indicate that the guess is different from solid >>information. As a result, I have sometimes had to abandon >>datetime, even though it seemed like the logical choice for >>representing data. >> >>E.g. I might have information like "this paper was published >>in May 1997". There's no way to write that with datetime, >>is there? Even if I just use the "date" object instead of >>datetime, I still have to actually specify something like >>May 1, 1997 --- fabricating data, which is frequently >>undesireable (later on, I might find information saying that >>it was actually published May 23, 1997 and I might want >>to update the earlier one, or simply evaluate them as >>"equal" since they are, to within the precision given --- >>for example, I might be trying to decide that two database >>entries are really duplicate references to the same paper). >> >>I know that this is somewhat theoretically stated, but I >>have run into to concrete problems along the lines of >>the above. >> >>I'd say this is analogous to how you might use "None" >>rather than "0" to represent an integer if you don't know >>it's value (rather than knowing that it is zero). ISTM, you >>ought to be able to specify a date as, e.g.: >> >>d = datetime.date(2005, 5, None) >> >>I realize there might be some complexity with deciding >>how to handle datestamp math, but as this situation >>occurs frequently in real life, it seems like it shouldn't >>be avoided. >> >>How do other people deal with this kind of problem? > > > Mostly, badly :-( > > Real-life example: due to war-time disruption etc, in some countries > it is common enough to find that the date of birth of someone born in > the 1940s is not known precisely. E.g. on the Hong Kong identity card, > it is possible to find only the year and month of birth, and sometimes > even only the year. Depending on the purpose, legislation and > convention will take the first day of the vague period or the last day > when a calculation is required. Badly == entering into a database the > "exact" date that was used for the purpose du jour, with no indication > that the source was vague. Consequently a person can have DOB recorded > as 1945-01-01 on one database and 1945-12-31 on another. > > Suggested approach in Python (sketch): Don't try to get the datetime > module to solve the problem. Define a fuzzydate class. Internal > representation: I'd suggest earliest possible date and latest possible > date. That way you have valid date instances for doing date > arithmetic. May have different constructors depending on how the > incoming vagueness is specified. > > HTH, > John
This is a very common problem in genealogy research as well as other sciences that deal with history, such as geology, geography, and archeology. I agree that some standard way of dealing with fuzzy dates would be a good thing. I think looking at how others do it would be the way to start... A google search found the following reference buried in a long reference page on mysql. http://www.dreamlink.net/mysql/manual_Functions.html > The reason the ranges for the month and day specifiers begin with zero is that MySQL allows incomplete dates such as '2004-00-00' to be stored as of MySQL 3.23. So it seems using 0's for the missing day or month may be how to do it. Cheers, _Ron -- http://mail.python.org/mailman/listinfo/python-list