Bugs item #947906, was opened at 2004-05-04 20:38 Message generated for change (Comment added) made by doerwalter You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=947906&group_id=5470
Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Python Library Group: Python 2.3 Status: Open Resolution: Accepted Priority: 7 Submitted By: Leonardo Rochael Almeida (rochael) Assigned to: Nobody/Anonymous (nobody) Summary: calendar.weekheader(n): n should mean chars not bytes Initial Comment: calendar.weekheader(n) is locale aware, which is good in principle. The parameter n, however, is interpreted as meaning bytes, not chars, which can generate broken strings for, e.g. localized weekday names: >>> calendar.weekheader(2) 'Mo Tu We Th Fr Sa Su' >>> locale.setlocale(locale.LC_ALL, "pt_BR.UTF-8") 'pt_BR.UTF-8' >>> calendar.weekheader(2) 'Se Te Qu Qu Se S\xc3 Do' Notice how "Sábado" (Saturday) above is missing the second utf-8 byte for the encoding of "á": >>> u"Sá".encode("utf-8") 'S\xc3\xa1' The implementation of weekheader (and of all of calendar.py, it seems) is based on localized 8 bit strings. I suppose the correct fix for this bug will involve a roundtrip thru unicode. ---------------------------------------------------------------------- >Comment By: Walter Dörwald (doerwalter) Date: 2006-03-31 17:14 Message: Logged In: YES user_id=89016 Here's a new version of the patch with documentation for the Calendar classes and a new test. The script interface isn't documented in the TeX file (python -mcalendar --help should be enough). ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2004-07-21 21:17 Message: Logged In: YES user_id=89016 The following patch doesn't fix the unicode problem, but it should enable us to have both 8bit and unicode calendars. It reimplements the calendar functionality as classes. This makes it possible to reuse the date calculation logic and extend or replace the string formatting logic. Implementing a unicode version would be done by subclassing TextCalendar and overwritting formatweekday() and formatmonthname(). The patch adds several other features: A HTML version of a calendar can be output. (An example output can be found at http://styx.livinglogic.de/~walter/calendar/calendar.html). The calendar module can be used as a script from the command line. Various options are available. It's possible to specify the number of months per row (they were fixed at 3 in the old version). If this patch is accepted I can provide documentation and tests. ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2004-06-03 06:43 Message: Logged In: YES user_id=21627 Adding an ucalendar module would be reasonable, IMO. Introducing ustrftime is not necessary - we could just apply the "unicode in/unicode out" procedure (i.e. if the format is a Unicode string, return a Unicode result). The tricky part of that is to convert the strftime result to Unicode. We could try mbstowcs, but that would fail if the locale doesn't use Unicode for wchar_t. Once ucalendar is written, we could document that the calendar module has known problems if the locale's encoding is not Latin-1. However, I'm not going to implement that any time soon, so unassigning. ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2004-06-02 21:08 Message: Logged In: YES user_id=89016 Maybe we should have a second version of calendar (named ucalendar?) that works with unicode strings? Could those two modules be rewritten to use as much common functionality as possible? Or we could use a module global to configure whether str or unicode should be returned? Most of the localization functionality in calendar seems to come from datetime.datetime.strftime(), so it probably would help to have a method datetime.datetime.ustrftime() that returns the formatted string as unicode (using the locale encoding). Assigning to MvL as the locale/unicode expert. ---------------------------------------------------------------------- Comment By: Hye-Shik Chang (perky) Date: 2004-05-08 01:57 Message: Logged In: YES user_id=55188 I think calendar.weekheader should mean not chars nor bytes but width. Because the function is currectly used for fixed width representations of calendars. Yes. They are same for western alphabets. But, for many of CJK characters are in full width. So, they need only 1 character for calendar.weekheader(2); and it's conventional in real life, too. But, we don't have unicode.width() support to implement the feature yet. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=947906&group_id=5470 _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com