Re: [Persian Locale d6 Feedback] Short Format Dates
Hooman, There is this application called Unibook that may help you NOT write the software for browsing the database. Depends on your needs of course: http://www.unicode.org/unibook/ roozbeh On Sat, 2004-06-26 at 06:38, Hooman Mehr wrote: Hi Behdad, On Jun 26, 2004, at 1:50 AM, Behdad Esfahbod wrote: I'm confused now. What do you expect in PropList.txt about U+060D? If you read UCD.html, it says that files like PropList.txt just list those code points that hold a true value for the binary property. Why they don't list the all?? Why should the do? There are more than a million of them, while poins of interest are usually less than a thousand ones... behdad You are right, that was my mistake. I had some wrong perceptions about U+060D that made me believe it would belong there. I am starting to feel I need to import all those data files into a database for quick reference. I am getting tired of having to find information scattered across so many different places (book, charts and various data files) I still feel there should be a better way for organizing all the information in Unicode. - Hooman ___ PersianComputing mailing list [EMAIL PROTECTED] http://lists.sharif.edu/mailman/listinfo/persiancomputing ___ PersianComputing mailing list [EMAIL PROTECTED] http://lists.sharif.edu/mailman/listinfo/persiancomputing
Re: [Persian Locale d6 Feedback] Short Format Dates
On Sat, 26 Jun 2004, Hooman Mehr wrote: Hi Behdad, You are right, that was my mistake. I had some wrong perceptions about U+060D that made me believe it would belong there. I am starting to feel I need to import all those data files into a database for quick reference. I am getting tired of having to find information scattered across so many different places (book, charts and various data files) I still feel there should be a better way for organizing all the information in Unicode. - Hooman There are applications out there that do this. Under Linux, gucharmap is such a one, but not really that Unicode-oriented. Under Windows, the unicode.org releases an application for that but unfortunately I don't recall the name right now, nor I can find it on their site. But I'm sure there is, I downloaded it last month (and couldn't run!). --behdad behdad.org ___ PersianComputing mailing list [EMAIL PROTECTED] http://lists.sharif.edu/mailman/listinfo/persiancomputing
Re: [Persian Locale d6 Feedback] Short Format Dates
On Fri, 25 Jun 2004, Hooman Mehr wrote: Hi Behdad, Hello, Glad to hear the good news. Is there anything that may impact end users? If there is, please provide a none-technical overview of the changes that will affect normal users of Persian text on computer. No, not really. What I meant about U+060D is that I expected to find something about it in /UNIDATA/PropList.txt but it wasn't there. That is the reason I asked. Now I have figured it out. Both the applicable defaults and also explicitly in UnicodeData.txt. Sometimes I find UCD (Unicode Character Database) files confusing. Is there any hope they will be cleaned up further? For example, why not explicitly include characters in all expected places instead of relying on fallback and default properties? I'm confused now. What do you expect in PropList.txt about U+060D? If you read UCD.html, it says that files like PropList.txt just list those code points that hold a true value for the binary property. Why they don't list the all?? Why should the do? There are more than a million of them, while poins of interest are usually less than a thousand ones... behdad - Hooman On Jun 24, 2004, at 12:17 AM, Behdad Esfahbod wrote: On Wed, 23 Jun 2004, Behdad Esfahbod wrote: On Tue, 22 Jun 2004, Hooman Mehr wrote: Excellent news. While talking about clarifications, I couldn't find the properties for U+060D. Do you have information in this regard? No idea. What kind of information are you looking for? If this is what you like to hear, yes using that character instead of slash, solves your poblem of entering short dates. :-) Ok, here comes the more info from Chapter 8 of Unicode available online at: http://www.unicode.org/versions/Unicode4.0.0/ch08.pdf#G20596 It says: Date Separator. U+060D ARABIC DATE SEPARATOR is used in Pakistan and India between the numeric date and the month name when writing out a date. This sign is distinct from U+002F SOLIDUS, which is used, for example, as a separator in currency amounts. --behdad behdad.org ___ PersianComputing mailing list [EMAIL PROTECTED] http://lists.sharif.edu/mailman/listinfo/persiancomputing --behdad behdad.org ___ PersianComputing mailing list [EMAIL PROTECTED] http://lists.sharif.edu/mailman/listinfo/persiancomputing
Re: [Persian Locale d6 Feedback] Short Format Dates
Hi Behdad, On Jun 26, 2004, at 1:50 AM, Behdad Esfahbod wrote: I'm confused now. What do you expect in PropList.txt about U+060D? If you read UCD.html, it says that files like PropList.txt just list those code points that hold a true value for the binary property. Why they don't list the all?? Why should the do? There are more than a million of them, while poins of interest are usually less than a thousand ones... behdad You are right, that was my mistake. I had some wrong perceptions about U+060D that made me believe it would belong there. I am starting to feel I need to import all those data files into a database for quick reference. I am getting tired of having to find information scattered across so many different places (book, charts and various data files) I still feel there should be a better way for organizing all the information in Unicode. - Hooman ___ PersianComputing mailing list [EMAIL PROTECTED] http://lists.sharif.edu/mailman/listinfo/persiancomputing
Re: [Persian Locale d6 Feedback] Short Format Dates
On Wed, 23 Jun 2004, Behdad Esfahbod wrote: On Tue, 22 Jun 2004, Hooman Mehr wrote: Excellent news. While talking about clarifications, I couldn't find the properties for U+060D. Do you have information in this regard? No idea. What kind of information are you looking for? If this is what you like to hear, yes using that character instead of slash, solves your poblem of entering short dates. :-) Ok, here comes the more info from Chapter 8 of Unicode available online at: http://www.unicode.org/versions/Unicode4.0.0/ch08.pdf#G20596 It says: Date Separator. U+060D ARABIC DATE SEPARATOR is used in Pakistan and India between the numeric date and the month name when writing out a date. This sign is distinct from U+002F SOLIDUS, which is used, for example, as a separator in currency amounts. --behdad behdad.org ___ PersianComputing mailing list [EMAIL PROTECTED] http://lists.sharif.edu/mailman/listinfo/persiancomputing
Re: [Persian Locale d6 Feedback] Short Format Dates
On Tue, 22 Jun 2004, Hooman Mehr wrote: BTW, Behdad is attending the Unicode Consortium's Technical Committee meeting right now, and later the ISO JTC1/SC2 ones. I'm sure the UTC meeting (which will be the first with a FarsiWeb member present) will have good news for us (which may include more changes and clarifications to the Bidirectional algorithm). Yeah, I'm exceptionally happy with the outcome. Since the changes are highly technical, I don't go over them in this list. Excellent news. While talking about clarifications, I couldn't find the properties for U+060D. Do you have information in this regard? No idea. What kind of information are you looking for? If this is what you like to hear, yes using that character instead of slash, solves your poblem of entering short dates. :-) --behdad behdad.org ___ PersianComputing mailing list [EMAIL PROTECTED] http://lists.sharif.edu/mailman/listinfo/persiancomputing
[Persian Locale d6 Feedback] Short Format Dates
Dear Roozbeh, In page 2 (physical page 3) of the Locale draft, the short format locale is specified in a table with some examples and explanation. The missing information is this: We know that the correct way to read (pronounce) a short format date that looks like 1358/1/12 is 12-e Farvardin-e 1358 just like the long format (Don't get to kasre-ye ezafe debate please, this is not what I mean). Since Unicode assumes that text is entered in the same way that it is intended to be read (pronounced or processed, which is called the logical order), one expects to be able to do the following data entry: 12 followed by / followed by 1 followed by 1358. I suspect that you didn't type it like that, because the normal software would result a display of 12/1/1358. The reason is that / (slash, U+002F) is a neutral character and when surrounded by digits it gets left-to-right directionality according to Unicode bi-di algorithm. In short, there is no mention of how you get the display results that you are showing in the tables. There are many ways that you can enter data and embed or assume different directionality and get the same visual results. I think you should be specific about directionality assumptions. The logical short format in Persian is day, month, year, but with normal delimiters and digits this is not how you get the visually correct result of year/month/day. The best solution in my opinion is to provide exact format strings (as arrays of Unicode characters with specific placeholders for date elements). This will avoid any possible ambiguity in the specification. I sincerely hope that you won't tell me that you expect the users to type 1383 then / then 1 then / then 12 to enter a date in short format, because it would be unnatural and none obvious (although currently it may be the only way to get a correct result with the available software applications). The debate here is whether we should turn workarounds that are logically questionable into standards that are assumed to have sound logical foundation. As I have seen, you have defended going back to using the correct yeh and correcting the faulty software/fonts, so I hope you choose the right thing to do this time as well. Alright I know, you may say: It is impossible any other way! What is the solution? Answer: Nothing is impossible, but the answer is gonna cost you! - Hooman Mehr ___ PersianComputing mailing list [EMAIL PROTECTED] http://lists.sharif.edu/mailman/listinfo/persiancomputing
Re: [Persian Locale d6 Feedback] Short Format Dates
On Sat, 2004-06-19 at 18:41, Hooman Mehr wrote: [...] The best solution in my opinion is to provide exact format strings (as arrays of Unicode characters with specific placeholders for date elements). This will avoid any possible ambiguity in the specification. That will be specified in a coming appendix, which will have the locale data for ICU and GNU C library. Anyway, the situation is worse than what you may guess. The Unicode Consortium has changed the bidirectional category of a few characters, including Slash, in Unicode 4.0.1. For Slash and its brethren, it's not just Neutral or things like that. We are having stuff like European Terminators and Common Separators in the Unicode Bidi algorithm. I sincerely hope that you won't tell me that you expect the users to type 1383 then / then 1 then / then 12 to enter a date in short format, because it would be unnatural and none obvious (although currently it may be the only way to get a correct result with the available software applications). I'm not implying anything about users here. We are specifying how the final text should be displayed. We have not specified how to encode it (of course that doesn't mean one is allowed to encode it however he likes). If we do that, we may not remain conforming to Unicode if Unicode changes yet another bidirectional category in a later version. As I have seen, you have defended going back to using the correct yeh and correcting the faulty software/fonts, so I hope you choose the right thing to do this time as well. I always do the right thing, don't I? ;-) Alright I know, you may say: It is impossible any other way! What is the solution? I say: the answer is too technical to be included in the locale specification. There will be different answers for different situations, in different contexts, or in different Unicode versions. BTW, Behdad is attending the Unicode Consortium's Technical Committee meeting right now, and later the ISO JTC1/SC2 ones. I'm sure the UTC meeting (which will be the first with a FarsiWeb member present) will have good news for us (which may include more changes and clarifications to the Bidirectional algorithm). roozbeh ___ PersianComputing mailing list [EMAIL PROTECTED] http://lists.sharif.edu/mailman/listinfo/persiancomputing