Re: [Persian Locale d6 Feedback] Short Format Dates

2004-06-28 Thread Roozbeh Pournader
Hooman,

There is this application called Unibook that may help you NOT write the
software for browsing the database. Depends on your needs of course:

http://www.unicode.org/unibook/

roozbeh

On Sat, 2004-06-26 at 06:38, Hooman Mehr wrote:
 Hi Behdad,
 
 On Jun 26, 2004, at 1:50 AM, Behdad Esfahbod wrote:
 
 
  I'm confused now.  What do you expect in PropList.txt about
  U+060D?  If you read UCD.html, it says that files like
  PropList.txt just list those code points that hold a true value
  for the binary property.  Why they don't list the all??  Why
  should the do?  There are more than a million of them, while
  poins of interest are usually less than a thousand ones...
 
  behdad
 
 You are right, that was my mistake. I had some wrong perceptions about 
 U+060D that made me believe it would belong there. I am starting to 
 feel I need to import all those data files into a database for quick 
 reference. I am getting tired of having to find information scattered 
 across so many different places (book, charts and various data files) I 
 still feel there should be a better way for organizing all the 
 information in Unicode.
 
 - Hooman
 
 ___
 PersianComputing mailing list
 [EMAIL PROTECTED]
 http://lists.sharif.edu/mailman/listinfo/persiancomputing

___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


Re: [Persian Locale d6 Feedback] Short Format Dates

2004-06-26 Thread Behdad Esfahbod
On Sat, 26 Jun 2004, Hooman Mehr wrote:

 Hi Behdad,

 You are right, that was my mistake. I had some wrong perceptions about
 U+060D that made me believe it would belong there. I am starting to
 feel I need to import all those data files into a database for quick
 reference. I am getting tired of having to find information scattered
 across so many different places (book, charts and various data files) I
 still feel there should be a better way for organizing all the
 information in Unicode.

 - Hooman


There are applications out there that do this.  Under Linux,
gucharmap is such a one, but not really that Unicode-oriented.
Under Windows, the unicode.org releases an application for that
but unfortunately I don't recall the name right now, nor I can
find it on their site.  But I'm sure there is, I downloaded it
last month (and couldn't run!).

--behdad
  behdad.org
___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


Re: [Persian Locale d6 Feedback] Short Format Dates

2004-06-25 Thread Behdad Esfahbod
On Fri, 25 Jun 2004, Hooman Mehr wrote:

 Hi Behdad,

Hello,

 Glad to hear the good news. Is there anything that may impact end
 users? If there is, please provide a none-technical overview of the
 changes that will affect normal users of Persian text on computer.

No, not really.

 What I meant about U+060D is that I expected to find something about it
 in /UNIDATA/PropList.txt but it wasn't there. That is the reason I
 asked. Now I have figured it out. Both the applicable defaults and also
 explicitly in UnicodeData.txt. Sometimes I find UCD (Unicode Character
 Database) files confusing. Is there any hope they will be cleaned up
 further? For example, why not explicitly include characters in all
 expected places instead of relying on fallback and default properties?

I'm confused now.  What do you expect in PropList.txt about
U+060D?  If you read UCD.html, it says that files like
PropList.txt just list those code points that hold a true value
for the binary property.  Why they don't list the all??  Why
should the do?  There are more than a million of them, while
poins of interest are usually less than a thousand ones...

behdad

 - Hooman

 On Jun 24, 2004, at 12:17 AM, Behdad Esfahbod wrote:

  On Wed, 23 Jun 2004, Behdad Esfahbod wrote:
 
  On Tue, 22 Jun 2004, Hooman Mehr wrote:
 
  Excellent news. While talking about clarifications, I couldn't find
  the
  properties for U+060D. Do you have information in this regard?
 
  No idea.  What kind of information are you looking for?  If this
  is what you like to hear, yes using that character instead of
  slash, solves your poblem of entering short dates. :-)
 
  Ok, here comes the more info from Chapter 8 of Unicode available
  online at:
 
  http://www.unicode.org/versions/Unicode4.0.0/ch08.pdf#G20596
 
  It says:
 
  Date Separator. U+060D ARABIC DATE SEPARATOR is used in Pakistan
  and India between the numeric date and the month name when
  writing out a date.  This sign is distinct from U+002F SOLIDUS,
  which is used, for example, as a separator in currency amounts.
 
 
  --behdad
behdad.org
  ___
  PersianComputing mailing list
  [EMAIL PROTECTED]
  http://lists.sharif.edu/mailman/listinfo/persiancomputing
 




--behdad
  behdad.org
___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


Re: [Persian Locale d6 Feedback] Short Format Dates

2004-06-25 Thread Hooman Mehr
Hi Behdad,
On Jun 26, 2004, at 1:50 AM, Behdad Esfahbod wrote:
I'm confused now.  What do you expect in PropList.txt about
U+060D?  If you read UCD.html, it says that files like
PropList.txt just list those code points that hold a true value
for the binary property.  Why they don't list the all??  Why
should the do?  There are more than a million of them, while
poins of interest are usually less than a thousand ones...
behdad
You are right, that was my mistake. I had some wrong perceptions about 
U+060D that made me believe it would belong there. I am starting to 
feel I need to import all those data files into a database for quick 
reference. I am getting tired of having to find information scattered 
across so many different places (book, charts and various data files) I 
still feel there should be a better way for organizing all the 
information in Unicode.

- Hooman
___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


Re: [Persian Locale d6 Feedback] Short Format Dates

2004-06-23 Thread Behdad Esfahbod
On Wed, 23 Jun 2004, Behdad Esfahbod wrote:

 On Tue, 22 Jun 2004, Hooman Mehr wrote:

  Excellent news. While talking about clarifications, I couldn't find the
  properties for U+060D. Do you have information in this regard?

 No idea.  What kind of information are you looking for?  If this
 is what you like to hear, yes using that character instead of
 slash, solves your poblem of entering short dates. :-)

Ok, here comes the more info from Chapter 8 of Unicode available
online at:

http://www.unicode.org/versions/Unicode4.0.0/ch08.pdf#G20596

It says:

Date Separator. U+060D ARABIC DATE SEPARATOR is used in Pakistan
and India between the numeric date and the month name when
writing out a date.  This sign is distinct from U+002F SOLIDUS,
which is used, for example, as a separator in currency amounts.


--behdad
  behdad.org
___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


Re: [Persian Locale d6 Feedback] Short Format Dates

2004-06-23 Thread Behdad Esfahbod
On Tue, 22 Jun 2004, Hooman Mehr wrote:

  BTW, Behdad is attending the Unicode Consortium's Technical Committee
  meeting right now, and later the ISO JTC1/SC2 ones. I'm sure the UTC
  meeting (which will be the first with a FarsiWeb member present) will
  have good news for us (which may include more changes and
  clarifications to the Bidirectional algorithm).

Yeah, I'm exceptionally happy with the outcome.  Since the
changes are highly technical, I don't go over them in this list.

 Excellent news. While talking about clarifications, I couldn't find the
 properties for U+060D. Do you have information in this regard?

No idea.  What kind of information are you looking for?  If this
is what you like to hear, yes using that character instead of
slash, solves your poblem of entering short dates. :-)

--behdad
  behdad.org
___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


[Persian Locale d6 Feedback] Short Format Dates

2004-06-19 Thread Hooman Mehr
Dear Roozbeh,
In page 2 (physical page 3) of the Locale draft, the short format 
locale is specified in a table with some examples and explanation. The 
missing information is this:

We know that the correct way to read (pronounce) a short format date 
that looks like 1358/1/12 is 12-e Farvardin-e 1358 just like the 
long format (Don't get to kasre-ye ezafe debate please, this is not 
what I mean).

Since Unicode assumes that text is entered in the same way that it is 
intended to be read (pronounced or processed, which is called the 
logical order), one expects to be able to do the following data entry: 
12 followed by / followed by 1 followed by 1358.

I suspect that you didn't type it like that, because the normal 
software would result a display of 12/1/1358. The reason is that / 
(slash, U+002F) is a neutral character and when surrounded by digits it 
gets left-to-right directionality according to Unicode bi-di algorithm.

In short, there is no mention of how you get the display results that 
you are showing in the tables. There are many ways that you can enter 
data and embed or assume different directionality and get the same 
visual results. I think you should be specific about directionality 
assumptions. The logical short format in Persian is day, month, year, 
but with normal delimiters and digits this is not how you get the 
visually correct result of year/month/day.

The best solution in my opinion is to provide exact format strings (as 
arrays of Unicode characters with specific placeholders for date 
elements). This will avoid any possible ambiguity in the specification.

I sincerely hope that you won't tell me that you expect the users to 
type 1383 then / then 1 then / then 12 to enter a date in short format, 
because it would be unnatural and none obvious (although currently it 
may be the only way to get a correct result with the available software 
applications).  The debate here is whether we should turn workarounds 
that are logically questionable into standards that are assumed to have 
sound logical foundation.

As I have seen, you have defended going back to using the correct yeh 
and correcting the faulty software/fonts, so I hope you choose the 
right thing to do this time as well.

Alright I know, you may say: It is impossible any other way!  What is 
the solution? Answer: Nothing is impossible, but the answer is gonna 
cost you!

- Hooman Mehr
___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing


Re: [Persian Locale d6 Feedback] Short Format Dates

2004-06-19 Thread Roozbeh Pournader
On Sat, 2004-06-19 at 18:41, Hooman Mehr wrote:
 [...] The best solution in my opinion is to provide exact format strings (as 
 arrays of Unicode characters with specific placeholders for date 
 elements). This will avoid any possible ambiguity in the specification.

That will be specified in a coming appendix, which will have the locale
data for ICU and GNU C library.

Anyway, the situation is worse than what you may guess. The Unicode
Consortium has changed the bidirectional category of a few characters,
including Slash, in Unicode 4.0.1. For Slash and its brethren, it's not
just Neutral or things like that. We are having stuff like European
Terminators and Common Separators in the Unicode Bidi algorithm.

 I sincerely hope that you won't tell me that you expect the users to 
 type 1383 then / then 1 then / then 12 to enter a date in short format, 
 because it would be unnatural and none obvious (although currently it 
 may be the only way to get a correct result with the available software 
 applications).

I'm not implying anything about users here. We are specifying how the
final text should be displayed. We have not specified how to encode it
(of course that doesn't mean one is allowed to encode it however he
likes). If we do that, we may not remain conforming to Unicode if
Unicode changes yet another bidirectional category in a later version.

 As I have seen, you have defended going back to using the correct yeh 
 and correcting the faulty software/fonts, so I hope you choose the 
 right thing to do this time as well.

I always do the right thing, don't I? ;-)

 Alright I know, you may say: It is impossible any other way!  What is 
 the solution?

I say: the answer is too technical to be included in the locale
specification. There will be different answers for different situations,
in different contexts, or in different Unicode versions.

BTW, Behdad is attending the Unicode Consortium's Technical Committee
meeting right now, and later the ISO JTC1/SC2 ones. I'm sure the UTC
meeting (which will be the first with a FarsiWeb member present) will
have good news for us (which may include more changes and clarifications
to the Bidirectional algorithm).

roozbeh


___
PersianComputing mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/persiancomputing