Am 02.06.2008 um 07:36 schrieb Stanislav Malyshev:
Hi!
What if there is a Number extension down the road. Or a Collator
extension. Or what if people already have classes called
NumberFormatter
Well, what if they have classes named IntlNumberFormatter? There can
be
all kinds of classes, and until we have standard namespace for
internal
classes we have to live with names being global and make them
sufficiently distinct to avoid collisions. I think NumberFormatter is
sufficiently distinct.
Obviously, it is less likely that someone who wrote a Collator called
their class "IntlCollator" instead of "Collator".
Consistency is important. The lack of it is also often what PHP users
complain about. This should really be changed to be consistent across
the board, Stas.
But why are those internal differences exposed through the API. I
think
Because there are cases when either one of them can be used, IMO.
I don't think that's a valid argument. Why does the API have to expose
the internal implementation differences?
At least, there should be Collator::asortWithKeys(). But I really
Current implementation doesn't allow to do asort with keys easily, but
this can be improved. Contributions welcome btw ;)
Also, asort seems to be less frequent use case for the data that
require
collation. That doesn't mean it shouldn't be done, just from
priority point of view.
Again, I think it should be consistent :) Why not just toss out
sort(), rename sortWithKeys() to sort(), and the optimize asort() later.
idea. I'm failing to get the extension compiled here on OS X, but
will
What's the problem on OS X? I'd like having it building on any OSes
supported by PHP, so could you provide more info on this?
Can't quite remember, but will let you know ASAP. Probably a problem
on my side.
[EMAIL PROTECTED];collation=traditional;calendar=thai-
buddhist is what I could come up with right now... 77 characters.
That's really not a frequent case - especially taking into account
that
there's no function that needs currency, collation and calendar at the
same time. But for the main reason see below.
Waitwaitwait. The entire point of doing internationalization properly,
using ICU and the CLDR, is that even seemingly obscure cases are
possible.
I mean who are you/we to decide that someone from the Republic of
Serbia, who speaks Serbian, may not view sales numbers for last week's
month in the thai-buddhist calendar in USD and sorted "traditionally"?
You think that is unrealistic? Maybe. Then what about this:
[EMAIL PROTECTED];collation=traditional;calendar=gregorian
China, simplified Han, List of quarterly sales, Gregorian calendar,
normal collation for sorting person names.
65 characters. And this is not unrealistic.
Stas, internationalization is not about neglecting edge cases. It has
to be done properly. That's the whole point of it.
The other question is what happens if the string is longer than
that? Does it get cut off or something?
No, the function getting overlong locale name would fail.
paintings. Or whatever. So locale identifier strings can be of any
length.
Please tell that to the ICU library developers. 98-byte long locale
provably crashes ICU libraries. I didn't want to take chances so I
chose
smallest "round" number for the limit that works reliably. I'd be
happy to raise it if I could be sure ICU would work OK with it.
97? ;)
Maybe ext/intl should do this:
- Accept locale strings of arbitrarys length
- Parse them and throw out any keywords ICU cannot handle (i.e.
everything except "collation", "currency" and "calendar", AFAIK)
- Hand the resulting string over to ICU
Well, maybe, but not in 1.0 :) Note that this will also significantly
slow down the functions and introduce dependency in PHP code for
locale
formats.
What confuses me, in general, is why locales are not implemented as
objects. Why do I have to pass a locale string to every locale-
aware function?
Because locale is essentially the string. There's nothing in the
locale that isn't in the string, so you don't need any specific
object for that - it wouldn't give you any value.
But you have to parse the locale string each time, right? That is
overhead. It would be much more logical to pass it around as a
resource/object. ICU does it the same way.
Also... having uloc_acceptLanguageFromHTTP exposed in the API would
be pretty neat ;) Since apparently, that does a mapping of e.g. "en-
GB" to "en_UK" etc
Feature request on pecl.php.net? ;) It'd be really easier to keep
track of it that way.
That was more of a joke ;) But I will add all those to the issue
tracker, yes.
And.. is there going to be Resources support in the future? AFAIK,
the
Yes, it's planned.
Awesome.
David
--
PHP Unicode & I18N Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php