Am 02.06.2008 um 07:36 schrieb Stanislav Malyshev:

Hi!

What if there is a Number extension down the road. Or a Collator extension. Or what if people already have classes called NumberFormatter

Well, what if they have classes named IntlNumberFormatter? There can be all kinds of classes, and until we have standard namespace for internal
classes we have to live with names being global and make them
sufficiently distinct to avoid collisions. I think NumberFormatter is
sufficiently distinct.

Obviously, it is less likely that someone who wrote a Collator called their class "IntlCollator" instead of "Collator".

Consistency is important. The lack of it is also often what PHP users complain about. This should really be changed to be consistent across the board, Stas.


But why are those internal differences exposed through the API. I think

Because there are cases when either one of them can be used, IMO.

I don't think that's a valid argument. Why does the API have to expose the internal implementation differences?


At least, there should be Collator::asortWithKeys().  But I really

Current implementation doesn't allow to do asort with keys easily, but
this can be improved. Contributions welcome btw ;)
Also, asort seems to be less frequent use case for the data that require collation. That doesn't mean it shouldn't be done, just from priority point of view.

Again, I think it should be consistent :) Why not just toss out sort(), rename sortWithKeys() to sort(), and the optimize asort() later.


idea. I'm failing to get the extension compiled here on OS X, but will

What's the problem on OS X? I'd like having it building on any OSes
supported by PHP, so could you provide more info on this?

Can't quite remember, but will let you know ASAP. Probably a problem on my side.


[EMAIL PROTECTED];collation=traditional;calendar=thai- buddhist is what I could come up with right now... 77 characters.

That's really not a frequent case - especially taking into account that
there's no function that needs currency, collation and calendar at the
same time. But for the main reason see below.

Waitwaitwait. The entire point of doing internationalization properly, using ICU and the CLDR, is that even seemingly obscure cases are possible.

I mean who are you/we to decide that someone from the Republic of Serbia, who speaks Serbian, may not view sales numbers for last week's month in the thai-buddhist calendar in USD and sorted "traditionally"?

You think that is unrealistic? Maybe. Then what about this:

[EMAIL PROTECTED];collation=traditional;calendar=gregorian

China, simplified Han, List of quarterly sales, Gregorian calendar, normal collation for sorting person names.

65 characters. And this is not unrealistic.

Stas, internationalization is not about neglecting edge cases. It has to be done properly. That's the whole point of it.


The other question is what happens if the string is longer than that? Does it get cut off or something?

No, the function getting overlong locale name would fail.

paintings. Or whatever. So locale identifier strings can be of any length.

Please tell that to the ICU library developers. 98-byte long locale
provably crashes ICU libraries. I didn't want to take chances so I chose smallest "round" number for the limit that works reliably. I'd be happy to raise it if I could be sure ICU would work OK with it.

97? ;)


Maybe ext/intl should do this:
- Accept locale strings of arbitrarys length
- Parse them and throw out any keywords ICU cannot handle (i.e. everything except "collation", "currency" and "calendar", AFAIK)
- Hand the resulting string over to ICU

Well, maybe, but not in 1.0 :) Note that this will also significantly
slow down the functions and introduce dependency in PHP code for locale
formats.

What confuses me, in general, is why locales are not implemented as objects. Why do I have to pass a locale string to every locale- aware function?

Because locale is essentially the string. There's nothing in the locale that isn't in the string, so you don't need any specific object for that - it wouldn't give you any value.

But you have to parse the locale string each time, right? That is overhead. It would be much more logical to pass it around as a resource/object. ICU does it the same way.


Also... having uloc_acceptLanguageFromHTTP exposed in the API would be pretty neat ;) Since apparently, that does a mapping of e.g. "en- GB" to "en_UK" etc

Feature request on pecl.php.net? ;) It'd be really easier to keep track of it that way.

That was more of a joke ;) But I will add all those to the issue tracker, yes.

And.. is there going to be Resources support in the future? AFAIK, the

Yes, it's planned.

Awesome.


David

--
PHP Unicode & I18N Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to