On 22-dec-2005, at 20:36, Thijs Van Der Vossen wrote:

On 21 Dec 2005, at 15:53 , Julian 'Julik' Tarkhanov wrote:
Well, I see that my last email hasn't generated any reaction from the Rails core team. [...]

Julian, maybe I've missed it, but do you have a patch for the String fix you proposed in your previous email? I really like to test our current apps against your proposed solution.

I sent it to you off-list yesterday I believe, I am working on this:
http://julik.textdriven.com/svn/tools/rails_plugins/unicode_hacks/

If someone wants to help out hacking I will gladly accept it.

Just grab the Unicode gem, export the plugin, rake. It has some other code (some of which is addressed in the core already - like DB connection charset - but funny as it may seem this was protecting me from the effects of the infamousdatabase timeout problem).

But I need more solid test coverage and not all methods are shadowed yet. Unfortunately there is no test for the core Ruby string functionality so I can't check if I break it for anyone else. If such a test exists I would like to know where (is Rubicon still viable? it hasn't been updated for quite some time). Right now I just filter all calls to strings which have UTF-8 semantics and only when $KCODE is UTF8. And you need to have the gem, which means that this won't work for Windows people - they will need to find out how to build the gem themselves, I am C-illiterate.

But it overrides the core Ruby class and core Ruby methods. It is, in general, a very nasty hack - a very deep one. I stand by it (and I use it daily), but I don't know if it will work for others. I just felt very, uhm... upset when I found out that Rails basically does nothing to what is (IMO) Matz's hesitation. There is similar ambiguity with this in PHP but every moderately large application (or framework) at least tries to tackle this through use of mb_string. I might hack on this further but I would like to know the position of the core on this.

Because if you want Rails-apps to be Unicode-enabled you basically have 2 options: 1) hack the String - Matz will not produce something working in the near future. Or maybe the Pragmatic guys can convince him, because the purism of "not doing anything not to hurt nobody" is noble but long-lasting with bad side-effects. I could find talks about Unicode in Ruby going to as far back as 2002, and still absolutely niente has been done to address it at the language level. 2) fork, fork, fork. Every single string truncation or length calculation or stripping within Rails has to be forked (like the truncate() helper) 3) Make an extension of String which will accomodate hacks like mine under their own prefix, as if we were in PHP-land calling mb_functions. Again, an enormous code review process should ensue, as well as it gives us no guarantee of covering other outside libraries (or, for that matter, it gives no guarantee that a Rails core developer from the USA won't forget that you need a prefix to count these darn letters right).

I am just upset because it's so broken and I seem to be the only one whining and asking questions. Maybe I am asking them wrong, I don't know. Or I seem to be the only Rails user needing to use both an ß and a Ш in a single string, while everyone else is happily building this new Web 2.0 (which as it turns out has problems accepting my first and last name).

Enjoy the holidays everyone!

--
Julian 'Julik' Tarkhanov
me at julik.nl



_______________________________________________
Rails-core mailing list
Rails-core@lists.rubyonrails.org
http://lists.rubyonrails.org/mailman/listinfo/rails-core

Reply via email to