Re: [Rails-core] Investigating Unicode. Take 2, with nastities and allegations.

Julian 'Julik' Tarkhanov Thu, 22 Dec 2005 14:48:21 -0800


On 22-dec-2005, at 20:36, Thijs Van Der Vossen wrote:

On 21 Dec 2005, at 15:53 , Julian 'Julik' Tarkhanov wrote:
Well, I see that my last email hasn't generated any reaction fromthe Rails core team. [...]
Julian, maybe I've missed it, but do you have a patch for theString fix you proposed in your previous email? I really like totest our current apps against your proposed solution.

I sent it to you off-list yesterday I believe, I am working on this:
http://julik.textdriven.com/svn/tools/rails_plugins/unicode_hacks/

If someone wants to help out hacking I will gladly accept it.

Just grab the Unicode gem, export the plugin, rake. It has some othercode (some of which is addressed in the core already - like DBconnection charset - but funny as it may seem this was protecting mefrom the effects of the infamousdatabase timeout problem).

But I need more solid test coverage and not all methods are shadowedyet. Unfortunately there is no test for the core Ruby stringfunctionality so I can't check if I break it for anyone else. If sucha test exists I would like to know where (is Rubicon still viable? ithasn't been updated for quite some time). Right now I just filter allcalls to strings which have UTF-8 semantics and only when $KCODE isUTF8. And you need to have the gem, which means that this won't workfor Windows people - they will need to find out how to build the gemthemselves, I am C-illiterate.

But it overrides the core Ruby class and core Ruby methods. It is, ingeneral, a very nasty hack - a very deep one. I stand by it (and Iuse it daily), but I don't know if it will work for others. I justfelt very, uhm... upset when I found out that Rails basically doesnothing to what is (IMO) Matz's hesitation. There is similarambiguity with this in PHP but every moderately large application (orframework) at least tries to tackle this through use of mb_string. Imight hack on this further but I would like to know the position ofthe core on this.

Because if you want Rails-apps to be Unicode-enabled you basicallyhave 2 options:1) hack the String - Matz will not produce something working in thenear future. Or maybe the Pragmatic guys can convince him, becausethe purism of "not doing anything not to hurt nobody" is noble butlong-lasting with bad side-effects. I could find talks about Unicodein Ruby going to as far back as 2002, and still absolutely niente hasbeen done to address it at the language level.2) fork, fork, fork. Every single string truncation or lengthcalculation or stripping within Rails has to be forked (like thetruncate() helper)3) Make an extension of String which will accomodate hacks like mineunder their own prefix, as if we were in PHP-land callingmb_functions. Again, an enormous code review process should ensue, aswell as it gives us no guarantee of covering other outside libraries(or, for that matter, it gives no guarantee that a Rails coredeveloper from the USA won't forget that you need a prefix to countthese darn letters right).

I am just upset because it's so broken and I seem to be the only onewhining and asking questions. Maybe I am asking them wrong, I don'tknow. Or I seem to be the only Rails user needing to use both an ßand a Ш in a single string, while everyone else is happily buildingthis new Web 2.0 (which as it turns out has problems accepting myfirst and last name).


Enjoy the holidays everyone!

--
Julian 'Julik' Tarkhanov
me at julik.nl



_______________________________________________
Rails-core mailing list
Rails-core@lists.rubyonrails.org
http://lists.rubyonrails.org/mailman/listinfo/rails-core

Re: [Rails-core] Investigating Unicode. Take 2, with nastities and allegations.

Reply via email to