Thanks for pointing out the obvious Frederick (seriously, thank you). The problem was completely on the JavaScript/browser side; the function which prepared the query string was using escape() rather than encodeURIComponent(). I replaced all the calls to escape and things started to magically work, how about that?
Thank you, I really appreciate the help!! I can't believe how much time I spent looking in the wrong places...at least I learned a fair amount about Rails internals as well as encoding issues though...haha. Cheers, Dave On 5月17日, 午前1:00, Frederick Cheung <[email protected]> wrote: > On 16 May 2011, at 14:47, ddellacosta <[email protected]> wrote: > > > > > Hi folks, > > > Here's my basic issue, hopefully this is clear. I'm trying to submit > > some UTF-8 values in my query string, but they are coming out mangled > > on the other end. It *seems* like the problem is that what > > Rack::Utils.unescape() pushes out gets converted to UTF-8 somewhere in > > the chain (using 3.0.7, and Ruby 1.9.2, by the way), and it's mangling > > characters which are two bytes (for example, "%20," which is space and > > a one byte character, gets converted fine). I feel like I've almost > > figured this out, but I'm still stumped. Here's my "evidence:" > > > # Example UTF-8 string: > > > "Adélaïde de Hongrie" > > > # GET string (obviously URI encoded): > > > Started GET "/registers/results?filter[title][]=Ad%E9la%EFde%20de > > %20Hongrie&search=&limit=4" for 127.0.0.1 at 2011-05-16 14:17:33 +0700 > > Who is producing this query string? They should be generating %c3%a9 if they > are UTF8 friendly, since %e9 is just URL speak for \xe9, which smells like > iso-Latin-something > > Fred > > > > > # What Rack produces/Rails sees (in Controller): > > > Parameters: {"filter"=>{"title"=>["Ad\xE9la\xEFde de Hongrie"]}, > > "search"=>"", "limit"=>"4"} > > > # Error I'm getting, when I try to "do stuff" with the above string: > > > ArgumentError (invalid byte sequence in UTF-8): > > > # What would actually be a valid string with hex UTF code points in > > the format above: > > > "Ad\xC3\xA9la\xC3\xAFde de Hongrie" > > > Or, in the "\u ..." format (see anything interesting here? Something > > obvious is eluding me...): > > > "Ad\u{E9}la\u{EF}de de Hongrie > > > To be clear, this is not a form, but an ajax query. I've tried adding > > the 'utf8' snowman thing manually too, but that doesn't seem to do > > anything...of course, maybe I'm doing that wrong. > > > Any thoughts/questions/pointing out of obvious errors or confused ways > > of thinking? I'd also appreciate any pointers to Rails documentation > > which describes in more detail how this stuff happens; I've just been > > digging through the code and it's slow going for me. > > > Help much appreciated! > > > Cheers, > > Dave > > > -- > > You received this message because you are subscribed to the Google Groups > > "Ruby on Rails: Talk" group. > > To post to this group, send email to [email protected]. > > To unsubscribe from this group, send email to > > [email protected]. > > For more options, visit this group > > athttp://groups.google.com/group/rubyonrails-talk?hl=en. -- You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en.

