Maybe post an example of a string/char that's causing the problem, as
it's logged in your app's log?
Here's an example of a problem string/char that I was seeing in data
posted to my app:
$ ./script/rails console
...
ruby-1.9.2-p136 :001 > s = "foo\xAE bar"
=> "foo\xAE bar"
ruby-1.9.2-p136 :002 > s.is_utf8?
=> false
ruby-1.9.2-p136 :003 > s.valid_encoding?
=> false
ruby-1.9.2-p136 :004 > s.sub(/bar/, 'biz')
ArgumentError: invalid byte sequence in UTF-8
from (irb):4:in `sub'
...
ruby-1.9.2-p136 :005 > s2 = Iconv.new('UTF-8//IGNORE',
'UTF-8').iconv("#{s} ")[0..-2]
=> "foo bar"
ruby-1.9.2-p136 :006 > s2.gsub(/bar/, 'biz')
=> "foo biz"
And if that's not doing the trick, then maybe try forcing the string
to utf8 first?:
ruby-1.9.2-p136 :007 > s3 = Iconv.new('UTF-8//IGNORE',
'UTF-8').iconv("#{s.force_encoding('UTF-8')} ")[0..-2]
=> "foo bar"
Jeff
On Jun 20, 4:33 pm, Erica <[email protected]> wrote:
> Thanks for your response. I tried this on a string that was causing
> the error and it didn't work. The problem is with microsoft word
> special characters. I can't find a way to replace these characters.
> Here is one website I found that describes the special
> characters:http://www.toao.net/48-replacing-smart-quotes-and-em-dashes-in-mysql,
> although it's not about rails.
>
> Can anyone help me out?
>
> Thanks,
>
> Erica
>
> On Jun 17, 7:38 pm, Jeff Lewis <[email protected]> wrote:
>
>
>
>
>
>
>
> > HiErica,
>
> > I ran into similar situation a while ago for a webservice app I was
> > working on where I had to handle a lot of bad / untrusted non-utf8
> > data, and found a fix that met the needs of the app using Iconv
> > (http://www.ruby-doc.org/stdlib/libdoc/iconv/rdoc/index.html)
> > following a strategy outlined by Paul Battley (http://po-ru.com/diary/
> > fixing-invalid-utf-8-in-ruby-revisited/):
>
> > ...
> > def AppUtil.force_utf8(str)
> > ic = Iconv.new('UTF-8//IGNORE', 'UTF-8')
> > return ic.iconv("#{str} ")[0..-2]
> > end
> > ...
>
> > Jeff
>
> > On Jun 16, 5:27 pm,Erica<[email protected]> wrote:
>
> > > What's a good solution for fixing character encoding problems for
> > > compatibility between ascii and utf-8? The database is postgres and
> > > is encoded in utf-8.
>
> > > Once in awhile there will be a compatibility error from strings from a
> > > webform.
>
> > > Is there a command to fix this besides using
> > > a_string.force_encoding('utf-8')? Even this doesn't seem to always
> > > work either.
>
> > > Thanks,
>
> > >Erica
--
You received this message because you are subscribed to the Google Groups "Ruby
on Rails: Talk" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/rubyonrails-talk?hl=en.