I did resolve this in case anyone else runs into this issue.
Placing this at the top of my ruby app file safely overrides the core
library method:
# Override core library method to match MSSQL ActiveRecord encoding
class REXML::Output
def encode_iconv(content)
#Iconv.conv(@encoding, UTF_8, content)
Iconv.conv(@encoding, 'LATIN1', content)
end
end
Matt Aimonetti wrote:
>
> Thanks for the suggestion - if you know how to explicitly set the
> coding for ActiveRecord that would save me some digging.
>
>
> You do that when you create your connection using database.yml or
> manually. I believe the key in the database.yml file is called encoding.
>
> - Matt
>
>
> On Wed, Sep 23, 2009 at 2:46 PM, Carl Graff <[email protected]
> <mailto:[email protected]>> wrote:
>
>
>
> To be honest I didn't explicitly set the coding for ActiveRecord
> and the
> data returned from AR methods appears to be:
> LATIN1 - when no accented chars are present
> ISO-8859-2 - when accented chars are present
> as detected by rchardet
>
> Also standard string data is detected as LATIN1 by rchardet
>
> Anyway I will see if I can coerce ActiveRecord to use UTF-8 and then
> maybe everything returned from ActiveRecord will consistently be
> UTF-8
> as detected by rchardet. Might have to see if 1.8 string methods all
> work OK with UTF-8 though.
>
> I guess this is kind of a mess from what I can gather on the forums -
> especially in 1.8 (better in 1.9).
>
> Thanks for the suggestion - if you know how to explicitly set the
> coding
> for ActiveRecord that would save me some digging.
>
> -- Carl
>
>
>
> Matt Aimonetti wrote:
> > did you correctly set the encoding in activerecord, back in 1.X
> > something I believe I changed it to be utf-8 by default, that
> might be
> > your problem.
> >
> > - Matt
> >
> > On Wed, Sep 23, 2009 at 2:00 PM, Carl Graff <[email protected]
> <mailto:[email protected]>
> > <mailto:[email protected] <mailto:[email protected]>>> wrote:
> >
> >
> > Hi Matt,
> >
> > The database is legacy and is in LATIN1 and therefore I cant
> > change the
> > encoding. Everything works fine until I get to the
> @doc.write call
> > which
> > in turn ends up calling the method that I hacked to work.
> >
> > I see that they aliased the method here in the library:
> > alias encode encode_iconv
> > in a register block that is a little mysterious to me - but
> maybe
> > I can
> > alias to redirect to what I want in some way.
> >
> > I can use brute force but hoped someone might have experience in
> > overriding module methods as that seems to be the least invasive
> > way to
> > fix this.
> >
> > Maybe I will post this on Nabble somewhere as well just in case.
> >
> > Thanks,
> > Carl
> >
> >
> >
> > Matt Aimonetti wrote:
> > > Make sure the charset and collation is set correctly in
> your DB.
> > > The rexml version you are using is the one that's coming with
> > 1.8 and
> > > therefore doesn't have anything to do with 1.9 encoding.
> > >
> > > My guess is that your db table isn't set as utf8.
> > >
> > > - Matt
> > >
> > >
> > > On Wed, Sep 23, 2009 at 1:39 PM, nblinux <[email protected]
> <mailto:[email protected]>
> > <mailto:[email protected] <mailto:[email protected]>>
> > > <mailto:[email protected] <mailto:[email protected]>
> <mailto:[email protected] <mailto:[email protected]>>>> wrote:
> > >
> > >
> > > HI,
> > >
> > > I have a ruby program that takes activerecord data and
> > outputs it to
> > > XML using the REXML library.
> > >
> > > It has been working fine until some accented
> characters were
> > used in
> > > one of the fields.
> > >
> > > I have isolated the error message to this file
> C:\ruby\lib\ruby
> > > \1.8\rexml\encodings\ICONV.rb and line after the commented
> > out line
> > > below:
> > > =====================
> > > module REXML
> > > module Encoding
> > > def decode_iconv(str)
> > > Iconv.conv(UTF_8, @encoding, str)
> > > end
> > >
> > > def encode_iconv(content)
> > > # cag changed to test encoding hack
> > > #Iconv.conv(@encoding, UTF_8, content)
> > > Iconv.conv(@encoding, 'LATIN1', content)
> > > end
> > >
> > > register("ICONV") do |obj|
> > > Iconv.conv(UTF_8, obj.encoding, nil)
> > > class << obj
> > > alias decode decode_iconv
> > > alias encode encode_iconv
> > > end
> > > end
> > > end
> > > end
> > > =======================
> > >
> > > So I think this has to do with REXML expecting UTF-8 which
> > is fine in
> > > Ruby 1.9 but Ruby 1.8 uses LATIN1 I think. I hacked
> the file
> > as shown
> > > above and this works, but what I really want to do is
> > override the
> > > "encode_iconv" method in this module in my own code so
> I am not
> > > changing the core libraries.
> > >
> > > I have attempted a couple of things but can't quite get it
> > such as
> > > putting this at the top of my main file before call that
> > uses it:
> > > ===========================
> > > include REXML
> > >
> > > module REXML
> > > module Encoding
> > > def encode_iconv(content)
> > > Iconv.conv('ISO-8859-1', 'LATIN1', content)
> > > end
> > > end
> > > end
> > > ===========================
> > >
> > > To test I just created a simple xml document
> > > @doc = Document,new("<xml? version='1.0'
> encoding='iso-8859' ?
> > > ><some text here/>")
> > > ... add some elements etc ...
> > > and then this is the call where it invoked:
> > > @doc.write(xml_string,0)
> > >
> > > I also tried eval technique from a David Black post
> and saw
> > some posts
> > > indicating use of self and object reference.
> > >
> > > Anyone got an idea? Is is because it is mixed in somewhere
> > when REXML
> > > is loaded and thus my override attempts are not
> recognized?
> > >
> > > Thanks,
> > > Carl
> > >
> > >
> > >
> > >
> > > >
> >
> >
> >
> >
> >
> > >
>
>
>
>
>
> >
--~--~---------~--~----~------------~-------~--~----~
SD Ruby mailing list
[email protected]
http://groups.google.com/group/sdruby
-~----------~----~----~----~------~----~------~--~---