I did resolve this in case anyone else runs into this issue.
Placing this at the top of my ruby app file safely overrides the core 
library method:

# Override core library method to match MSSQL ActiveRecord encoding
class REXML::Output
    def encode_iconv(content)
        #Iconv.conv(@encoding, UTF_8, content)
        Iconv.conv(@encoding, 'LATIN1', content)
     end
end


Matt Aimonetti wrote:
>
>      Thanks for the suggestion - if you know how to explicitly set the
>     coding for ActiveRecord that would save me some digging.
>
>
> You do that when you create your connection using database.yml or 
> manually. I believe the key in the database.yml file is called encoding.
>
> - Matt
>
>
> On Wed, Sep 23, 2009 at 2:46 PM, Carl Graff <[email protected] 
> <mailto:[email protected]>> wrote:
>
>
>
>     To be honest I didn't explicitly set the coding for ActiveRecord
>     and the
>     data returned from AR methods appears to be:
>      LATIN1 - when no accented chars are present
>      ISO-8859-2 - when accented chars are present
>     as detected by rchardet
>
>     Also standard string data is detected as LATIN1 by rchardet
>
>     Anyway I will see if I can coerce ActiveRecord to use UTF-8  and then
>     maybe everything returned from ActiveRecord will consistently  be
>     UTF-8
>     as detected by rchardet. Might have to see if 1.8 string methods all
>     work OK with UTF-8 though.
>
>     I guess this is kind of a mess from what I can gather on the forums -
>     especially in 1.8 (better in 1.9).
>
>     Thanks for the suggestion - if you know how to explicitly set the
>     coding
>     for ActiveRecord that would save me some digging.
>
>     -- Carl
>
>
>
>     Matt Aimonetti wrote:
>     > did you correctly set the encoding in activerecord, back in 1.X
>     > something I believe I changed it to be utf-8 by default, that
>     might be
>     > your problem.
>     >
>     > - Matt
>     >
>     > On Wed, Sep 23, 2009 at 2:00 PM, Carl Graff <[email protected]
>     <mailto:[email protected]>
>     > <mailto:[email protected] <mailto:[email protected]>>> wrote:
>     >
>     >
>     >     Hi Matt,
>     >
>     >     The database is legacy and is in LATIN1 and therefore I cant
>     >     change the
>     >     encoding. Everything works fine until I get to the
>     @doc.write call
>     >     which
>     >     in turn ends up  calling the method that I hacked to work.
>     >
>     >     I see that they aliased the method here in the library:
>     >           alias encode encode_iconv
>     >     in a register block that is a little mysterious to me - but
>     maybe
>     >     I can
>     >     alias to redirect to what I want in some way.
>     >
>     >     I can use brute force but hoped someone might have experience in
>     >     overriding module methods as that seems to be the least invasive
>     >     way to
>     >     fix this.
>     >
>     >     Maybe I will post this on Nabble somewhere as well just in case.
>     >
>     >     Thanks,
>     >      Carl
>     >
>     >
>     >
>     >     Matt Aimonetti wrote:
>     >     > Make sure the charset and collation is set correctly in
>     your DB.
>     >     > The rexml version you are using is the one that's coming with
>     >     1.8 and
>     >     > therefore doesn't have anything to do with 1.9 encoding.
>     >     >
>     >     > My guess is that your db table isn't set as utf8.
>     >     >
>     >     > - Matt
>     >     >
>     >     >
>     >     > On Wed, Sep 23, 2009 at 1:39 PM, nblinux <[email protected]
>     <mailto:[email protected]>
>     >     <mailto:[email protected] <mailto:[email protected]>>
>     >     > <mailto:[email protected] <mailto:[email protected]>
>     <mailto:[email protected] <mailto:[email protected]>>>> wrote:
>     >     >
>     >     >
>     >     >     HI,
>     >     >
>     >     >     I have a ruby program that takes activerecord data and
>     >     outputs it to
>     >     >     XML using the REXML library.
>     >     >
>     >     >     It has been working fine until some accented
>     characters were
>     >     used in
>     >     >     one of the fields.
>     >     >
>     >     >     I have isolated the error message to this file
>     C:\ruby\lib\ruby
>     >     >     \1.8\rexml\encodings\ICONV.rb and line after the commented
>     >     out line
>     >     >     below:
>     >     >     =====================
>     >     >     module REXML
>     >     >      module Encoding
>     >     >        def decode_iconv(str)
>     >     >          Iconv.conv(UTF_8, @encoding, str)
>     >     >        end
>     >     >
>     >     >        def encode_iconv(content)
>     >     >          # cag changed to test encoding hack
>     >     >          #Iconv.conv(@encoding, UTF_8, content)
>     >     >          Iconv.conv(@encoding, 'LATIN1', content)
>     >     >        end
>     >     >
>     >     >        register("ICONV") do |obj|
>     >     >          Iconv.conv(UTF_8, obj.encoding, nil)
>     >     >          class << obj
>     >     >            alias decode decode_iconv
>     >     >            alias encode encode_iconv
>     >     >          end
>     >     >        end
>     >     >      end
>     >     >     end
>     >     >     =======================
>     >     >
>     >     >     So I think this has to do with REXML expecting UTF-8 which
>     >     is fine in
>     >     >     Ruby 1.9 but Ruby 1.8 uses LATIN1 I think. I hacked
>     the file
>     >     as shown
>     >     >     above and this works, but what I really want to do is
>     >     override the
>     >     >     "encode_iconv" method in this module in my own code so
>     I am not
>     >     >     changing the core libraries.
>     >     >
>     >     >     I have attempted a couple of things but can't quite get it
>     >     such as
>     >     >     putting this at the top of my main file before call that
>     >     uses it:
>     >     >     ===========================
>     >     >     include REXML
>     >     >
>     >     >     module REXML
>     >     >        module Encoding
>     >     >        def encode_iconv(content)
>     >     >          Iconv.conv('ISO-8859-1', 'LATIN1', content)
>     >     >        end
>     >     >      end
>     >     >     end
>     >     >     ===========================
>     >     >
>     >     >     To test I just created a simple xml document
>     >     >       @doc = Document,new("<xml? version='1.0'
>     encoding='iso-8859' ?
>     >     >     ><some text here/>")
>     >     >       ... add some elements etc ...
>     >     >     and then this is the call where it invoked:
>     >     >       @doc.write(xml_string,0)
>     >     >
>     >     >     I also tried eval technique from a David Black post
>     and saw
>     >     some posts
>     >     >     indicating use of self and object reference.
>     >     >
>     >     >     Anyone got an idea? Is is because it is mixed in somewhere
>     >     when REXML
>     >     >     is loaded and thus my override attempts are not
>     recognized?
>     >     >
>     >     >     Thanks,
>     >     >      Carl
>     >     >
>     >     >
>     >     >
>     >     >
>     >     > >
>     >
>     >
>     >
>     >
>     >
>     > >
>
>
>
>
>
> >


--~--~---------~--~----~------------~-------~--~----~
SD Ruby mailing list
[email protected]
http://groups.google.com/group/sdruby
-~----------~----~----~----~------~----~------~--~---

Reply via email to