To be honest I didn't explicitly set the coding for ActiveRecord and the 
data returned from AR methods appears to be:
  LATIN1 - when no accented chars are present
  ISO-8859-2 - when accented chars are present
as detected by rchardet

Also standard string data is detected as LATIN1 by rchardet

Anyway I will see if I can coerce ActiveRecord to use UTF-8  and then 
maybe everything returned from ActiveRecord will consistently  be UTF-8 
as detected by rchardet. Might have to see if 1.8 string methods all 
work OK with UTF-8 though.

I guess this is kind of a mess from what I can gather on the forums - 
especially in 1.8 (better in 1.9).

Thanks for the suggestion - if you know how to explicitly set the coding 
for ActiveRecord that would save me some digging.

-- Carl



Matt Aimonetti wrote:
> did you correctly set the encoding in activerecord, back in 1.X 
> something I believe I changed it to be utf-8 by default, that might be 
> your problem.
>
> - Matt
>
> On Wed, Sep 23, 2009 at 2:00 PM, Carl Graff <[email protected] 
> <mailto:[email protected]>> wrote:
>
>
>     Hi Matt,
>
>     The database is legacy and is in LATIN1 and therefore I cant
>     change the
>     encoding. Everything works fine until I get to the @doc.write call
>     which
>     in turn ends up  calling the method that I hacked to work.
>
>     I see that they aliased the method here in the library:
>           alias encode encode_iconv
>     in a register block that is a little mysterious to me - but maybe
>     I can
>     alias to redirect to what I want in some way.
>
>     I can use brute force but hoped someone might have experience in
>     overriding module methods as that seems to be the least invasive
>     way to
>     fix this.
>
>     Maybe I will post this on Nabble somewhere as well just in case.
>
>     Thanks,
>      Carl
>
>
>
>     Matt Aimonetti wrote:
>     > Make sure the charset and collation is set correctly in your DB.
>     > The rexml version you are using is the one that's coming with
>     1.8 and
>     > therefore doesn't have anything to do with 1.9 encoding.
>     >
>     > My guess is that your db table isn't set as utf8.
>     >
>     > - Matt
>     >
>     >
>     > On Wed, Sep 23, 2009 at 1:39 PM, nblinux <[email protected]
>     <mailto:[email protected]>
>     > <mailto:[email protected] <mailto:[email protected]>>> wrote:
>     >
>     >
>     >     HI,
>     >
>     >     I have a ruby program that takes activerecord data and
>     outputs it to
>     >     XML using the REXML library.
>     >
>     >     It has been working fine until some accented characters were
>     used in
>     >     one of the fields.
>     >
>     >     I have isolated the error message to this file C:\ruby\lib\ruby
>     >     \1.8\rexml\encodings\ICONV.rb and line after the commented
>     out line
>     >     below:
>     >     =====================
>     >     module REXML
>     >      module Encoding
>     >        def decode_iconv(str)
>     >          Iconv.conv(UTF_8, @encoding, str)
>     >        end
>     >
>     >        def encode_iconv(content)
>     >          # cag changed to test encoding hack
>     >          #Iconv.conv(@encoding, UTF_8, content)
>     >          Iconv.conv(@encoding, 'LATIN1', content)
>     >        end
>     >
>     >        register("ICONV") do |obj|
>     >          Iconv.conv(UTF_8, obj.encoding, nil)
>     >          class << obj
>     >            alias decode decode_iconv
>     >            alias encode encode_iconv
>     >          end
>     >        end
>     >      end
>     >     end
>     >     =======================
>     >
>     >     So I think this has to do with REXML expecting UTF-8 which
>     is fine in
>     >     Ruby 1.9 but Ruby 1.8 uses LATIN1 I think. I hacked the file
>     as shown
>     >     above and this works, but what I really want to do is
>     override the
>     >     "encode_iconv" method in this module in my own code so I am not
>     >     changing the core libraries.
>     >
>     >     I have attempted a couple of things but can't quite get it
>     such as
>     >     putting this at the top of my main file before call that
>     uses it:
>     >     ===========================
>     >     include REXML
>     >
>     >     module REXML
>     >        module Encoding
>     >        def encode_iconv(content)
>     >          Iconv.conv('ISO-8859-1', 'LATIN1', content)
>     >        end
>     >      end
>     >     end
>     >     ===========================
>     >
>     >     To test I just created a simple xml document
>     >       @doc = Document,new("<xml? version='1.0' encoding='iso-8859' ?
>     >     ><some text here/>")
>     >       ... add some elements etc ...
>     >     and then this is the call where it invoked:
>     >       @doc.write(xml_string,0)
>     >
>     >     I also tried eval technique from a David Black post and saw
>     some posts
>     >     indicating use of self and object reference.
>     >
>     >     Anyone got an idea? Is is because it is mixed in somewhere
>     when REXML
>     >     is loaded and thus my override attempts are not recognized?
>     >
>     >     Thanks,
>     >      Carl
>     >
>     >
>     >
>     >
>     > >
>
>
>
>
>
> >


--~--~---------~--~----~------------~-------~--~----~
SD Ruby mailing list
[email protected]
http://groups.google.com/group/sdruby
-~----------~----~----~----~------~----~------~--~---

Reply via email to