[MacRuby-devel] encoding mismatch ?

2010-03-11 Thread Yvon Thoraval
I have to read an UTF-8 encoded file with accentuated chars.

if i do :

#! /usr/local/bin/macruby
# encoding: utf-8

SIGNATURES_FILE = "/Users/yt/dev/Signature/signatures.txt"
  open(SIGNATURES_FILE) do |file|
file.each do |line|
  puts line
end
  end

i get the correct characters on the Terminal, ie. :

...
-- 
« Pour ceux qui vont chercher midi à quatorze heures,
la minute de vérité risque de se faire longtemps attendre. »
(Pierre Dac)


however when doing (with or without the .force_encoding("UTF-8" and ,
"r:UTF-8") :

  t = "".force_encoding("UTF-8")
  open(SIGNATURES_FILE, "r:UTF-8") do |file|
 file.each do |line|
  t += line.force_encoding("UTF-8")
end
  end
puts t

i get wrong chars :

...
-- 
« Pour ceux qui vont chercher midi à quatorze heures,
la minute de vérité risque de se faire longtemps attendre. »
(Pierre Dac)

the only difference is that i make use of a String t = ""

what did i missunderstood ?
___
MacRuby-devel mailing list
MacRuby-devel@lists.macosforge.org
http://lists.macosforge.org/mailman/listinfo.cgi/macruby-devel


[MacRuby-devel] RE2: a principled approach to regular expression matching - Google Open Source Blog

2010-03-11 Thread Ernest N. Prabhakar, Ph.D.
Fascinating.  Wonder if Ruby might benefit from this...

http://google-opensource.blogspot.com/2010/03/re2-principled-approach-to-regular.html

Today, we released RE2 as an open source project. It's a mostly drop-in 
replacement for PCRE's C++ bindings
and is available under a BSD-style license. See the RE2 project page for 
details.
___
MacRuby-devel mailing list
MacRuby-devel@lists.macosforge.org
http://lists.macosforge.org/mailman/listinfo.cgi/macruby-devel


Re: [MacRuby-devel] encoding mismatch ?

2010-03-11 Thread Laurent Sansonetti
Hi Yvon,

1.9 encodings in trunk have very little support for now, but we significantly 
improved them in a branch that might get merged into trunk in a few days (maybe 
today). I will post an update here once it's done.

Laurent

On Mar 11, 2010, at 8:46 AM, Yvon Thoraval wrote:

> I have to read an UTF-8 encoded file with accentuated chars.
> 
> if i do :
> 
> #! /usr/local/bin/macruby
> # encoding: utf-8
> 
> SIGNATURES_FILE = "/Users/yt/dev/Signature/signatures.txt"
>   open(SIGNATURES_FILE) do |file|
> file.each do |line|
>   puts line
> end
>   end
> 
> i get the correct characters on the Terminal, ie. :
> 
> ...
> -- 
> « Pour ceux qui vont chercher midi à quatorze heures, 
> la minute de vérité risque de se faire longtemps attendre. » 
> (Pierre Dac)
> 
> 
> however when doing (with or without the .force_encoding("UTF-8" and , 
> "r:UTF-8") :
> 
>   t = "".force_encoding("UTF-8")
>   open(SIGNATURES_FILE, "r:UTF-8") do |file|
>  file.each do |line|
>   t += line.force_encoding("UTF-8")
> end
>   end
> puts t
> 
> i get wrong chars :
> 
> ...
> -- 
> « Pour ceux qui vont chercher midi à quatorze heures, 
> la minute de vérité risque de se faire longtemps attendre. » 
> (Pierre Dac)
> 
> the only difference is that i make use of a String t = ""
> 
> what did i missunderstood ?
> ___
> MacRuby-devel mailing list
> MacRuby-devel@lists.macosforge.org
> http://lists.macosforge.org/mailman/listinfo.cgi/macruby-devel

___
MacRuby-devel mailing list
MacRuby-devel@lists.macosforge.org
http://lists.macosforge.org/mailman/listinfo.cgi/macruby-devel


Re: [MacRuby-devel] RE2: a principled approach to regular expression matching - Google Open Source Blog

2010-03-11 Thread Laurent Sansonetti
Sounds interesting, but the project page says it's not good for back-references 
:)

As some of you may know, we are rewriting String, Symbol and Regexp based on 
the ICU framework in a branch (called "icu"). The work is almost done at this 
point and will get merged into trunk soon. These changes have many advantages, 
mostly performance, thread-safety and Ruby compatibility related.

Laurent

On Mar 11, 2010, at 12:21 PM, Ernest N. Prabhakar, Ph.D. wrote:

> Fascinating.  Wonder if Ruby might benefit from this...
> 
> http://google-opensource.blogspot.com/2010/03/re2-principled-approach-to-regular.html
> 
> Today, we released RE2 as an open source project. It's a mostly drop-in 
> replacement for PCRE's C++ bindings
> and is available under a BSD-style license. See the RE2 project page for 
> details.
> ___
> MacRuby-devel mailing list
> MacRuby-devel@lists.macosforge.org
> http://lists.macosforge.org/mailman/listinfo.cgi/macruby-devel

___
MacRuby-devel mailing list
MacRuby-devel@lists.macosforge.org
http://lists.macosforge.org/mailman/listinfo.cgi/macruby-devel


Re: [MacRuby-devel] encoding mismatch ?

2010-03-11 Thread Yvon Thoraval
2010/3/11 Laurent Sansonetti 

> Hi Yvon,
>
> 1.9 encodings in trunk have very little support for now, but we
> significantly improved them in a branch that might get merged into trunk in
> a few days (maybe today). I will post an update here once it's done.
>
> Laurent
>
>
fine thanks !
my script works very well with ruby 1.9 MacPorts version.
___
MacRuby-devel mailing list
MacRuby-devel@lists.macosforge.org
http://lists.macosforge.org/mailman/listinfo.cgi/macruby-devel