Re: [MacRuby-devel] String performance (yet another)

Vincent Isambart Sun, 16 Jan 2011 20:20:02 -0800

Hi,

> Indeed, String#[] will now perform slower on UTF8 non-ascii strings, because
> computing the character index cannot be done in constant time anymore.
> I don't believe this can be improved using the optimization we implemented
> for #gsub and #scan. Maybe 1.9.2 has a better optimization, I will let
> Vincent comment :)


> text = File.read("test.txt")
> 1000.times do |i|
>  a = text[i,i+30]
> end

In fact I already use the cache to get the offset for the end index.
I just had a look at 1.9.2 and what they do is pretty similar to what
we do. I would not be surprised if the difference was mainly due to
the object allocator being much slower in MacRuby.
I would need to shark to be sure but I would not expect much
improvement on String#[] soon.

And by the way to try with UTF-16 you should not use force_encoding
but encode, and not UTF-16BE but LE:
text = text.encode(Encoding::UTF_16LE)
because the fastest encoding is UTF-16LE and not BE (the native
encoding on x86 is little endian), and on a UTF-8 string, forcing the
encoding to ASCII or BINARY(ASCII-8BIT) would make sense (as all ASCII
characters are the same in UTF-8 and ASCII) but forcing it to UTF-16
would give you a meaningless string full of strange characters.
_______________________________________________
MacRuby-devel mailing list
MacRuby-devel@lists.macosforge.org
http://lists.macosforge.org/mailman/listinfo.cgi/macruby-devel

Re: [MacRuby-devel] String performance (yet another)

Reply via email to