[
https://issues.apache.org/jira/browse/SOLR-122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12468490
]
Yonik Seeley commented on SOLR-122:
-----------------------------------
OK, check this out... my second ruby coding attempt ever. The first was the 6
line program here http://wiki.apache.org/solr/SolRuby
At first I thought maybe the speed difference was due to gsub scanning the
string 3 times. Then I started fooling around with it and realized the
slowdown must be because the pattern is being "compiled" on every evaluation
(just a guess). I also wrote a single-pass version that's a little faster yet.
I didn't test the XML versions since I don't have libxml (and I'm not even sure
how to get/install... I'm obviously not a ruby person). *but* since these
versions are 10 times faster than the original string concat versions, I assume
they will be perhaps 5 times faster than libxml. Assuming It's actually doing
what it's supposed to and I didn't make some horrible mistake.
user system total real
string concatenation: 6.812000 0.171000 6.983000 (
7.172000)
string substitution: 6.922000 0.141000 7.063000 (
7.250000)
string concatenation2: 1.047000 0.000000 1.047000 (
1.078000)
string substitution2: 0.953000 0.000000 0.953000 (
0.969000)
catenation w/ single pass escape: 0.734000 0.000000 0.734000 (
0.750000)
substitution w/ single pass escape: 0.657000 0.000000 0.657000 ( 0.656000
)
require "benchmark"
#TESTS = 1_000_000
TESTS = 100_000
def escape(text)
text.gsub(/([&<>])/) { |ch|
case ch
when '&' then '&'
when '<' then '<'
when '>' then '>'
end
}
end
Benchmark.bmbm do |results|
results.report("string concatenation:") do
TESTS.times do
x = "<blah>"
x << "woot".gsub("&", "&").gsub("<", "<").gsub(">", ">")
x << "</blah>"
end
end
results.report("string substitution:") do
TESTS.times do
x = "<blah>#{"woot".gsub("&", "&").gsub("<", "<").gsub(">",
">")}</blah>"
end
end
results.report("string concatenation2:") do
TESTS.times do
x = "<blah>"
x << "woot".gsub(/&/, '&').gsub(/</, '<').gsub(/>/, '>')
x << "</blah>"
end
end
results.report("string substitution2:") do
TESTS.times do
x = "<blah>#{"woot".gsub(/&/, '&').gsub(/</, '<').gsub(/>/,
'>')}</blah>"
end
end
results.report("catenation w/ single pass escape:") do
TESTS.times do
x = "<blah>"
x << escape("woot")
x << "</blah>"
end
end
results.report("substitution w/ single pass escape:") do
TESTS.times do
x = "<blah>#{escape('woot')}</blah>"
end
end
end
> Add optional support for Ruby-libxml2 (vs. REXML)
> -------------------------------------------------
>
> Key: SOLR-122
> URL: https://issues.apache.org/jira/browse/SOLR-122
> Project: Solr
> Issue Type: Improvement
> Components: clients - ruby - flare
> Reporter: Coda Hale
> Attachments: libxml.rb, libxml.rb
>
>
> This file adds drop-in support for the ruby-libxml2, which is a wrapper for
> the libxml2 library, which is an order of magnitude or so faster than REXML.
> This depends on my SOLR-121 patch for multi-document adds, since the behavior
> of Solr::Request::AddDocument#to_s is different.
> Requiring this makes some tests fail, but for trivial reasons: some tests are
> directly tied to REXML, others fail due to interelement whitespace added by
> libxml2 (which you can't disable via the Ruby interface). Functionally, it's
> identical, and passes all functional tests.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.