At Thu, 8 Sep 2016 07:09:51 +0000, "Tsunakawa, Takayuki" <tsunakawa.ta...@jp.fujitsu.com> wrote in <0A3221C70F24FB45833433255569204D1F5E7D4A@G01JPEXMBYT05> > From: pgsql-hackers-ow...@postgresql.org > > [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Kyotaro > > HORIGUCHI > > <Using radix tree> > > $ time psql postgres -c 'select t.a from t, generate_series(0, 9999)' > > > /dev/null > > > > real 0m22.696s > > user 0m16.991s > > sys 0m0.182s> > > > > Using binsearch the result for the same operation was > > real 0m35.296s > > user 0m17.166s > > sys 0m0.216s > > > > Returning in UTF-8 bloats the result string by about 1.5 times so it doesn't > > seem to make sense comparing with it. But it takes real = 47.35s. > > Cool, 36% speedup! Does this difference vary depending on the actual > characters used, e.g. the speedup would be greater if most of the characters > are ASCII?
Binsearch on JIS X 0208 always needs about 10 times of comparison and bisecting and the radix tree requires three hops on arrays for most of the characters and two hops for some. In sort, this effect won't be differ among 2 and 3 byte characters in UTF-8. The translation speed of ASCII cahracters (U+20 - U+7f) is not affected by the character conversion mechanism. They are just copied without conversion. As the result, there's no speedup if the output consists only of ASCII characters and maximum speedup when the output consists only of 2 byte UTF-8 characters. regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers