Hi!

----

While testing the new AST "tr" builtin I hit a performance issue with
the -C option on Solaris 11/B145/AMD64... basically running the demo
below takes "forever" in AST "tr" (aborted after 10 mins) when running
in locales like en_US.UTF-8, uk_UA.UTF-8, de_DE.UTF-8... but not in
ja_JP.UTF08 or in the libast-internal C.UTF-8 locale...

For comparisation the native Solaris 11 /usr/xpg6/bin/tr takes around
1.75 seconds to execute these lines:
-- snip --
$ timex ~/bin/ksh -c 'builtin tr ; /usr/xpg6/bin/tr -Cs "[:alpha:]"
"[\n*]" <<<"hello chicken world" ; true'
hello
chicken
world

real           1.75
user           1.69
sys            0.03
-- snip --

Trying the same with the libast-internal C.UTF-8 locale works, too:
-- snip --
$ timex ~/bin/ksh -c 'builtin tr ; LC_ALL=C.UTF-8 tr -Cs "[:alpha:]"
"[\n*]" <<<"hello chicken world" ; true'
hello
chicken
world

real           0.16
user           0.09
sys            0.06
-- snip --

... but trying this with the en_US.UTF-8 locale takes "forever" - I
tried to run it for 10mins but it never returned in that timeframe:
-- snip --
$ timex ~/bin/ksh -c 'builtin tr ; LC_ALL=en_US.UTF-8 tr -Cs
"[:alpha:]" "[\n*]" <<<"hello chicken world" ; true'
^CCommand terminated abnormally.

real        1:32.07
user        1:31.93
sys            0.12
-- snip --

Interestingly the same works OK in the ja_JP.UTF-8 locale:
-- snip --
$ timex ~/bin/ksh -c 'builtin tr ; LC_ALL=ja_JP.UTF-8 tr -Cs
"[:alpha:]" "[\n*]" <<<"hello chicken world" ; true'
hello
chicken
world

real           0.56
user           0.42
sys            0.08
-- snip --

... but in the de_DE.UTF-8 and uk_UA.UTF-8 locales (and likely many
others) it runs for a long time again. Sampling a stack trace looks
like this:
-- snip --
0xfffffd7fff2b0879: coll_cookie_init+0x0021:    testl
$0x0000000000000018,0x0000000000000040(%r12)
(dbx) where
=>[1] coll_cookie_init(0xfffffd7fffdfe958, 0xfffffd7fffdfe7c0, 0x30,
0x10, 0xfffffd7fffdfe980, 0xfffffd7ffe633d88), at 0xfffffd7fff2b0879
  [2] __wcscoll(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7fff2971f4
  [3] __wcscoll_bc(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7fff297156
  [4] _wcscoll(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7fff2af59c
  [5] collate(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0x4e4467
  [6] qsort(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7fff1e4bc7
  [7] tropen(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0x4e4eaf
  [8] b_tr(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0x4e531d
  [9] sh_exec(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0x49163b
  [10] sh_exec(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0x48fb66
  [11] exfile(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0x438a51
  [12] sh_main(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0x438170
  [13] main(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0x436faa
-- snip --
... it looks like the |qsort()| runs forever...

----

Bye,
Roland

-- 
  __ .  . __
 (o.\ \/ /.o) roland.ma...@nrubsig.org
  \__\/\/__/  MPEG specialist, C&&JAVA&&Sun&&Unix programmer
  /O /==\ O\  TEL +49 641 3992797
 (;O/ \/ \O;)
_______________________________________________
ast-developers mailing list
ast-developers@research.att.com
https://mailman.research.att.com/mailman/listinfo/ast-developers

Reply via email to