Configuration Information [Automatically generated, do not change]: Machine: i686 OS: linux-gnu Compiler: gcc Compilation CFLAGS: -DPROGRAM='bash' -DCONF_HOSTTYPE='i686' -DCONF_OSTYPE='linux-gnu' -DCONF_MACHTYPE='i686-pc-linux-gnu' -DCONF_VENDOR='pc' -DLOCALEDIR='/usr/share/locale' -DPACKAGE='bash' -DSHELL -DHAVE_CONFIG_H -DDEBUG -DMALLOC_DEBUG -I. -I. -I./include -I./lib -g -O2 -Wno-parentheses -Wno-format-security uname output: Linux Xaox 4.4.0-tm3 #2 Mon Feb 22 13:26:44 CET 2016 i686 GNU/Linux Machine Type: i686-pc-linux-gnu
Bash Version: 4.4 Patch Level: 0 Release Status: rc2 / release Description: The tests below were performed with 4.4.0-rc2. However, the problem is still present in 4.4.0-release, only execution times are even higher for about 20%. Repeated pattern substitution (here: removal) using an extended pattern and variables of considerable size is incredibly time and cpu consuming. The command that revealed the problem was: D=${C//\[+([0-9])\]=} The variable C contains the output of 'declare -p A', where A is an array with 510 file names and C contains 510 matches. But as can be seen below, also commands like D=${C//u+([a-z])} or D=${C//@(usr)} trigger the problem, but _not_ commands like D=${C//usr} or D=${C//u[a-z][a-z]} See the test case and statistics below. Of course, the problem is simply solvable be a mini sed(1) script, but every now and then I try comands like the above, because I think that simple tasks should be doable without the aid of external programmes. But in many such cases I must sadly accept that using external programs, especially sed(1), is the quicker method. Additionally I will have to revise my script (a ~100kb font editor) and possibly replace other constructs using extended pattern maching. Repeat-By: ----------------------------------------------------------------------- declare -a B A=( /usr/share/consolefonts/* ) # column 2: here 510 files # A=( "${A[@]##*/}" ) # column 3: pure filenames # A=( "${A[@]/*/a}" ) # column 4: "a" # A=( "${A[@]/*}" ) # column 5: "" (empty) for matches in {10..500..10}; do B=( "${A[@]:0:matches}" ) # reduce array C=`declare -p B | sed -r "s/^[^=]+=?//"` # rm 'declare -<attr> <name>=' time D="${C//\[+([0-9])\]=}" # rm '[<subscr>]=' done ------------------------------------------------------------------------ results (all with >99% cpu): number of | contents of array elements matches | size=${#C} path/file | file | "a" | empty --------------------------------------------------------------- 10: | 369 bytes 0.099s | 0.014s | 0.007s | 0.005s 20: | 900 1.261s | 0.315s | 0.048s | 0.036s 30: | 1453 5.274s | 1.538s | 0.168s | 0.134s 40: | 2070 15.030s | 4.868s | 0.406s | 0.324s 50: | 2655 31.830s | 10.694s | 0.814s | 0.644s 60: | 3240 56.831s | 19.203s | 1.423s | 1.130s 70: | 3837 94.022s | 32.356s | 2.299s | 1.829s 80: | 4384 139.000s | 47.079s | 3.473s | 2.751s 90: | 4998 204.683s | | 4.955s | 3.932s 100: | 5567 283.118s | | 6.871s | 5.452s 110: | 6135 | | 9.495s | 7.547s 120: | 6664 | | | 10.164s 200: | 15554 | | | 55.529s I was too impatient to wait for the complete array with 510 elements to complete. The following test results all belong in column 1 + 2. the command: time D=`sed -r "s/\[[0-9]+\]=//g"<<<"$C"` 510: | 27137 bytes, R:0.020 U:0.007 S:0.007 67.66% ok! other commands: size=${#C} D=${C//usr} D=${C//u[a-z][a-z]} -------------------------------------------------------- 100: 5567 bytes 0.004s 0.004s ok! 200: 11167 0.012s 0.012s 300: 16712 0.024s 0.024s 400: 21818 0.038s 0.040s 500: 26647 0.056s 0.057s but: D=${C//u+([a-z])} D=${C//@(usr)} 10: 0.136s 0.112s >99% cpu 20: 1.647s 1.078s 30: 6.467s 4.014s 40: 17.912s 10.886s 50: 38.178s 22.391s which seems to indicate that extended pattern matching causes the problem. Please CC answers to me as I am not subscribed to the list. ---------------------------------------------------------------- Gesendet mit Telekom Mail <https://t-online.de/email-kostenlos> - kostenlos und sicher für alle!