Hi! ----
AFAIK we found a bug in the libast regex engine which manifests itself when it should match&&capture text with '[' charcaters. The following example (derived from Olga's previous work on a quick&&dirty XML document scanner) shows the issue (note the "[TEXT]" in variable "xmltext"): -- snip -- xmltext='<h1><div> a text </div>More [TEXT].<!-- a comment (<disabled>) --></h1>' # parse dummy="${xmltext//~(Ex)(?: (<!--.+-->)+?| # xml comments (<.+>)+?| # xml tags ([^[><]]+)+? # xml text )/dummy}" # debug output printf 'dummy=%q\n' "${dummy}" print -v .sh.match # rebuild the original text, based on our matches nameref nodes_all=.sh.match[0] # contains all matches nameref nodes_comments=.sh.match[1] # contains only XML comment matches nameref nodes_tags=.sh.match[2] # contains only XML tag matches nameref nodes_text=.sh.match[3] # contains only XML text matches integer i for (( i = 0 ; i <= ${#nodes_all[@]} ; i++ )) ; do [[ -v nodes_comments[i] ]] && printf '%s' "${nodes_comments[i]}" [[ -v nodes_tags[i] ]] && printf '%s' "${nodes_tags[i]}" [[ -v nodes_text[i] ]] && printf '%s' "${nodes_text[i]}" done printf '\n' -- snip -- If I run the example i get the following output. First sign of trouble is the '[' character in the "...dummydummy[dummy..." output. It looks like the '[' wasn't simple matched by any of the patterns: -- snip -- $ ./arch/sol11.i386\-64/bin/ksh xmlparse.sh dummy='dummydummydummydummydummydummydummydummydummydummydummydummydummydummydummydummy[dummydummydummydummydummydummydummydummy' ( ( [0]='<h1>' [1]='<div>' [2]=' ' [3]=a [4]=' ' [5]=t [6]=e [7]=x [8]=t [9]=' ' [10]='</div>' [11]=M [12]=o [13]=r [14]=e [15]=' ' [16]=T [17]=E [18]=X [19]=T [20]=']' [21]=. [22]='<!-- a comment (<disabled>) -->' [23]='</h1>' ) ( [22]='<!-- a comment (<disabled>) -->' ) ( [0]='<h1>' [1]='<div>' [10]='</div>' [23]='</h1>' ) ( [2]=' ' [3]=a [4]=' ' [5]=t [6]=e [7]=x [8]=t [9]=' ' [11]=M [12]=o [13]=r [14]=e [15]=' ' [16]=T [17]=E [18]=X [19]=T [20]=']' [21]=. ) ) <h1><div> a text </div>More TEXT].<!-- a comment (<disabled>) --></h1> -- snip -- Glenn: What do you think ? It looks like that ([^[><]]+)+? does not generate matches for '[', right ? ---- Bye, Roland -- __ . . __ (o.\ \/ /.o) roland.ma...@nrubsig.org \__\/\/__/ MPEG specialist, C&&JAVA&&Sun&&Unix programmer /O /==\ O\ TEL +49 641 3992797 (;O/ \/ \O;) _______________________________________________ ast-developers mailing list ast-developers@research.att.com https://mailman.research.att.com/mailman/listinfo/ast-developers