On Tue, May 22, 2007, Jörn Zaefferer wrote: > Dan G. Switzer, II wrote: > >> This is a little off-topic, but when doing a regex search and replace > >> within a text editor, how can I replace one character within a > >> specific pattern? > >> > >> I want to get rid of newlines within <td> tags. This finds them: > >> <td>[^<]+(\r\n).+</td> > >> > >> How do I specify that I only want to replace the matched set? > >> > > > > You group all the contents and then the replacement string are all the > > matched sets pieced back together: > > > > sHtml.replace(/(<td>[^<]+)(\r\n)(.+</td>)/gi, "$1$3") > > > If I got that right, you could even mark the second group to be skipped by > adding a colon: > > sHtml.replace(/(<td>[^<]+)(:\r\n)(.+</td>)/gi, "$1$2")
The syntax requires a question mark: (?:...) > Or just skip the parentheses? > > sHtml.replace(/(<td>[^<]+)\r\n(.+</td>)/gi, "$1$2") Yes, but this IMHO is still too weak because... 1. the ".+" in this regex is greedy and matches too much and this way you would only remove newlines from every _second_ <td>...</td> construct. So one has to use at least .+? to fix this. 2. Additionally, I recommend to use \r?\n to support both the Windows CR-LF and Unix LF-only field. 3. The [^<]+ I do not understand as it would NOT allow to remove the newlines when there is additional markup in the <td> container as in "<td>...\n...<span>...</span>...</td>". I recommend to replace it with just ".*?". 4. The "+" qualifier should be actually "*" as it might be fully valid to have a "<td>\r\n</td>" container ;-) 5. The </td> has to be written escaped as in <\/td> within the regex construct. 6. As the "." regex character in JavaScript does NOT match newline character one has to use "(.|\r?\n)*". So, I recommend the following stronger version: sHtml.replace(/(<td>.*?)\r?\n((?:.|\r?\n)*?<\/td>)/gi, "$1$2") But even this still has the problem that it is unable to remove MULTIPLE occurences of newlines in the SAME <td> container. If this should be also allowed one has to trick a little bit more: sHtml = sHtml.replace( /(<td>)(.*\r?\n(?:.|\r?\n)*)(<\/td>)/gi, function ($0, $1, $2, $3) { return $1 + $2.replace(/\r?\n/g, "") + $3; } ); This now should be a strong enough version and finally do what was requested... Ralf S. Engelschall [EMAIL PROTECTED] www.engelschall.com