> I am not fully satisfied with this command, because, it print whole line. >
If you want *only* the words between them to be printed, then back references are probably the way out, but I see that the regexp is greedy as usual, and the bracketed part gets even part of the "stop" string, if there is no clear delimiter. See eg. below. Eg: bash-3.2$ cat ~/.test 111 hi 223 333 bye 456 555 nothig more to say 555 nvkjnfv;esn bash-3.2$ bash-3.2$ sed -rn 's/[0-9 ]+(.+)[0-9 ]+/\1/p' ~/.test hi 22 bye 45 nothig more to say 55 Most regex engines have a way of specifying a non greedy match usually like (.*?), which doesn't work in gawk or sed :( But in the case you mentioned ***">*** *would not* not be present in the address so the regex might be like this: sed -rn 's/<a href="([^">]+)">/\1/p' ~/.test The best way out is to use the non greedy match of Perl/Python, i think. Either that or you have to make sure that: 1. the "start" string "ends" with a proper delimiter 2. the "stop" string "starts" with a proper delimiter 3. the delimiters used above *do not* occur in the actual string (like <, > and " do not appear in the address). But this is too restrictive :( Suresh: any tips/tricks around here? -- Lots o' Luv, Phani Bhushan Let not your sense of morals prevent you from doing what is right - Isaac Asimov (Salvor Hardin in Foundation and Empire) Please avoid sending me Word or PowerPoint attachments. See http://www.gnu.org/philosophy/no-word-attachments.html -- l...@iitd - http://tinyurl.com/ycueutm
