Re: [LUG@IITD:10534] Sed expression required.

Phani Bhushan Tholeti Tue, 23 Nov 2010 23:15:00 -0800

> I am not fully satisfied with this command, because, it print whole line.
>


If you want *only* the words between them to be printed, then back
references are probably the way out, but I see that the regexp is
greedy as usual, and the bracketed part gets even part of the "stop"
string, if there is no clear delimiter. See eg. below.

Eg:

bash-3.2$ cat ~/.test
111 hi 223
333 bye 456
555 nothig more to say 555
nvkjnfv;esn
bash-3.2$
bash-3.2$ sed -rn 's/[0-9 ]+(.+)[0-9 ]+/\1/p' ~/.test
hi 22
bye 45
nothig more to say 55


Most regex engines have a way of specifying a non greedy match usually
like (.*?), which doesn't work in gawk or sed :(

But in the case you mentioned ***">*** *would not* not be present in
the address so the regex might be like this:

sed -rn 's/<a href="([^">]+)">/\1/p' ~/.test

The best way out is to use the non greedy match of Perl/Python, i
think. Either that or you have to make sure that:
1. the "start" string "ends" with a proper delimiter
2. the "stop" string "starts" with a proper delimiter
3. the delimiters used above *do not* occur in the actual string (like
<, > and " do not appear in the address).
But this is too restrictive :(

Suresh: any tips/tricks around here?


-- 
Lots o' Luv,
Phani Bhushan

Let not your sense of morals prevent you from doing what is right -
Isaac Asimov (Salvor Hardin in Foundation and Empire)

Please avoid sending me Word or PowerPoint attachments.
See http://www.gnu.org/philosophy/no-word-attachments.html

-- 
l...@iitd - http://tinyurl.com/ycueutm

Re: [LUG@IITD:10534] Sed expression required.

Reply via email to