Perhaps this is the time for me to ask: does anyone know a way for grep or awk to extract from a text file any sequence of up to, say, six words that begins and ends with an initially-capitalised word -- whether or not it is part of a larger matching sequence? So if the input text was: "Sally Lee Jones worked for the United Nations Support Team" the output would be Sally Lee Lee Jones Sally Lee Jones Jones worked for the United Jones worked for the United Nations United Nations Nations Support Support Team United Nations Support Team I don't particularly care if it takes one pass or several, and I can clean up duplicates afterwards. This is not a serious problem for me -- it falls into the 'would be nice to have' category -- but I've been puzzling over it off and on for a while, and the mention of the word 'greedy' reminded me of it. Jon. On 14/07/10 18:06, Nick Andrew wrote:
(aaa...)& Where the stuff inside () is what's being matched. The matched part stops at the first& or the end of the string. It's greedy so it matches as long a string as possible. Nick.
-- SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/ Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html
