Yes, I'm happy because I found a non-regex way to solve the problem (see below).
No, I'm not a student or worn out but wish I was back at college and partying!
Yes, this is an interesting problem and here is the requirement:
- A text document contains special words that start and end with a period
("."), the word between the start and end periods contain no punctuation or
spaces except a hyphen in some special words.
- Examples of special words include ".thrfore.", ".because.", '.music-sharp.",
".music-flat.", ".dbd.", ".vertline.", ".uparw.", ".hoarfrost." etc.
- In most cases, the special words have a space (" ") before and after.
- In some cases, a special word will be followed by one or two other special
words eg. ".dbd..vertline." or ".music-flat..dbd..vertline."
- In some cases, a special word will be followed by an ordinary word (with or
without punctuation) eg. ".music-flat.mozart" or ".vertline.isn't"
- A special word followed by an ordinary word (with or without punctuation)
could be the end of a sentence and hence have a full-stop (".") eg.
".music-flat.mozart." or ".vertline.isn't."
- The number of characters in a special word excluding the two periods is > 1
- Find and remove all special words from the text document (by processing one
line at a time)
How did I solve it? I found a list of all the special words, created a set of
special words and then checked if each word in the text belonged to the set of
special words. If we assume that the list of special words doesn't exist then
the problem is interesting in itself to solve.
Cheers!
Dinesh
--------------------------------------------------------------------------------
Date: Sun, 1 Jun 2008 21:56:26 -0400
From: "Kent Johnson" <[EMAIL PROTECTED]>
Subject: Re: [Tutor] finding special character string
To: "Marilyn Davis" <[EMAIL PROTECTED]>
Cc: [email protected]
Message-ID:
<[EMAIL PROTECTED]>
Content-Type: text/plain; charset=ISO-8859-1
On Sun, Jun 1, 2008 at 9:41 PM, Marilyn Davis <[EMAIL PROTECTED]> wrote:
> Yeh, we need a better spec. I was wondering if the stuff between the text
> ought not include white space, or even a word boundary. A character class
> might be better, if we knew.
Hmm, yes, my regex will find many ordinary sentences in plain text.
> Anyhow, I think we wore out the student. :^)
He went away happy after my first reply.
Kent
_______________________________________________
Tutor maillist - [email protected]
http://mail.python.org/mailman/listinfo/tutor