Yes, I'm happy because I found a non-regex way to solve the problem (see below).

No, I'm not a student or worn out but wish I was back at college and partying!

Yes, this is an interesting problem and here is the requirement:

- A text document contains special words that start and end with a period 
("."), the word between the start and end periods contain no punctuation or 
spaces except a hyphen in some special words.
- Examples of special words include ".thrfore.", ".because.", '.music-sharp.", 
".music-flat.", ".dbd.", ".vertline.", ".uparw.", ".hoarfrost." etc.
- In most cases, the special words have a space (" ") before and after.
- In some cases, a special word will be followed by one or two other special 
words eg. ".dbd..vertline." or ".music-flat..dbd..vertline."
- In some cases, a special word will be followed by an ordinary word (with or 
without punctuation) eg. ".music-flat.mozart" or ".vertline.isn't"
- A special word followed by an ordinary word (with or without punctuation) 
could be the end of a sentence and hence have a full-stop (".") eg. 
".music-flat.mozart." or ".vertline.isn't."
- The number of characters in a special word excluding the two periods is > 1
- Find and remove all special words from the text document (by processing one 
line at a time)

How did I solve it?  I found a list of all the special words, created a set of 
special words and then checked if each word in the text belonged to the set of 
special words.  If we assume that the list of special words doesn't exist then 
the problem is interesting in itself to solve.

Cheers!

Dinesh


--------------------------------------------------------------------------------

Date: Sun, 1 Jun 2008 21:56:26 -0400
From: "Kent Johnson" <[EMAIL PROTECTED]>
Subject: Re: [Tutor] finding special character string
To: "Marilyn Davis" <[EMAIL PROTECTED]>
Cc: [email protected]
Message-ID:
<[EMAIL PROTECTED]>
Content-Type: text/plain; charset=ISO-8859-1

On Sun, Jun 1, 2008 at 9:41 PM, Marilyn Davis <[EMAIL PROTECTED]> wrote:

> Yeh, we need a better spec. I was wondering if the stuff between the text
> ought not include white space, or even a word boundary.  A character class
> might be better, if we knew.

Hmm, yes, my regex will find many ordinary sentences in plain text.

> Anyhow, I think we wore out the student. :^)

He went away happy after my first reply.

Kent


_______________________________________________
Tutor maillist  -  [email protected]
http://mail.python.org/mailman/listinfo/tutor

Reply via email to