Joe Emenaker wrote:

Florin Andrei wrote:

Anyone has a generic rule to match "stuck keyboard" spam like this one?

"Baaaargaiiiiin baaaasemeeent priiiiiiicing for viaaaaagraaaa"


This regexp...

       (\w)\1{2}

should catch any word where the same character is repeated three times or more. Change the "2" to whatever you want and it will match that number PLUS one. So "(\w)\1{4}" would match 5 chars in a row.

You need the parens to sample the matched character and then it is reused with the "\1". You can't just say "\w{5}" because it would match ANY 5 word chars.

You probably want to give some points to any word with three or more (since a scan through my spell dict didn't find ANY English words with more than two of the same char in a row), and then some more points for one with four or more... and then some for five, etc.

- Joe

I'm running a few tests as I'm typing this. I'll let you know how they did when I've finished.


Jesse
SARE Ninja






Reply via email to