On Thu, Feb 19, 2009 at 10:14 AM, Dinesh B Vadhia <dineshbvad...@hotmail.com> wrote: > I want a regex to remove control characters (< chr(32) and > chr(126)) from > strings ie. > > line = re.sub(r"[^a-z0-9-';.]", " ", line) # replace all chars NOT A-Z, > a-z, 0-9, [-';.] with " " > > 1. What is the best way to include all the required chars rather than list > them all within the r"" ?
You have to list either the chars you want, as you have done, or the ones you don't want. You could use r'[\x00-\x1f\x7f-\xff]' or r'[^\x20-\x7e]' > 2. How do you handle the inclusion of the quotation mark " ? Use \", that works even in a raw string. By the way string.translate() is likely to be faster for this purpose than re.sub(). This recipe might help: http://code.activestate.com/recipes/303342/ Kent _______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor