On Jun 2, 2011, at 10:24 AM, Massimo Di Pierro wrote: > > The reason for underscored is that they are not visible when links are > undelined.
They could be treated like spaces (replaced with hyphens). > > On Jun 2, 11:34 am, Jonathan Lundell <[email protected]> wrote: >> On Jun 2, 2011, at 8:53 AM, Massimo Di Pierro wrote: >> >> >> >>> Use IS_SLUG.urlify(...) >> >>> I will remove the latter. Was never intended to be there. >> >> There are a couple of things that could be fixed in urlify. >> >>> def urlify(value, maxlen=80): >>> s = value.decode('utf-8').lower() # to lowercase utf-8 >>> s = unicodedata.normalize('NFKD', s) # normalize eg è => e, ñ => n >>> s = s.encode('ASCII', 'ignore') # encode as ASCII >>> s = re.sub('&\w+?;', '', s) # strip html entities >> >> the '?' is redundant here >> >>> s = re.sub('[^a-z0-9\-\s]', '', s) # strip all but >>> alphanumeric/hyphen/space >> >> the comment says 'space' but the pattern retains other whitespace. See next >> line. >> >>> s = s.replace(' ', '-') # spaces to hyphens >> >> if the previous line stays \s, then this should be \s as well >> >>> s = re.sub('--+', '-', s) # collapse strings of hyphens >>> s = s.strip('-') # remove leading and traling >>> hyphens >> >> 'trailing' >> >>> return s[:maxlen].strip('-') # enforce maximum length >> >> should strip first, then enforce maxlen >> >> I wonder whether it's such a great idea to strip underscores.

