On Thu, May 28, 2020 at 5:54 PM Peter Otten <__pete...@web.de> wrote: > > Chris Angelico wrote: > > > Situation: A terminal application. Requirement: Display nicely-wrapped > > text. With colour codes in it. And that text might be indented to any > > depth. > > > > label = f"{indent}\U0010cc32{code}\U0010cc00 > > @{tweet['user']['screen_name']}: " wrapper = textwrap.TextWrapper( > > initial_indent=label, > > subsequent_indent=indent + " " * 12, > > width=shutil.get_terminal_size().columns, > > break_long_words=False, break_on_hyphens=False, # Stop URLs from > > breaking > > ) > > for line in tweet["full_text"].splitlines(): > > print(wrapper.fill(line) > > .replace("\U0010cc32", "\x1b[32m\u2026") > > .replace("\U0010cc00", "\u2026\x1b[0m") > > ) > > wrapper.initial_indent = wrapper.subsequent_indent # For > > subsequent lines, just indent them > > > > > > The parameter "indent" is always some number of spaces (possibly > > zero). If I simply include the escape codes in the label, their > > characters will be counted, and the first line will be shorter. Rather > > than mess with how textwrap defines text, I just replace the escape > > codes *and one other character* with a placeholder. In the final > > display, \U0010cc32 means "colour code 32 and an ellipsis", and > > \U0010cc00 means "colour code 0 and an ellipsis", so textwrap > > correctly counts them as one character each. > > > > So what do you folks think? Is this a gloriously elegant way to > > collapse nonprinting text, or is it a gross hacky mess > > Yes ;)
... I should have expected a "yes" response to an either-or question. Silly of me. :) > > that's going to cause problems? > > Probably not. > > I had a quick look at the TextWrapper class, and it doesn't really lend > itself to clean and elegant customisation. However, my first idea to > approach this problem was to patch the len() builtin: > > import re > import textwrap > > text = """The parameter "indent" is always some number of spaces (possibly > zero). If I simply include the escape codes in the label, their > characters will be counted, and the first line will be shorter. Rather > than mess with how textwrap defines text, I just replace the escape > codes *and one other character* with a placeholder. In the final > display, \U0010cc32 means "colour code 32 and an ellipsis", and > \U0010cc00 means "colour code 0 and an ellipsis", so textwrap > correctly counts them as one character each. > """ > > print(textwrap.fill(text, width=40)) > > # add some color to the text sample > GREEN = "\x1b[32m" > NORMAL = "\x1b[0m" > parts = text.split(" ") > parts[::2] = [GREEN + p + NORMAL for p in parts[::2]] > ctext = " ".join(parts) > > # wrong wrapping > print(textwrap.fill(ctext, width=40)) > > # fixed wrapping > def color_len(s): > return len(re.compile("\x1b\[\d+m").sub("", s)) > > textwrap.len = color_len > print(textwrap.fill(ctext, width=40)) > > The output of my ad-hoc test script looks OK. However, I did not try to > understand the word-breaking regexes, so I don't know if the escape codes > can be spread across words which would confuse color_len(). Likewise, I have > no idea if textwrap can cope with zero-length chunks. > > But at least now you have two -- elegant or gross -- hacks to choose from ;) > Yeah, I thought of this originally as a challenge in redefining the concept of "length". But the trouble is that it might not always be the len() function that figures out the length - there might be a regex with a size threshold or any number of other things that effectively think about the length of the string. ChrisA -- https://mail.python.org/mailman/listinfo/python-list