There have been a number of interesting suggestions as to whether string.format should support pipelined conversion specifiers, nested conversion specifiers, and so forth.
I'm going to follow in Guido's lead at this point, and say that perhaps these kinds of decisions should be made after looking at a sample implementation. At the same time, I want it make it as easy as possible, so I'm going to post here a sample implementation to use as a starting point. Now, I'm not actually going to post a patch that adds a "format" method to the built-in string type. Instead, I am going to post a function that has the behavior that I am looking for. It's not the greatest Python code in the world, but that's not its purpose. I hacked this up over the course of about an hour, so its probably got a bug or two. In a real implementation, both the string.format and the MyFormatter.format functions would call this underlying 'engine' to do the work of parsing the field names and specifiers. Note: I decided to scan the string character by character rather than using regular expressions because of (a) the recursive nesting of braces, and (b) because something like this may go into the interpreter, and we don't want to add a dependency on re. Anyway, if you have an idea as to how things should behave differently - feel free to hack this, play with it, test out your idea, and then describe what you did. --- Talin ---------------------------------------------------------------------------- # Python string formatting # Except for errors in the format string. class FormatError(StandardError): pass def format(template, format_hook, *args, **kwargs): # Using array types since we're going to be growing # a lot. from array import array array_type = 'c' # Use unicode array if the original string is unicode. if isinstance(template, unicode): array_type = 'u' buffer = array(array_type) # Track which arguments actuallly got used unused_args = set(kwargs.keys()) unused_args.update(range(0, len(args))) # Inner function to format a field from a value and # conversion spec. Most details missing. def format_field(value, cspec, buffer): # See if there's a hook if format_hook and format_hook(value, cspec, buffer): return # See if there's a __format__ method elif hasattr(value, '__format__'): buffer.extend(value.__format__(cspec)) # Example built-in for ints. Probably should be # table driven by type, but oh well. elif isinstance(value, int): if cspec == 'x': buffer.extend(hex(value)) else: buffer.extend(str(value)) # Default to just 'str' else: buffer.extend(str(value)) # Parse a field specification. def parse_field(iterator, buffer): # A separate array for the field name. name = array(array_type) # Consume from the same iterator. for ch in iterator: # A sub-field. We just interpret it # like a normal field, and append to # the name. if ch == '{': parse_field(iterator, name) # End of field. Time to process elif ch == '}': # Convert the array to string or uni if array_type == 'u': name = name.tosunicode() else: name = name.tostring() # Check for conversion spec parts = name.split(':', 1) conversion = 's' if len(parts) > 1: name, conversion = parts # Try to retrieve the field value try: key = int(name) value = args[key] except ValueError: # Keyword args are strings, not uni (so far) key = str(name) value = kwargs[name] # If we got no exception, then remove from # unused args unused_args.remove(key) # Format it format_field(value, conversion, buffer) return elif ch == '\\': # Escape try: name.append(template_iter.next()) except StopIteration: # Backslash at end of string is bad raise FormatError("unmatched open brace") else: name.append(ch) raise FormatError("unmatched open brace") # Construct an iterator from the template template_iter = iter(template) for ch in template_iter: # It's a field! Yay! if ch == '{': parse_field(template_iter, buffer) elif ch == '}': # Unmatch brace raise FormatError("unmatched close brace") elif ch == '\\': # More escapism try: buffer.append(template_iter.next()) except StopIteration: # Backslash at end of string is OK here buffer.append(ch) break else: buffer.append(ch) # Complain about unused args if unused_args: raise FormatError( "Unused arguments: " + ",".join(str(x) for x in unused_args)) # Convert the array to its proper type if isinstance(template, unicode): return buffer.tounicode() else: return buffer.tostring() print format("This is a test of {0:x} {x} {1}\{", None, 1000, 20, x='hex'); _______________________________________________ Python-3000 mailing list Python-3000@python.org http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com