Liam wrote: ------------------------ Alan - the data is of the form - a = { b = 1 c = 2 d = { e = { f = 4 g = "Ultimate Showdown of Ultimate Destiny" } h = { i j k } } }
Everything is whitespace delimited. I'd like to turn it into ["a", "=", "{", "b", "=", "1", "c", "=", "2", "d", "=", "{", "e", "=", "{", "f", "=", "4", "g", "=", "\"Ultimate Showdown of Ultimate Destiny\"", "}", "h", "=", "{", "i", "j", "k", "}", "}"] ----------------------- Liam, I'd probably tackle this on a character by character basis using traditional tokenising code. ie build a state machine to determine what kind of token I'm in and keep reading chars until the token completes. most tokens are single-char tokens, others are quote-tokens. Set token-type appropriately and read till end-of-token. I might even use some classes to define the token types, but I'd keep them as simple as possible. Rough pseudo code: for char in tokenString: if token.type == quoted: token.append(char) if char == ": token.type =None continue elif token.type = simple tokens.append(char) token.type = None else: # not in a token if char in '\n\t ,.': # other non token chars continue elif char == '"': token.type = quoted token.append(char) continue else: token.type = simple token.append(char) You can tidy that up with functions and a proper state jump table but it might be faster than trying to build complex pattern matches and doing lots of insertions into lists etc. But it does rely on the data being as simple as your sample in the variety of token types. HTH, Alan G Author of the learn to program web tutor http://www.freenetpages.co.uk/hp/alan.gauld _______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor