How to form a dict out of a string by doing regex ?
data = GEOMETRYCOLLECTION (POINT (-8.96484375 -4.130859375000), POINT (2.021484375000 -2.63671875), POINT (-1.40625000 -11.162109375000), POINT (-11.95312500,-10.89843750), POLYGON ((-21.62109375 1.845703125000,2.46093750 2.197265625000, -18.98437500 -3.69140625, -22.67578125 -3.33984375, -22.14843750 -2.63671875, -21.62109375 1.845703125000)),LINESTRING (-11.95312500 11.337890625000, 7.73437500 11.513671875000, 12.30468750 2.548828125000, 12.216796875000 1.669921875000, 14.501953125000 3.955078125000)) This is my string . How do I traverse through it and form 3 dicts of Point , Polygon and Linestring containing the co-ordinates ? -- http://mail.python.org/mailman/listinfo/python-list
Re: How to form a dict out of a string by doing regex ?
Satyajit Sarangi wrote: data = GEOMETRYCOLLECTION (POINT (-8.96484375 -4.130859375000), POINT (2.021484375000 -2.63671875), POINT (-1.40625000 -11.162109375000), POINT (-11.95312500,-10.89843750), POLYGON ((-21.62109375 1.845703125000,2.46093750 2.197265625000, -18.98437500 -3.69140625, -22.67578125 -3.33984375, -22.14843750 -2.63671875, -21.62109375 1.845703125000)),LINESTRING (-11.95312500 11.337890625000, 7.73437500 11.513671875000, 12.30468750 2.548828125000, 12.216796875000 1.669921875000, 14.501953125000 3.955078125000)) This is my string . How do I traverse through it and form 3 dicts of Point , Polygon and Linestring containing the co-ordinates ? Except for those space-separated number pairs, it could be a job for some well-crafted classes (e.g. `class GEOMETRYCOLLECTION ...`, `class POINT ...`) and eval. My approach would be to use a loop with regexes to recognize the leading element and pick out its arguments, then use the string split and strip methods beyond that point. Like (untested): recognizer = re.compile (r'(?(POINT|POLYGON|LINESTRING)\s*\(+(.*?)\)+,(.*)') # regex is not good with nested brackets, # so kill off outer nested brackets.. s1 = 'GEOMETRYCOLLECTION (' if data.startswith (s1): data = data (len (s1):-1) while data: match = recognizer.match (data) if not match: break # nothing usable in data ## now the matched groups will be: ## 1: the keyword ## 2: the arguments inside the smallest bracketed sequence ## 3: the rest of data ## so use str.split and str.match to pull out the individual arguments, ## and lastly data = match.group (3) This is all from memory. I might have got some details wrong in recognizer. Mel. -- http://mail.python.org/mailman/listinfo/python-list
Re: How to form a dict out of a string by doing regex ?
One solution is https://gist.github.com/1027445. Note that you have a stray , in your last POINT. I recommend however using some kind of parser framework (PLY?). -- http://mail.python.org/mailman/listinfo/python-list
Re: How to form a dict out of a string by doing regex ?
On 6/15/2011 10:42 AM, Satyajit Sarangi wrote: data = GEOMETRYCOLLECTION (POINT (-8.96484375 -4.130859375000), POINT (2.021484375000 -2.63671875), POINT (-1.40625000 -11.162109375000), POINT (-11.95312500,-10.89843750), POLYGON ((-21.62109375 1.845703125000,2.46093750 2.197265625000, -18.98437500 -3.69140625, -22.67578125 -3.33984375, -22.14843750 -2.63671875, -21.62109375 1.845703125000)),LINESTRING (-11.95312500 11.337890625000, 7.73437500 11.513671875000, 12.30468750 2.548828125000, 12.216796875000 1.669921875000, 14.501953125000 3.955078125000)) This is my string . If this what you are given by an unchangable external source or can you get something a bit better? One object per line would make the problem pretty simple, with no regex required. How do I traverse through it and form 3 dicts of Point , Polygon and Linestring containing the co-ordinates ? Dicts map keys to values. I do not see any key values above. It looks like you really want three sets. -- Terry Jan Reedy -- http://mail.python.org/mailman/listinfo/python-list