How to form a dict out of a string by doing regex ?

2011-06-15 Thread Satyajit Sarangi


data = GEOMETRYCOLLECTION (POINT (-8.96484375
-4.130859375000), POINT (2.021484375000 -2.63671875),
POINT (-1.40625000 -11.162109375000), POINT
(-11.95312500,-10.89843750), POLYGON
((-21.62109375 1.845703125000,2.46093750
2.197265625000, -18.98437500 -3.69140625,
-22.67578125 -3.33984375, -22.14843750
-2.63671875, -21.62109375
1.845703125000)),LINESTRING (-11.95312500
11.337890625000, 7.73437500 11.513671875000,
12.30468750 2.548828125000, 12.216796875000
1.669921875000, 14.501953125000 3.955078125000))

This is my string .
How do I traverse through it and form 3 dicts of Point , Polygon and
Linestring containing the co-ordinates ?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How to form a dict out of a string by doing regex ?

2011-06-15 Thread Mel
Satyajit Sarangi wrote:

 
 
 data = GEOMETRYCOLLECTION (POINT (-8.96484375
 -4.130859375000), POINT (2.021484375000 -2.63671875),
 POINT (-1.40625000 -11.162109375000), POINT
 (-11.95312500,-10.89843750), POLYGON
 ((-21.62109375 1.845703125000,2.46093750
 2.197265625000, -18.98437500 -3.69140625,
 -22.67578125 -3.33984375, -22.14843750
 -2.63671875, -21.62109375
 1.845703125000)),LINESTRING (-11.95312500
 11.337890625000, 7.73437500 11.513671875000,
 12.30468750 2.548828125000, 12.216796875000
 1.669921875000, 14.501953125000 3.955078125000))
 
 This is my string .
 How do I traverse through it and form 3 dicts of Point , Polygon and
 Linestring containing the co-ordinates ?

Except for those space-separated number pairs, it could be a job for some 
well-crafted classes (e.g. `class GEOMETRYCOLLECTION ...`, `class POINT 
...`) and eval.

My approach would be to use a loop with regexes to recognize the leading 
element and pick out its arguments, then use the string split and strip 
methods beyond that point.  Like (untested):

recognizer = re.compile (r'(?(POINT|POLYGON|LINESTRING)\s*\(+(.*?)\)+,(.*)')
# regex is not good with nested brackets, 
# so kill off outer nested brackets..
s1 = 'GEOMETRYCOLLECTION ('
if data.startswith (s1):
data = data (len (s1):-1)

while data:
match = recognizer.match (data)
if not match:
break   # nothing usable in data
## now the matched groups will be:
## 1: the keyword
## 2: the arguments inside the smallest bracketed sequence
## 3: the rest of data
##  so use str.split and str.match to pull out the individual arguments,
## and lastly
data = match.group (3)

This is all from memory.  I might have got some details wrong in recognizer.

Mel.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How to form a dict out of a string by doing regex ?

2011-06-15 Thread Miki Tebeka
One solution is https://gist.github.com/1027445.
Note that you have a stray , in your last POINT.

I recommend however using some kind of parser framework (PLY?).
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How to form a dict out of a string by doing regex ?

2011-06-15 Thread Terry Reedy

On 6/15/2011 10:42 AM, Satyajit Sarangi wrote:



data = GEOMETRYCOLLECTION (POINT (-8.96484375
-4.130859375000), POINT (2.021484375000 -2.63671875),
POINT (-1.40625000 -11.162109375000), POINT
(-11.95312500,-10.89843750), POLYGON
((-21.62109375 1.845703125000,2.46093750
2.197265625000, -18.98437500 -3.69140625,
-22.67578125 -3.33984375, -22.14843750
-2.63671875, -21.62109375
1.845703125000)),LINESTRING (-11.95312500
11.337890625000, 7.73437500 11.513671875000,
12.30468750 2.548828125000, 12.216796875000
1.669921875000, 14.501953125000 3.955078125000))

This is my string .


If this what you are given by an unchangable external source or can you 
get something a bit better? One object per line would make the problem 
pretty simple, with no regex required.



How do I traverse through it and form 3 dicts of Point , Polygon and
Linestring containing the co-ordinates ?


Dicts map keys to values. I do not see any key values above. It looks 
like you really want three sets.



--
Terry Jan Reedy

--
http://mail.python.org/mailman/listinfo/python-list