On Sat, 19 Jan 2008 13:50:08 -0800 Alex Mandel <[EMAIL PROTECTED]> wrote:
> I've got a big text file to parse(example below)
> Now I was about to figure out how to parse what I need from this file
> using python since it's what I'm used to but I realized it was going
> to be hard and some of you seem to love sed, awk etc. Which I have no
> idea how to use.
How about in ruby:
if your data is in the constant TEXT, then you can parse it into nice
Directions structures:
Directions=Struct.new(:start,:distance,:time,:directions,:destination)
result=TEXT.split(/\s+\.\s+Map data ©2008 Tele Atlas\s+/m).collect do |record|
lines=record.split("\n")
distance,time=lines[1].match(/^(.+)\((.+)\)/).captures
Directions.new(lines[0],distance,time,lines[2..-2],lines[-1])
end
the results would be:
[#<struct Directions
address="618 Lessley Pl, Davis, CA 95616",
distance="2.0 mi ",
time="about 9 mins",
directions=
["1.\tHead north on Lessley Pl toward Lehigh Dr\t335 ft",
"2.\tTurn left at Lehigh Dr\t0.1 mi",
"3.\tTurn right at Colgate Dr\t0.2 mi",
"4.\tTurn left at L St\t0.2 mi",
"5.\tTurn right at 5th St\t0.6 mi",
"6.\tTurn left at B St\t427 ft",
"7.\tTurn right at 4th St\t367 ft",
"8.\tTurn left at University Ave\t499 ft",
"9.\tTurn right at 3rd St\t0.2 mi",
"10.\tTurn left at E Quad\t0.2 mi",
"11.\tTurn right at Peter J Shields Ave/Shields Ave\t0.2 mi"],
destination="\t1 Shields Ave, Davis, CA 95616">,
#<struct Directions
address="618 Lessley Pl, Davis, CA 95616",
distance="2.0 mi ",
time="about 9 mins",
directions=
["1.\tHead north on Lessley Pl toward Lehigh Dr\t335 ft",
"2.\tTurn left at Lehigh Dr\t0.1 mi",
"3.\tTurn right at Colgate Dr\t0.2 mi",
"4.\tTurn left at L St\t0.2 mi",
"5.\tTurn right at 5th St\t0.6 mi",
"6.\tTurn left at B St\t427 ft",
"7.\tTurn right at 4th St\t367 ft",
"8.\tTurn left at University Ave\t499 ft",
"9.\tTurn right at 3rd St\t0.2 mi",
"10.\tTurn left at E Quad\t0.2 mi",
"11.\tTurn right at Peter J Shields Ave/Shields Ave\t0.2 mi"],
destination="\t1 Shields Ave, Davis, CA 95616">,
#<struct Directions
address="1600 Amphitheatre Pkwy, Mountain View, CA 94043",
distance="105 mi ",
time="about 1 hour 54 mins",
directions=
["1.\tHead west on Amphitheatre Pkwy toward Garcia Ave\t0.5 mi",
"2.\tMerge onto US-101 N via the ramp to San Francisco\t4.7 mi",
"3.\tExit onto CA-114/Willow Rd toward Fremont/State Hwy 84 E\t1.0 mi",
"4.\tSlight right toward Bayfront Expy/CA-84\t495 ft",
"5.\tSlight right at Bayfront Expy/CA-84 Continue to follow CA-84\t8.4 mi",
"6.\tMerge onto I-880 N via the ramp to Oakland\t22 mi",
"7.\tSlight right at I-980 E (signs for I-980/State Hwy 24/Walnut Creek)
1.5 mi",
"8.\tTake the exit onto I-580 W toward San Francisco\t5.9 mi",
"9.\tContinue on I-80 E (signs for Vallejo/Sacramento) Partial toll
road\t60 mi",
"10.\tTake exit 72 for Richards Blvd\t0.3 mi",
"11.\tTurn right at Richards Blvd (signs for Davis)\t0.3 mi",
"12.\tSlight right to stay on Richards Blvd\t0.2 mi",
"13.\tTurn left at 1st St\t0.3 mi",
"14.\tTurn right at A St\t0.2 mi",
"15.\tTurn left at 3rd St\t0.1 mi",
"16.\tTurn left at E Quad\t0.2 mi",
"17.\tTurn right at Peter J Shields Ave/Shields Ave\t0.2 mi"],
destination="\t1 Shields Ave, Davis, CA 95616">]
--
Ken (Chanoch) Bloom. PhD candidate. Linguistic Cognition Laboratory.
Department of Computer Science. Illinois Institute of Technology.
http://www.iit.edu/~kbloom1/
signature.asc
Description: PGP signature
_______________________________________________ vox-tech mailing list [email protected] http://lists.lugod.org/mailman/listinfo/vox-tech
