> Hamish writes: > > Hi, re. r.in.wms XML paring code for layers with spaces in the > > name given some text like this: > > > DATA="<Name>Foo Bar Baz</Name>" > > echo "$DATA" | sed -e "s/<Name>\s*\(\w*\)/~\1~/g" \ > > -e "s/<\/Name>//g" > > > you get ~Foo~ Bar Baz > > instead of ~Foo Bar Baz~ > > how to fix that regex? > Ivan: > First of all, we expand `\w' into ``any letter or digit or the > underscore character'' [1]: > > echo "$DATA" \ > | sed -e "s/<Name>\s*\([[:alpha:][:digit:]_]*\)/~\1~/g" \ > -e "s/<\/Name>//g" > ## => ~Foo~ Bar Baz > > Then, we add `[:space:]' to the []-set: > > echo "$DATA" \ > | sed -e "s/<Name>\s*\([[:alpha:][:digit:][:space:]_]*\)/~\1~/g" > \ > -e "s/<\/Name>//g" > ## => ~Foo Bar Baz~
thanks. > Finally, I'd recommend to use single quotes for the Sed program, > since it has no Shell substitutions contained within: right, good idea. > [1] GNU Sed manual (for GNU Sed 4.1.5.) I spent a little time on this site yesterday sharpening up my regex: http://www.regular-expressions.info/quickstart.html [time spent learning regex is time well spent!] and commited a fix: http://trac.osgeo.org/grass/changeset/30522 In the end I replaced it with "continue until you find an open bracket": [^<]* I guess another way to do [[:alpha:][:digit:][:space:]_]* would be: [\w\d\s]* ? I don't see much in the the OGC's WMS spec about allowed chars, although I didn't study it that closely. http://www.opengeospatial.org/standards/wms but it does say the <Name> field is for computer to computer communication while <Title> is the human-readable version, and gives a multiword example <Title> with a approx six letter upperchar alpha code for <Name>. And that is exactly what the S-57 data standard provides: http://www.s-57.com/ So in this case I consider that NOAA's ArcIMS server is just abusing what the <Name> field should be, using it more as a <Title> than it should. example: SERVER="http://ocs-spatial.ncd.noaa.gov/wmsconnector/com.esri.wms.Esrimap/encdirect?" r.in.wms -l mapserv="$SERVER" LAYER: ~SUBMARINE_ON LAND PIPELINE_point(PISOL)~ # ~<Name>~ --SUBMARINE_ON LAND PIPELINE_point # --<Title> The S-57 acronym PISOL is right there to use, but ............ (no, it doesn't work to just use PISOL as the <Name>) Otherwise I'd worry about a literal <comment> in a <Name>, and how to match until "</Name>" not just "<". But as it is I hope no one would be so silly as to use < in a <Name> field. For the explicit [[:alpha:][:digit:][:space:]_]* case I worry about possible i18n/Unicode issues? Hamish ____________________________________________________________________________________ Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now. http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ _______________________________________________ grass-dev mailing list [email protected] http://lists.osgeo.org/mailman/listinfo/grass-dev
