Re: [OSM-talk] State of the NameFinder
Preliminary results of the British Museum Test: http://povesham.wordpress.com/2007/11/23/the-british-museum-test-for-public-mapping-websites/ Thanks for all this - very interesting results. I'll work my way through them and track down the reasons for the various errors. The duplication is to be expected - at the moment there is no code to prevent it - but I'll got some written given how significant a problem it seems to be, it was less obvious with road searches. Another hint for priorities would be if something is in the current map, it should possibly score a bit higher than something further away (although anything in the current map area should score the same - you don't want to put a positive bias on The Midlands when searching while viewing the whole UK). This would have solved the Natural History Museum, Hemel Hempsted problem. This is partially working, but currently turned off because it made debuging harder and I forgot to turn it back on before posting to the list. Ho, hum. Postcode searching is weird. If I search for NW1 3AN, it seems to give me the result for NW1 3AR, which isn't the same place. If it doesn't know the correct postcode, it should fall back to the area - NW1 3, for which is a better result because it's clear that it's not accurate and it's pointing to the whole area. The system performs a weighted sum over all nearby postcodes which seems to generate a fairly actuate result. In this case it is about 80 meters out, far better than a simple sector code search. However it then uses its standard address generation code to present the result which is just confusing and odd. I'll improve the output to try and present the error range better. Many thanks, -- Brian ___ talk mailing list talk@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk
Re: [OSM-talk] State of the NameFinder
On Thu, Aug 27, 2009 at 12:04 AM, Brian Quinionopenstreet...@brian.quinion.co.uk wrote: The one I've been working on (suggestions for a name for the project gratefully received off list BTW) is mostly functional. I was expecting to open it for testing on the geocoding list some time next week when it has finished indexing the most recent planet import however given the timing of this I've started an import of just a uk extract (which will take 3 to 4 hours to run assuming it all works first time) so people can have a quick preview. I'll post a URL to the list when it is complete. As promised, you can try a uk test system here: http://katie.openstreetmap.org/~twain/ And a couple of sample queries: http://katie.openstreetmap.org/~twain/?q=london http://katie.openstreetmap.org/~twain/?q=91+upper+ground%2C+london http://katie.openstreetmap.org/~twain/?q=pub+near+upper+ground%2C+london If you want to know how the address was created click the 'details' link at the end of the search result. Some of the values are my debug info but it will also provide links to the osm node/way/relation. Please be aware that this extract is about 4 weeks old and there have been quite a bit of improvements to the UK county data since then. Please email me bug reports off list, but be aware that I'm going to be away from my email for a lot of the weekend and that there are still known issues - so don't be that surprised if you break it. Whoow, this looks and works much better than current OSM search! Any info when will it be available on main OSM page for the whole world? Current search just sucks for me ;( -- pratite me na twitteru - www.twitter.com/valentt http://kernelreloaded.blog385.com/ linux, blog, anime, spirituality, windsurf, wireless registered as user #367004 with the Linux Counter, http://counter.li.org. ICQ: 2125241, Skype: valent.turkovic, msn: valent.turko...@hotmail.com ___ talk mailing list talk@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk
Re: [OSM-talk] State of the NameFinder
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Brian Quinion wrote: As promised, you can try a uk test system here: http://katie.openstreetmap.org/~twain/ And a couple of sample queries: http://katie.openstreetmap.org/~twain/?q=london http://katie.openstreetmap.org/~twain/?q=91+upper+ground%2C+london http://katie.openstreetmap.org/~twain/?q=pub+near+upper+ground%2C+london If you want to know how the address was created click the 'details' link at the end of the search result. Some of the values are my debug info but it will also provide links to the osm node/way/relation. Please be aware that this extract is about 4 weeks old and there have been quite a bit of improvements to the UK county data since then. Please email me bug reports off list, but be aware that I'm going to be away from my email for a lot of the weekend and that there are still known issues - so don't be that surprised if you break it. Preliminary results of the British Museum Test: http://povesham.wordpress.com/2007/11/23/the-british-museum-test-for-public-mapping-websites/ (also on wiki http://wiki.openstreetmap.org/wiki/British_Museum_Test) * Tate Modern - Correct, but listed 2 nearly identical results for Attraction and Arts centre. * British Museum - First result is disused station that's not even visible on the standard map. Second result correct. * National Gallery - No results found * Natural History Museum - First result is Hemel Hempstead, Hertfordshire. 2nd, 3rd and 4th results correct. Natural History Museum, London produced just the 3 correct results. * British Airways London Eye - No results found * London Eye - First result good. Second result nearby bus stop. Third result common. Not sure what the common means, but otherwise good * Science Museum - Correct, but listed 2 nearly identical results for Museum and Public Building * The Victoria Albert Museum - Correct, but listed 2 nearly identical results for Museum;attraction and Public Building * VA Museum - Identical results to above (good) * The Tower of London - Correct, but listed 3 nearly identical results for Castle, Attraction and Public Building * St Paul’s Cathedral - 4 results. First one Attraction, which is acceptable. Second one bus stop, Fourth one Place of Worship. 3rd one = Place of Worship in Dundee. * National Portrait Gallery - No results found So overall, I'd say it's very good, but it could use some sort clumping of multiple results of the same thing and needs improvements in prioritising results - Museums are more important than bus stops and disused stations. Also it needs typo correction. Another hint for priorities would be if something is in the current map, it should possibly score a bit higher than something further away (although anything in the current map area should score the same - you don't want to put a positive bias on The Midlands when searching while viewing the whole UK). This would have solved the Natural History Museum, Hemel Hempsted problem. Postcode searching is weird. If I search for NW1 3AN, it seems to give me the result for NW1 3AR, which isn't the same place. If it doesn't know the correct postcode, it should fall back to the area - NW1 3, for which is a better result because it's clear that it's not accurate and it's pointing to the whole area. Robert (Jamie) Munro -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.8 (Darwin) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEUEARECAAYFAkqWf5kACgkQz+aYVHdncI3BWACg8SalbIGx2j8eCFSo4+Skeuoq F/kAmP+u2eS7SKSEzqBMCuk4v10Btng= =h8fE -END PGP SIGNATURE- ___ talk mailing list talk@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk
Re: [OSM-talk] State of the NameFinder
On Wed, Aug 26, 2009 at 11:58 AM, David Earlda...@frankieandshadow.com wrote: Beyond getting the index updated using the existing technology, my next step is to try using postgres instead of mysql to (hopefully) increase the search speed (I use self-joins a lot and these should be faster in postgres; there may also be scope beyond that for replacing my low level word search algorithm with postgres' flexible free text searching but still retaining the multiple variations the system copes with at present). This might be a silly question but has someone looked into using dedicated search engines like lucene for OSM data? That's what Wikimedia moved to after hacking search with relational databases proved too slow, but I'm not familiar with how well it could handle searching through geodata. ___ talk mailing list talk@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk
Re: [OSM-talk] State of the NameFinder
Robert (Jamie) Munro rjmunro at arjam.net writes: http://katie.openstreetmap.org/~twain/ Preliminary results of the British Museum Test: * National Gallery - No results found This is because it's tagged as 'National Gallery / National Portrait Gallery'. If they share the same physical building, there isn't an obvious way to tag it with two separate names, so putting a slash may be the least bad option. Semicolon is also used, as in 'Palace of Westminster; Houses of Parliament; House of Commons; House of Lords'. And a semicolon is used by some mapping tools to join together names when two objects are merged. So I suggest the namefinder should split names at slash and at semicolon, and search each part separately. * British Airways London Eye - No results found I've added alt_name tags to this object, so after a database update it should work. * London Eye - First result good. Second result nearby bus stop. Third result common. Not sure what the common means, but otherwise good Might be caused by the map having both an area and a node for the attraction. I have tidied this to just the area. * The Victoria Albert Museum - Correct, but listed 2 nearly identical results for Museum;attraction and Public Building This is semicolons getting into the tagging again; I've changed it to just 'museum' since that is more specific than 'attraction'. The two separate results look like a namefinder bug (you had other examples). * St Paul’s Cathedral - 4 results. First one Attraction, which is acceptable. Second one bus stop, Fourth one Place of Worship. 3rd one = Place of Worship in Dundee. There used to be separate tagging for the building and a node in the centre of the building, which could explain one of the duplicate results. I've tidied that up. * National Portrait Gallery - No results found See discussion of slashes above. -- Ed Avis e...@waniasset.com ___ talk mailing list talk@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk
Re: [OSM-talk] State of the NameFinder
On 27/08/09 14:26, Ævar Arnfjörð Bjarmason wrote: On Wed, Aug 26, 2009 at 11:58 AM, David Earlda...@frankieandshadow.com wrote: Beyond getting the index updated using the existing technology, my next step is to try using postgres instead of mysql to (hopefully) increase the search speed (I use self-joins a lot and these should be faster in postgres; there may also be scope beyond that for replacing my low level word search algorithm with postgres' flexible free text searching but still retaining the multiple variations the system copes with at present). This might be a silly question but has someone looked into using dedicated search engines like lucene for OSM data? Brian's stuff is using the full text search support in postgres which is effectively a dedicated full test search engine. The advantage of using that over something like lucene is that you combine geographic restrictions (using postgis) with text searches in the same query. Tom -- Tom Hughes (t...@compton.nu) http://www.compton.nu/ ___ talk mailing list talk@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk
[OSM-talk] State of the NameFinder
Hi! Congracts everyone with amount of input OSM recieves this year. Volunteers join us every day and changesets grow in detaility and quality. However, there is one set back and those are two things - mapnik render and NameFinder. While mapnik slowly gains additional renders for POIs and stuff, people can't find anything newer than January in NameFinder. I also think it is time to redesign it and make default version more good looking. Yes, I know, commercial vendors are here for selling services, but stil, as a geek, I contribute to OSM and want to use OSM slippy map. Anyway, so far I have heard about two efforts of getting NameFinder running again. First one is just performance improvement for old one (done by David Earl) and second one is completely new effort (by Twain). How far both projects are? Is there anything someone with moderate Linux/sysadmining/Python/Postgresql/whatever knowledge can help? Cheers, Peter. ___ talk mailing list talk@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk
Re: [OSM-talk] State of the NameFinder
On 26/08/2009 12:44, Peteris Krisjanis wrote: Anyway, so far I have heard about two efforts of getting NameFinder running again. First one is just performance improvement for old one (done by David Earl) I have been trying to rebuild the index, but the first two attempts failed (well, the second worked but the changes meant it took an impracticably long time(*)). I was going to start a third attempt today (it's best to start on a Wednesday with a fresh planet file). Beyond getting the index updated using the existing technology, my next step is to try using postgres instead of mysql to (hopefully) increase the search speed (I use self-joins a lot and these should be faster in postgres; there may also be scope beyond that for replacing my low level word search algorithm with postgres' flexible free text searching but still retaining the multiple variations the system copes with at present). While the current index is January, that's from daily updates. The last time the index was completely reloaded was February 2008, so the amount of data has increased enormously since then, and that means everything takes a lot longer. David (*) I was trying InnoDB tables, but the overhead on these was tremendous, and the load process was taking order of magnitude longer. ___ talk mailing list talk@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk
Re: [OSM-talk] State of the NameFinder
2009/8/26 Jonas Krückel o...@jonas-krueckel.de: Nestoria (Ed Freyfogle) also offered help for a new search/namefinder on SOTM. And Geocommons made their geocoding service open source. So maybe we should start a kind of working group who looks at all the offers and possibilities and then get one running. Maybe this could also be one topic for a hacking session on wherecampeu. I would join a working group, who else is interested? Me :) Have some nice expierence with server systems and would be nice to have such challenge to deal with. And ohhh, I would like to push OSM to next level, when all our contributed data are easily found by simple search string. Cheers, Peter. ___ talk mailing list talk@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk
Re: [OSM-talk] State of the NameFinder
On 26/08/2009 13:19, Peteris Krisjanis wrote: 2009/8/26 Jonas Krückel o...@jonas-krueckel.de: Nestoria (Ed Freyfogle) also offered help for a new search/namefinder on SOTM. And Geocommons made their geocoding service open source. So maybe we should start a kind of working group who looks at all the offers and possibilities and then get one running. Maybe this could also be one topic for a hacking session on wherecampeu. I would join a working group, who else is interested? Me :) Have some nice expierence with server systems and would be nice to have such challenge to deal with. And ohhh, I would like to push OSM to next level, when all our contributed data are easily found by simple search string. We set up a geocaching mailing list just after SOTM. We haven't had much traffic on it so far. Please feel free to join: http://lists.openstreetmap.org/listinfo/geocoding David ___ talk mailing list talk@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk
Re: [OSM-talk] State of the NameFinder
running again. First one is just performance improvement for old one (done by David Earl) and second one is completely new effort (by Twain). The one I've been working on (suggestions for a name for the project gratefully received off list BTW) is mostly functional. I was expecting to open it for testing on the geocoding list some time next week when it has finished indexing the most recent planet import however given the timing of this I've started an import of just a uk extract (which will take 3 to 4 hours to run assuming it all works first time) so people can have a quick preview. I'll post a URL to the list when it is complete. The main problem with the project, and reason it has been so slow, is the sheer size of the data. A complete test cycle for the whole planet data takes around a week assuming that nothing goes wrong and working on less than a country sized area is pointless because you don't get a true indication of performance. There is still considerable work to be done - the system doesn't just support diff updates, the code is very messy and in need of considerable cleaning up and there are a few known bugs with long strings running out of memory. I also want to move a lot more of the core search code into the database to make it less dependant on php. If people are keen it is possible that some of this work could be shared with other people. Cheers, -- Brian ___ talk mailing list talk@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk
Re: [OSM-talk] State of the NameFinder
On 26/08/09 13:12, Jonas Krückel wrote: Nestoria (Ed Freyfogle) also offered help for a new search/namefinder on SOTM. And Geocommons made their geocoding service open source. So maybe we should start a kind of working group who looks at all the offers and possibilities and then get one running. Umm... Like the geocoding mailing list you mean? The one that was created after discussion with the interested parties at SOTM... Tom -- Tom Hughes (t...@compton.nu) http://www.compton.nu/ ___ talk mailing list talk@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk
Re: [OSM-talk] State of the NameFinder
Am 26.08.2009 um 14:49 schrieb Tom Hughes t...@compton.nu: On 26/08/09 13:12, Jonas Krückel wrote: Nestoria (Ed Freyfogle) also offered help for a new search/namefinder on SOTM. And Geocommons made their geocoding service open source. So maybe we should start a kind of working group who looks at all the offers and possibilities and then get one running. Umm... Like the geocoding mailing list you mean? The one that was created after discussion with the interested parties at SOTM... Tom Yep, sorry, I don't know why i missed that list, David already gave me a hint know. I will subscribe asap. Jonas ___ talk mailing list talk@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk
Re: [OSM-talk] State of the NameFinder
Hi, 2009/8/26 Peteris Krisjanis pec...@gmail.com: Anyway, so far I have heard about two efforts of getting NameFinder running again. First one is just performance improvement for old one (done by David Earl) and second one is completely new effort (by Twain). I started working on a different search engine for OSM but didn't have time to make it really work yet. I'll try to do it in the next week or two. The idea is a little different from the NameFinder in various aspects and it can search for all data in the database not only names. In some aspects it's dumber than NameFinder but it stresses being usable for non-latin alphabet searches, mixed alphabets, and nice presentation of the results. It indexes the planet file, which is a rather heavy task for my desktop but I think it would still be able to cope if the size of data in OSM grew up to about 10 times. The idea was also for it to be very cheap computationally and so that a search with N search terms takes exactly N reads of 1 sector from my harddisk, so N moves of the disk's head and I've not fully achieved this. (this is all how it was supposed to work, not how it will end up working) :) Cheers ___ talk mailing list talk@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk
Re: [OSM-talk] State of the NameFinder
Hi, Jonas Krückel wrote: Yep, sorry, I don't know why i missed that list, David already gave me a hint know. I will subscribe asap. I'm not on that list either (yet) so let me continue abusing talk: There's also an OpenSearch Suggestion Service by Wolfram Schneider wsc...@googlemail.com (done with OSM data for the bbbike project) here: http://bbbike.elsif.de/streets.html He says that it can easily be run for whole countries/continents. It is still somewhat beta (if I understood correctly, the largest problem is that it can currently only handle one same-named street per region but he's working on that). He says his software is open source in principle and he'll share the status quo if anyone is interested, otherwise he'll bring it into a better shape before releasing it. Bye Frederik ___ talk mailing list talk@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk
Re: [OSM-talk] State of the NameFinder
The one I've been working on (suggestions for a name for the project gratefully received off list BTW) is mostly functional. I was expecting to open it for testing on the geocoding list some time next week when it has finished indexing the most recent planet import however given the timing of this I've started an import of just a uk extract (which will take 3 to 4 hours to run assuming it all works first time) so people can have a quick preview. I'll post a URL to the list when it is complete. As promised, you can try a uk test system here: http://katie.openstreetmap.org/~twain/ And a couple of sample queries: http://katie.openstreetmap.org/~twain/?q=london http://katie.openstreetmap.org/~twain/?q=91+upper+ground%2C+london http://katie.openstreetmap.org/~twain/?q=pub+near+upper+ground%2C+london If you want to know how the address was created click the 'details' link at the end of the search result. Some of the values are my debug info but it will also provide links to the osm node/way/relation. Please be aware that this extract is about 4 weeks old and there have been quite a bit of improvements to the UK county data since then. Please email me bug reports off list, but be aware that I'm going to be away from my email for a lot of the weekend and that there are still known issues - so don't be that surprised if you break it. -- Brian ___ talk mailing list talk@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk