Re: [OSM-talk] State of the NameFinder

2009-08-31 Thread Brian Quinion
 Preliminary results of the British Museum Test:
 http://povesham.wordpress.com/2007/11/23/the-british-museum-test-for-public-mapping-websites/

Thanks for all this - very interesting results.  I'll work my way
through them and track down the reasons for the various errors.  The
duplication is to be expected - at the moment there is no code to
prevent it - but I'll got some written given how significant a problem
it seems to be, it was less obvious with road searches.

 Another hint for priorities would be if something is in the current map,
 it should possibly score a bit higher than something further away
 (although anything in the current map area should score the same - you
 don't want to put a positive bias on The Midlands when searching while
 viewing the whole UK). This would have solved the Natural History
 Museum, Hemel Hempsted problem.

This is partially working, but currently turned off because it made
debuging harder and I forgot to turn it back on before posting to the
list. Ho, hum.

 Postcode searching is weird. If I search for NW1 3AN, it seems to give
 me the result for NW1 3AR, which isn't the same place. If it doesn't
 know the correct postcode, it should fall back to the area - NW1 3, for
 which is a better result because it's clear that it's not accurate and
 it's pointing to the whole area.

The system performs a weighted sum over all nearby postcodes which
seems to generate a fairly actuate result. In this case it is about 80
meters out, far better than a simple sector code search.  However it
then uses its standard address generation code to present the result
which is just confusing and odd.  I'll improve the output to try and
present the error range better.

Many thanks,
--
 Brian

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] State of the NameFinder

2009-08-27 Thread Valent Turkovic
On Thu, Aug 27, 2009 at 12:04 AM, Brian
Quinionopenstreet...@brian.quinion.co.uk wrote:
 The one I've been working on (suggestions for a name for the project
 gratefully received off list BTW) is mostly functional.  I was
 expecting to open it for testing on the geocoding list some time next
 week when it has finished indexing the most recent planet import
 however given the timing of this I've started an import of just a uk
 extract (which will take 3 to 4 hours to run assuming it all works
 first time) so people can have a quick preview.  I'll post a URL to
 the list when it is complete.

 As promised, you can try a uk test system here:

 http://katie.openstreetmap.org/~twain/

 And a couple of sample queries:

 http://katie.openstreetmap.org/~twain/?q=london
 http://katie.openstreetmap.org/~twain/?q=91+upper+ground%2C+london
 http://katie.openstreetmap.org/~twain/?q=pub+near+upper+ground%2C+london

 If you want to know how the address was created click the 'details'
 link at the end of the search result.  Some of the values are my debug
 info but it will also provide links to the osm node/way/relation.
 Please be aware that this extract is about 4 weeks old and there have
 been quite a bit of improvements to the UK county data since then.

 Please email me bug reports off list, but be aware that I'm going to
 be away from my email for a lot of the weekend and that there are
 still known issues - so don't be that surprised if you break it.

Whoow, this looks and works much better than current OSM search! Any
info when will it be available on main OSM page for the whole world?
Current search just sucks for me ;(

-- 
pratite me na twitteru - www.twitter.com/valentt
http://kernelreloaded.blog385.com/
linux, blog, anime, spirituality, windsurf, wireless
registered as user #367004 with the Linux Counter, http://counter.li.org.
ICQ: 2125241, Skype: valent.turkovic, msn: valent.turko...@hotmail.com

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] State of the NameFinder

2009-08-27 Thread Robert (Jamie) Munro
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Brian Quinion wrote:
 As promised, you can try a uk test system here:
 
 http://katie.openstreetmap.org/~twain/
 
 And a couple of sample queries:
 
 http://katie.openstreetmap.org/~twain/?q=london
 http://katie.openstreetmap.org/~twain/?q=91+upper+ground%2C+london
 http://katie.openstreetmap.org/~twain/?q=pub+near+upper+ground%2C+london
 
 If you want to know how the address was created click the 'details'
 link at the end of the search result.  Some of the values are my debug
 info but it will also provide links to the osm node/way/relation.
 Please be aware that this extract is about 4 weeks old and there have
 been quite a bit of improvements to the UK county data since then.
 
 Please email me bug reports off list, but be aware that I'm going to
 be away from my email for a lot of the weekend and that there are
 still known issues - so don't be that surprised if you break it.

Preliminary results of the British Museum Test:
http://povesham.wordpress.com/2007/11/23/the-british-museum-test-for-public-mapping-websites/

(also on wiki http://wiki.openstreetmap.org/wiki/British_Museum_Test)

* Tate Modern
  - Correct, but listed 2 nearly identical results for Attraction and
Arts centre.
* British Museum
  - First result is disused station that's not even visible on the
standard map. Second result correct.
* National Gallery
  - No results found
* Natural History Museum
  - First result is Hemel Hempstead, Hertfordshire. 2nd, 3rd and 4th
results correct. Natural History Museum, London produced just the 3
correct results.
* British Airways London Eye
  - No results found
* London Eye
  - First result good. Second result nearby bus stop. Third result
common. Not sure what the common means, but otherwise good
* Science Museum
  - Correct, but listed 2 nearly identical results for Museum and
Public Building
* The Victoria  Albert Museum
  - Correct, but listed 2 nearly identical results for
Museum;attraction and Public Building
* VA Museum
  - Identical results to above (good)
* The Tower of London
  - Correct, but listed 3 nearly identical results for Castle,
Attraction and Public Building
* St Paul’s Cathedral
  - 4 results. First one Attraction, which is acceptable. Second one
bus stop, Fourth one Place of Worship. 3rd one = Place of Worship
in Dundee.
* National Portrait Gallery
  - No results found

So overall, I'd say it's very good, but it could use some sort clumping
of multiple results of the same thing and needs improvements in
prioritising results - Museums are more important than bus stops and
disused stations. Also it needs typo correction.

Another hint for priorities would be if something is in the current map,
it should possibly score a bit higher than something further away
(although anything in the current map area should score the same - you
don't want to put a positive bias on The Midlands when searching while
viewing the whole UK). This would have solved the Natural History
Museum, Hemel Hempsted problem.

Postcode searching is weird. If I search for NW1 3AN, it seems to give
me the result for NW1 3AR, which isn't the same place. If it doesn't
know the correct postcode, it should fall back to the area - NW1 3, for
which is a better result because it's clear that it's not accurate and
it's pointing to the whole area.

Robert (Jamie) Munro
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.8 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEUEARECAAYFAkqWf5kACgkQz+aYVHdncI3BWACg8SalbIGx2j8eCFSo4+Skeuoq
F/kAmP+u2eS7SKSEzqBMCuk4v10Btng=
=h8fE
-END PGP SIGNATURE-

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] State of the NameFinder

2009-08-27 Thread Ævar Arnfjörð Bjarmason
On Wed, Aug 26, 2009 at 11:58 AM, David Earlda...@frankieandshadow.com wrote:
 Beyond getting the index updated using the existing technology, my next
 step is to try using postgres instead of mysql to (hopefully) increase
 the search speed (I use self-joins a lot and these should be faster in
 postgres; there may also be scope beyond that for replacing my low level
 word search algorithm with postgres' flexible free text searching but
 still retaining the multiple variations the system copes with at present).

This might be a silly question but has someone looked into using
dedicated search engines like lucene for OSM data?

That's what Wikimedia moved to after hacking search with relational
databases proved too slow, but I'm not familiar with how well it could
handle searching through geodata.

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] State of the NameFinder

2009-08-27 Thread Ed Avis
Robert (Jamie) Munro rjmunro at arjam.net writes:

http://katie.openstreetmap.org/~twain/

Preliminary results of the British Museum Test:

* National Gallery
  - No results found

This is because it's tagged as 'National Gallery / National Portrait Gallery'.
If they share the same physical building, there isn't an obvious way to tag it
with two separate names, so putting a slash may be the least bad option.

Semicolon is also used, as in 'Palace of Westminster; Houses of Parliament;
House of Commons; House of Lords'.  And a semicolon is used by some mapping
tools to join together names when two objects are merged.

So I suggest the namefinder should split names at slash and at semicolon, and
search each part separately.

* British Airways London Eye
  - No results found

I've added alt_name tags to this object, so after a database update it should
work.

* London Eye
  - First result good. Second result nearby bus stop. Third result
common. Not sure what the common means, but otherwise good

Might be caused by the map having both an area and a node for the attraction.
I have tidied this to just the area.

* The Victoria  Albert Museum
  - Correct, but listed 2 nearly identical results for
Museum;attraction and Public Building

This is semicolons getting into the tagging again; I've changed it to just
'museum' since that is more specific than 'attraction'.  The two separate
results look like a namefinder bug (you had other examples).

* St Paul’s Cathedral
  - 4 results. First one Attraction, which is acceptable. Second one
bus stop, Fourth one Place of Worship. 3rd one = Place of Worship
in Dundee.

There used to be separate tagging for the building and a node in the centre
of the building, which could explain one of the duplicate results.  I've
tidied that up.

* National Portrait Gallery
  - No results found

See discussion of slashes above.

-- 
Ed Avis e...@waniasset.com


___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] State of the NameFinder

2009-08-27 Thread Tom Hughes
On 27/08/09 14:26, Ævar Arnfjörð Bjarmason wrote:
 On Wed, Aug 26, 2009 at 11:58 AM, David Earlda...@frankieandshadow.com  
 wrote:
 Beyond getting the index updated using the existing technology, my next
 step is to try using postgres instead of mysql to (hopefully) increase
 the search speed (I use self-joins a lot and these should be faster in
 postgres; there may also be scope beyond that for replacing my low level
 word search algorithm with postgres' flexible free text searching but
 still retaining the multiple variations the system copes with at present).

 This might be a silly question but has someone looked into using
 dedicated search engines like lucene for OSM data?

Brian's stuff is using the full text search support in postgres which is 
effectively a dedicated full test search engine. The advantage of using 
that over something like lucene is that you combine geographic 
restrictions (using postgis) with text searches in the same query.

Tom

-- 
Tom Hughes (t...@compton.nu)
http://www.compton.nu/

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


[OSM-talk] State of the NameFinder

2009-08-26 Thread Peteris Krisjanis
Hi!

Congracts everyone with amount of input OSM recieves this year.
Volunteers join us every day and changesets grow in detaility and
quality. However, there is one set back and those are two things -
mapnik render and NameFinder. While mapnik slowly gains additional
renders for POIs and stuff, people can't find anything newer than
January in NameFinder. I also think it is time to redesign it and make
default version more good looking. Yes, I know, commercial vendors are
here for selling services, but stil, as a geek, I contribute to OSM
and want to use OSM slippy map.

Anyway, so far I have heard about two efforts of getting NameFinder
running again. First one is just performance improvement for old one
(done by David Earl) and second one is completely new effort (by
Twain).

How far both projects are? Is there anything someone with moderate
Linux/sysadmining/Python/Postgresql/whatever knowledge can help?

Cheers,
Peter.

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] State of the NameFinder

2009-08-26 Thread David Earl
On 26/08/2009 12:44, Peteris Krisjanis wrote:
 Anyway, so far I have heard about two efforts of getting NameFinder
 running again. First one is just performance improvement for old one
 (done by David Earl)

I have been trying to rebuild the index, but the first two attempts 
failed (well, the second worked but the changes meant it took an 
impracticably long time(*)).

I was going to start a third attempt today (it's best to start on a 
Wednesday with a fresh planet file).

Beyond getting the index updated using the existing technology, my next 
step is to try using postgres instead of mysql to (hopefully) increase 
the search speed (I use self-joins a lot and these should be faster in 
postgres; there may also be scope beyond that for replacing my low level 
word search algorithm with postgres' flexible free text searching but 
still retaining the multiple variations the system copes with at present).

While the current index is January, that's from daily updates. The last 
time the index was completely reloaded was February 2008, so the amount 
of data has increased enormously since then, and that means everything 
takes a lot longer.

David



(*) I was trying InnoDB tables, but the overhead on these was 
tremendous, and the load process was taking order of magnitude longer.

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] State of the NameFinder

2009-08-26 Thread Peteris Krisjanis
2009/8/26 Jonas Krückel o...@jonas-krueckel.de:

 Nestoria (Ed Freyfogle) also offered help for a new search/namefinder on
 SOTM. And Geocommons made their geocoding service open source.
 So maybe we should start a kind of working group who looks at all the offers
 and possibilities and then get one running.

 Maybe this could also be one topic for a hacking session on wherecampeu.

 I would join a working group, who else is interested?


Me :) Have some nice expierence with server systems and would be nice
to have such challenge to deal with.

And ohhh, I would like to push OSM to next level, when all our
contributed data are easily found by simple search string.

Cheers,
Peter.

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] State of the NameFinder

2009-08-26 Thread David Earl
On 26/08/2009 13:19, Peteris Krisjanis wrote:
 2009/8/26 Jonas Krückel o...@jonas-krueckel.de:
 Nestoria (Ed Freyfogle) also offered help for a new search/namefinder on
 SOTM. And Geocommons made their geocoding service open source.
 So maybe we should start a kind of working group who looks at all the offers
 and possibilities and then get one running.

 Maybe this could also be one topic for a hacking session on wherecampeu.

 I would join a working group, who else is interested?

 
 Me :) Have some nice expierence with server systems and would be nice
 to have such challenge to deal with.
 
 And ohhh, I would like to push OSM to next level, when all our
 contributed data are easily found by simple search string.


We set up a geocaching mailing list just after SOTM. We haven't had much 
traffic on it so far. Please feel free to join:

http://lists.openstreetmap.org/listinfo/geocoding


David

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] State of the NameFinder

2009-08-26 Thread Brian Quinion
 running again. First one is just performance improvement for old one
 (done by David Earl) and second one is completely new effort (by
 Twain).

The one I've been working on (suggestions for a name for the project
gratefully received off list BTW) is mostly functional.  I was
expecting to open it for testing on the geocoding list some time next
week when it has finished indexing the most recent planet import
however given the timing of this I've started an import of just a uk
extract (which will take 3 to 4 hours to run assuming it all works
first time) so people can have a quick preview.  I'll post a URL to
the list when it is complete.

The main problem with the project, and reason it has been so slow, is
the sheer size of the data.  A complete test cycle for the whole
planet data takes around a week assuming that nothing goes wrong and
working on less than a country sized area is pointless because you
don't get a true indication of performance.

There is still considerable work to be done - the system doesn't just
support diff updates, the code is very messy and in need of
considerable cleaning up and there are a few known bugs with long
strings running out of memory.  I also want to move a lot more of the
core search code into the database to make it less dependant on php.
If people are keen it is possible that some of this work could be
shared with other people.

Cheers,
--
 Brian

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] State of the NameFinder

2009-08-26 Thread Tom Hughes
On 26/08/09 13:12, Jonas Krückel wrote:

 Nestoria (Ed Freyfogle) also offered help for a new search/namefinder
 on SOTM. And Geocommons made their geocoding service open source.
 So maybe we should start a kind of working group who looks at all the
 offers and possibilities and then get one running.

Umm... Like the geocoding mailing list you mean? The one that was 
created after discussion with the interested parties at SOTM...

Tom

-- 
Tom Hughes (t...@compton.nu)
http://www.compton.nu/

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] State of the NameFinder

2009-08-26 Thread Jonas Krückel


Am 26.08.2009 um 14:49 schrieb Tom Hughes t...@compton.nu:

 On 26/08/09 13:12, Jonas Krückel wrote:

 Nestoria (Ed Freyfogle) also offered help for a new search/namefinder
 on SOTM. And Geocommons made their geocoding service open source.
 So maybe we should start a kind of working group who looks at all the
 offers and possibilities and then get one running.

 Umm... Like the geocoding mailing list you mean? The one that was  
 created after discussion with the interested parties at SOTM...

 Tom

Yep, sorry, I don't know why i missed that list, David already gave me  
a hint know. I will subscribe asap.

Jonas


___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] State of the NameFinder

2009-08-26 Thread andrzej zaborowski
Hi,

2009/8/26 Peteris Krisjanis pec...@gmail.com:
 Anyway, so far I have heard about two efforts of getting NameFinder
 running again. First one is just performance improvement for old one
 (done by David Earl) and second one is completely new effort (by
 Twain).

I started working on a different search engine for OSM but didn't have
time to make it really work yet.  I'll try to do it in the next week
or two.  The idea is a little different from the NameFinder in various
aspects and it can search for all data in the database not only names.
 In some aspects it's dumber than NameFinder but it stresses being
usable for non-latin alphabet searches, mixed alphabets, and nice
presentation of the results.  It indexes the planet file, which is a
rather heavy task for my desktop but I think it would still be able to
cope if the size of data in OSM grew up to about 10 times.  The idea
was also for it to be very cheap computationally and so that a search
with N search terms takes exactly N reads of 1 sector from my
harddisk, so N moves of the disk's head and I've not fully achieved
this.

(this is all how it was supposed to work, not how it will end up working) :)

Cheers

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] State of the NameFinder

2009-08-26 Thread Frederik Ramm
Hi,

Jonas Krückel wrote:
 Yep, sorry, I don't know why i missed that list, David already gave me  
 a hint know. I will subscribe asap.

I'm not on that list either (yet) so let me continue abusing talk:

There's also an OpenSearch Suggestion Service by Wolfram Schneider 
wsc...@googlemail.com (done with OSM data for the bbbike project) here:

http://bbbike.elsif.de/streets.html

He says that it can easily be run for whole countries/continents. It is 
still somewhat beta (if I understood correctly, the largest problem is 
that it can currently only handle one same-named street per region but 
he's working on that). He says his software is open source in 
principle and he'll share the status quo if anyone is interested, 
otherwise he'll bring it into a better shape before releasing it.

Bye
Frederik

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] State of the NameFinder

2009-08-26 Thread Brian Quinion
 The one I've been working on (suggestions for a name for the project
 gratefully received off list BTW) is mostly functional.  I was
 expecting to open it for testing on the geocoding list some time next
 week when it has finished indexing the most recent planet import
 however given the timing of this I've started an import of just a uk
 extract (which will take 3 to 4 hours to run assuming it all works
 first time) so people can have a quick preview.  I'll post a URL to
 the list when it is complete.

As promised, you can try a uk test system here:

http://katie.openstreetmap.org/~twain/

And a couple of sample queries:

http://katie.openstreetmap.org/~twain/?q=london
http://katie.openstreetmap.org/~twain/?q=91+upper+ground%2C+london
http://katie.openstreetmap.org/~twain/?q=pub+near+upper+ground%2C+london

If you want to know how the address was created click the 'details'
link at the end of the search result.  Some of the values are my debug
info but it will also provide links to the osm node/way/relation.
Please be aware that this extract is about 4 weeks old and there have
been quite a bit of improvements to the UK county data since then.

Please email me bug reports off list, but be aware that I'm going to
be away from my email for a lot of the weekend and that there are
still known issues - so don't be that surprised if you break it.

--
 Brian

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk