Re: [OSM-dev] Nominatim and/or a fault-tolerant geocoder

2016-11-28 Thread Дмитрий Киселев
Hi Tom.

There is a list of OSM search engines.
http://wiki.openstreetmap.org/wiki/Search_engines
Most of the should have fuzzy search some of them do that better, some of
them - worse.

If you are looking for something what will work for Russia, try:

1. openstreetmap.ru (they have their own geocoder osm2pgsql + some magic
queries + sphynx, it's not open source, but you'll get the vision what you
can achive going osm2pgsql + sphynx way.

2. OSM-Gazetteer - that's my geocoder, since I'm from Russia and made some
trics for russian addresses should works fine. Or you can use it as an OSM
data preprocessor. But you may want more stability and enterprise stuff.

3. Photon, Pelias - they both have fuzzy search capabilities.

Other options from the list may also have fuzzy search, I just haven't
tried them so can't say for sure.

Regards, Dmitry.

2016-11-28 19:03 GMT-04:00 Tom :

> Hi everybody,
>
> I’m in the quest for a geocoder for OSM that is fault-tolerant in regards
> of miss-spelled search terms.
>
> The company I’m working for does different projects for customers in the
> logistics field. From every customer we receive several hundred thousand
> address-records, which we have to geocode in order to do different
> calculations. I started to use Nominatim for that (on an own installation),
> but it seems that Nominatim has not much of tolerance regarding
> miss-spelled street and city names. Especially on our last project in
> Russia it turned out, that street- and city-names often include
> abbreviations in different ways (like „street“, „str.“, „s“, …). Since we
> receive the address information from our customers, we have not much
> influence on the quality of the data. So there are not just these valid
> abbreviations, but also real spelling errors. Nevertheless we have to
> geocode as much of these addresses as possible.
>
> But right now, Nominatim throws out around 40% of the addresses, not
> finding anything, although the address is in OSM and could be found (just
> slightly different spelled). What I would expect is, that a geocoder gives
> me back some kind of answer for *every* question I ask, being it an exact
> match on the city or on the street, or only a „similar“ match. It should
> tell me if there was no 100%-match, there were several records found,
> matching my street or my city from e.g. 80% to 50%. So then I can decide
> later on which records I consider a match and which not. In any case the
> first row returned should be the best match available.
>
> So I have a couple of questions here:
>
>
>1. Does anybody know of a geocoder for OSM-data that does this
>already?
>I found besides Nominatim there are several other geocoders. But I
>cannot test them all. Maybe some work already this way.
>
>2. There is a Postgresql-module that seems to do just what I want:
>pg_trgm. It does not seem like Nominatim uses that right now.
>Is there anybody already working on implementing this (or anything
>similar)?
>
>3. If not, I would be willing to invest further time and effort into
>this, but I need some help on the internals of Nominatim, which I’m not
>firm with.
>   1. Where would be the right place to integrate this into Nominatim?
>   2. Does it make sense to try to put this into Nominatim?
>   3. Or would it be easier to use just osm2psql and build on top of
>   that a new query-interface?
>
>
>
> Thanks a lot for anybody who can help me getting forward with this issue!
>
> Best regards,
>
> Tom
>
>
> PS: I put this on both mailing list, 'dev‘ and 'geocoding', since I’m not
> sure where it suits better. Please excuse me if this is wrong!
>
> ___
> dev mailing list
> dev@openstreetmap.org
> https://lists.openstreetmap.org/listinfo/dev
>
>


-- 
Thank you for your time. Best regards.
Dmitry.
___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev


[OSM-dev] Nominatim and/or a fault-tolerant geocoder

2016-11-28 Thread Tom
Hi everybody,

I’m in the quest for a geocoder for OSM that is fault-tolerant in regards of 
miss-spelled search terms.

The company I’m working for does different projects for customers in the 
logistics field. From every customer we receive several hundred thousand 
address-records, which we have to geocode in order to do different 
calculations. I started to use Nominatim for that (on an own installation), but 
it seems that Nominatim has not much of tolerance regarding miss-spelled street 
and city names. Especially on our last project in Russia it turned out, that 
street- and city-names often include abbreviations in different ways (like 
„street“, „str.“, „s“, …). Since we receive the address information from our 
customers, we have not much influence on the quality of the data. So there are 
not just these valid abbreviations, but also real spelling errors. Nevertheless 
we have to geocode as much of these addresses as possible. 

But right now, Nominatim throws out around 40% of the addresses, not finding 
anything, although the address is in OSM and could be found (just slightly 
different spelled). What I would expect is, that a geocoder gives me back some 
kind of answer for every question I ask, being it an exact match on the city or 
on the street, or only a „similar“ match. It should tell me if there was no 
100%-match, there were several records found, matching my street or my city 
from e.g. 80% to 50%. So then I can decide later on which records I consider a 
match and which not. In any case the first row returned should be the best 
match available.

So I have a couple of questions here: 

Does anybody know of a geocoder for OSM-data that does this already? 
I found besides Nominatim there are several other geocoders. But I cannot test 
them all. Maybe some work already this way.

There is a Postgresql-module that seems to do just what I want: pg_trgm. It 
does not seem like Nominatim uses that right now.
Is there anybody already working on implementing this (or anything similar)?

If not, I would be willing to invest further time and effort into this, but I 
need some help on the internals of Nominatim, which I’m not firm with. 
Where would be the right place to integrate this into Nominatim? 
Does it make sense to try to put this into Nominatim?
Or would it be easier to use just osm2psql and build on top of that a new 
query-interface?


Thanks a lot for anybody who can help me getting forward with this issue!

Best regards,

Tom


PS: I put this on both mailing list, 'dev‘ and 'geocoding', since I’m not sure 
where it suits better. Please excuse me if this is wrong!___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] planet.openstreetmap.org/replication policy

2016-11-28 Thread Yves
Martin is right,  using daily and hourly diffs you'll save bandwidth and reduce 
the number of transactions on your side to update Imposm DB.

Yves

Le 28 novembre 2016 15:28:36 GMT+01:00, Martin Koppenhoefer 
 a écrit :
>2016-11-28 14:33 GMT+01:00 Oliver Tonnhofer :
>
>> It would also not reduce the bandwidth by much, as it still needs to
>> download the same data.
>
>
>
>it surely will use a lot fewer connections, but also the amount of data
>to
>download can be significantly smaller, depending how often the same
>objects
>get touched within the same day.
>
>Cheers,
>Martin

-- 
Envoyé de mon appareil Android avec K-9 Mail. Veuillez excuser ma brièveté.___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev


[OSM-dev] OpenStreetMap Carto release v2.45.0

2016-11-28 Thread Daniel Koć

Dear all,

Today, v2.45.0 of the openstreetmap-carto stylesheet (the default
stylesheet on openstreetmap.org) has been released.

Changes include:
- Rendering all shops without a specific icon as a dot, not just a 
whitelist

- Scrub pattern change to random
- Changing pitch and track color
- Railway stations rendering as major buildings
- Rendering the name of man_made=bridge inside the polygon
- Documentation updates (including cartography design goals and icon 
design guidelines)

- Icons general code cleaning
- Various bug fixes

Thanks to all the contributors for this release, including micahcochran,
a new contributor.

For a full list of commits, see
https://github.com/gravitystorm/openstreetmap-carto/compare/v2.44.1...v2.45.0

As always, we welcome any bug reports at
https://github.com/gravitystorm/openstreetmap-carto/issues.

You may also like to know that this release is the first with 3 new 
project
maintainers on board. Please be aware that we're going to drop some 
legacy
dependencies soon (like Mapnik 2), so we're approaching a big version 
change.


--
"A dragon lives forever but not so little boys" [L. Lipton]

___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] planet.openstreetmap.org/replication policy

2016-11-28 Thread Martin Koppenhoefer
2016-11-28 14:33 GMT+01:00 Oliver Tonnhofer :

> It would also not reduce the bandwidth by much, as it still needs to
> download the same data.



it surely will use a lot fewer connections, but also the amount of data to
download can be significantly smaller, depending how often the same objects
get touched within the same day.

Cheers,
Martin
___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] planet.openstreetmap.org/replication policy

2016-11-28 Thread Oliver Tonnhofer
Hi,

> On 28.11.2016, at 13:44, Yves  wrote:
> 
> I think you could take daily and hourly diffs first to cope with the import 
> and last planet delay. 

That would make the implementation much more complex (what is the first hourly 
diff after a complete day?). It would also not reduce the bandwidth by much, as 
it still needs to download the same data. The code also uses keep-alive 
connections during the catch-up phase to reduce the load. 

The code is already working and behaves similar to --read-replication-interval 
from osmosis (as far as I unterstand). So I'm just asking what is acceptable: 
10 requests/s? 100 requests/s?

Regards,
Oliver

-- 
Oliver Tonnhofer  | Omniscale GmbH & Co KG  | https://omniscale.com
OpenStreetMap WMS and tile services | https://maps.omniscale.com
___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] planet.openstreetmap.org/replication policy

2016-11-28 Thread Tom Hughes

On 28/11/16 12:24, Oliver Tonnhofer wrote:


I'm the author of Imposm 3 (https://github.com/omniscale/imposm3/) and I'm 
working on a new command that will automatically download and import diff files 
from planet.openstreetmap.org as they appear.

Normally, it should only make two requests per minute when using minutely 
replication. One for the state and one for the osc.gz file. But after the 
initial import it will download the diff files as fast as Imposm can import 
them till it catches up with the live updates.

A fast server should be able to process 100 diffs per second and more, 
especially when only a smaller extract is imported. My question: Is this OK, or 
should I add a throttle for this?


As Yves said the best plan would be to use daily diffs until you get to 
the current day, then hourlies and only switch to minutelies when you 
get to the last hour.


Tom

--
Tom Hughes (t...@compton.nu)
http://compton.nu/

___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] planet.openstreetmap.org/replication policy

2016-11-28 Thread Komяpa
For prior implementations, try looking at osmupdate:
https://wiki.openstreetmap.org/wiki/Osmupdate

пн, 28 нояб. 2016 г. в 15:47, Yves :

> I think you could take daily and hourly diffs first to cope with the
> import and last planet delay.
> Yves
>
>
> Le 28 novembre 2016 13:24:23 GMT+01:00, Oliver Tonnhofer 
> a écrit :
>
> Hi,
>
> I'm the author of Imposm 3 (https://github.com/omniscale/imposm3/) and I'm 
> working on a new command that will automatically download and import diff 
> files from planet.openstreetmap.org as they appear.
>
> Normally, it should only make two requests per minute when using minutely 
> replication. One for the state and one for the osc.gz file. But after the 
> initial import it will download the diff files as fast as Imposm can import 
> them till it catches up with the live updates.
>
> A fast server should be able to process 100 diffs per second and more, 
> especially when only a smaller extract is imported. My question: Is this OK, 
> or should I add a throttle for this?
>
> PS: The User-Agent is set to "Imposm 3 x.x.x".
>
>
> Regards,
> Oliver
>
>
> --
> Envoyé de mon appareil Android avec K-9 Mail. Veuillez excuser ma brièveté.
> ___
> dev mailing list
> dev@openstreetmap.org
> https://lists.openstreetmap.org/listinfo/dev
>
___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] planet.openstreetmap.org/replication policy

2016-11-28 Thread Yves
I think you could take daily and hourly diffs first to cope with the import and 
last planet delay. 
Yves

Le 28 novembre 2016 13:24:23 GMT+01:00, Oliver Tonnhofer  a 
écrit :
>Hi,
>
>I'm the author of Imposm 3 (https://github.com/omniscale/imposm3/) and
>I'm working on a new command that will automatically download and
>import diff files from planet.openstreetmap.org as they appear.
>
>Normally, it should only make two requests per minute when using
>minutely replication. One for the state and one for the osc.gz file.
>But after the initial import it will download the diff files as fast as
>Imposm can import them till it catches up with the live updates.
>
>A fast server should be able to process 100 diffs per second and more,
>especially when only a smaller extract is imported. My question: Is
>this OK, or should I add a throttle for this?
>
>PS: The User-Agent is set to "Imposm 3 x.x.x".
>
>
>Regards,
>Oliver
>
>-- 
>Oliver Tonnhofer  | Omniscale GmbH & Co KG  | https://omniscale.com
>OpenStreetMap WMS and tile services |
>https://maps.omniscale.com
>
>
>
>
>
>___
>dev mailing list
>dev@openstreetmap.org
>https://lists.openstreetmap.org/listinfo/dev

-- 
Envoyé de mon appareil Android avec K-9 Mail. Veuillez excuser ma brièveté.___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev


[OSM-dev] planet.openstreetmap.org/replication policy

2016-11-28 Thread Oliver Tonnhofer
Hi,

I'm the author of Imposm 3 (https://github.com/omniscale/imposm3/) and I'm 
working on a new command that will automatically download and import diff files 
from planet.openstreetmap.org as they appear.

Normally, it should only make two requests per minute when using minutely 
replication. One for the state and one for the osc.gz file. But after the 
initial import it will download the diff files as fast as Imposm can import 
them till it catches up with the live updates.

A fast server should be able to process 100 diffs per second and more, 
especially when only a smaller extract is imported. My question: Is this OK, or 
should I add a throttle for this?

PS: The User-Agent is set to "Imposm 3 x.x.x".


Regards,
Oliver

-- 
Oliver Tonnhofer  | Omniscale GmbH & Co KG  | https://omniscale.com
OpenStreetMap WMS and tile services | https://maps.omniscale.com





___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev