I was curious how complete OpenStreetMap shop data was, so decided to
do an analysis for some Canadian chains.

The results were mixed. Starting with a Canada extract, I processed the
data into PostGIS and ran queries against name, brand and franchise for
objects where amenity, office or shop was null. Brands which have recently
changed names (e.g. Zellers/Target) were avoided.

OSM completeness varied from 7% to 81%, with no overwhelming trends. The
four major fast food and restaurants chains considered ranged from 33%
to 51%. Shops opening and closing change the accuracy of these results,
and the accuracy of the external sources for number of shops may be
variable.

Method
======

The geofabrik extract for Canada was imported with osm2pgsql with a
custom .style file containing name, operator, brand, franchise, amenity,
building, office and shop columns. The last four columns caused an object
to be placed in the polygon table. After import, the tables were filtered
to remove rows where there was not an amenity, office, or shop tag.
Appropriate indexes were added. A view was created combining the two
tables and giving lower-case versions of the name, operator, brand and
franchise tags.

Queries of the following form were run
  SELECT COUNT(*) FROM shops
    WHERE lname LIKE :'name' OR lbrand LIKE :'name'
      OR lfranchise LIKE :'name';

:'name' was substituted in by psql for what I was searching for. For
example, 'mcdonald%' for McDonald's. The queries used were intended to
catch all possible shops even if it resulted in false positives. Brand
selection was not done in any systematic manner.

Public sources were used for the true number of shops of a particular
chain, generally Wikipedia or public data aggregators.

Results
=======
                   OSM   True   Completenmess
Tim Horton's      1480   4304   34%
Subway             849   2563   33%
McDonalds          722   1417   51%
Starbucks          592   1363   43%
    Both OSM and Google use both Starbucks and Starbucks Coffee
A & W              292    800   37%
Domino's            67    383   17%
Wendy's            224    369   61%
Burger King        150    281   53%
East Side Mario's   46     85   54%
Milestones          32     44   73%
Chili's             10     16   63%

Sears              105   1570    7%
Rona               122    500   24%
The Bay             50    421   12%
Walmart            265    382   69%
Home Depot          95    180   53%

Canadian Tire      400    491   81%
    May be double-counting automative centers.
Chapters            47    233   20%
Sleep Country       22    179   12%
London Drugs        54     78   69%
    May be double-counting some stores with a pharmacy inside

Remarks
=======

It took significantly longer to find the true number of stores than
to get results from the OSM data. Part of this is my increased familiarity
with OSM tools, but a large part is that it is not necessary to track
down many different sources to get store counts.

Although no urban/rural analysis was performed, it is generally expected
that OSM is more complete in populated urban areas than low-density rural
areas, and completeness in these urban areas are often more important
for many uses.

No proprietary data sources were available for comparison, but it should
not be assumed that they are any more complete, nor that their name or
similar tagging is any more consistent. As an example, Google's data was
observed to use both "Starbucks" and "Starbucks Coffee" for the coffee
chain, sometimes having both for what was really the same location.

The tools used to generate counts could easily be used to extract the
shop data to work with.

Improving the data
==================
Inconsistent tagging was observed with some shops, such as variability
between amenity=restaurant and amenity=fast_food. This should reflect
differences between locations but may not. Inconsistent names were also
observed, such as "Walmart", "Wal-mart", and "Wal Mart". These issues
are not as significant as the large percentage of missing shops.

_______________________________________________
Talk-ca mailing list
Talk-ca@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk-ca

Reply via email to