I had the chance to visit with Abhishek in Boulder last year. How he came up with his findings is interesting. He looked at TIGER imports. People assumed that TIGER is TIGER, but in reality the quality of TIGER in 2006 was very much depend on the county that produced the source material and how much cleanup Census did to the data. This difference allowed him to study the impact on community growth. Here is an excerpt from Abhishek's paper:
"Unbeknownst to the OpenStreetMap contributors, the US Census was itself in the process of updating and correcting a mostly outdated and incomplete TIGER map in preparation for the 2010 census. Consequently, the 2006 version of the TIGER map that was used by OpenStreetMap contained accurate and complete information for only about 60% of the approximately 3,100 counties in the United States. Information for the remaining 40%, provided largely out-of-date and incomplete information. Thus, communities in about 60% of the counties in the US were seeded with a higher level of information than the other 40% during OpenStreetMap’s formative years. The high-information and low-information groups of counties were broadly comparable along many other dimensions, such as their population and income growth. I exploit this natural experiment to estimate the impact of the level of information seeding on follow-on knowledge production in online communities. Specifically, by comparing Treatment counties (those that received the higher-quality TIGER map) with Control counties, combined with micro-data on more than 350 million contributions between 2005 and 2014, I can estimate the causal effects of information seeding on long-run outcomes within OpenStreetMap in a difference-in-difference framework." This is a little off topic, but I find TIGER interesting. This year I've been looking at Washington State TIGER data comparing it to county data where it is available. What I find is that TIGER data in some counties is pretty much the same as it was in 2006. Yet the counties produce monthly or more updates to their road networks. They just don't get sent to Census. One county wasn't updating their friends at Esri so their basemap didn't match their own data. As you might guess, these counties have a bare number of people assigned to GIS work. I'm a strong believer that the right import can help. For instance, having buildings and addresses help with tools like Maps.Me, GoMap!! and OsmAnd by helping people place nodes in the correct location. Without buildings for context, it's just an empty space and hard to visualize. Address help even more. With an address it just a matter of adding the appropriate POI information. Roads shouldn't been imported, at least in the US. I can't speak of for countries. While I stated above that the counties road data is much better than Census, it's not perfect. I'd rather see people trace the roads in than import them. (That's why I created Washington State Roads background for iD and JOSM.) The import process seems to work. (Although I still don't understand the purpose for a seperate id. I hear the reason, but it just doesn't make sense. Paul Norman has been trying to explain it to me for years. Then again I'm probably a slow learner) By making sure the data is properly licensed, there is a good workflow, the data is good, and it has buy in from the local community, the imports process seems to be working. I'm including Abhishek on this discussion since I doubt he follows the import list. If I in anyway misinterpreted his findings he can jump in. Best, Clifford On Tue, Jul 3, 2018 at 12:45 PM Martijn van Exel <[email protected]> wrote: > Thanks for this follow up. I had not read that paper yet but had seen it > had come out. I am familiar with Abhishek's other research and will be > looking forward to sharing my take on it. > -- > Martijn van Exel > [email protected] > > On Tue, Jul 3, 2018, at 11:46, Frederik Ramm wrote: > > Hi, > > > > this (forwarded message belor) is for Martijn who in another thread > > asked if I knew of any research that would back up by claim that "large > > imports are often detrimental to community building". I believe the > > author had also presented at SotM-US last year. > > > > Of course in addition to this diligent scientific research, there's also > > the theoretical models and discussions in > > http://www.asklater.com/matt/blog/2009/09/06/imports-and-the-community/ > > and the follow-on post, though these are hardly news! > > > > I've posted this in a separate thread in order not to further upset > > Christoph ;) > > > > Bye > > Frederik > > > > -------- Forwarded Message -------- > > Subject: Scientific paper on "Information Seeding" > > Date: Mon, 9 Oct 2017 23:10:13 +0200 > > From: Frederik Ramm <[email protected]> > > To: Talk Openstreetmap <[email protected]> > > > > Hi, > > > > today I was pointed to a recent, open-access scientific paper called > > "Information Seeding and Knowledge Production in Online Communities: > > Evidence from OpenStreetMap". This open-access paper is available here > > > > https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3044581 > > > > In the context of armchair mapping, but especially of data imports (and > > recently, machine-generated OSM data) there's always been the discussion > > between those who say "careful, too much importing will hurt the growth > > of a local community", and others who say "this import is going to > > kick-start a local community, let's do it!" > > > > Until now this has been a rather un-proven matter of belief, and the > > general mood is usually in favour of a quick build-up of data (through > > remote mapping, importing, or machine learning) instead of a > > take-it-slow approach that would wait for a community to form and take > > matters into their own hands. > > > > The paper quoted above uses OSM as a research object and finds that in > > certain ways imports in OSM have indeed harmed community growth. The > > paper attempts to provide insights helpful for all kinds of > > user-generated knowledge projects (not necessarily OSM), and > > draws the following conclusion: > > > > "While information seeding could be useful to encourage the production > > of distant forms of follow-on knowledge, it might demotivate and > > under-provide more mundane and incremental follow-on information. > > Accordingly, if managers are interested in leveraging pre-existing > > information to spur the development of online communities, they might be > > better served by withholding some pre-existing information and provide > > community members with some space to create knowledge from scratch—even > > if such knowledge already exists in an external source. This policy > allows > > community members to become invested in the community and develop > > ownership over the knowledge." > > > > Bye > > Frederik > > > > -- > > Frederik Ramm ## eMail [email protected] ## N49°00'09" E008°23'33" > > > > _______________________________________________ > > Imports mailing list > > [email protected] > > https://lists.openstreetmap.org/listinfo/imports > > _______________________________________________ > Imports mailing list > [email protected] > https://lists.openstreetmap.org/listinfo/imports > -- @osm_seattle osm_seattle.snowandsnow.us OpenStreetMap: Maps with a human touch
_______________________________________________ Imports mailing list [email protected] https://lists.openstreetmap.org/listinfo/imports
