Dear Team, I am willing to put some work into this but I need a clear directive :
Do you want to just revert my EPA import or should I put the work into fixing it? please give me a direction. I get abuse for importing "junk", but on the other side people have been happy to get this data. So there is some conflict here. It is techinally possible to fix the data, here my algorithm : 1. pull the 10 changesets off the server, using the export routine. Convert from osmchange to josm format. 2. Replace the CAPS name with a nicer name. Devise some rules to convert the WWTP to waste water treatment plant. 3. Check if the item has been decommissioned (simple list lookup), if so make it non visible and mark it . 4. Check the state/area if they dont want this data (nj has state level) 5. Check the map if there are any other overlapping nodes in a certain radius (needs to have the world file) /- Ideally OSM would have an export routine to include nearby nodes. 6. Check the map if there are any additional nodes with the same name... now this will be very hard. But there should be a way to find possible fuzzy matches in an area. Now, in fact this processing would be best done on a state or county level. First you would want to split up the data into chunks and distribute the processing. I don't know about the chunking mechanisms for the USA data. Of course not all areas even have EPA hazards. But we could take a set of shape files for the chunks, use them to split up the EPA data, create a weighted list of areas with the most nodes and then extract the world files for those areas. Now, there are other things to do : A. Be able to pull out the EPA record for each node and augment it with the given data. decide based on that data to create better symbols. This will create a huge load on the severs and could be considered a form of DDOSing. That is why I have not started to do so. Ideally the EPA will update the KML file and include the basic infomation about it, the type of the hazard and the date of activity. B. If the company is just listed as a regulated producer of waste, and the is no hazard, we should want to include the listing for the fact that it is a POI. C. Now the points that have been deleted by people or modified already should be fed back to the EPA as an update. D. If you look at the EPA webpage, they are using BING maps and contain many more local records of each doctors house with a xray device. I have not found where to get this data from, but it can be used to complete the map with more POIS. there are about 10 x more points in that dataset. Well these are my ideas for processing the data, let me know if anyone supports further work in this area. thanks, mike On Sun, Dec 13, 2009 at 10:49 AM, Minh Nguyen <[email protected]> wrote: > Ngày 12/12/09 7:03 AM, > [email protected] viết: >> The ref is for the node itself. >> If you follow them, you will find a ton of information about the item >> from the EPA. >> It has been suggested to change this to website. > > Wouldn't "url" be a better tag for it? For your example, the "ref" would > actually be more like "110010106081". > > -- > Minh Nguyen <[email protected]> > [[en:User:Mxn]] [[vi:User:Mxn]] [[m:User:Mxn]] > AIM: trycom2000; Jabber: [email protected]; Blog: http://notes.1ec5.org/ > > > _______________________________________________ > Talk-us mailing list > [email protected] > http://lists.openstreetmap.org/listinfo/talk-us > _______________________________________________ Talk-us mailing list [email protected] http://lists.openstreetmap.org/listinfo/talk-us

