[Talk-es] Fwd: [Imports] Spanish Cadastre ELEMTEX

Javier Sánchez Thu, 07 Jun 2012 01:09:58 -0700

Hola

Nos hemos cruzado Cruz y yo en la respuesta de Paul Norman para la
importación de ELEMTEX.


Resumen: Vamos por buen camino pero hay que currar un poco más.

* Sería conveniente añadir algún proceso en Cat2Osm para filtrar
algunos nodos y asignar etiquetas a POI's.

* Es necesario un control de calidad sobre los datos.

A continuación el mensaje completo traducido.

> - Some nodes have only name, source and source:date tags while others have
> those and place=locality
> Algunos nodos sólo tienen las etiquetas nombre, source y source:date,
> mientras que otros tienen place=locality

Sí. La mayoría de los nodos corresponden a lugares parajes y tienen la
etiqueta place=locality asignada provisionalmente. Después de una
revisión manual, algunos de ellos podrían ser asignados a lugares
poblados, habitualmente de los pequeños como isolated_dwelling y
hamlet. Si el nodo no tiene place=locality, lo más probable es que sea
un POI.

> - Some have absurd names (e.g. name=-I+I and name=P-1, P-2, etc). These
> could be dealt with manually but it would be worth seeing how common they
> are and dealing with them in the conversion
Algunos tienen nombre absurdos (e.g. name=-I+I y name=P-1, P-2, etc).
Se podrían manejar a mano pero podría merecer la pena comprobar su
frecuencia y tratarlos en la conversión.

Si. Se corresponden con números de parcela (por ejemplo en polígonos
industriales). Son propensos a eliminación. Se puede añadir algún
proceso en el programa para detectarlos y filtrarlos. Se lo comentaré
a los programadores.

> - Some of the names could be translated to tagging, e.g. name=GASOLINERA
> could be turned into amenity=fuel. Again, this would depend how common they
> are and how consistent they are.
Algunos nombres pueden traducirse a etiquetas, por ejemplo
name=GASOLINERA en amenity=fuel. De nuevo depende de su frecuencia y
consistencia.

Si también. De nuevo el programa podría tomar decisiones sobre esto.

> - All I noticed with the processed file is that were some ways with only
> source, source:date and name tags.
Todo lo que he visto en el fichero procesado es que hay algunas vías
con sólo las etiquetas source, source:date y name.

"Mea culpa" Estoy avergonzado, es un error mio. Hay una vía que
corresponde a una escuela y un nodo a un cementerio sin etiquetas
físicas.

> CanVec has convinced me that going this method with imports requires some
> sort of QA plan or one importer will not check any of their work and cause
> duplicates, disconnected data, bad data, etc.
¿CanVec? me ha convecido de que usar este método con importaciones
requiere de algún tipo de plan de control de calidad o algún
importador no comprobará todo su trabajo provocando duplicados, datos
inconexos, malos, etc.

Está demostrado con los errores que he cometido.

> The problem is not that there aren't tools for checking your own work, the
> problem is that some people will not use them and cause significant damage.
> I can't suggest a solution for this, and I think it is not as significant
> for this particular data set, but it is definitely an issue when you start
> getting into others like roads and landuse.
El problema no es que no existan herramientas para comprobar tu propio
trabajo, el problema es que algunos no las usarán y pueden provocar
daños significativos. Puedo sugerirte una solución para esto, y pienso
que no es tan significativo para este conjunto de datos en particular,
pero definitivamente es un punto a tener en cuenta cuando pases a
otros como carreteras y uso del terreno.

Haz tu sugerencia, por favor. Yo propongo publicar los archivos
procesados y solicitar en la lista española (o aquí) una revisión por
una segunda persona antes de subir los datos.

> If you find a good solution for this problem, please document it and let
> everyone know as it is a big problem with imports done by multiple people
> who may have varying standards.
Si encuentras una buena solución a este problema documéntala por favor
y déjanosla saber ya que se trata de un problema importante con
importaciones realizadas por muchas personas que pueden tener
criterios variables.

Corregiré estos puntos. Muchas gracias por todos.


---------- Forwarded message ----------
From: Javier Sánchez <[email protected]>
Date: 2012/6/7
Subject: Re: [Imports] Spanish Cadastre ELEMTEX
To: Paul Norman <[email protected]>

2012/6/7 Paul Norman <[email protected]>:
>> From: Javier Sánchez
>> Subject: [Imports] Spanish Cadastre ELEMTEX
>>
>> The ELEMTEX layer contains text labels about unpopulated places, many
>> small populated places and points of interest (like police stations,
>> post offices, hospitals, schools, etc), but they are not categorized.
>> These data are extracted with an option of Cat2Osm. The result consist
>> of nodes with the tags name=*, source=cadastre and source:date=*. The
>> job basically consists in manually check nodes, assign tags most
>> suitable to describe the elements based on the name and local knowledge,
>> drop which of them that can not be classified, correct other errors like
>> spelling and conflate with existing OSM data.
>>
>> As example two files are attached. The first is the output generated by
>> the program Cat2Osm for a municipality [3]. The second contains the data
>> reviewed manually [4].
>
> I have a few comments, some on the raw data, some on the edited data. This
> is much improved and dealing with a smaller subset of data makes it much
> easier to review.
>
> - Some nodes have only name, source and source:date tags while others have
> those and place=locality

Yes. Most of the nodes correspond to unpopulated places and have the
tag place=locality assigned tentatively. After a manual revision, some
of them could be assigned to populated places, ussually small ones
like issolated dwelling and hamlet. If the node don't have
place=locality, most probably it is a POI.

> - Some have absurd names (e.g. name=-I+I and name=P-1, P-2, etc). These
> could be dealt with manually but it would be worth seeing how common they
> are and dealing with them in the conversion

Yes. They correspond to parcel numbers (for example in a industrial
estate). They are prone to deletion. Some kind of automatic proccess
could be added in the program to detect them and filter. I will
suggest this to the programers.

> - Some of the names could be translated to tagging, e.g. name=GASOLINERA
> could be turned into amenity=fuel. Again, this would depend how common they
> are and how consistent they are.

Also yes. Again the program could take some decisions on this.

> - All I noticed with the processed file is that were some ways with only
> source, source:date and name tags.

"Mea culpa" I'm embarrassed, this is a mistake of mine. There is one
way corresponding to a school and one node corresponding to a
cementery  without phisical tags.

> CanVec has convinced me that going this method with imports requires some
> sort of QA plan or one importer will not check any of their work and cause
> duplicates, disconnected data, bad data, etc.

And it is demostrated with my previous mistakes.

> The problem is not that there aren't tools for checking your own work, the
> problem is that some people will not use them and cause significant damage.
> I can't suggest a solution for this, and I think it is not as significant
> for this particular data set, but it is definitely an issue when you start
> getting into others like roads and landuse.

Please make your suggestion. I propose publishing the processed osm
files and ask in the Spanish list (or here by the way) for a revision
by a second peer prior to upload.

> If you find a good solution for this problem, please document it and let
> everyone know as it is a big problem with imports done by multiple people
> who may have varying standards.

I will correct this points. Thank you very much for all.

Javier

_______________________________________________
Talk-es mailing list
[email protected]
http://lists.openstreetmap.org/listinfo/talk-es

[Talk-es] Fwd: [Imports] Spanish Cadastre ELEMTEX

Responder a