[OSM-dev] OpenStreetMap Carto, lines and polygons

2018-06-29 Thread sav123

Dear All,

I'm working on osm data transformations and focused only on renderring a map 
with osm-carto


Could you help me please ? I have one question :

what an OSM-carto css author expects to find exactly in the tables line and 
polygon ?


---

The generators scripts cannot help to understand since their outputs differ.
On another hand, such algorithms follow necessarily discussions and 
conventions.


Else, if there is not an exact definitions page, I should be interested by 
pages about this kind of questions :
- in a relation, is a reference to a relation a shortcut to its own content 
?

- does it make sense to build a way with the nodes belonging to polygons ?
- in what cases exactly a way data belongs to both line and polygon tables ?
- is there a mandatory way to split wkb data when it is too long ?
- in what cases tags which aren't explicitly requested are useful ?
- is there identified data belonging both to postgres tables and to other 
databases ( ie shape simplified_land_polygons.shp ) ?


I think I understand what are lines and polygons in general and I know the 
db style and lua transformations preconised by osm-carto


Thank you
Igael




___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] Generalisation

2018-05-03 Thread sav123

   Anyways, there is no point of talking about who first, last, only
etc. All approaches using closed commercial software are pointless for
OSM - it cannot be reused. Everything can be done with open source so
that all code/algorithms are open and clear and there is no need to
pay piles of money for nothing.



pointless for ?

1) in charge of initiating a big installation, I spent N000 euros
- to earn 4 hours in the init process by server
- to get the initial prevision of 44 masters  reduced to 12
- to get turn-key sub products to run tiles servers on very small slave 
hardwares


It was a great help for a few bucks and finally, I saved a lot of money

0) it would not be only one serious product in the osm catalog without the 
private donations of a very few enterprises and individuals. I respect what 
they did and I hope a lot for their revenus.



___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev


___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] Top Ten Tasks

2018-04-27 Thread sav123
-Message d'origine- 
From: Tobias Knerr

Sent: Thursday, April 26, 2018 5:08 PM
To: dev list
Subject: [OSM-dev] Top Ten Tasks

Hi All,

about areas type work charge : with tags qualifiers, any node / way / 
relation may be enhanced to a new logical type without rewriting the low 
level utilities.  Effective changes would affect only the css design and db 
production teams.


about the area object : working on tables partitions, with in mind a remark 
read here about embedded areas , I found more easy to dispatch objects on 
different levels ( natural, human artefacts, administrative boundaries, 
spoken languages, etc ) , each one with its coverage set. It needs more 
normalization to be efficient.


Adjacent data like  official geo-dictionnary , well known tags set and 
qualifiers normalization would be useful to many applications.


:)
Igael 



___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] intermediate spacial grid for postgis joins

2018-03-30 Thread sav123

Hello,

It is a cartoo-css query. I don't try to understand it.

Analyzing the join, I deduced that it was difficult to optimize the 
'ST_SetSRID' with the '&&'. It depends from the value of the proximity 
argument , the units and the size of the required box.


Note that 1) the base tables already have geometry indexes, 2) the query is 
trying to rely nodes to ways, something very local since a New York amenity 
never intersects a Paris street ( unless if poetry matter ).


Adding the join with the 'local' grid consists only of splitting the surface 
in small squares to apply the join inside the squares. It's allmost 
equivalent to a set partition. The trick result is so remarkable ( even if 
1/15 is far from the theoretical 1/256 ) that I wanted to share it ...



:)
Igael


-Message d'origine- 
From: Darafei "Komяpa" Praliaskouski

Sent: Friday, March 30, 2018 1:52 PM
To: sav123
Cc: dev@openstreetmap.org
Subject: Re: [OSM-dev] intermediate spacial grid for postgis joins


Hi,

can you please share your query plans?

also, `distinct on` is usually a sign of not well thought logic, a lateral 
join with a subquery with limit 1 can be much more performant than producing 
extra rows in possibly wrong join order and then getting rid of them.



пт, 30 мар. 2018 г. в 14:46, sav123 :
Hello All,

context : postgres database, cartoocss

I was trying to build the exact table for the cartoo css layer
"turning_circle_casing".
It is a join between the tables line and point at high zoom.

sql from cartoo css authors embedded in a create table by a script :
CREATE TABLE  IF NOT EXISTS turning_circle_casing_raw AS select * from
(SELECT DISTINCT ON (p.way)
p.way AS way, l.highway AS int_tc_type,
CASE WHEN l.service IN ('parking_aisle', 'drive-through', 'driveway')
THEN 'INT-minor'::text
ELSE 'INT-normal'::text
END AS int_tc_service,v.prio
FROM planet_osm_point p
JOIN planet_osm_line l ON ST_DWithin(p.way, l.way, 0.1)
JOIN (VALUES
('tertiary', 1),
('unclassified', 2),
('residential', 3),
('living_street', 4),
('service', 5)
) AS v (highway, prio)
ON v.highway=l.highway
WHERE p.highway = 'turning_circle'
OR p.highway = 'turning_loop'
) as creator  WHERE ST_IsValid(way) ORDER BY
ST_GeoHash(ST_Transform(ST_Envelope(way),4326),10) COLLATE "C";

1,597,250 selected rows, 28800 seconds to build , 50 seconds to index on
geometry col.

Then I produced an intermediate table to divide by 60 one of the tables :
CREATE TABLE  IF NOT EXISTS planet_osm_point_casing AS ( select way from
planet_osm_point where highway in ('turning_circle','turning_loop') )  ;
CREATE INDEX IF NOT EXISTS planet_osm_point_casing_i ON
planet_osm_point_casing USING GIST (way) ;
ANALYZE planet_osm_point_casing;

 3 or 4 minutes to get the new table with 1,631,000 record instead of
the original 103,755,000.


The request execution time was reduced to 25000 seconds. Still too much.

Then I built a regular grid table 256 x 256 from -20037508 to 20037508 on x
and on y
insert into local_join_grid values
('BOX3D(xmin ymin,xmax ymax)'::box3d), etc ...
looping on xmin ymin,xmax ymax to get 65536 records.

and I modified the above request like that :

CREATE TABLE  IF NOT EXISTS turning_circle_casing_raw AS select * from
(
SELECT DISTINCT ON (p.way)
p.way AS way, l.highway AS int_tc_type,
CASE WHEN l.service IN ('parking_aisle', 'drive-through', 'driveway')
THEN 'INT-minor'::text
ELSE 'INT-normal'::text
END AS int_tc_service,v.prio
FROM local_join_grid b
JOIN planet_osm_point_casing p on p.way && ST_SetSRID(b.bbox, 3857)
JOIN planet_osm_line l ON l.way && ST_SetSRID(b.bbox, 3857) and
l.highway in ('tertiary',
'unclassified','residential','living_street','service') and
ST_DWithin(p.way, l.way, 0.1)
JOIN (VALUES
('tertiary', 1),
('unclassified', 2),
('residential', 3),
('living_street', 4),
('service', 5)
) AS v (highway, prio)
ON v.highway=l.highway
) as creator   WHERE ST_IsValid(way) ORDER BY
ST_GeoHash(ST_Transform(ST_Envelope(way),4326),10) COLLATE "C";

and I got the result in 1802 s, index in 43 s. It is 15 times faster.
I suppose that the optimal grid size depends of the hardware and the
postgresql configuration.


The comparison between the 2 results shows some rows missing. It is because
the grid which is not correct. It cuts streets.
I know 1 or 2 heavy heurisitics to build a good grid if it doesn't exist.
I could also start with existent boundaries.

Does the grid exist ? some ideas ?

:)
Igael








___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev


___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev


[OSM-dev] intermediate spacial grid for postgis joins

2018-03-30 Thread sav123

Hello All,

context : postgres database, cartoocss

I was trying to build the exact table for the cartoo css layer 
"turning_circle_casing".

It is a join between the tables line and point at high zoom.

sql from cartoo css authors embedded in a create table by a script :
CREATE TABLE  IF NOT EXISTS turning_circle_casing_raw AS select * from
(SELECT DISTINCT ON (p.way)
p.way AS way, l.highway AS int_tc_type,
CASE WHEN l.service IN ('parking_aisle', 'drive-through', 'driveway')
THEN 'INT-minor'::text
ELSE 'INT-normal'::text
END AS int_tc_service,v.prio
FROM planet_osm_point p
JOIN planet_osm_line l ON ST_DWithin(p.way, l.way, 0.1)
JOIN (VALUES
('tertiary', 1),
('unclassified', 2),
('residential', 3),
('living_street', 4),
('service', 5)
) AS v (highway, prio)
ON v.highway=l.highway
WHERE p.highway = 'turning_circle'
OR p.highway = 'turning_loop'
) as creator  WHERE ST_IsValid(way) ORDER BY 
ST_GeoHash(ST_Transform(ST_Envelope(way),4326),10) COLLATE "C";


1,597,250 selected rows, 28800 seconds to build , 50 seconds to index on 
geometry col.


Then I produced an intermediate table to divide by 60 one of the tables :
CREATE TABLE  IF NOT EXISTS planet_osm_point_casing AS ( select way from
planet_osm_point where highway in ('turning_circle','turning_loop') )  ;
CREATE INDEX IF NOT EXISTS planet_osm_point_casing_i ON 
planet_osm_point_casing USING GIST (way) ;

ANALYZE planet_osm_point_casing;

 3 or 4 minutes to get the new table with 1,631,000 record instead of 
the original 103,755,000.



The request execution time was reduced to 25000 seconds. Still too much.

Then I built a regular grid table 256 x 256 from -20037508 to 20037508 on x 
and on y

insert into local_join_grid values
('BOX3D(xmin ymin,xmax ymax)'::box3d), etc ...
looping on xmin ymin,xmax ymax to get 65536 records.

and I modified the above request like that :

CREATE TABLE  IF NOT EXISTS turning_circle_casing_raw AS select * from
(
SELECT DISTINCT ON (p.way)
p.way AS way, l.highway AS int_tc_type,
CASE WHEN l.service IN ('parking_aisle', 'drive-through', 'driveway')
THEN 'INT-minor'::text
ELSE 'INT-normal'::text
END AS int_tc_service,v.prio
FROM local_join_grid b
JOIN planet_osm_point_casing p on p.way && ST_SetSRID(b.bbox, 3857)
JOIN planet_osm_line l ON l.way && ST_SetSRID(b.bbox, 3857) and
l.highway in ('tertiary', 
'unclassified','residential','living_street','service') and 
ST_DWithin(p.way, l.way, 0.1)

JOIN (VALUES
('tertiary', 1),
('unclassified', 2),
('residential', 3),
('living_street', 4),
('service', 5)
) AS v (highway, prio)
ON v.highway=l.highway
) as creator   WHERE ST_IsValid(way) ORDER BY 
ST_GeoHash(ST_Transform(ST_Envelope(way),4326),10) COLLATE "C";


and I got the result in 1802 s, index in 43 s. It is 15 times faster.
I suppose that the optimal grid size depends of the hardware and the 
postgresql configuration.



The comparison between the 2 results shows some rows missing. It is because 
the grid which is not correct. It cuts streets.

I know 1 or 2 heavy heurisitics to build a good grid if it doesn't exist.
I could also start with existent boundaries.

Does the grid exist ? some ideas ?

:)
Igael








___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] about Carto and SQL

2018-03-21 Thread sav123


1) if all the SQL requests required to produce a map are in the style file
or else if there are other SQL requests not if in this file ( and where they
are ).


Part of SQL is wrapping and table discovery logic in Mapnik's PostGIS 
driver, look for all text strings in 
https://github.com/mapnik/mapnik/blob/333ef9fde145f88339eaccba810305707bae9b0e/plugins/input/postgis/postgis_datasource.cpp 
starting from line 170, and nearby files.


I found the exact role of each added variable.

While the implicit Cartocss database is 20% bigger before min max scales 
redutions, it is faster.
There are many levels of optimization for a production database ( not for 
dev ) ...


More than half of the requests are made in subsets of size lower than 100 Mb 
( compared to the 90 Gb of the polygon table of osm2pgsql ) :
under 10 Mb : 25% , from 10 Mb to 100 Mb : 26% , 100 Mb to 1 Gb : 22% , 1 Gb 
to 10 Gb : 20% and over 10 Gb : 7%.


Having 75 tables ( and 1 view, 2 tables being identical ) allows to better 
dispatch data on the resources. And to get indexes faster.


For the postgres table partitioning, who can claim that a 50 Gb "building" 
table is difficult to geo-split in 4 to 10 parts and more ?
Or the 9 Gb "addresses" table ? There are also things to do with the 30 Gb 
roads_casing and the 20 Gb landcover.

Note that planet_osm_roads is already a subset of planet_osm_line.

It is useless to sort the new tables as specified in the queries but to keep 
the initial geom order of the source tables.

After geo cuts, sort is fast. Final adjustments...

Are also interesting :
- the intersections and inclusions relations between the subsets
- the (14) constant queries sequences sent by Mapnik.
- the balance of the tags never queried for rendering but perhaps useful for 
searches.


I'll come back in a few weeks with some scripts.
Thank you again ... 



___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] about Carto and SQL

2018-03-20 Thread sav123



-Message d'origine- 
From: Paul Norman

Sent: Tuesday, March 20, 2018 9:51 PM
To: dev@openstreetmap.org
Subject: Re: [OSM-dev] about Carto and SQL








On 3/20/2018 1:40 PM, sav123 wrote:
Kompza covered the variables you mentioned, but I if you're looking at
benchmarking, I recommend you set up the full rendering stack rather
than trying to generate queries yourself. The latter can be tricky to
get right, and you have to handle parallelism the same as the full stack
to get meaningful results. It's just easier to use renderd and feed it a
list of tiles with render_list than to write code that will take that
list, generate SQL, and run it.

This doesn't answer the question.
If you read the carto file, you will see that there are natural criteria , 
not only geometry.
Indeed, it is a bad idea to partition on coordinates, unless if a request 
applies to a small region.


I'm not speaking of partitioning, but of how to best benchmark 
performance, which is what you're planning to do to see if partitioning is 
worthwhile.


You don't know me but I see that you have your own idea. It's pretty amazing 
...


Please respect the protocol.
I ask and wait answers.
If it is a bad question, give argument on the question, not on me, we had 
not be presented.

Or else I have a lot of work, don't spend my time.








___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev 



___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] about Carto and SQL

2018-03-20 Thread sav123
-Message d'origine- 
From: Paul Norman

Sent: Tuesday, March 20, 2018 8:36 PM
To: dev@openstreetmap.org
Subject: Re: [OSM-dev] about Carto and SQL

Hello


Kompza covered the variables you mentioned, but I if you're looking at

benchmarking, I recommend you set up the full rendering stack rather
than trying to generate queries yourself. The latter can be tricky to
get right, and you have to handle parallelism the same as the full stack
to get meaningful results. It's just easier to use renderd and feed it a
list of tiles with render_list than to write code that will take that

list, generate SQL, and run it.


This doesn't answer the question.
If you read the carto file, you will see that there are natural criteria , 
not only geometry.
Indeed, it is a bad idea to partition on coordinates, unless if a request 
applies to a small region.




Kompza

Thank you very much.
My goal is to have optimized "replicates" keeping the original tables in a 
master db.

I'll report the test results if they are interesting .

Igael






___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev


[OSM-dev] about Carto and SQL

2018-03-20 Thread sav123

Hello,

Is there a documentation of the Carto sql mechanism ?

In fact, I should want only to know :

1) if all the SQL requests required to produce a map are in the style file 
or else if there are other SQL requests not if in this file ( and where they 
are ).


2) how work variables like pixel_width and how they are passed to postgresql


I hope to be able to grab enough information to partition some or all the 
postgresql tables and to publish the resulting comparison boards.


I know that it is possible to setup logs and to check them, but this method 
may miss rare calls ...


thank you

Igael 



___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev