Re: [GRASS-user] Slow import of GHSL

2017-03-24 Thread Markus Neteler
On Fri, Mar 24, 2017 at 10:25 AM, Nikos Alexandris
 wrote:
> * Markus Metz  [2017-03-22 22:11:01 +0100]:
>> On Wed, Mar 22, 2017 at 9:52 PM, Markus Neteler  wrote:

...
> Markus N, I am interested: did you use the "memory" option?

I left r.in.gdal's default value.

...
> My understanding is that it would make a difference, for GRASS, if I
> would redo the GHSL layers with a row-shaped "block".  Makes sense?

Why spend time on redoing the GHSL layers? Do you have to import them
frequently?

markusN
___
grass-user mailing list
grass-user@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/grass-user

Re: [GRASS-user] Slow import of GHSL

2017-03-24 Thread Nikos Alexandris

(Sorry for silence, was without my personal computer for a week.)


* Markus Metz  [2017-03-22 22:11:01 +0100]:


On Wed, Mar 22, 2017 at 9:52 PM, Markus Neteler  wrote:


On Wed, Mar 22, 2017 at 9:28 PM, Markus Metz
 wrote:
> On Wed, Mar 22, 2017 at 8:12 PM, Markus Neteler 

wrote:

...
>> Nikos, for an even bigger map try
>>
>> Global Surface Water (2000-2012, 30 m, Data coverage is from 80° north
>> to 60° south):
>> http://landcover.usgs.gov/glc/WaterDescriptionAndDownloads.php
>> by USGS. 1.6GB in size.


Interesting this is. See also:
https://global-surface-water.appspot.com/, at 30m, Landsat-based as
well.



>> Using gdalbuildvrt I created a VRT from the 504 GeoTIFF files.
>>
>> After import into GRASS GIS, here the timings:
>>
>> # final map size:
>> g.region -p
>> ...
>> rows:   493200
>> cols:   1296001
>> cells:  639187693200
>>
>> (handling only works in GRASS GIS 7.3.svn since Markus Metz's recent
>> improvements on global data import are needed).
>
> (my changes were bug fixes, not improvements)
>
>>
>> Benchmarks:
>> - Import took 2h while reading the data from a CIFS mounted storage
>> box (slow) and writing on SSD.


Markus N, I am interested: did you use the "memory" option?


>> - Displaying the entire map (639 giga-pixel) in GRASS GIS' display
>> (d.mon) took ~15 sec over a ssh tunnel from my laptop to the server,
>> since I am at a conference.
>>
>> Fair deal I would say :-)
>
> A bit more information would help to compare:
>  - what is your GDAL version?

GDAL 2.1.2

>  - are 504 GeoTIFF files compressed? If yes, which method?

Yes, COMPRESSION=LZW

>  - what are the block dimensions of the input GeoTIFFs?

Size is 36001, 36001  - Block=36001x1


Now that's important too.  What about GHSL's block size of 4K^2?
My understanding is that it would make a difference, for GRASS, if I
would redo the GHSL layers with a row-shaped "block".  Makes sense?


This is row by row compression as in GRASS. That could help import with
r.in.gdal which also reads and writes row by row.


Type=Byte

>  - what kind of GRASS compression did you use?

Default raster + NULL compression enabled. I.e.,

r.compress -p watermask2010
 is compressed (method 2: ZLIB). Data type: CELL


You might save disk space at the cost of longer reading times with BZIP2.


 has a compressed NULL file

Again, the fact that I had to read from an attached storage box likely
slowed down the import.
Just thought to post these numbers here.


Impressive that such a large raster can be imported at all, and relatively
fasto!


Indeed, impressive.

Nikos


Reading about 1.6 GB (also from an attached storage box) should not take 2
hours, therefore I think the limit is software input decompression and
output compression.

Markus M

___
grass-user mailing list
grass-user@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/grass-user

Re: [GRASS-user] Slow import of GHSL

2017-03-22 Thread Markus Neteler
On Wed, Mar 22, 2017 at 9:28 PM, Markus Metz
 wrote:
> On Wed, Mar 22, 2017 at 8:12 PM, Markus Neteler  wrote:
...
>> Nikos, for an even bigger map try
>>
>> Global Surface Water (2000-2012, 30 m, Data coverage is from 80° north
>> to 60° south):
>> http://landcover.usgs.gov/glc/WaterDescriptionAndDownloads.php
>> by USGS. 1.6GB in size.
>>
>> Using gdalbuildvrt I created a VRT from the 504 GeoTIFF files.
>>
>> After import into GRASS GIS, here the timings:
>>
>> # final map size:
>> g.region -p
>> ...
>> rows:   493200
>> cols:   1296001
>> cells:  639187693200
>>
>> (handling only works in GRASS GIS 7.3.svn since Markus Metz's recent
>> improvements on global data import are needed).
>
> (my changes were bug fixes, not improvements)
>
>>
>> Benchmarks:
>> - Import took 2h while reading the data from a CIFS mounted storage
>> box (slow) and writing on SSD.
>> - Displaying the entire map (639 giga-pixel) in GRASS GIS' display
>> (d.mon) took ~15 sec over a ssh tunnel from my laptop to the server,
>> since I am at a conference.
>>
>> Fair deal I would say :-)
>
> A bit more information would help to compare:
>  - what is your GDAL version?

GDAL 2.1.2

>  - are 504 GeoTIFF files compressed? If yes, which method?

Yes, COMPRESSION=LZW

>  - what are the block dimensions of the input GeoTIFFs?

Size is 36001, 36001  - Block=36001x1
Type=Byte

>  - what kind of GRASS compression did you use?

Default raster + NULL compression enabled. I.e.,

r.compress -p watermask2010
 is compressed (method 2: ZLIB). Data type: CELL
 has a compressed NULL file

Again, the fact that I had to read from an attached storage box likely
slowed down the import.
Just thought to post these numbers here.

markusN
___
grass-user mailing list
grass-user@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/grass-user

Re: [GRASS-user] Slow import of GHSL

2017-03-22 Thread Markus Metz
On Wed, Mar 22, 2017 at 8:12 PM, Markus Neteler  wrote:
>
> On Sat, Mar 11, 2017 at 7:01 PM, Markus Metz
>  wrote:
> > On Sat, Mar 11, 2017 at 8:53 AM, Nikos Alexandris <
n...@nikosalexandris.net>
> > wrote:
> >>
> >> Nikos Alexandris
> >>
>  Why does (attempting to) import a 38m pixel resolution GHSL [0]
GeoTIFF
>  layer, ie GHS_BUILT_LDS1990_GLOBE_R2016A_3857_38_v1_0_p1.tif, in
GRASS'
>  db progress slow?
> >
> > because it is a very large raster map: Size is 507904, 647168
>
> Nikos, for an even bigger map try
>
> Global Surface Water (2000-2012, 30 m, Data coverage is from 80° north
> to 60° south):
http://landcover.usgs.gov/glc/WaterDescriptionAndDownloads.php
> by USGS. 1.6GB in size.
>
> Using gdalbuildvrt I created a VRT from the 504 GeoTIFF files.
>
> After import into GRASS GIS, here the timings:
>
> # final map size:
> g.region -p
> ...
> rows:   493200
> cols:   1296001
> cells:  639187693200
>
> (handling only works in GRASS GIS 7.3.svn since Markus Metz's recent
> improvements on global data import are needed).

(my changes were bug fixes, not improvements)

>
> Benchmarks:
> - Import took 2h while reading the data from a CIFS mounted storage
> box (slow) and writing on SSD.
> - Displaying the entire map (639 giga-pixel) in GRASS GIS' display
> (d.mon) took ~15 sec over a ssh tunnel from my laptop to the server,
> since I am at a conference.
>
> Fair deal I would say :-)

A bit more information would help to compare:
 - what is your GDAL version?
 - are 504 GeoTIFF files compressed? If yes, which method?
 - what are the block dimensions of the input GeoTIFFs?
 - what kind of GRASS compression did you use?

Markus M
___
grass-user mailing list
grass-user@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/grass-user

Re: [GRASS-user] Slow import of GHSL

2017-03-22 Thread Markus Neteler
On Sat, Mar 11, 2017 at 7:01 PM, Markus Metz
 wrote:
> On Sat, Mar 11, 2017 at 8:53 AM, Nikos Alexandris 
> wrote:
>>
>> Nikos Alexandris
>>
 Why does (attempting to) import a 38m pixel resolution GHSL [0] GeoTIFF
 layer, ie GHS_BUILT_LDS1990_GLOBE_R2016A_3857_38_v1_0_p1.tif, in GRASS'
 db progress slow?
>
> because it is a very large raster map: Size is 507904, 647168

Nikos, for an even bigger map try

Global Surface Water (2000-2012, 30 m, Data coverage is from 80° north
to 60° south): http://landcover.usgs.gov/glc/WaterDescriptionAndDownloads.php
by USGS. 1.6GB in size.

Using gdalbuildvrt I created a VRT from the 504 GeoTIFF files.

After import into GRASS GIS, here the timings:

# final map size:
g.region -p
...
rows:   493200
cols:   1296001
cells:  639187693200

(handling only works in GRASS GIS 7.3.svn since Markus Metz's recent
improvements on global data import are needed).

Benchmarks:
- Import took 2h while reading the data from a CIFS mounted storage
box (slow) and writing on SSD.
- Displaying the entire map (639 giga-pixel) in GRASS GIS' display
(d.mon) took ~15 sec over a ssh tunnel from my laptop to the server,
since I am at a conference.

Fair deal I would say :-)

cheers,
Markus

-- 
https://www.mundialis.de/
___
grass-user mailing list
grass-user@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/grass-user

Re: [GRASS-user] Slow import of GHSL

2017-03-17 Thread Markus Neteler
On Fri, Mar 17, 2017 at 11:02 AM, Nikos Alexandris
 wrote:
> Markus N wrote:
>> Remember that you have to explicitly switch on the NULL compression:
>>
>> https://grass.osgeo.org/grass72/manuals/rasterintro.html#raster-compression
>
> Thanks for this one too.  I guess we can't opt for a "sane" default NULL
> compression. Can we? For future versions?

That I proposed here:

https://trac.osgeo.org/grass/ticket/2750#comment:61

and MarkusM suggested some tests to be implemented for that (any
volunteers? might be easy with some Python knowledge).

...
> There are "out-of-date" versions of GRASS, in software repositories.  For
> example,
> https://dl.fedoraproject.org/pub/epel/7/x86_64/g/, see for gdal and
> grass.  So, this complicates thigns (compatibility).

Try here:
https://copr.fedorainfracloud.org/coprs/neteler/GDAL/
and
https://copr.fedorainfracloud.org/coprs/neteler/grass72/

The latter is still pending a fix to get g.extension working but due
to travelling I cannot work on this at time.

Markus

-- 
Markus Neteler
http://www.mundialis.de - free data with free software
http://grass.osgeo.org
http://courses.neteler.org/blog
___
grass-user mailing list
grass-user@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/grass-user

Re: [GRASS-user] Slow import of GHSL

2017-03-17 Thread Nikos Alexandris

* Markus Metz  [2017-03-16 22:06:12 +0100]:


On Thu, Mar 16, 2017 at 11:26 AM, Nikos Alexandris 
wrote:



[...]



With the p1 tif and GRASS db on the same spinning HDD, and
6 other heavy processes constantly reading from and writing to that

same

HDD, r.in.gdal took 2h 13min to import the p1 tif. 360 MB as input and

1.5

GB as output is not that heavy on disk IO. Most of the time is spent
decompressing input and compressing output.



Is it an 1rpm disk?


I think you are on the wrong track, disk IO does not matter here. It was a
7200rpm disk, and the output of r.in.gdal was about 1.5 GB. It takes only
seconds, not hours to write 1.5 GB to a HDD.




p2 is a harder one!



export GDAL_CACHEMAX=1
gdal_translate -co "COMPRESS=LZW"
GHS_BUILT_LDS1990_GLOBE_R2016A_3857_38_v1_0_p2.tif p2_test.tif



Also related?  GTIFF_DIRECT_IO, GTIFF_VIRTUAL_MEM_IO


Again, I think you are on the wrong track, disk IO does not matter here.
And according to the GDAL documentation, GTIFF_DIRECT_IO,
GTIFF_VIRTUAL_MEM_IO apply only to reading un-compressed TIFF files.




finishes in 28 minutes.


Impressive!


Hardware does not really matter here. To be precise, the difference between
GDAL 1.11.4 and 2.1.3 is impressive, thanks to the efforts of the GDAL
development team.

Regarding GDAL 2.1.3, profiling might tell why gdal_translate is so much
faster than GRASS r.in.gdal.


Thanks Markus.  Yes, on the wrong track.  Useful lessons learned.

Nikos

ps- Working in a restricted environment (as in: I cannot install
whatsoever I need) is not easy.- Sure, I can possibly use a VM or
similar...
___
grass-user mailing list
grass-user@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/grass-user

Re: [GRASS-user] Slow import of GHSL

2017-03-17 Thread Nikos Alexandris

[gdal-dev removed from Cc]


Nikos A:


unfortunately, because I don't have a lot of free space :-/


Markus M:


maybe because you forgot the enable compression ;-)



I should!


Markus N:


Remember that you have to explicitly switch on the NULL compression:
https://grass.osgeo.org/grass72/manuals/rasterintro.html#raster-compression


Thanks for this one too.  I guess we can't opt for a "sane" default NULL
compression. Can we? For future versions?

There are "out-of-date" versions of GRASS, in software repositories.  For 
example,
https://dl.fedoraproject.org/pub/epel/7/x86_64/g/, see for gdal and
grass.  So, this complicates thigns (compatibility).

Nikos
___
grass-user mailing list
grass-user@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/grass-user

Re: [GRASS-user] Slow import of GHSL

2017-03-16 Thread Markus Neteler
On Mar 16, 2017 11:26 AM, "Nikos Alexandris" 
wrote:
...

>>> unfortunately, because I don't have a lot of free space :-/
>>
>>
>> maybe because you forgot the enable compression ;-)
>
>
> I should!

Remember that you have to explicitly switch on the NULL compression:

https://grass.osgeo.org/grass72/manuals/rasterintro.html#raster-compression

Best
markusN
___
grass-user mailing list
grass-user@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/grass-user

Re: [GRASS-user] Slow import of GHSL

2017-03-16 Thread Nikos Alexandris

[..]

Nikos:


Some messy rough timings:
1) i7, 8 cores, 32GB RAM, Base OS: CentOS -> Three r.in.gdal processes
for "p2.tif", each stuck at 3% for almost 14h
2) Xeon, 24 Cores, 32GB RAM, Base OS: Windows -> Three gdal_translate
processes with -projwin, the VRT file as an input and GeoTIFF as output,
at 40% since yesterday afternoon
3) Xeon, 12 Cores, ? RAM, Base OS: CentOS.jpg -> Same processes as in
1), stuck at 0% of progress for more than 16h.
SSD can be seen as a "necessity".


Markus M:


Hmm, not really.


Nikos:


In a laptop (i7-4600U CPU @ 2.10GHz with 8GB of RAM with SSD) it was
progressing, in a quite acceptable manner.


Markus M:


What is the gdal version you used? I use gdal 2.1.3.


Well, yes! 2.1.3 in the laptop, 1.11.4 for the rest.


 I had to break the process,
unfortunately, because I don't have a lot of free space :-/


maybe because you forgot the enable compression ;-)


I should!


With the p1 tif and GRASS db on the same spinning HDD, and
6 other heavy processes constantly reading from and writing to that same
HDD, r.in.gdal took 2h 13min to import the p1 tif. 360 MB as input and 1.5
GB as output is not that heavy on disk IO. Most of the time is spent
decompressing input and compressing output.


Is it an 1rpm disk?


p2 is a harder one!


export GDAL_CACHEMAX=1
gdal_translate -co "COMPRESS=LZW"
GHS_BUILT_LDS1990_GLOBE_R2016A_3857_38_v1_0_p2.tif p2_test.tif



I did not emphasize it enough, but cache size was among my questions
initially.  I wrongly assumed that it can't be more than 2047 due to the
reference in :

--%<---
memory=integer
 ..
 Options: 0-2047
 ..
--->%--

I admit I did not head over to
https://trac.osgeo.org/gdal/wiki/ConfigOptions from where it is implied
that it can be much higher than 2047MB.

Can't r.in.gdal deal with memory=4096 for example (will try)? If yes,
can we update the manual(s)?

Also related?  GTIFF_DIRECT_IO, GTIFF_VIRTUAL_MEM_IO



finishes in 28 minutes.


Impressive!


you could try gdal 2.1.3, maybe 2.1.3 has a more efficient cache regarding
block-wise reading than gdal 1.11.4


Yes, I have to.

Kudos, Nikos
___
grass-user mailing list
grass-user@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/grass-user

Re: [GRASS-user] Slow import of GHSL

2017-03-15 Thread Markus Metz
On Wed, Mar 15, 2017 at 6:03 PM, Nikos Alexandris 
wrote:
>
[...]
>
> Nikos:
>
>>> Some messy rough timings:
>>>
>>> 1) i7, 8 cores, 32GB RAM, Base OS: CentOS -> Three r.in.gdal processes
>>> for "p2.tif", each stuck at 3% for almost 14h
>>>
>>> 2) Xeon, 24 Cores, 32GB RAM, Base OS: Windows -> Three gdal_translate
>>> processes with -projwin, the VRT file as an input and GeoTIFF as output,
>>> at 40% since yesterday afternoon
>>>
>>> 3) Xeon, 12 Cores, ? RAM, Base OS: CentOS.jpg -> Same processes as in
>>> 1), stuck at 0% of progress for more than 16h.
>>>
>>> SSD can be seen as a "necessity".
>>
>>
> Markus Metz:
>
>> Hmm, not really.
>
>
> In a laptop (i7-4600U CPU @ 2.10GHz with 8GB of RAM with SSD) it was
> progressing, in a quite acceptable manner.
 What is the gdal version you used? I use gdal 2.1.3.

>  I had to break the process,
> unfortunately, because I don't have a lot of free space :-/

maybe because you forgot the enable compression ;-)

>
>> With the p1 tif and GRASS db on the same spinning HDD, and
>> 6 other heavy processes constantly reading from and writing to that same
>> HDD, r.in.gdal took 2h 13min to import the p1 tif. 360 MB as input and
1.5
>> GB as output is not that heavy on disk IO. Most of the time is spent
>> decompressing input and compressing output.
>
>
> p2 is a harder one!

export GDAL_CACHEMAX=1
gdal_translate -co "COMPRESS=LZW"
GHS_BUILT_LDS1990_GLOBE_R2016A_3857_38_v1_0_p2.tif p2_test.tif

finishes in 28 minutes.

you could try gdal 2.1.3, maybe 2.1.3 has a more efficient cache regarding
block-wise reading than gdal 1.11.4

Best,

Markus M
___
grass-user mailing list
grass-user@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/grass-user

Re: [GRASS-user] Slow import of GHSL

2017-03-15 Thread Nikos Alexandris


Nikos Alexandris


Why does (attempting to) import a 38m pixel resolution GHSL [0] GeoTIFF
layer, ie GHS_BUILT_LDS1990_GLOBE_R2016A_3857_38_v1_0_p1.tif, in GRASS'
db progress slow?


Markus M:


because it is a very large raster map: Size is 507904, 647168


Markus Neteler:


Can you elaborate a bit more? I have downloaded and checked:
That is 9835059101  bytes in 19885 files or I downloaded the wrong one
(please post an URL).



For example ,
see
GHS_BUILT_LDS1975_GLOBE_R2016A_3857_38 (768MB)
GHS_BUILT_LDS1990_GLOBE_R2016A_3857_38 (854MB)
GHS_BUILT_LDS2000_GLOBE_R2016A_3857_38 (892MB)
GHS_BUILT_LDS2014_GLOBE_R2016A_3857_38 (900MB)



"3857" is the EPSG code.  They are split in two GeoTIFFs (p1, p2) and
there is a VRT along with overviews for it.  No overviews for the TIFFs.



For example:
GHSL_data_access_v1.3.pdf
GHS_BUILT_LDS1990_GLOBE_R2016A_3857_38_v1_0.clr
GHS_BUILT_LDS1990_GLOBE_R2016A_3857_38_v1_0.vrt
GHS_BUILT_LDS1990_GLOBE_R2016A_3857_38_v1_0.vrt.ovr
GHS_BUILT_LDS1990_GLOBE_R2016A_3857_38_v1_0_p1.tif
GHS_BUILT_LDS1990_GLOBE_R2016A_3857_38_v1_0_p2.tif

Even trying to clip, with gdal_translate, might create file(s) of
hundreds of GBs. This might be due to missing compression.



then use compression. The source tiffs use LZW with blocks of 4096x4096
cells.



The import of p1 or p2 or of the VRT file in GRASS' data base, via
r.in.gdal/r.import, does not progress at all.



Importing GHS_BUILT_LDS1990_GLOBE_R2016A_3857_38_v1_0_p1.tif with r.in.gdal
took 1:31 hours on a laptop with SSD. The resultant cell file was 1.5 GB.

Recompressing with BZIP2 took 2:20 hours and the size of the cell file was
reduced to a mere 143 MB.


Nikos:


Some messy rough timings:

1) i7, 8 cores, 32GB RAM, Base OS: CentOS -> Three r.in.gdal processes
for "p2.tif", each stuck at 3% for almost 14h

2) Xeon, 24 Cores, 32GB RAM, Base OS: Windows -> Three gdal_translate
processes with -projwin, the VRT file as an input and GeoTIFF as output,
at 40% since yesterday afternoon

3) Xeon, 12 Cores, ? RAM, Base OS: CentOS.jpg -> Same processes as in
1), stuck at 0% of progress for more than 16h.

SSD can be seen as a "necessity".



Markus Metz:


Hmm, not really.


In a laptop (i7-4600U CPU @ 2.10GHz with 8GB of RAM with SSD) it was
progressing, in a quite acceptable manner.  I had to break the process,
unfortunately, because I don't have a lot of free space :-/


With the p1 tif and GRASS db on the same spinning HDD, and
6 other heavy processes constantly reading from and writing to that same
HDD, r.in.gdal took 2h 13min to import the p1 tif. 360 MB as input and 1.5
GB as output is not that heavy on disk IO. Most of the time is spent
decompressing input and compressing output.


p2 is a harder one!


Are your r.in.gdal and gdal_translate processes running at nearly 100% CPU?
Anything slowing down the HDD(s)?


Yes, all processes, in my attempts 2 or 3 in parallel, where constantly
at 100%. RAM was not an issue.

No other heavy process in parallel.  If it matters, working on i3wm and
firefox to browse (webmail, wikis, etc).

Nikos
___
grass-user mailing list
grass-user@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/grass-user

Re: [GRASS-user] Slow import of GHSL

2017-03-14 Thread Nikos Alexandris

[all deleted]

Here's a non-elegant way, derived out of tests. Maybe a starter for the
Wiki. Elegant would be scripted, no need to manually enter any GRASS
session.


```
# Get Eurostat's NUTS_2013_01M_SH.zip vector map
unzip NUTS_2013_01M_SH.zip && cd NUTS_2013_01M_SH/data/
grass73 -c NUTS_2013_01M_SH.shp /geo/grassdb/europe/etrs89
v.import in=NUTS_RG_01M_2013.shp out=NUTS_RG_01M_2013

# draw, view, pick & set computational region of interest, create a vector map
v.in.region out=europe_less_box

# Clip original VRT to Europe’s extent, output as VRT
# and successively add overviews !Adding overviews takes time!

# 1990
gdal_translate -projwin -3480828.507849 11465936.382472 4989400.357796 
3203413.703282 GHS_BUILT_LDS1990_GLOBE_R2016A_3857_38_v1_0.vrt 
GHS_BUILT_LDS1990_GLOBE_R2016A_3857_38_v1_0_Europe_Less.vrt -of VRT
gdaladdo -ro --config COMPRESS_OVERVIEW DEFLATE 
GHS_BUILT_LDS1990_GLOBE_R2016A_3857_38_v1_0_Europe_Less.vrt 2 4 8 16

# 2000
gdal_translate -projwin -3480828.507849 11465936.382472 4989400.357796 
3203413.703282 GHS_BUILT_LDS2000_GLOBE_R2016A_3857_38_v1_0.vrt 
GHS_BUILT_LDS2000_GLOBE_R2016A_3857_38_v1_0_Europe_Less.vrt -of VRT
gdaladdo -ro --config COMPRESS_OVERVIEW DEFLATE 
GHS_BUILT_LDS2000_GLOBE_R2016A_3857_38_v1_0_Europe.tif 2 4 8 16

# 2014
gdal_translate -projwin -3480828.507849 11465936.382472 4989400.357796 
3203413.703282 GHS_BUILT_LDS2014_GLOBE_R2016A_3857_38_v1_0.vrt 
GHS_BUILT_LDS2014_GLOBE_R2016A_3857_38_v1_0_Europe_Less.vrt -of VRT
gdaladdo -ro --config COMPRESS_OVERVIEW DEFLATE 
GHS_BUILT_LDS2014_GLOBE_R2016A_3857_38_v1_0_Europe_Less.vrt  2 4 8 16

# Create 'epsg:3857' location for the 1990 data, set region &
 # work this in three different terminals
# resolution (v.proj-ing existing "box" map), import raster
grass72 -c "epsg:3857" /geo/grassdb/global/wgs84_3857_1990
v.proj dbase=/geo/grassdb/europe/ location=etrs89 mapset=PERMANENT 
in=europe_less_box out=europe_less_box
g.region -p vect=europe_less_box_epsg_3857 ewres=38.218470987084757 
nsres=38.218446797782505
r.import input=GHS_BUILT_LDS1990_GLOBE_R2016A_3857_38_v1_0_Europe_Less.vrt 
out=GHS_BUILT_LDS1990_GLOBE_R2016A_3857_38_v1_0_Europe_Less memory=2047 
extent=region

# Repeat for 2000
grass72 -c "epsg:3857" /geo/grassdb/global/wgs84_3857_2000
v.proj dbase=/geo/grassdb/europe/ location=etrs89 mapset=PERMANENT 
in=europe_less_box out=europe_less_box
g.region -p vect=europe_less_box_epsg_3857 ewres=38.218470987084757 
nsres=38.218446797782505
r.import input=GHS_BUILT_LDS2000_GLOBE_R2016A_3857_38_v1_0_Europe_Less.vrt 
out=GHS_BUILT_LDS2000_GLOBE_R2016A_3857_38_v1_0_Europe_Less memory=2047 
extent=region

# Repeat for 2014
grass72 -c "epsg:3857" /geo/grassdb/global/wgs84_3857_2014
v.proj dbase=/geo/grassdb/europe/ location=etrs89 mapset=PERMANENT 
in=europe_less_box out=europe_less_box
g.region -p vect=europe_less_box_epsg_3857 ewres=38.218470987084757 
nsres=38.218446797782505
r.import input=GHS_BUILT_LDS2014_GLOBE_R2016A_3857_38_v1_0_Europe_Less.vrt 
out=GHS_BUILT_LDS2014_GLOBE_R2016A_3857_38_v1_0_Europe_Less memory=2047 
extent=region
```

Nikos
___
grass-user mailing list
grass-user@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/grass-user

Re: [GRASS-user] Slow import of GHSL

2017-03-14 Thread Nikos Alexandris

* Markus Metz  [2017-03-14 15:02:30 +0100]:


On Tue, Mar 14, 2017 at 10:01 AM, Nikos Alexandris 
wrote:


Nikos Alexandris


Why does (attempting to) import a 38m pixel resolution GHSL [0]

GeoTIFF

layer, ie GHS_BUILT_LDS1990_GLOBE_R2016A_3857_38_v1_0_p1.tif, in

GRASS'

db progress slow?



Markus M



because it is a very large raster map: Size is 507904, 647168




(Apologies for cross-posting to gdal-dev)



Markus Neteler:


Can you elaborate a bit more? I have downloaded and checked:

That is 9835059101  bytes in 19885 files or I downloaded the wrong one
(please post an URL).



For example ,

see

GHS_BUILT_LDS1975_GLOBE_R2016A_3857_38 (768MB)


GHS_BUILT_LDS1990_GLOBE_R2016A_3857_38 (854MB)
GHS_BUILT_LDS2000_GLOBE_R2016A_3857_38 (892MB)
GHS_BUILT_LDS2014_GLOBE_R2016A_3857_38 (900MB)



"3857" is the EPSG code.  They are split in two GeoTIFFs (p1, p2) and
there is a VRT along with overviews for it.  No overviews for the TIFFs.

For example:

GHSL_data_access_v1.3.pdf
GHS_BUILT_LDS1990_GLOBE_R2016A_3857_38_v1_0.clr
GHS_BUILT_LDS1990_GLOBE_R2016A_3857_38_v1_0.vrt
GHS_BUILT_LDS1990_GLOBE_R2016A_3857_38_v1_0.vrt.ovr
GHS_BUILT_LDS1990_GLOBE_R2016A_3857_38_v1_0_p1.tif
GHS_BUILT_LDS1990_GLOBE_R2016A_3857_38_v1_0_p2.tif


Even trying to clip, with gdal_translate, might create file(s) of
hundreds of GBs. This might be due to missing compression.




then use compression. The source tiffs use LZW with blocks of 4096x4096
cells.





The import of p1 or p2 or of the VRT file in GRASS' data base, via
r.in.gdal/r.import, does not progress at all.




Importing GHS_BUILT_LDS1990_GLOBE_R2016A_3857_38_v1_0_p1.tif with

r.in.gdal

took 1:31 hours on a laptop with SSD. The resultant cell file was 1.5 GB.

Recompressing with BZIP2 took 2:20 hours and the size of the cell file

was

reduced to a mere 143 MB.



Some messy rough timings:

1) i7, 8 cores, 32GB RAM, Base OS: CentOS -> Three r.in.gdal processes
for "p2.tif", each stuck at 3% for almost 14h

2) Xeon, 24 Cores, 32GB RAM, Base OS: Windows -> Three gdal_translate
processes with -projwin, the VRT file as an input and GeoTIFF as output,
at 40% since yesterday afternoon

3) Xeon, 12 Cores, ? RAM, Base OS: CentOS.jpg -> Same processes as in
1), stuck at 0% of progress for more than 16h.

SSD can be seen as a "necessity".


Hmm, not really. With the p1 tif and GRASS db on the same spinning HDD, and
6 other heavy processes constantly reading from and writing to that same
HDD, r.in.gdal took 2h 13min to import the p1 tif. 360 MB as input and 1.5
GB as output is not that heavy on disk IO. Most of the time is spent
decompressing input and compressing output.

Are your r.in.gdal and gdal_translate processes running at nearly 100% CPU?
Anything slowing down the HDD(s)?

Markus M


Ehm, maybe GDAL version 1.11.4? Just realised!
Working in restricted environment, time spent to configure things.
Will update...

Nikos
___
grass-user mailing list
grass-user@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/grass-user

Re: [GRASS-user] Slow import of GHSL

2017-03-14 Thread Markus Metz
On Tue, Mar 14, 2017 at 10:01 AM, Nikos Alexandris 
wrote:
>
> Nikos Alexandris
>
> Why does (attempting to) import a 38m pixel resolution GHSL [0]
GeoTIFF
> layer, ie GHS_BUILT_LDS1990_GLOBE_R2016A_3857_38_v1_0_p1.tif, in
GRASS'
> db progress slow?
>
>
> Markus M
>
>
>> because it is a very large raster map: Size is 507904, 647168
>
>
>>> (Apologies for cross-posting to gdal-dev)
>
>
> Markus Neteler:
>
 Can you elaborate a bit more? I have downloaded and checked:

 That is 9835059101  bytes in 19885 files or I downloaded the wrong one
 (please post an URL).
>>>
>>>
>>> For example ,
>>>
>>> see
>>>
>>> GHS_BUILT_LDS1975_GLOBE_R2016A_3857_38 (768MB)
>>
>> GHS_BUILT_LDS1990_GLOBE_R2016A_3857_38 (854MB)
>> GHS_BUILT_LDS2000_GLOBE_R2016A_3857_38 (892MB)
>> GHS_BUILT_LDS2014_GLOBE_R2016A_3857_38 (900MB)
>>>
>>>
>>> "3857" is the EPSG code.  They are split in two GeoTIFFs (p1, p2) and
>>> there is a VRT along with overviews for it.  No overviews for the TIFFs.
>>>
>>> For example:
>>>
>>> GHSL_data_access_v1.3.pdf
>>> GHS_BUILT_LDS1990_GLOBE_R2016A_3857_38_v1_0.clr
>>> GHS_BUILT_LDS1990_GLOBE_R2016A_3857_38_v1_0.vrt
>>> GHS_BUILT_LDS1990_GLOBE_R2016A_3857_38_v1_0.vrt.ovr
>>> GHS_BUILT_LDS1990_GLOBE_R2016A_3857_38_v1_0_p1.tif
>>> GHS_BUILT_LDS1990_GLOBE_R2016A_3857_38_v1_0_p2.tif
>>>
>>>
>>> Even trying to clip, with gdal_translate, might create file(s) of
>>> hundreds of GBs. This might be due to missing compression.
>
>
>> then use compression. The source tiffs use LZW with blocks of 4096x4096
>> cells.
>
>
>
>>> The import of p1 or p2 or of the VRT file in GRASS' data base, via
>>> r.in.gdal/r.import, does not progress at all.
>
>
>> Importing GHS_BUILT_LDS1990_GLOBE_R2016A_3857_38_v1_0_p1.tif with
r.in.gdal
>> took 1:31 hours on a laptop with SSD. The resultant cell file was 1.5 GB.
>>
>> Recompressing with BZIP2 took 2:20 hours and the size of the cell file
was
>> reduced to a mere 143 MB.
>
>
> Some messy rough timings:
>
> 1) i7, 8 cores, 32GB RAM, Base OS: CentOS -> Three r.in.gdal processes
> for "p2.tif", each stuck at 3% for almost 14h
>
> 2) Xeon, 24 Cores, 32GB RAM, Base OS: Windows -> Three gdal_translate
> processes with -projwin, the VRT file as an input and GeoTIFF as output,
> at 40% since yesterday afternoon
>
> 3) Xeon, 12 Cores, ? RAM, Base OS: CentOS.jpg -> Same processes as in
> 1), stuck at 0% of progress for more than 16h.
>
> SSD can be seen as a "necessity".

Hmm, not really. With the p1 tif and GRASS db on the same spinning HDD, and
6 other heavy processes constantly reading from and writing to that same
HDD, r.in.gdal took 2h 13min to import the p1 tif. 360 MB as input and 1.5
GB as output is not that heavy on disk IO. Most of the time is spent
decompressing input and compressing output.

Are your r.in.gdal and gdal_translate processes running at nearly 100% CPU?
Anything slowing down the HDD(s)?

Markus M
___
grass-user mailing list
grass-user@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/grass-user

Re: [GRASS-user] Slow import of GHSL

2017-03-14 Thread Nikos Alexandris

Nikos Alexandris


Why does (attempting to) import a 38m pixel resolution GHSL [0] GeoTIFF
layer, ie GHS_BUILT_LDS1990_GLOBE_R2016A_3857_38_v1_0_p1.tif, in GRASS'
db progress slow?


Markus M


because it is a very large raster map: Size is 507904, 647168



(Apologies for cross-posting to gdal-dev)


Markus Neteler:


Can you elaborate a bit more? I have downloaded and checked:

That is 9835059101  bytes in 19885 files or I downloaded the wrong one
(please post an URL).


For example ,

see

GHS_BUILT_LDS1975_GLOBE_R2016A_3857_38 (768MB)

GHS_BUILT_LDS1990_GLOBE_R2016A_3857_38 (854MB)
GHS_BUILT_LDS2000_GLOBE_R2016A_3857_38 (892MB)
GHS_BUILT_LDS2014_GLOBE_R2016A_3857_38 (900MB)


"3857" is the EPSG code.  They are split in two GeoTIFFs (p1, p2) and
there is a VRT along with overviews for it.  No overviews for the TIFFs.

For example:

GHSL_data_access_v1.3.pdf
GHS_BUILT_LDS1990_GLOBE_R2016A_3857_38_v1_0.clr
GHS_BUILT_LDS1990_GLOBE_R2016A_3857_38_v1_0.vrt
GHS_BUILT_LDS1990_GLOBE_R2016A_3857_38_v1_0.vrt.ovr
GHS_BUILT_LDS1990_GLOBE_R2016A_3857_38_v1_0_p1.tif
GHS_BUILT_LDS1990_GLOBE_R2016A_3857_38_v1_0_p2.tif


Even trying to clip, with gdal_translate, might create file(s) of
hundreds of GBs. This might be due to missing compression.



then use compression. The source tiffs use LZW with blocks of 4096x4096
cells.




The import of p1 or p2 or of the VRT file in GRASS' data base, via
r.in.gdal/r.import, does not progress at all.



Importing GHS_BUILT_LDS1990_GLOBE_R2016A_3857_38_v1_0_p1.tif with r.in.gdal
took 1:31 hours on a laptop with SSD. The resultant cell file was 1.5 GB.

Recompressing with BZIP2 took 2:20 hours and the size of the cell file was
reduced to a mere 143 MB.


Some messy rough timings:

1) i7, 8 cores, 32GB RAM, Base OS: CentOS -> Three r.in.gdal processes
for "p2.tif", each stuck at 3% for almost 14h

2) Xeon, 24 Cores, 32GB RAM, Base OS: Windows -> Three gdal_translate
processes with -projwin, the VRT file as an input and GeoTIFF as output,
at 40% since yesterday afternoon

3) Xeon, 12 Cores, ? RAM, Base OS: CentOS.jpg -> Same processes as in
1), stuck at 0% of progress for more than 16h.

SSD can be seen as a "necessity".

Nikos

[rest deleted]
___
grass-user mailing list
grass-user@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/grass-user

Re: [GRASS-user] Slow import of GHSL

2017-03-11 Thread Markus Metz
On Sat, Mar 11, 2017 at 8:53 AM, Nikos Alexandris 
wrote:
>
> Nikos Alexandris
>
>>> Why does (attempting to) import a 38m pixel resolution GHSL [0] GeoTIFF
>>> layer, ie GHS_BUILT_LDS1990_GLOBE_R2016A_3857_38_v1_0_p1.tif, in GRASS'
>>> db progress slow?

because it is a very large raster map: Size is 507904, 647168
>
>
> (Apologies for cross-posting to gdal-dev)
>
> Markus Neteler:
>
>> Can you elaborate a bit more? I have downloaded and checked:
>>
>> That is 9835059101  bytes in 19885 files or I downloaded the wrong one
>> (please post an URL).
>
> For example ,
>
> see
>
> GHS_BUILT_LDS1975_GLOBE_R2016A_3857_38 (768MB)
GHS_BUILT_LDS1990_GLOBE_R2016A_3857_38 (854MB)
GHS_BUILT_LDS2000_GLOBE_R2016A_3857_38 (892MB)
GHS_BUILT_LDS2014_GLOBE_R2016A_3857_38 (900MB)
>
> "3857" is the EPSG code.  They are split in two GeoTIFFs (p1, p2) and
> there is a VRT along with overviews for it.  No overviews for the TIFFs.
>
> For example:
>
> GHSL_data_access_v1.3.pdf
> GHS_BUILT_LDS1990_GLOBE_R2016A_3857_38_v1_0.clr
> GHS_BUILT_LDS1990_GLOBE_R2016A_3857_38_v1_0.vrt
> GHS_BUILT_LDS1990_GLOBE_R2016A_3857_38_v1_0.vrt.ovr
> GHS_BUILT_LDS1990_GLOBE_R2016A_3857_38_v1_0_p1.tif
> GHS_BUILT_LDS1990_GLOBE_R2016A_3857_38_v1_0_p2.tif
>
>
> Even trying to clip, with gdal_translate, might create file(s) of
> hundreds of GBs. This might be due to missing compression.

then use compression. The source tiffs use LZW with blocks of 4096x4096
cells.

>
> The import of p1 or p2 or of the VRT file in GRASS' data base, via
> r.in.gdal/r.import, does not progress at all.

Importing GHS_BUILT_LDS1990_GLOBE_R2016A_3857_38_v1_0_p1.tif with r.in.gdal
took 1:31 hours on a laptop with SSD. The resultant cell file was 1.5 GB.

Recompressing with BZIP2 took 2:20 hours and the size of the cell file was
reduced to a mere 143 MB.

> (
> Side question: why is max 2047?  What if there is a lot more of RAM?
> )
To avoid integer overflow because 2047 is converted to bytes with 2047 *
1024 * 1024.

Markus M
___
grass-user mailing list
grass-user@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/grass-user

Re: [GRASS-user] Slow import of GHSL

2017-03-10 Thread Nikos Alexandris

Nikos Alexandris


Why does (attempting to) import a 38m pixel resolution GHSL [0] GeoTIFF
layer, ie GHS_BUILT_LDS1990_GLOBE_R2016A_3857_38_v1_0_p1.tif, in GRASS'
db progress slow?


(Apologies for cross-posting to gdal-dev)

Markus Neteler:


Can you elaborate a bit more? I have downloaded and checked:

That is 9835059101  bytes in 19885 files or I downloaded the wrong one
(please post an URL).


I suggested them, already, to have single "pool" directory just with the
data, zipped and the license in it, for each data set.

For example ,


Similar GHSL data sets vary between 300 ~ 500 MB in size.


see

GHS_BUILT_LDS1975_GLOBE_R2016A_3857_38 (768MB) 
GHS_BUILT_LDS1990_GLOBE_R2016A_3857_38 (854MB) 
GHS_BUILT_LDS2000_GLOBE_R2016A_3857_38 (892MB) 
GHS_BUILT_LDS2014_GLOBE_R2016A_3857_38 (900MB)


"3857" is the EPSG code.  They are split in two GeoTIFFs (p1, p2) and
there is a VRT along with overviews for it.  No overviews for the TIFFs.

For example:

GHSL_data_access_v1.3.pdf
GHS_BUILT_LDS1990_GLOBE_R2016A_3857_38_v1_0.clr
GHS_BUILT_LDS1990_GLOBE_R2016A_3857_38_v1_0.vrt
GHS_BUILT_LDS1990_GLOBE_R2016A_3857_38_v1_0.vrt.ovr
GHS_BUILT_LDS1990_GLOBE_R2016A_3857_38_v1_0_p1.tif
GHS_BUILT_LDS1990_GLOBE_R2016A_3857_38_v1_0_p2.tif


Even trying to clip, with gdal_translate, might create file(s) of
hundreds of GBs. This might be due to missing compression. Even then,
the derived files, which are a subset in terms of extent, are enormous
compared to their source, say p1 or p2.

Creating a new VRT, works of course instantaneously. For example:

```
# some custom Europe's extent
ogrinfo -al europe_extent_epsg_3857/corine_2000.shp |grep Ext

Extent: (-6290123.623699, 2788074.747995) - (8115874.019718, 8170181.584331)

# extract the above subset in a new VRT
gdal_translate -projwin -6290123.623699 8170181.584331 8115874.019718 
2788074.747995 GHS_BUILT_LDS1990_GLOBE_R2016A_3857_38_v1_0.vrt test.vrt -of VRT

# build some overview for it (or for the p1 or p2 GeoTIFFs) -- slow for all 
options
gdaladdo -ro --config COMPRESS_OVERVIEW LZW test.vrt 2 4 8 16
```

If it's not for a VRT file, the subset extraction is very slow.
The files appear to be practically hard to process, one needs to wait
several hours for a clip.

The import of p1 or p2 or of the VRT file in GRASS' data base, via
r.in.gdal/r.import, does not progress at all.


Yes - do you have a SSD disk? This quite helps along with a
sufficiently large GDAL cache ("memory" parameter of r.in.gdal).


Among tests, I had set that to 2047. No obvious improvement.


As well, trying to clip the GeoTIFFs (not the VRT files) with gdal
tools to a custom extent (say Europe), appears to be a heavy process.



With GDAL, be sure to have set something like
export GDAL_CACHEMAX=2000


(
Side question: why is max 2047?  What if there is a lot more of RAM?
)


HTH,
Markus


Thank you Markus. I think there is more into it than the cache.

Nikos


[0] http://ghsl.jrc.ec.europa.eu/

___
grass-user mailing list
grass-user@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/grass-user

Re: [GRASS-user] Slow import of GHSL

2017-03-10 Thread Markus Neteler
On Fri, Mar 10, 2017 at 5:47 PM, Nikos Alexandris
 wrote:
> Why does (attempting to) import a 38m pixel resolution GHSL [0] GeoTIFF
> layer, ie GHS_BUILT_LDS1990_GLOBE_R2016A_3857_38_v1_0_p1.tif, in GRASS'
> db progress slow?

Can you elaborate a bit more? I have downloaded and checked:

That is 9835059101  bytes in 19885 files or I downloaded the wrong one
(please post an URL).

> Similar GHSL data sets vary between 300 ~ 500 MB in size.

Yes - do you have a SSD disk? This quite helps along with a
sufficiently large GDAL cache ("memory" parameter of r.in.gdal).

> As well, trying to clip the GeoTIFFs (not the VRT files) with gdal
> tools to a custom extent (say Europe), appears to be a heavy process.

With GDAL, be sure to have set something like
export GDAL_CACHEMAX=2000

HTH,
Markus

> Thanks for hints, Nikos
>
> [0] http://ghsl.jrc.ec.europa.eu/
___
grass-user mailing list
grass-user@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/grass-user

[GRASS-user] Slow import of GHSL

2017-03-10 Thread Nikos Alexandris

Why does (attempting to) import a 38m pixel resolution GHSL [0] GeoTIFF
layer, ie GHS_BUILT_LDS1990_GLOBE_R2016A_3857_38_v1_0_p1.tif, in GRASS'
db progress slow?

Similar GHSL data sets vary between 300 ~ 500 MB in size.

As well, trying to clip the GeoTIFFs (not the VRT files) with gdal
tools to a custom extent (say Europe), appears to be a heavy process.

Thanks for hints, Nikos

[0] http://ghsl.jrc.ec.europa.eu/
___
grass-user mailing list
grass-user@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/grass-user