According to the docs, GRASS_INT_ZLIB can be set to use
LZW instead of RLE compression. But as far as I can see,
there is not yet a way to completely prevent rasters from
being compressed at creation time?

Ben

On 04/25/2012 09:24 AM, Maris Nartiss wrote:
  There's already an existing GRASS_INT_ZLIB env variable. There should
be only one env variable to enable/disable raster compression.

Just my 0.002 Verizon cents.
Maris.

2012. gada 24. aprīlis 18:02 Jim Regetz<[email protected]>  rakstīja:
Chagrined by a performance hit apparently involving zlib compression, I
patched my local GRASS 7.0 to accept an environment variable that disables
raster compression. At least for the particular DCELL rasters I've been
using, this yields a ~5x improvement in run time during write operations, at
a cost of some extra disk usage that I'm often more than happy to incur. See
sample timing outputs below my sig.

Admittedly, the speedup factor drops to ~2.5x if the timing comparisons
include a forced sync to disk, because uncompressed output means more IO.
But that's still a nice speedup, and the disk IO cost may be of little
consequence in cases where the raster can fit comfortably in the OS page
cache and is an intermediate output that gets read back in during a
subsequent step of a particular processing workflow (and perhaps then
removed before ever being flushed to disk).

My demo-purposes patch is attached. It just adds a GRASS_NO_COMPRESSION
environment variable and then injects a new conditional dispatch into each
of the three Rast_open{_,_fp_,_c_}new functions. For cleaner semantics, it
might be better to keep the original functions but rename them as
*_compressed (paralleling the existing *_uncompressed versions) for callers
who really want/need to force compression (e.g., r.compress, which my patch
in some sense "breaks" when the environment variable is set), but I didn't
do this here. And I haven't looked hard to see if other modules/etc truly
depend on the existing compression behavior.

Any chance something like this could make it into trunk?

As a real world example, I recently wrote a Python module that relies on
r.mapcalc, r.neighbors, and r.samp.stats. With GRASS_NO_COMPRESSION set,
total runtime dropped from 20 minutes to 10 minutes on a 12K by 12K input
raster, with a disk usage differential that peaked at ~4GB during
processing. Outputs were identical other than compression.

Cheers,
Jim

------------------------------
James Regetz, Ph.D.
Scientific Programmer/Analyst
National Center for Ecological Analysis&  Synthesis
735 State St, Suite 300
Santa Barbara, CA 93101


# timings performed on Ubuntu 10.04 with ample RAM and a recent
# build of GRASS 7.0-svn with the applied patch

# describe the 'test' raster used below; based on some 90m SRTM
# data coerced to double precision
GRASS 7.0.svn (tmp):~>  r.info -g test
...
rows=4801
cols=4801
cells=23049601
datatype=DCELL

GRASS 7.0.svn (tmp):~>  r.univar test
total null and non-null cells: 23049601
total null cells: 0

Of the non-null cells:
----------------------
n: 23049601
minimum: 500
maximum: 3139
range: 2639
mean: 1445.04
mean of absolute values: 1445.04
standard deviation: 336.437
...


# using (default) zlib compression on write
GRASS 7.0.svn (tmp):~>  g.gisenv set="OVERWRITE=1"
GRASS 7.0.svn (tmp):~>  g.region rast=test
GRASS 7.0.svn (tmp):~>  unset GRASS_NO_COMPRESSION
GRASS 7.0.svn (tmp):~>  sync; echo 3>  /proc/sys/vm/drop_caches
GRASS 7.0.svn (tmp):~>  time r.mapcalc "foo = test" --quiet

real    0m13.209s
user    0m12.660s
sys     0m0.400s


# after disabling compression on write
GRASS 7.0.svn (tmp):~>  g.gisenv set="OVERWRITE=1"
GRASS 7.0.svn (tmp):~>  g.region rast=test
GRASS 7.0.svn (tmp):~>  export GRASS_NO_COMPRESSION=1
GRASS 7.0.svn (tmp):~>  sync; echo 3>  /proc/sys/vm/drop_caches
GRASS 7.0.svn (tmp):~>  time r.mapcalc "foo = test" --quiet

real    0m2.514s
user    0m2.320s
sys     0m0.170s


_______________________________________________
grass-dev mailing list
[email protected]
http://lists.osgeo.org/mailman/listinfo/grass-dev
_______________________________________________
grass-dev mailing list
[email protected]
http://lists.osgeo.org/mailman/listinfo/grass-dev



--
Benjamin Ducke
{*} Geospatial Consultant
{*} GIS Developer

  [email protected]
_______________________________________________
grass-dev mailing list
[email protected]
http://lists.osgeo.org/mailman/listinfo/grass-dev

Reply via email to