Using a bigger file (8GB) and a machine with 64GB Ram we can see the increase
being higher. For both gdal.Warp and gdal.Translate
```
97 112.2 MiB 0.0 MiB logging.debug(kwargs)
98 691.5 MiB 579.3 MiB gdal.Warp(temp.name, input_path,
**kwargs)
99 691.5 MiB 0.0 MiB logging.debug('Compressing
image...')
100 3943.1 MiB 3251.6 MiB gdal.Translate(output_path,
temp.name, creationOptions=copts, callback=progress_logging('Compressing
image', one_is_max=True))
97 112.2 MiB 0.0 MiB logging.debug(kwargs)
98 691.5 MiB 579.3 MiB gdal.Warp(temp.name, input_path,
**kwargs)
100 3943.1 MiB 3251.6 MiB gdal.Translate(output_path,
temp.name, creationOptions=copts)
```
On 26 Dec 2019, at 15:26, Evert Etienne (SITEMARK)
<[email protected]<mailto:[email protected]>> wrote:
Hi all,
I have a question about memory usage of the python gdal bindings. For some GDAL
calls (python or not), we try to optimise the gdal cache. Doing this, I’ve
noticed the free RAM decreasing after doing gdal operations. I have been able
to narrow it down to the python bindings. Using `memory_profiler`
(https://pypi.org/project/memory-profiler/<https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpypi.org%2Fproject%2Fmemory-profiler%2F&data=01%7C01%7Cevert.etienne%40sitemark.com%7C341fa1258c3c4f700a4e08d78a0f8e45%7Cfc89adff07ac47008853b7b7e906068e%7C0&sdata=PxqBDpBlLOr8eiUQXw9fSdSfCH8lKnUeLLCbciVMO5E%3D&reserved=0>)
I get the following:
The first column represents the line number of the code that has been profiled,
the second column (Mem usage) the memory usage of the Python interpreter after
that line has been executed. The third column (Increment) represents the
difference in memory of the current line with respect to the last one. The last
column (Line Contents) prints the code that has been profiled.
```
101 65.4 MiB 0.0 MiB logging.debug(kwargs)
102 203.9 MiB 138.4 MiB gdal.Warp(temp.name, input_path,
**kwargs)
```
It does seem related to the cache because of the following tests, but only
partially. I would expect since every file is on disk that these calls do not
have any lasting effect on memory usage.
```
98 65.4 MiB 0.0 MiB gdal.SetCacheMax(0)
99 87.8 MiB 22.4 MiB gdal.Warp(temp.name, input_path,
**kwargs)
```
temp.name is a `tempfile.NamedTemporaryFile('w+’)`
(`/var/folders/3t/_j9hh3_907g646cgt8pkkjch0000gn/T/tmpumywovz7`. The passed
kwargs are ` {'dstSRS': 'EPSG:3857', 'resampleAlg': 2, 'format': 'gtiff',
'multithread': True, 'warpOptions': ['NUM_THREADS=ALL_CPUS'],
'creationOptions': ['BIGTIFF=YES', 'NUM_THREADS=ALL_CPUS’]}`. The input file is
84.5 MB.
Assigning and deleting the result does not affect the results. They grow bigger
but also decrease after deletion. I assume this is the dataset size.
```
96 65.4 MiB 0.0 MiB logging.debug(kwargs)
97 249.8 MiB 184.4 MiB ds = gdal.Warp(temp.name,
input_path, **kwargs)
98 193.8 MiB 0.0 MiB del ds
```
Am I overlooking any cause for this memory increase or is there a possibility
to clear this?
Am I correct to assume the usage of the gdal python bindings in this way (All
files are on disk) should have barely any effect on script memory usage?
Thanks in advance.
_______________________________________________
gdal-dev mailing list
[email protected]<mailto:[email protected]>
https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.osgeo.org%2Fmailman%2Flistinfo%2Fgdal-dev&data=01%7C01%7Cevert.etienne%40sitemark.com%7C341fa1258c3c4f700a4e08d78a0f8e45%7Cfc89adff07ac47008853b7b7e906068e%7C0&sdata=swgZAj2FYOzIEkzJo6%2FlDaeusFh7xslQnAyQnQT1mNU%3D&reserved=0
_______________________________________________
gdal-dev mailing list
[email protected]
https://lists.osgeo.org/mailman/listinfo/gdal-dev