I've played with VirtualAlloc(NULL, SINGLE_ALLOC_SIZE, MEM_COMMIT | MEM_RESERVE, PAGE_READWRITE), and it does avoid the performance issue. However I see that VitualAlloc() allocates by chunks of 64 kB, so depending on the size of a block, it might cause significant waste of RAM, so that can't be used as a direct replacement of malloc().

My inclination would be to perhaps have an optional config option like GDAL_BLOCK_CACHE_USE_PRIVATE_HEAP that could be set, and when doing so it would use HeapCreate(0, 0, GDAL_CACHEMAX) to create a heap only used by the block cache. Not ideal, since that would reserve the whole GDAL_CACHEMAX (but for a large enough processing, you'll end up consuming it), but it has the advantage of not being extremely intrusive either... and could be easily ditched/replaced by something better in the future.

Regarding tcmalloc, I've had to use it on Linux too, but only on scenarios involving multithreading where it helps reducing RAM fragmentation: cf https://gdal.org/user/multithreading.html#ram-fragmentation-and-multi-threading . I've just tried quickly to use it on Windows to test it on the scenario, but didn't really manage to make it work. Even building it was challenging. Actually I tried https://github.com/gperftools/gperftools and I had to build from master since the latest tagged version doesn't build with CMake on Windows. But then nothing happens when linking tcmalloc_minimal.lib against my toy app. I probably missed something.

Anyway I don't really think we can force tcmalloc to be used in GDAL, as a library. Unless there would be a way to have its allocator to be optionnaly used at places that we control (ie explicitly call tc_malloc / tc_free), and not replace the default malloc / free etc, which might be undesirable when GDAL is just a component of a larger application.

Disabling entirely the block cache (or setting it to a minimum value) is only a workable option for uncompressed formats, or if you use per-band blocks (INTERLEAVE=BAND in GTiff language) and not one block for all bands (INTERLEAVE=PIXEL), otherwise you'll pay multiple time the decompression.

Le 21/03/2024 à 14:38, Meyer, Jesse R. (GSFC-618.0)[SCIENCE SYSTEMS AND APPLICATIONS INC] via gdal-dev a écrit :

+1. We use a variety of hand-rolled VirtualAlloc based (for basic tasks, a simple pointer bump, and for more elaborate needs, a ‘buddy’) allocators, some of which try to be smart about memory usage via de-committing regions.  In our work, we tend to disable the GDAL cache entirely and rely on the file system’s file cache instead, which is a simplification we can make but is surely untenable in general here.

*From: *gdal-dev <gdal-dev-boun...@lists.osgeo.org> on behalf of Abel Pau via gdal-dev <gdal-dev@lists.osgeo.org>
*Reply-To: *Abel Pau <a....@creaf.uab.cat>
*Date: *Thursday, March 21, 2024 at 4:51 AM
*To: *"gdal-dev@lists.osgeo.org" <gdal-dev@lists.osgeo.org>
*Subject: *[EXTERNAL] [BULK] Re: [gdal-dev] Experience with slowness of free() on Windows with lots of allocations?

*CAUTION:*This email originated from outside of NASA.  Please take care when clicking links or opening attachments. Use the "Report Message" button to report suspicious messages to the NASA SOC.



Hi Even,

you’re right. We also know that. When programming the driver I took it in consideration. Our solution is not rely on windows to make a good job with memory and we try to reuse as memory as possible instead of use calloc/free freely.

For instance, in the driver, for each feature I have to get or write the coordinates. I could do it every time I have to, so lots of times: create memory for reading, and then put them on the feature, and then free... so many times. What I do? When opening the layer I create some memory blocs of 250 Mb (due to the format itself) and I use that created memory to manage whatever I need. And when closing, I free it.

While doing that I observed that sometimes I have to use GDAL code that doesn’t take it in consideration (CPLRecode()for instance). Perhaps it could be improves as well.

Thanks for noticing that.

*De:*gdal-dev <gdal-dev-boun...@lists.osgeo.org> *En nombre de *Javier Jimenez Shaw via gdal-dev
*Enviado el:* dijous, 21 de març de 2024 8:27
*Para:* Even Rouault <even.roua...@spatialys.com>
*CC:* gdal dev <gdal-dev@lists.osgeo.org>
*Asunto:* Re: [gdal-dev] Experience with slowness of free() on Windows with lots of allocations?

In my company we confirmed that "Windows heap allocation mechanism sucks."

Closing the application after using gtiff driver can take many seconds due to memory deallocations.

One workaround was to use tcmalloc. I will ask my colleagues more details next week.

On Thu, 21 Mar 2024, 01:55 Even Rouault via gdal-dev, <gdal-dev@lists.osgeo.org> wrote:

    Hi,

    while investigating
    https://github.com/OSGeo/gdal/issues/9510#issuecomment-2010950408,
    I've
    come to the conclusion that the Windows heap allocation mechanism
    sucks.
    Basically if you allocate a lot of heap regions of modest size with
    malloc()/new[], the time spent when freeing them all with
    corresponding
    free()/delete[] is excruciatingly slow (like ~ 10 seconds for ~
    80,000
    allocations). The slowness is clearly quadratic with the number of
    allocations. You only start noticing it with ~ 30,000 allocations.
    And
    interestingly, another condition for that slowness is that each
    individual allocation much be strictly greater than 4096 * 4
    bytes. At
    exactly that value, perf is acceptable, but add one extra byte,
    and it
    suddenly drops. I suspect that there must be a threshold from which
    malloc() starts using VirtualAlloc() instead of the heap, which must
    involve slow system calls, instead of a user-land allocation
    mechanism.

    Anyone has already hit that and found solutions? The only
    potential idea
    I found until now would be to use a private heap with HeapCreate()
    with
    a fixed maximum size, which is a bit problematic to adopt by default,
    basically that would mean that the size of GDAL_CACHEMAX would be
    consumed as soon as one use the block cache.

    Even

-- http://www.spatialys.com
    My software is free, but my time generally not.

    _______________________________________________
    gdal-dev mailing list
    gdal-dev@lists.osgeo.org
    https://lists.osgeo.org/mailman/listinfo/gdal-dev


_______________________________________________
gdal-dev mailing list
gdal-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/gdal-dev

--
http://www.spatialys.com
My software is free, but my time generally not.
_______________________________________________
gdal-dev mailing list
gdal-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/gdal-dev
  • ... Even Rouault via gdal-dev
    • ... Javier Jimenez Shaw via gdal-dev
      • ... Abel Pau via gdal-dev
        • ... Uhrig, Stefan via gdal-dev
          • ... Uhrig, Stefan via gdal-dev
        • ... Meyer, Jesse R. (GSFC-618.0)[SCIENCE SYSTEMS AND APPLICATIONS INC] via gdal-dev
          • ... Even Rouault via gdal-dev
            • ... Meyer, Jesse R. (GSFC-618.0)[SCIENCE SYSTEMS AND APPLICATIONS INC] via gdal-dev
            • ... Javier Jimenez Shaw via gdal-dev

Reply via email to