Re: [gdal-dev] Notice: issue about multi-threaded GTiff compression+decompression

2023-10-16 Thread Kurt Schwehr via gdal-dev
> In short: multithreading is hard

So true! With the introduction of tsan things are a little less bad, but
tsan is still a tool with plenty of false positives and false negatives.
And that assumes that a particular issue is covered by the tests being run
under tsan.
___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/gdal-dev


Re: [gdal-dev] Notice: issue about multi-threaded GTiff compression+decompression

2023-10-16 Thread Sean Gillies via gdal-dev
Thanks for the announcement, Even! I wonder if we should track such issues
in a list? Or maybe give them a unique GitHub label?

We don't plan to release a 3.6.5, correct?

I'm going to make a Rasterio post release that patches 3.6.4 by tomorrow:
https://github.com/rasterio/rasterio/issues/2943.

On Mon, Oct 16, 2023 at 9:26 AM Even Rouault via gdal-dev <
gdal-dev@lists.osgeo.org> wrote:

>
> Le 16/10/2023 à 17:14, Kurt Schwehr a écrit :
>
> Thanks for the heads up!
>
> Can you share that SHAs of the fix and cause?
>
> master: d083af1ec9c38e79bfb0532885570de6e5b8a410
>
> 3.7 branch (should apply to 3.6 as well):
> b5858ed5bc5004c97f7cd6000674015bdc70b33b
>
> cause: a worker thread of the multithreaded decoding could, in a situation
> where the block cache is full, cause a "dirty" (ie modified but not yet
> serialized to disk) block to be written, resulting in either a deadlock
> between the lock of the multithreaded decoder and the lock of the job queue
> mechanism, or other decoding threads could see their file handle being
> seek() by another thread.  In short: multithreading is hard.
>
>
> On Mon, Oct 16, 2023, 7:44 AM Even Rouault via gdal-dev <
> gdal-dev@lists.osgeo.org> wrote:
>
>> Hi,
>>
>> For GDAL 3.6.0 to 3.7.2, use of multi-threaded GTiff
>> compression+decompression, in particular within the same file, as for
>> example in a "gdalwarp -co COMPRESS=... -co NUM_THREADS=..." workflow
>> can lead to deadlocks (process stalled forever) and/or data corruption
>> (sometimes without errors at generation). Cf
>> https://github.com/OSGeo/gdal/issues/8470 for a reproducer. The fix is
>> in https://github.com/OSGeo/gdal/pull/8561
>>
>> The issue is particularly visible on Windows, or more generally any
>> operating system (or file system where the output file is located) which
>> has no VSIVirtualHandle::PRead() implementation, but it can also be
>> occasionally reproduced on Linux (at least as a deadlock).
>>
>> Even
>>
>>
-- 
Sean Gillies
___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/gdal-dev


Re: [gdal-dev] Notice: issue about multi-threaded GTiff compression+decompression

2023-10-16 Thread Even Rouault via gdal-dev


Le 16/10/2023 à 17:14, Kurt Schwehr a écrit :

Thanks for the heads up!

Can you share that SHAs of the fix and cause?


master: d083af1ec9c38e79bfb0532885570de6e5b8a410

3.7 branch (should apply to 3.6 as well): 
b5858ed5bc5004c97f7cd6000674015bdc70b33b


cause: a worker thread of the multithreaded decoding could, in a 
situation where the block cache is full, cause a "dirty" (ie modified 
but not yet serialized to disk) block to be written, resulting in either 
a deadlock between the lock of the multithreaded decoder and the lock of 
the job queue mechanism, or other decoding threads could see their file 
handle being seek() by another thread.  In short: multithreading is hard.




On Mon, Oct 16, 2023, 7:44 AM Even Rouault via gdal-dev 
 wrote:


Hi,

For GDAL 3.6.0 to 3.7.2, use of multi-threaded GTiff
compression+decompression, in particular within the same file, as for
example in a "gdalwarp -co COMPRESS=... -co NUM_THREADS=..." workflow
can lead to deadlocks (process stalled forever) and/or data
corruption
(sometimes without errors at generation). Cf
https://github.com/OSGeo/gdal/issues/8470 for a reproducer. The
fix is
in https://github.com/OSGeo/gdal/pull/8561

The issue is particularly visible on Windows, or more generally any
operating system (or file system where the output file is located)
which
has no VSIVirtualHandle::PRead() implementation, but it can also be
occasionally reproduced on Linux (at least as a deadlock).

Even

-- 
http://www.spatialys.com

My software is free, but my time generally not.

___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/gdal-dev


--
http://www.spatialys.com
My software is free, but my time generally not.
___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/gdal-dev


Re: [gdal-dev] Notice: issue about multi-threaded GTiff compression+decompression

2023-10-16 Thread Kurt Schwehr via gdal-dev
Thanks for the heads up!

Can you share that SHAs of the fix and cause?

On Mon, Oct 16, 2023, 7:44 AM Even Rouault via gdal-dev <
gdal-dev@lists.osgeo.org> wrote:

> Hi,
>
> For GDAL 3.6.0 to 3.7.2, use of multi-threaded GTiff
> compression+decompression, in particular within the same file, as for
> example in a "gdalwarp -co COMPRESS=... -co NUM_THREADS=..." workflow
> can lead to deadlocks (process stalled forever) and/or data corruption
> (sometimes without errors at generation). Cf
> https://github.com/OSGeo/gdal/issues/8470 for a reproducer. The fix is
> in https://github.com/OSGeo/gdal/pull/8561
>
> The issue is particularly visible on Windows, or more generally any
> operating system (or file system where the output file is located) which
> has no VSIVirtualHandle::PRead() implementation, but it can also be
> occasionally reproduced on Linux (at least as a deadlock).
>
> Even
>
> --
> http://www.spatialys.com
> My software is free, but my time generally not.
>
> ___
> gdal-dev mailing list
> gdal-dev@lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/gdal-dev
>
___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/gdal-dev


Re: [gdal-dev] Notice: issue about multi-threaded GTiff compression+decompression

2023-10-16 Thread Javier Jimenez Shaw via gdal-dev
Thanks Even for notifying this.
I see the fix will be included in the next releases 3.7.3 and 3.8.0, both
planned for November 1st (just in a couple of weeks)
https://github.com/OSGeo/gdal/milestones

On Mon, 16 Oct 2023 at 16:42, Even Rouault via gdal-dev <
gdal-dev@lists.osgeo.org> wrote:

> Hi,
>
> For GDAL 3.6.0 to 3.7.2, use of multi-threaded GTiff
> compression+decompression, in particular within the same file, as for
> example in a "gdalwarp -co COMPRESS=... -co NUM_THREADS=..." workflow
> can lead to deadlocks (process stalled forever) and/or data corruption
> (sometimes without errors at generation). Cf
> https://github.com/OSGeo/gdal/issues/8470 for a reproducer. The fix is
> in https://github.com/OSGeo/gdal/pull/8561
>
> The issue is particularly visible on Windows, or more generally any
> operating system (or file system where the output file is located) which
> has no VSIVirtualHandle::PRead() implementation, but it can also be
> occasionally reproduced on Linux (at least as a deadlock).
>
> Even
>
> --
> http://www.spatialys.com
> My software is free, but my time generally not.
>
> ___
> gdal-dev mailing list
> gdal-dev@lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/gdal-dev
>
___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/gdal-dev


[gdal-dev] Notice: issue about multi-threaded GTiff compression+decompression

2023-10-16 Thread Even Rouault via gdal-dev

Hi,

For GDAL 3.6.0 to 3.7.2, use of multi-threaded GTiff 
compression+decompression, in particular within the same file, as for 
example in a "gdalwarp -co COMPRESS=... -co NUM_THREADS=..." workflow 
can lead to deadlocks (process stalled forever) and/or data corruption 
(sometimes without errors at generation). Cf 
https://github.com/OSGeo/gdal/issues/8470 for a reproducer. The fix is 
in https://github.com/OSGeo/gdal/pull/8561


The issue is particularly visible on Windows, or more generally any 
operating system (or file system where the output file is located) which 
has no VSIVirtualHandle::PRead() implementation, but it can also be 
occasionally reproduced on Linux (at least as a deadlock).


Even

--
http://www.spatialys.com
My software is free, but my time generally not.

___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/gdal-dev