Sure Andrew,

Here it is the call stack from Visual Studio for both threads (I just copied 
the top calls where GDAL is involved, just for easy reading. If you need the 
whole stack just let me know):

THREAD 1:

                ntdll.dll!_NtWaitForSingleObject@12‑() Unknown
               ntdll.dll!_RtlpWaitOnCriticalSection@8‑()             Unknown
               ntdll.dll!_RtlEnterCriticalSection@4‑()    Unknown
               gdal201.dll!CPLAcquireMutex(_CPLMutex * hMutexIn, double 
dfWaitInSeconds) Line 806               C++
               gdal201.dll!GDALDataset::EnterReadWrite(GDALRWFlag eRWFlag) Line 
6102        C++
               gdal201.dll!GDALRasterBand::EnterReadWrite(GDALRWFlag eRWFlag) 
Line 5290 C++
               gdal201.dll!GDALRasterBlock::Write() Line 742    C++
               gdal201.dll!GDALRasterBlock::Internalize() Line 917          C++
               gdal201.dll!GDALRasterBand::GetLockedBlockRef(int nXBlockOff, 
int nYBlockOff, int bJustInitialize) Line 1126                C++
>             Test.exe!RasterBandPixelAccess::SetValueAtPixel<short>(const int 
> & pX, const int & pY, const short & value) Line 180                C++


THREAD 2:

                ntdll.dll!_NtWaitForSingleObject@12‑() Unknown
               KernelBase.dll!_WaitForSingleObjectEx@12‑()   Unknown
               kernel32.dll!_WaitForSingleObjectExImplementation@12‑()        
Unknown
               kernel32.dll!_WaitForSingleObject@8‑() Unknown
               gdal201.dll!CPLCondWait(_CPLCond * hCond, _CPLMutex * 
hClientMutex) Line 937           C++
               gdal201.dll!GDALAbstractBandBlockCache::WaitKeepAliveCounter() 
Line 134       C++
               gdal201.dll!GDALArrayBandBlockCache::FlushCache() Line 312    C++
               gdal201.dll!GDALRasterBand::FlushCache() Line 865         C++
               gdal201.dll!GDALDataset::FlushCache() Line 386 C++
               gdal201.dll!GDALPamDataset::FlushCache() Line 159        C++
               gdal201.dll!GTiffDataset::Finalize() Line 6180       C++
               gdal201.dll!GTiffDataset::~GTiffDataset() Line 6135           C++
               gdal201.dll!GTiffDataset::`scalar deleting destructor'(unsigned 
int)           C++
               gdal201.dll!GDALClose(void * hDS) Line 2998       C++
>             
> Test.exe!main::__l2::<lambda>(std::basic_string<char,std::char_traits<char>,std::allocator<char>
>  > sourcefilePath, 
> std::basic_string<char,std::char_traits<char>,std::allocator<char> > 
> targetFilePath, int threadID) Line 66           C++


From: Andrew Bell [mailto:[email protected]]
Sent: 26 September, 2016 16:06
To: Francisco Javier Calzado <[email protected]>
Cc: [email protected]
Subject: Re: [gdal-dev] Multithread deadlock

Deadlocks are usually easy to debug if you can get a traceback when deadlocked. 
 If you can attach with gdb (or run in the debugger) and reproduce and post the 
stack at the time ('where' from gdb), it should be no problem to fix.  Trying 
to reproduce on different hardware can be difficult.

On Mon, Sep 26, 2016 at 9:33 AM, Francisco Javier Calzado 
<[email protected]<mailto:[email protected]>>
 wrote:
Hi guys,

I am experiencing a deadlock with just 2 threads in a single reader & multiple 
writer scenario. This is, threads read from the same input file (using 
different handlers) and then write different output files. Deadlock comes when 
the block cache gets filled. The situation is the following:


-          T1 and T2 read datasets D1 and D2, both pointing to the same input 
raster (GTiff).

-          Block cache gets filled.

-          T1 tries to lock one block in the cache to write data. But cache is 
full, so it tries to free dirty blocks from T2 (as seen in Internalize() 
method). For that purpose, it requires apparently a mutex from D2.

-          However T2 is in a state where must wait for thread T1 to finish 
working with T2’s blocks. In this state, T2 has a mutex acquired from D2.

At least, that is what it seems to be happening based on source code. Maybe I’m 
wrong, I don’t have a full picture overview about how GDAL is internally 
working. The thing is that I can reproduce this issue with the following test 
code and dataset:
https://drive.google.com/file/d/0B-OCl1FjBi0YSkU3RUozZjc5SnM/view?usp=sharing

Oddly enough, ticket with number #6163 is supposed to fix this, but its failing 
in my case. I am working with GDAL 2.1.0 version under VS2015 (x32, Debug) 
compilation.

Even, what do you think?

Thanks!
Javier C.


_______________________________________________
gdal-dev mailing list
[email protected]<mailto:[email protected]>
http://lists.osgeo.org/mailman/listinfo/gdal-dev



--
Andrew Bell
[email protected]<mailto:[email protected]>
_______________________________________________
gdal-dev mailing list
[email protected]
http://lists.osgeo.org/mailman/listinfo/gdal-dev

Reply via email to