Re: 8.2-PRERELEASE: live deadlock, almost all processes in "pfault" state

2011-01-10 Thread Ivan Voras

On 08/01/2011 20:42, Lev Serebryakov wrote:

Hello, Kostik.
You wrote 8 января 2011 г., 22:02:32:



If I am guessing right, this creature has a classic deadlock when
bio processing requires memory allocation. It seems that tid 100079
is sleeping not even due to the free page shortage, but due to address
space exhaustion. As result, read/write requests are stalled.

   I want to say, that ZFS, for example, could allocate much more
memory, and, yes, it had problems on i386 with this, but not on amd64,
AFAIK...

   So, I'm (geom_radi5) doing something wrong...


geom_raid5 (I'm assuming you're talking about the module that was 
written some time ago by an external developer) does serveral things 
wrong - that's why it wasn't included in FreeBSD. IIRC, one of those 
things is that it aggressively caches writes below the file system 
layer, which is a no-no.



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: 8.2-PRERELEASE: live deadlock, almost all processes in "pfault" state

2011-01-10 Thread Ivan Voras

On 08/01/2011 23:06, Lev Serebryakov wrote:



   I need to look how raid3 and vinum/raid5 lives with that situation.


One other standard solution is to spawn a thread and offload the job to 
that thread, instead of within GEOM start(). This is what most current 
complex GEOM classes to.



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"



Re: 8.2-PRERELEASE: live deadlock, almost all processes in "pfault" state

2011-01-08 Thread Lev Serebryakov
Hello, Kostik.
You wrote 8 января 2011 г., 23:20:28:

>> >>   And, if it is "classic deadlock" is here any "classical" solution to
>> >> it?
>> > Do not allocate during bio processing.
>>  So, if GEOM need some cache, it needs pre-allocate it and implements
>> custom allocator over allocated chunk? :(
>> 
>>  And what is "bio processing" in this context? geom_raid5 puts all
> bio processing == whole time needed to finish pageout. Pageout is
> often performed to clean the page to lower the page shortage.
> If pageout requires more free pages to finish during the shortage,
> then we get the deadlock.
  Ok, and transmission mmap() files on geom_raid5, so when these pages
are paged out, and geom_raid5 asks for other pages, and there is no
free ones... I see. It seems, that M_NOWAIT flag should help, if
geom_raid5 could live with failed mallocs...

> Also, it seems that you allocate not only bios (small objects, not
> every request cause page allocation), but also the huge buffers, that
> require free pages each time.
  Yes, in worst case RAID5 need a lot of additional memory to perform
 simple write. If it is lone write (geom_raid5 waits some time for
 writes in adjacent areas, but not forever), geom_raid5 need to read
 (Number of disks - 1) x (size of write) bytes of data to re-calculate
 checksum. And it need buffers for this data. Worst case for 5-disks
 RAID5 and 128KiB write will be 4x128KiB = 512KiB of buffers. For one
 128KiB write. And I don;t understand how to avoid deadlock here :(
 Maybe, preallocating some memory at start (these 512KiB) and try to
 use them when malloc() failed...

  I need to look how raid3 and vinum/raid5 lives with that situation.

-- 
// Black Lion AKA Lev Serebryakov 

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: 8.2-PRERELEASE: live deadlock, almost all processes in "pfault" state

2011-01-08 Thread Kostik Belousov
On Sat, Jan 08, 2011 at 11:10:21PM +0300, Lev Serebryakov wrote:
> Hello, Kostik.
> You wrote 8 января 2011 г., 22:56:13:
> 
> 
> >>   And, if it is "classic deadlock" is here any "classical" solution to
> >> it?
> > Do not allocate during bio processing.
>  So, if GEOM need some cache, it needs pre-allocate it and implements
> custom allocator over allocated chunk? :(
> 
>  And what is "bio processing" in this context? geom_raid5 puts all
bio processing == whole time needed to finish pageout. Pageout is
often performed to clean the page to lower the page shortage.
If pageout requires more free pages to finish during the shortage,
then we get the deadlock.

Also, it seems that you allocate not only bios (small objects, not
every request cause page allocation), but also the huge buffers, that
require free pages each time.

> bios into the (private, internal) queue and geom_start() exits
> immediately, and bio could spend rather long time in queue (if it is
> write request) before it will be sent to underlying provider. And,
> yes, it could be combined with other bios to form new one (why
> allocation of new bio is needed).
> 
>  So, is "bio processing" a whole time before bio is complete, or only
> geom_start() call or what?
> 
>  Also, RAID5 needs to read data (other stripes) and write data (new
> checksum) when "write" bio is processed. BTW, "system" geom_raid3 and
> geom_vinum (with raid5 volume) need to do the same to maintain
> checksums, so they could deadlock (in theory) too, if problem is
> "allocate memory during bio processing". And geom_mirror needs
> allocate bio for second (third, ...) component on every write...
> 
> -- 
> // Black Lion AKA Lev Serebryakov 
> 


pgpxNoOkpIjqZ.pgp
Description: PGP signature


Re: 8.2-PRERELEASE: live deadlock, almost all processes in "pfault" state

2011-01-08 Thread Lev Serebryakov
Hello, Kostik.
You wrote 8 января 2011 г., 22:56:13:


>>   And, if it is "classic deadlock" is here any "classical" solution to
>> it?
> Do not allocate during bio processing.
 So, if GEOM need some cache, it needs pre-allocate it and implements
custom allocator over allocated chunk? :(

 And what is "bio processing" in this context? geom_raid5 puts all
bios into the (private, internal) queue and geom_start() exits
immediately, and bio could spend rather long time in queue (if it is
write request) before it will be sent to underlying provider. And,
yes, it could be combined with other bios to form new one (why
allocation of new bio is needed).

 So, is "bio processing" a whole time before bio is complete, or only
geom_start() call or what?

 Also, RAID5 needs to read data (other stripes) and write data (new
checksum) when "write" bio is processed. BTW, "system" geom_raid3 and
geom_vinum (with raid5 volume) need to do the same to maintain
checksums, so they could deadlock (in theory) too, if problem is
"allocate memory during bio processing". And geom_mirror needs
allocate bio for second (third, ...) component on every write...

-- 
// Black Lion AKA Lev Serebryakov 

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: 8.2-PRERELEASE: live deadlock, almost all processes in "pfault" state

2011-01-08 Thread Kostik Belousov
On Sat, Jan 08, 2011 at 10:29:09PM +0300, Lev Serebryakov wrote:
> Hello, Kostik.
> You wrote 8 января 2011 г., 22:02:32:
> 
> 
> > There is some weird backtrace at the pid 20, what is g_raid5 ?
>   It is geom_raid5, with two threads -- working one and one for
>  processing finished bios.
> 
> > If I am guessing right, this creature has a classic deadlock when 
> > bio processing requires memory allocation. It seems that tid 100079
>   tid 100079 sleep in waiting for some data in queue.
> 
> > is sleeping not even due to the free page shortage, but due to address
> > space exhaustion. As result, read/write requests are stalled.
>   tid 100078 sleep in malloc(). But geom_raid5 never ever allocate
>  more than 128MiB of memory and it is 64bit system with huge amount of
>  kmem_size/kmem_size_max...
> 
>   How could I explore allocation (like vmstat -m) from kdb to be sure,
> it doesn't allocated more?
Use "show uma" and "show malloc" from ddb.

> 
>   And, if it is "classic deadlock" is here any "classical" solution to
> it?
Do not allocate during bio processing.

> 
>   Really, I'm maintainer of geom_raid5 now, so I need fix this
> deadlock, but I don't really understand, why does it occur? I've
> hit panic with "kernel memory exhausted" symptoms when module allocate
> too much, but not deadlock :(
Hm, I missed the kmem_back() in the stack. Yes, the thread is waiting for page
allocation.

> 
> > Then, syncer is blocked waiting for some physical buffer (look at tid
> > 100075), owning the vnode lock. Other processes also wait for the
> > locked buffers, etc.
> 
> > So my belief is that this is plain driver (g_raid5, whatever is it)
> > i/o loss. Try the same load without it.
>   I can not, because all data is on this GEOM :)
> 
> -- 
> // Black Lion AKA Lev Serebryakov 
> 


pgpl73U94BtBn.pgp
Description: PGP signature


Re: 8.2-PRERELEASE: live deadlock, almost all processes in "pfault" state

2011-01-08 Thread Lev Serebryakov
Hello, Kostik.
You wrote 8 января 2011 г., 22:02:32:


> If I am guessing right, this creature has a classic deadlock when
> bio processing requires memory allocation. It seems that tid 100079
> is sleeping not even due to the free page shortage, but due to address
> space exhaustion. As result, read/write requests are stalled.
  I want to say, that ZFS, for example, could allocate much more
memory, and, yes, it had problems on i386 with this, but not on amd64,
AFAIK...

  So, I'm (geom_radi5) doing something wrong...

-- 
// Black Lion AKA Lev Serebryakov 

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: 8.2-PRERELEASE: live deadlock, almost all processes in "pfault" state

2011-01-08 Thread Lev Serebryakov
Hello, Kostik.
You wrote 8 января 2011 г., 22:02:32:


> There is some weird backtrace at the pid 20, what is g_raid5 ?
  It is geom_raid5, with two threads -- working one and one for
 processing finished bios.

> If I am guessing right, this creature has a classic deadlock when 
> bio processing requires memory allocation. It seems that tid 100079
  tid 100079 sleep in waiting for some data in queue.

> is sleeping not even due to the free page shortage, but due to address
> space exhaustion. As result, read/write requests are stalled.
  tid 100078 sleep in malloc(). But geom_raid5 never ever allocate
 more than 128MiB of memory and it is 64bit system with huge amount of
 kmem_size/kmem_size_max...

  How could I explore allocation (like vmstat -m) from kdb to be sure,
it doesn't allocated more?

  And, if it is "classic deadlock" is here any "classical" solution to
it?

  Really, I'm maintainer of geom_raid5 now, so I need fix this
deadlock, but I don't really understand, why does it occur? I've
hit panic with "kernel memory exhausted" symptoms when module allocate
too much, but not deadlock :(

> Then, syncer is blocked waiting for some physical buffer (look at tid
> 100075), owning the vnode lock. Other processes also wait for the
> locked buffers, etc.

> So my belief is that this is plain driver (g_raid5, whatever is it)
> i/o loss. Try the same load without it.
  I can not, because all data is on this GEOM :)

-- 
// Black Lion AKA Lev Serebryakov 

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: 8.2-PRERELEASE: live deadlock, almost all processes in "pfault" state

2011-01-08 Thread Kostik Belousov
On Sat, Jan 08, 2011 at 09:44:57PM +0300, Lev Serebryakov wrote:
> Hello, Freebsd-stable.
> 
>  I've added `transmission' BitTorrent client to my home server and now
> it deadlocks easily (after about 1 hour of intensive download and
> seeding). This server is upgraded from 7.x and last time I've run
> transmission on 7.x system without any problems.
> 
>  I have home partition on geom_raid5 device, so I can not exclude this
> third-party module from experiments.
> 
>  My home filsystem has 32KiB block and all other filesystems (/, /var,
> /tmp, /usr) has standard 16KiB block sizes. I know, that 7.x system
> had (has?) deadlock when 16KiB and 64KiB file systems are mixed up on
> one system, but I never experienced deadlocks with 16KiB and 32KiB
> mixture.
> 
>  All filesystems (Except root) is SU, but no gjournal so SU_J patch
> are in use.
> 
>  Same BitTorrent client on same filesystem, but accessed via NFS (from
> other host), doesn't cause deadlock and works rock-stable for days.
> 
>  I've built kernel with all debug options, waited for deadlock and
> collect all information, mentioned in Developer's Handbook / Debugging
> Deadlocks.
> 
>  Capture from debug session is attached, together with kernel config
> and dmesg from rebooting.
> 
>  As I can easily reproduce this deadlock, I could provide any
> additional information from kernel debugger, if needed.
> 
> System:   FreeBSD 8.2-PRERELEASE
> cvsup:2011-01-08 00:41:24 MSK (GTM+3) time
> Platform: amd64
There is some weird backtrace at the pid 20, what is g_raid5 ?

If I am guessing right, this creature has a classic deadlock when 
bio processing requires memory allocation. It seems that tid 100079
is sleeping not even due to the free page shortage, but due to address
space exhaustion. As result, read/write requests are stalled.

Then, syncer is blocked waiting for some physical buffer (look at tid
100075), owning the vnode lock. Other processes also wait for the
locked buffers, etc.

So my belief is that this is plain driver (g_raid5, whatever is it)
i/o loss. Try the same load without it.


pgpLhnfw4K47p.pgp
Description: PGP signature