Re: scheduling oddity on 2.6.20.3 stock
David Schwartz wrote: bunzip2 -c $file.bz2 |gzip -9 >$file.gz So here are some actual results from a dual P3-1Ghz machine (2.6.21.1, CFSv9). First lets time each operation individually: $ time bunzip2 -k linux-2.6.21.tar.bz2 real1m5.626s user1m2.240s sys 0m3.144s $ time gzip -9 linux-2.6.21.tar real1m17.652s user1m15.609s sys 0m1.912s The compress was the most complex (no surprise there) but they are close enough that efficient overlap will definitely affect the total wall time. If we can both decompress and compress in 1:17, we are optimal. First, let's try the normal way: $ time (bunzip2 -c linux-2.6.21.tar.bz2 | gzip -9 > test1) real1m45.051s user2m16.945s sys 0m2.752s 1:45, or 1/3 over optimal. Now, with a 32MB non-blocking cache between the two processes ('accel' creates a 32MB cache and uses 'select' to fill from stdin and empty to stdout without blocking either direction): $ time (bunzip2 -c linux-2.6.21.tar.bz2 | ./accel | gzip -9 > test2) real1m18.361s user2m19.589s sys 0m6.356s Within testing accuracy of optimal. So it's not the scheduler. It's the fact that bunzip2/gzip have inadequate input/output buffering. I don't think it's unreasonable to consider this a defect in those programs. They are hardly designed to optimize this operation... For a tunable buffer program allowing the buffer size and buffers in the pool to be set, see www.tmr.com/~public/source program ptbuf. I wrote it as a proof of concept for a pthreads presentation I was giving, and it happened to be useful. -- Bill Davidsen <[EMAIL PROTECTED]> "We have more to fear from the bungling of the incompetent than from the machinations of the wicked." - from Slashdot - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: scheduling oddity on 2.6.20.3 stock
David Schwartz wrote: bunzip2 -c $file.bz2 |gzip -9 $file.gz So here are some actual results from a dual P3-1Ghz machine (2.6.21.1, CFSv9). First lets time each operation individually: $ time bunzip2 -k linux-2.6.21.tar.bz2 real1m5.626s user1m2.240s sys 0m3.144s $ time gzip -9 linux-2.6.21.tar real1m17.652s user1m15.609s sys 0m1.912s The compress was the most complex (no surprise there) but they are close enough that efficient overlap will definitely affect the total wall time. If we can both decompress and compress in 1:17, we are optimal. First, let's try the normal way: $ time (bunzip2 -c linux-2.6.21.tar.bz2 | gzip -9 test1) real1m45.051s user2m16.945s sys 0m2.752s 1:45, or 1/3 over optimal. Now, with a 32MB non-blocking cache between the two processes ('accel' creates a 32MB cache and uses 'select' to fill from stdin and empty to stdout without blocking either direction): $ time (bunzip2 -c linux-2.6.21.tar.bz2 | ./accel | gzip -9 test2) real1m18.361s user2m19.589s sys 0m6.356s Within testing accuracy of optimal. So it's not the scheduler. It's the fact that bunzip2/gzip have inadequate input/output buffering. I don't think it's unreasonable to consider this a defect in those programs. They are hardly designed to optimize this operation... For a tunable buffer program allowing the buffer size and buffers in the pool to be set, see www.tmr.com/~public/source program ptbuf. I wrote it as a proof of concept for a pthreads presentation I was giving, and it happened to be useful. -- Bill Davidsen [EMAIL PROTECTED] We have more to fear from the bungling of the incompetent than from the machinations of the wicked. - from Slashdot - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: scheduling oddity on 2.6.20.3 stock
> > >> bunzip2 -c $file.bz2 |gzip -9 >$file.gz So here are some actual results from a dual P3-1Ghz machine (2.6.21.1, CFSv9). First lets time each operation individually: $ time bunzip2 -k linux-2.6.21.tar.bz2 real1m5.626s user1m2.240s sys 0m3.144s $ time gzip -9 linux-2.6.21.tar real1m17.652s user1m15.609s sys 0m1.912s The compress was the most complex (no surprise there) but they are close enough that efficient overlap will definitely affect the total wall time. If we can both decompress and compress in 1:17, we are optimal. First, let's try the normal way: $ time (bunzip2 -c linux-2.6.21.tar.bz2 | gzip -9 > test1) real1m45.051s user2m16.945s sys 0m2.752s 1:45, or 1/3 over optimal. Now, with a 32MB non-blocking cache between the two processes ('accel' creates a 32MB cache and uses 'select' to fill from stdin and empty to stdout without blocking either direction): $ time (bunzip2 -c linux-2.6.21.tar.bz2 | ./accel | gzip -9 > test2) real1m18.361s user2m19.589s sys 0m6.356s Within testing accuracy of optimal. So it's not the scheduler. It's the fact that bunzip2/gzip have inadequate input/output buffering. I don't think it's unreasonable to consider this a defect in those programs. DS - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: scheduling oddity on 2.6.20.3 stock
> On Thu, 3 May 2007, David Schwartz wrote: > > >> I needed to recompress some files from .bz2 to .gz so I setup > a script to > >> do > >> > >> bunzip2 -c $file.bz2 |gzip -9 >$file.gz > >> > >> I expected that the two CPU heavy processes would end up on different > >> cpu's and spend a little time shuffling data between the two cpu's on a > >> system (dual core opteron) > >> > >> however, instead what I find is that each process is getting 50% of one > >> cpu while the other cpu is 97% idle. > > > > That would only be possible if the compression/decompression > block size is > > small compared to the maximum pipe buffer size. I suspect the > reverse is the > > case. > > I'm still running into this problem in various forms > > is there an easy way to change the maximum pipe buffer size? (including a > simple change to the kernel source, I do compile my own kernels) No. Changing the size will not do what you want it to do since that only tells the kernel what the size is, it does not determine what it is. > > It would be interesting to write an intermediate process that basically > > enlarged the pipe buffers and see if that changed anything. > > Basically, the > > intermediate process would allocate a large buffer (16MB or so) > > and fill it > > from 'bunzip2' while draining it to 'gzip' in a non-blocking > > way (unless the > > buffer was full/empty, of course). It is not particularly hard to write such a process. I have a proxy that I can easily tweak to do this. I'm going to give it a shot and see if it helps. DS - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: scheduling oddity on 2.6.20.3 stock
On Thu, 3 May 2007, David Schwartz wrote: I needed to recompress some files from .bz2 to .gz so I setup a script to do bunzip2 -c $file.bz2 |gzip -9 >$file.gz I expected that the two CPU heavy processes would end up on different cpu's and spend a little time shuffling data between the two cpu's on a system (dual core opteron) however, instead what I find is that each process is getting 50% of one cpu while the other cpu is 97% idle. That would only be possible if the compression/decompression block size is small compared to the maximum pipe buffer size. I suspect the reverse is the case. I'm still running into this problem in various forms is there an easy way to change the maximum pipe buffer size? (including a simple change to the kernel source, I do compile my own kernels) It would be interesting to write an intermediate process that basically enlarged the pipe buffers and see if that changed anything. Basically, the intermediate process would allocate a large buffer (16MB or so) and fill it from 'bunzip2' while draining it to 'gzip' in a non-blocking way (unless the buffer was full/empty, of course). - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: scheduling oddity on 2.6.20.3 stock
On Thu, 3 May 2007, David Schwartz wrote: I needed to recompress some files from .bz2 to .gz so I setup a script to do bunzip2 -c $file.bz2 |gzip -9 $file.gz I expected that the two CPU heavy processes would end up on different cpu's and spend a little time shuffling data between the two cpu's on a system (dual core opteron) however, instead what I find is that each process is getting 50% of one cpu while the other cpu is 97% idle. That would only be possible if the compression/decompression block size is small compared to the maximum pipe buffer size. I suspect the reverse is the case. I'm still running into this problem in various forms is there an easy way to change the maximum pipe buffer size? (including a simple change to the kernel source, I do compile my own kernels) It would be interesting to write an intermediate process that basically enlarged the pipe buffers and see if that changed anything. Basically, the intermediate process would allocate a large buffer (16MB or so) and fill it from 'bunzip2' while draining it to 'gzip' in a non-blocking way (unless the buffer was full/empty, of course). - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: scheduling oddity on 2.6.20.3 stock
On Thu, 3 May 2007, David Schwartz wrote: I needed to recompress some files from .bz2 to .gz so I setup a script to do bunzip2 -c $file.bz2 |gzip -9 $file.gz I expected that the two CPU heavy processes would end up on different cpu's and spend a little time shuffling data between the two cpu's on a system (dual core opteron) however, instead what I find is that each process is getting 50% of one cpu while the other cpu is 97% idle. That would only be possible if the compression/decompression block size is small compared to the maximum pipe buffer size. I suspect the reverse is the case. I'm still running into this problem in various forms is there an easy way to change the maximum pipe buffer size? (including a simple change to the kernel source, I do compile my own kernels) No. Changing the size will not do what you want it to do since that only tells the kernel what the size is, it does not determine what it is. It would be interesting to write an intermediate process that basically enlarged the pipe buffers and see if that changed anything. Basically, the intermediate process would allocate a large buffer (16MB or so) and fill it from 'bunzip2' while draining it to 'gzip' in a non-blocking way (unless the buffer was full/empty, of course). It is not particularly hard to write such a process. I have a proxy that I can easily tweak to do this. I'm going to give it a shot and see if it helps. DS - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: scheduling oddity on 2.6.20.3 stock
bunzip2 -c $file.bz2 |gzip -9 $file.gz So here are some actual results from a dual P3-1Ghz machine (2.6.21.1, CFSv9). First lets time each operation individually: $ time bunzip2 -k linux-2.6.21.tar.bz2 real1m5.626s user1m2.240s sys 0m3.144s $ time gzip -9 linux-2.6.21.tar real1m17.652s user1m15.609s sys 0m1.912s The compress was the most complex (no surprise there) but they are close enough that efficient overlap will definitely affect the total wall time. If we can both decompress and compress in 1:17, we are optimal. First, let's try the normal way: $ time (bunzip2 -c linux-2.6.21.tar.bz2 | gzip -9 test1) real1m45.051s user2m16.945s sys 0m2.752s 1:45, or 1/3 over optimal. Now, with a 32MB non-blocking cache between the two processes ('accel' creates a 32MB cache and uses 'select' to fill from stdin and empty to stdout without blocking either direction): $ time (bunzip2 -c linux-2.6.21.tar.bz2 | ./accel | gzip -9 test2) real1m18.361s user2m19.589s sys 0m6.356s Within testing accuracy of optimal. So it's not the scheduler. It's the fact that bunzip2/gzip have inadequate input/output buffering. I don't think it's unreasonable to consider this a defect in those programs. DS - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: scheduling oddity on 2.6.20.3 stock
On Thu, 3 May 2007, David Schwartz wrote: I needed to recompress some files from .bz2 to .gz so I setup a script to do bunzip2 -c $file.bz2 |gzip -9 >$file.gz I expected that the two CPU heavy processes would end up on different cpu's and spend a little time shuffling data between the two cpu's on a system (dual core opteron) however, instead what I find is that each process is getting 50% of one cpu while the other cpu is 97% idle. That would only be possible if the compression/decompression block size is small compared to the maximum pipe buffer size. I suspect the reverse is the case. It would be interesting to write an intermediate process that basically enlarged the pipe buffers and see if that changed anything. Basically, the intermediate process would allocate a large buffer (16MB or so) and fill it from 'bunzip2' while draining it to 'gzip' in a non-blocking way (unless the buffer was full/empty, of course). hmm, how about bunzip2 -c $file.bz2 |dd bs=8m |gzip -9 >$file.gz should that work? David Lang - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: scheduling oddity on 2.6.20.3 stock
> I needed to recompress some files from .bz2 to .gz so I setup a script to > do > > bunzip2 -c $file.bz2 |gzip -9 >$file.gz > > I expected that the two CPU heavy processes would end up on different > cpu's and spend a little time shuffling data between the two cpu's on a > system (dual core opteron) > > however, instead what I find is that each process is getting 50% of one > cpu while the other cpu is 97% idle. That would only be possible if the compression/decompression block size is small compared to the maximum pipe buffer size. I suspect the reverse is the case. It would be interesting to write an intermediate process that basically enlarged the pipe buffers and see if that changed anything. Basically, the intermediate process would allocate a large buffer (16MB or so) and fill it from 'bunzip2' while draining it to 'gzip' in a non-blocking way (unless the buffer was full/empty, of course). DS - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: scheduling oddity on 2.6.20.3 stock
I needed to recompress some files from .bz2 to .gz so I setup a script to do bunzip2 -c $file.bz2 |gzip -9 $file.gz I expected that the two CPU heavy processes would end up on different cpu's and spend a little time shuffling data between the two cpu's on a system (dual core opteron) however, instead what I find is that each process is getting 50% of one cpu while the other cpu is 97% idle. That would only be possible if the compression/decompression block size is small compared to the maximum pipe buffer size. I suspect the reverse is the case. It would be interesting to write an intermediate process that basically enlarged the pipe buffers and see if that changed anything. Basically, the intermediate process would allocate a large buffer (16MB or so) and fill it from 'bunzip2' while draining it to 'gzip' in a non-blocking way (unless the buffer was full/empty, of course). DS - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: scheduling oddity on 2.6.20.3 stock
On Thu, 3 May 2007, David Schwartz wrote: I needed to recompress some files from .bz2 to .gz so I setup a script to do bunzip2 -c $file.bz2 |gzip -9 $file.gz I expected that the two CPU heavy processes would end up on different cpu's and spend a little time shuffling data between the two cpu's on a system (dual core opteron) however, instead what I find is that each process is getting 50% of one cpu while the other cpu is 97% idle. That would only be possible if the compression/decompression block size is small compared to the maximum pipe buffer size. I suspect the reverse is the case. It would be interesting to write an intermediate process that basically enlarged the pipe buffers and see if that changed anything. Basically, the intermediate process would allocate a large buffer (16MB or so) and fill it from 'bunzip2' while draining it to 'gzip' in a non-blocking way (unless the buffer was full/empty, of course). hmm, how about bunzip2 -c $file.bz2 |dd bs=8m |gzip -9 $file.gz should that work? David Lang - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/