Re: A little coding style nugget of joy
On 9/20/07, Robert P. J. Day <[EMAIL PROTECTED]> wrote: > On Thu, 20 Sep 2007, Pádraig Brady wrote: > > > Matt LaPlante wrote: > > > Since everyone loves random statistics, here are a few gems to give you a > > > break from your busy day: > > > > > > Number of lines in the 2.6.22 Linux kernel source that include one or > > > more trailing whitespaces: 135209 > > > Bytes saved by removing said whitespace: 151809 > > > Lines in the (unified) diff: 455437 > > > Size of the diff: 15M > > > People brave enough to submit the patch: ~0 > > > > It's gradually getting better so: > > http://lwn.net/2001/1129/a/whitespace.php3 > > and you wouldn't *believe* how much space you can save by getting rid > of all that annoying indentation. and don't even get me *started* on > those comments ... > > rday --- I think you're on to something here. If we stored the files with all the non-meaningful whitespace (including non-meaningful newlines) removed, not only would we save disk space, but we would also eliminate significant amounts of developer time and LKML bandwidth currently expended on arguing about formatting. Everybody could just run things through indent with whatever formatting they preferred. Might make diffs ugly, though... scott -- scott preece - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: A little coding style nugget of joy
On Thu, 20 Sep 2007, Pádraig Brady wrote: > Matt LaPlante wrote: > > Since everyone loves random statistics, here are a few gems to give you a > > break from your busy day: > > > > Number of lines in the 2.6.22 Linux kernel source that include one or more > > trailing whitespaces: 135209 > > Bytes saved by removing said whitespace: 151809 > > Lines in the (unified) diff: 455437 > > Size of the diff: 15M > > People brave enough to submit the patch: ~0 > > It's gradually getting better so: > http://lwn.net/2001/1129/a/whitespace.php3 and you wouldn't *believe* how much space you can save by getting rid of all that annoying indentation. and don't even get me *started* on those comments ... rday -- Robert P. J. Day Linux Consulting, Training and Annoying Kernel Pedantry Waterloo, Ontario, CANADA http://crashcourse.ca
Re: A little coding style nugget of joy
Matt LaPlante wrote: > Since everyone loves random statistics, here are a few gems to give you a > break from your busy day: > > Number of lines in the 2.6.22 Linux kernel source that include one or more > trailing whitespaces: 135209 > Bytes saved by removing said whitespace: 151809 > Lines in the (unified) diff: 455437 > Size of the diff: 15M > People brave enough to submit the patch: ~0 It's gradually getting better so: http://lwn.net/2001/1129/a/whitespace.php3 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: A little coding style nugget of joy
Matt LaPlante wrote: Since everyone loves random statistics, here are a few gems to give you a break from your busy day: Number of lines in the 2.6.22 Linux kernel source that include one or more trailing whitespaces: 135209 Bytes saved by removing said whitespace: 151809 Lines in the (unified) diff: 455437 Size of the diff: 15M People brave enough to submit the patch: ~0 It's gradually getting better so: http://lwn.net/2001/1129/a/whitespace.php3 - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: A little coding style nugget of joy
On Thu, 20 Sep 2007, Pádraig Brady wrote: Matt LaPlante wrote: Since everyone loves random statistics, here are a few gems to give you a break from your busy day: Number of lines in the 2.6.22 Linux kernel source that include one or more trailing whitespaces: 135209 Bytes saved by removing said whitespace: 151809 Lines in the (unified) diff: 455437 Size of the diff: 15M People brave enough to submit the patch: ~0 It's gradually getting better so: http://lwn.net/2001/1129/a/whitespace.php3 and you wouldn't *believe* how much space you can save by getting rid of all that annoying indentation. and don't even get me *started* on those comments ... rday -- Robert P. J. Day Linux Consulting, Training and Annoying Kernel Pedantry Waterloo, Ontario, CANADA http://crashcourse.ca
Re: A little coding style nugget of joy
On 9/20/07, Robert P. J. Day [EMAIL PROTECTED] wrote: On Thu, 20 Sep 2007, Pádraig Brady wrote: Matt LaPlante wrote: Since everyone loves random statistics, here are a few gems to give you a break from your busy day: Number of lines in the 2.6.22 Linux kernel source that include one or more trailing whitespaces: 135209 Bytes saved by removing said whitespace: 151809 Lines in the (unified) diff: 455437 Size of the diff: 15M People brave enough to submit the patch: ~0 It's gradually getting better so: http://lwn.net/2001/1129/a/whitespace.php3 and you wouldn't *believe* how much space you can save by getting rid of all that annoying indentation. and don't even get me *started* on those comments ... rday --- I think you're on to something here. If we stored the files with all the non-meaningful whitespace (including non-meaningful newlines) removed, not only would we save disk space, but we would also eliminate significant amounts of developer time and LKML bandwidth currently expended on arguing about formatting. Everybody could just run things through indent with whatever formatting they preferred. Might make diffs ugly, though... scott -- scott preece - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: A little coding style nugget of joy
Andi Kleen wrote: Matt LaPlante <[EMAIL PROTECTED]> writes: Since everyone loves random statistics, here are a few gems to give you a break from your busy day: Number of lines in the 2.6.22 Linux kernel source that include one or more trailing whitespaces: 135209 Bytes saved by removing said whitespace: 151809 You don't actually save anything on disk on most file systems (essentially everything except reiserfs on current Linux) because all files are rounded to block size (normally 4K) Same in page cache. This is a terrible assumption in general (i.e. if filesize % blocksize is close to uniformly distributed). If you remove one byte and the data is stored with blocksize B, then you either save zero bytes with probability 1-1/B or you save B bytes with probability 1/B. The expected number of bytes saved is B*1/B=1. Since expectation is linear, if you remove x bytes, the expected number of bytes saved is x (even if there is more than one byte removed per file). In my tree, about half of the files have size >= 4k, so the assumption is probably not _that_ far off the mark. Alternatively, there are an average of about 16 bytes removed per file, and there are 11 which are <= 16 bytes short of a 4k boundary, so it's not at all unreasonable that we'd save 40-50k. And in tar files bzip2/gzip is very good at compacting them. That's true. --Andy - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: A little coding style nugget of joy
On 9/19/07, Andi Kleen <[EMAIL PROTECTED]> wrote: > > This is a terrible assumption in general (i.e. if filesize % blocksize > > is close to uniformly distributed). If you remove one byte and the data > > is stored with blocksize B, then you either save zero bytes with > > probability 1-1/B or you save B bytes with probability 1/B. The > > expected number of bytes saved is B*1/B=1. Since expectation is linear, > > if you remove x bytes, the expected number of bytes saved is x (even if > > there is more than one byte removed per file). > > You didn't calculate the probability of actually saving a full block > or not (that's the only thing that matters). I assumed it's relatively > small and can be ignored in practice since the amount of end white > space is negligible compared to total file size. Sure I did. It's roughly 1/B per byte removed ( = 1/4096 ). --Andy - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: A little coding style nugget of joy
> This is a terrible assumption in general (i.e. if filesize % blocksize > is close to uniformly distributed). If you remove one byte and the data > is stored with blocksize B, then you either save zero bytes with > probability 1-1/B or you save B bytes with probability 1/B. The > expected number of bytes saved is B*1/B=1. Since expectation is linear, > if you remove x bytes, the expected number of bytes saved is x (even if > there is more than one byte removed per file). You didn't calculate the probability of actually saving a full block or not (that's the only thing that matters). I assumed it's relatively small and can be ignored in practice since the amount of end white space is negligible compared to total file size. -Andi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: A little coding style nugget of joy
Matt LaPlante <[EMAIL PROTECTED]> writes: > Since everyone loves random statistics, here are a few gems to give you a > break from your busy day: > > Number of lines in the 2.6.22 Linux kernel source that include one or more > trailing whitespaces: 135209 > Bytes saved by removing said whitespace: 151809 You don't actually save anything on disk on most file systems (essentially everything except reiserfs on current Linux) because all files are rounded to block size (normally 4K) Same in page cache. And in tar files bzip2/gzip is very good at compacting them. > Lines in the (unified) diff: 455437 > Size of the diff: 15M > People brave enough to submit the patch: ~0 Many kernel maintainers automatically remove trailing white space on any new lines these days. So as the kernel keeps changing it should eventually all disappear; except on essentially dead code. -Andi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
A little coding style nugget of joy
Since everyone loves random statistics, here are a few gems to give you a break from your busy day: Number of lines in the 2.6.22 Linux kernel source that include one or more trailing whitespaces: 135209 Bytes saved by removing said whitespace: 151809 Lines in the (unified) diff: 455437 Size of the diff: 15M People brave enough to submit the patch: ~0 Take care. :) - Matt - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
A little coding style nugget of joy
Since everyone loves random statistics, here are a few gems to give you a break from your busy day: Number of lines in the 2.6.22 Linux kernel source that include one or more trailing whitespaces: 135209 Bytes saved by removing said whitespace: 151809 Lines in the (unified) diff: 455437 Size of the diff: 15M People brave enough to submit the patch: ~0 Take care. :) - Matt - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: A little coding style nugget of joy
Matt LaPlante [EMAIL PROTECTED] writes: Since everyone loves random statistics, here are a few gems to give you a break from your busy day: Number of lines in the 2.6.22 Linux kernel source that include one or more trailing whitespaces: 135209 Bytes saved by removing said whitespace: 151809 You don't actually save anything on disk on most file systems (essentially everything except reiserfs on current Linux) because all files are rounded to block size (normally 4K) Same in page cache. And in tar files bzip2/gzip is very good at compacting them. Lines in the (unified) diff: 455437 Size of the diff: 15M People brave enough to submit the patch: ~0 Many kernel maintainers automatically remove trailing white space on any new lines these days. So as the kernel keeps changing it should eventually all disappear; except on essentially dead code. -Andi - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: A little coding style nugget of joy
This is a terrible assumption in general (i.e. if filesize % blocksize is close to uniformly distributed). If you remove one byte and the data is stored with blocksize B, then you either save zero bytes with probability 1-1/B or you save B bytes with probability 1/B. The expected number of bytes saved is B*1/B=1. Since expectation is linear, if you remove x bytes, the expected number of bytes saved is x (even if there is more than one byte removed per file). You didn't calculate the probability of actually saving a full block or not (that's the only thing that matters). I assumed it's relatively small and can be ignored in practice since the amount of end white space is negligible compared to total file size. -Andi - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: A little coding style nugget of joy
On 9/19/07, Andi Kleen [EMAIL PROTECTED] wrote: This is a terrible assumption in general (i.e. if filesize % blocksize is close to uniformly distributed). If you remove one byte and the data is stored with blocksize B, then you either save zero bytes with probability 1-1/B or you save B bytes with probability 1/B. The expected number of bytes saved is B*1/B=1. Since expectation is linear, if you remove x bytes, the expected number of bytes saved is x (even if there is more than one byte removed per file). You didn't calculate the probability of actually saving a full block or not (that's the only thing that matters). I assumed it's relatively small and can be ignored in practice since the amount of end white space is negligible compared to total file size. Sure I did. It's roughly 1/B per byte removed ( = 1/4096 ). --Andy - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: A little coding style nugget of joy
Andi Kleen wrote: Matt LaPlante [EMAIL PROTECTED] writes: Since everyone loves random statistics, here are a few gems to give you a break from your busy day: Number of lines in the 2.6.22 Linux kernel source that include one or more trailing whitespaces: 135209 Bytes saved by removing said whitespace: 151809 You don't actually save anything on disk on most file systems (essentially everything except reiserfs on current Linux) because all files are rounded to block size (normally 4K) Same in page cache. This is a terrible assumption in general (i.e. if filesize % blocksize is close to uniformly distributed). If you remove one byte and the data is stored with blocksize B, then you either save zero bytes with probability 1-1/B or you save B bytes with probability 1/B. The expected number of bytes saved is B*1/B=1. Since expectation is linear, if you remove x bytes, the expected number of bytes saved is x (even if there is more than one byte removed per file). In my tree, about half of the files have size = 4k, so the assumption is probably not _that_ far off the mark. Alternatively, there are an average of about 16 bytes removed per file, and there are 11 which are = 16 bytes short of a 4k boundary, so it's not at all unreasonable that we'd save 40-50k. And in tar files bzip2/gzip is very good at compacting them. That's true. --Andy - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/