Re: [Haifux] some additions and eratta to today's lecture
On Mon, 2011-03-21 at 14:26 +0200, Nadav Har'El wrote: > On Tue, Mar 15, 2011, guy keren wrote about "[Haifux] some additions and > eratta to today's lecture": > > > > 1. etzion asked about controlling the age of dirty pages before pdflush > >flushes them - the default value is 30 seconds, and can be seen by: > > > > cat /proc/sys/vm/dirty_expire_centisecs > > > > (the time there is in milli-seconds). it can be changed by echoing > > the desired time into that file, e.g. to change it to 40 seconds: > > > > echo 40 > /proc/sys/vm/dirty_expire_centisecs > > Would I be wrong to assume that since the file name is "centisecs", it is > indeed centisecs (1/100th of a second), neither milli-seconds nor seconds > as you wrote above? err.. you're right. the default value is '3000', which amounts to 30 seconds, so to set it to 40 seconds - one should echo "4000" into that file. --guy ___ Haifux mailing list Haifux@haifux.org http://hamakor.org.il/cgi-bin/mailman/listinfo/haifux
Re: [Haifux] some additions and eratta to today's lecture
On Tue, Mar 15, 2011, guy keren wrote about "[Haifux] some additions and eratta to today's lecture": > > 1. etzion asked about controlling the age of dirty pages before pdflush >flushes them - the default value is 30 seconds, and can be seen by: > > cat /proc/sys/vm/dirty_expire_centisecs > > (the time there is in milli-seconds). it can be changed by echoing > the desired time into that file, e.g. to change it to 40 seconds: > > echo 40 > /proc/sys/vm/dirty_expire_centisecs Would I be wrong to assume that since the file name is "centisecs", it is indeed centisecs (1/100th of a second), neither milli-seconds nor seconds as you wrote above? -- Nadav Har'El| Monday, Mar 21 2011, 15 Adar II 5771 n...@math.technion.ac.il |- Phone +972-523-790466, ICQ 13349191 |I considered atheism but there weren't http://nadav.harel.org.il |enough holidays. ___ Haifux mailing list Haifux@haifux.org http://hamakor.org.il/cgi-bin/mailman/listinfo/haifux
Re: [Haifux] some additions and eratta to today's lecture
On Tue, Mar 15, 2011 at 11:00:02AM +0200, Shachar Raindel wrote: > "Hijacking" the thread to a more general HD discussion. And while we're at it, here's the article I mentioned about the "funny" behaviour of write to SSDs: http://lwn.net/Articles/428584/ Short summary: someone of the Linaro project (Linux on ARM) looks into the matter only to discover those devices are heavily tuned for using FAT32 as the file system. -- Tzafrir Cohen | tzaf...@jabber.org | VIM is http://tzafrir.org.il || a Mutt's tzaf...@cohens.org.il || best tzaf...@debian.org|| friend ___ Haifux mailing list Haifux@haifux.org http://hamakor.org.il/cgi-bin/mailman/listinfo/haifux
Re: [Haifux] some additions and eratta to today's lecture
"Hijacking" the thread to a more general HD discussion. Since there was an interest in SSD (flash) drives, here is a benchmark of normal hard-drives and flash drives: http://techreport.com/articles.x/19330/3 2 points which are easy to see in the graph, and were raised in the discussion yesterday: * In normal (mechanical) hard drives, the first bytes are read much faster than the last bytes. * In SSDs the read speed is *mostly* the same everywhere (though it depends on the controller's behavior and the workload history of the disk) --Shachar On Tue, Mar 15, 2011 at 3:52 AM, guy keren wrote: > > someone reminded me about the "small trail through the linux kernel" > link i mentioned. it is: > > http://www.win.tue.nl/~aeb/linux/vfs/trail.html > > note that it is from 2001 and relates to kernel 2.4 (or even older) - > but the general has not completely changed. > > you can find a more up-to-date info about this in the book "the linux > kernel, 3rd edition" - in the VFS chapter. > > --guy > > On Tue, 2011-03-15 at 01:35 +0200, guy keren wrote: > > 1. etzion asked about controlling the age of dirty pages before pdflush > > flushes them - the default value is 30 seconds, and can be seen by: > > > > cat /proc/sys/vm/dirty_expire_centisecs > > > > (the time there is in milli-seconds). it can be changed by echoing > > the desired time into that file, e.g. to change it to 40 seconds: > > > > echo 40 > /proc/sys/vm/dirty_expire_centisecs > > > > this parameter (and some other pdflush-related parameters) is described > > in the link i mentioned during the meeting today - that talks about > > configuring pdflush for write-intensive workloads: > > > > http://www.westnet.com/~gsmith/content/linux-pdflush.htm > > > > 2. if you read the above link - you'll see that in case there are too > > many dirty pages - the writing to disk is done directly by processes > > calling the write() system call (if you'll check the stack trace - > > you'll see these processes waiting for the page transfer to complete). > > this is done to serve as a flow-control mechanism, that slows down the > > processes that fill up the dirty cache. > > > > 3. regarding whether the write system call copied the data directly into > > the page-cache, or into a temporary buffer - it indeed copies the data > > directly into the page-cache. the generic write() system call passes > > control to the file-system's code - and this eventually allocates the > > pages, then maps the page with the user-data into the kernel's address > > space, and copy the data into the page-cache's page. note: i checked > > this for the ext3 file-system in kernel 2.6.18 - but the code it uses > > resides in generic kernel code - so i think other file-systems will > > behave the same. > > > > 4. regarding the output of iostat -x: > > > > the 'wrqm/s' (write requests merged per-second) field, shows the > > number of write requests that were merged into existing requests by the > > elevator. i.e. if 3 requests were merged together, '2' will be added to > > this counter (the first of these requests was not merged. the other two > > were merged into the first). this, at least according to the code of > > kernel 2.6.18 > > > > > > if i forgot some question, or you have a question about what we covered > > today - please shout! > > > > --guy > > > > ___ > > Haifux mailing list > > Haifux@haifux.org > > http://hamakor.org.il/cgi-bin/mailman/listinfo/haifux > > > ___ > Haifux mailing list > Haifux@haifux.org > http://hamakor.org.il/cgi-bin/mailman/listinfo/haifux ___ Haifux mailing list Haifux@haifux.org http://hamakor.org.il/cgi-bin/mailman/listinfo/haifux
Re: [Haifux] some additions and eratta to today's lecture
someone reminded me about the "small trail through the linux kernel" link i mentioned. it is: http://www.win.tue.nl/~aeb/linux/vfs/trail.html note that it is from 2001 and relates to kernel 2.4 (or even older) - but the general has not completely changed. you can find a more up-to-date info about this in the book "the linux kernel, 3rd edition" - in the VFS chapter. --guy On Tue, 2011-03-15 at 01:35 +0200, guy keren wrote: > 1. etzion asked about controlling the age of dirty pages before pdflush >flushes them - the default value is 30 seconds, and can be seen by: > > cat /proc/sys/vm/dirty_expire_centisecs > > (the time there is in milli-seconds). it can be changed by echoing > the desired time into that file, e.g. to change it to 40 seconds: > > echo 40 > /proc/sys/vm/dirty_expire_centisecs > > this parameter (and some other pdflush-related parameters) is described > in the link i mentioned during the meeting today - that talks about > configuring pdflush for write-intensive workloads: > > http://www.westnet.com/~gsmith/content/linux-pdflush.htm > > 2. if you read the above link - you'll see that in case there are too > many dirty pages - the writing to disk is done directly by processes > calling the write() system call (if you'll check the stack trace - > you'll see these processes waiting for the page transfer to complete). > this is done to serve as a flow-control mechanism, that slows down the > processes that fill up the dirty cache. > > 3. regarding whether the write system call copied the data directly into > the page-cache, or into a temporary buffer - it indeed copies the data > directly into the page-cache. the generic write() system call passes > control to the file-system's code - and this eventually allocates the > pages, then maps the page with the user-data into the kernel's address > space, and copy the data into the page-cache's page. note: i checked > this for the ext3 file-system in kernel 2.6.18 - but the code it uses > resides in generic kernel code - so i think other file-systems will > behave the same. > > 4. regarding the output of iostat -x: > > the 'wrqm/s' (write requests merged per-second) field, shows the > number of write requests that were merged into existing requests by the > elevator. i.e. if 3 requests were merged together, '2' will be added to > this counter (the first of these requests was not merged. the other two > were merged into the first). this, at least according to the code of > kernel 2.6.18 > > > if i forgot some question, or you have a question about what we covered > today - please shout! > > --guy > > ___ > Haifux mailing list > Haifux@haifux.org > http://hamakor.org.il/cgi-bin/mailman/listinfo/haifux ___ Haifux mailing list Haifux@haifux.org http://hamakor.org.il/cgi-bin/mailman/listinfo/haifux
[Haifux] some additions and eratta to today's lecture
1. etzion asked about controlling the age of dirty pages before pdflush flushes them - the default value is 30 seconds, and can be seen by: cat /proc/sys/vm/dirty_expire_centisecs (the time there is in milli-seconds). it can be changed by echoing the desired time into that file, e.g. to change it to 40 seconds: echo 40 > /proc/sys/vm/dirty_expire_centisecs this parameter (and some other pdflush-related parameters) is described in the link i mentioned during the meeting today - that talks about configuring pdflush for write-intensive workloads: http://www.westnet.com/~gsmith/content/linux-pdflush.htm 2. if you read the above link - you'll see that in case there are too many dirty pages - the writing to disk is done directly by processes calling the write() system call (if you'll check the stack trace - you'll see these processes waiting for the page transfer to complete). this is done to serve as a flow-control mechanism, that slows down the processes that fill up the dirty cache. 3. regarding whether the write system call copied the data directly into the page-cache, or into a temporary buffer - it indeed copies the data directly into the page-cache. the generic write() system call passes control to the file-system's code - and this eventually allocates the pages, then maps the page with the user-data into the kernel's address space, and copy the data into the page-cache's page. note: i checked this for the ext3 file-system in kernel 2.6.18 - but the code it uses resides in generic kernel code - so i think other file-systems will behave the same. 4. regarding the output of iostat -x: the 'wrqm/s' (write requests merged per-second) field, shows the number of write requests that were merged into existing requests by the elevator. i.e. if 3 requests were merged together, '2' will be added to this counter (the first of these requests was not merged. the other two were merged into the first). this, at least according to the code of kernel 2.6.18 if i forgot some question, or you have a question about what we covered today - please shout! --guy ___ Haifux mailing list Haifux@haifux.org http://hamakor.org.il/cgi-bin/mailman/listinfo/haifux