Re: [Haifux] some additions and eratta to today's lecture

2011-03-21 Thread Nadav Har'El
On Tue, Mar 15, 2011, guy keren wrote about [Haifux] some additions and eratta 
to today's lecture:
 
 1. etzion asked about controlling the age of dirty pages before pdflush 
flushes them - the default value is 30 seconds, and can be seen by:
 
 cat /proc/sys/vm/dirty_expire_centisecs
 
 (the time there is in milli-seconds). it can be changed by echoing
 the desired time into that file, e.g. to change it to 40 seconds:
 
  echo 40  /proc/sys/vm/dirty_expire_centisecs

Would I be wrong to assume that since the file name is centisecs, it is
indeed centisecs (1/100th of a second), neither milli-seconds nor seconds
as you wrote above?

-- 
Nadav Har'El| Monday, Mar 21 2011, 15 Adar II 5771
n...@math.technion.ac.il |-
Phone +972-523-790466, ICQ 13349191 |I considered atheism but there weren't
http://nadav.harel.org.il   |enough holidays.
___
Haifux mailing list
Haifux@haifux.org
http://hamakor.org.il/cgi-bin/mailman/listinfo/haifux


Re: [Haifux] some additions and eratta to today's lecture

2011-03-21 Thread guy keren
On Mon, 2011-03-21 at 14:26 +0200, Nadav Har'El wrote:
 On Tue, Mar 15, 2011, guy keren wrote about [Haifux] some additions and 
 eratta to today's lecture:
  
  1. etzion asked about controlling the age of dirty pages before pdflush 
 flushes them - the default value is 30 seconds, and can be seen by:
  
  cat /proc/sys/vm/dirty_expire_centisecs
  
  (the time there is in milli-seconds). it can be changed by echoing
  the desired time into that file, e.g. to change it to 40 seconds:
  
   echo 40  /proc/sys/vm/dirty_expire_centisecs
 
 Would I be wrong to assume that since the file name is centisecs, it is
 indeed centisecs (1/100th of a second), neither milli-seconds nor seconds
 as you wrote above?

err.. you're right.

the default value is '3000', which amounts to 30 seconds,
so to set it to 40 seconds - one should echo 4000 into that file.

--guy

___
Haifux mailing list
Haifux@haifux.org
http://hamakor.org.il/cgi-bin/mailman/listinfo/haifux


Re: [Haifux] some additions and eratta to today's lecture

2011-03-15 Thread Shachar Raindel
Hijacking the thread to a more general HD discussion.

Since there was an interest in SSD (flash) drives, here is a benchmark
of normal hard-drives and flash drives:

http://techreport.com/articles.x/19330/3

2 points which are easy to see in the graph, and were raised in the
discussion yesterday:

* In normal (mechanical) hard drives, the first bytes are read much
faster than the last bytes.

* In SSDs the read speed is *mostly* the same everywhere (though it
depends on the controller's behavior and the workload history of the
disk)

--Shachar

On Tue, Mar 15, 2011 at 3:52 AM, guy keren c...@actcom.co.il wrote:

 someone reminded me about the small trail through the linux kernel
 link i mentioned. it is:

 http://www.win.tue.nl/~aeb/linux/vfs/trail.html

 note that it is from 2001 and relates to kernel 2.4 (or even older) -
 but the general has not completely changed.

 you can find a more up-to-date info about this in the book the linux
 kernel, 3rd edition - in the VFS chapter.

 --guy

 On Tue, 2011-03-15 at 01:35 +0200, guy keren wrote:
  1. etzion asked about controlling the age of dirty pages before pdflush
     flushes them - the default value is 30 seconds, and can be seen by:
 
      cat /proc/sys/vm/dirty_expire_centisecs
 
      (the time there is in milli-seconds). it can be changed by echoing
  the desired time into that file, e.g. to change it to 40 seconds:
 
       echo 40  /proc/sys/vm/dirty_expire_centisecs
 
  this parameter (and some other pdflush-related parameters) is described
  in the link i mentioned during the meeting today - that talks about
  configuring pdflush for write-intensive workloads:
 
      http://www.westnet.com/~gsmith/content/linux-pdflush.htm
 
  2. if you read the above link - you'll see that in case there are too
  many dirty pages - the writing to disk is done directly by processes
  calling the write() system call (if you'll check the stack trace -
  you'll see these processes waiting for the page transfer to complete).
  this is done to serve as a flow-control mechanism, that slows down the
  processes that fill up the dirty cache.
 
  3. regarding whether the write system call copied the data directly into
  the page-cache, or into a temporary buffer - it indeed copies the data
  directly into the page-cache. the generic write() system call passes
  control to the file-system's code - and this eventually allocates the
  pages, then maps the page with the user-data into the kernel's address
  space, and copy the data into the page-cache's page. note: i checked
  this for the ext3 file-system in kernel 2.6.18 - but the code it uses
  resides in generic kernel code - so i think other file-systems will
  behave the same.
 
  4. regarding the output of iostat -x:
 
      the 'wrqm/s' (write requests merged per-second) field, shows the
  number of write requests that were merged into existing requests by the
  elevator. i.e. if 3 requests were merged together, '2' will be added to
  this counter (the first of these requests was not merged. the other two
  were merged into the first). this, at least according to the code of
  kernel 2.6.18
 
 
  if i forgot some question, or you have a question about what we covered
  today - please shout!
 
  --guy
 
  ___
  Haifux mailing list
  Haifux@haifux.org
  http://hamakor.org.il/cgi-bin/mailman/listinfo/haifux


 ___
 Haifux mailing list
 Haifux@haifux.org
 http://hamakor.org.il/cgi-bin/mailman/listinfo/haifux
___
Haifux mailing list
Haifux@haifux.org
http://hamakor.org.il/cgi-bin/mailman/listinfo/haifux


Re: [Haifux] some additions and eratta to today's lecture

2011-03-15 Thread Tzafrir Cohen
On Tue, Mar 15, 2011 at 11:00:02AM +0200, Shachar Raindel wrote:
 Hijacking the thread to a more general HD discussion.

And while we're at it, here's the article I mentioned about the funny
behaviour of write to SSDs:
http://lwn.net/Articles/428584/

Short summary: someone of the Linaro project (Linux on ARM) looks into
the matter only to discover those devices are heavily tuned for using
FAT32 as the file system.

-- 
Tzafrir Cohen | tzaf...@jabber.org | VIM is
http://tzafrir.org.il || a Mutt's
tzaf...@cohens.org.il ||  best
tzaf...@debian.org|| friend
___
Haifux mailing list
Haifux@haifux.org
http://hamakor.org.il/cgi-bin/mailman/listinfo/haifux


[Haifux] some additions and eratta to today's lecture

2011-03-14 Thread guy keren

1. etzion asked about controlling the age of dirty pages before pdflush 
   flushes them - the default value is 30 seconds, and can be seen by:

cat /proc/sys/vm/dirty_expire_centisecs

(the time there is in milli-seconds). it can be changed by echoing
the desired time into that file, e.g. to change it to 40 seconds:

 echo 40  /proc/sys/vm/dirty_expire_centisecs

this parameter (and some other pdflush-related parameters) is described
in the link i mentioned during the meeting today - that talks about
configuring pdflush for write-intensive workloads:

http://www.westnet.com/~gsmith/content/linux-pdflush.htm

2. if you read the above link - you'll see that in case there are too
many dirty pages - the writing to disk is done directly by processes
calling the write() system call (if you'll check the stack trace -
you'll see these processes waiting for the page transfer to complete).
this is done to serve as a flow-control mechanism, that slows down the
processes that fill up the dirty cache.

3. regarding whether the write system call copied the data directly into
the page-cache, or into a temporary buffer - it indeed copies the data
directly into the page-cache. the generic write() system call passes
control to the file-system's code - and this eventually allocates the
pages, then maps the page with the user-data into the kernel's address
space, and copy the data into the page-cache's page. note: i checked
this for the ext3 file-system in kernel 2.6.18 - but the code it uses
resides in generic kernel code - so i think other file-systems will
behave the same.

4. regarding the output of iostat -x:

the 'wrqm/s' (write requests merged per-second) field, shows the
number of write requests that were merged into existing requests by the
elevator. i.e. if 3 requests were merged together, '2' will be added to
this counter (the first of these requests was not merged. the other two
were merged into the first). this, at least according to the code of
kernel 2.6.18


if i forgot some question, or you have a question about what we covered
today - please shout!

--guy

___
Haifux mailing list
Haifux@haifux.org
http://hamakor.org.il/cgi-bin/mailman/listinfo/haifux


Re: [Haifux] some additions and eratta to today's lecture

2011-03-14 Thread guy keren

someone reminded me about the small trail through the linux kernel
link i mentioned. it is:

http://www.win.tue.nl/~aeb/linux/vfs/trail.html

note that it is from 2001 and relates to kernel 2.4 (or even older) -
but the general has not completely changed.

you can find a more up-to-date info about this in the book the linux
kernel, 3rd edition - in the VFS chapter.

--guy

On Tue, 2011-03-15 at 01:35 +0200, guy keren wrote:
 1. etzion asked about controlling the age of dirty pages before pdflush 
flushes them - the default value is 30 seconds, and can be seen by:
 
 cat /proc/sys/vm/dirty_expire_centisecs
 
 (the time there is in milli-seconds). it can be changed by echoing
 the desired time into that file, e.g. to change it to 40 seconds:
 
  echo 40  /proc/sys/vm/dirty_expire_centisecs
 
 this parameter (and some other pdflush-related parameters) is described
 in the link i mentioned during the meeting today - that talks about
 configuring pdflush for write-intensive workloads:
 
 http://www.westnet.com/~gsmith/content/linux-pdflush.htm
 
 2. if you read the above link - you'll see that in case there are too
 many dirty pages - the writing to disk is done directly by processes
 calling the write() system call (if you'll check the stack trace -
 you'll see these processes waiting for the page transfer to complete).
 this is done to serve as a flow-control mechanism, that slows down the
 processes that fill up the dirty cache.
 
 3. regarding whether the write system call copied the data directly into
 the page-cache, or into a temporary buffer - it indeed copies the data
 directly into the page-cache. the generic write() system call passes
 control to the file-system's code - and this eventually allocates the
 pages, then maps the page with the user-data into the kernel's address
 space, and copy the data into the page-cache's page. note: i checked
 this for the ext3 file-system in kernel 2.6.18 - but the code it uses
 resides in generic kernel code - so i think other file-systems will
 behave the same.
 
 4. regarding the output of iostat -x:
 
 the 'wrqm/s' (write requests merged per-second) field, shows the
 number of write requests that were merged into existing requests by the
 elevator. i.e. if 3 requests were merged together, '2' will be added to
 this counter (the first of these requests was not merged. the other two
 were merged into the first). this, at least according to the code of
 kernel 2.6.18
 
 
 if i forgot some question, or you have a question about what we covered
 today - please shout!
 
 --guy
 
 ___
 Haifux mailing list
 Haifux@haifux.org
 http://hamakor.org.il/cgi-bin/mailman/listinfo/haifux


___
Haifux mailing list
Haifux@haifux.org
http://hamakor.org.il/cgi-bin/mailman/listinfo/haifux