Re: [PATCH] Introduce a handy list_first_entry macro

2007-04-18 Thread Nikita Danilov
Pavel Emelianov writes: There are many places in the kernel where the construction like foo = list_entry(head-next, struct foo_struct, list); are used. The code might look more descriptive and neat if using the macro list_first_entry(head, type, member) \

Re: [RFC][PATCH] ChunkFS: fs fission for faster fsck

2007-04-24 Thread Nikita Danilov
Amit Gud writes: Hello, This is an initial implementation of ChunkFS technique, briefly discussed at: http://lwn.net/Articles/190222 and http://cis.ksu.edu/~gud/docs/chunkfs-hotdep-val-arjan-gud-zach.pdf I have a couple of questions about chunkfs repair process. First, as I

Re: [RFC][PATCH] ChunkFS: fs fission for faster fsck

2007-04-24 Thread Nikita Danilov
David Lang writes: On Tue, 24 Apr 2007, Nikita Danilov wrote: Amit Gud writes: Hello, This is an initial implementation of ChunkFS technique, briefly discussed at: http://lwn.net/Articles/190222 and http://cis.ksu.edu/~gud/docs/chunkfs-hotdep-val-arjan-gud-zach.pdf

Re: [RFC][PATCH] ChunkFS: fs fission for faster fsck

2007-04-25 Thread Nikita Danilov
David Lang writes: On Tue, 24 Apr 2007, Nikita Danilov wrote: David Lang writes: On Tue, 24 Apr 2007, Nikita Danilov wrote: Amit Gud writes: Hello, This is an initial implementation of ChunkFS technique, briefly discussed at: http

Re: [RFC][PATCH] split file and anonymous page queues #3

2007-03-21 Thread Nikita Danilov
Rik van Riel writes: [ OK, I suck. I edited yesterday's email with the new info, but forgot to change the attachment to today's patch. Here is today's patch. ] Split the anonymous and file backed pages out onto their own pageout queues. This we do not unnecessarily churn through

Re: [RFC][PATCH] split file and anonymous page queues #3

2007-03-21 Thread Nikita Danilov
Rik van Riel writes: Nikita Danilov wrote: Rik van Riel writes: [ OK, I suck. I edited yesterday's email with the new info, but forgot to change the attachment to today's patch. Here is today's patch. ] Split the anonymous and file backed pages out onto their own

Re: [RFC][PATCH] split file and anonymous page queues #3

2007-03-21 Thread Nikita Danilov
Rik van Riel writes: Rik van Riel wrote: Nikita Danilov wrote: Probably I am missing something, but I don't see how that can help. For example, suppose (for simplicity) that we have swappiness of 100%, and that fraction of referenced anon pages gets slightly less than of file

Re: [RFC][PATCH] split file and anonymous page queues #3

2007-03-21 Thread Nikita Danilov
Rik van Riel writes: Nikita Danilov wrote: Generally speaking, multi-queue replacement mechanisms were tried in the past, and they all suffer from the common drawback: once scanning rate is different for different queues, so is the notion of hotness, measured by scanner

Re: [rfc][patch] queued spinlocks (i386)

2007-03-24 Thread Nikita Danilov
Nick Piggin writes: On Fri, Mar 23, 2007 at 11:04:18AM +0100, Ingo Molnar wrote: * Nick Piggin [EMAIL PROTECTED] wrote: Implement queued spinlocks for i386. [...] isnt this patented by MS? (which might not worry you SuSE/Novell guys, but it might be a worry for the rest

Re: [rfc][patch] queued spinlocks (i386)

2007-03-24 Thread Nikita Danilov
Ingo Molnar writes: * Nikita Danilov [EMAIL PROTECTED] wrote: Indeed, this technique is very well known. E.g., http://citeseer.ist.psu.edu/anderson01sharedmemory.html has a whole section (3. Local-spin Algorithms) on them, citing papers from the 1990 onward

Re: ZFS with Linux: An Open Plea

2007-04-15 Thread Nikita Danilov
Ignatich writes: You might want to look at this discussion: http://mail.opensolaris.org/pipermail/zfs-discuss/2007-April/027041.html Licenses involved cover file system _code_, rather than storage format that is openly specified. Just stand up and implement driver for zfs format from scratch

Re: [RFC][PATCH 0/3] VM throttling: avoid blocking occasional writers

2007-02-24 Thread Nikita Danilov
Tomoki Sekiyama writes: Hi, Hello, [...] While Dirty+Writeback pages get more than 40% of memory, process-B is blocked in balance_dirty_pages() until writeback of some (`write_chunk', typically = 1536) dirty pages on disk-b is started. May be the simpler solution is to use

Re: [PATCH 12/13] maps: Add /proc/pid/pagemap interface

2007-04-04 Thread Nikita Danilov
Matt Mackall writes: Add /proc/pid/pagemap interface This interface provides a mapping for each page in an address space to its physical page frame number, allowing precise determination of what pages are mapped and what pages are shared between processes. [...] +#ifdef

Re: [PATCH 12/13] maps: Add /proc/pid/pagemap interface

2007-04-04 Thread Nikita Danilov
Matt Mackall writes: [...] Now I could adjust these to only export u64s in some preferred endianness. But given I already need details like the page size to make any sense of it, it seems unnecessary. Also, the PFNs are fairly opaque unless you're attempting to correlate them with

Re: [RFC 2/9] Use NOMEMALLOC reclaim to allow reclaim if PF_MEMALLOC is set

2007-08-23 Thread Nikita Danilov
Peter Zijlstra writes: [...] My idea is to extend kswapd, run cpus_per_node instances of kswapd per node for each of GFP_KERNEL, GFP_NOFS, GFP_NOIO. (basically 3 kswapds per cpu) whenever we would hit direct reclaim, add ourselves to a special waitqueue corresponding to the type of

Re: [PATCH 00/23] per device dirty throttling -v8

2007-08-04 Thread Nikita Danilov
Andrew Morton writes: [...] It's pretty much unfixable given the ext3 journalling design, and the guarantees which data-ordered provides. ZFS has intent log to handle this (http://blogs.sun.com/realneel/entry/the_zfs_intent_log). Something like that can --theoretically-- be added to

Re: Info Regarding MCR tool

2005-04-06 Thread Nikita Danilov
karthik [EMAIL PROTECTED] writes: Hi, If anybody is having any idea of what is MCR and what is its use, just tell me. i think its some Monitor related software. But i want in more detail of what is it and how is it working. MCR stands for Monitor Console Routine. Press Ctrl-C to

Re: [patch] fix race in __block_prepare_write (again)

2005-04-22 Thread Nikita Danilov
Anton Altaparmakov writes: [...] mm/filemap.c::file_buffered_write(): - It calls fault_in_pages_readable() which is completely bogus if @nr_segs 1. It needs to be replaced by a to be written fault_in_pages_readable_iovec(). Which will be only marginally less bogus, because

Re: [PATCH] mm counter operations through macros

2005-03-12 Thread Nikita Danilov
Christoph Lameter writes: On Fri, 11 Mar 2005, Dave Jones wrote: Splitting this last one into inc_mm_counter() and dec_mm_counter() means you can kill off the last argument, and get some of the readability back. As it stands, I think this patch adds a bunch of obfuscation for no

Re: [2.6.11-rc5-mm1 patch] reiser4 Kconfig help cleanup

2005-03-02 Thread Nikita Danilov
Andrew Morton [EMAIL PROTECTED] writes: Jes Sorensen [EMAIL PROTECTED] wrote: [...] [EMAIL PROTECTED] linux-2.6.11-rc5-mm1]$ grep PG_arch fs/reiser4/*.c fs/reiser4/page_cache.c: page_flag_name(page, PG_arch_1), fs/reiser4/txnmgr.c:assert(vs-1448,

patch to fs/proc/base.c

2001-07-20 Thread Nikita Danilov
Hello, following patch cures oopses in 2.4.7-pre9 when proc_pid_make_inode() is called on task with task-mm == NULL. Linus, please apply, if you haven't got a bunch of equivalent patches already, which is doubtful. Nikita. ---

Re: sched_yield() makes OpenLDAP slow

2005-08-19 Thread Nikita Danilov
Howard Chu [EMAIL PROTECTED] writes: [...] concurrency. It is the nature of such a system to encounter deadlocks over the normal course of operations. When a deadlock is detected, some thread must be chosen (by one of a variety of algorithms) to abort its transaction, in order to allow other

Re: sched_yield() makes OpenLDAP slow

2005-08-20 Thread Nikita Danilov
Howard Chu writes: Nikita Danilov wrote: [...] What prevents transaction monitor from using, say, condition variables to yield cpu? That would have an additional advantage of blocking thread precisely until specific event occurs, instead of blocking for some vague

Re: sched_yield() makes OpenLDAP slow

2005-08-20 Thread Nikita Danilov
Howard Chu writes: Nikita Danilov wrote: That returns us to the core of the problem: sched_yield() is used to implement a synchronization primitive and non-portable assumptions are made about its behavior: SUS defines that after sched_yield() thread ceases to run on the CPU until

Re: sched_yield() makes OpenLDAP slow

2005-08-21 Thread Nikita Danilov
Howard Chu writes: Lee Revell wrote: On Sat, 2005-08-20 at 11:38 -0700, Howard Chu wrote: But I also found that I needed to add a new yield(), to work around yet another unexpected issue on this system - we have a number of threads waiting on a condition variable, and the thread

Re: Finding hardlinks

2006-12-31 Thread Nikita Danilov
Mikulas Patocka writes: On Fri, 29 Dec 2006, Trond Myklebust wrote: On Thu, 2006-12-28 at 19:14 +0100, Mikulas Patocka wrote: Why don't you rip off the support for colliding inode number from the kernel at all (i.e. remove iget5_locked)? It's reasonable to have either no

Re: Finding hardlinks

2007-01-01 Thread Nikita Danilov
Mikulas Patocka writes: [...] BTW. How does ReiserFS find that a given inode number (or object ID in ReiserFS terminology) is free before assigning it to new file/directory? reiserfs v3 has an extent map of free object identifiers in super-block. reiser4 used 64 bit object identifiers

Re: Finding hardlinks

2007-01-04 Thread Nikita Danilov
Mikulas Patocka writes: BTW. How does ReiserFS find that a given inode number (or object ID in ReiserFS terminology) is free before assigning it to new file/directory? reiserfs v3 has an extent map of free object identifiers in super-block. Inode free space can have at most

Re: [RFC 7/8] Exclude unreclaimable pages from dirty ration calculation

2007-01-18 Thread Nikita Danilov
by kernel slab allocator prevents write throttling from ever happening. Signed-off-by: Nikita Danilov [EMAIL PROTECTED] mm/page-writeback.c | 33 - 1 files changed, 24 insertions(+), 9 deletions(-) Index: git-linux/mm/page-writeback.c

Re: How innovative is Linux?

2007-06-24 Thread Nikita Danilov
Alan Cox writes: [...] A few innovations that afaik first appeared the Linux kernel - Making multiple hosts appear transparently as one IP address - Futex fast hybrid locking DEC Firefly workstation, before 1987. Nikita. - To unsubscribe from this list: send the line unsubscribe

Re: [RFD] BIO_RW_BARRIER - what it means for devices, filesystems, and dm/md.

2007-05-28 Thread Nikita Danilov
Neil Brown writes: [...] Thus the general sequence might be: a/ issue all preceding writes. b/ issue the commit write with BIO_RW_BARRIER c/ wait for the commit to complete. If it was successful - done. If it failed other than with EOPNOTSUPP, abort

Re: [rfc] lock bitops

2007-05-09 Thread Nikita Danilov
Nick Piggin writes: Hi, [...] /** + * clear_bit_unlock - Clears a bit in memory with release + * @nr: Bit to clear + * @addr: Address to start counting from + * + * clear_bit() is atomic and may not be reordered. It does s/clear_bit/clear_bit_unlock/ ? + * contain a

Re: Info Regarding MCR tool

2005-04-06 Thread Nikita Danilov
karthik <[EMAIL PROTECTED]> writes: > Hi, > > If anybody is having any idea of what is MCR and what is its use, > just tell me. i think its some Monitor related software. But i want in more > detail of what is it and how is it working. MCR stands for "Monitor Console Routine". Press

Re: [patch] fix race in __block_prepare_write (again)

2005-04-22 Thread Nikita Danilov
Anton Altaparmakov writes: [...] > > mm/filemap.c::file_buffered_write(): > > - It calls fault_in_pages_readable() which is completely bogus if > @nr_segs > 1. It needs to be replaced by a to be written > "fault_in_pages_readable_iovec()". Which will be only marginally less bogus,

Re: [PATCH] mm counter operations through macros

2005-03-12 Thread Nikita Danilov
Christoph Lameter writes: > On Fri, 11 Mar 2005, Dave Jones wrote: > > > Splitting this last one into inc_mm_counter() and dec_mm_counter() > > means you can kill off the last argument, and get some of the > > readability back. As it stands, I think this patch adds a bunch > > of

Re: [2.6.11-rc5-mm1 patch] reiser4 Kconfig help cleanup

2005-03-02 Thread Nikita Danilov
Andrew Morton <[EMAIL PROTECTED]> writes: > Jes Sorensen <[EMAIL PROTECTED]> wrote: >> [...] >> >> [EMAIL PROTECTED] linux-2.6.11-rc5-mm1]$ grep PG_arch fs/reiser4/*.c >> fs/reiser4/page_cache.c: page_flag_name(page, PG_arch_1), >> fs/reiser4/txnmgr.c:

Re: sched_yield() makes OpenLDAP slow

2005-08-19 Thread Nikita Danilov
Howard Chu <[EMAIL PROTECTED]> writes: [...] > concurrency. It is the nature of such a system to encounter deadlocks > over the normal course of operations. When a deadlock is detected, some > thread must be chosen (by one of a variety of algorithms) to abort its > transaction, in order to allow

Re: sched_yield() makes OpenLDAP slow

2005-08-20 Thread Nikita Danilov
Howard Chu writes: > Nikita Danilov wrote: [...] > > > What prevents transaction monitor from using, say, condition > > variables to "yield cpu"? That would have an additional advantage of > > blocking thread precisely until specific event occurs,

Re: sched_yield() makes OpenLDAP slow

2005-08-20 Thread Nikita Danilov
Howard Chu writes: > Nikita Danilov wrote: > > That returns us to the core of the problem: sched_yield() is used to > > implement a synchronization primitive and non-portable assumptions are > > made about its behavior: SUS defines that after sched_yield() thread > &g

Re: sched_yield() makes OpenLDAP slow

2005-08-21 Thread Nikita Danilov
Howard Chu writes: > Lee Revell wrote: > > On Sat, 2005-08-20 at 11:38 -0700, Howard Chu wrote: > > > But I also found that I needed to add a new yield(), to work around > > > yet another unexpected issue on this system - we have a number of > > > threads waiting on a condition variable, and

Re: Finding hardlinks

2006-12-31 Thread Nikita Danilov
Mikulas Patocka writes: > > > On Fri, 29 Dec 2006, Trond Myklebust wrote: > > > On Thu, 2006-12-28 at 19:14 +0100, Mikulas Patocka wrote: > >> Why don't you rip off the support for colliding inode number from the > >> kernel at all (i.e. remove iget5_locked)? > >> > >> It's reasonable

Re: Finding hardlinks

2007-01-01 Thread Nikita Danilov
Mikulas Patocka writes: [...] > > BTW. How does ReiserFS find that a given inode number (or object ID in > ReiserFS terminology) is free before assigning it to new file/directory? reiserfs v3 has an extent map of free object identifiers in super-block. reiser4 used 64 bit object

Re: Finding hardlinks

2007-01-04 Thread Nikita Danilov
Mikulas Patocka writes: > > > BTW. How does ReiserFS find that a given inode number (or object ID in > > > ReiserFS terminology) is free before assigning it to new file/directory? > > > > reiserfs v3 has an extent map of free object identifiers in > > super-block. > > Inode free space can

patch to fs/proc/base.c

2001-07-20 Thread Nikita Danilov
Hello, following patch cures oopses in 2.4.7-pre9 when proc_pid_make_inode() is called on task with task->mm == NULL. Linus, please apply, if you haven't got a bunch of equivalent patches already, which is doubtful. Nikita. ---

Re: [RFC][PATCH 0/3] VM throttling: avoid blocking occasional writers

2007-02-24 Thread Nikita Danilov
Tomoki Sekiyama writes: > Hi, Hello, > [...] > > While Dirty+Writeback pages get more than 40% of memory, process-B is > blocked in balance_dirty_pages() until writeback of some (`write_chunk', > typically = 1536) dirty pages on disk-b is started. May be the simpler solution is to use

Re: [PATCH 12/13] maps: Add /proc/pid/pagemap interface

2007-04-04 Thread Nikita Danilov
Matt Mackall writes: > Add /proc/pid/pagemap interface > > This interface provides a mapping for each page in an address space to > its physical page frame number, allowing precise determination of what > pages are mapped and what pages are shared between processes. [...] > > +#ifdef

Re: [PATCH 12/13] maps: Add /proc/pid/pagemap interface

2007-04-04 Thread Nikita Danilov
Matt Mackall writes: [...] > > Now I could adjust these to only export u64s in some preferred > endianness. But given I already need details like the page size to > make any sense of it, it seems unnecessary. Also, the PFNs are fairly > opaque unless you're attempting to correlate them

Re: [RFC 7/8] Exclude unreclaimable pages from dirty ration calculation

2007-01-18 Thread Nikita Danilov
ons (among other things) when memory consumed by kernel slab allocator prevents write throttling from ever happening. Signed-off-by: Nikita Danilov <[EMAIL PROTECTED]> mm/page-writeback.c | 33 - 1 files changed, 24 insertions(+), 9 deletion

Re: [RFC][PATCH] split file and anonymous page queues #3

2007-03-21 Thread Nikita Danilov
Rik van Riel writes: > [ OK, I suck. I edited yesterday's email with the new info, but forgot >to change the attachment to today's patch. Here is today's patch. ] > > Split the anonymous and file backed pages out onto their own pageout > queues. This we do not unnecessarily churn

Re: [RFC][PATCH] split file and anonymous page queues #3

2007-03-21 Thread Nikita Danilov
Rik van Riel writes: > Nikita Danilov wrote: > > Rik van Riel writes: > > > [ OK, I suck. I edited yesterday's email with the new info, but forgot > > >to change the attachment to today's patch. Here is today's patch. ] > > > > > >

Re: [RFC][PATCH] split file and anonymous page queues #3

2007-03-21 Thread Nikita Danilov
Rik van Riel writes: > Rik van Riel wrote: > > Nikita Danilov wrote: > > > >> Probably I am missing something, but I don't see how that can help. For > >> example, suppose (for simplicity) that we have swappiness of 100%, and > >> that fraction o

Re: [RFC][PATCH] split file and anonymous page queues #3

2007-03-21 Thread Nikita Danilov
Rik van Riel writes: > Nikita Danilov wrote: > > > Generally speaking, multi-queue replacement mechanisms were tried in the > > past, and they all suffer from the common drawback: once scanning rate > > is different for different queues, so is the notion of

Re: [rfc][patch] queued spinlocks (i386)

2007-03-24 Thread Nikita Danilov
Nick Piggin writes: > On Fri, Mar 23, 2007 at 11:04:18AM +0100, Ingo Molnar wrote: > > > > * Nick Piggin <[EMAIL PROTECTED]> wrote: > > > > > Implement queued spinlocks for i386. [...] > > > > isnt this patented by MS? (which might not worry you SuSE/Novell guys, > > but it might be a

Re: [rfc][patch] queued spinlocks (i386)

2007-03-24 Thread Nikita Danilov
Ingo Molnar writes: > > * Nikita Danilov <[EMAIL PROTECTED]> wrote: > > > Indeed, this technique is very well known. E.g., > > http://citeseer.ist.psu.edu/anderson01sharedmemory.html has a whole > > section (3. Local-spin Algorithms) on them, citing

Re: [RFC][PATCH] ChunkFS: fs fission for faster fsck

2007-04-24 Thread Nikita Danilov
Amit Gud writes: Hello, > > This is an initial implementation of ChunkFS technique, briefly discussed > at: http://lwn.net/Articles/190222 and > http://cis.ksu.edu/~gud/docs/chunkfs-hotdep-val-arjan-gud-zach.pdf I have a couple of questions about chunkfs repair process. First, as I

Re: [RFC][PATCH] ChunkFS: fs fission for faster fsck

2007-04-24 Thread Nikita Danilov
David Lang writes: > On Tue, 24 Apr 2007, Nikita Danilov wrote: > > > Amit Gud writes: > > > > Hello, > > > > > > > > This is an initial implementation of ChunkFS technique, briefly discussed > > > at: http://lwn.net/Articles/1902

Re: [RFC][PATCH] ChunkFS: fs fission for faster fsck

2007-04-25 Thread Nikita Danilov
David Lang writes: > On Tue, 24 Apr 2007, Nikita Danilov wrote: > > > David Lang writes: > > > On Tue, 24 Apr 2007, Nikita Danilov wrote: > > > > > > > Amit Gud writes: > > > > > > > > Hello, > > > > > &

Re: ZFS with Linux: An Open Plea

2007-04-15 Thread Nikita Danilov
Ignatich writes: > You might want to look at this discussion: > http://mail.opensolaris.org/pipermail/zfs-discuss/2007-April/027041.html Licenses involved cover file system _code_, rather than storage format that is openly specified. Just stand up and implement driver for zfs format from

Re: [PATCH] Introduce a handy list_first_entry macro

2007-04-18 Thread Nikita Danilov
Pavel Emelianov writes: > There are many places in the kernel where the construction like > >foo = list_entry(head->next, struct foo_struct, list); > > are used. > The code might look more descriptive and neat if using the macro > >list_first_entry(head, type, member) \ >

Re: [RFD] BIO_RW_BARRIER - what it means for devices, filesystems, and dm/md.

2007-05-28 Thread Nikita Danilov
Neil Brown writes: > [...] > Thus the general sequence might be: > > a/ issue all "preceding writes". > b/ issue the commit write with BIO_RW_BARRIER > c/ wait for the commit to complete. > If it was successful - done. > If it failed other than with EOPNOTSUPP,

Re: [rfc] lock bitops

2007-05-09 Thread Nikita Danilov
Nick Piggin writes: > Hi, [...] > > /** > + * clear_bit_unlock - Clears a bit in memory with release > + * @nr: Bit to clear > + * @addr: Address to start counting from > + * > + * clear_bit() is atomic and may not be reordered. It does s/clear_bit/clear_bit_unlock/ ? > + *

Re: How innovative is Linux?

2007-06-24 Thread Nikita Danilov
Alan Cox writes: [...] > > A few innovations that afaik first appeared the Linux kernel > - Making multiple hosts appear transparently as one IP address > - Futex fast hybrid locking DEC Firefly workstation, before 1987. Nikita. - To unsubscribe from this list: send the line "unsubscribe

Re: [RFC 2/9] Use NOMEMALLOC reclaim to allow reclaim if PF_MEMALLOC is set

2007-08-23 Thread Nikita Danilov
Peter Zijlstra writes: [...] > My idea is to extend kswapd, run cpus_per_node instances of kswapd per > node for each of GFP_KERNEL, GFP_NOFS, GFP_NOIO. (basically 3 kswapds > per cpu) > > whenever we would hit direct reclaim, add ourselves to a special > waitqueue corresponding to the

Re: [PATCH 00/23] per device dirty throttling -v8

2007-08-04 Thread Nikita Danilov
Andrew Morton writes: [...] > > It's pretty much unfixable given the ext3 journalling design, and the > guarantees which data-ordered provides. ZFS has intent log to handle this (http://blogs.sun.com/realneel/entry/the_zfs_intent_log). Something like that can --theoretically-- be added to