[PATCH 2/2] btrfs: better error handling in xattr.c

2010-09-02 Thread Mark Fasheh
This was quite trivial - there's only 3 places I counted where we weren't
handling errors. None of the sites looked like they needed any additional
unrolling either.

Signed-off-by: Mark Fasheh 
---
 fs/btrfs/xattr.c |6 ++
 1 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/fs/btrfs/xattr.c b/fs/btrfs/xattr.c
index 0b4f7e6..109fdc3 100644
--- a/fs/btrfs/xattr.c
+++ b/fs/btrfs/xattr.c
@@ -119,11 +119,10 @@ static int do_setxattr(struct btrfs_trans_handle *trans,
}
 
ret = btrfs_delete_one_dir_name(trans, root, path, di);
-   btrfs_fixable_bug_on(ret);
btrfs_release_path(root, path);
 
/* if we don't have a value then we are removing the xattr */
-   if (!value)
+   if (ret || !value)
goto out;
} else {
btrfs_release_path(root, path);
@@ -138,7 +137,6 @@ static int do_setxattr(struct btrfs_trans_handle *trans,
/* ok we have to create a completely new xattr */
ret = btrfs_insert_xattr_item(trans, root, path, inode->i_ino,
  name, name_len, value, size);
-   btrfs_fixable_bug_on(ret);
 out:
btrfs_free_path(path);
return ret;
@@ -166,7 +164,7 @@ int __btrfs_setxattr(struct btrfs_trans_handle *trans,
 
inode->i_ctime = CURRENT_TIME;
ret = btrfs_update_inode(trans, root, inode);
-   btrfs_fixable_bug_on(ret);
+
 out:
btrfs_end_transaction_throttle(trans, root);
return ret;
-- 
1.6.4.2

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch v2 1/5] mm: add nofail variants of kmalloc kcalloc and kzalloc

2010-09-02 Thread Neil Brown
On Thu, 2 Sep 2010 16:51:41 +0200
Jan Kara  wrote:

> On Thu 02-09-10 09:59:13, Jiri Slaby wrote:
> > On 09/02/2010 03:02 AM, David Rientjes wrote:
> > > --- a/include/linux/slab.h +++ b/include/linux/slab.h @@ -334,6 +334,57
> > > @@ static inline void *kzalloc_node(size_t size, gfp_t flags, int node)
> > > return kmalloc_node(size, flags | __GFP_ZERO, node); }
> > >  
> > > +/** + * kmalloc_nofail - infinitely loop until kmalloc() succeeds.  +
> > > * @size: how many bytes of memory are required.  + * @flags: the type
> > > of memory to allocate (see kmalloc).  + * + * NOTE: no new callers of
> > > this function should be implemented!  + * All memory allocations should
> > > be failable whenever possible.  + */ +static inline void
> > > *kmalloc_nofail(size_t size, gfp_t flags) +{ +void *ret; + +  for
> > > (;;) { +  ret = kmalloc(size, flags); +   if (ret) +
> > > return ret; + WARN_ON_ONCE(get_order(size) >
> > > PAGE_ALLOC_COSTLY_ORDER);
> > 
> > This doesn't work as you expect. kmalloc will warn every time it fails.
> > __GFP_NOFAIL used to disable the warning. Actually what's wrong with
> > __GFP_NOFAIL? I cannot find a reason in the changelogs why the patches
> > are needed.
>   David should probably add the reasoning to the changelogs so that he
> doesn't have to explain again and again ;). But if I understood it
> correctly, the concern is that the looping checks slightly impact fast path
> of the callers which do not need it. Generally, also looping for a long
> time inside allocator isn't a nice thing but some callers aren't able to do
> better for now to the patch is imperfect in this sence...
>

I'm actually a bit confused about this too.
I thought David said he was removing a branch on the *slow* path - which make
sense as you wouldn't even test NOFAIL until you had a failure.
Why are branches on the slow-path an issue??
This is an important question to me because I still hope to see the
swap-over-nfs patches merged eventually and they add a branch on the slow
path (if I remember correctly).

NeilBrown
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS: Unbelievably slow with kvm/qemu

2010-09-02 Thread K. Richard Pixley

 On 9/2/10 09:36 , K. Richard Pixley wrote:

 On 9/1/10 17:18 , Ted Ts'o wrote:

On Tue, Aug 31, 2010 at 02:58:44PM -0700, K. Richard Pixley wrote:

  On 20100831 14:46, Mike Fedyk wrote:

There is little reason not to use duplicate metadata.  Only small
files (less than 2kb) get stored in the tree, so there should be no
worries about images being duplicated without data duplication set at
mkfs time.

My benchmarks show that for my kinds of data, btrfs is somewhat
slower than ext4, (which is slightly slower than ext3 which is
somewhat slower than ext2), when using the defaults, (ie, duplicate
metadata).

It's a hair faster than ext2, (the fastest of the ext family), when
using singleton metadata.  And ext2 isn't even crash resistant while
btrfs has snapshots.

I'm really, really curious.  Can you describe your data and your
workload in detail?  You mentioned "continuous builders"; is this some
kind of tinderbox setup?
I'm not familiar with tinderbox.  Continuous builders tend to be a lot 
like shell scripts - its usually easier to write a new one than to 
even bother to read someone else's.  :).


Basically, it's an automated system that started out life as a shell 
script loop around a build a few years ago.  The current rendition 
includes a number of extra features.  The basic idea here is to expose 
top-of-tree build errors as fast as possible which means that these 
machines can take some build shortcuts that would not be appropriate 
for official builds intended as release candidates.  We have a 
different set of builders which build release candidates.


When it starts, it removes as many snapshots as it needs to in order 
to make space for another build.  Initially it creates a snapshot from 
/home, checks out source, and does a full build of top of tree.  Then 
it starts over.  If it has a build and is not top of tree, it creates 
a snapshot from the last successful build, updates, and does an 
incremental build.  When it reaches top of tree, it starts taking 
requests.


We're using openembedded so the build is largely based on components 
with a global "BOM", (bill of materials), acting as a code based 
database of which versions of which components are in use for which 
images.  This acts as a funneling point.  Requests are a specification 
of a list of components to change, (different versions, etc).  A 
snapshot is taken from the last successful build, the BOM is changed 
locally and built incrementally.  If everything builds alright, then 
the new BOM may be committed and/or the resulting binary packages may 
be published for QA consumption.  But even in the case of failure, 
this snapshot is terminal and never marked as "successful" so never 
reused.


The system acts both as a continuous builder to check top of tree as 
well as an automated method for serializing changes, (which stands in 
for real, human integration).


We currently have about 20 of these servers, ranging from 2 - 24 
cores, 4 - 24G memory, etc.  A single device build takes about 22G so 
a 24G machine can do an entire build in memory.  The different 
machines run similar builds against different branches or against 
different targets and the staggering tends to create a lower response 
time in the case of top-of-tree build errors that affect all devices, 
(the most common type of error).  And most of the servers are cast 
offs, older servers that would be discarded otherwise.  Server speed 
tends to be an issue primarily for the full builds.  Once the full 
build has been created, the incrementals tend to be limited to single 
threading as the build spends most of it's time doing dependency 
rechecking.


The snapshot based approach is recent, as is our btrfs usage, (which 
is currently problematic, polluted file systems, kernel crashes, 
etc).  Previously I was using rsync to backup a copy of a full build 
and rsync to replace it when a build failed.  The working directory 
was the same working directory and I went to some pains to make it 
reusable.  I've been looking for a snapshotting facility for a couple 
of years now but only discovered btrfs recently.  (I tried lvm based 
snapshots but they don't really have the characteristics that I want, 
nor do nilfs2 snapshots.)


Is that what you were looking for?

I should probably mention times and targets.

A typical 2-core, 4G developer workstation can build our entire system 
for 1 device in about 6 - 8hrs.  We typically build each device on a 
separate server and the highest end servers we're using today, (8 - 24 
core, 24G memory), can build a single device in a little under an hour.  
Those are full build times.  A complete cycle of an incremental based 
builder, (doing nothing but bookkeeping and checking dependencies), can 
take anywhere from about 2 - 4 minutes.  And a typical single component 
update usually takes 4 - 6 minutes.


From a developer's perspective, I'm churning out 8hr builds every 5 
minutes or so.  What snapshots provide primarily is the ability 

Re: BTRFS: Unbelievably slow with kvm/qemu

2010-09-02 Thread K. Richard Pixley

 On 9/1/10 17:18 , Ted Ts'o wrote:

On Tue, Aug 31, 2010 at 02:58:44PM -0700, K. Richard Pixley wrote:

  On 20100831 14:46, Mike Fedyk wrote:

There is little reason not to use duplicate metadata.  Only small
files (less than 2kb) get stored in the tree, so there should be no
worries about images being duplicated without data duplication set at
mkfs time.

My benchmarks show that for my kinds of data, btrfs is somewhat
slower than ext4, (which is slightly slower than ext3 which is
somewhat slower than ext2), when using the defaults, (ie, duplicate
metadata).

It's a hair faster than ext2, (the fastest of the ext family), when
using singleton metadata.  And ext2 isn't even crash resistant while
btrfs has snapshots.

I'm really, really curious.  Can you describe your data and your
workload in detail?  You mentioned "continuous builders"; is this some
kind of tinderbox setup?
I'm not familiar with tinderbox.  Continuous builders tend to be a lot 
like shell scripts - its usually easier to write a new one than to even 
bother to read someone else's.  :).


Basically, it's an automated system that started out life as a shell 
script loop around a build a few years ago.  The current rendition 
includes a number of extra features.  The basic idea here is to expose 
top-of-tree build errors as fast as possible which means that these 
machines can take some build shortcuts that would not be appropriate for 
official builds intended as release candidates.  We have a different set 
of builders which build release candidates.


When it starts, it removes as many snapshots as it needs to in order to 
make space for another build.  Initially it creates a snapshot from 
/home, checks out source, and does a full build of top of tree.  Then it 
starts over.  If it has a build and is not top of tree, it creates a 
snapshot from the last successful build, updates, and does an 
incremental build.  When it reaches top of tree, it starts taking requests.


We're using openembedded so the build is largely based on components 
with a global "BOM", (bill of materials), acting as a code based 
database of which versions of which components are in use for which 
images.  This acts as a funneling point.  Requests are a specification 
of a list of components to change, (different versions, etc).  A 
snapshot is taken from the last successful build, the BOM is changed 
locally and built incrementally.  If everything builds alright, then the 
new BOM may be committed and/or the resulting binary packages may be 
published for QA consumption.  But even in the case of failure, this 
snapshot is terminal and never marked as "successful" so never reused.


The system acts both as a continuous builder to check top of tree as 
well as an automated method for serializing changes, (which stands in 
for real, human integration).


We currently have about 20 of these servers, ranging from 2 - 24 cores, 
4 - 24G memory, etc.  A single device build takes about 22G so a 24G 
machine can do an entire build in memory.  The different machines run 
similar builds against different branches or against different targets 
and the staggering tends to create a lower response time in the case of 
top-of-tree build errors that affect all devices, (the most common type 
of error).  And most of the servers are cast offs, older servers that 
would be discarded otherwise.  Server speed tends to be an issue 
primarily for the full builds.  Once the full build has been created, 
the incrementals tend to be limited to single threading as the build 
spends most of it's time doing dependency rechecking.


The snapshot based approach is recent, as is our btrfs usage, (which is 
currently problematic, polluted file systems, kernel crashes, etc).  
Previously I was using rsync to backup a copy of a full build and rsync 
to replace it when a build failed.  The working directory was the same 
working directory and I went to some pains to make it reusable.  I've 
been looking for a snapshotting facility for a couple of years now but 
only discovered btrfs recently.  (I tried lvm based snapshots but they 
don't really have the characteristics that I want, nor do nilfs2 snapshots.)


Is that what you were looking for?

--rich
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch v2 1/5] mm: add nofail variants of kmalloc kcalloc and kzalloc

2010-09-02 Thread Jan Kara
On Thu 02-09-10 09:59:13, Jiri Slaby wrote:
> On 09/02/2010 03:02 AM, David Rientjes wrote:
> > --- a/include/linux/slab.h +++ b/include/linux/slab.h @@ -334,6 +334,57
> > @@ static inline void *kzalloc_node(size_t size, gfp_t flags, int node)
> > return kmalloc_node(size, flags | __GFP_ZERO, node); }
> >  
> > +/** + * kmalloc_nofail - infinitely loop until kmalloc() succeeds.  +
> > * @size: how many bytes of memory are required.  + * @flags: the type
> > of memory to allocate (see kmalloc).  + * + * NOTE: no new callers of
> > this function should be implemented!  + * All memory allocations should
> > be failable whenever possible.  + */ +static inline void
> > *kmalloc_nofail(size_t size, gfp_t flags) +{ +  void *ret; + +  for
> > (;;) { +ret = kmalloc(size, flags); +   if (ret) +
> > return ret; +   WARN_ON_ONCE(get_order(size) >
> > PAGE_ALLOC_COSTLY_ORDER);
> 
> This doesn't work as you expect. kmalloc will warn every time it fails.
> __GFP_NOFAIL used to disable the warning. Actually what's wrong with
> __GFP_NOFAIL? I cannot find a reason in the changelogs why the patches
> are needed.
  David should probably add the reasoning to the changelogs so that he
doesn't have to explain again and again ;). But if I understood it
correctly, the concern is that the looping checks slightly impact fast path
of the callers which do not need it. Generally, also looping for a long
time inside allocator isn't a nice thing but some callers aren't able to do
better for now to the patch is imperfect in this sence...

Honza
-- 
Jan Kara 
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


mount a compressed btrfs without compress cause kernel oops

2010-09-02 Thread Grissiom
Hello devs,

I'm a slackware user that tried btrfs recently. My kernel version is
2.6.35.4, btrfs-progs is on the date of 20100902.

I have sdb7 in btrfs. I mounted with compress feature in the past, and
had put some files(~8GB) in it. However, I mount it without compress
feature today. When I rsync on the disk, kernel oops as follows:

lost page write due to I/O error on sdb7
end_request: I/O error, dev sdb, sector 131072
lost page write due to I/O error on sdb7
btrfs: 1 errors while writing supers
[ cut here ]
kernel BUG at fs/btrfs/disk-io.c:2292!
invalid opcode:  [#1] SMP
last sysfs file:
/sys/devices/pci:00/:00:14.4/:08:00.0/ssb1:0/net/eth0/carrier
CPU 1
Modules linked in: btrfs ipv6 radeon ttm drm_kms_helper drm
i2c_algo_bit snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq
snd_seq_device snd_pcm_oss snd_mixer_oss nls_cp936 vfat fat ext3 jbd
cpufreq_ondemand cpufreq_performance cpufreq_powersave powernow_k8
freq_table mperf agpgart lp ppdev parport_pc parport fuse
snd_hda_codec_idt b43 ohci_hcd snd_hda_intel joydev mac80211
usb_storage b44 snd_hda_codec cfg80211 video ssb thermal snd_hwdep
snd_pcm sdhci_pci sdhci processor snd_timer dell_laptop mmc_core
pcmcia rfkill thermal_sys snd soundcore psmouse rtc_cmos i2c_piix4
battery snd_page_alloc k8temp i2c_core dell_wmi rtc_core button ac mii
led_class wmi pcmcia_core rtc_lib hwmon serio_raw shpchp ehci_hcd
output dcdbas evdev sg ext4 mbcache jbd2

Pid: 17348, comm: btrfs-transacti Not tainted 2.6.35.4 #1 0PR523/Inspiron 1501
RIP: 0010:[]  []
write_all_supers+0x25c/0x260 [btrfs]
RSP: 0018:88003e2a1d40  EFLAGS: 00010282
RAX: 002b RBX: 8800498e0de8 RCX: 8196cf20
RDX:  RSI: 0046 RDI: 8196cdd4
RBP: 88003e2a1da0 R08:  R09: 
R10:  R11:  R12: 0001
R13: 8800498e0de8 R14: 880047a14000 R15: 0001
FS:  7f60561ba740() GS:880001d0() knlGS:
CS:  0010 DS:  ES:  CR0: 8005003b
CR2: 0085cd18 CR3: 75d22000 CR4: 06e0
DR0:  DR1:  DR2: 
DR3:  DR6: 0ff0 DR7: 0400
Process btrfs-transacti (pid: 17348, threadinfo 88003e2a, task
8800396a)
Stack:
 8800396b8f4e 8800396b8f5e 8800498e0d80 
<0> 8800396b8e43 00010001 88003e2a1da0 880047a14000
<0> 88003908c0e8 88003e2a1df0 88003908c158 0001
Call Trace:
 [] write_ctree_super+0x13/0x20 [btrfs]
 [] btrfs_commit_transaction+0x4cb/0x6f0 [btrfs]
 [] ? autoremove_wake_function+0x0/0x40
 [] transaction_kthread+0x263/0x270 [btrfs]
 [] ? transaction_kthread+0x0/0x270 [btrfs]
 [] kthread+0x96/0xa0
 [] kernel_thread_helper+0x4/0x10
 [] ? kthread+0x0/0xa0
 [] ? kernel_thread_helper+0x0/0x10
Code: 5e 41 5f c9 c3 44 89 fe 48 c7 c7 f8 d6 77 a0 31 c0 e8 f8 14 d4
e0 0f 0b eb fe 44 89 e6 48 c7 c7 f8 d6 77 a0 31 c0 e8 e3 14 d4 e0 <0f>
0b eb fe 55 48 89 e5 0f 1f 44 00 00 48 89 f7 89 d6 e8 8d fd
RIP  [] write_all_supers+0x25c/0x260 [btrfs]
 RSP 
---[ end trace 854c315770163d00 ]---
[ cut here ]
kernel BUG at fs/btrfs/inode.c:247!
invalid opcode:  [#2] SMP
last sysfs file:
/sys/devices/pci:00/:00:14.4/:08:00.0/ssb1:0/net/eth0/carrier
CPU 1
Modules linked in: btrfs ipv6 radeon ttm drm_kms_helper drm
i2c_algo_bit snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq
snd_seq_device snd_pcm_oss snd_mixer_oss nls_cp936 vfat fat ext3 jbd
cpufreq_ondemand cpufreq_performance cpufreq_powersave powernow_k8
freq_table mperf agpgart lp ppdev parport_pc parport fuse
snd_hda_codec_idt b43 ohci_hcd snd_hda_intel joydev mac80211
usb_storage b44 snd_hda_codec cfg80211 video ssb thermal snd_hwdep
snd_pcm sdhci_pci sdhci processor snd_timer dell_laptop mmc_core
pcmcia rfkill thermal_sys snd soundcore psmouse rtc_cmos i2c_piix4
battery snd_page_alloc k8temp i2c_core dell_wmi rtc_core button ac mii
led_class wmi pcmcia_core rtc_lib hwmon serio_raw shpchp ehci_hcd
output dcdbas evdev sg ext4 mbcache jbd2

Pid: 17414, comm: flush-btrfs-4 Tainted: G  D 2.6.35.4 #1
0PR523/Inspiron 1501
RIP: 0010:[]  []
cow_file_range_inline+0x16d/0x180 [btrfs]
RSP: 0018:880038f63790  EFLAGS: 00010282
RAX: fffb RBX:  RCX: 880022e7a510
RDX: 880001d1b800 RSI:  RDI: 880022e7ab40
RBP: 880038f63800 R08: 0001 R09: 
R10: 009e R11: 88004025a840 R12: 88002779f800
R13: 880024bc0938 R14: 009e R15: 1000
FS:  7f4dbb8a1700() GS:880001d0() knlGS:
CS:  0010 DS:  ES:  CR0: 8005003b
CR2: 7faabc0b0fc4 CR3: 276cc000 CR4: 06e0
DR0:  DR1:  DR

Re: [PATCH V2 1/3] lib: introduce some memory copy macros and functions

2010-09-02 Thread Andi Kleen
> So I improve the generic version of memcpy and memmove, and x86_64's memmove
> which are implemented by byte copy.

One should also add that most memmove()s and memcpy()s are actually
generated by gcc as inlines (especially if you don't use the
"make my code slow" option aka -Os) and don't use the fallback.
The fallback depends on the gcc version and if gcc thinks the 
data is aligned or not.

Sometimes one can get better code in the caller by making sure
gcc knows the correct alignment (e.g. with suitable 
types) and size. This might be worth looking at for btrfs
if it's really that memmove heavy.

> >
> >I have some systemtap scripts to measure size/alignment distributions of
> >copies on a kernel, if you have a particular workload you're interested
> >in those could be tried.
> 
> Good! Could you give me these script?

ftp://firstfloor.org/pub/ak/probes/csum.m4

You need to run them through .m4 first.
They don't measure memmove, but that should be easy to add.

-Andi
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS: Unbelievably slow with kvm/qemu

2010-09-02 Thread Ted Ts'o
On Tue, Aug 31, 2010 at 02:58:44PM -0700, K. Richard Pixley wrote:
>  On 20100831 14:46, Mike Fedyk wrote:
> >There is little reason not to use duplicate metadata.  Only small
> >files (less than 2kb) get stored in the tree, so there should be no
> >worries about images being duplicated without data duplication set at
> >mkfs time.
> My benchmarks show that for my kinds of data, btrfs is somewhat
> slower than ext4, (which is slightly slower than ext3 which is
> somewhat slower than ext2), when using the defaults, (ie, duplicate
> metadata).
> 
> It's a hair faster than ext2, (the fastest of the ext family), when
> using singleton metadata.  And ext2 isn't even crash resistant while
> btrfs has snapshots.

I'm really, really curious.  Can you describe your data and your
workload in detail?  You mentioned "continuous builders"; is this some
kind of tinderbox setup?

- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V2 1/3] lib: introduce some memory copy macros and functions

2010-09-02 Thread Miao Xie

On Thu, 02 Sep 2010 10:55:58 +0200, Andi Kleen wrote:

Miao Xie  writes:


Changes from V1 to V2:
- change the version of GPL from version 2.1 to version 2

the kernel's memcpy and memmove is very inefficient. But the glibc version is
quite fast, in some cases it is 10 times faster than the kernel version. So I



Can you elaborate on which CPUs and with what workloads you measured that?


I did this test on x86_64 box with 4 cores, and the workload is quite low,
and I just do 500 bytes copy for 5,000,000 times.

the attached file is my test program.


The kernel memcpy is optimized for copies smaller than a page size
for example (kernel very rarely does anything on larger than 4k),
the glibc isn't. etc. There are various other differences.

memcpy and memmove are very different. AFAIK noone has tried
to optimize memmove() before because traditionally it wasn't
used for anything performance critical in the kernel. Has that
that changed? memcpy on the other hand while not perfect
is actually quite optimized for typical workloads.


Yes,the performance of memcpy on the most architecture is well,

But some of memmoves are implemented by byte copy, it is quite inefficient.
Unfortunately those memmove are used to modify the metadata of some filesystems,
such as: btrfs. That is memmove is importent for the performance of those 
filesystems.

So I improve the generic version of memcpy and memmove, and x86_64's memmove
which are implemented by byte copy.


One big difference between the kernel and glibc is that kernel
is often cache cold, so you e.g. the cost of a very large code footprint
memcpy/memset is harder to amortize.

Microbenchmarks often leave out that crucial variable.

I have some systemtap scripts to measure size/alignment distributions of
copies on a kernel, if you have a particular workload you're interested
in those could be tried.


Good! Could you give me these script?


Just copying the glibc bloat uncritical is very likely
the wrong move at least.


Agree!

Thanks!
Miao
#include 
#include 
#include 
#include 
#include 
#include 
#include 

void get_start_time(struct timeval *tv)
{
	do_gettimeofday(tv);
}

void account_time(struct timeval *stv, struct timeval *etv, int loops)
{
	do_gettimeofday(etv);

	if (loops) {
		while (etv->tv_usec < stv->tv_usec) {
			etv->tv_sec--;
			etv->tv_usec += 100;
		}

		etv->tv_usec -= stv->tv_usec;
		etv->tv_sec -= stv->tv_sec;

		while (etv->tv_usec > 100) {
			etv->tv_usec -= 100;
			etv->tv_sec++;
		}

		printk("\tTotal loops: %d\n", loops);
		printk("\tTotal time: %lds%ldus\n", etv->tv_sec, etv->tv_usec);
	} else
		printk("Didn't do any loop!\n");

}

char *str;

int init_module(void)
{
	struct timeval stv, etv;
	int loops, i;

	str = kmalloc(1000, GFP_KERNEL);
	if (!str)
		return 0;
	loops = i = 500;

	printk("memcpy:\n");
	get_start_time(&stv);
	while (i--)
		memcpy(str + 400, str, 500);
	account_time(&stv, &etv, loops);

	i = loops;
	printk("\nmemmove:\n");
	get_start_time(&stv);
	while (i--)
		memmove(str + 400, str, 500);
	account_time(&stv, &etv, loops);

	return 0;
}

void cleanup_module(void)
{
	kfree(str);
}

MODULE_LICENSE("GPL");


Re: [PATCH V2 1/3] lib: introduce some memory copy macros and functions

2010-09-02 Thread Andi Kleen
Miao Xie  writes:

> Changes from V1 to V2:
> - change the version of GPL from version 2.1 to version 2
>
> the kernel's memcpy and memmove is very inefficient. But the glibc version is
> quite fast, in some cases it is 10 times faster than the kernel version. So I


Can you elaborate on which CPUs and with what workloads you measured that?

The kernel memcpy is optimized for copies smaller than a page size 
for example (kernel very rarely does anything on larger than 4k), 
the glibc isn't. etc. There are various other differences.

memcpy and memmove are very different. AFAIK noone has tried
to optimize memmove() before because traditionally it wasn't
used for anything performance critical in the kernel. Has that
that changed? memcpy on the other hand while not perfect
is actually quite optimized for typical workloads.

One big difference between the kernel and glibc is that kernel 
is often cache cold, so you e.g. the cost of a very large code footprint
memcpy/memset is harder to amortize.

Microbenchmarks often leave out that crucial variable.

I have some systemtap scripts to measure size/alignment distributions of
copies on a kernel, if you have a particular workload you're interested
in those could be tried.

Just copying the glibc bloat uncritical is very likely
the wrong move at least.

-Andi
-- 
a...@linux.intel.com -- Speaking for myself only.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] lib: introduce some memory copy macros and functions

2010-09-02 Thread chxand...@gmail.com
On 2 September 2010 15:24, Chris Samuel  wrote:
> On 02/09/10 18:16, chxand...@gmail.com wrote:
>
>> Umm, isn't the only one that can do that the copyright holder?
>
> The copyright holder can use whatever license they wish; the LGPL
> tells the licensee what rights *they* have, which includes distributing
> the software under the (more strict) GPLv2.
>
Wow, there is even a FAQ.
http://www.gnu.org/licenses/gpl-faq.html#compat-matrix-footnote-7
in the matrix notes 1 and 7 apply, which means that anyone, not just
the license holder may change to GPL v2 from LGPV 2.1

Cheers

Chris
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] lib: introduce some memory copy macros and functions

2010-09-02 Thread Chris Samuel
On 02/09/10 18:16, chxand...@gmail.com wrote:

> Umm, isn't the only one that can do that the copyright holder?

The copyright holder can use whatever license they wish; the LGPL
tells the licensee what rights *they* have, which includes distributing
the software under the (more strict) GPLv2.

cheers,
Chris (not a lawyer either, thankfully!)
-- 
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] lib: introduce some memory copy macros and functions

2010-09-02 Thread chxand...@gmail.com
On 2 September 2010 14:07, Chris Samuel  wrote:
> On 02/09/10 16:50, Miao Xie wrote:
>
>> I just change the "2.1" to "2" in your patch, because the
>> orignal code is LGPL v2.1, LGPL v2.1 permits us to apply
>> the terms of the ordinary GNU General Public License instead
>> of it.
> Ahhh excellent, I hadn't realised that was possible; well spotted!
Umm, isn't the only one that can do that the copyright holder?
Disclaimer: I am not a laywer
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch v2 1/5] mm: add nofail variants of kmalloc kcalloc and kzalloc

2010-09-02 Thread Jiri Slaby
On 09/02/2010 03:02 AM, David Rientjes wrote:
> --- a/include/linux/slab.h
> +++ b/include/linux/slab.h
> @@ -334,6 +334,57 @@ static inline void *kzalloc_node(size_t size, gfp_t 
> flags, int node)
>   return kmalloc_node(size, flags | __GFP_ZERO, node);
>  }
>  
> +/**
> + * kmalloc_nofail - infinitely loop until kmalloc() succeeds.
> + * @size: how many bytes of memory are required.
> + * @flags: the type of memory to allocate (see kmalloc).
> + *
> + * NOTE: no new callers of this function should be implemented!
> + * All memory allocations should be failable whenever possible.
> + */
> +static inline void *kmalloc_nofail(size_t size, gfp_t flags)
> +{
> + void *ret;
> +
> + for (;;) {
> + ret = kmalloc(size, flags);
> + if (ret)
> + return ret;
> + WARN_ON_ONCE(get_order(size) > PAGE_ALLOC_COSTLY_ORDER);

This doesn't work as you expect. kmalloc will warn every time it fails.
__GFP_NOFAIL used to disable the warning. Actually what's wrong with
__GFP_NOFAIL? I cannot find a reason in the changelogs why the patches
are needed.

> + }



-- 
js
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] lib: introduce some memory copy macros and functions

2010-09-02 Thread Peter Zijlstra
On Thu, 2010-09-02 at 09:55 +0200, Peter Zijlstra wrote:
> On Thu, 2010-09-02 at 13:44 +0800, Miao Xie wrote:
> > On Wed, 01 Sep 2010 17:25:36 +0200, Peter Zijlstra wrote:
> > > On Wed, 2010-09-01 at 18:36 +0800, Miao Xie wrote:
> > >> + * This program is free software; you can redistribute it and/or modify 
> > >> it
> > >> + * under the terms of the GNU General Public License as published by 
> > >> the Free
> > >> + * Software Foundation; either version 2.1 of the License, or (at your 
> > >> option)
> > >> + * any later version.
> > >
> > > The kernel is GPL v2, see COPYING. GPL v2.1 is not a suitable license.
> > 
> > Ok, I will change it.
> 
> Are you allowed to change it? from what I understand the FSF owns the
> copyright of that code, that least, that's what the preamble I cut away
> implied.

n/m, I should probably have read more inbox before replying.

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] lib: introduce some memory copy macros and functions

2010-09-02 Thread Peter Zijlstra
On Thu, 2010-09-02 at 13:44 +0800, Miao Xie wrote:
> On Wed, 01 Sep 2010 17:25:36 +0200, Peter Zijlstra wrote:
> > On Wed, 2010-09-01 at 18:36 +0800, Miao Xie wrote:
> >> + * This program is free software; you can redistribute it and/or modify it
> >> + * under the terms of the GNU General Public License as published by the 
> >> Free
> >> + * Software Foundation; either version 2.1 of the License, or (at your 
> >> option)
> >> + * any later version.
> >
> > The kernel is GPL v2, see COPYING. GPL v2.1 is not a suitable license.
> 
> Ok, I will change it.

Are you allowed to change it? from what I understand the FSF owns the
copyright of that code, that least, that's what the preamble I cut away
implied.


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V2 0/3] improve the performance of some memory copy functions

2010-09-02 Thread Miao Xie

On Thu, 2 Sep 2010 08:53:47 +0200, Ingo Molnar wrote:


* Miao Xie  wrote:


- change the version of GPL from version 2.1 to version 2


How were you able to do this? If the code derives from glibc (as your
comments in the patches suggest), and if glibc is under the GPL v2.1,
then you probably cannot simply change the license to v2.


I think we can do it, because the orignal code is under LGPL v2.1,
LGPL v2.1 permits us to apply the terms of the ordinary GNU General Public
License instead of it.

This is the clause of LGPL v2.1:
3. You may opt to apply the terms of the ordinary GNU General Public
License instead of this License to a given copy of the Library. To do
this, you must alter all the notices that refer to this License, so
that they refer to the ordinary GNU General Public License, version 2,
instead of to this License. (If a newer version than version 2 of the
ordinary GNU General Public License has appeared, then you can specify
that version instead if you wish.) Do not make any other change in these
notices.

Once this change is made in a given copy, it is irreversible for that copy,
so the ordinary GNU General Public License applies to all subsequent copies
and derivative works made from that copy.

This option is useful when you wish to copy part of the code of the Library
into a program that is not a library.

The following is the URL of LGPL v2.1:
  http://www.gnu.org/licenses/old-licenses/lgpl-2.1.html

Thanks
Miao
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V2 0/3] improve the performance of some memory copy functions

2010-09-02 Thread Chris Samuel
On 02/09/10 16:53, Ingo Molnar wrote:

> and if glibc is under the GPL v2.1

It's LGPL v2.1 which can be converted to GPL v2 under section 3
of its license. See: http://www.gnu.org/licenses/lgpl-2.1.html

cheers,
Chris
-- 
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] lib: introduce some memory copy macros and functions

2010-09-02 Thread Chris Samuel
On 02/09/10 16:50, Miao Xie wrote:

> I just change the "2.1" to "2" in your patch, because the
> orignal code is LGPL v2.1, LGPL v2.1 permits us to apply
> the terms of the ordinary GNU General Public License instead
> of it.

Ahhh excellent, I hadn't realised that was possible; well spotted!

cheers,
Chris
-- 
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html