date:20070407

pci card detection by kernel

2007-04-07 Thread S. Vishnu Priya



Dear All,

  I am new to this group. I was asked to write a device driver for a 
new pci card. My doubt is, Kernel will detect a new pci card which is 
attached to the system? Or how we can make the kernel to detect the new 
card which is newly attached? I am using kernel 2.6.11.12. Please suggest 
me regarding this.



Regards,
vpriya.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH nf-2.6.22] [netfilter] early_drop imrovement

2007-04-07 Thread Vasily Averin

Eric Dumazet wrote:
> Vasily Averin a e'crit :
>> When the number of conntracks is reached nf_conntrack_max limit,
>> early_drop() is
>> called and tries to free one of already used conntracks in one of the
>> hash
>> buckets. If it does not find any conntracks that may be freed, it
>> leads to transmission errors.
>> However it is not fair because of current hash bucket may be empty but
>> the
>> neighbour ones can have the number of conntracks that can be freed. On
>> the other
>> hand the number of checked conntracks is not limited and it can cause
>> a long delay.
>> The following patch limits the number of checked conntracks by average
>> number of
>> conntracks in one hash bucket and allows to search conntracks in other
>> hash buckets.
> 
> Hi Vasily
> 
>>
>>  atomic_inc(>ct_general.use);
>>  break;
>>  }
>> +if (!--(*cnt)) {
>> +dropped = 1;
>> +break;
>> +}
> 
> 
>> +cnt = (nf_conntrack_max/nf_conntrack_htable_size) + 1;
> 
> I am sorry but this wont help in the case you mentioned in an earlier
> mail :
> 
> If nf_conntrack_max  < nf_conntrack_htable_size, cnt will be set to 1.
> 
> Then in __early_drop() you endup in breaking the
> list_for_each_entry_reverse() loop after the first element was tested !
> Not what you intended I'm afraid, because you wont event scan the whole
> chain as before your patch :(

I would note that in my experiment I got errors when first checked hash bucket
was empty. With this patch I have guarantee that at least one conntrack will be
checked. I'm agree 1 is not too high, but it is better than nothing. I've
checked, my testcase works now.

> I believe you should not test --cnt in __early_drop() but in the caller.
> 
> (That is not counting the number of found cells, but the number of hash
> chains you tried)

I need to count conntracks but not hash buckets. Also it is possible that all
the conntracks will be placed to only one hash bucket, and as you pointed in
your previous letter it may lead to long delays.

However how do you think, is it probably better to set low limit to default
average number of conntracks in hash bucket?

cnt = max(8U, nf_conntrack_max/nf_conntrack_htable_size);

Thank you,
Vasily Averin

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Reiser4. BEST FILESYSTEM EVER.

2007-04-07 Thread Christer Weinigel

[EMAIL PROTECTED] writes:

> Lennart. Tell me again that these results from 
> 
> http://linuxhelp.150m.com/resources/fs-benchmarks.htm and
> http://m.domaindlx.com/LinuxHelp/resources/fs-benchmarks.htm
> 
> are not of interest to you. I still don't understand why you have your
> head in the sand.

Oh, for fucks sake, stop sounding like a broken record.  You have
repeated the same totally meaningless statistics more times than I
care to count.  Please shut the fuck up.

As you discovered yourself (even though you seem to fail to understand
the significance of your discovery), bonnie writes files that consist
of mostly zeroes.  If your normal use cases consist of creating a
bunch of files containing zeroes, reiser4 with compression will do
great.  Just lovely.  Except that nobody sane would store a lot of
files containing zeroes, except an an excercize in mental
masturbation.  So the two bonnie benchmarks with lzo and gzip are
totally meaningless for any real life usages.

As for the amount of disk needed to store three kernel trees, the
figures you quote show that Reiser4 does tail combining where the tail
of multiple files are stored in one disk block.  A nice trick that
seems save you about 15% disk space compared to ext3.  Now you have to
realise what that means, it means that if the disk block containing
those tails (or any metadata pointing at that block) gets corrupted,
instead of just losing one disk block for one file, you will have lost
the tail for all the files sharing that disk block.  Depending on your
personal prioritites, saving 15% of the space may be worth the risk to
you, or maybe not.  Personally, for the only disk I'm short on space
on, I mostly store flac encoded images of my CD collection, and saving
2kByte out of every 300MByte disk simply doesn't make any difference,
and I much prefer a stable file system that I can trust not to lose my
data.  You might make different choices.

The same goes for just about every feature that you tout, it has its
advantages, and it has its disadvantages.  Doing compression on data
is great if the data you store is compressible, and sucks if it isn't.
Doing compression on each disk block and then packing multiple
compressed blocks into each physical disk block will probably save
some space if the data is compressible, but at the same time it means
that you will spend a lot of CPU (and cache footprint) compressing and
uncompressing that data.  On a single user system where the CPU is
mostly idle it might not make much of a difference, on a heavily
loaded multiuser system it might do.

Logs can be compressed quite well using a block based compression
scheme, but the logs can be compressed even better by doing
compression on the whole file with gzip.  So what's the best choice,
to do transparent compression on the fly giving ok compression or
teaching the userspace tools to do compression of old logs and get
really good compression?  Or maybe disk space really isn't that
important anyway and the best thing is to just leave the logs
uncompressed.

Another example: one of the things Reiser3 is supposed to be really
good for is storing an INN news spool, doing tail merging of lots of
individual files containing articles gives a great space saving, and
since it's just a news spool, reliability in face of a system crashes
really don't matter all that much.  On the other hand, INN's Cyclic
News File System running on top of ext2 is probably an even better
choice in that case.  What do you want to use?

What I want to get at is that you can troll the mailing lists (and
crossposting stupid inflammatory material with an inane subject to a
bunch of mailing lists the way you have done definitely is trolling)
trying to say that whatever you're trying to sell is the best, but at
the end, if a file system is better or not is a lot more complex than
quoting just one benchmark (which, once again, is meaningless,
compressing a lot of zeroes is simple and really does not tell you
anything about real world usages).  And there are other considerations
too, even if Reiser4 would be the best thing since sliced breadd, can
I trust Hans Reiser to support Reiser4 for the next five years?  Or
will he drop support for Reiser4 the same way he dropped support for
the old Reiser3 when Reiser4 came along?  Or will he drop Reiser4 when
the grant to do Reiser 4 development expires?

  /Christer

-- 
"Just how much can I get away with and still go to heaven?"

Christer Weinigel <[EMAIL PROTECTED]>  http://www.weinigel.se
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Reiser4. BEST FILESYSTEM EVER.

2007-04-07 Thread johnrobertbanks

Teddy,

It is a pity you don't address the full set of results, when you make
your snide comments.

Now since you have them,... why don't you make reasoned comment about
them.

You can read more here:

http://linuxhelp.150m.com/resources/fs-benchmarks.htm and
http://m.domaindlx.com/LinuxHelp/resources/fs-benchmarks.htm

.-.
| FILESYSTEM | TIME |DISK |
| TYPE   |(secs)|USAGE|
.-.
|REISER4 lzo | 1938 | 278 |
|REISER4 gzip| 2295 | 213 |
|REISER4 | 3462 | 692 |
|EXT2| 4092 | 816 |
|JFS | 4225 | 806 |
|EXT4| 4408 | 816 |
|EXT3| 4421 | 816 |
|XFS | 4625 | 779 |
|REISER3 | 6178 | 793 |
|FAT32   |12342 | 988 |
|NTFS-3g |10414 | 772 |
.-.


Column one measures the time taken to complete the bonnie++ benchmarking
test (run with the parameters bonnie++ -n128:128k:0)

Column two, Disk Usage: measures the amount of disk used to store 655MB
of raw data (which was 3 different copies of the Linux kernel sources).

OR LOOK AT THE FULL RESULTS:

.-.
|File |Disk |Copy |Copy |Tar  |Unzip| Del |
|System   |Usage|655MB|655MB|Gzip |UnTar| 2.5 |
|Type | (MB)| (1) | (2) |655MB|655MB| Gig |
.-.
|REISER4 gzip | 213 | 148 |  68 |  83 |  48 |  70 |
|REISER4 lzo  | 278 | 138 |  56 |  80 |  34 |  84 |
|REISER4 tails| 673 | 148 |  63 |  78 |  33 |  65 |
|REISER4  | 692 | 148 |  55 |  67 |  25 |  56 |
|NTFS3g   | 772 |1333 |1426 | 585 | 767 | 194 |
|NTFS | 779 | 781 | 173 |   X |   X |   X |
|REISER3  | 793 | 184 |  98 |  85 |  63 |  22 |
|XFS  | 799 | 220 | 173 | 119 |  90 | 106 |
|JFS  | 806 | 228 | 202 |  95 |  97 | 127 |
|EXT4 extents | 806 | 162 |  55 |  69 |  36 |  32 |
|EXT4 default | 816 | 174 |  70 |  74 |  42 |  50 |
|EXT3 | 816 | 182 |  74 |  73 |  43 |  51 |
|EXT2 | 816 | 201 |  82 |  73 |  39 |  67 |
|FAT32| 988 | 253 | 158 | 118 |  81 |  95 |
.-.


Each test was preformed 5 times and the average value recorded.
Disk Usage: The amount of disk used to store the data (which was 3
different copies of the Linux kernel sources).
The raw data (without filesystem meta-data, block alignment wastage,
etc) was 655MB.
Copy 655MB (1): Copy the data over a partition boundary.
Copy 655MB (2): Copy the data within a partition.
Tar Gzip 655MB: Tar and Gzip the data.
Unzip UnTar 655MB: UnGzip and UnTar the data.
Del 2.5 Gig: Delete everything just written (about 2.5 Gig).


To get a feel for the performance increases that can be achieved by
using compression, we look at the total time (in seconds) to run the
test:

bonnie++ -n128:128k:0 (bonnie++ is Version 1.93c)

.---.
| FILESYSTEM | TIME |
.---.
|REISER4 lzo |  1938|
|REISER4 gzip|  2295|
|REISER4 |  3462|
|EXT4|  4408|
|EXT2|  4092|
|JFS |  4225|
|EXT3|  4421|
|XFS |  4625|
|REISER3 |  6178|
|FAT32   | 12342|
|NTFS-3g |>10414|
.---.



On Sat, 7 Apr 2007 22:56:32 -0400, "Theodore Tso" <[EMAIL PROTECTED]> said:
> On Sat, Apr 07, 2007 at 05:44:57PM -0700, [EMAIL PROTECTED]
> wrote:
> > To get a feel for the performance increases that can be achieved by
> > using compression, we look at the total time (in seconds) to run the
> > test:
> 
> You mean the performance increases of writing a file which is mostly
> all zero's?  Yawn.
> 
>   - Ted
-- 
  
  [EMAIL PROTECTED]

-- 
http://www.fastmail.fm - IMAP accessible web-mail

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 5/5] partitions: Rewrite check_partition to remove necessity of check_part

2007-04-07 Thread Randy Dunlap

On Sat, 7 Apr 2007 23:42:00 -0400 (EDT) John Anthony Kazos Jr. wrote:

> From: John Anthony Kazos Jr. <[EMAIL PROTECTED]>
> 
> Removes the entire check_part array and uses the presence of new stub 
> functions in header files in fs/partitions to call them directly in a list 
> and let the compiler optimize away any that aren't compiled in. Also fixes 
> a bug where " unable to read partition table" would never be printed.
> 
> Signed-off-by: John Anthony Kazos Jr. <[EMAIL PROTECTED]>
> 
> ---
> 
> I did this because it seems to be much easier to understand. And even 
> though the memory used by the array was only sizeof(void*)*n+1 for the 
> number of partitions configured to be included, that's still unnecessary 
> usage.
> 
> This code also has two warn_unused_result problems, but I don't feel up to 
> the task of fixing those yet.
> 
> Within the old loop, if res < 0, res = 0, and immediately after the loop, 
> the function returns if res > 0. Therefore, it must be that res == 0 after 
> the loop. if (!err) is true only if err == 0 so again, res == 0, so if 
> (!res) is true, and the else condition can never be reached. I 
> re-interpreted the intent to be "log a message if the partition format is 
> unrecognized, and log a message if the partition cannot be read but only 
> if the warning is requested". Judging by the "This is ugly" comment, the 
> warning is not intended to account for actual I/O errors when reading the 
> partition-table block, but rather to detect when the partitions are 
> checked for a device with a medium that has been removed.
> 
> --- linux-2.6.20.6-orig/fs/partitions/check.c 2007-04-06 16:02:48.0 
> -0400
> +++ linux-2.6.20.6-mod/fs/partitions/check.c  2007-04-07 21:50:22.0 
> -0400
> @@ -41,73 +41,6 @@ extern void md_autodetect_dev(dev_t dev)
>  
>  int warn_no_part = 1; /*This is ugly: should make genhd removable media 
> aware*/

[snip]

>  /*
>   * disk_name() is used by partition check code and the genhd driver.
>   * It formats the devicename of the indicated disk into
> @@ -149,11 +82,125 @@ const char *__bdevname(dev_t dev, char *
>  
>  EXPORT_SYMBOL(__bdevname);
>  
> +static int
> +check_succeeded(int res, struct parsed_partitions *state, int *err)
> +{
> + if (res < 0) {
> + /* We have hit an I/O error which we don't report now.
> +  * But record it, and let the others do their job.
> +  */
> + *err = res;
> + res = 0;
> + }
> +
> + if (!res) {
> + memset(state->parts, 0, sizeof(state->parts));
> + }

No braces on single-statement blocks.  (many of these below also)

> +
> + return res;
> +}
> +
> +static struct parsed_partitions *
> +do_check_list(struct parsed_partitions *state, struct block_device *bdev)
> +{
> + int err;
> +
> + err = 0;
> +
> +/*

Use tab to indent (above), not (only) spaces.

> +  * Probe partition formats with tables at disk address 0
> +  * that also have an ADFS boot block at 0xdc0.
> +  */
> + if (check_succeeded(adfspart_check_ICS(state, bdev), state, )) {
> + return state;
> + }
> + if (check_succeeded(adfspart_check_POWERTEC(state, bdev), state, )) 
> {
> + return state;
> + }
> + if (check_succeeded(adfspart_check_EESOX(state, bdev), state, )) {
> + return state;
> + }
> +
> +/*

tab above

> +  * Now move on to formats that only have partition info at
> +  * disk address 0xdc0.  Since these may also have stale
> +  * PC/BIOS partition tables, they need to come before
> +  * the msdos entry.
> +  */
> + if (check_succeeded(adfspart_check_CUMANA(state, bdev), state, )) {
> + return state;
> + }
> + if (check_succeeded(adfspart_check_ADFS(state, bdev), state, )) {
> + return state;
> + }
> +
> + /* this must come before msdos */
> + if (check_succeeded(efi_partition(state, bdev), state, )) {
> + return state;
> + }
> +
> + if (check_succeeded(sgi_partition(state, bdev), state, )) {
> + return state;
> + }
> +
> + /* this must come before msdos */
> + if (check_succeeded(ldm_partition(state, bdev), state, )) {
> + return state;
> + }
> +
> + if (check_succeeded(msdos_partition(state, bdev), state, )) {
> + return state;
> + }
> +
> + if (check_succeeded(osf_partition(state, bdev), state, )) {
> + return state;
> + }
> +
> + if (check_succeeded(sun_partition(state, bdev), state, )) {
> + return state;
> + }
> +
> + if (check_succeeded(amiga_partition(state, bdev), state, )) {
> + return state;
> + }
> +
> + if (check_succeeded(atari_partition(state, bdev), state, )) {
> + return state;
> + }
> +
> + if (check_succeeded(mac_partition(state, bdev), state, )) {
> + return state;
> + }
> +
> +

Re: [PATCH 4/5] partitions: Add conditionals and static inline stubs to helpers in headers

2007-04-07 Thread Randy Dunlap

On Sat, 7 Apr 2007 23:40:50 -0400 (EDT) John Anthony Kazos Jr. wrote:

> From: John Anthony Kazos Jr. <[EMAIL PROTECTED]>
> 
> Functions of the form adfspart_check_FOO and foo_partition defined in 
> fs/partitions/*.h are helper functions called in a deliberate order by 
> check_partition in check.c. Add conditional-compilation directives and 
> static inline no-op functions to allow code to indiscriminately call these 
> functions irrespective of whether they do anything, removing the necessity 
> of oodles of #ifdef/#endif in function bodies.
> 
> Signed-off-by: John Anthony Kazos Jr. <[EMAIL PROTECTED]>
> 
> ---
> 
> The next patch changes the check.c code to use these function definitions 
> in a nicer way.
> 
> --- linux-2.6.20.6-orig/fs/partitions/acorn.h 2007-04-06 16:02:48.0 
> -0400
> +++ linux-2.6.20.6-mod/fs/partitions/acorn.h  2007-04-07 20:02:51.0 
> -0400

Send patches against the latest Linus-tree unless the patches are
specifically for the -stable branch (where 2.6.20.y is -stable
and 2.6.21-rc6 or -git is Linus).  (Maybe it won't matter for
these patches)

See Andrew's The Perfect Patch for more info:
  http://www.zip.com.au/~akpm/linux/patches/stuff/tpp.txt


> @@ -7,8 +7,47 @@
>   *  format, and everyone stick to it?
>   */
>  
> -int adfspart_check_CUMANA(struct parsed_partitions *state, struct 
> block_device *bdev);
> -int adfspart_check_ADFS(struct parsed_partitions *state, struct block_device 
> *bdev);
> -int adfspart_check_ICS(struct parsed_partitions *state, struct block_device 
> *bdev);
> -int adfspart_check_POWERTEC(struct parsed_partitions *state, struct 
> block_device *bdev);
> -int adfspart_check_EESOX(struct parsed_partitions *state, struct 
> block_device *bdev);
> +#ifdef CONFIG_ACORN_PARTITION_ICS
> + int adfspart_check_ICS(struct parsed_partitions *state, struct 
> block_device *bdev);
> +#else
> + static inline int adfspart_check_ICS(struct parsed_partitions *state, 
> struct block_device *bdev)
> + {
> + return 0;
> + }
> +#endif
> +
> +#ifdef CONFIG_ACORN_PARTITION_POWERTEC
> + int adfspart_check_POWERTEC(struct parsed_partitions *state, struct 
> block_device *bdev);
> +#else
> + static inline int adfspart_check_POWERTEC(struct parsed_partitions 
> *state, struct block_device *bdev)
> + {
> + return 0;
> + }
> +#endif

We don't indent functions inside ifdef/else/endif blocks.
Just act as though the preprocessor lines are not there.
(That's our current common practice; I don't see that documented
anywhere.)


---
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 5/5] partitions: Rewrite check_partition to remove necessity of check_part

2007-04-07 Thread John Anthony Kazos Jr.

From: John Anthony Kazos Jr. <[EMAIL PROTECTED]>

Removes the entire check_part array and uses the presence of new stub 
functions in header files in fs/partitions to call them directly in a list 
and let the compiler optimize away any that aren't compiled in. Also fixes 
a bug where " unable to read partition table" would never be printed.

Signed-off-by: John Anthony Kazos Jr. <[EMAIL PROTECTED]>

---

I did this because it seems to be much easier to understand. And even 
though the memory used by the array was only sizeof(void*)*n+1 for the 
number of partitions configured to be included, that's still unnecessary 
usage.

This code also has two warn_unused_result problems, but I don't feel up to 
the task of fixing those yet.

Within the old loop, if res < 0, res = 0, and immediately after the loop, 
the function returns if res > 0. Therefore, it must be that res == 0 after 
the loop. if (!err) is true only if err == 0 so again, res == 0, so if 
(!res) is true, and the else condition can never be reached. I 
re-interpreted the intent to be "log a message if the partition format is 
unrecognized, and log a message if the partition cannot be read but only 
if the warning is requested". Judging by the "This is ugly" comment, the 
warning is not intended to account for actual I/O errors when reading the 
partition-table block, but rather to detect when the partitions are 
checked for a device with a medium that has been removed.

--- linux-2.6.20.6-orig/fs/partitions/check.c   2007-04-06 16:02:48.0 
-0400
+++ linux-2.6.20.6-mod/fs/partitions/check.c2007-04-07 21:50:22.0 
-0400
@@ -41,73 +41,6 @@ extern void md_autodetect_dev(dev_t dev)
 
 int warn_no_part = 1; /*This is ugly: should make genhd removable media aware*/
 
-static int (*check_part[])(struct parsed_partitions *, struct block_device *) 
= {
-   /*
-* Probe partition formats with tables at disk address 0
-* that also have an ADFS boot block at 0xdc0.
-*/
-#ifdef CONFIG_ACORN_PARTITION_ICS
-   adfspart_check_ICS,
-#endif
-#ifdef CONFIG_ACORN_PARTITION_POWERTEC
-   adfspart_check_POWERTEC,
-#endif
-#ifdef CONFIG_ACORN_PARTITION_EESOX
-   adfspart_check_EESOX,
-#endif
-
-   /*
-* Now move on to formats that only have partition info at
-* disk address 0xdc0.  Since these may also have stale
-* PC/BIOS partition tables, they need to come before
-* the msdos entry.
-*/
-#ifdef CONFIG_ACORN_PARTITION_CUMANA
-   adfspart_check_CUMANA,
-#endif
-#ifdef CONFIG_ACORN_PARTITION_ADFS
-   adfspart_check_ADFS,
-#endif
-
-#ifdef CONFIG_EFI_PARTITION
-   efi_partition,  /* this must come before msdos */
-#endif
-#ifdef CONFIG_SGI_PARTITION
-   sgi_partition,
-#endif
-#ifdef CONFIG_LDM_PARTITION
-   ldm_partition,  /* this must come before msdos */
-#endif
-#ifdef CONFIG_MSDOS_PARTITION
-   msdos_partition,
-#endif
-#ifdef CONFIG_OSF_PARTITION
-   osf_partition,
-#endif
-#ifdef CONFIG_SUN_PARTITION
-   sun_partition,
-#endif
-#ifdef CONFIG_AMIGA_PARTITION
-   amiga_partition,
-#endif
-#ifdef CONFIG_ATARI_PARTITION
-   atari_partition,
-#endif
-#ifdef CONFIG_MAC_PARTITION
-   mac_partition,
-#endif
-#ifdef CONFIG_ULTRIX_PARTITION
-   ultrix_partition,
-#endif
-#ifdef CONFIG_IBM_PARTITION
-   ibm_partition,
-#endif
-#ifdef CONFIG_KARMA_PARTITION
-   karma_partition,
-#endif
-   NULL
-};
- 
 /*
  * disk_name() is used by partition check code and the genhd driver.
  * It formats the devicename of the indicated disk into
@@ -149,11 +82,125 @@ const char *__bdevname(dev_t dev, char *
 
 EXPORT_SYMBOL(__bdevname);
 
+static int
+check_succeeded(int res, struct parsed_partitions *state, int *err)
+{
+   if (res < 0) {
+   /* We have hit an I/O error which we don't report now.
+* But record it, and let the others do their job.
+*/
+   *err = res;
+   res = 0;
+   }
+
+   if (!res) {
+   memset(state->parts, 0, sizeof(state->parts));
+   }
+
+   return res;
+}
+
+static struct parsed_partitions *
+do_check_list(struct parsed_partitions *state, struct block_device *bdev)
+{
+   int err;
+
+   err = 0;
+
+/*
+* Probe partition formats with tables at disk address 0
+* that also have an ADFS boot block at 0xdc0.
+*/
+   if (check_succeeded(adfspart_check_ICS(state, bdev), state, )) {
+   return state;
+   }
+   if (check_succeeded(adfspart_check_POWERTEC(state, bdev), state, )) 
{
+   return state;
+   }
+   if (check_succeeded(adfspart_check_EESOX(state, bdev), state, )) {
+   return state;
+   }
+
+/*
+* Now move on to formats that only have partition info at
+* disk address 0xdc0.  Since these may also have stale
+* PC/BIOS partition tables, they need to come before
+

Re: [PATCH] Re: kernel oops with badly formatted module option

2007-04-07 Thread Larry Finger


Randy Dunlap wrote:

On Sat, 07 Apr 2007 19:21:01 -0500 Larry Finger wrote:


With the following line in /etc/modprobe.conf.local:

options bcm43xx fwpostfix = ".fw3" locale=8

the kernel oops below is generated. I realize that the line should have no whitespace around the 
"=", but I do not feel that an oops is the best way to report the syntax error. Could there be a more gentle failure?



From: Randy Dunlap <[EMAIL PROTECTED]>

Catch malformed kernel parameter usage of "param = value".
Spaces are not supported, but don't cause a kernel fault on
such usage, just report an error.

Signed-off-by: Randy Dunlap <[EMAIL PROTECTED]>

It works here.  ACKed by: Larry Finger <[EMAIL PROTECTED]>


---
 kernel/params.c |4 
 1 file changed, 4 insertions(+)

--- linux-2.6.21-rc6.orig/kernel/params.c
+++ linux-2.6.21-rc6/kernel/params.c
@@ -356,6 +356,10 @@ int param_set_copystring(const char *val
 {
struct kparam_string *kps = kp->arg;
 
+	if (!val) {

+   printk(KERN_ERR "%s: missing param set value\n", kp->name);
+   return -EINVAL;
+   }
if (strlen(val)+1 > kps->maxlen) {
printk(KERN_ERR "%s: string doesn't fit in %u chars.\n",
   kp->name, kps->maxlen-1);



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2/5] partitions: Add Kconfig dependency to clear benign compiler warning

2007-04-07 Thread John Anthony Kazos Jr.

From: John Anthony Kazos Jr. <[EMAIL PROTECTED]>

Adds a dependency to ACORN_PARTITION_RISCIX in fs/partitions/Kconfig to 
prevent compilation of the function riscix_partition which is used only 
within ACORN_PARTITION_CUMANA and ACORN_PARTITION_ADFS sections, thereby 
preventing an unused-function compiler warning if ACORN_PARTITION_RISCIX 
is defined and neither of the other two are.

Signed-off-by: John Anthony Kazos Jr. <[EMAIL PROTECTED]>

---

Nothing depending on RISCIX is defined except inside CUMANA and ADFS, so 
without either of those, nothing using the RISCIX stuff is compiled, hance 
the unused-function warning.

--- linux-2.6.20.6-orig/fs/partitions/Kconfig   2007-04-06 16:02:48.0 
-0400
+++ linux-2.6.20.6-mod/fs/partitions/Kconfig2007-04-07 22:10:15.0 
-0400
@@ -62,7 +62,7 @@ config ACORN_PARTITION_POWERTEC
 config ACORN_PARTITION_RISCIX
bool "RISCiX partition support" if PARTITION_ADVANCED
default y if ARCH_ACORN
-   depends on ACORN_PARTITION
+   depends on ACORN_PARTITION && (ACORN_PARTITION_CUMANA || 
ACORN_PARTITION_ADFS)
help
  Once upon a time, there was a native Unix port for the Acorn series
  of machines called RISCiX.  If you say 'Y' here, Linux will be able
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3/5] partitions: Add conditionals to acorn.c to clear benign compiler warnings

2007-04-07 Thread John Anthony Kazos Jr.

From: John Anthony Kazos Jr. <[EMAIL PROTECTED]>

Adds conditional-compilation directives to fs/partitions/acorn.c to 
prevent compilation of the functions adfs_partition and linux_partition 
which are used only within ACORN_PARTITION_CUMANA and ACORN_PARTITION_ADFS 
sections, thereby preventing unused-function compiler warnings if 
ACORN_PARTITION is defined and neither of the other two are.

Signed-off-by: John Anthony Kazos Jr. <[EMAIL PROTECTED]>

---

Nothing using these two functions is defined except inside CUMANA and 
ADFS, so without either of those, nothing using these functions is 
compiled, hence the unused-function warning. Both CUMANA and ADFS code use 
both functions so they must be included if either symbol is defined.

--- linux-2.6.20.6-orig/fs/partitions/acorn.c   2007-04-06 16:02:48.0 
-0400
+++ linux-2.6.20.6-mod/fs/partitions/acorn.c2007-04-07 21:40:39.0 
-0400
@@ -25,6 +25,8 @@
 #define PARTITION_RISCIX_SCSI  2
 #define PARTITION_LINUX9
 
+#if defined(CONFIG_ACORN_PARTITION_CUMANA) || 
defined(CONFIG_ACORN_PARTITION_ADFS)
+
 static struct adfs_discrecord *
 adfs_partition(struct parsed_partitions *state, char *name, char *data,
   unsigned long first_sector, int slot)
@@ -49,6 +51,8 @@ adfs_partition(struct parsed_partitions 
return dr;
 }
 
+#endif
+
 #ifdef CONFIG_ACORN_PARTITION_RISCIX
 
 struct riscix_part {
@@ -106,6 +110,8 @@ riscix_partition(struct parsed_partition
 }
 #endif
 
+#if defined(CONFIG_ACORN_PARTITION_CUMANA) || 
defined(CONFIG_ACORN_PARTITION_ADFS)
+
 #define LINUX_NATIVE_MAGIC 0xdeafa1de
 #define LINUX_SWAP_MAGIC   0xdeafab1e
 
@@ -147,6 +153,8 @@ linux_partition(struct parsed_partitions
return slot;
 }
 
+#endif
+
 #ifdef CONFIG_ACORN_PARTITION_CUMANA
 int
 adfspart_check_CUMANA(struct parsed_partitions *state, struct block_device 
*bdev)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 4/5] partitions: Add conditionals and static inline stubs to helpers in headers

2007-04-07 Thread John Anthony Kazos Jr.

From: John Anthony Kazos Jr. <[EMAIL PROTECTED]>

Functions of the form adfspart_check_FOO and foo_partition defined in 
fs/partitions/*.h are helper functions called in a deliberate order by 
check_partition in check.c. Add conditional-compilation directives and 
static inline no-op functions to allow code to indiscriminately call these 
functions irrespective of whether they do anything, removing the necessity 
of oodles of #ifdef/#endif in function bodies.

Signed-off-by: John Anthony Kazos Jr. <[EMAIL PROTECTED]>

---

The next patch changes the check.c code to use these function definitions 
in a nicer way.

--- linux-2.6.20.6-orig/fs/partitions/acorn.h   2007-04-06 16:02:48.0 
-0400
+++ linux-2.6.20.6-mod/fs/partitions/acorn.h2007-04-07 20:02:51.0 
-0400
@@ -7,8 +7,47 @@
  *  format, and everyone stick to it?
  */
 
-int adfspart_check_CUMANA(struct parsed_partitions *state, struct block_device 
*bdev);
-int adfspart_check_ADFS(struct parsed_partitions *state, struct block_device 
*bdev);
-int adfspart_check_ICS(struct parsed_partitions *state, struct block_device 
*bdev);
-int adfspart_check_POWERTEC(struct parsed_partitions *state, struct 
block_device *bdev);
-int adfspart_check_EESOX(struct parsed_partitions *state, struct block_device 
*bdev);
+#ifdef CONFIG_ACORN_PARTITION_ICS
+   int adfspart_check_ICS(struct parsed_partitions *state, struct 
block_device *bdev);
+#else
+   static inline int adfspart_check_ICS(struct parsed_partitions *state, 
struct block_device *bdev)
+   {
+   return 0;
+   }
+#endif
+
+#ifdef CONFIG_ACORN_PARTITION_POWERTEC
+   int adfspart_check_POWERTEC(struct parsed_partitions *state, struct 
block_device *bdev);
+#else
+   static inline int adfspart_check_POWERTEC(struct parsed_partitions 
*state, struct block_device *bdev)
+   {
+   return 0;
+   }
+#endif
+
+#ifdef CONFIG_ACORN_PARTITION_EESOX
+   int adfspart_check_EESOX(struct parsed_partitions *state, struct 
block_device *bdev);
+#else
+   static inline int adfspart_check_EESOX(struct parsed_partitions *state, 
struct block_device *bdev)
+   {
+   return 0;
+   }
+#endif
+
+#ifdef CONFIG_ACORN_PARTITION_CUMANA
+   int adfspart_check_CUMANA(struct parsed_partitions *state, struct 
block_device *bdev);
+#else
+   static inline int adfspart_check_CUMANA(struct parsed_partitions 
*state, struct block_device *bdev)
+   {
+   return 0;
+   }
+#endif
+
+#ifdef CONFIG_ACORN_PARTITION_ADFS
+   int adfspart_check_ADFS(struct parsed_partitions *state, struct 
block_device *bdev);
+#else
+   static inline int adfspart_check_ADFS(struct parsed_partitions *state, 
struct block_device *bdev)
+   {
+   return 0;
+   }
+#endif
--- linux-2.6.20.6-orig/fs/partitions/amiga.h   2007-04-06 16:02:48.0 
-0400
+++ linux-2.6.20.6-mod/fs/partitions/amiga.h2007-04-07 20:04:10.0 
-0400
@@ -2,5 +2,11 @@
  *  fs/partitions/amiga.h
  */
 
-int amiga_partition(struct parsed_partitions *state, struct block_device 
*bdev);
-
+#ifdef CONFIG_AMIGA_PARTITION
+   int amiga_partition(struct parsed_partitions *state, struct 
block_device *bdev);
+#else
+   static inline int amiga_partition(struct parsed_partitions *state, 
struct block_device *bdev)
+   {
+   return 0;
+   }
+#endif
--- linux-2.6.20.6-orig/fs/partitions/atari.h   2007-04-06 16:02:48.0 
-0400
+++ linux-2.6.20.6-mod/fs/partitions/atari.h2007-04-07 20:05:34.0 
-0400
@@ -31,4 +31,11 @@ struct rootsector
   u16 checksum;/* checksum for bootable disks */
 } __attribute__((__packed__));
 
-int atari_partition(struct parsed_partitions *state, struct block_device 
*bdev);
+#ifdef CONFIG_ATARI_PARTITION
+   int atari_partition(struct parsed_partitions *state, struct 
block_device *bdev);
+#else
+   static inline int atari_partition(struct parsed_partitions *state, 
struct block_device *bdev)
+   {
+   return 0;
+   }
+#endif
--- linux-2.6.20.6-orig/fs/partitions/efi.h 2007-04-06 16:02:48.0 
-0400
+++ linux-2.6.20.6-mod/fs/partitions/efi.h  2007-04-07 20:06:28.0 
-0400
@@ -106,7 +106,14 @@ typedef struct _legacy_mbr {
 } __attribute__ ((packed)) legacy_mbr;
 
 /* Functions */
-extern int efi_partition(struct parsed_partitions *state, struct block_device 
*bdev);
+#ifdef CONFIG_EFI_PARTITION
+   int efi_partition(struct parsed_partitions *state, struct block_device 
*bdev);
+#else
+   static inline int efi_partition(struct parsed_partitions *state, struct 
block_device *bdev)
+   {
+   return 0;
+   }
+#endif
 
 #endif
 
--- linux-2.6.20.6-orig/fs/partitions/ibm.h 2007-04-06 16:02:48.0 
-0400
+++ linux-2.6.20.6-mod/fs/partitions/ibm.h  2007-04-07 22:06:16.0 
-0400
@@ -1 +1,8 @@
-int ibm_partition(struct parsed_partitions *, struct block_device *);

[PATCH 1/5] partitions: Touch up comments for check.h and ibm.h

2007-04-07 Thread John Anthony Kazos Jr.

From: John Anthony Kazos Jr. <[EMAIL PROTECTED]>

Adds top-of-file identifying comments to check.h and ibm.h in 
fs/partitions similar to the other files in the directory. Removes an 
obsolescent comment from check.h leftover from devfs.

Signed-off-by: John Anthony Kazos Jr. <[EMAIL PROTECTED]>

---

add_gd_partition used to be in check.c for devfs. That identifier no 
longer exists anywhere within the tree.

--- linux-2.6.20.6-orig/fs/partitions/check.h   2007-04-06 16:02:48.0 
-0400
+++ linux-2.6.20.6-mod/fs/partitions/check.h2007-04-07 21:26:10.0 
-0400
@@ -1,10 +1,10 @@
+/*
+ *  fs/partitions/check.h
+ */
+
 #include 
 #include 
 
-/*
- * add_gd_partition adds a partitions details to the devices partition
- * description.
- */
 enum { MAX_PART = 256 };
 
 struct parsed_partitions {
--- linux-2.6.20.6-orig/fs/partitions/ibm.h 2007-04-06 16:02:48.0 
-0400
+++ linux-2.6.20.6-mod/fs/partitions/ibm.h  2007-04-07 22:02:18.0 
-0400
@@ -1 +1,5 @@
+/*
+ *  fs/partitions/ibm.h
+ */
+
 int ibm_partition(struct parsed_partitions *, struct block_device *);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 0/5] partitions: Changes to fs/partitions for readability and efficiency

2007-04-07 Thread John Anthony Kazos Jr.

In addition to the Kconfig help text patch I submitted earlier, this is a 
set of patches to touch up the partition handling files and also to change 
the "array of function pointers" algorithm of the main checking function 
to "list of calls to possible stub functions" to better fit in with the 
rest of the kernel code, to reduce memory usage by a few dozen bytes, and 
to generally be easier (in my opinion) to understand.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Reiser4. BEST FILESYSTEM EVER.

2007-04-07 Thread Theodore Tso

On Sat, Apr 07, 2007 at 05:44:57PM -0700, [EMAIL PROTECTED] wrote:
> To get a feel for the performance increases that can be achieved by
> using compression, we look at the total time (in seconds) to run the
> test:

You mean the performance increases of writing a file which is mostly
all zero's?  Yawn.

- Ted
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFD driver-core] Lifetime problems of the current driver model

2007-04-07 Thread Tejun Heo

Hello,

Alan Stern wrote:
>>   The problem here is that kobjec_get() in sysfs_schedule_callback()
>>   doesn't grab the module backing the kobject it's grabbing.  By the
>>   time (ss->func)(ss->kobj) runs, scsi_mod is already gone.
> 
> As the author of this routine, I wish you had included my name in your
> CC: list.  :-(

Sorry.  I'll from the next time.  I posted related patchset yesterday.
You can see it at...

  http://thread.gmane.org/gmane.linux.kernel/513334

> The problem here isn't exactly as you described.  scsi_mod needs to be
> pinned (1) because it is the owner of the kobject and hence will be
> called when the kobject is released, and (2) because it is the owner
> of the callback routine.  However this is just a detail; clearly the
> bug needs to be fixed.
> 
> One possibility would be to have scsi_mod's exit_scsi() routine call
> flush_scheduled_work().  Another would be to add such a call in
> sys_delete_module().  Neither of these is attractive.  They would add
> overhead when it's not needed, and they would deadlock if a workqueue
> routine tried to unload a module.
> 
> On balance, the patch below seems better.  Do you agree?

Agreed.  Grabbing module on function schedule should fix the problem.

> With regard to your analysis of lifetime issues, there is a whole
> aspect you did not mention.  A basic assumption of the refcounting
> approach is that once X has a reference to Y, X can freely access and
> use Y as much as it wants until it drops the reference.
> 
> However this is not true when X is a device driver and Y is a device
> structure.  Drivers can be unbound from devices.  If X has been
> unbound from Y then it must not access Y again, no matter how many
> references it possesses.  After all, some other driver may have bound
> to Y in the meantime; this other driver would not appreciate the
> interference.
> 
> Just as bad, if Y represents a hot-pluggable device then some other
> device may have been plugged in and may be using Y's old address.  We
> don't want X sending commands to a new device, thinking that it is Y!
> 
> The complications caused by this requirement affect both the subsystem
> code and device drivers.  Drivers must synchronize their release()
> methods with every action they take -- and refcounts cannot provide
> synchronization.
> 
> A similar problem afflicts the char-device subsystem, and here even
> less care has been taken to address the issues.  The race between
> open() and unregister() is resolved in many places by relying on the
> BKL!
> 
> We should be able to make things better and easier than they are.
> Orphaning open sysfs files was a move in this direction.  But I doubt
> they will ever become truly simple and clear.

Yeap, you're right.  My goal is make driver detach point the automatic
final synchronization point such that after the driver unregisters
itself from all the upper layers including sysfs, it's guaranteed that
there's no user left to the driver or the device it was driving.  Taking
sysfs out of the lifetime equation is a big step toward this goal.
Converting all subsystems and upper layers to immediately disconnect
from the device on driver detach would take some time but I don't think
it will be too difficult.

Thanks.

-- 
tejun
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] Re: kernel oops with badly formatted module option

2007-04-07 Thread Randy Dunlap

On Sat, 07 Apr 2007 19:21:01 -0500 Larry Finger wrote:

> With the following line in /etc/modprobe.conf.local:
> 
> options bcm43xx fwpostfix = ".fw3" locale=8
> 
> the kernel oops below is generated. I realize that the line should have no 
> whitespace around the 
> "=", but I do not feel that an oops is the best way to report the syntax 
> error. Could there be a more gentle failure?


From: Randy Dunlap <[EMAIL PROTECTED]>

Catch malformed kernel parameter usage of "param = value".
Spaces are not supported, but don't cause a kernel fault on
such usage, just report an error.

Signed-off-by: Randy Dunlap <[EMAIL PROTECTED]>
---
 kernel/params.c |4 
 1 file changed, 4 insertions(+)

--- linux-2.6.21-rc6.orig/kernel/params.c
+++ linux-2.6.21-rc6/kernel/params.c
@@ -356,6 +356,10 @@ int param_set_copystring(const char *val
 {
struct kparam_string *kps = kp->arg;
 
+   if (!val) {
+   printk(KERN_ERR "%s: missing param set value\n", kp->name);
+   return -EINVAL;
+   }
if (strlen(val)+1 > kps->maxlen) {
printk(KERN_ERR "%s: string doesn't fit in %u chars.\n",
   kp->name, kps->maxlen-1);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 13/14] sysfs: kill attribute file orphaning

2007-04-07 Thread Tejun Heo

Tejun Heo wrote:
> Now that sysfs_dirent can be disconnected from kobject on deletion,
> there is no need to orphan each attribute files.  All [bin_]attribute
> nodes are automatically orphaned when the parent node is deleted.
> Kill attribute file orphaning.
> 
> Signed-off-by: Tejun Heo <[EMAIL PROTECTED]>

This isn't really true.  An attribute can belong to different module
from the kobj's and we still need per-attribute orphaning as the backing
methods can go away.  Please ignore patch 13, 14 of this patchset.

Thanks.

-- 
tejun
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: compressing intermediate files with LZO on the fly

2007-04-07 Thread David Lang


On Sat, 7 Apr 2007, Willy Tarreau wrote:


Hi Al,

On Sat, Apr 07, 2007 at 02:32:34PM +0300, Al Boldi wrote:

Willy Tarreau wrote:


... for some usages (temporary space),
light compression can increase speed. For instance, when processing logs,
I get better speed by compressing intermediate files with LZO on the fly.


How can you do that on ext3?

Also, can you do that on a partition block-io level?


No, sorry for the confusion. My scripts simply do :

$ lzop -cd file1.lzo | process | lzop -c3 > file2.lzo

With decent CPU, you can reach higher read/write data rates than what a
single off-the-shelf disk can achieve. For this reason, I think that
reiser4 would be worth trying for this particular usage. And in this case,
I'm not interested at all in reliability. It's just temporary storage. If
the disk fails, I throw it away and buy a new one.


I see the same thing with my nightly scripts that do syslog analysis, last year 
I trimmed 2 hours from the nightly run by processing compressed files instead of 
uncompressed ones (after I did this I configured it to compress the files as 
they are rolled, but rolling every 5 min the compression takes <20 seconds, so 
the compression is < 30 min)


now I just need to find a version of split that can compress it's output files.

David Lang
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/2] Optimize compound_head() by avoiding a shared page flag

2007-04-07 Thread Andrew Morton

On Sat, 7 Apr 2007 18:32:04 -0700 (PDT) Christoph Lameter <[EMAIL PROTECTED]> 
wrote:

> On Sat, 7 Apr 2007, Andrew Morton wrote:
> 
> > > I just tried the approach that we discussed earlier and it was not 
> > > nice either.
> > 
> > We've discussed at least three approaches, so we don't know to what you 
> > refer.
> 
> Thats the approach of checking two flags at the same time. In that case 
> the compiler will generate and "and-immediate" and then a 
> "compare-immediate" one branch but  Yuck.

Right.

movl(%ebx), %eax# .flags, tmp399
andl$48, %eax   #, tmp399
cmpl$48, %eax   #, tmp399
je  .L265   #,

what's "yuck" about that?

With the single page flag:

movl(%ebx), %eax#* page.521, D.21940
testb   $32, %al#, D.21940
jne .L265   #,

So you're talking about saving one sole single silly solitary instruction.


> > Because I don't expect there will be much efficiency difference between the
> > above and the use of another page flag.
> 
> Then we end up with all these small efficiency differences in all 
> the code paths. I'd rather go for optimal performance in a frequently used 
> construct like this.

You can save that worrisome single instruction in the common case by putting the
handling of the uncommon compound pages out of line, as I indicated.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: COMPILING AND CONFIGURING A NEW KERNEL.

2007-04-07 Thread Lennart Sorensen

On Sat, Apr 07, 2007 at 06:02:37PM -0700, [EMAIL PROTECTED] wrote:
> Thats interesting, I didn't know that. 
> 
> Do you know if deb-pkg and rpm-pkg take care of creating the initrd
> automatically.
> 
> I seriously doubt they do.
> 
> Actually, I guess I only need to compile another kernel to find out.

Certainly on debian make-kpkg actually does have options for initrd
(--initrd) and will call mkinitrd/update-initramfs/yaird at package
install time to generate an initrd for the kernel to deal with raid,
lvm, etc.  It works quite well in general.

I would be surprised if redhat didn't do something similar.

--
Len Sorensen
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] partitions: CONFIG_BLK_DEV_MD and modular RAID support

2007-04-07 Thread Randy Dunlap

On Sat, 7 Apr 2007 18:47:20 -0400 (EDT) John Anthony Kazos Jr. wrote:

> (Linux v2.6.20.6.)
> 
> The function md_autodetect_dev is defined in drivers/md/md.c. Its 
> declaration is on line 1443, outside of conditionals. However, both its 
> use on line 1455 and its definition on line 5600 are inside "#ifndef 
> MODULE" conditionals. So it seems obvious that the declaration should be 
> inside conditionals as well.
> 
> However, this function is separately declared and used in 
> fs/partitions/check.c but not inside the same conditional, which means if 
> md.c is compiled as a module, check.c will be referencing an undefined 
> symbol.

I don't get a build error when BLK_DEV_MD=m.  Do you?

Do you want to determine why there is no build error?
Hint:  look at include/linux/autoconf.h.

> Should the conditionals around md_autodetect_dev be changed to make sure 
> CONFIG_BLK_DEV_MD=y, or does the function need to be extracted from md.c 
> so it can be used by check.c in any case?


---
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/2] Optimize compound_head() by avoiding a shared page flag

2007-04-07 Thread Christoph Lameter

On Sat, 7 Apr 2007, Andrew Morton wrote:

> > I just tried the approach that we discussed earlier and it was not 
> > nice either.
> 
> We've discussed at least three approaches, so we don't know to what you refer.

Thats the approach of checking two flags at the same time. In that case 
the compiler will generate and "and-immediate" and then a 
"compare-immediate" one branch but  Yuck.

> Because I don't expect there will be much efficiency difference between the
> above and the use of another page flag.

Then we end up with all these small efficiency differences in all 
the code paths. I'd rather go for optimal performance in a frequently used 
construct like this.

This check is not rare. It is done on every SLAB free and on every 
get_page() and put_page(). Lets do the page flag.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Reiser4. BEST FILESYSTEM EVER.

2007-04-07 Thread Lennart Sorensen

On Sat, Apr 07, 2007 at 05:44:57PM -0700, [EMAIL PROTECTED] wrote:
> Lennart. Tell me again that these results from 
> 
> http://linuxhelp.150m.com/resources/fs-benchmarks.htm and
> http://m.domaindlx.com/LinuxHelp/resources/fs-benchmarks.htm

Hmm, copying kernel sources around.  Not that interesting.  How does it
handle mpeg2 files (the majority of my personal data files is on a
mythtv system).  So a few large files, with mostly linear access, and
the occational file deletion.  Compression would gain nothing.

> are not of interest to you. I still don't understand why you have your
> head in the sand.

Well I find it hard to get excited about new filesystems.  I had
sufficiently nasty data loses due to reiserfs 3 back in the early 2.4
kernel days, that I no longer get excited about new filesystems.  now I
want something I trust that hasn't destroyed any of my data.  I tried
XFS for a while, ut the early 2.6 kernels had some nasty bugs in XFS too
that made that pretty much unusable.  Now I just stick with ext3.  Screw
performance, give me something that works all the time.

> .-.
> | FILESYSTEM | TIME |DISK |
> | TYPE   |(secs)|USAGE|
> .-.
> |REISER4 lzo | 1938 | 278 |
> |REISER4 gzip| 2295 | 213 |
> |REISER4 | 3462 | 692 |
> |EXT2| 4092 | 816 |
> |JFS | 4225 | 806 |
> |EXT4| 4408 | 816 |
> |EXT3| 4421 | 816 |
> |XFS | 4625 | 779 |
> |REISER3 | 6178 | 793 |
> |FAT32   |12342 | 988 |
> |NTFS-3g |10414 | 772 |
> .-.
> 
> 
> Column one measures the time taken to complete the bonnie++ benchmarking
> test (run with the parameters bonnie++ -n128:128k:0)

Time without cpu usage is not interesting.  If you can increase
filesystem speed by 10% by doubling cpu load, then I don't want the
increase.  It is all relative.  Wall clock time by itself just doesn't
contain enough data to be useful.

> Column two, Disk Usage: measures the amount of disk used to store 655MB
> of raw data (which was 3 different copies of the Linux kernel sources).

I remember disk compression from the DOS days.  Disk space is too cheap
to bother with that crap anymore.  I don't care if it can theoretically
turn idle cpu cycles into improved disk speed.  Sometimes I don't have
idle cpu cycles to waste on that.

> OR LOOK AT THE FULL RESULTS:
> 
> .-.
> |File |Disk |Copy |Copy |Tar  |Unzip| Del |
> |System   |Usage|655MB|655MB|Gzip |UnTar| 2.5 |
> |Type | (MB)| (1) | (2) |655MB|655MB| Gig |
> .-.
> |REISER4 gzip | 213 | 148 |  68 |  83 |  48 |  70 |
> |REISER4 lzo  | 278 | 138 |  56 |  80 |  34 |  84 |
> |REISER4 tails| 673 | 148 |  63 |  78 |  33 |  65 |
> |REISER4  | 692 | 148 |  55 |  67 |  25 |  56 |
> |NTFS3g   | 772 |1333 |1426 | 585 | 767 | 194 |
> |NTFS | 779 | 781 | 173 |   X |   X |   X |
> |REISER3  | 793 | 184 |  98 |  85 |  63 |  22 |
> |XFS  | 799 | 220 | 173 | 119 |  90 | 106 |
> |JFS  | 806 | 228 | 202 |  95 |  97 | 127 |
> |EXT4 extents | 806 | 162 |  55 |  69 |  36 |  32 |
> |EXT4 default | 816 | 174 |  70 |  74 |  42 |  50 |
> |EXT3 | 816 | 182 |  74 |  73 |  43 |  51 |
> |EXT2 | 816 | 201 |  82 |  73 |  39 |  67 |
> |FAT32| 988 | 253 | 158 | 118 |  81 |  95 |
> .-.
> 
> 
> Each test was preformed 5 times and the average value recorded.
> Disk Usage: The amount of disk used to store the data (which was 3
> different copies of the Linux kernel sources).
> The raw data (without filesystem meta-data, block alignment wastage,
> etc) was 655MB.
> Copy 655MB (1): Copy the data over a partition boundary.
> Copy 655MB (2): Copy the data within a partition.
> Tar Gzip 655MB: Tar and Gzip the data.
> Unzip UnTar 655MB: UnGzip and UnTar the data.
> Del 2.5 Gig: Delete everything just written (about 2.5 Gig).
> 
> To get a feel for the performance increases that can be achieved by
> using compression, we look at the total time (in seconds) to run the
> test:

kernel sources are some of the most compressable data files around.  Try
with some interesting data instead, like something with larger files,
mostly binary, which isn't likely to compess very much.

> bonnie++ -n128:128k:0 (bonnie++ is Version 1.93c)
> 
> .---.
> | FILESYSTEM | TIME |
> .---.
> |REISER4 lzo |  1938|
> |REISER4 gzip|  2295|
> |REISER4 |  3462|
> |EXT4|  4408|
> |EXT2|  4092|
> |JFS |  4225|
> |EXT3|  4421|
> |XFS |  4625|
> |REISER3 |  6178|
> |FAT32   | 12342|
> |NTFS-3g |>10414|
> .---.
> -- 

Well Reiser4 certainly looks impresive, but I still want to know what the
cpu load is like, what the repair tools are like, how well it handles
power failures in the middle of a write (I didn't like the way reiser3
would

Re: [PATCH 2/2] Optimize compound_head() by avoiding a shared page flag

2007-04-07 Thread Andrew Morton

On Sat, 7 Apr 2007 17:21:38 -0700 (PDT) Christoph Lameter <[EMAIL PROTECTED]> 
wrote:

> On Sat, 7 Apr 2007, Andrew Morton wrote:
> 
> > Which is all a ton of fun, but this subversion of the architecture's
> > freedom to use volatile, memory barriers etc is a worry.  We do the same in
> > page_alloc.c, of course...  
> 
> I just tried the approach that we discussed earlier and it was not 
> nice either.

We've discussed at least three approaches, so we don't know to what you refer.

>  Lets just use a page flag please.

Nope, try harder.

PageCompound is an unlikely case.  Back in the old days we would have done

if (PageCompound(page))
goto out_of_line;
back:

do_stuff_with(page);
return;

out_of_line:
if (PageTail(page)) {
page = page_tail(page);
goto back;
}

and nowadays we hope that gcc does the above for us.  If it doesn't do it
for us, perhaps it needs open-coded help.

Because I don't expect there will be much efficiency difference between the
above and the use of another page flag.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: COMPILING AND CONFIGURING A NEW KERNEL.

2007-04-07 Thread johnrobertbanks


> It is *highly* recommended that you change the kernel identifier at
> least slightly, so that you can install '2.6.20-1.local' without
> overlaying
> the vendor-supplied 2.6.20-1 kernel.  Among other things, this lets you
> boot back to the equivalent code level in the vendor kernel, 
> so you can figure
> out if it's your .config file that's broken, or if you hit a bug
> upggrading from 2.6.19-10 to 2.6.20-1.

I agree. I think your advice is *highly* recommended.

I had this problem once after forcing an upgrade, which removed the
working kernel. I just booted the kernel and stuff, from another
partition.

John.
-- 
  
  [EMAIL PROTECTED]

-- 
http://www.fastmail.fm - The way an email service should be

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: COMPILING AND CONFIGURING A NEW KERNEL.

2007-04-07 Thread johnrobertbanks


> It is quite possible to build a kernel that has all the drivers built-in,
> but still require an initrd file.  For instance, if you have a recent
> RedHat or Fedora system, '/' may very well be on an LVM partition, which
> means you need an initrd to do a 'lvm varyonvg' before mounting your real
> root filesystem will work

Thanks Valdis,

Thats interesting, I didn't know that. 

Do you know if deb-pkg and rpm-pkg take care of creating the initrd
automatically.

I seriously doubt they do.

Actually, I guess I only need to compile another kernel to find out.

John.
-- 
  
  [EMAIL PROTECTED]

-- 
http://www.fastmail.fm - Choose from over 50 domains or use your own

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Reiser4. BEST FILESYSTEM EVER.

2007-04-07 Thread johnrobertbanks

Lennart. Tell me again that these results from 

http://linuxhelp.150m.com/resources/fs-benchmarks.htm and
http://m.domaindlx.com/LinuxHelp/resources/fs-benchmarks.htm

are not of interest to you. I still don't understand why you have your
head in the sand.

.-.
| FILESYSTEM | TIME |DISK |
| TYPE   |(secs)|USAGE|
.-.
|REISER4 lzo | 1938 | 278 |
|REISER4 gzip| 2295 | 213 |
|REISER4 | 3462 | 692 |
|EXT2| 4092 | 816 |
|JFS | 4225 | 806 |
|EXT4| 4408 | 816 |
|EXT3| 4421 | 816 |
|XFS | 4625 | 779 |
|REISER3 | 6178 | 793 |
|FAT32   |12342 | 988 |
|NTFS-3g |10414 | 772 |
.-.


Column one measures the time taken to complete the bonnie++ benchmarking
test (run with the parameters bonnie++ -n128:128k:0)

Column two, Disk Usage: measures the amount of disk used to store 655MB
of raw data (which was 3 different copies of the Linux kernel sources).

OR LOOK AT THE FULL RESULTS:

.-.
|File |Disk |Copy |Copy |Tar  |Unzip| Del |
|System   |Usage|655MB|655MB|Gzip |UnTar| 2.5 |
|Type | (MB)| (1) | (2) |655MB|655MB| Gig |
.-.
|REISER4 gzip | 213 | 148 |  68 |  83 |  48 |  70 |
|REISER4 lzo  | 278 | 138 |  56 |  80 |  34 |  84 |
|REISER4 tails| 673 | 148 |  63 |  78 |  33 |  65 |
|REISER4  | 692 | 148 |  55 |  67 |  25 |  56 |
|NTFS3g   | 772 |1333 |1426 | 585 | 767 | 194 |
|NTFS | 779 | 781 | 173 |   X |   X |   X |
|REISER3  | 793 | 184 |  98 |  85 |  63 |  22 |
|XFS  | 799 | 220 | 173 | 119 |  90 | 106 |
|JFS  | 806 | 228 | 202 |  95 |  97 | 127 |
|EXT4 extents | 806 | 162 |  55 |  69 |  36 |  32 |
|EXT4 default | 816 | 174 |  70 |  74 |  42 |  50 |
|EXT3 | 816 | 182 |  74 |  73 |  43 |  51 |
|EXT2 | 816 | 201 |  82 |  73 |  39 |  67 |
|FAT32| 988 | 253 | 158 | 118 |  81 |  95 |
.-.


Each test was preformed 5 times and the average value recorded.
Disk Usage: The amount of disk used to store the data (which was 3
different copies of the Linux kernel sources).
The raw data (without filesystem meta-data, block alignment wastage,
etc) was 655MB.
Copy 655MB (1): Copy the data over a partition boundary.
Copy 655MB (2): Copy the data within a partition.
Tar Gzip 655MB: Tar and Gzip the data.
Unzip UnTar 655MB: UnGzip and UnTar the data.
Del 2.5 Gig: Delete everything just written (about 2.5 Gig).

To get a feel for the performance increases that can be achieved by
using compression, we look at the total time (in seconds) to run the
test:

bonnie++ -n128:128k:0 (bonnie++ is Version 1.93c)

.---.
| FILESYSTEM | TIME |
.---.
|REISER4 lzo |  1938|
|REISER4 gzip|  2295|
|REISER4 |  3462|
|EXT4|  4408|
|EXT2|  4092|
|JFS |  4225|
|EXT3|  4421|
|XFS |  4625|
|REISER3 |  6178|
|FAT32   | 12342|
|NTFS-3g |>10414|
.---.
-- 
  
  [EMAIL PROTECTED]

-- 
http://www.fastmail.fm - Access all of your messages and folders
  wherever you are

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Reiser4. BEST FILESYSTEM EVER.

2007-04-07 Thread Krzysztof Halasa

[EMAIL PROTECTED] writes:

> I am quite sure that the kernel RPM file is *already* compressed, at least
> somewhat.

Sure - that's the point - it's better to have the tool compress
data when it makes sense.

OTOH I think Reiser4 fs is not about transparent compression, it's
rather about the plugins etc. There are other filesystems with
transparent compression, that's nothing new.
-- 
Krzysztof Halasa
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: init's children list is long and slows reaping children.

2007-04-07 Thread Eric W. Biederman

Oleg Nesterov <[EMAIL PROTECTED]> writes:

> On 04/06, Oleg Nesterov wrote:
>> 
>> Perhaps,
>> 
>> --- t/kernel/exit.c~ 2007-04-06 23:31:31.0 +0400
>> +++ t/kernel/exit.c  2007-04-06 23:31:57.0 +0400
>> @@ -275,10 +275,7 @@ static void reparent_to_init(void)
>>  remove_parent(current);
>>  current->parent = child_reaper(current);
>>  current->real_parent = child_reaper(current);
>> -add_parent(current);
>> -
>> -/* Set the exit signal to SIGCHLD so we signal init on exit */
>> -current->exit_signal = SIGCHLD;
>> +current->exit_signal = -1;
>>  
>>  if (!has_rt_policy(current) && (task_nice(current) < 0))
>>  set_user_nice(current, 0);
>> 
>> is enough. init is still our parent (make ps happy), but it can't see us,
>> we are not on ->children list.
>
> OK, this doesn't work. A multi-threaded init may do execve(). 

Good catch.  daemonize must die! 

> So, we can re-parent a kernel thread to swapper. In that case it doesn't 
> matter
> if we put task on ->children list or not.

Yes.  We can.

> User-visible change. Acceptable?

We would have user visible changes when we changed to ktrhead_create anyway,
and it's none of user space's business so certainly.

>> Off course, we also need to add preparent_to_init() to kthread() and
>> (say) stopmachine(). Or we can create kernel_thread_detached() and
>> modify callers to use it.
>
> It would be very nice to introduce CLONE_KERNEL_THREAD instead, then

If we are going to do something to copy_process and the like let's take
this one step farther.  Let's pass in the value of the task to copy.

Then we can add a wrapper around copy_process to build kernel_thread
something like:

struct task_struct *__kernel_thread(int (*fn)(void *), void * arg,
unsigned long flags)
{
struct task_struct *task;
struct pt_regs regs, *reg;

reg = kernel_thread_regs(, fn, arg);
task = copy_process(_task, flags, 0, reg, 0, NULL, NULL, NULL, 0);
if (!IS_ERR(task))
wake_up_new_task(task, flags);

return task;
}

long kernel_thread(int (*fn)(void *), void * arg, unsigned long flags)
{
struct task_struct *task;
task = __kernel_thread(fn, arg, flags);
if (IS_ERR(task))
return PTR_ERR(task);
return task->pid;   
}

After that daemonize just becomes:

void daemonize(const char *name, ...)
{
va_list args;

va_start(args, name);
vsnprintf(current->comm, sizeof(current->comm), name, args);
va_end(args);
}

And kthread_create becomes:

struct task_struct *kthread_create(int (*threadfn)(void *data),
   void *data,
   const char namefmt[],
   ...)
{
struct kthread_create_info create;
struct task_struct *task;

create.threadfn = threadfn;
create.data = data;

/* We want our own signal handler (we take no signals by default). */
task = __kernel_thread(kthread, create, CLONE_FS | CLONE_FILES | 
SIGCHLD);
if (!IS_ERR(task)) {
va_list args;
va_start(args, namefmt);
vsnprintf(task->comm, sizeof(task->comm),  namefmt, args);
va_end(args);
}
return task;
}

If we are willing to go that far.  I think it is worth it to touch the
architecture specific code.  As that removes an unnecessary wait
queue, and ensures that kernel threads always start with a consistent
state.  So it is a performance, scalability and simplicity boost.

Otherwise we should just put the code in the callers of kernel_thread
like we do today.

I don't have the energy to rework all of th architecture or even a
noticeable fraction of them right now.  I have to much on my plate.
A non-architecture specific solution looks like a fairly simple
patch and I might get around to it one of these times..

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

JFFS2 BUG(), report

2007-04-07 Thread Denys

Hi

Trying to mount filesystem image, mounted over block2mtd to USB Flash (usb-
storage), got OOPS/BUG. Reproduced on latest kernel, 2.6.21-rc6.

Do you need complete JFFS2 image?
Please CC me in reply, cause i am not subscribed to list.

oops text:
[   41.355000] scsi 0:0:0:0: Direct-Access USB 2.0  Flash Disk   0.00 
PQ: 0 ANSI: 2
[   41.357000] SCSI device sda: 512000 512-byte hdwr sectors (262 MB)
[   41.357000] sda: Write Protect is off
[   41.357000] sda: Mode Sense: 00 00 00 00
[   41.357000] sda: assuming drive cache: write through
[   41.361000] SCSI device sda: 512000 512-byte hdwr sectors (262 MB)
[   41.361000] sda: Write Protect is off
[   41.361000] sda: Mode Sense: 00 00 00 00
[   41.361000] sda: assuming drive cache: write through
[   41.361000]  sda: sda1 sda2
[   41.414000] sd 0:0:0:0: Attached scsi removable disk sda
[   41.414000] usb-storage: device scan complete
[   44.681000] eth0: link down
[   79.68] block2mtd: mtd0: [d: /dev/sda2] erase_size = 8KiB [8192]
[   79.681000] block2mtd: version $Revision: 1.30 $
[   79.763000] JFFS2 version 2.2. (NAND) (SUMMARY)  (C) 2001-2006 Red Hat, 
Inc.
[   80.209000] Totlen for ref at d56c09b4 (0x0064-0x0064000c) 
miscalculated as 0x0 instead of c
[   80.209000] next d56c09c0 (0x0064-0x0064000c)
[   80.209000] jeb->wasted_size 0, dirty_size 0, used_size c, free_size 1ff4
[   80.209000] BUG: at fs/jffs2/nodelist.c:1219 __jffs2_ref_totlen()
[   80.209000]  [] __jffs2_ref_totlen+0x1e3/0x22d [jffs2]
[   80.209000]  [] jffs2_link_node_ref+0x70/0x121 [jffs2]
[   80.209000]  [] block2mtd_read+0xe4/0x113 [block2mtd]
[   80.209000]  [] jffs2_scan_make_ino_cache+0x10/0x63 [jffs2]
[   80.209000]  [] jffs2_sum_scan_sumnode+0x1ab/0x574 [jffs2]
[   80.209000]  [] jffs2_scan_medium+0x302/0x12da [jffs2]
[   80.209000]  [] __alloc_pages+0x59/0x28d
[   80.209000]  [] __vmalloc+0xf/0x11
[   80.209000]  [] jffs2_sum_init+0x50/0x9d [jffs2]
[   80.209000]  [] jffs2_do_mount_fs+0x16f/0x470 [jffs2]
[   80.209000]  [] jffs2_do_fill_super+0xe3/0x1e6 [jffs2]
[   80.209000]  [] jffs2_sb_set+0x0/0x1d [jffs2]
[   80.209000]  [] jffs2_sb_compare+0x0/0x11 [jffs2]
[   80.209000]  [] jffs2_get_sb_mtd+0xfd/0x147 [jffs2]
[   80.209000]  [] jffs2_get_sb+0x1a3/0x1bd [jffs2]
[   80.209000]  [] error_code+0x74/0x7c
[   80.209000]  [] alloc_vfsmnt+0x8d/0xb4
[   80.209000]  [] vfs_kern_mount+0x40/0x6f
[   80.209000]  [] do_kern_mount+0x2d/0x3e
[   80.209000]  [] do_mount+0x56e/0x5e1
[   80.209000]  [] mntput_no_expire+0x11/0x47
[   80.209000]  [] link_path_walk+0xa5/0xaf
[   80.209000]  [] __handle_mm_fault+0x271/0x675
[   80.209000]  [] __handle_mm_fault+0x3ae/0x675
[   80.209000]  [] getname+0x59/0x8f
[   80.209000]  [] error_code+0x74/0x7c
[   80.209000]  [] __get_free_pages+0x1a/0x33
[   80.209000]  [] copy_mount_options+0x26/0x109
[   80.209000]  [] sys_mount+0x72/0xa9
[   80.209000]  [] sysenter_past_esp+0x5d/0x81
[   80.209000]  ===
[   80.209000] JFFS2 error: (8300) jffs2_link_node_ref: Adding new ref 
d56c09c0 at (0x0064-0x00640dd4) not immediately after previous 
(0x0064-0x0064000c)
[   80.209000] [ cut here ]
[   80.209000] kernel BUG at fs/jffs2/nodelist.c:1098!
[   80.209000] invalid opcode:  [#1]
[   80.209000] Modules linked in: jffs2 zlib_deflate block2mtd mtdpart 
mtdcore snd_pcm_oss snd_mixer_oss snd_seq_dummy snd_seq_oss 
snd_seq_midi_event snd_seq snd_seq_device ohci_hcd thermal processor battery 
ac usb_storage libusual wlan_scan_sta ath_rate_sample ath_pci rtc_cmos 
rtc_core wlan rtc_lib ath_hal(P) 8139too i2c_i801 ehci_hcd uhci_hcd usbcore 
intelfb i2c_algo_bit cfbcopyarea snd_hda_intel snd_hda_codec snd_pcm 
snd_timer i2c_core snd soundcore snd_page_alloc cfbimgblt cfbfillrect
[   80.209000] CPU:0
[   80.209000] EIP:0060:[]Tainted: P   VLI
[   80.209000] EFLAGS: 00010296   (2.6.21-rc5 #6)
[   80.209000] EIP is at jffs2_link_node_ref+0xbe/0x121 [jffs2]
[   80.209000] eax: 00a5   ebx: d65f10f0   ecx: 0046   edx: 7324
[   80.209000] esi: d56c09c0   edi: d7129400   ebp: 0dd4   esp: d6949bdc
[   80.209000] ds: 007b   es: 007b   fs: 00d8  gs: 0033  ss: 0068
[   80.209000] Process mount (pid: 8300, ti=d6948000 task=d65f10f0 
task.ti=d6948000)
[   80.209000] Stack: e0357fed 206c e035782f d56c09c0 0064 00640dd4 
0064 0064000c
[   80.209000]d605bf44 e002  00dc e0356b01 0dd4 
d58b62dc 0008
[   80.209000]0006 c14db0b0 d605bf24 e03af280 d7129400 2000 
00dc 
[   80.209000] Call Trace:
[   80.209000]  [] jffs2_sum_scan_sumnode+0x1ab/0x574 [jffs2]
[   80.209000]  [] jffs2_scan_medium+0x302/0x12da [jffs2]
[   80.209000]  [] __alloc_pages+0x59/0x28d
[   80.209000]  [] __vmalloc+0xf/0x11
[   80.209000]  [] jffs2_sum_init+0x50/0x9d [jffs2]
[   80.209000]  [] jffs2_do_mount_fs+0x16f/0x470 [jffs2]
[   80.209000]  [] jffs2_do_fill_super+0xe3/0x1e6 [jffs2]
[   80.209000]  [] jffs2_sb_set+0x0/0x1d [jffs2]
[

Re: [PATCH 2/2] Optimize compound_head() by avoiding a shared page flag

2007-04-07 Thread Christoph Lameter

On Sat, 7 Apr 2007, Andrew Morton wrote:

> Which is all a ton of fun, but this subversion of the architecture's
> freedom to use volatile, memory barriers etc is a worry.  We do the same in
> page_alloc.c, of course...  

I just tried the approach that we discussed earlier and it was not 
nice either. Lets just use a page flag please. This check will be in 
several hot code paths. And it may become more important because the file 
system folks want to support buffers > page size. Then we may want more 
transparent support for huge pages... For all of this page->private gets 
in the way.

And I think we curently have 5 or so page flags available?

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

kernel oops with badly formatted module option

2007-04-07 Thread Larry Finger


With the following line in /etc/modprobe.conf.local:

options bcm43xx fwpostfix = ".fw3" locale=8

the kernel oops below is generated. I realize that the line should have no whitespace around the 
"=", but I do not feel that an oops is the best way to report the syntax error. Could there be a 
more gentle failure?


This is an x86_64 system running 2.6.16-rc5 from Linville's wireless-2.6 tree. The "dirty" refers to 
patches for the bcm43xx driver that are under test.


Larry

kernel: Unable to handle kernel NULL pointer dereference at  
RIP:
kernel:  [] param_set_copystring+0x15/0x4b
kernel: PGD 3ce82067 PUD 34b0c067 PMD 0
kernel: Oops:  [1] SMP
kernel: CPU 0
kernel: Modules linked in: af_packet snd_pcm_oss snd_mixer_oss snd_seq 
snd_seq_device
cpufreq_conservative cpuf
req_ondemand cpufreq_userspace cpufreq_powersave powernow_k8 freq_table button 
battery ac loop
snd_hda_intel snd_hda_codec snd_pcm snd_t
imer snd ohci1394 soundcore snd_page_alloc ieee1394 ide_cd cdrom ehci_hcd 
ohci_hcd sdhci mmc_core
usbcore forcedeth i2c_nforce2 firmware
_class ieee80211softmac ieee80211 ieee80211_crypt ext3 mbcache jbd sg edd fan 
sata_nv libata amd74xx
thermal processor sd_mod scsi_mod i
de_disk ide_core
kernel: Pid: 4325, comm: modprobe Not tainted 2.6.21-rc5-L2.6-gae6ff7a1-dirty #5
kernel: RIP: 0010:[]  [] 
param_set_copystring+0x15/0x4b
kernel: RSP: 0018:81003dee3dd8  EFLAGS: 00010286
kernel: RAX:  RBX: 810053736c7a RCX: 
kernel: RDX: 0040 RSI: 882fee68 RDI: 
kernel: RBP: 81003dee3dd8 R08: 88307ea0 R09: 
kernel: R10: 0006 R11: 88307e88 R12: 0028
kernel: R13: 810053736c70 R14:  R15: 88308098
kernel: FS:  2ae7eb6246f0() GS:80505000() 
knlGS:f70686d0
kernel: CS:  0010 DS:  ES:  CR0: 8005003b
kernel: CR2:  CR3: 36ff8000 CR4: 06e0
kernel: Process modprobe (pid: 4325, threadinfo 81003dee2000, task 
810035d31140)
kernel: Stack:  81003dee3e38 802430ff 88308080 

kernel:  0007003b74d8 882fed78 0202 c23b77d8
kernel:  0028 88308080 88304c78 c23b7498
kernel: Call Trace:
kernel:  [] parse_args+0x139/0x216
kernel:  [] sys_init_module+0x1376/0x1744
kernel:  [] autoremove_wake_function+0x0/0x38
kernel:  [] device_remove_file+0x0/0x33
kernel:  [] trace_hardirqs_on_thunk+0x35/0x37
kernel:  [] system_call+0x7e/0x83
kernel:
kernel: Code: f2 ae 89 d0 48 f7 d1 48 39 c1 76 1a 48 8b 36 ff ca 48 c7 c7
kernel: RIP  [] param_set_copystring+0x15/0x4b
kernel:  RSP 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: If not readdir() then what?

2007-04-07 Thread Jan Engelhardt


On Apr 7 2007 16:36, Theodore Tso wrote:
>
>So how do we solve this problem?  I can think of two solutions:
>
>1) Deprecate telldir/seekdir() altogether.  Relatively few progams use
>this functionality, and it is highly questionable how useful it is,
>anyway.  If you use telldir/seekdir and keep the cookie for a long
>time, even the POSIX-provided guarantees about files that are created
>and deleted between the telldir() and seekdir() points in time makes
>its utility highly dubious.
>
>2) If application programs must have telldir/seekdir, than expand the
>size of the cookie from 32-bits to a minimum of 128 bits, and
>preferably larger --- say 512 bits, to accomodate systems that might
>be using 512-bit variant of SHA-2.  [...]

Maybe a combination of both? That is, replace telldir by an
independent emulation layer.


Jan
-- 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH -mm] swsusp: Use rbtree for tracking allocated swap

2007-04-07 Thread Nigel Cunningham

Hi.

On Sun, 2007-04-08 at 01:13 +0200, Rafael J. Wysocki wrote:
> On Sunday, 8 April 2007 00:31, Nigel Cunningham wrote:
> > Hi.
> > 
> > On Sat, 2007-04-07 at 15:06 -0700, Andrew Morton wrote:
> > > On Sat, 7 Apr 2007 23:20:39 +0200 "Rafael J. Wysocki" <[EMAIL PROTECTED]> 
> > > wrote:
> > > 
> > > > This should allow us to reduce the memory usage, practically always, and
> > > > improve performance.
> > > 
> > > And does it?
> 
> Yes.  There are theoretical corner cases in which it may be less efficient
> than the current approach, but in the usual situation it is _much_ better.
> 
> > It will. I've been using extents for ages, for the same reasons. I don't
> > put them in an rb_tree because I view it as less than most efficient,
> 
> Actually, I don't agree with that.  In the normal situation (ie. one extent is
> needed) there is no difference as far as the memory usage or performance
> are concerned, but if there are more extents, the rbtree should be more
> efficient.

I don't think it's worth having a big discussion over, but let me give
you the details, which you can then feel free to ignore :)

The rb_node struct adds an unsigned long and two struct rb_node *
pointers. My extents use one struct extent * pointer. The difference is
thus 12/24 bytes per extent (32/64 bits) vs 20/40. In the normal
situation, not worth worrying about, but I'm also using these for
recording the sectors we write too, and thinking about swap files and
multiple swap devices. Nearly double the memory use bites more as you
get more extents.

Insertion cost for rb_node includes keeping the tree balanced. For
extents, I start with the location of the last insertion to minimise the
cost, so insertion time is usually virtually zero (inc max of last
extent or append a new one). If for some reason swap was allocated out
of order, I might need to traverse the whole chain from the start.

Normal usage in both cases is simply iterating through the list, so I
guess the cost would be approximately the same.

Deletion could would include rebalancing for the rb_nodes.

Code cost is a gain for you - you're leveraging existing code, I'm
adding a bit more. extent.c is 300 lines including code for serialising
the chains in an image header and iterating through a group of chains
(multiple swap devices support).

rb_nodes seem to be the wrong solution to me because we generally don't
care about searching. We care about minimising memory usage and
maximising the speed of iteration, insertion and deletion. I believe
I've managed to do that with a singly linked, sorted list.

That said, we've agreed that we're normally talking about a small number
of extents, so it's probably not worth the bandwidth I've already
spent :)

Regards,

Nigel

signature.asc
Description: This is a digitally signed message part

Re: If not readdir() then what?

2007-04-07 Thread Christoph Hellwig

On Sat, Apr 07, 2007 at 04:36:33PM -0400, Theodore Tso wrote:
> this functionality, and it is highly questionable how useful it is,
> anyway.  If you use telldir/seekdir and keep the cookie for a long
> time, even the POSIX-provided guarantees about files that are created
> and deleted between the telldir() and seekdir() points in time makes
> its utility highly dubious.

It's not going to solve anything at all.  We can't stop supporting
functionality that has been there forever. 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

netconsole module unload broken between 2.6.19 and 2.6.20 (and still broken as of 2.6.21-rc6)

2007-04-07 Thread Robin H. Johnson

(Please CC me on emails, I'm not on LKML).

Somewhere between 2.6.19 and 2.6.20, unloading of the netconsole module
got broken. It's still broken as of 2.6.21-rc6.

If you try to unload the module, the rmmod/modprobe-r just sits there
forever. I can reproduce it on tg3, forcedeth and e1000 hardware (all in
various Opteron machines)

Looking at the differences in netconsole itself between .19 and .20,
they are extremely small, so I'd guess that the problem probably lies in
netpoll itself.

Originally, I was trying to unload the module to reconfigure the log
destination - maybe a sysfs interface for (re-)configuration would be a
good addition as well?

-- 
Robin Hugh Johnson
Gentoo Linux Developer & Council Member
E-Mail : [EMAIL PROTECTED]
GnuPG FP   : 11AC BA4F 4778 E3F6 E4ED  F38E B27B 944E 3488 4E85


pgp4hjM8A3xEQ.pgp
Description: PGP signature

Re: [RFC][PATCH -mm] swsusp: Use rbtree for tracking allocated swap

2007-04-07 Thread Rafael J. Wysocki

On Sunday, 8 April 2007 00:31, Nigel Cunningham wrote:
> Hi.
> 
> On Sat, 2007-04-07 at 15:06 -0700, Andrew Morton wrote:
> > On Sat, 7 Apr 2007 23:20:39 +0200 "Rafael J. Wysocki" <[EMAIL PROTECTED]> 
> > wrote:
> > 
> > > This should allow us to reduce the memory usage, practically always, and
> > > improve performance.
> > 
> > And does it?

Yes.  There are theoretical corner cases in which it may be less efficient
than the current approach, but in the usual situation it is _much_ better.

> It will. I've been using extents for ages, for the same reasons. I don't
> put them in an rb_tree because I view it as less than most efficient,

Actually, I don't agree with that.  In the normal situation (ie. one extent is
needed) there is no difference as far as the memory usage or performance
are concerned, but if there are more extents, the rbtree should be more
efficient.

> but it will still be a huge step forward from bitmaps in the normal
> case.
> 
> The worst case would be if every second page of swap was in use, so that
> you needed one extent per swap page. In that case, it would use more
> memory than the bitmap, but far, far more common will be the case where
> only one extent is needed for the whole swap partition, because the
> algorithm used by the swap allocator minimises fragmentation.

Exactly.

Greetings,
Rafael
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/2] Optimize compound_head() by avoiding a shared page flag

2007-04-07 Thread Andrew Morton

On Sat, 7 Apr 2007 15:16:17 -0700 (PDT) Christoph Lameter <[EMAIL PROTECTED]> 
wrote:

> On Fri, 6 Apr 2007, Andrew Morton wrote:
> 
> > Did you investigate
> > 
> > static inline int page_tail(struct page *page)
> > {
> > return ((page->flags & (PG_compound|PG_tail)) == (PG_compound|PG_tail));
> > }
> 
> The usual test_bit that we are using there uses a volatile reference 
> so these wont be combined if I check them separately.
> 
> A working example of the above would be much uglier:
> 
> static inline int page_tail(struct page *page)
> {
>   return ((page->flags & ((1L << PG_compound)|(1L << PG_tail))) == 
> ((1L << PG_compound)|(1L << PG_tail)));
> }
> 
> May be this can be cleaned up somehow.

It might generate better code to do

unsigned long compound;

compound = page->flags & (1 << PG_compound);
if (PG_compound > PG_tail)
return compound & (page->flags << (PG_compound - PG_tail));
else
return compound & (page->flags << (PG_tail - PG_compound));

ie: get the PG_compound flag into `compound', then bitwise-and that with
the PG_tail flag, after shifting it into PG_compound' slot.  The return
value will be zero if either bit is clear, (1< PG_tail)' will be swallowed by the compiler.

The compiler should turn it all into

(page->flags & N) & (page->flags << M)

Which may or may not be better than (page->flags & N == N), dunno. 
Probably not - if the compiler's any good it won't save a branch, I
suspect.

Which is all a ton of fun, but this subversion of the architecture's
freedom to use volatile, memory barriers etc is a worry.  We do the same in
page_alloc.c, of course...  

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC] partitions: CONFIG_BLK_DEV_MD and modular RAID support

2007-04-07 Thread John Anthony Kazos Jr.

(Linux v2.6.20.6.)

The function md_autodetect_dev is defined in drivers/md/md.c. Its 
declaration is on line 1443, outside of conditionals. However, both its 
use on line 1455 and its definition on line 5600 are inside "#ifndef 
MODULE" conditionals. So it seems obvious that the declaration should be 
inside conditionals as well.

However, this function is separately declared and used in 
fs/partitions/check.c but not inside the same conditional, which means if 
md.c is compiled as a module, check.c will be referencing an undefined 
symbol.

Should the conditionals around md_autodetect_dev be changed to make sure 
CONFIG_BLK_DEV_MD=y, or does the function need to be extracted from md.c 
so it can be used by check.c in any case?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: CPU offline but power consumption increased?

2007-04-07 Thread Andi Kleen

"Andika Triwidada" <[EMAIL PROTECTED]> writes:

[cc linux-acpi]

> Question: is that normal? I thought power consumption will be
> automatically reduced if one core offlined.

The current cpu offline essentially just runs a special idle loop.
The standard idle loop is even a bit more aggressive on some systems
because it knows about the deeper ACPI sleep modi.

There are also dependencies between cores because current CPUs
have shared power planes between cores.

I suppose in the future when a whole socket goes off line one could
implement special code to turn off the CPU further. But it likely
won't work on older hardware.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH -mm] swsusp: Use rbtree for tracking allocated swap

2007-04-07 Thread Nigel Cunningham

Hi.

On Sat, 2007-04-07 at 15:06 -0700, Andrew Morton wrote:
> On Sat, 7 Apr 2007 23:20:39 +0200 "Rafael J. Wysocki" <[EMAIL PROTECTED]> 
> wrote:
> 
> > This should allow us to reduce the memory usage, practically always, and
> > improve performance.
> 
> And does it?

It will. I've been using extents for ages, for the same reasons. I don't
put them in an rb_tree because I view it as less than most efficient,
but it will still be a huge step forward from bitmaps in the normal
case.

The worst case would be if every second page of swap was in use, so that
you needed one extent per swap page. In that case, it would use more
memory than the bitmap, but far, far more common will be the case where
only one extent is needed for the whole swap partition, because the
algorithm used by the swap allocator minimises fragmentation.

Regards,

Nigel

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/4] x86_64: (SPARSE_VIRTUAL doubles sparsemem speed)

2007-04-07 Thread Andi Kleen

On Sunday 08 April 2007 00:06:13 Christoph Lameter wrote:

> Results:
> 
> x86_64 boot with virtual memmap
> 
> Format:   #events totaltime (min/avg/max)
> 
> kfree_virt_to_page   598430 5.6ms(3ns/9ns/322ns)
> 
> x86_64 boot regular sparsemem
> 
> kfree_virt_to_page   596360 10.5ms(4ns/18ns/28.7us)
> 
> 
> On average sparsemem virtual takes half the time than of sparsemem.

Nice.  But on what workloads? 

Anyways it looks promising. I hope we can just
replace old style sparsemem support with this for x86-64.

> Time is measured using the cycle counter (TSC on IA32, ITC on IA64) which has
> a very low latency.

Sorry that triggered my usual RDTSC rant...

Not on NetBurst (hundred of cycles) And on the others (C2,K8) it is a bit 
dangerous 
to measure short code blocks because RDTSC is not guaranteed ordered with the 
surrounding 
instructions.

-Andi

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/2] Optimize compound_head() by avoiding a shared page flag

2007-04-07 Thread Christoph Lameter

On Fri, 6 Apr 2007, Andrew Morton wrote:

> Did you investigate
> 
> static inline int page_tail(struct page *page)
> {
>   return ((page->flags & (PG_compound|PG_tail)) == (PG_compound|PG_tail));
> }

The usual test_bit that we are using there uses a volatile reference 
so these wont be combined if I check them separately.

A working example of the above would be much uglier:

static inline int page_tail(struct page *page)
{
return ((page->flags & ((1L << PG_compound)|(1L << PG_tail))) == 
((1L << PG_compound)|(1L << PG_tail)));
}

May be this can be cleaned up somehow.

> 
> and
> 
> static inline int page_tail(struct page *page)
> {
>   return unlikely(PageCompound(page)) && unlikely(PageTail(page));
> }

Two volatile references in the bit opes that the compiler cannot combine. 
Wont work unless we clean up the bitops first. This means we still have 
two branches in the code. Maybe I can make the first one work.

> In the latter case we _should_ have a not-taken branch to not-inline code. 
> If the compiler doesn't do that, we can make the PageTail() test an
> out-of-line function.  Or make the whole thing an uninlined function.

Still two branches which cannot be optimized in the same way as the single 
on on IA64 as shown by the asm that I included.

> More work needed, please.  I don't expect that a not-taken branch to
> not-inline code is worth a new page flag.  Especially as it does not
> actually reduce the number of branch decisions in the common case.

A new page flag does reduce the number of branches. On several platforms
it eliminates the branch completely since a single instruction can be
conditionally skipped.

> (I'm assuming in all of this that !PageCompound() is the very common case
> with slub.  If that is not true, we need to talk).

Yes, it is common for slabs to only have a single page.

The most promising avenue seems to be the simultaneous check for two bits.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH, take4] FUTEX : new PRIVATE futexes

2007-04-07 Thread Andrew Morton

On Sat, 7 Apr 2007 10:43:39 +0200 Eric Dumazet <[EMAIL PROTECTED]> wrote:

> Hi all
> 
> Updates on this take4 :
> 
> - All remarks from Nick were addressed I hope
> 
> - Current mm code have a problem with 64bit futexes, as spoted by Nick :
> 
> get_futex_key() does a check against sizeof(u32) regardless of futex being 
> 64bits or not.
> So it is possible a 64bit futex spans two pages of memory...
> I had to change get_futex_key() prototype to be able to do a correct test.
> 

Cold we please have that in a separate patch?  It's logically a part of the
64-bit-futex work, is it not?

> +
> +/**
> + * get_futex_key - Get parameters which are the keys for a futex.
> + * @uaddr: virtual address of the futex
> + * @size: size of futex (4 or 8)
> + * @shared: NULL for a PROCESS_PRIVATE futex,
> + *   >mm->mmap_sem for a PROCESS_SHARED futex
> + * @key: address where result is stored.
> + *
> + * Returns an error code or 0
> + */
> +int get_futex_key(void __user *uaddr, int size, struct rw_semaphore *shared,
> +   union futex_key *key);

Thanks for documenting the interface, but please do it in the .c file at
the function's definition site.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/4] x86_64: (SPARSE_VIRTUAL doubles sparsemem speed)

2007-04-07 Thread Christoph Lameter

On Thu, 5 Apr 2007, Christoph Lameter wrote:

> On Thu, 5 Apr 2007, Andy Whitcroft wrote:
> > Christoph if you could let us know which benchmarks you are seeing gains
> > with that would be a help.
> 
> You saw the numbers that Ken got with the pipe test right?
> 
> Then there are some minor improvements if you run AIM7.
> 
> I could get real some performance numbers for this by sticking in a 
> performance counter before and after virt_to_page and page_address. But I 
> am pretty sure about the result just looking at the code.

Ok. Since I keep being asked, I stuck a performance counter in kfree on 
x86_64 to see the difference in performance:

Results:

x86_64 boot with virtual memmap

Format:   #events totaltime (min/avg/max)

kfree_virt_to_page   598430 5.6ms(3ns/9ns/322ns)

x86_64 boot regular sparsemem

kfree_virt_to_page   596360 10.5ms(4ns/18ns/28.7us)

On average sparsemem virtual takes half the time than of sparsemem.

Note that the maximum time for regular sparsemem is way higher than
sparse virtual. This reflects the possibility that regular sparsemem may
once in a while have to deal with a cache miss whereas sparsemem virtual 
has no memory reference. Thus the numbers stay consistently low. 

Patch that was used to get these results (this is not very clean sorry 
but it should be enough to verify the results):

Simple Performance Counters

This patch allows the use of simple performance counters to measure time
intervals in the kernel source code. This allows a detailed analysis of the
time spend and the amount of data processed in specific code sections of the
kernel.

Time is measured using the cycle counter (TSC on IA32, ITC on IA64) which has
a very low latency.

To use add #include  to the header of the file where the
measurement needs to take place.

Then add the folowing to the code:

To declare a time stamp do

struct pc pc;

To mark the beginning of the time measurement do

pc_start(, )

(If measurement from the beginning of a function is desired one may use
INIT_PC(xx) instead).

To mark the end of the time frame do:

pc_stop();

or if the amount of data transferred needs to be measured as well:

pc_throughput(, number-of-bytes);

The measurements will show up in /proc/perf/all.
Processor specific statistics
may be obtained via /proc/perf/.
Writing to /proc/perf/reset will reset all counters. F.e.

echo >/proc/perf/reset

The first counter is the number of times that the time measurement was
performed. (+ xx) is the number of samples that were thrown away since
the processor on which the process is running changed. Cycle counters
may not be consistent across different processors.

Then follows the sum of the time spend in the code segment followed in
parentheses by the minimum / average / maximum time spent there.
The second block are the sizes of data processed.

Signed-off-by: Christoph Lameter <[EMAIL PROTECTED]>

Index: linux-2.6.21-rc5-mm4/kernel/Makefile
===
--- linux-2.6.21-rc5-mm4.orig/kernel/Makefile   2007-04-07 14:04:32.0 
-0700
+++ linux-2.6.21-rc5-mm4/kernel/Makefile2007-04-07 14:05:12.0 
-0700
@@ -55,6 +55,7 @@
 obj-$(CONFIG_TASKSTATS) += taskstats.o tsacct.o
 obj-$(CONFIG_UTRACE) += utrace.o
 obj-$(CONFIG_PTRACE) += ptrace.o
+obj-y += perf.o

 ifneq ($(CONFIG_SCHED_NO_NO_OMIT_FRAME_POINTER),y)
 # According to Alan Modra <[EMAIL PROTECTED]>, the -fno-omit-frame-pointer is
Index: linux-2.6.21-rc5-mm4/include/linux/perf.h
===
--- /dev/null   1970-01-01 00:00:00.0 +
+++ linux-2.6.21-rc5-mm4/include/linux/perf.h   2007-04-07 14:05:12.0 
-0700
@@ -0,0 +1,49 @@
+/*
+ * Performance Counters and Measurement macros
+ * (C) 2005 Silicon Graphics Incorporated
+ * by Christoph Lameter <[EMAIL PROTECTED]>, April 2005
+ *
+ * Counters are calculated using the cycle counter. If a process
+ * is migrated to another cpu during the measurement then the measurement
+ * is invalid.
+ *
+ * We cannot disable preemption during measurement since that may interfere
+ * with other things in the kernel and limit the usefulness of the counters.
+ */
+
+enum pc_item {
+   PC_KFREE_VIRT_TO_PAGE,
+   PC_PTE_ALLOC,
+   PC_PTE_FREE,
+   PC_PMD_ALLOC,
+   PC_PMD_FREE,
+   PC_PUD_ALLOC,
+   PC_PUD_FREE,
+   PC_PGD_ALLOC,
+   PC_PGD_FREE,
+   NR_PC_ITEMS
+};
+
+/*
+ * Information about the start of the measurement
+ */
+struct pc {
+   unsigned long time;
+   int processor;
+   enum pc_item item;
+};
+
+static inline void pc_start(struct pc *pc, enum pc_item nr)
+{
+   pc->item = nr;
+   pc->processor = smp_processor_id();
+   pc->time = get_cycles();
+}
+
+#define INIT_PC(__var, __item) struct pc __var = \
+   { get_cycles(), smp_processor_id(), __item }
+
+void pc_throughput(struct pc *pc, unsigned

Re: [RFC][PATCH -mm] swsusp: Use rbtree for tracking allocated swap

2007-04-07 Thread Andrew Morton

On Sat, 7 Apr 2007 23:20:39 +0200 "Rafael J. Wysocki" <[EMAIL PROTECTED]> wrote:

> This should allow us to reduce the memory usage, practically always, and
> improve performance.

And does it?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] timekeeping: drop irq-context clocksource polling

2007-04-07 Thread Daniel Walker

On Sat, 2007-04-07 at 22:50 +0200, Thomas Gleixner wrote:
> On Sat, 2007-04-07 at 10:43 -0700, Daniel Walker wrote:
> > Looks like this path ,
> > 
> > arch/i386/kernel/tsc.c: time_cpufreq_notifier(); <-- takes xtime_lock
> >  mark_tsc_unstable();
> >   clocksource_change_rating(_tsc, 
> > 0);
> >timekeeping_change_clocksource(); <-- takes 
> > xtime_lock
> > 
> > 
> > I'm not sure why the time_cpufreq_notifier is taking the xtime_lock tho .
> 
> Simply because it fiddles with variables which are relevant for
> timekeeping.

loops_per_jiffy perhaps?



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH -mm] freezer: Remove PF_NOFREEZE from handle_initrd

2007-04-07 Thread Nigel Cunningham

Hi.

On Sat, 2007-04-07 at 18:14 +0200, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <[EMAIL PROTECTED]>
> 
> Make handle_initrd() call try_to_freeze() in a suitable place instead of 
> setting
> PF_NOFREEZE for the current task.
> 
> Signed-off-by: Rafael J. Wysocki <[EMAIL PROTECTED]>
> ---
>  init/do_mounts_initrd.c |5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> Index: linux-2.6.21-rc6/init/do_mounts_initrd.c
> ===
> --- linux-2.6.21-rc6.orig/init/do_mounts_initrd.c
> +++ linux-2.6.21-rc6/init/do_mounts_initrd.c
> @@ -55,11 +55,12 @@ static void __init handle_initrd(void)
>   sys_mount(".", "/", NULL, MS_MOVE, NULL);
>   sys_chroot(".");
>  
> - current->flags |= PF_NOFREEZE;
>   pid = kernel_thread(do_linuxrc, "/linuxrc", SIGCHLD);
>   if (pid > 0) {
> - while (pid != sys_wait4(-1, NULL, 0, NULL))
> + while (pid != sys_wait4(-1, NULL, 0, NULL)) {
> + try_to_freeze();
>   yield();
> + }
>   }
>  
>   /* move initrd to rootfs' /old */

ACK.

Regards,

Nigel

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH -mm] freezer: Remove PF_NOFREEZE from handle_initrd

2007-04-07 Thread Nigel Cunningham

Hi again.

By the way, I'm stopping using [EMAIL PROTECTED]; could you
please change your address book to nigel at nigel dot suspend2 dot net?

Thanks!

Nigel

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: mm snapshot broken-out-2007-04-07-03-27.tar.gz uploaded

2007-04-07 Thread Michal Piotrowski


On 07/04/07, Michal Piotrowski <[EMAIL PROTECTED]> wrote:

On 07/04/07, Andrew Morton <[EMAIL PROTECTED]> wrote:
> On Sat, 7 Apr 2007 20:48:43 +0200 "Michal Piotrowski" <[EMAIL PROTECTED]> 
wrote:
>
> > On 07/04/07, Andrew Morton <[EMAIL PROTECTED]> wrote:
> > > On Sat, 07 Apr 2007 20:09:43 +0200 Michal Piotrowski <[EMAIL PROTECTED]> 
wrote:
> > >
> > > > BTW. I guess that this need a similar fix.
> > > >
> > > > kernel BUG at kernel/ptrace.c:494!
> > > > invalid opcode:  [#2]
> > > > PREEMPT SMP
> > > > last sysfs file: devices/platform/w83627hf.656/temp2_input
> > > > Modules linked in: ipt_MASQUERADE iptable_nat nf_nat nfsd exportfs 
lockd nfs_acl autofs4 sunrpc af_packet nf_conntrack_netbios_ns ipt_REJECT nf_conntrack_ipv4 
xt_state nf_conntrack nfnetlink iptable_filter ip_tables ip6t_REJECT xt_tcpudp 
ip6table_filter ip6_tables x_tables ipv6 binfmt_misc thermal processor fan container nvram 
snd_intel8x0 snd_ac97_codec ac97_bus snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq 
snd_seq_device snd_pcm_oss snd_mixer_oss intel_agp snd_pcm agpgart evdev snd_timer snd 
soundcore i2c_i801 snd_page_alloc ide_cd cdrom rtc unix
> > > > CPU:1
> > > > EIP:0060:[]Not tainted VLI
> > > > EFLAGS: 00010202   (2.6.21-rc6-mm1 #1)
> > > > EIP is at ptrace_exit+0x29/0x21d
> > > >
> > >
> > > no, I don't see what would cause that.  Was there no call trace?
> >
> > Here is a call trace (it was in the first email).
> >
> > Call Trace:
> >  [] do_exit+0x16b/0x86c
> >  [] die+0x206/0x22c
> >  [] do_trap+0x8a/0xa4
> >  [] do_invalid_op+0x88/0x92
> >  [] error_code+0x79/0x80
> >  [] ptrace_do_wait+0x1eb/0x510
> >  [] do_wait+0x9d6/0xbad
> >  [] sys_wait4+0x30/0x32
> >  [] sys_waitpid+0x27/0x29
> >  [] syscall_call+0x7/0xb
> >  [] 0xb7f36410
>
> Was that with the earlier ptrace fix applied?

No, it wasn't. I'll retest with patch applied.


It seems that everything is ok now. Thanks.

Regards,
Michal

--
Michal K. K. Piotrowski
LTG - Linux Testers Group (PL)
(http://www.stardust.webpages.pl/ltg/)
LTG - Linux Testers Group (EN)
(http://www.stardust.webpages.pl/linux_testers_group_en/)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC][PATCH -mm] swsusp: Use rbtree for tracking allocated swap

2007-04-07 Thread Rafael J. Wysocki

Hi,

Some time ago we discussed the possibility of simplifying the swsusp's approach
towards tracking the swap pages allocated by it for saving the image (so that
they can be freed if there's an error).

I think we can get back to it now, as it is a nice optimization that should
allow us to use less memory (almost always) and improve performance a bit.

Greetings,
Rafael

---
From: Rafael J. Wysocki <[EMAIL PROTECTED]>

Make swsusp use extents instead of a bitmap to trace swap pages allocated for
saving the image (the tracking is only needed in case there's an error, so that
the allocated swap pages can be released).

This should allow us to reduce the memory usage, practically always, and
improve performance.

Signed-off-by: Rafael J. Wysocki <[EMAIL PROTECTED]>
---
 kernel/power/power.h  |   27 +-
 kernel/power/swap.c   |   18 +-
 kernel/power/swsusp.c |  135 ++
 kernel/power/user.c   |   22 +---
 4 files changed, 85 insertions(+), 117 deletions(-)

Index: linux-2.6.21-rc6/kernel/power/swsusp.c
===
--- linux-2.6.21-rc6.orig/kernel/power/swsusp.c
+++ linux-2.6.21-rc6/kernel/power/swsusp.c
@@ -50,6 +50,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "power.h"
 
@@ -74,72 +75,69 @@ static inline unsigned int count_highmem
 /**
  * The following functions are used for tracing the allocated
  * swap pages, so that they can be freed in case of an error.
- *
- * The functions operate on a linked bitmap structure defined
- * in power.h
  */
 
-void free_bitmap(struct bitmap_page *bitmap)
-{
-   struct bitmap_page *bp;
+struct swsusp_extent {
+   struct rb_node node;
+   unsigned long start;
+   unsigned long end;
+};
 
-   while (bitmap) {
-   bp = bitmap->next;
-   free_page((unsigned long)bitmap);
-   bitmap = bp;
-   }
-}
+static struct rb_root swsusp_extents = RB_ROOT;
 
-struct bitmap_page *alloc_bitmap(unsigned int nr_bits)
+static int swsusp_extents_insert(unsigned long swap_offset)
 {
-   struct bitmap_page *bitmap, *bp;
-   unsigned int n;
-
-   if (!nr_bits)
-   return NULL;
-
-   bitmap = (struct bitmap_page *)get_zeroed_page(GFP_KERNEL);
-   bp = bitmap;
-   for (n = BITMAP_PAGE_BITS; n < nr_bits; n += BITMAP_PAGE_BITS) {
-   bp->next = (struct bitmap_page *)get_zeroed_page(GFP_KERNEL);
-   bp = bp->next;
-   if (!bp) {
-   free_bitmap(bitmap);
-   return NULL;
+   struct rb_node **new = &(swsusp_extents.rb_node);
+   struct rb_node *parent = NULL;
+   struct swsusp_extent *ext;
+
+   /* Figure out where to put the new node */
+   while (*new) {
+   ext = container_of(*new, struct swsusp_extent, node);
+   parent = *new;
+   if (swap_offset < ext->start) {
+   /* Try to merge */
+   if (swap_offset == ext->start - 1) {
+   ext->start--;
+   return 0;
+   }
+   new = &((*new)->rb_left);
+   } else if (swap_offset > ext->end) {
+   /* Try to merge */
+   if (swap_offset == ext->end + 1) {
+   ext->end++;
+   return 0;
+   }
+   new = &((*new)->rb_right);
+   } else {
+   /* It already is in the tree */
+   return -EINVAL;
}
}
-   return bitmap;
-}
-
-static int bitmap_set(struct bitmap_page *bitmap, unsigned long bit)
-{
-   unsigned int n;
-
-   n = BITMAP_PAGE_BITS;
-   while (bitmap && n <= bit) {
-   n += BITMAP_PAGE_BITS;
-   bitmap = bitmap->next;
-   }
-   if (!bitmap)
-   return -EINVAL;
-   n -= BITMAP_PAGE_BITS;
-   bit -= n;
-   n = 0;
-   while (bit >= BITS_PER_CHUNK) {
-   bit -= BITS_PER_CHUNK;
-   n++;
-   }
-   bitmap->chunks[n] |= (1UL << bit);
+   /* Add the new node and rebalance the tree. */
+   ext = kzalloc(sizeof(struct swsusp_extent), GFP_KERNEL);
+   if (!ext)
+   return -ENOMEM;
+
+   ext->start = swap_offset;
+   ext->end = swap_offset;
+   rb_link_node(>node, parent, new);
+   rb_insert_color(>node, _extents);
return 0;
 }
 
-sector_t alloc_swapdev_block(int swap, struct bitmap_page *bitmap)
+/**
+ * alloc_swapdev_block - allocate a swap page and register that it has
+ * been allocated, so that it can be freed in case of an error.
+ */
+
+sector_t alloc_swapdev_block(int swap)
 {
unsigned long offset;
 
offset = swp_offset(get_swap_page_of_type(swap));
if

Re: Linux 2.6.21-rc6

2007-04-07 Thread Gene Heskett

On Thursday 05 April 2007, Linus Torvalds wrote:
>Ok,
> I don't think there really is anything very interesting here, but we're
>hopefully whittling down the list of regressions, and fixing various
>random other small issues while at it.
>
>Some smallish MIPS updates, networking (and network driver) fixes,
> removal of a long obsolete framebuffer driver, etc etc. The shortlog
> really tells the story.
>
>We should be getting close to a 2.6.21 release, so please update any
>regression reports you've done,
>
>   Linus

[...]
>
>
>Andrew Morton (4):
>  proc: fix linkage with CONFIG_SYSCTL=y, CONFIG_PROC_SYSCTL=n
>  revert "retries in ext3_prepare_write() violate ordering
> requirements" revert "retries in ext4_prepare_write() violate ordering
> requirements" remove protection of LANANA-reserved majors
>
FWIW, this last reversion didn't do it quite right, the device-mapper was 
at 253 prior to this patches parent patch, and now its at 252, which is 
still a 'dump it all' change for both tar & dump.  Until things settle, 
I'm going to test and probably use the instructions that Dave Dillow just 
sent me, which should put it at 238 regardless.

-- 
Cheers, Gene
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
"Looks clean and obviously correct to me, but then _everything_ I write
 always looks obviously correct yo me."

- Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] timekeeping: drop irq-context clocksource polling

2007-04-07 Thread Thomas Gleixner

On Sat, 2007-04-07 at 10:43 -0700, Daniel Walker wrote:
> Looks like this path ,
> 
> arch/i386/kernel/tsc.c: time_cpufreq_notifier(); <-- takes xtime_lock
>mark_tsc_unstable();
> clocksource_change_rating(_tsc, 0);
>  timekeeping_change_clocksource(); <-- takes 
> xtime_lock
> 
> 
> I'm not sure why the time_cpufreq_notifier is taking the xtime_lock tho .

Simply because it fiddles with variables which are relevant for
timekeeping.

tglx






-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] CRIS: Remove code related to pre-2.2 kernel.

2007-04-07 Thread Robert P. J. Day


Remove conditionals and code related to checking for a pre-2.2
kernel.

Signed-off-by: Robert P. J. Day <[EMAIL PROTECTED]>

---

i'm fairly certain there's no value in checking for pre-2.2 kernels
anymore.  (not compile tested as i have no such system.)


diff --git a/arch/cris/arch-v32/kernel/fasttimer.c 
b/arch/cris/arch-v32/kernel/fasttimer.c
index 5daeb6f..79e1e4c 100644
--- a/arch/cris/arch-v32/kernel/fasttimer.c
+++ b/arch/cris/arch-v32/kernel/fasttimer.c
@@ -603,23 +603,8 @@ void schedule_usleep(unsigned long us)

 #ifdef CONFIG_PROC_FS
 static int proc_fasttimer_read(char *buf, char **start, off_t offset, int len
-#if LINUX_VERSION_CODE >= KERNEL_VERSION(2,2,0)
-   ,int *eof, void *data_unused
-#else
-,int unused
-#endif
-   );
-#if LINUX_VERSION_CODE >= KERNEL_VERSION(2,2,0)
+   ,int *eof, void *data_unused);
 static struct proc_dir_entry *fasttimer_proc_entry;
-#else
-static struct proc_dir_entry fasttimer_proc_entry =
-{
-  0, 9, "fasttimer",
-  S_IFREG | S_IRUGO, 1, 0, 0,
-  0, NULL /* ops -- default to array */,
-  _fasttimer_read /* get_info */,
-};
-#endif
 #endif /* CONFIG_PROC_FS */

 #ifdef CONFIG_PROC_FS
@@ -628,12 +613,7 @@ static struct proc_dir_entry fasttimer_proc_entry =
 #define BIG_BUF_SIZE (500 + NUM_TIMER_STATS * 300)

 static int proc_fasttimer_read(char *buf, char **start, off_t offset, int len
-#if LINUX_VERSION_CODE >= KERNEL_VERSION(2,2,0)
-   ,int *eof, void *data_unused
-#else
-,int unused
-#endif
-   )
+   ,int *eof, void *data_unused)
 {
   unsigned long flags;
   int i = 0;
@@ -808,9 +788,7 @@ static int proc_fasttimer_read(char *buf, char **start, 
off_t offset, int len

   memcpy(buf, bigbuf + offset, len);
   *start = buf;
-#if LINUX_VERSION_CODE >= KERNEL_VERSION(2,2,0)
   *eof = 1;
-#endif

   return len;
 }
@@ -974,12 +952,8 @@ void fast_timer_init(void)
 printk("fast_timer_init()\n");

 #ifdef CONFIG_PROC_FS
-#if LINUX_VERSION_CODE >= KERNEL_VERSION(2,2,0)
if ((fasttimer_proc_entry = create_proc_entry( "fasttimer", 0, 0 )))
  fasttimer_proc_entry->read_proc = proc_fasttimer_read;
-#else
-proc_register_dynamic(_root, _proc_entry);
-#endif
 #endif /* PROC_FS */
 if(request_irq(TIMER_INTR_VECT, timer_trig_interrupt, IRQF_DISABLED,
"fast timer int", NULL))
-- 

Robert P. J. Day
Linux Consulting, Training and Annoying Kernel Pedantry
Waterloo, Ontario, CANADA

http://fsdev.net/wiki/index.php?title=Main_Page

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: If not readdir() then what?

2007-04-07 Thread Theodore Tso

On Sat, Apr 07, 2007 at 09:57:32AM -0700, Ulrich Drepper wrote:
> In their closed chambers (well, workshops,
> http://lwn.net/Articles/226351/), the filesystem developers complain
> about readdir.  I fully appreciate the difficulties.  But what I fail
> to see so far is any proposal for an alternative interface.
> 
> The phase to get new functionality included in the next revision of
> POSIX is over.  But that does not mean we should not try to get some
> sensible new implementation in place.  There is, for example, the
> "High End Computing Extensions Working Group" (the guys who showed up
> here with their statlite and readdirplus proposals).  This is an
> official working group at the OpenGroup which can produce a document
> which can be the basis of inclusion in the next revision and become a
> OpenGroup specification earlier than that.
> 
> So, if anybody has a proposal for better interfaces let's hear them.
> "Now" is a very good time to start working on this.

The problem isn't as much readdir() as it is telldir()/seekdir(),
which fundamentally assumes that the directory is a linear file.  JFS
for example was forced to implement an entire separate btree whose
only purpose was to record telldir cookies.  

With ext3 and htree we return the directories in hash tree sort order.
This sacrifices some performance with readdir/stat workloads unless
the userspace program does a readdir/qsort on inode number/stat
replacement.  In addition, there is the risk of hash collusions
because of the pathetically small size of the telldir cookie (32
bits).  If this happens, the readdir() guarantees of only returning
each directory entry at most once are compromised.  We use a keyed
hash with a per-filesystem superblock secret to prevent someone from
deliberately finding a hash collision and proving that we're not POSIX
compliant in the edge cases.  Cheasy, yes, but only loser programs
should be using telldir/seekdir anyway.  :-)

So how do we solve this problem?  I can think of two solutions:

1) Deprecate telldir/seekdir() altogether.  Relatively few progams use
this functionality, and it is highly questionable how useful it is,
anyway.  If you use telldir/seekdir and keep the cookie for a long
time, even the POSIX-provided guarantees about files that are created
and deleted between the telldir() and seekdir() points in time makes
its utility highly dubious.

2) If application programs must have telldir/seekdir, than expand the
size of the cookie from 32-bits to a minimum of 128 bits, and
preferably larger --- say 512 bits, to accomodate systems that might
be using 512-bit variant of SHA-2.   So something like this?

/* TELLDIR_COOKIE_SIZE must be >= 64 */
#define TELLDIR_COOKIE_SIZE 64
typedef unsigned char ltelldir_cookie_t[TELLDIR_COOKIE_SIZE];

int ltelldir(int fd, ltelldircookie_t *cookie);
int lseekdir(int fd, ltelldircookie_t *cookie);

My personal preference would be #1, though.  :-)

- Ted
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Ten percent test

2007-04-07 Thread Gene Heskett

On Saturday 07 April 2007, Mike Galbraith wrote:
>On Sat, 2007-04-07 at 20:08 +0200, Ingo Molnar wrote:
>> * Gene Heskett <[EMAIL PROTECTED]> wrote:
>> > (who the hell runs a 'make -j 200' or 50 while(1)'s in the real
>> > world?
>>
>> not many - and i dont think Mike tested any of these - Mike tested
>> pretty low make -j values (Mike, can you confirm?).
>
>Yes.  I don't test anything more than make -j5 when looking at
>interactivity, and make -j nr_cpus+1 is my must have yardstick.
>
>   -Mike

Somebody made that remark, maybe not you, and maybe they were being funny, 
but I didn't at the time, see any smileys.

-- 
Cheers, Gene
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
Please remain calm, it's no use both of us being hysterical at the same 
time.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: init's children list is long and slows reaping children.

2007-04-07 Thread Oleg Nesterov

On 04/06, Oleg Nesterov wrote:
> 
> Perhaps,
> 
> --- t/kernel/exit.c~  2007-04-06 23:31:31.0 +0400
> +++ t/kernel/exit.c   2007-04-06 23:31:57.0 +0400
> @@ -275,10 +275,7 @@ static void reparent_to_init(void)
>   remove_parent(current);
>   current->parent = child_reaper(current);
>   current->real_parent = child_reaper(current);
> - add_parent(current);
> -
> - /* Set the exit signal to SIGCHLD so we signal init on exit */
> - current->exit_signal = SIGCHLD;
> + current->exit_signal = -1;
>  
>   if (!has_rt_policy(current) && (task_nice(current) < 0))
>   set_user_nice(current, 0);
> 
> is enough. init is still our parent (make ps happy), but it can't see us,
> we are not on ->children list.

OK, this doesn't work. A multi-threaded init may do execve(). Can we forbid
exec() from non-main thread? init is special anyway. We can simplify de_thread()
in this case, and kill tasklist_lock around zap_other_threads().

So, we can re-parent a kernel thread to swapper. In that case it doesn't matter
if we put task on ->children list or not.

User-visible change. Acceptable?

> Off course, we also need to add preparent_to_init() to kthread() and
> (say) stopmachine(). Or we can create kernel_thread_detached() and
> modify callers to use it.

It would be very nice to introduce CLONE_KERNEL_THREAD instead, then

--- kernel/fork.c~  2007-04-07 20:11:14.0 +0400
+++ kernel/fork.c   2007-04-07 23:40:35.0 +0400
@@ -1159,7 +1159,8 @@ static struct task_struct *copy_process(
p->parent_exec_id = p->self_exec_id;
 
/* ok, now we should be set up.. */
-   p->exit_signal = (clone_flags & CLONE_THREAD) ? -1 : 
(clone_flags & CSIGNAL);
+   p->exit_signal = (clone_flags & 
(CLONE_THREAD|CLONE_KERNEL_THREAD))
+   ? -1 : (clone_flags & CSIGNAL);
p->pdeath_signal = 0;
p->exit_state = 0;
 
@@ -1196,6 +1197,8 @@ static struct task_struct *copy_process(
/* CLONE_PARENT re-uses the old parent */
if (clone_flags & (CLONE_PARENT|CLONE_THREAD))
p->parent = current->parent;
+   else if (unlikely(clone_flags & CLONE_KERNEL_THREAD))
+   p->parent = _task;
else
p->parent = current;

That is all, very simple. However, in that case we should introduce

#define SYS_CLONE_MASK  (~CLONE_KERNEL_THREAD)

and change every sys_clone() implementation to filter out non-SYS_CLONE_MASK
flags. This is trivial (and imho useful), but

arch/sparc/kernel/process.c
arch/sparc64/kernel/process.c
arch/m68knommu/kernel/process.c
arch/m68k/kernel/process.c
arch/alpha/kernel/entry.S
arch/h8300/kernel/process.c
arch/v850/kernel/process.c
arch/frv/kernel/kernel_thread.S
arch/frv/kernel/kernel_thread.S
arch/powerpc/kernel/misc_32.S
arch/powerpc/kernel/misc_64.S

implement kthread_create() via sys_clone(). This means that without additional
effort from mainteners these arches can't take advantage of CLONE_KERNEL_THREAD.

Dear CC list, any advice?

Oleg.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Ten percent test

2007-04-07 Thread Gene Heskett

On Saturday 07 April 2007, Ingo Molnar wrote:
>* Gene Heskett <[EMAIL PROTECTED]> wrote:
>> Yes it would be Ingo, but so far, none of the recent -rt patches has
>> booted on this machine, the last one I tried a few days ago failing to
>> find /dev/root, whatever the heck that is.
>
>did you have a chance to try the yum kernel by any chance? The -testing
>one you can try on Fedora with little hassle, by doing this as root:
>
>cat > /etc/yum.repos.d/rt-testing.repo
>[rt-testing]
>name=Ingo's Real-Time (-rt) test-kernel for FC6
>baseurl=http://people.redhat.com/mingo/realtime-preempt/yum-testing/yum/
>enabled=1
>gpgcheck=0
>
>
>and "yum install kernel-rt" and a reboot should get you going.

No, I couldn't seem to get that to show up in a yumex display, and I'm 
partial to smart anyway.

>> [...]  I don't enjoy sitting through all these e2fsk's during the
>> reboot just to have things I normally run in the background die, like
>> tvtime, sitting there with some news channel muttering along in the
>> background.  I was even ignored when I suggested it might be a dma
>> problem, which I still think it could be.
>
>i did spend quite some time to debug your tv-tuner problem back then,
>and for that purpose alone i bought a tv tuner card to test this myself.
>(but it worked on my testbox)
>
>   Ingo
You didn't tell me this.

That said, I am booted to the patch you sent me now, and this also is a 
very obvious improvement, one I could easily live with on a long term 
basis.  I haven't tried a kernel build in the background yet, but I have 
sat here and played patience for about an hour, looking for the little 
stutters, but never saw them.  So I could just as easily recommend this 
one for desktop use, it seems to be working.  tvtime hasn't had any audio 
or video glitches that I've noted when I was on that screen to check on 
an interesting story, like the 102 year old lady who finally got her hole 
in one, on a very short hole, but after 90 years of golfing, she was 
beginning to wonder if she would ever get one.  Not sure who bought at 
the 19th hole, HNN didn't cover that traditional part.

So this patch also works.  And if it gets into mainline, at least Con's 
efforts at proding the fixes needed will not have been in vain.

My question then, is why did it take a very public cat-fight to get this 
looked at and the code adjusted?  Its been what, nearly 2 years since 
Linus himself made a comment that this thing needed fixed.  The fixes 
then done were of very little actual effectiveness and the situation then 
has gradually deteriorated since.

Its on the desktop that linux will win or lose the public's market share.  
After all, there are only so many 'servers' on the planet, a market that 
linux has pretty well demo'ed its superiority, if not in terms of speed, 
at least in security.

To qualify that, I currently have 2 of yahoo's machines in 
my .procmailrc's /dev/null list as they are a source of a large number of 
little 1 to 3 line spams.  I assume they are IIS machines, but the emails 
headers aren't that explicit to my relatively untrained eyeballs.

And I'd like to see korea put on a permanent rbl black hole.  I'm less 
than amused at watching the log coming out of my router as first one 
shithead and then the next makes a 100,000 word dictionary attack against 
it.  One has even found a way too cause a tcp reset about every 10 words 
tried.  But nobody has gotten any farther than that.  That knocking 
sound?  Guess.

-- 
Cheers, Gene
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
You are magnetic in your bearing.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] iucv: fix compilation on s390-up

2007-04-07 Thread Alexey Dobriyan

  CC [M]  net/iucv/iucv.o
net/iucv/iucv.c: In function 'iucv_init':
net/iucv/iucv.c:1556: error: 'iucv_cpu_notifier' undeclared (first use in this 
function)

Signed-off-by: Alexey Dobriyan <[EMAIL PROTECTED]>
---

 net/iucv/iucv.c |2 --
 1 file changed, 2 deletions(-)

--- a/net/iucv/iucv.c
+++ b/net/iucv/iucv.c
@@ -519,7 +519,6 @@ static void iucv_disable(void)
kfree(iucv_path_table);
 }
 
-#ifdef CONFIG_HOTPLUG_CPU
 static int __cpuinit iucv_cpu_notify(struct notifier_block *self,
 unsigned long action, void *hcpu)
 {
@@ -565,7 +564,6 @@ static int __cpuinit iucv_cpu_notify(struct notifier_block 
*self,
 static struct notifier_block iucv_cpu_notifier = {
.notifier_call = iucv_cpu_notify,
 };
-#endif
 
 /**
  * iucv_sever_pathid

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: mm snapshot broken-out-2007-04-07-03-27.tar.gz uploaded

2007-04-07 Thread Michal Piotrowski


On 07/04/07, Andrew Morton <[EMAIL PROTECTED]> wrote:

On Sat, 7 Apr 2007 20:48:43 +0200 "Michal Piotrowski" <[EMAIL PROTECTED]> wrote:

> On 07/04/07, Andrew Morton <[EMAIL PROTECTED]> wrote:
> > On Sat, 07 Apr 2007 20:09:43 +0200 Michal Piotrowski <[EMAIL PROTECTED]> 
wrote:
> >
> > > BTW. I guess that this need a similar fix.
> > >
> > > kernel BUG at kernel/ptrace.c:494!
> > > invalid opcode:  [#2]
> > > PREEMPT SMP
> > > last sysfs file: devices/platform/w83627hf.656/temp2_input
> > > Modules linked in: ipt_MASQUERADE iptable_nat nf_nat nfsd exportfs lockd 
nfs_acl autofs4 sunrpc af_packet nf_conntrack_netbios_ns ipt_REJECT nf_conntrack_ipv4 
xt_state nf_conntrack nfnetlink iptable_filter ip_tables ip6t_REJECT xt_tcpudp 
ip6table_filter ip6_tables x_tables ipv6 binfmt_misc thermal processor fan container 
nvram snd_intel8x0 snd_ac97_codec ac97_bus snd_seq_dummy snd_seq_oss snd_seq_midi_event 
snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss intel_agp snd_pcm agpgart evdev 
snd_timer snd soundcore i2c_i801 snd_page_alloc ide_cd cdrom rtc unix
> > > CPU:1
> > > EIP:0060:[]Not tainted VLI
> > > EFLAGS: 00010202   (2.6.21-rc6-mm1 #1)
> > > EIP is at ptrace_exit+0x29/0x21d
> > >
> >
> > no, I don't see what would cause that.  Was there no call trace?
>
> Here is a call trace (it was in the first email).
>
> Call Trace:
>  [] do_exit+0x16b/0x86c
>  [] die+0x206/0x22c
>  [] do_trap+0x8a/0xa4
>  [] do_invalid_op+0x88/0x92
>  [] error_code+0x79/0x80
>  [] ptrace_do_wait+0x1eb/0x510
>  [] do_wait+0x9d6/0xbad
>  [] sys_wait4+0x30/0x32
>  [] sys_waitpid+0x27/0x29
>  [] syscall_call+0x7/0xb
>  [] 0xb7f36410

Was that with the earlier ptrace fix applied?


No, it wasn't. I'll retest with patch applied.



Because what could happen is that ptrace_do_wait() (or anything else)
goes BUG, then the trap handler ends up calling do_exit(), which calls
ptrace_exit() which will then go BUG again over non-zero preempt_count.

Asserting that preempt_count==0 on the do_exit() path is a bad idea, because
do_exit() is called on the oops path - we're virtually assured that we'll get
recursive crashes.

I think I'll just disable the whole NO_LOCKS thing.



Regards,
Michal

--
Michal K. K. Piotrowski
LTG - Linux Testers Group (PL)
(http://www.stardust.webpages.pl/ltg/)
LTG - Linux Testers Group (EN)
(http://www.stardust.webpages.pl/linux_testers_group_en/)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: mm snapshot broken-out-2007-04-07-03-27.tar.gz uploaded

2007-04-07 Thread Andrew Morton

On Sat, 7 Apr 2007 20:48:43 +0200 "Michal Piotrowski" <[EMAIL PROTECTED]> wrote:

> On 07/04/07, Andrew Morton <[EMAIL PROTECTED]> wrote:
> > On Sat, 07 Apr 2007 20:09:43 +0200 Michal Piotrowski <[EMAIL PROTECTED]> 
> > wrote:
> >
> > > BTW. I guess that this need a similar fix.
> > >
> > > kernel BUG at kernel/ptrace.c:494!
> > > invalid opcode:  [#2]
> > > PREEMPT SMP
> > > last sysfs file: devices/platform/w83627hf.656/temp2_input
> > > Modules linked in: ipt_MASQUERADE iptable_nat nf_nat nfsd exportfs lockd 
> > > nfs_acl autofs4 sunrpc af_packet nf_conntrack_netbios_ns ipt_REJECT 
> > > nf_conntrack_ipv4 xt_state nf_conntrack nfnetlink iptable_filter 
> > > ip_tables ip6t_REJECT xt_tcpudp ip6table_filter ip6_tables x_tables ipv6 
> > > binfmt_misc thermal processor fan container nvram snd_intel8x0 
> > > snd_ac97_codec ac97_bus snd_seq_dummy snd_seq_oss snd_seq_midi_event 
> > > snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss intel_agp snd_pcm 
> > > agpgart evdev snd_timer snd soundcore i2c_i801 snd_page_alloc ide_cd 
> > > cdrom rtc unix
> > > CPU:1
> > > EIP:0060:[]Not tainted VLI
> > > EFLAGS: 00010202   (2.6.21-rc6-mm1 #1)
> > > EIP is at ptrace_exit+0x29/0x21d
> > >
> >
> > no, I don't see what would cause that.  Was there no call trace?
> 
> Here is a call trace (it was in the first email).
> 
> Call Trace:
>  [] do_exit+0x16b/0x86c
>  [] die+0x206/0x22c
>  [] do_trap+0x8a/0xa4
>  [] do_invalid_op+0x88/0x92
>  [] error_code+0x79/0x80
>  [] ptrace_do_wait+0x1eb/0x510
>  [] do_wait+0x9d6/0xbad
>  [] sys_wait4+0x30/0x32
>  [] sys_waitpid+0x27/0x29
>  [] syscall_call+0x7/0xb
>  [] 0xb7f36410

Was that with the earlier ptrace fix applied?

Because what could happen is that ptrace_do_wait() (or anything else)
goes BUG, then the trap handler ends up calling do_exit(), which calls
ptrace_exit() which will then go BUG again over non-zero preempt_count.

Asserting that preempt_count==0 on the do_exit() path is a bad idea, because
do_exit() is called on the oops path - we're virtually assured that we'll get
recursive crashes.

I think I'll just disable the whole NO_LOCKS thing.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: mm snapshot broken-out-2007-04-07-03-27.tar.gz uploaded

2007-04-07 Thread Andrew Morton

On Sat, 07 Apr 2007 21:43:17 +0200 Michal Piotrowski <[EMAIL PROTECTED]> wrote:

> [EMAIL PROTECTED] napisał(a):
> > The mm snapshot broken-out-2007-04-07-03-27.tar.gz has been uploaded to
> > 
> >
> > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/mm/broken-out-2007-04-07-03-27.tar.gz
> > 
> > It contains the following patches against 2.6.21-rc6:
> 
> suspend-to-disk doesn't work when pktgen module is loaded.
> 
> Stopping kernel threads timed out after 20 seconds (2 tasks refusing to 
> freeze):
>  kpktgend_0
>  kpktgend_1
> Restarting tasks ... done.
> swsusp: Basic memory bitmaps freed

This?

--- a/net/core/pktgen.c~pktgen-add-try_to_freeze
+++ a/net/core/pktgen.c
@@ -128,6 +128,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -3325,6 +3326,8 @@ static int pktgen_thread_worker(void *ar
t->control &= ~(T_REMDEV);
}
 
+   try_to_freeze();
+
set_current_state(TASK_INTERRUPTIBLE);
}
 
_

> http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/broken-out-2007-04-07-03-27/mm-serialconsole2.log
> 
> Anyway, this kernel is pretty stable :)

Is appreciated, thanks.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 2.6.21-rc5-git] make /proc/acpi/wakeup more useful

2007-04-07 Thread David Brownell

On Friday 06 April 2007 10:01 pm, Greg KH wrote:

> Are you _sure_ you have a 1-to-1 relationship here?  No multiple devices
> pointing to the same acpi node?  Or the other way around?  If so, you
> are going to have to change the name to be something more unique.

I've wondered that too.  The short answer:  APCI only supports 1-1
here.  It will emit warnings if it tries to bind more than one ACPI
device to a given "real" device ... but errors the other way are
silently ignored.

By adding a warning over this create-links patch, I found that the
system in the $SUBJECT patch (and likely every ACPI system) has
two different nodes that correspond to one ACPI node:

/sys/devices/pci:00 ... pci root node
/sys/devices/pnp0/00:00 ... id PNP0a03
/sys/devices/acpi_system:00/device:00/PNP0A03:00 ... ditto

Arguably that's too many sysfs nodes for one device...

Plus, there's the issue of flakey ACPI tables; in the $SUBJECT patch
both MDM and AUD nodes exist in the ACPI namespace, but they could
only refer to one PCI device (with MDM as the wakeup source, not AUD
as listed in the table).  Or maybe that's another case where the ACPI
code isn't handling the tables as sensibly as it might...

> Or how about "firmware" instead of "acpi" to be able to have the
> userspace tools work on any type of firmware that provides this, like
> openfirmware?

Assuming they all adopt that same "parallel tree" model, that seems
like a good idea.  The tools will likely need to understand how ACPI
and OF differ, but there's no point in reserving more names than we
really need.  Though it may be that "parallel trees" should go away.

A small glitch in the patch:  lines bigger than 80 characters.  :)

- Dave
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: mm snapshot broken-out-2007-04-07-03-27.tar.gz uploaded

2007-04-07 Thread Michal Piotrowski

[EMAIL PROTECTED] napisał(a):
> The mm snapshot broken-out-2007-04-07-03-27.tar.gz has been uploaded to
> 
>
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/mm/broken-out-2007-04-07-03-27.tar.gz
> 
> It contains the following patches against 2.6.21-rc6:

suspend-to-disk doesn't work when pktgen module is loaded.

Stopping kernel threads timed out after 20 seconds (2 tasks refusing to freeze):
 kpktgend_0
 kpktgend_1
Restarting tasks ... done.
swsusp: Basic memory bitmaps freed

http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/broken-out-2007-04-07-03-27/mm-serialconsole2.log

Anyway, this kernel is pretty stable :)

aio_dio_bugs - ok
aiostress - ok
bash_shared_mapping - ok
bonnie - ok
cpu_hotplug - ok
cyclictest - ok
dbench - ok
fsfuzzer - ok
fs_mark - ok
fsx - ok
interbench - ok
iozone -ok
isic - ok
linus_stress - ok
ltp - ptrace bug
pi_tests - ok
rmaptest - ok
rtlinuxtests - ok
rttester - ok
scrashme - ok
spew - ok
stress - ok

Regards,
Michal

-- 
Michal K. K. Piotrowski
LTG - Linux Testers Group (PL)
(http://www.stardust.webpages.pl/ltg/)
LTG - Linux Testers Group (EN)
(http://www.stardust.webpages.pl/linux_testers_group_en/)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: REISER4: fix for reiser4_write_extent

2007-04-07 Thread Edward Shishkin


Laurent Riffard wrote:


Le 06.04.2007 00:42, Ignatich a écrit :

While trying to find the cause of problems with reiser4 in recent 
kernels I came across this.


Incomplete write handling seem to be missing from 
reiser4_write_extent() thanks to reiser4-temp-fix.patch. Strangely, 
there is a patch by Edward Shishkin that should address that issue, 
but it is missing from -mm tree. Please check.


   Max



This patch was added to -mm tree the 14 Dec 2006 (see 
http://www.mail-archive.com/mm-commits@vger.kernel.org/msg05338.html).


It was then dropped from -mm tree the 05 Mar 2007 (see 
http://www.mail-archive.com/mm-commits@vger.kernel.org/msg10818.html), 
with this comment:

"This patch was dropped because it is obsolete"

No idea why it was obsolete. Does somebody know ?



This uses not settled interface filemap_copy_from_user_atomic/nonatomic
However, those things should be fixed. I'll prepare the patch a bit later..

Thanks,
Edward.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Hello! Could I talk to you for a second?

2007-04-07 Thread hi hi


Hello! Could I talk to you for a second?

I am a submissive SlaveGirl, and am looking for master/mistress. I
would like to submit to master/mistress' wildest desire.

You are welcome to visit my blog: http://blog.25u.com

/Kitty
My Email: [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Reiser4. BEST FILESYSTEM EVER.

2007-04-07 Thread Lennart Sorensen

On Thu, Apr 05, 2007 at 09:32:11PM -0700, [EMAIL PROTECTED] wrote:
> Don't you agree, that "If they are accurate, THEN they are obviously
> very relevant."

Nope, if they are accurate and they have something to do with your
particular usage and applications, then they are relevant.  But it
requires both to make them relevant.  Although it may be possible for a
benchmark to be relevant even if not particularly accurate.

> I have set up a Reiser4 partition with gzip compression, here is the
> difference in disk usage of a typical Debian installation on two 10GB
> partitions, one with Reiser3 and the other with Reiser4.
> 
> debian:/# df
> Filesystem   1K-blocks  Used Available Use% Mounted on
> /dev/sda3 10490104   6379164   4110940  61% /3
> /dev/sda7  9967960   2632488   7335472  27% /7
> 
> Partitions 3 and 7 have exactly the same data on them (the typical
> Debian install).
> 
> The partitions are exactly the same size (although df records different
> sizes).
> 
> Partition 3 is Reiser3 -- uses 6.4 GB.
> Partition 7 is Reiser4 -- uses 2.6 GB.
> 
> So Reiser4 uses 2.6 GB to store the (typical) data that it takes Reiser3
> 6.4 GB to store (note it would take ext2/3/4 some 7 GB to store the same
> info).
> 
> Don't you think this result is significant in itself?

Only if you think disk space is so valuable that trading cpu time to
compress and decompress the data is a good trade off.  It is not one I
would want to make.  So you saved 3GB, what is that?  About $1 worth?
maybe $2 if you have raid.  How much extra time and cpu will it take to
access the data that way?  How much extra electricity will the cpu use?
What is your time worth?  There are so many variables.  Do you _trust_
reiserfs4 to not loose your data any more or less than some other
filesystem?

--
Len Sorensen
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Ten percent test

2007-04-07 Thread Mike Galbraith

On Sat, 2007-04-07 at 20:08 +0200, Ingo Molnar wrote:
> * Gene Heskett <[EMAIL PROTECTED]> wrote:

> > (who the hell runs a 'make -j 200' or 50 while(1)'s in the real world?
> 
> not many - and i dont think Mike tested any of these - Mike tested 
> pretty low make -j values (Mike, can you confirm?).

Yes.  I don't test anything more than make -j5 when looking at
interactivity, and make -j nr_cpus+1 is my must have yardstick.

-Mike

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] arm: fix section mismatch warning in board-sam9260

2007-04-07 Thread Sam Ravnborg

Andrew Morton found a section mismatch warning in x86_64 triggered
by a wrongly placed __initdata marker.
git grep "struct __initdata" revealed that board-sam9260.c
had the same problem.

This patch fixes this by placing the __initdata marker correct.
It was checked with objdump that the variable was moved to
.init.data by this change.

Fixed an unrelated section mismatch warning while touching the file.

Both changes are only compile tested but obvious correct.
[Used at91sam9260ek_defconfig to get compile coverage]

Signed-off-by: Sam Ravnborg <[EMAIL PROTECTED]>
---
diff --git a/arch/arm/mach-at91/board-sam9260ek.c 
b/arch/arm/mach-at91/board-sam9260ek.c
index 57fb449..7a31db0 100644
--- a/arch/arm/mach-at91/board-sam9260ek.c
+++ b/arch/arm/mach-at91/board-sam9260ek.c
@@ -118,7 +118,7 @@ static struct spi_board_info ek_spi_devices[] = {
 /*
  * MACB Ethernet device
  */
-static struct __initdata at91_eth_data ek_macb_data = {
+static struct at91_eth_data __initdata ek_macb_data = {
.phy_irq_pin= AT91_PIN_PA7,
.is_rmii= 1,
 };
@@ -140,7 +140,7 @@ static struct mtd_partition __initdata ek_nand_partition[] 
= {
},
 };
 
-static struct mtd_partition *nand_partitions(int size, int *num_partitions)
+static struct mtd_partition * __init nand_partitions(int size, int 
*num_partitions)
 {
*num_partitions = ARRAY_SIZE(ek_nand_partition);
return ek_nand_partition;
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 2/4] clean up identify_cpu

2007-04-07 Thread Andrew Morton

On Sat, 7 Apr 2007 20:39:16 +0200 Sam Ravnborg <[EMAIL PROTECTED]> wrote:

> On Sat, Apr 07, 2007 at 10:59:47AM -0700, Andrew Morton wrote:
> > On Sat, 07 Apr 2007 10:20:17 -0700 Jeremy Fitzhardinge <[EMAIL PROTECTED]> 
> > wrote:
> > 
> > >  I don't have a x86-64 compile environment on
> > > hand, so the 64 bits are completely untested
> > 
> > http://userweb.kernel.org/~akpm/cross-compilers/
> 
> Does the alpha toolchain work for you?

Seems not.

> For defconfig I get:
>   CC  arch/alpha/kernel/core_cia.o
> {standard input}: Assembler messages:
> {standard input}:351: Error: macro requires $at register while noat in effect
> {standard input}:376: Error: macro requires $at register while noat in effect
> {standard input}:400: Error: macro requires $at register while noat in effect
> {standard input}:419: Error: macro requires $at register while noat in effect
> {standard input}:474: Error: macro requires $at register while noat in effect
> {standard input}:499: Error: macro requires $at register while noat in effect
> {standard input}:523: Error: macro requires $at register while noat in effect
> {standard input}:542: Error: macro requires $at register while noat in effect
> make[2]: *** [arch/alpha/kernel/core_cia.o] Error 1
> make[1]: *** [arch/alpha/kernel] Error 2
> make: *** [_all] Error 2
> 
> Same happens when I compile the same version direct from Dan's crosstool.
> 

Me too.  I always do allmodconfig with alpha, and allmodconfig doesn't
include that file.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: sched.c: Remove unused variable 'relative'

2007-04-07 Thread Ingo Molnar


* Linux Kernel Mailing List  wrote:

> Committer:  Linus Torvalds <[EMAIL PROTECTED]>
> CommitDate: Sat Apr 7 10:18:33 2007 -0700
> 
> sched.c: Remove unused variable 'relative'
> 
> Getting rid of the p->children printout in show_task() left behind an
> unused variable.

grumble. The warning drowned in the usual flood of gcc warnings that a 
kernel compile produces. Btw., whenever i meet a gcc false positive that 
annoys me particularly, i just fix it up and flame the gcc folks in a 
comment. That patch has grown quite a bit and can be found below. (it's 
not complete by any means, and it should rather be done as an annotation 
combined with some unflattering CONFIG_HACK_AROUND_BROKEN_GCC_WARNINGS 
option that a sane compiler could unset and avoid the runtime cost of 
needless initializations.)

Ingo

--->
 arch/i386/kernel/cpu/mcheck/p4.c |2 +-
 arch/i386/kernel/efi.c   |2 +-
 fs/block_dev.c   |2 +-
 fs/isofs/namei.c |2 +-
 fs/jffs2/erase.c |2 +-
 fs/nfsd/nfsctl.c |2 +-
 ipc/msg.c|2 +-
 ipc/sem.c|2 +-
 kernel/audit.c   |2 +-
 kernel/auditfilter.c |2 +-
 net/core/flow.c  |2 +-
 net/sunrpc/svc.c |2 +-
 sound/core/control_compat.c  |2 +-
 sound/pci/pcxhr/pcxhr.c  |2 +-
 14 files changed, 14 insertions(+), 14 deletions(-)

Index: linux/arch/i386/kernel/cpu/mcheck/p4.c
===
--- linux.orig/arch/i386/kernel/cpu/mcheck/p4.c
+++ linux/arch/i386/kernel/cpu/mcheck/p4.c
@@ -155,7 +155,7 @@ static fastcall void intel_machine_check
u32 alow, ahigh, high, low;
u32 mcgstl, mcgsth;
int i;
-   struct intel_mce_extended_msrs dbg;
+   struct intel_mce_extended_msrs dbg = { 0, } /* shut up gcc! */;
 
rdmsr (MSR_IA32_MCG_STATUS, mcgstl, mcgsth);
if (mcgstl & (1<<0))/* Recoverable ? */
Index: linux/arch/i386/kernel/efi.c
===
--- linux.orig/arch/i386/kernel/efi.c
+++ linux/arch/i386/kernel/efi.c
@@ -278,7 +278,7 @@ void efi_memmap_walk(efi_freemem_callbac
struct range {
unsigned long start;
unsigned long end;
-   } prev, curr;
+   } prev = { } /* shut up gcc */ , curr = { } /* shut up gcc */ ;
efi_memory_desc_t *md;
unsigned long start, end;
void *p;
Index: linux/fs/block_dev.c
===
--- linux.orig/fs/block_dev.c
+++ linux/fs/block_dev.c
@@ -950,7 +950,7 @@ static int bd_claim_by_kobject(struct bl
struct kobject *kobj)
 {
int res;
-   struct bd_holder *bo, *found;
+   struct bd_holder *bo, *found = NULL /* shut up GCC */;
 
if (!kobj)
return -EINVAL;
Index: linux/fs/isofs/namei.c
===
--- linux.orig/fs/isofs/namei.c
+++ linux/fs/isofs/namei.c
@@ -158,7 +158,7 @@ isofs_find_entry(struct inode *dir, stru
 struct dentry *isofs_lookup(struct inode * dir, struct dentry * dentry, struct 
nameidata *nd)
 {
int found;
-   unsigned long block, offset;
+   unsigned long block = 0, offset = 0 /* avoid stupid gcc warning */;
struct inode *inode;
struct page *page;
 
Index: linux/fs/jffs2/erase.c
===
--- linux.orig/fs/jffs2/erase.c
+++ linux/fs/jffs2/erase.c
@@ -364,7 +364,7 @@ static void jffs2_mark_erased_block(stru
 {
size_t retlen;
int ret;
-   uint32_t bad_offset;
+   uint32_t bad_offset = 0 /* shut up gcc */;
 
switch (jffs2_block_check_erase(c, jeb, _offset)) {
case -EAGAIN:   goto refile;
Index: linux/fs/nfsd/nfsctl.c
===
--- linux.orig/fs/nfsd/nfsctl.c
+++ linux/fs/nfsd/nfsctl.c
@@ -299,7 +299,7 @@ static ssize_t write_filehandle(struct f
 * qword quoting is used, so filehandle will be \x
 */
char *dname, *path;
-   int maxsize;
+   int maxsize = 0;
char *mesg = buf;
int len;
struct auth_domain *dom;
Index: linux/ipc/msg.c
===
--- linux.orig/ipc/msg.c
+++ linux/ipc/msg.c
@@ -387,7 +387,7 @@ copy_msqid_from_user(struct msq_setbuf *
 asmlinkage long sys_msgctl(int msqid, int cmd, struct msqid_ds __user *buf)
 {
struct kern_ipc_perm *ipcp;
-   struct msq_setbuf setbuf;
+   struct msq_setbuf setbuf = { /* shut up gcc warning */ };
struct msg_queue *msq;
int err, version;
struct ipc_namespace *ns;
Index: linux/ipc/sem.c

Re: [PATCH] ip_tables.h

2007-04-07 Thread Patrick Ale


On 4/7/07, Patrick Ale <[EMAIL PROTECTED]> wrote:

And my "patch" is made obsolete :P
Jan Engelhard from the netfilter list made  patch set within the
patch-o-matic tree.


Cheers Jan :D


Patrick
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 2.6.21-rc6

2007-04-07 Thread Linus Torvalds

On Sat, 7 Apr 2007, Linus Torvalds wrote:

> 
> 
> On Sat, 7 Apr 2007, Randy Dunlap wrote:
> > 
> > Is it too late to get a v2.6.21-rc6 tag ?
> 
> It's definitely there, I can see it in gitweb..
> 
> Do you have some really ancient git that didn't fetch the tags 
> automatically?

Oh, my bad. I'd tagged it, but I didn't *sign* the tag, so it was just a 
tag-reference (and git fetch won't fetch them by default).

I replaced the v2.6.21-rc6 tag with a signed one. Do 

git fetch --tags

to get the thing.

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Ten percent test

2007-04-07 Thread Ingo Molnar


* Gene Heskett <[EMAIL PROTECTED]> wrote:

> Yes it would be Ingo, but so far, none of the recent -rt patches has 
> booted on this machine, the last one I tried a few days ago failing to 
> find /dev/root, whatever the heck that is.

did you have a chance to try the yum kernel by any chance? The -testing 
one you can try on Fedora with little hassle, by doing this as root:

cat > /etc/yum.repos.d/rt-testing.repo
[rt-testing]
name=Ingo's Real-Time (-rt) test-kernel for FC6
baseurl=http://people.redhat.com/mingo/realtime-preempt/yum-testing/yum/
enabled=1
gpgcheck=0


and "yum install kernel-rt" and a reboot should get you going.

> [...]  I don't enjoy sitting through all these e2fsk's during the 
> reboot just to have things I normally run in the background die, like 
> tvtime, sitting there with some news channel muttering along in the 
> background.  I was even ignored when I suggested it might be a dma 
> problem, which I still think it could be.

i did spend quite some time to debug your tv-tuner problem back then, 
and for that purpose alone i bought a tv tuner card to test this myself. 
(but it worked on my testbox)

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 2.6.21-rc6

2007-04-07 Thread Randy Dunlap

On Sat, 7 Apr 2007 11:46:13 -0700 (PDT) Linus Torvalds wrote:

> 
> 
> On Sat, 7 Apr 2007, Randy Dunlap wrote:
> > 
> > Is it too late to get a v2.6.21-rc6 tag ?
> 
> It's definitely there, I can see it in gitweb..
> 
> Do you have some really ancient git that didn't fetch the tags 
> automatically?

Could be.  I'll check that.

Thanks.
---
~Randy
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: mm snapshot broken-out-2007-04-07-03-27.tar.gz uploaded

2007-04-07 Thread Michal Piotrowski


On 07/04/07, Andrew Morton <[EMAIL PROTECTED]> wrote:

On Sat, 07 Apr 2007 20:09:43 +0200 Michal Piotrowski <[EMAIL PROTECTED]> wrote:

> BTW. I guess that this need a similar fix.
>
> kernel BUG at kernel/ptrace.c:494!
> invalid opcode:  [#2]
> PREEMPT SMP
> last sysfs file: devices/platform/w83627hf.656/temp2_input
> Modules linked in: ipt_MASQUERADE iptable_nat nf_nat nfsd exportfs lockd 
nfs_acl autofs4 sunrpc af_packet nf_conntrack_netbios_ns ipt_REJECT 
nf_conntrack_ipv4 xt_state nf_conntrack nfnetlink iptable_filter ip_tables 
ip6t_REJECT xt_tcpudp ip6table_filter ip6_tables x_tables ipv6 binfmt_misc thermal 
processor fan container nvram snd_intel8x0 snd_ac97_codec ac97_bus snd_seq_dummy 
snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss 
intel_agp snd_pcm agpgart evdev snd_timer snd soundcore i2c_i801 snd_page_alloc 
ide_cd cdrom rtc unix
> CPU:1
> EIP:0060:[]Not tainted VLI
> EFLAGS: 00010202   (2.6.21-rc6-mm1 #1)
> EIP is at ptrace_exit+0x29/0x21d
>

no, I don't see what would cause that.  Was there no call trace?


Here is a call trace (it was in the first email).

Call Trace:
[] do_exit+0x16b/0x86c
[] die+0x206/0x22c
[] do_trap+0x8a/0xa4
[] do_invalid_op+0x88/0x92
[] error_code+0x79/0x80
[] ptrace_do_wait+0x1eb/0x510
[] do_wait+0x9d6/0xbad
[] sys_wait4+0x30/0x32
[] sys_waitpid+0x27/0x29
[] syscall_call+0x7/0xb
[] 0xb7f36410



It's always possible that some random part of the kernel has
gone and leaked a preempt_count.

BUG_ON is an obnoxious thing - please prefer to use WARN_ON in non-fatal
situations.   Particularly when the assertions aren't tested ;)



Regards,
Michal

--
Michal K. K. Piotrowski
LTG - Linux Testers Group (PL)
(http://www.stardust.webpages.pl/ltg/)
LTG - Linux Testers Group (EN)
(http://www.stardust.webpages.pl/linux_testers_group_en/)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 2.6.21-rc6

2007-04-07 Thread Linus Torvalds



On Sat, 7 Apr 2007, Randy Dunlap wrote:
> 
> Is it too late to get a v2.6.21-rc6 tag ?

It's definitely there, I can see it in gitweb..

Do you have some really ancient git that didn't fetch the tags 
automatically?

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Two questions regarding Opening files within Kernel!

2007-04-07 Thread Jan Engelhardt


On Apr 7 2007 16:57, JanuGerman wrote:
>
>Thanks Jan for the response.
>
>>struct dentry *fbar = lookup_one_len("/foo/bar", current->fs->root);
>
>But that gives me a dentry, where as file object is still not reachable.

So use filp_open.

>Question: I am currently using a function called fs.h/dentry_open which takes
>a "dentry", "vfsmount" object and flag (usually RW i.e. 2), and gives me the
>file object. with your suggested method, vfsmount is still not available. In
>this regard, any idea about a function, which gives directly the file object
>instead of dentry will be highly appreciated.
>
>
>OR,  (Kindly see the code below), i need some thing for "missing vfsmount".
>
>
>struct dentry *fbar = lookup_one_len("/foo/bar", current->fs->root);
>struct file *file1 = dentry_open(fbar, "missing vfsmount here",2)
>
>
>
>Thanks,
>JG
>
>
>   
>   
>   
>___ 
>New Yahoo! Mail is the ultimate force in competitive emailing. Find out more 
>at the Yahoo! Mail Championships. Plus: play games and win prizes. 
>http://uk.rd.yahoo.com/evt=44106/*http://mail.yahoo.net/uk 
>

Jan
-- 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 2/4] clean up identify_cpu

2007-04-07 Thread Sam Ravnborg

On Sat, Apr 07, 2007 at 10:59:47AM -0700, Andrew Morton wrote:
> On Sat, 07 Apr 2007 10:20:17 -0700 Jeremy Fitzhardinge <[EMAIL PROTECTED]> 
> wrote:
> 
> >  I don't have a x86-64 compile environment on
> > hand, so the 64 bits are completely untested
> 
> http://userweb.kernel.org/~akpm/cross-compilers/

Does the alpha toolchain work for you?

For defconfig I get:
  CC  arch/alpha/kernel/core_cia.o
{standard input}: Assembler messages:
{standard input}:351: Error: macro requires $at register while noat in effect
{standard input}:376: Error: macro requires $at register while noat in effect
{standard input}:400: Error: macro requires $at register while noat in effect
{standard input}:419: Error: macro requires $at register while noat in effect
{standard input}:474: Error: macro requires $at register while noat in effect
{standard input}:499: Error: macro requires $at register while noat in effect
{standard input}:523: Error: macro requires $at register while noat in effect
{standard input}:542: Error: macro requires $at register while noat in effect
make[2]: *** [arch/alpha/kernel/core_cia.o] Error 1
make[1]: *** [arch/alpha/kernel] Error 2
make: *** [_all] Error 2

Same happens when I compile the same version direct from Dan's crosstool.

Sam
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: mm snapshot broken-out-2007-04-07-03-27.tar.gz uploaded

2007-04-07 Thread Andrew Morton

On Sat, 07 Apr 2007 20:09:43 +0200 Michal Piotrowski <[EMAIL PROTECTED]> wrote:

> BTW. I guess that this need a similar fix.
> 
> kernel BUG at kernel/ptrace.c:494!
> invalid opcode:  [#2]
> PREEMPT SMP 
> last sysfs file: devices/platform/w83627hf.656/temp2_input
> Modules linked in: ipt_MASQUERADE iptable_nat nf_nat nfsd exportfs lockd 
> nfs_acl autofs4 sunrpc af_packet nf_conntrack_netbios_ns ipt_REJECT 
> nf_conntrack_ipv4 xt_state nf_conntrack nfnetlink iptable_filter ip_tables 
> ip6t_REJECT xt_tcpudp ip6table_filter ip6_tables x_tables ipv6 binfmt_misc 
> thermal processor fan container nvram snd_intel8x0 snd_ac97_codec ac97_bus 
> snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device 
> snd_pcm_oss snd_mixer_oss intel_agp snd_pcm agpgart evdev snd_timer snd 
> soundcore i2c_i801 snd_page_alloc ide_cd cdrom rtc unix
> CPU:1
> EIP:0060:[]Not tainted VLI
> EFLAGS: 00010202   (2.6.21-rc6-mm1 #1)
> EIP is at ptrace_exit+0x29/0x21d
> 

no, I don't see what would cause that.  Was there no call trace?

It's always possible that some random part of the kernel has
gone and leaked a preempt_count.

BUG_ON is an obnoxious thing - please prefer to use WARN_ON in non-fatal
situations.   Particularly when the assertions aren't tested ;)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 2.6.21-rc6

2007-04-07 Thread Randy Dunlap

On Thu, 5 Apr 2007 19:50:11 -0700 (PDT) Linus Torvalds wrote:

> 
> Ok,
>  I don't think there really is anything very interesting here, but we're 
> hopefully whittling down the list of regressions, and fixing various 
> random other small issues while at it.
> 
> Some smallish MIPS updates, networking (and network driver) fixes, removal 
> of a long obsolete framebuffer driver, etc etc. The shortlog really tells 
> the story.
> 
> We should be getting close to a 2.6.21 release, so please update any 
> regression reports you've done,

Is it too late to get a v2.6.21-rc6 tag ?

---
~Randy
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] kernel-doc: handle arrays with arithmetic expressions as initializers

2007-04-07 Thread Randy Dunlap

On Sat, 7 Apr 2007 17:04:44 +0200 Borislav Petkov wrote:

> 
> In a different approach here's a patch that handles the special case of
> composite arithmetic expressions in array size initializers. With it,
> prior to pushing the split strings on the @first_arg array, I split the
> keywords before the array name as before and then keep the array name
> along with the subscript expression as a single whole element which gets
> pushed last. In this manner, kernel-doc produces correct output without
> removing whitespaces which makes the array subscripts unreadable in the docs.

Nice job.

Andrew, please drop kernel-doc-handle-spaces-in-array-size.patch
and I'll (re)send this one to you.

---
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v3] kernel-doc: handle arrays with arithmetic expressions as initializers

2007-04-07 Thread Randy Dunlap

From: Borislav Petkov <[EMAIL PROTECTED]>

In a different approach here's a patch that handles the special case of
composite arithmetic expressions in array size initializers. With it,
prior to pushing the split strings on the @first_arg array, I split the
keywords before the array name as before and then keep the array name
along with the subscript expression as a single whole element which gets
pushed last. In this manner, kernel-doc produces correct output without
removing whitespaces which makes the array subscripts unreadable in the docs.

Signed-off-by: Borislav Petkov <[EMAIL PROTECTED]>
Signed-off-by: Randy Dunlap <[EMAIL PROTECTED]>
---
 scripts/kernel-doc |   11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

--- linux-2.6.21-rc6.orig/scripts/kernel-doc
+++ linux-2.6.21-rc6/scripts/kernel-doc
@@ -1456,7 +1456,16 @@ sub create_parameterlist($$$) {
if ($args[0] =~ m/\*/) {
$args[0] =~ s/(\*+)\s*/ $1/;
}
-   my @first_arg = split('\s+', shift @args);
+
+   my @first_arg;
+   if ($args[0] =~ /^(.*\s+)(.*?\[.*\].*)$/) {
+   shift @args;
+   push(@first_arg, split('\s+', $1));
+   push(@first_arg, $2);
+   } else {
+   @first_arg = split('\s+', shift @args);
+   }
+
unshift(@args, pop @first_arg);
$type = join " ", @first_arg;
 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Ten percent test

2007-04-07 Thread Gene Heskett

On Saturday 07 April 2007, Ingo Molnar wrote:
>* Gene Heskett <[EMAIL PROTECTED]> wrote:
>> To be expected, there are after all, only so many cpu cycles to go
>> around.  Here I sit, running 2.6.21-rc6 ATM, and since there is not an
>> SD patch that applies cleanly to rc6, I am back to typing half or more
>> of a sentence blind while I answer a posting such as this because of x
>> starvation while kmail is sorting incoming stuff.
>
>it would be really nice to analyze this. Does the latest -rt patch boot
>on your box so that we could trace this regression? (I can send you a
>standalone tracing patch if it doesnt.) IIRC you reported that one of
>the early patches from Mike made your system behave good (but still not
>as good as SD) - it would be nice to try a later patch too.

Yes it would be Ingo, but so far, none of the recent -rt patches has 
booted on this machine, the last one I tried a few days ago failing to 
find /dev/root, whatever the heck that is.

FWIW, I gave up on the rt stuffs 6 months or more ago when the regressions 
I was reporting weren't ever acknowledged.  I don't enjoy sitting through 
all these e2fsk's during the reboot just to have things I normally run in 
the background die, like tvtime, sitting there with some news channel 
muttering along in the background.  I was even ignored when I suggested 
it might be a dma problem, which I still think it could be.

Nevertheless, the patch you sent is building as I type, intermittently 
when the screen deigns to update so I can fix the spelling etc.

>basically, the current unfairness in the scheduler should be solved, one
>way or another. Good testcases were posted and there's progress.
>
>> (who the hell runs a 'make -j 200' or 50 while(1)'s in the real world?
>
>not many - and i dont think Mike tested any of these - Mike tested
>pretty low make -j values (Mike, can you confirm?).
>
>(I personally routinely run 'make -j 200' build jobs on my box [because
> it's the central server of a build cluster and high parallelism is
> needed to overcome network latencies], but i'm pretty special in that
> regard and i didnt use that workload as a test against any of these
> schedulers.)

And I'd wager a cool one that you don't gain more than a second or so in 
compile time between a make -j8 and a make -j200 unless your network is a 
pair of tomato juice cans & some string.  Again, to me, the network thing 
is not something that's present in an everyday users environment.  My 
drives are all here and now, on pata-133 interfaces.

>   Ingo

-- 
Cheers, Gene
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
If you would keep a secret from an enemy, tell it not to a friend.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Samba, inotify on a Windows share

2007-04-07 Thread Steve French (smfltc)

Here's what I try to do. I want to monitor from a Linux Gentoo 
machine with inotify enabled on a directory for new files hosted by a 
windows share(Windows server, not Samba).



Samba now uses inotify if available on the server side to support the 
Directory Change Notification requested by certain cifs client (Windows 
explorer on most Windows clients e.g.) but the Linux CIFS client does 
not have a complete implementation of the mapping between the fcntl 
dnotify (the old way to do the same thing on Linux) and the cifs 
transact change notify request on the wire.  It would not be too hard to 
finish up if anyone is looking for a small project.   support for 
inotify on the client (mapping to the cifs transact change notify on the 
wire) would be a little harder because Linux's inotify is a broader API 
than the older fcntl but it could be done for some common cases.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

kernel BUG at net/core/skbuff.c in linux-2.6.21-rc6

2007-04-07 Thread Bartek


Hallo

I have problem with a Linux kernel oops. It mostly appears when I
download files using bittorrent or other large file. I have a phone
modem based Internet access using Home Internet Solution
(http://en.wikipedia.org/wiki/Home_internet_Solution). I use Debian
testing, Linux vanilla version: 2.6.21-rc6 (however I noticed that
problem since Linux ver. 2.6.15, doesn't matter if that was vanilla
kernel or from Debian distro), Here my sys info:

Linux mars 2.6.21-rc6 #3 Sat Apr 7 16:11:12 CEST 2007 i686 GNU/Linux

Gnu C  4.1.2
Gnu make   3.81
binutils   2.17
util-linux 2.12r
mount  2.12r
module-init-tools  3.3-pre2
e2fsprogs  1.40-WIP
xfsprogs   2.8.11
PPP2.4.4
Linux C Library2.3.6
Dynamic linker (ldd)   2.3.6
Procps 3.2.7
Net-tools  1.60
Console-tools  0.2.3
Sh-utils   5.97
udev   105
Modules Loaded nvidia vmnet vmmon nfs nfsd exportfs lockd
nfs_acl sunrpc button xt_TCPMSS xt_limit xt_tcpudp nf_nat_irc
nf_nat_ftp iptable_nat iptable_mangle ipt_LOG ipt_MASQUERADE nf_nat
ipt_TOS ipt_REJECT nf_conntrack_irc nf_conntrack_ftp nf_conntrack_ipv4
xt_state nf_conntrack nfnetlink iptable_filter ip_tables x_tables
ppp_async ipv6 ppp_generic slhc xfs eeprom w83781d w83627hf hwmon_vid
i2c_isa ide_generic parport_pc parport rtc floppy serio_raw pcspkr
psmouse i2c_viapro snd_via82xx snd_ac97_codec ac97_bus snd_pcm
snd_timer snd_page_alloc snd_mpu401_uart i2c_core via_ircc snd_rawmidi
snd_seq_device via_agp agpgart snd irda crc_ccitt soundcore evdev ext3
jbd mbcache usbhid ide_cd cdrom ide_disk generic uhci_hcd usbcore
via82cxxx ide_core e100 mii thermal processor fan

If there are needs for more info, don't hesitate to ask. If anyone can
help me and write a patch, I'll be appreciated.
Here is a log from syslog:

Apr  7 17:45:52 localhost kernel: skb_under_panic: text:f8e36c0e
len:107 put:1 head:df928000 data:df927fff tail:df92806a end:df928600
dev:
Apr  7 17:45:52 localhost kernel: [ cut here ]
Apr  7 17:45:52 localhost kernel: kernel BUG at net/core/skbuff.c:111!
Apr  7 17:45:52 localhost kernel: invalid opcode:  [#1]
Apr  7 17:45:52 localhost kernel: Modules linked in: w83781d w83627hf
i2c_dev nvidia(P) nfs nfsd exportfs lockd nfs_acl sunrpc button
xt_TCPMSS xt_limit xt_tcpudp nf_nat_irc nf_nat_ftp iptable_nat
iptable_mangle ipt_LOG ipt_MASQUERADE nf_nat ipt_TOS ipt_REJECT
nf_conntrack_irc nf_conntrack_ftp nf_conntrack_ipv4 xt_state
nf_conntrack nfnetlink iptable_filter ip_tables x_tables ppp_async
ipv6 ppp_generic slhc xfs eeprom hwmon_vid i2c_isa ide_generic rtc
snd_via82xx snd_ac97_codec ac97_bus snd_pcm snd_timer snd_page_alloc
snd_mpu401_uart i2c_viapro i2c_core serio_raw snd_rawmidi
snd_seq_device via_ircc psmouse pcspkr floppy irda snd crc_ccitt
soundcore via_agp agpgart evdev ext3 jbd mbcache usbhid ide_cd cdrom
ide_disk generic uhci_hcd usbcore via82cxxx ide_core e100 mii thermal
processor fan
Apr  7 17:45:52 localhost kernel: CPU:0
Apr  7 17:45:52 localhost kernel: EIP:0060:[]
Tainted: P   VLI
Apr  7 17:45:52 localhost kernel: EFLAGS: 00010096   (2.6.21-rc6 #2)
Apr  7 17:45:52 localhost kernel: EIP is at skb_under_panic+0x59/0x5d
Apr  7 17:45:52 localhost kernel: eax: 0072   ebx: df928000   ecx:
   edx: 
Apr  7 17:45:52 localhost kernel: esi:    edi: df92806c   ebp:
df92806b   esp: c17e5ed8
Apr  7 17:45:52 localhost kernel: ds: 007b   es: 007b   fs: 00d8  gs:
  ss: 0068
Apr  7 17:45:52 localhost kernel: Process events/0 (pid: 3,
ti=c17e4000 task=dfd02030 task.ti=c17e4000)
Apr  7 17:45:52 localhost kernel: Stack: c02c4d51 f8e36c0e 006b
0001 df928000 df927fff df92806a df928600
Apr  7 17:45:52 localhost kernel:c02b83d5 e366f8e0 00ff
f8e36c13  c01044d7  c17e4000
Apr  7 17:45:52 localhost kernel:f7f9f576 f7f9f476 f6b9a800
0202 dfe60c00 0001 f7f9f400 f6b9a80c
Apr  7 17:45:52 localhost kernel: Call Trace:
Apr  7 17:45:52 localhost kernel:  []
ppp_asynctty_receive+0x3b0/0x584 [ppp_async]
Apr  7 17:45:52 localhost kernel:  []
ppp_asynctty_receive+0x3b5/0x584 [ppp_async]
Apr  7 17:45:52 localhost kernel:  [] common_interrupt+0x23/0x28
Apr  7 17:45:52 localhost kernel:  [] flush_to_ldisc+0xe6/0x124
Apr  7 17:45:52 localhost kernel:  [] flush_to_ldisc+0x0/0x124
Apr  7 17:45:52 localhost kernel:  [] run_workqueue+0x70/0x101
Apr  7 17:45:52 localhost kernel:  [] worker_thread+0x105/0x12e
Apr  7 17:45:52 localhost kernel:  [] default_wake_function+0x0/0xc
Apr  7 17:45:52 localhost kernel:  [] worker_thread+0x0/0x12e
Apr  7 17:45:52 localhost kernel:  [] kthread+0xa0/0xc8
Apr  7 17:45:52 localhost kernel:  [] kthread+0x0/0xc8
Apr  7 17:45:52 localhost kernel:  [] kernel_thread_helper+0x7/0x10
Apr  7 17:45:52 localhost kernel:  ===
Apr  7 17:45:52 localhost kernel: Code: 00 00 89 5c 24 14

Re: mm snapshot broken-out-2007-04-07-03-27.tar.gz uploaded

2007-04-07 Thread Michal Piotrowski

Andrew Morton napisał(a):
> On Sat, 07 Apr 2007 14:30:04 +0200 Michal Piotrowski <[EMAIL PROTECTED]> 
> wrote:
> 
>> [EMAIL PROTECTED] napisał(a):
>>> The mm snapshot broken-out-2007-04-07-03-27.tar.gz has been uploaded to
>>>
>>>
>>> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/mm/broken-out-2007-04-07-03-27.tar.gz
>>>
>>> It contains the following patches against 2.6.21-rc6:
>>>
>> LTP triggered a ptrace problem.
>>
>> [ cut here ]
>> kernel BUG at kernel/ptrace.c:1281!
> 
> umm, Roland, if you're going to add assertions which only function with
> CONFIG_PREEMPT enabled, you'd better test with CONFIG_PREEMPT enabled ;)
> 
> 
> From: Andrew Morton <[EMAIL PROTECTED]>
> 
> We hold read_lock(tasklist_lock) in here.
> 
> Cc: Roland McGrath <[EMAIL PROTECTED]>
> Signed-off-by: Andrew Morton <[EMAIL PROTECTED]>
> ---
> 
>  kernel/ptrace.c |2 --
>  1 file changed, 2 deletions(-)
> 
> diff -puN kernel/ptrace.c~utrace-fix-bug kernel/ptrace.c
> --- a/kernel/ptrace.c~utrace-fix-bug
> +++ a/kernel/ptrace.c
> @@ -1278,8 +1278,6 @@ found:
>current->pid, tsk->pid, p->pid, exit_code,
>p->exit_state, p->exit_signal);
>  
> - NO_LOCKS;
> -
>   /*
>* If there was a group exit in progress, all threads report that
>* status.  Most will have SIGKILL in their own exit_code.
> _
> 
> 
> 

Thanks.

BTW. I guess that this need a similar fix.

kernel BUG at kernel/ptrace.c:494!
invalid opcode:  [#2]
PREEMPT SMP 
last sysfs file: devices/platform/w83627hf.656/temp2_input
Modules linked in: ipt_MASQUERADE iptable_nat nf_nat nfsd exportfs lockd 
nfs_acl autofs4 sunrpc af_packet nf_conntrack_netbios_ns ipt_REJECT 
nf_conntrack_ipv4 xt_state nf_conntrack nfnetlink iptable_filter ip_tables 
ip6t_REJECT xt_tcpudp ip6table_filter ip6_tables x_tables ipv6 binfmt_misc 
thermal processor fan container nvram snd_intel8x0 snd_ac97_codec ac97_bus 
snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss 
snd_mixer_oss intel_agp snd_pcm agpgart evdev snd_timer snd soundcore i2c_i801 
snd_page_alloc ide_cd cdrom rtc unix
CPU:1
EIP:0060:[]Not tainted VLI
EFLAGS: 00010202   (2.6.21-rc6-mm1 #1)
EIP is at ptrace_exit+0x29/0x21d

l *0xc0163f22
0xc0163f22 is in ptrace_exit (kernel/ptrace.c:494).
489 ptrace_exit(struct task_struct *tsk)
490 {
491 struct list_head *pos, *n;
492 int restart;
493
494 NO_LOCKS;
495
496 /*
497  * Taking the task_lock after PF_EXITING is set ensures that a
498  * child in ptrace_traceme will not put itself on our list when

Regards,
Michal

-- 
Michal K. K. Piotrowski
LTG - Linux Testers Group (PL)
(http://www.stardust.webpages.pl/ltg/)
LTG - Linux Testers Group (EN)
(http://www.stardust.webpages.pl/linux_testers_group_en/)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Ten percent test

2007-04-07 Thread Ingo Molnar

* Gene Heskett <[EMAIL PROTECTED]> wrote:

> To be expected, there are after all, only so many cpu cycles to go 
> around.  Here I sit, running 2.6.21-rc6 ATM, and since there is not an 
> SD patch that applies cleanly to rc6, I am back to typing half or more 
> of a sentence blind while I answer a posting such as this because of x 
> starvation while kmail is sorting incoming stuff.

it would be really nice to analyze this. Does the latest -rt patch boot 
on your box so that we could trace this regression? (I can send you a 
standalone tracing patch if it doesnt.) IIRC you reported that one of 
the early patches from Mike made your system behave good (but still not 
as good as SD) - it would be nice to try a later patch too.

basically, the current unfairness in the scheduler should be solved, one 
way or another. Good testcases were posted and there's progress.

> (who the hell runs a 'make -j 200' or 50 while(1)'s in the real world?

not many - and i dont think Mike tested any of these - Mike tested 
pretty low make -j values (Mike, can you confirm?).

(I personally routinely run 'make -j 200' build jobs on my box [because
 it's the central server of a build cluster and high parallelism is
 needed to overcome network latencies], but i'm pretty special in that
 regard and i didnt use that workload as a test against any of these
 schedulers.)

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] console UTF-8 fixes

2007-04-07 Thread H. Peter Anvin


Egmont Koblinger wrote:

On Sat, Apr 07, 2007 at 01:00:48PM +0200, Jan Engelhardt wrote:

Hi,


Please, no dot, and no inverse color.
Imagine someone had the following bitmap for :


No dot, I'm already convinced. To clarify the inverse thingy:

This is what the current kernel does:
  1) tries to display the desired symbol
  2) if it fails, tries to display U+FFFD (which usually looks similar to an
 inverted question mark)
  3) if this fails again then displays a normal '?'
 (or a different symbol due to a bug discussed below)

Here's my proposal. This only alters the 3rd step, not the first two:
  1) tries to display the desired symbol
  2) if it fails, tries to display U+FFFD, still with _normal_ attributes
  3) if this fails then display an ascii '?' with inverted attributes

So you won't get "double" inversion. If you do have U+FFFD in your font then
this will introduce no chance. If you don't have U+FFFD, you'll see inverse
question marks instead of normal ones.



This seems fine.




I blame your latin2 unicode map. (See above about 'Û'.)


There's nothing wrong with my latin2 unicode map, and I've located and
changed the part _in the kernel_ that displays a false glyph using the
algorithm I've outlined. It just uses "the glyph at that code position
within the glyph table" as a fallback, which might be okay in 8-bit mode
(and I haven't modified the behavior in that case), but I got rid of this
behavior in UTF-8 mode since it's definitely a fault in the world of
Unicode.


It should perhaps display a regular 'u' if it cannot display 'û',


I rather think it should display U+FFFD but YMMV.


That's a policy decision for the maker of the Unicode map.  The kernel 
cannot by default know that a pre-composed ű is a modified u; obviously, 
if the ű is send in decomposed form the kernel probably will display it 
as u? or some such.



but definitely not 'ü' (which is not called a double accent, btw).


This is not the character I've been talking about, I actually _did_ talk
about u with double acute accent (ű - you might not have seen this character
so far, AFAIK it's only used in Hungarian, no other languages). But we agree
that the kernel definitely shouldn't display a character with a different
accent on it. This is one of the bugs my patch addresses.


As far as width handling -- in order to make all the text line up under 
all circumstances you need more than width handling.  The wcwidth() 
stuff is specific to CJK -- a character set which is totally implausible 
to display on the builtin console.  You also need bidir support (in case 
you encounter Hebrew or Arabic), you need Indic shape handling (Indic 
langauges have some *very* odd composing rules), etc, and this is just 
to know how much space to take up on the screen.


is is ridiculous.  It's much better to draw a line in the sand and say 
that this is beyond the scope of the in-kernel Linux console.


-hpa

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 2/4] clean up identify_cpu

2007-04-07 Thread Andrew Morton

On Sat, 07 Apr 2007 10:20:17 -0700 Jeremy Fitzhardinge <[EMAIL PROTECTED]> 
wrote:

>  I don't have a x86-64 compile environment on
> hand, so the 64 bits are completely untested

http://userweb.kernel.org/~akpm/cross-compilers/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] ip_tables.h

2007-04-07 Thread Patrick Ale


On 4/7/07, Patrick Ale <[EMAIL PROTECTED]> wrote:

Hi lads,


Oh! And! I don't want to take credit for things I didn't write, which
is this exactly.
I merely got the ip_tables.h header from 2.6.20 and filtered out what
I needed to get things work.
So, the actual credit for the code in the patch goes to the person who
put it in ip_tables.h under 2.6.20.

So whoever you are, thanks :)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: mm snapshot broken-out-2007-04-07-03-27.tar.gz uploaded

2007-04-07 Thread Andrew Morton

On Sat, 07 Apr 2007 14:30:04 +0200 Michal Piotrowski <[EMAIL PROTECTED]> wrote:

> [EMAIL PROTECTED] napisał(a):
> > The mm snapshot broken-out-2007-04-07-03-27.tar.gz has been uploaded to
> > 
> >
> > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/mm/broken-out-2007-04-07-03-27.tar.gz
> > 
> > It contains the following patches against 2.6.21-rc6:
> > 
> 
> LTP triggered a ptrace problem.
> 
> [ cut here ]
> kernel BUG at kernel/ptrace.c:1281!

umm, Roland, if you're going to add assertions which only function with
CONFIG_PREEMPT enabled, you'd better test with CONFIG_PREEMPT enabled ;)


From: Andrew Morton <[EMAIL PROTECTED]>

We hold read_lock(tasklist_lock) in here.

Cc: Roland McGrath <[EMAIL PROTECTED]>
Signed-off-by: Andrew Morton <[EMAIL PROTECTED]>
---

 kernel/ptrace.c |2 --
 1 file changed, 2 deletions(-)

diff -puN kernel/ptrace.c~utrace-fix-bug kernel/ptrace.c
--- a/kernel/ptrace.c~utrace-fix-bug
+++ a/kernel/ptrace.c
@@ -1278,8 +1278,6 @@ found:
 current->pid, tsk->pid, p->pid, exit_code,
 p->exit_state, p->exit_signal);
 
-   NO_LOCKS;
-
/*
 * If there was a group exit in progress, all threads report that
 * status.  Most will have SIGKILL in their own exit_code.
_


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: sky2 PHY setup

2007-04-07 Thread Rob Sims

On Wed, Apr 04, 2007 at 11:19:30AM -0700, Stephen Hemminger wrote:
> On Mon, 26 Mar 2007 21:24:06 -0600
> Rob Sims <[EMAIL PROTECTED]> wrote:
> 
> > On Fri, Mar 16, 2007 at 02:16:48PM -0700, Stephen Hemminger wrote:

> > > Use ethtool -S to if there are any pause frames, etc. See if frames are
> > > still making it into PHY statistics but not being received.
> > > 
> > > Use ethtool -d to dump registers. Need current version of ethtool with 
> > > decode logic.
> > > 
> > > Then look for things like is Ram buffer read/write pointer changing?
> > > 
> > > Is GMAC stuck in pause:
> > > 
> > > Normal is:
> > >   GMAC 1
> > >   Status   0x5010  (see GM_GPSR_XXX in sky2.h)
> > >   Control  0x1800
> > > 
> > > Stuck is
> > >   GMAC 1
> > >   Status   0x5810 (or 0x5A10)
> > 
> > First, here's the described hang in action, on the Core2 Duo on a 1Gb
> > hub:
> > GMAC 1 Status/Control remains at 0x5010/0x1800 until module is removed.
> > Read/write buffer pointers are changing.  Full ethtool output in
> > http://www.robsims.com/sky2.netmon.log.gz
> > 
> > This machine was also having major throughput problems - 17 kB/s.
> > Rebooting brought it to ~ 20 MB/s.  Booting into a kernel with the
> > proprietary sk98lin kernel module showed ~ 80MB/s.  Finally, returning
> > to sky2 gave 117 MB/s.  Tests run using netcat, dd, /dev/zero, and
> > /dev/null, transmitting from the problem box to an e1000 via a Netgear
> > GS108.  No hangs were observed during the "load test."
> 
> The vendor driver does not do hardware flow control correctly. It ignores
> Tx pause frames.
 
> Were you using Jumbo MTU?

No - 1500.

> You might have over run the hub and it wedged.  Try doing:
>   ethtool -r eth0
> that forces a down/up

I'm still seeing throughput drop to under 1 Mb/s (50-100kB/s)
periodically.  I did an ethtool -S and ethtool -d to capture state.
(sky2-fail.log).  I ran ethtool -r and retested; no change
(sky2-ethtool-r.log).  Finally, ifdown - rmmod - modprobe - ifup, which
restored to Gigabit speeds (96+ MB/s), (sky2-modcycle.log).

Please let me know if there's anything else I can poke into.  As a
reminder, this is a 88E8053 r20.  Next time I see a degradation, I'll
try cycling the switch.
-- 
Rob
NIC statistics:
 tx_bytes: 84771583
 rx_bytes: 94680579
 tx_broadcast: 3
 rx_broadcast: 1356
 tx_multicast: 655
 rx_multicast: 22
 tx_unicast: 173780
 rx_unicast: 181842
 tx_mac_pause: 0
 rx_mac_pause: 0
 collisions: 0
 late_collision: 0
 aborted: 0
 single_collisions: 0
 multi_collisions: 0
 rx_short: 0
 rx_runt: 0
 rx_64_byte_packets: 2360
 rx_65_to_127_byte_packets: 82349
 rx_128_to_255_byte_packets: 35420
 rx_256_to_511_byte_packets: 8584
 rx_512_to_1023_byte_packets: 3734
 rx_1024_to_1518_byte_packets: 50773
 rx_1518_to_max_byte_packets: 0
 rx_too_long: 0
 rx_fifo_overflow: 0
 rx_jabber: 0
 rx_fcs_error: 0
 tx_64_byte_packets: 1744
 tx_65_to_127_byte_packets: 102366
 tx_128_to_255_byte_packets: 19607
 tx_256_to_511_byte_packets: 8735
 tx_512_to_1023_byte_packets: 4438
 tx_1024_to_1518_byte_packets: 31033
 tx_1519_to_max_byte_packets: 0
 tx_fifo_underrun: 0

PCI config
--
00: ab 11 62 43 07 00 10 00 20 00 00 02 04 00 00 00
10: 04 c0 9f fa 00 00 00 00 01 c8 00 00 00 00 00 00
20: 00 00 00 00 00 00 00 00 00 00 00 00 43 10 42 81
30: 00 00 9c fa 48 00 00 00 00 00 00 00 05 01 00 00
40: 00 00 f0 01 00 80 a0 01 01 50 02 fe 00 20 00 13
50: 03 5c fc 80 00 00 00 01 00 00 00 01 05 e0 82 00
60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
70: 00 04 00 00 00 00 00 00 00 00 00 00 00 00 00 00

Control Registers
-
Register Access Port 0x00
LED Control/Status   0xA603164A
Interrupt Source 0x
Interrupt Mask   0xC01D
Interrupt Hardware Error Source  0x
Interrupt Hardware Error Mask0x2E003F3F

Bus Management Unit
---
CSR Receive Queue 1  0x0001
CSR Sync Queue 1 0x
CSR Async Queue 10x

MAC Addresses
---
Addr 100 1A 92 23 52 4D
Addr 200 1A 92 23 52 4D
Addr 300 00 00 00 00 00

Connector type   0x4A (J)
PMD type 0x31 (1)
PHY type 0x80
Chip Id  0xB6 Yukon-2 EC (rev 0)
Ram Buffer   0x0C

Status BMU:
---
Control0x0002220A
Last Index 0x07FF
Put Index  0x025F
List Address   0x1276
Transmit 1 done index  0x0118
Transmit index threshold   0x000A

Status FIFO
Write Pointer0x16
Read Pointer 0x16
Level0x00

Re: Kernel NULL pointer when loading bcm43xx-mac80211 with fwpostfix = ".fw4"

2007-04-07 Thread Michael Buesch

On Saturday 07 April 2007 19:44, Larry Finger wrote:
> Johannes Berg wrote:
> > On Sat, 2007-04-07 at 15:51 +0200, Michael Buesch wrote:
> >> On Saturday 07 April 2007 02:01, Larry Finger wrote:
> >>> The current mb and wireless-dev git trees both get a kernel NULL pointer 
> >>> in "param_set_copystring" 
> >>> when modprobe'ing bcm43xx-mac80211 with a line of 'options 
> >>> bcm43xx-mac80211 fwpostfix = ".fw4"' in 
> >>> /etc/modprobe.conf.local. This construction used to work and still does 
> >>> for bcm43xx-softmac. I 
> >>> compared the code between the two versions and cannot see any real 
> >>> differences. Any suggestions?
> >> Uhm, fwpostfix=.fw4 works fine for me when I pass it to modprobe.
> > 
> > I use
> > $ cat /etc/modprobe.d/bcm43xx 
> > options bcm43xx-mac80211 fwpostfix=-v4
> > options bcm43xx_mac80211 fwpostfix=-v4
> > options bcm43xx fwpostfix=-v3
> > 
> > and it works great.
> 
> My bad. As shown above, I had white space around the equals - a no-no. 
> Removing it fixed the 
> problem. The examples both of you showed gave me the clue.

Hm, probably a bug in the modparam subsystem, though.
I'd say it shouldn't crash, at least.

-- 
Greetings Michael.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] timekeeping: drop irq-context clocksource polling

2007-04-07 Thread Daniel Walker

On Sat, 2007-04-07 at 03:19 -0700, Andrew Morton wrote:
> On Thu, 05 Apr 2007 14:03:16 -0700 Daniel Walker <[EMAIL PROTECTED]> wrote:
> 
> > Before this change the timekeeping code would poll the clocksource
> > list every interrupt. This changes that so the clocksource list is
> > only checked when there has been and update, and no longer checks
> > in interrupt context.
> 
> I get a complete lockup on i386 SMP - before the kernel has printed anything.
> 
> I'm suspecting a recursive taking of xtime_lock.

Looks like this path ,

arch/i386/kernel/tsc.c: time_cpufreq_notifier(); <-- takes xtime_lock
 mark_tsc_unstable();
  clocksource_change_rating(_tsc, 0);
   timekeeping_change_clocksource(); <-- takes 
xtime_lock


I'm not sure why the time_cpufreq_notifier is taking the xtime_lock tho .

Daniel

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] ip_tables.h

2007-04-07 Thread Patrick Ale


Hi lads,

I had some problems compiling the external netfilter modules due to
missing definitions.
I googled a lot, saw a lot of people having the same problems but no
real answer to how to fix it.

So.. I made a little patch which make things work for me, at least.

Modules that work after applying the patch are the geopip module,
connlimit module, and prolly more, but I didnt test them.

Please note, I am not a coder, not a maintainer and I am happy that I
didnt break anything so please don't consider this as a proposal to
include in the kernel or something, I am just in favor of sharing what
helped me getting things to work, if i can help others with it or if
it is interesting material for inclusion, even better :)

So, the patch is attached in this email and can also be found on
http://www.patrickale.eu/documents/archives/patches/ip_tables.h.diff


I hope this helps a people or two.


Patrick
--- include/linux/netfilter_ipv4/ip_tables.h.orig   2007-04-07 20:30:25.344365707 +0200
+++ include/linux/netfilter_ipv4/ip_tables.h2007-04-07 20:34:05.076887550 +0200
@@ -34,6 +34,12 @@
 #define ipt_table xt_table
 #define ipt_get_revision xt_get_revision

+#define ipt_register_match(mtch)\
+({  (mtch)->family = AF_INET;   \
+xt_register_match(mtch); })
+#define ipt_unregister_match(mtch) xt_unregister_match(mtch)
+
+
 /* Yes, Virginia, you have to zero the padding. */
 struct ipt_ip {
/* Source and destination IP addr */

Re: Reiser4. BEST FILESYSTEM EVER.

2007-04-07 Thread Valdis . Kletnieks

On Fri, 06 Apr 2007 19:47:36 PDT, [EMAIL PROTECTED] said:
> On Fri, 6 Apr 2007 11:21:19 -0400, "Jan Harkes" <[EMAIL PROTECTED]>

> > With compression there is a pretty high probability that one corrupted
> > byte or disk block will result in loss of a considerably larger amount
> > of data. 
> 
> Bad blocks are NOT dealt with by the filesystem,... so your comment is
> irrelevant, or just plain wrong.
> 
> If your filesystem is writing to bad blocks, then throw away your
> operating system.

You know... occasionally, blocks go bad *after* you write to them.  If
you have an uncompressed filesystem, it's often possible to recover most
of the file , and just have a few 512-byte blocks of zeros, simply by
doing something like 'dd if=bad.file of=bad.file bs=512 conv=noerror'
or careful applications of 'skip=N'.  If it's compressed, you usually
can't recover the rest of a compression group if a previous block is lost.

(And for those who talk about backups - yes, taking backups is good.
However, it's the rare laptop or desktop machine that can afford the
luxury of RAID disks, and backups usually happen once a night, if that
often.  This means that if you've been working hard on something important
all day, and the disk blows chunks at 4:30PM, you *will* be suddenly very
concerned over exactly how much you can recover off the failing drive

And yes, I'd *love* to have all my users connected to nice SAN systems that do
snapshotting and remote replication to DR sites and all that - but have you
ever *priced* a petabyte of SAN storage, the NAS gateways to serve it to users,
and upgrading several tens of thousands of network ports to Gig-E? Hint -
US$1M would get us through a pilot, and probably $5M and up to *start*
deployment. Anybody wanna buy us an EMC DMX-3? :)

http://www.emc.com/products/systems/symmetrix/DMX_series/DMX3.jsp



pgp1JOWRSl3hZ.pgp
Description: PGP signature

Re: [PATCH 3/7] Containers (V8): Add generic multi-subsystem API to containers

2007-04-07 Thread Paul Menage


On 4/6/07, Srivatsa Vaddagiri <[EMAIL PROTECTED]> wrote:

On Fri, Apr 06, 2007 at 04:32:24PM -0700, [EMAIL PROTECTED] wrote:
> +static int attach_task(struct container *cont, struct task_struct *tsk)
>  {

[snip]

> + task_lock(tsk);

You need to check here if task state is PF_EXITING and fail with
-ESRCH if so? Otherwise we risk breaking refcount on
init_container_group.



Yes, I think you're right; I've now changed it to this in my tree:

   task_lock(tsk);
   if (tsk->flags & PF_EXITING) {
   task_unlock(tsk);
   put_container_group(newcg);
   return -ESRCH;
   }
   rcu_assign_pointer(tsk->containers, newcg);
   task_unlock(tsk);

Thanks,

Paul
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] partitions: Enhance Kconfig help text for EESOX and MSDOS formats

2007-04-07 Thread John Anthony Kazos Jr.

From: John Anthony Kazos Jr. <[EMAIL PROTECTED]>

Adds help text for ACORN_PARTITION_EESOX and improves help text for 
MSDOS_PARTITION in fs/partitions/Kconfig.

Signed-off-by: John Anthony Kazos Jr. <[EMAIL PROTECTED]>

---

Applied against Linux v2.6.20.6.

--- linux-2.6.20.6-orig/fs/partitions/Kconfig   2007-04-06 16:02:48.0 
-0400
+++ linux-2.6.20.6-mod/fs/partitions/Kconfig2007-04-07 13:22:17.0 
-0400
@@ -32,6 +32,10 @@ config ACORN_PARTITION_EESOX
bool "EESOX partition support" if PARTITION_ADVANCED
default y if ARCH_ACORN
depends on ACORN_PARTITION
+   help
+ EESOX SCSI card on-disk partition format support for Acorn
+ systems. If you have one of these cards, or want to use a disk
+ written by one, say Y.
 
 config ACORN_PARTITION_ICS
bool "ICS partition support" if PARTITION_ADVANCED
@@ -108,7 +112,11 @@ config MSDOS_PARTITION
bool "PC BIOS (MSDOS partition tables) support" if PARTITION_ADVANCED
default y
help
- Say Y here.
+ Standard PC-compatible partition table support for Linux. Used by
+ i386 systems, Linux/Windows dual-boot systems, and many others.
+ Unless you are certain your system does not use this partition
+ table format, and you're not using any disks from a system that
+ does, say Y.
 
 config BSD_DISKLABEL
bool "BSD disklabel (FreeBSD partition tables) support"
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] console UTF-8 fixes

2007-04-07 Thread Egmont Koblinger

On Sat, Apr 07, 2007 at 01:00:48PM +0200, Jan Engelhardt wrote:

Hi,

> Please, no dot, and no inverse color.
> Imagine someone had the following bitmap for :

No dot, I'm already convinced. To clarify the inverse thingy:

This is what the current kernel does:
  1) tries to display the desired symbol
  2) if it fails, tries to display U+FFFD (which usually looks similar to an
 inverted question mark)
  3) if this fails again then displays a normal '?'
 (or a different symbol due to a bug discussed below)

Here's my proposal. This only alters the 3rd step, not the first two:
  1) tries to display the desired symbol
  2) if it fails, tries to display U+FFFD, still with _normal_ attributes
  3) if this fails then display an ascii '?' with inverted attributes

So you won't get "double" inversion. If you do have U+FFFD in your font then
this will introduce no chance. If you don't have U+FFFD, you'll see inverse
question marks instead of normal ones.

> I blame your latin2 unicode map. (See above about 'Û'.)

There's nothing wrong with my latin2 unicode map, and I've located and
changed the part _in the kernel_ that displays a false glyph using the
algorithm I've outlined. It just uses "the glyph at that code position
within the glyph table" as a fallback, which might be okay in 8-bit mode
(and I haven't modified the behavior in that case), but I got rid of this
behavior in UTF-8 mode since it's definitely a fault in the world of
Unicode.

> It should perhaps display a regular 'u' if it cannot display 'û',

I rather think it should display U+FFFD but YMMV.

> but definitely not 'ü' (which is not called a double accent, btw).

This is not the character I've been talking about, I actually _did_ talk
about u with double acute accent (ű - you might not have seen this character
so far, AFAIK it's only used in Hungarian, no other languages). But we agree
that the kernel definitely shouldn't display a character with a different
accent on it. This is one of the bugs my patch addresses.

bye,

Egmont
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

1 2 3 4 5 >

1 - 100 of 404 matches

Mail list logo