Re: epoll and shared fd's

2008-02-26 Thread Bodo Eggert
Michael Kerrisk [EMAIL PROTECTED] wrote:

 a) I did a
 
 s/internal kernel handle/open file description/
 
 since that is the POSIX term for the internal handle.
 
 b) It seems to me that you text doesn't quite make the point explicit
 enough.  I've tried to rewrite it; could you please check:
 
A6 Yes, but be aware of the following point.  A  file
   descriptor is a reference to an open file descrip-
   tion (see  open(2)).   Whenever  a  descriptor  is
   duplicated  via dup(2), dup2(2), fcntl(2) F_DUPFD,
   or fork(2), a new file descriptor referring to the
   same  open  file  description is created.  An open
   file description continues to exist until all file
   descriptors referring to it have been closed.  The
   epoll  interface  automatically  removes  a   file
   descriptor  from  an  epoll set only after all the
   file descriptors referring to the underlying  open
   file  handle  have  been  closed.  This means that
   even after a file descriptor that is  part  of  an
   epoll  set has been closed, events may be reported
   for that file descriptor if other file descriptors
   referring  to the same underlying file description
   remain open.
 
 Does that seem okay?  I plan to include the text in man-pages-2.79.

It's hard to read for me, and probably very hard to read for others.


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Plans for mISDN? Was: [PATCH 00/14] [ISDN] ...

2008-02-25 Thread Bodo Eggert
Andi Kleen [EMAIL PROTECTED] wrote:

 we were talking about the load order. This will solve the load order,
 but if we have races like the kind you described, then the whole mISDN
 design is broken.
 
 It's more a generic problem of the module code.

It's a problem of not enough synchronisation before a module load completes.

If a module provides an interface, but needs some time after being load to
initialize, it obviously MUST provide a way to wait for it. Since you'll
need some i-need-module-foo functions anyway, why not: (bar needs foo)

-foo.c---
DECLARE_COMPLETION(init_complete); /* static? */
module_foo_init_async() {
...
void complete_all(init_complete);
}

void usemod_foo()
{
wait_for_completion(init_complete);
}
EXPORT_SYMBOL(usemod_foo)

-bar.c---
DECLARE_COMPLETION(init_complete);
module_bar_init_async() {
usemod_foo();
...
void complete_all(init_complete);
}

void usemod_bar()
{
wait_for_completion(init_complete);
}
EXPORT_SYMBOL(usemod_bar)


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Handshaking on USB serial devices

2008-02-15 Thread Bodo Eggert
Gene Heskett [EMAIL PROTECTED] wrote:

['cat /dev/hideaw0 | hexdump -v']

 Or some way to ship the 
 $00's to /dev/null so hexdump ignores them?

.. | perl -pe 's/\00//g/' | ...


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Is there a blackhole /dev/null directory?

2008-02-14 Thread Bodo Eggert
rzryyvzy [EMAIL PROTECTED] wrote:

 /dev/null is often very useful, specially if programs force to save data in
 some file. But some programs like to creates different temporary file names,
 so /dev/null could no more work.
 
 What is with a /dev/null-directory?
 I mean a blackhole pseudo directory which eats every write to null.
 
 Here is how it could work:
 mount -t nulldir nulldir /dev/nulldir
 
 Now if a program does a create(2),
 it creates in the memory the file with its fd.
 Then if a program does a write(2) to the fd, it eats the writes and give out
 fakely it has written the number of bytes. When the program calls does a
 close(2) of the fd, then the complete inode is deleted in the memory.
 
 The directory should  be permanently empty except for the inodes with open
 file descriptors. So only inode information would be temporary saved in this
 nulldir tmpfs directory.
 
 Is there already existing a possibility to create a null directory?

Please try the patch below. It will add an autounlink option to
tmpfs, which should automatically get rid of non-referenced files.

diff -X dontdiff -dpruN linux-2.6.24.pure/include/linux/shmem_fs.h
linux-2.6.24.autounlink/include/linux/shmem_fs.h
--- linux-2.6.24.pure/include/linux/shmem_fs.h  2006-11-29 22:57:37.0
+0100
+++ linux-2.6.24.autounlink/include/linux/shmem_fs.h2008-02-14
15:35:01.0 +0100
@@ -30,11 +30,14 @@ struct shmem_sb_info {
unsigned long free_blocks;  /* How many are left for allocation */
unsigned long max_inodes;   /* How many inodes are allowed */
unsigned long free_inodes;  /* How many are left for allocation */
-   int policy; /* Default NUMA memory alloc policy */
-   nodemask_t policy_nodes;/* nodemask for preferred and bind */
+   unsigned int  flags;
+   int   policy;   /* Default NUMA memory alloc policy */
+   nodemask_tpolicy_nodes; /* nodemask for preferred and bind */
spinlock_tstat_lock;
 };
 
+#define TMPFS_FL_AUTOREMOVE 1
+
 static inline struct shmem_inode_info *SHMEM_I(struct inode *inode)
 {
return container_of(inode, struct shmem_inode_info, vfs_inode);
diff -X dontdiff -dpruN linux-2.6.24.pure/mm/shmem.c
linux-2.6.24.autounlink/mm/shmem.c
--- linux-2.6.24.pure/mm/shmem.c2008-01-25 15:09:39.0 +0100
+++ linux-2.6.24.autounlink/mm/shmem.c  2008-02-14 18:00:54.0 +0100
@@ -1747,31 +1747,41 @@ static int
 shmem_mknod(struct inode *dir, struct dentry *dentry, int mode, dev_t dev)
 {
struct inode *inode = shmem_get_inode(dir-i_sb, mode, dev);
+   struct shmem_sb_info *sbinfo = SHMEM_SB(dir-i_sb);
int error = -ENOSPC;
 
-   if (inode) {
-   error = security_inode_init_security(inode, dir, NULL, NULL,
-NULL);
-   if (error) {
-   if (error != -EOPNOTSUPP) {
-   iput(inode);
-   return error;
-   }
-   }
-   error = shmem_acl_init(inode, dir);
-   if (error) {
+   if (!inode)
+   return error;
+
+   error = security_inode_init_security(inode, dir, NULL, NULL,
+NULL);
+   if (error) {
+   if (error != -EOPNOTSUPP) {
iput(inode);
return error;
}
-   if (dir-i_mode  S_ISGID) {
-   inode-i_gid = dir-i_gid;
-   if (S_ISDIR(mode))
-   inode-i_mode |= S_ISGID;
-   }
-   dir-i_size += BOGO_DIRENT_SIZE;
-   dir-i_ctime = dir-i_mtime = CURRENT_TIME;
-   d_instantiate(dentry, inode);
+   }
+   error = shmem_acl_init(inode, dir);
+   if (error) {
+   iput(inode);
+   return error;
+   }
+   if (dir-i_mode  S_ISGID) {
+   inode-i_gid = dir-i_gid;
+   if (S_ISDIR(mode))
+   inode-i_mode |= S_ISGID;
+   }
+
+   dir-i_size += BOGO_DIRENT_SIZE;
+   dir-i_ctime = dir-i_mtime = CURRENT_TIME;
+   d_instantiate(dentry, inode);
+   if ( S_ISDIR(mode)
+ || !(sbinfo-flags  TMPFS_FL_AUTOREMOVE))
+   {
dget(dentry); /* Extra count - pin the dentry in core */
+   } else {
+   dir-i_size -= BOGO_DIRENT_SIZE;
+   drop_nlink(inode);
}
return error;
 }
@@ -1800,6 +1810,11 @@ static int shmem_link(struct dentry *old
struct inode *inode = old_dentry-d_inode;
struct shmem_sb_info *sbinfo = SHMEM_SB(inode-i_sb);
 
+   /* In auto-unlink mode, the newly created link would be unlinked
+  immediately. We don't need to do anything here. */
+   if (sbinfo-flags  TMPFS_FL_AUTOREMOVE)
+   return 0;
+
/*
   

Re: Is there a blackhole /dev/null directory?

2008-02-14 Thread Bodo Eggert
Hans-Jürgen Koch [EMAIL PROTECTED] wrote:
 schrieb Jan Engelhardt [EMAIL PROTECTED]:

 There is a much more interesting 'problem' with a /dev/null
 directory.
 
 Q: Why would you need such a directory?
 A: To temporarily fool a program into believing it wrote something.
 
 Q: Should all files disappear? (e.g. unlink after open)
 A: Maybe not, programs may stat() the file right afterwards and
get confused by the inexistence.
 
 Q: What if a program attempts to mkdir /dev/nullmnt/foo to just
create a file /dev/nullmnt/foo/barfile?
 A: /dev/nullmnt/foo must continue to exist or be accepted for a while,
or perhaps for eternity.
 
 Well, the problem seems to be that a directory is not just data but
 also contains metadata. While it's easy to write data to /dev/null, you
 cannot simply discard metadata associated with a directory. So, such a
 /dev/null-directory would have to remember metadata (at least all
 created filenames including subdirectories) in the same way as other
 filesystems do. Only file _content_ can be discarded.
 To be honest, I still cannot see many sensible usecases for that...

Since both of you seem to know about the (possible) problems, maybe you can
take a look at my patch.

http://7eggert.dyndns.org:8080/tmp/autounlink.patch
(Not inline, because it would be duplicate in this thread.)
(Yes, this patch needs some cleanup. I just did checkpatch.)

First I thought I'd modify tmpfs to delete the file on O_CREAT, but it
turned out tmpfs will increase the dentry count in order to prevent the
delete-on-close effect. Skipping this step was allmost enough, I had to
prevent link() from pinning these files, too.

Some loops creating files or linking them did not show any decrease in
available memory, and to the best of my knowlege, I did not introduce new
memory leaks. And the best thing is: It's only 150 bytes of code. Not bad
for an additional mount flag, is it?

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Avoid buffer overflows in get_user_pages()

2008-02-12 Thread Bodo Eggert
Andrew Morton [EMAIL PROTECTED] wrote:
 On Mon, 11 Feb 2008 16:17:33 -0700 Jonathan Corbet [EMAIL PROTECTED] wrote:

 Avoid buffer overflows in get_user_pages()
 
 So I spent a while pounding my head against my monitor trying to figure
 out the vmsplice() vulnerability - how could a failure to check for
 *read* access turn into a root exploit?  It turns out that it's a buffer
 overflow problem which is made easy by the way get_user_pages() is
 coded.
 
 In particular, len is a signed int, and it is only checked at the
 *end* of a do {} while() loop.  So, if it is passed in as zero, the loop
 will execute once and decrement len to -1.  At that point, the loop will
 proceed until the next invalid address is found; in the process, it will
 likely overflow the pages array passed in to get_user_pages().

[...]

 Can we just convert
 
 do {
 ...
 } while (len);
 
 into
 
 while (len) {

while (len  0), if I understand this patch correctly.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ide-core: remove conditional compiling with MODULE in ide-core.c

2008-02-02 Thread Bodo Eggert
Denis Cheng [EMAIL PROTECTED] wrote:

 use module_init/module_exit to replace the original cond-compiling, these
 macros were well designed to deal module/built-in compiling.
 
 the original __setup with null string was invalid and not executed,
 __setup(, ide_setup);
 
 however, with the current module_param mechanics, module parameters also can
 be input on the kernel command line, with this style:
 
 ide-core.options=ide=nodma hdd=cdrom idebus=...
 
 so Documentation/kernel-parameters.txt also updated.

 --- a/Documentation/kernel-parameters.txt

 - ide=[HW] (E)IDE subsystem
 - Format: ide=nodma or ide=doubler or ide=reverse
 - See Documentation/ide.txt.
 -
 - ide?=   [HW] (E)IDE subsystem
 - Format: ide?=noprobe or chipset specific parameters.
 - See Documentation/ide.txt.
 -
 - idebus= [HW] (E)IDE subsystem - VLB/PCI bus speed
 + ide-core.options= [HW] (E)IDE subsystem
 + Format: ide-core.options=ide=nodma hdd=cdrom 
 idebus=...

IMO you should use separate options for things like nodma while you're at it.

¢¢

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] Introduce softpanic V.2

2008-01-28 Thread Bodo Eggert
Enabling this option changes a hard panic on boot errors to a
soft panic, which does not stop the system completely.
You can still scroll the screen and read the messages.

Signed-Off-By: Bodo Eggert [EMAIL PROTECTED]

---

Fixed: s/SOFTPANIC/CONFIG_SOFTPANIC/

I did not implement shutting down the network on panic, which was requested 
to let the watchdog reboot the machine. For this purpose, you should use 
panic=$n.


diff -pruN -X dontdiff linux-2.6.24.pure/include/linux/kernel.h 
linux-2.6.24.softpanic/include/linux/kernel.h
--- linux-2.6.24.pure/include/linux/kernel.h2008-01-25 15:09:36.0 
+0100
+++ linux-2.6.24.softpanic/include/linux/kernel.h   2008-01-25 
15:31:26.0 +0100
@@ -130,6 +130,12 @@ extern struct atomic_notifier_head panic
 extern long (*panic_blink)(long time);
 NORET_TYPE void panic(const char * fmt, ...)
__attribute__ ((NORET_AND format (printf, 1, 2))) __cold;
+#ifdef SOFTPANIC
+NORET_TYPE void softpanic(const char *fmt, ...)
+   __attribute__ ((NORET_AND format (printf, 1, 2))) __cold;
+#else
+# define softpanic(...) do { panic(__VA_ARGS__); } while (0)
+#endif
 extern void oops_enter(void);
 extern void oops_exit(void);
 extern int oops_may_print(void);
diff -pruN -X dontdiff linux-2.6.24.pure/init/Kconfig 
linux-2.6.24.softpanic/init/Kconfig
--- linux-2.6.24.pure/init/Kconfig  2008-01-25 15:09:38.0 +0100
+++ linux-2.6.24.softpanic/init/Kconfig 2008-01-25 15:15:08.0 +0100
@@ -526,6 +526,14 @@ config BUG
   option for embedded systems with no facilities for reporting errors.
   Just say Y.
 
+config SOFTPANIC
+   bool Enable softpanic for boot errors if EMBEDDED
+   default y
+   help
+   Enabling this option changes a hard panic on boot errors to a
+   soft panic, which does not stop the system completely.
+   You can still scroll the screen and read the messages.
+
 config ELF_CORE
default y
bool Enable ELF core dumps if EMBEDDED
diff -pruN -X dontdiff linux-2.6.24.pure/init/do_mounts.c 
linux-2.6.24.softpanic/init/do_mounts.c
--- linux-2.6.24.pure/init/do_mounts.c  2008-01-25 15:08:31.0 +0100
+++ linux-2.6.24.softpanic/init/do_mounts.c 2008-01-25 15:15:08.0 
+0100
@@ -330,7 +330,7 @@ retry:
printk(Please append a correct \root=\ boot option; here are 
the available partitions:\n);
 
printk_all_partitions();
-   panic(VFS: Unable to mount root fs on %s, b);
+   softpanic(VFS: Unable to mount root fs on %s, b);
}
 
printk(List of all partitions:\n);
@@ -342,7 +342,7 @@ retry:
 #ifdef CONFIG_BLOCK
__bdevname(ROOT_DEV, b);
 #endif
-   panic(VFS: Unable to mount root fs on %s, b);
+   softpanic(VFS: Unable to mount root fs on %s, b);
 out:
putname(fs_names);
 }
diff -pruN -X dontdiff linux-2.6.24.pure/init/main.c 
linux-2.6.24.softpanic/init/main.c
--- linux-2.6.24.pure/init/main.c   2008-01-25 15:09:38.0 +0100
+++ linux-2.6.24.softpanic/init/main.c  2008-01-25 15:15:08.0 +0100
@@ -585,7 +585,7 @@ asmlinkage void __init start_kernel(void
 */
console_init();
if (panic_later)
-   panic(panic_later, panic_param);
+   softpanic(panic_later, panic_param);
 
lockdep_info();
 
@@ -800,7 +800,7 @@ static int noinline init_post(void)
run_init_process(/bin/init);
run_init_process(/bin/sh);
 
-   panic(No init found.  Try passing init= option to kernel.);
+   softpanic(No init found.  Try passing init= option to kernel.);
 }
 
 static int __init kernel_init(void * unused)
diff -pruN -X dontdiff linux-2.6.24.pure/kernel/panic.c 
linux-2.6.24.softpanic/kernel/panic.c
--- linux-2.6.24.pure/kernel/panic.c2008-01-25 15:09:38.0 +0100
+++ linux-2.6.24.softpanic/kernel/panic.c   2008-01-25 18:37:59.0 
+0100
@@ -142,6 +142,66 @@ NORET_TYPE void panic(const char * fmt, 
 
 EXPORT_SYMBOL(panic);
 
+#ifdef CONFIG_SOFTPANIC
+NORET_TYPE void softpanic(const char *fmt, ...)
+{
+   long i;
+   static char buf[1024];
+   va_list args;
+#if defined(CONFIG_S390)
+   unsigned long caller = (unsigned long) __builtin_return_address(0);
+#endif
+
+   va_start(args, fmt);
+   vsnprintf(buf, sizeof(buf), fmt, args);
+   va_end(args);
+   printk(KERN_EMERG Kernel panic - not syncing: %s\n, buf);
+
+   atomic_notifier_call_chain(panic_notifier_list, 0, buf);
+
+   if (!panic_blink)
+   panic_blink = no_blink;
+
+   if (panic_timeout  0) {
+   /*
+* Delay timeout seconds before rebooting the machine.
+* We can't use the normal timers since we just panicked..
+*/
+   printk(KERN_EMERG Rebooting in %d seconds.., panic_timeout);
+   for (i = 0; i  panic_timeout*1000; ) {
+   touch_nmi_watchdog

Re: [PATCH] [8/18] BKL-removal: Remove BKL from remote_llseek

2008-01-28 Thread Bodo Eggert
Trond Myklebust [EMAIL PROTECTED] wrote:
 On Mon, 2008-01-28 at 05:38 +0100, Andi Kleen wrote:
 On Monday 28 January 2008 05:13:09 Trond Myklebust wrote:
  On Mon, 2008-01-28 at 03:58 +0100, Andi Kleen wrote:

   The problem is that it's not a race in who gets to do its thing first,
   but a parallel reader can actually see a corrupted value from the two
   independent words on 32bit (e.g. during a 4GB). And this could actually
   completely corrupt f_pos when it happens with two racing relative seeks
   or read/write()s
   
   I would consider that a bug.
  
  I disagree. The corruption occurs because this isn't a situation that is
  allowed by either POSIX or SUSv2/v3. Exactly what spec are you referring
  to here?
 
 No specific spec, just general quality of implementation. We normally don't
 have non thread safe system calls even if it was in theory allowed by some
 specification.
 
 We've had the existing implementation for quite some time. The arguments
 against changing it have been the same all along: if your application
 wants to share files between threads, the portability argument implies
 that you should either use pread/pwrite or use a mutex or some other
 form of synchronisation primitive in order to ensure that
 lseek()/read()/write() do not overlap.

Does anything in the kernel depend on f_pos being valid?
E.g. is it possible to read beyond the EOF using this race, or to have files
larger than the ulimit?

If not, update the manpage and be done. ¢¢

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] [14/18] BKL-removal: Add unlocked_fasync

2008-01-27 Thread Bodo Eggert
 +++ linux/fs/fcntl.c
 @@ -240,11 +240,15 @@ static int setfl(int fd, struct file * f
  
 lock_kernel();
 if ((arg ^ filp-f_flags)  FASYNC) {
 -   if (filp-f_op  filp-f_op-fasync) {
 +   if (filp-f_op  filp-f_op-unlocked_fasync)
 +   error = filp-f_op-unlocked_fasync(fd, filp,
 +   !!(arg  FASYNC));
 +   else if (filp-f_op  filp-f_op-fasync) {
 error = filp-f_op-fasync(fd, filp, (arg  FASYNC) !=
0);
 if (error  0)
 goto out;

No goto if you use unlocked_fasync?

 }
 +   /* AK: no else error = -EINVAL here? */
 }
  
 filp-f_flags = (arg  SETFL_MASK) | (filp-f_flags  ~SETFL_MASK);
 --
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Introduce softpanic

2008-01-25 Thread Bodo Eggert
On Fri, 25 Jan 2008, Jan Engelhardt wrote:
 On Jan 25 2008 15:54, Bodo Eggert wrote:

 +#ifdef SOFTPANIC
 
 #ifdef CONFIG_SOFTPANIC?

Thanks. I remember having fixed it ...
-- 
Professionals are predictable, it's the amateurs that are dangerous.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] Introduce softpanic

2008-01-25 Thread Bodo Eggert
Enabling this option changes a hard panic on boot errors to a
soft panic, which does not stop the system completely.
You can still scroll the screen and read the messages.

Signed-Off-By: Bodo Eggert [EMAIL PROTECTED]

diff -pruN -X dontdiff linux-2.6.24.pure/include/linux/kernel.h 
linux-2.6.24.softpanic/include/linux/kernel.h
--- linux-2.6.24.pure/include/linux/kernel.h2008-01-25 15:09:36.0 
+0100
+++ linux-2.6.24.softpanic/include/linux/kernel.h   2008-01-25 
15:31:26.0 +0100
@@ -130,6 +130,12 @@ extern struct atomic_notifier_head panic
 extern long (*panic_blink)(long time);
 NORET_TYPE void panic(const char * fmt, ...)
__attribute__ ((NORET_AND format (printf, 1, 2))) __cold;
+#ifdef SOFTPANIC
+NORET_TYPE void softpanic(const char *fmt, ...)
+   __attribute__ ((NORET_AND format (printf, 1, 2))) __cold;
+#else
+# define softpanic(...) do { panic(__VA_ARGS__); } while (0)
+#endif
 extern void oops_enter(void);
 extern void oops_exit(void);
 extern int oops_may_print(void);
diff -pruN -X dontdiff linux-2.6.24.pure/init/Kconfig 
linux-2.6.24.softpanic/init/Kconfig
--- linux-2.6.24.pure/init/Kconfig  2008-01-25 15:09:38.0 +0100
+++ linux-2.6.24.softpanic/init/Kconfig 2008-01-25 15:15:08.0 +0100
@@ -526,6 +526,14 @@ config BUG
   option for embedded systems with no facilities for reporting errors.
   Just say Y.
 
+config SOFTPANIC
+   bool Enable softpanic for boot errors if EMBEDDED
+   default y
+   help
+   Enabling this option changes a hard panic on boot errors to a
+   soft panic, which does not stop the system completely.
+   You can still scroll the screen and read the messages.
+
 config ELF_CORE
default y
bool Enable ELF core dumps if EMBEDDED
diff -pruN -X dontdiff linux-2.6.24.pure/init/do_mounts.c 
linux-2.6.24.softpanic/init/do_mounts.c
--- linux-2.6.24.pure/init/do_mounts.c  2008-01-25 15:08:31.0 +0100
+++ linux-2.6.24.softpanic/init/do_mounts.c 2008-01-25 15:15:08.0 
+0100
@@ -330,7 +330,7 @@ retry:
printk(Please append a correct \root=\ boot option; here are 
the available partitions:\n);
 
printk_all_partitions();
-   panic(VFS: Unable to mount root fs on %s, b);
+   softpanic(VFS: Unable to mount root fs on %s, b);
}
 
printk(List of all partitions:\n);
@@ -342,7 +342,7 @@ retry:
 #ifdef CONFIG_BLOCK
__bdevname(ROOT_DEV, b);
 #endif
-   panic(VFS: Unable to mount root fs on %s, b);
+   softpanic(VFS: Unable to mount root fs on %s, b);
 out:
putname(fs_names);
 }
diff -pruN -X dontdiff linux-2.6.24.pure/init/main.c 
linux-2.6.24.softpanic/init/main.c
--- linux-2.6.24.pure/init/main.c   2008-01-25 15:09:38.0 +0100
+++ linux-2.6.24.softpanic/init/main.c  2008-01-25 15:15:08.0 +0100
@@ -585,7 +585,7 @@ asmlinkage void __init start_kernel(void
 */
console_init();
if (panic_later)
-   panic(panic_later, panic_param);
+   softpanic(panic_later, panic_param);
 
lockdep_info();
 
@@ -800,7 +800,7 @@ static int noinline init_post(void)
run_init_process(/bin/init);
run_init_process(/bin/sh);
 
-   panic(No init found.  Try passing init= option to kernel.);
+   softpanic(No init found.  Try passing init= option to kernel.);
 }
 
 static int __init kernel_init(void * unused)
diff -pruN -X dontdiff linux-2.6.24.pure/kernel/panic.c 
linux-2.6.24.softpanic/kernel/panic.c
--- linux-2.6.24.pure/kernel/panic.c2008-01-25 15:09:38.0 +0100
+++ linux-2.6.24.softpanic/kernel/panic.c   2008-01-25 15:38:52.0 
+0100
@@ -142,6 +142,66 @@ NORET_TYPE void panic(const char * fmt, 
 
 EXPORT_SYMBOL(panic);
 
+#ifdef SOFTPANIC
+NORET_TYPE void softpanic(const char *fmt, ...)
+{
+   long i;
+   static char buf[1024];
+   va_list args;
+#if defined(CONFIG_S390)
+   unsigned long caller = (unsigned long) __builtin_return_address(0);
+#endif
+
+   va_start(args, fmt);
+   vsnprintf(buf, sizeof(buf), fmt, args);
+   va_end(args);
+   printk(KERN_EMERG Kernel panic - not syncing: %s\n, buf);
+
+   atomic_notifier_call_chain(panic_notifier_list, 0, buf);
+
+   if (!panic_blink)
+   panic_blink = no_blink;
+
+   if (panic_timeout  0) {
+   /*
+* Delay timeout seconds before rebooting the machine.
+* We can't use the normal timers since we just panicked..
+*/
+   printk(KERN_EMERG Rebooting in %d seconds.., panic_timeout);
+   for (i = 0; i  panic_timeout*1000; ) {
+   touch_nmi_watchdog();
+   i += panic_blink(i);
+   mdelay(1);
+   i++;
+   }
+   /*  This will not be a clean reboot, with everything

Re: [PATCH] Introduce softpanic

2008-01-25 Thread Bodo Eggert
On Fri, 25 Jan 2008, Andi Kleen wrote:
 Bodo Eggert [EMAIL PROTECTED] writes:

  Enabling this option changes a hard panic on boot errors to a
  soft panic, which does not stop the system completely.
  You can still scroll the screen and read the messages.
 
 I don't think it's a good idea to keep the network running in the
 soft panic. A lot of people have set ups that use ping was a watchdog
 and with nfsroot/ip=dhcp ping does work quite well before
 mounting root and then the watchdog might not pick up the 
 soft panic.

 Using a polled keyboard driver after panic seems to be the better
 option to me, but if you want softpanic you should probably
 at least add a suitable panic notifier to the network stack 
 to shut it all down.

I have no idea on how to do it. If somebody has a big red arrow pointing to 
a HOWTO, I can give it a try.

OTOH, I think the panic timeout should do the job nicely.
-- 
If you talk about race, it does not make you a racist. If you see distinctions
between the genders, it does not make you a sexist. If you think critically
about a denomination, it does not make you anti-religion. If you accept but
don't celebrate homosexuality, it does not make you a homophobe.Charlton Heston
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] Parallelize IO for e2fsck

2008-01-24 Thread Bodo Eggert
Alan Cox [EMAIL PROTECTED] wrote:

 I'd tried to advocate SIGDANGER some years ago as well, but none of
 the kernel maintainers were interested.  It definitely makes sense
 to have some sort of mechanism like this.  At the time I first brought
 it up it was in conjunction with Netscape using too much cache on some
 system, but it would be just as useful for all kinds of other memory-
 hungry applications.
 
 There is an early thread for a /proc file which you can add to your
 poll() set and it will wake people when memory is low. Very elegant and
 if async support is added it will also give you the signal variant for
 free.

IMO you'll need a userspace daemon. The kernel does only know about the
amount of memory available / recommended for a system (or container),
while the user knows which program's cache is most precious today.

(Off cause the userspace daemon will in turn need the /proc file.)

I think a single, system-wide signal is the second-to worst solution: All
applications (or the wrong one, if you select one) would free their caches
and start to crawl, and either stay in this state or slowly increase their
caches again until they get signaled again. And the signal would either
come too early or too late. The userspace daemon could collect the weighted
demand of memory from all applications and tell them how much to use.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 2/2] 8250_pnp: register x86 COM ports at the conventional ttyS names

2008-01-16 Thread Bodo Eggert
Bjorn Helgaas [EMAIL PROTECTED] wrote:
 On Wednesday 16 January 2008 11:44:37 am H. Peter Anvin wrote:
 Bjorn Helgaas wrote:

  +#ifdef CONFIG_X86
  +  switch (port-iobase) {
  +  case 0x3f8: return 0;   /* COM1 - ttyS0 */
  +  case 0x2f8: return 1;   /* COM2 - ttyS1 */
  +  case 0x3e8: return 2;   /* COM3 - ttyS2 */
  +  case 0x2e8: return 3;   /* COM4 - ttyS3 */
  +  }
  +#endif
  +
 
 Arguably, the right thing is to use the addresses present in the array
 at address 0x400.  In particular, COM3 and COM4 aren't always at those
 addresses.
 
 Wow.  I bow before your storehouse of x86 arcana :-)
 
 I guess you're referring to the BIOS data area, which I'd never
 heard of before (but fortunately, Google knows).

You'll want to google for Ralph Brown ...
if your back allowes bowing that much.

 What would you think about doing this only for COM1 and COM2?  The
 only real value for doing this in the first place is so console=ttyS0
 always goes to COM1, even if we don't have SERIAL_PORT_DFNS.  User-
 space ought to use some sort of udev magic if it cares about persistent
 naming.

Since the first four COM ports are magic, and since using the BIOS port
numbers will move the non-legacy ports anyway, you should use all up to
four stored port numbers.

BTW1: These addresses may be used to detect ports on non-standard addresses,
but unfortunately they don't tell the IRQ.

BTW2: When I submitted a patch using the BIOS data area, I was told that it
might not exist on systems booting from non-PC firmware. This claim was not
yet backed with any knowledge, nor did anybody suggest a way to detect this
situation.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFD] Incremental fsck

2008-01-11 Thread Bodo Eggert
Al Boldi [EMAIL PROTECTED] wrote:

 Even after a black-out shutdown, the corruption is pretty minimal, using
 ext3fs at least.  So let's take advantage of this fact and do an optimistic
 fsck, to assure integrity per-dir, and assume no external corruption.  Then
 we release this checked dir to the wild (optionally ro), and check the next.
 Once we find external inconsistencies we either fix it unconditionally,
 based on some preconfigured actions, or present the user with options.

Maybe we can know the changes that need to be done in order to fix the
filesystem. Let's record this information in - eh - let's call it a journal!

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Clustering indirect blocks in Ext3

2008-01-11 Thread Bodo Eggert
Abhishek Rai [EMAIL PROTECTED] wrote:

 Putting metacluster at the end of the block group gives slightly
 inferior sequential read throughput compared to putting it in the
 beginning or the middle, but the difference is very tiny and exists
 only for large files that span multiple block groups.

Just an idea:

What about putting it into the end of the previous block group (except for
the first group, off cause) and starting to read the block group a little
earlier (readahead/~before)? I imagine it might be about as good as placing
it at the beginning while avoiding the fragmentation.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: The ext3 way of journalling

2008-01-11 Thread Bodo Eggert
Matthias Schniedermeyer [EMAIL PROTECTED] wrote:

  Don't use udev then. Good old static dev works fine if you have a fixed
  set of devices.
 
 It doesn't, with the unpredictable SCSI mapping insanity.
 
 That what LABEL und UUID-Support in mount is for.
 
 You label the filesystems (e2label for ext2 and ext3) and use that label to
 mount them
 
 - fstab -
 LABEL=root  /xfs defaults,noatime 0 1
 LABEL=boot  /bootext2defaults,noatime 0 2

What can happen if someone does tune2fs -Lroot /dev/usbstick
and puts that stick into this system?

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: The ext3 way of journalling

2008-01-11 Thread Bodo Eggert
On Fri, 11 Jan 2008, Lennart Sorensen wrote:
 On Fri, Jan 11, 2008 at 05:22:45PM +0100, Bodo Eggert wrote:

  What can happen if someone does tune2fs -Lroot /dev/usbstick
  and puts that stick into this system?
 
 Don't know.  I use UUIDs rather than LABELs.  Having duplicated labels
 just means being careless.  Having duplicate UUIDs should require being
 malicous.

That's exactly what you have to assume for your users. Otherwise, you could 
remove any security feature from the system.
-- 
Fun things to slip into your budget
Not in a budget, but in an annual report:
An employee stole 500,000+. They accounted for it on the annual report as
'involountary employee relations expense'
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86: provide a DMI based port 0x80 I/O delay override.

2008-01-08 Thread Bodo Eggert
On Tue, 8 Jan 2008, Rene Herman wrote:
 On 08-01-08 00:24, H. Peter Anvin wrote:
  Rene Herman wrote:

   Is this only about the ones then left for things like legacy PIC and PIT?
   Does anyone care about just sticking in a udelay(2) (or 1) there as a
   replacement and call it a day?
   
  
  PIT is problematic because the PIT may be necessary for udelay setup.
 
 Yes, can initialise loops_per_jiffy conservatively. Just didn't quite get why
 you guys are talking about an ISA bus speed parameter.

If the ISA bus is below 8 MHz, we might need a longer delay. If we default
to the longer delay, the delay will be too long for more than 99,99 % of 
all systems, not counting i586+. Especially if the driver is fine-tuned to 
give maximum throughput, this may be bad.

OTOH, the DOS drivers I heared about use delays and would break on 
underclocked ISA busses if the n * ISA_HZ delay was needed. Maybe
somebody having a configurable ISA bus speed and some problematic
chips can test it ...

-- 
Fun things to slip into your budget
I [Meow Cat] sliped in 'Legal fees for firing Jim (Jim's my [his] boss).'
Jim approved the budget and was fired when upper management saw the budget.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86: provide a DMI based port 0x80 I/O delay override.

2008-01-08 Thread Bodo Eggert
On Mon, 7 Jan 2008, Alan Cox wrote:

  But overclocking is not the problem for udelay, it would err to the safe 
  side. The problem would be a BUS having  8 MHz, and since the days of 
  80286, they are hard to find. IMO having an option to set the bus speed
  for those systems should be enough.
 
 If you get it wrong you risk data corruption. Not good, not clever, not
 appropriate. Basically the use of port 0x80 is the right thing to do for
 ISA devices and as 15 odd years of use has shown works reliably and
 solidly for ISA systems.

As long as there is no port 80 card or a similar device using it. If 
there is a port 80 card, ISA acess needing the delay does break, cause
the data corruption you fear and does cause this thread to be started.
Pest, Cholera ...

OTOH, maybe the 6-MHz-delay is the same as the 8-MHz-delay, and the kernel 
parameter is not needed.
-- 
Fun things to slip into your budget
A Romulan Cloaking device:
The PHB won't know what it is but will be to chicken to ask
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: sleep before boot panic

2008-01-08 Thread Bodo Eggert
On Mon, 7 Jan 2008, Pavel Machek wrote:

  Introduce config CONFIG_SOFTPANIC
  Enabling this option changes a hard panic on boot errors to a
  soft panic, which does not stop the system completely.
  You can still scroll the screen and read the messages.
  
  Signed-Off-By: Bodo Eggert [EMAIL PROTECTED]
 
 Looks good to me... but should this be configurable? IMO we should
 just do the right thing.

Having it configurable doesn't cost much, and the embedded folks without a 
screen certainly don't need this code.
-- 
Funny quotes:
23. If at first you don't succeed, destroy all evidence that you tried.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-kernel] Re: [PATCH] x86: provide a DMI based port 0x80 I/O delay override.

2008-01-08 Thread Bodo Eggert
On Tue, 8 Jan 2008, Ondrej Zary wrote:
 On Tuesday 08 January 2008 18:24:02 David P. Reed wrote:

  Windows these days does delays with timing loops or the scheduler.  It
  doesn't use a port.  Also, Windows XP only supports machines that tend
  not to have timing problems that use delays.  Instead, if a device takes
  a while to respond, it has a busy bit in some port or memory slot that
  can be tested.
 
 Windows XP can run on a machine with ISA slot(s) and has built-in drivers for 
 some plugplay ISA cards - e.g. the famous 3Com EtherLink III. I think that 
 there's a driver for NE2000-compatible cards too and it probably works.

The NE2K-driver went missing in W2K. BTDT.
-- 
Anyone can speak Troll. All you have to do is point and grunt.
-- Fred Weasley
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: The ext3 way of journalling

2008-01-08 Thread Bodo Eggert
Tuomo Valkonen [EMAIL PROTECTED] wrote:
 On 2008-01-08, Jan Engelhardt [EMAIL PROTECTED] wrote:

 Power users may still
 use the index= option of sound card modules and wire it up in
 /etc/modprobe.d if they prefer.
 
 Another very cryptic directory whose contents say nothing to me.
 Configuration files should be self-documenting and editable,
 instead having to be created based on long documentation.
 The simple /etc/modules -- which at least Debian's stock kernels
 do not use -- qualifies, but few other files these days do.
 
 You can guess my answer: udev will fix it.
 
 And break everything else, such as my symlinks, permissions, etc.
 I'm not going to learn its cryptic special-case config files for
 such trivial tasks as creating a fucking symlink or change the
 permissions of a file, for which exist general purpose methods:
 chmod, chown, ln -s.

Edit the start script, append your commands to create the links.
Or edit the correct file in /etc/udev/rules.d to include the links.

 Well what do you expect of it? The kernel does not keep USB port -
 SCSI device mappings. Neither USB device - SCSI device mapping,
 because not all USB ports or USB devices are mass-storage devices.
 It just is not the kernel's job.
 
 Mapping everything to scsi nodes is brain damaged. The old hda, hdb,
 etc. mappings had somewhat clear correspondence between to physical
 evice addresses, and were easy to use without such complicated crap
 as udev. Of course, I'd prefer just device unique IDs being used,
 where possible... but I'm not going to suffer udev for that.

If you wire your devices to a named/numbered port, you can use one device
node for each port. Therefore it's sane to create hda to hdd, fd0 to fd3,
etc. But if you don't have a fixed mapping (USB), or if you'd have too
many possible (and mostly unused) ports for the available amount of devices
(SCSI), you can not create fixed numbers. Even for hde ..., you have no
native ordering, you can only tell the first controller from the rest.

When the SCSI naming was created, there were at most 256 SCSI devices of
each kind. Since each partition is a SCSI device, too, you had at most
16 disks of up to 15 partitions! This was later extended to 128 disks.
If you'd hardcode the controller, linux would have been unable to support
more than 8 SCSI controllers.

The result was the semi-random enumeration of the SCSI devices, and all
the hacks trying to work around that. (google for Joerg Schilling lkml)
Udev provides a clean way of naming the devices, using the static
information from the devices or the path.

I agree that the current udev documentation makes it hard to even find the
settings you'd like to change. That's because the documentation is meant
for developers, while the configuration is themed by the distribution.

The udev files itself are as simple as possible. Imagine them to be
bloatXML/bloat, like the pile of HAL, DBUS and KDE automatically
mounting my disks using the wrong settings into the wrong directory
and forcing me to su in order to umount them! I temporarily beat that
beast, but the kill is still on the TODO list. Maybe as soon as I find
more^W usable^W documentation ...

 May I remind you that the kernel also loses all your network interface
 configuration, routes, firewalling rules and all sysctl settings at
 boot (sic: reboot  powerdown).
 
 But traditional /dev does not lose permissions and symlinks. udev
 tmpfs shadow brain damage does. You have to illogically and
 inconveniently edit udev's cryptic config files instead, and yet
 it in no way stops /dev from being modified.

You aren't stopped from directly poking the memory and crashing the systems
either - if you are root. Or from deleting all nodes on a classic /dev.
Don't do that then.

 Nonsense. The kernel notices udev about all available hardware and udev
 will load modules. It has nothing to do with initrd, in fact, this very
 step of loading a gazillion of modules is done after initrd has passed
 control on to /sbin/init. At least, in opensuse.
 
 I've never seen a system that would do so. And I won't use udev.

No system can load modules that are not on initrd, and only needed-for-boot
modules are usually put on the initrd. The bulk of modules must be loaded
from the real root.  This seems to work quite well (except not here because
I prefer to include all necessary drivers in the kernel.)

I've debugged hotplug, too, and I found it was a wrapper around a wrapper
around ... a script that would be supposed to load the required firmware,
except it did neither load it nor provide a way to find out why it didn't.
Each HOWTO mentioned a different directory to use, and a different filename.
I'd have settled for manually loading the firmware, but hacking the scripts
to pieces was finally easier than finding out how I was supposed to do that.

I did not need to install a firmware after this, therefore I don't know if
udev is better, but it CAN'T be worse.

--
To unsubscribe from this list: send 

Re: sleep before boot panic

2008-01-07 Thread Bodo Eggert
Ingo Oeser [EMAIL PROTECTED] wrote:

 CC'ed hpa, since I'm sure he can give useful advise on that :-)
 
 On Sunday 06 January 2008, Bernd Schubert wrote:
 On Sunday 06 January 2008, Ingo Oeser wrote:
  On Sunday 06 January 2008, you wrote:

   Index: zd1211rw.git.beno/init/do_mounts.c
   ===
   --- zd1211rw.git.beno.orig/init/do_mounts.c  2008-01-06 
   18:44:23.0
   +0100
   +++ zd1211rw.git.beno/init/do_mounts.c   2008-01-06 18:45:44.0
   +0100 @@ -330,6 +330,7 @@
printk(Please append a correct \root=\ boot option; here are the
   available partitions:\n);
  
printk_all_partitions();
   +msleep(60 * 1000);
 
  ssleep(60);
 
 feel free to replace it replace it :)
 
 Not that urgent, but if you resubmit please do it :-)

You don't need to, because ...

 There is no dump_stack() here, but disc detection is relatively early in boot
 process and on all these information are already scrolled off screen when the
 panic is done. For this and any other panic it would be optimal if scrolling
 still would work, but scrolling also requires kernel code, so I see there's a
 reason not to this for all panics. However, for this boot problem I tend to
 say there's no need to panic at all...
 
 But the kernel cannot continue from that position. You would need a soft
 panic, which allows behavior of panic=X, but let the kernel continue.

Ingo is right, and I've done the work. The latest version is included below.
If it does not apply cleanly and there is a chance of including it, I'll
port the patch to any version you name.

 Even better is to continue with the init in the builtin ramfs. That should
 always be available and can implement any behavior desired (like droping into
 a dash).

ACK, but that's your part.


snip

Introduce config CONFIG_SOFTPANIC
Enabling this option changes a hard panic on boot errors to a
soft panic, which does not stop the system completely.
You can still scroll the screen and read the messages.

Signed-Off-By: Bodo Eggert [EMAIL PROTECTED]

diff -pruN linux-2.6.23.base/include/linux/kernel.h
linux-2.6.23.softpanic/include/linux/kernel.h
--- linux-2.6.23.base/include/linux/kernel.h2007-10-11 14:15:39.0 
+0200
+++ linux-2.6.23.softpanic/include/linux/kernel.h   2007-10-11 
14:45:15.0
+0200
@@ -108,6 +108,12 @@ extern struct atomic_notifier_head panic
 extern long (*panic_blink)(long time);
 NORET_TYPE void panic(const char * fmt, ...)
__attribute__ ((NORET_AND format (printf, 1, 2))) __cold;
+#ifdef SOFTPANIC
+NORET_TYPE void softpanic(const char * fmt, ...)
+   __attribute__ ((NORET_AND format (printf, 1, 2))) __cold;
+#else
+# define softpanic(...) do { panic(__VA_ARGS__); } while (0)
+#endif
 extern void oops_enter(void);
 extern void oops_exit(void);
 extern int oops_may_print(void);
diff -pruN linux-2.6.23.base/init/Kconfig linux-2.6.23.softpanic/init/Kconfig
--- linux-2.6.23.base/init/Kconfig  2007-10-11 14:15:42.0 +0200
+++ linux-2.6.23.softpanic/init/Kconfig 2007-10-11 15:19:07.0 +0200
@@ -441,6 +441,14 @@ config BUG
   option for embedded systems with no facilities for reporting errors.
   Just say Y.
 
+config SOFTPANIC
+   bool Enable softpanic for boot errors if EMBEDDED
+   default y
+   help
+   Enabling this option changes a hard panic on boot errors to a
+   soft panic, which does not stop the system completely.
+   You can still scroll the screen and read the messages.
+
 config ELF_CORE
default y
bool Enable ELF core dumps if EMBEDDED
diff -pruN linux-2.6.23.base/init/do_mounts.c
linux-2.6.23.softpanic/init/do_mounts.c
--- linux-2.6.23.base/init/do_mounts.c  2007-10-11 14:15:42.0 +0200
+++ linux-2.6.23.softpanic/init/do_mounts.c 2007-10-11 14:48:51.0 
+0200
@@ -330,7 +330,7 @@ retry:
printk(Please append a correct \root=\ boot option; here are 
the available
partitions:\n);
 
printk_all_partitions();
-   panic(VFS: Unable to mount root fs on %s, b);
+   softpanic(VFS: Unable to mount root fs on %s, b);
}
 
printk(List of all partitions:\n);
@@ -342,7 +342,7 @@ retry:
 #ifdef CONFIG_BLOCK
__bdevname(ROOT_DEV, b);
 #endif
-   panic(VFS: Unable to mount root fs on %s, b);
+   softpanic(VFS: Unable to mount root fs on %s, b);
 out:
putname(fs_names);
 }
diff -pruN linux-2.6.23.base/init/main.c linux-2.6.23.softpanic/init/main.c
--- linux-2.6.23.base/init/main.c   2007-10-11 14:15:42.0 +0200
+++ linux-2.6.23.softpanic/init/main.c  2007-10-11 14:40:06.0 +0200
@@ -590,7 +590,7 @@ asmlinkage void __init start_kernel(void
 */
console_init();
if (panic_later)
-   panic(panic_later, panic_param);
+   softpanic(panic_later, panic_param);
 
lockdep_info

Re: [PATCH] x86: provide a DMI based port 0x80 I/O delay override.

2008-01-07 Thread Bodo Eggert
Christer Weinigel [EMAIL PROTECTED] wrote:

 How do you find out the speed of the ISA bus?  AFAIK there is no
 standardized way to do that.  On the Geode SC2200 the ISA bus speed is
 usually the PCI clock divided by 4 giving 33MHz/4=8.3MHz or
 30/4=7.5MHz, but with no external ISA devices it's possible to
 overclock the ISA bus to /3 to run it at 11MHz or so.  But without
 poking at some CPU and southbridge specific registers to find out the
 PCI bus speed and the ISA bus divisor you can't really tell.

If you overclock, you are on your own. IIRC I've used 13,3 MHz for some time
and used a lower PIO mode to compensate.

 So if you do udelay based on a 6MHz clock (I think you can safely
 assume that any 386 based system runs the ISA bus at least that fast)
 you'll waste at least 30% and maybe even 100% more time for the delay
 after every _p call.

Defaulting to 8 MHz and offering an option to set another clock speed
(like idebus=) should be OK.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86: provide a DMI based port 0x80 I/O delay override.

2008-01-07 Thread Bodo Eggert
On Mon, 7 Jan 2008, H. Peter Anvin wrote:
 Bodo Eggert wrote:
  Christer Weinigel [EMAIL PROTECTED] wrote:

   How do you find out the speed of the ISA bus?  AFAIK there is no
   standardized way to do that.  On the Geode SC2200 the ISA bus speed is
   usually the PCI clock divided by 4 giving 33MHz/4=8.3MHz or
   30/4=7.5MHz, but with no external ISA devices it's possible to
   overclock the ISA bus to /3 to run it at 11MHz or so.  But without
   poking at some CPU and southbridge specific registers to find out the
   PCI bus speed and the ISA bus divisor you can't really tell.
  
  If you overclock, you are on your own. IIRC I've used 13,3 MHz for some time
  and used a lower PIO mode to compensate.
  
   So if you do udelay based on a 6MHz clock (I think you can safely
   assume that any 386 based system runs the ISA bus at least that fast)
   you'll waste at least 30% and maybe even 100% more time for the delay
   after every _p call.
  
  Defaulting to 8 MHz and offering an option to set another clock speed
  (like idebus=) should be OK.
  
 
 The formalization of the ISA bus which was part of the EISA specification
 settled on 8.33 MHz maximum nominal frequency.  There were, however, some
 earlier designs which used up to 12 MHz nominal; I'm not sure if that applied
 to 386s though.

I've used up to 13,3 MHz on my 386DX40, but it was way out of spec and
I had to use a lower PIO mode to compensate. IIRC, one of my cards forced
me to settle for 10 MHz. Wikipedia claims there were systems having
16 MHz ISA bus, and systems underclocking themselves when accessing ISA.
I remember having optional and mandatory waitstates, too, but I'm not
100 % sure it was on ISA. I think they were ...

But overclocking is not the problem for udelay, it would err to the safe 
side. The problem would be a BUS having  8 MHz, and since the days of 
80286, they are hard to find. IMO having an option to set the bus speed
for those systems should be enough.

-- 
knghtbrd:JHM AIX - the Unix from the universe where Spock has a beard.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86: provide a DMI based port 0x80 I/O delay override.

2008-01-07 Thread Bodo Eggert
On Mon, 7 Jan 2008, H. Peter Anvin wrote:
 Bodo Eggert wrote:

  But overclocking is not the problem for udelay, it would err to the safe
  side. The problem would be a BUS having  8 MHz, and since the days of
  80286, they are hard to find. IMO having an option to set the bus speed
  for those systems should be enough.
  
 
 There might have been a few 386/20's clocking the ISA bus at ­­÷3
 (6.67 MHz) rather than ÷2 (10 MHz) or ÷2.5 (8 MHz).

Yes, and the remaining users should set the kernel option. Both of them.
The question is: How will they be told about the new kernel option?
-- 
A man inserted an advertisement in the classified: Wife Wanted.
The next day he received a hundred letters. They all said the
same thing: You can have mine.

Re: macro _set_base - do - while(0) question

2008-01-02 Thread Bodo Eggert
Abdel [EMAIL PROTECTED] wrote:

 In file include/asm-i386/system.h,  _set_base and _set_limit use an
 useless do ... while(0)
 
 Why is this needed ?

http://kernelnewbies.org/FAQ/DoWhile0

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Force UNIX domain sockets to be built in

2008-01-02 Thread Bodo Eggert
On Wed, 2 Jan 2008, Herbert Xu wrote:
 Theodore Tso [EMAIL PROTECTED] wrote:

  The question is whether the size of the Unix domain sockets support is
  worth the complexity of yet another config option that we expose to
  the user.  For the embedded world, OK, maybe they want to save 14k of
  non-swappable memory.  But for the non-embedded world, given the 117k
  mandatory memory usage of sysfs, or the 124k memory usage of the core
  networking stack, never mind the 3 megabytes of memory used by objects
  in the kernel subdirectory, it's not clear that it's worth worrying
  over 14k of memory, especially when many Unix programs assume
  that Unix Domain Sockets are present.
 
 That would make sense if we were proposing to get rid of the CONFIG_UNIX
 question altogether for !CONFIG_EMBEDDED.

Exactly this is what my patch does: The question is not to be displayed 
unless EMBEDDED, and the default is changed to y.

  However, the proposal here is
 merely to eliminate the modular option but the CONFIG_UNIX prompt itself
 will remain even without CONFIG_EMBEDDED.
 
 This I think is quite pointless.

That's what another patch would do. I decided that s/tristate/bool/ is 
something completely different from adding the default and hiding the 
option, and that I'd avoid this discussion by not eliminating UNIX=m.

-- 
Top 100 things you don't want the sysadmin to say:
96. That's SO bizarre.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: RAID timeout parameter accessibility request

2008-01-02 Thread Bodo Eggert
Thanasis [EMAIL PROTECTED] wrote:
 on 12/31/2007 11:54 AM Jose de la Mancha wrote the following:

 -- All RAID edition drives are more expensive that their equivalent
 desktop edition drives (same model on desktop edition). Just take a look
 at newegg for instance.
 
http://www.newegg.com/Product/Product.aspx?Item=N82E16822136055Tpk=WD%2b2500YS

Not available here, and my local store offers a 250 GB drive for 7 % less.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Get physical MAC address

2007-12-31 Thread Bodo Eggert
Theewara Vorakosit [EMAIL PROTECTED] wrote:

 I get MAC address from ioctl. However, ifconfig can change this  MAC
 address. Can I get a real physical MAC address of the NIC?

First, get a network card having a physical MAC. Most cards have only a
(currently configured) default MAC address, maybe you'll be lucky
with some old ISA cards ...

Then, don't worry about sb changing it, since these cards don't support that.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] Force UNIX domain sockets to be built in

2007-12-31 Thread Bodo Eggert
As suggested by Adrian Bunk, UNIX domain sockets should always be built in 
on normal systems. This is especially true since udev needs these sockets
and fails to run if UNIX=m.

Signed-Off-By: Bodo Eggert [EMAIL PROTECTED]

---
Last minute change: I decided against making it a bool because embedded 
folks might depend on a small kernel image. Edited in the patch below.

diff -X dontdiff -pruN linux-2.6.23.base/net/unix/Kconfig 
linux-2.6.23.socket-y/net/unix/Kconfig
--- linux-2.6.23.base/net/unix/Kconfig  2006-11-29 22:57:37.0 +0100
+++ linux-2.6.23.socket-y/net/unix/Kconfig  2007-12-31 12:57:44.0 
+0100
@@ -3,7 +3,8 @@
 #
 
 config UNIX
-   tristate Unix domain sockets
+   tristate Unix domain sockets if EMBEDDED
+   default y
---help---
  If you say Y here, you will include support for Unix domain sockets;
  sockets are the standard Unix mechanism for establishing and

-- 
A. Top posters
Q. What's the most annoying thing on Usenet?
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Bad escriptions in Kconfig

2007-12-31 Thread Bodo Eggert
In some of the Kconfig files, the options are not adequately decribed. I 
collected a few of the bad descriptions I found:


---
Lowlevel video output switch controls (VIDEO_OUTPUT_CONTROL) [M/n/y/?] (NEW) ?

This framework adds support for low-level control of the video
output switch.
---

- What is THE video output switch and why would I need low level control?

- Frameworks should be auto-selected like libraries, shouldn't they?

- WTF is this a module?


-
---
Auxiliary Display support (AUXDISPLAY) [N/y/?] (NEW) ?

Say Y here to get to see options for auxiliary display drivers.
This option alone does not add any kernel code.
---

- If I knw what an axilary display was, I would not read this help text!

- After digging some time, I discovered that all Auxdisplays are parallel 
  port devices.
  Rename to Parallel port display device support?


-
---
Transformation user configuration interface (XFRM_USER) [N/m/y/?] (NEW)
  
Support for Transformation(XFRM) user configuration interface
like IPsec used by native Linux tools.

If unsure, say Y.
---

- I'm not sure if these words combine to a sentence.
- I can't tell if IPSEC is the only user or if I'd break other parts by not 
  saying 'Y'. OTOH, I don't want to bloat my kernel ...
- What's a native linux tool?

  
-
---
SCSI target support (SCSI_TGT) [N/m/y/?] (NEW) ?

If you want to use SCSI target mode drivers enable this option.
If you choose M, the module will be called scsi_tgt.
---

What TF is a SCSI target mode, what is a target mode driver?

-- 
I'm a member of DNA (National Assocciation of Dyslexics).
-- Storm in [EMAIL PROTECTED]
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Force UNIX domain sockets to be built in

2007-12-31 Thread Bodo Eggert
On Mon, 31 Dec 2007, Adrian Bunk wrote:
 On Mon, Dec 31, 2007 at 01:09:43PM +0100, Bodo Eggert wrote:

  As suggested by Adrian Bunk, UNIX domain sockets should always be built in 
  on normal systems. This is especially true since udev needs these sockets
  and fails to run if UNIX=m.
  
  Signed-Off-By: Bodo Eggert [EMAIL PROTECTED]
  
  ---
  Last minute change: I decided against making it a bool because embedded 
  folks might depend on a small kernel image. Edited in the patch below.
 ...
 
 Is this just a purely theoretical thought or is this a reasonable use 
 case people actually use in practice?
 
For now, it's a theoretical thought, but having an embedded device, I can 
see the reason for $EVERYTHING=m there.

 After all, changing it to a bool will allow us to make the kernel image 
 for nearly everyone smaller by a few hundred bytes...

I can't see why optionally building it as a module would force us to make 
the kernel bigger. It may be a little more ugly to support =m, but thats it,
isn't it?

-- 
Logic: The art of being wrong with confidence...
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Force UNIX domain sockets to be built in

2007-12-31 Thread Bodo Eggert
On Mon, 31 Dec 2007, David Miller wrote:
 From: Bodo Eggert [EMAIL PROTECTED]

  As suggested by Adrian Bunk, UNIX domain sockets should always be built in 
  on normal systems. This is especially true since udev needs these sockets
  and fails to run if UNIX=m.
  
  Signed-Off-By: Bodo Eggert [EMAIL PROTECTED]
 
 People who use udev can make sure they have it built into their kernel
 if they have such a dependency.
 
 Not everyone uses udev, and therefore needs AF_UNIX non-modular.

That's why I kept this option for embedded folks.

Is there any benefit for non-embedded systems from having UNIX=m?
-- 
Top 100 things you don't want the sysadmin to say:
89. I got a better job at Lockheed...
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86: provide a DMI based port 0x80 I/O delay override

2007-12-31 Thread Bodo Eggert
On Sun, 30 Dec 2007, Alan Cox wrote:
 On Sun, 30 Dec 2007 12:53:02 -0800
 H. Peter Anvin [EMAIL PROTECTED] wrote:
  Bodo Eggert wrote:

   I've never seen code which would do that, and it was not suggested by any
   tutorial I ever saw. I'd expect any machine to break on all kinds of 
   software
   if it required this. The only thing I remember being warned about is 
   writing
   the index and the data register at the same time using outw, because that
   would write both registers at the same time on 16-bit-cards.
   
  
  And we use that, and have been for 15 years.  I haven't seen any screams 
  of pain about it.
 
 Actually there were, and I sent numerous people patches for that back in
 ISA days. 

Are you talking about VGA cards requiring a delay between outb index/outb 
data, VGA cards barfing on outw or systems barfing on outb(0x80,42)?
-- 
Programming is an art form that fights back.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Force UNIX domain sockets to be built in

2007-12-31 Thread Bodo Eggert
On Mon, 31 Dec 2007, Adrian Bunk wrote:
 On Mon, Dec 31, 2007 at 02:26:42PM +0100, Bodo Eggert wrote:
  On Mon, 31 Dec 2007, Adrian Bunk wrote:
   On Mon, Dec 31, 2007 at 01:09:43PM +0100, Bodo Eggert wrote:

As suggested by Adrian Bunk, UNIX domain sockets should always be built 
in 
on normal systems. This is especially true since udev needs these 
sockets
and fails to run if UNIX=m.

Signed-Off-By: Bodo Eggert [EMAIL PROTECTED]

---
Last minute change: I decided against making it a bool because embedded 
folks might depend on a small kernel image. Edited in the patch below.
   ...
   
   Is this just a purely theoretical thought or is this a reasonable use 
   case people actually use in practice?
   
  For now, it's a theoretical thought, but having an embedded device, I can 
  see the reason for $EVERYTHING=m there.
 
 The only advantage I see is that the kernel image you have to flash 
 can be made smaller - with the disadvantage that the running kernel
 is bigger by more than 10%.
 
 If you don't believe me, try it yourself:
 Build all drivers statically into your kernel, and then compare the 
 vmlinux sizes with CONFIG_MODULES=n and CONFIG_MODULES=y.

If you'd aim for a small kernel image, you would build anything as a module 
that is not requred for booting.

   After all, changing it to a bool will allow us to make the kernel image 
   for nearly everyone smaller by a few hundred bytes...
  
  I can't see why optionally building it as a module would force us to make 
  the kernel bigger. It may be a little more ugly to support =m, but thats it,
  isn't it?
 
 On architectures like x86 where __exit code is freed at runtime 
 af_unix_exit() makes your kernel image (but not the running kernel) 
 bigger.
 
 With CONFIG_MODULES=y the 13 EXPORT_SYMBOL's that only exist for the 
 theoretical possibility of CONIG_UNIX=m waste a few hundred bytes 
 of memory.

#define m='m'
#if CONIG_UNIX=='m'
#define EXPORT_SYMBOL_AF_UNIX EXPORT_SYMBOL
#else
#define EXPORT_SYMBOL_AF_UNIX()
#endif
#undef m

You could also use #if defined(C_U)  (C_U == m).
-- 
Funny quotes:
36. You never really learn to swear until you learn to drive.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Bad escriptions in Kconfig

2007-12-31 Thread Bodo Eggert
On Mon, 31 Dec 2007, Douglas Gilbert wrote:
 Matthew Wilcox wrote:
  On Mon, Dec 31, 2007 at 10:16:43AM -0500, Douglas Gilbert wrote:
  Bodo Eggert wrote:

(Kicking netdev from CC)

  ---
  SCSI target support (SCSI_TGT) [N/m/y/?] (NEW) ?
 
  If you want to use SCSI target mode drivers enable this option.
  If you choose M, the module will be called scsi_tgt.
  ---
 
  What TF is a SCSI target mode, what is a target mode driver?
  Heard of google :-)
 
  For explanations of SCSI (and other storage) terminology
  reference could be made to SAM-3 or SAM-4 drafts (because
  the real standards cost money) at www.t10.org .
 
  Perhaps many other subsections in the kernel could have
  similar references.
  
  I think that's an appalling idea.  Someone's trying to configure their
  kernel, not research hundreds of new ideas on the internet.  Here's a
  better description:
  
  help
The SCSI target code allows your computer to appear as a SCSI
device.  This is useful in a SAN or NAS environment where you
want other computers to be able to treat this computer as a disc.
  
To compile this driver as a module, choose M here: the module
will be called scsi_tgt.
 
 Appalling or not, it is more accurate to define a SCSI target
 properly than equate it to a direct access logical unit (i.e.
 a disk).

Yes, but calling the current text a help text would be even less accurate.
Can you create a helpfull text without being incorrect?
-- 
Field experience is something you don't get until just after you need it.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Force UNIX domain sockets to be built in

2007-12-31 Thread Bodo Eggert
On Mon, 31 Dec 2007, Al Viro wrote:
 On Mon, Dec 31, 2007 at 03:03:20PM +0100, Bodo Eggert wrote:
  On Mon, 31 Dec 2007, David Miller wrote:
   From: Bodo Eggert [EMAIL PROTECTED]

As suggested by Adrian Bunk, UNIX domain sockets should always be built 
in 
on normal systems. This is especially true since udev needs these 
sockets
and fails to run if UNIX=m.

Signed-Off-By: Bodo Eggert [EMAIL PROTECTED]
   
   People who use udev can make sure they have it built into their kernel
   if they have such a dependency.
   
   Not everyone uses udev, and therefore needs AF_UNIX non-modular.
  
  That's why I kept this option for embedded folks.
  
  Is there any benefit for non-embedded systems from having UNIX=m?
 
 udev-free != embedded.

But UNIX=m == waste RAM and have an effectively b0rken system until the 
module is loaded. It would be silly to do this unless you have a very small 
space for the kernel image and some free space for storing the needed 
modules. The big question is: Is there any non-embedded system where you 
have to aim for a small kernel image?

-- 
Fun things to slip into your budget
Half a million dollars for consultants to design a web site that was being
done by an intern in his spare time.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Force UNIX domain sockets to be built in

2007-12-31 Thread Bodo Eggert
On Mon, 31 Dec 2007, David Miller wrote:
 From: Bodo Eggert [EMAIL PROTECTED]

  The big question is: Is there any non-embedded system where you have
  to aim for a small kernel image?
 
 One some platforms, due to bootloader restrictions or whatever,
 there are hard limits on how large the main kernel image can be.
 
 On sparc64 for example the limit is around 6.5MB

That would be about the size of a complete rescue system. I don't think we 
need to worry about unix sockets there, do we?

 But this big question isn't the important issue, in fact it's
 tangental and has no bearing on the final decision we make
 here.
 
 Rather, choice is, and taking choice away is bad.  I may have a reason
 to make AF_UNIX modular, I might not, but either way taking that
 option away from me is not the right thing to do.

_You_'ll still have the option, because you have selected EMBEDDED=y 
(otherwise, you'd miss all other valuable options to cripple your kernel), 
while the folks that just care for a working systems will have what they'd 
select anyway without being bothered with useless questions.
-- 
Top 100 things you don't want the sysadmin to say:
5. Just add yourself to the password file and make a directory...
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] Get NUMLOCK from PC BIOS

2007-12-30 Thread Bodo Eggert
This patch enables reading the default NUMLOCK status from the BIOS data
area as defined for IBM PCs (1981). Using this area has been a reliable way
to detect (or set) the NUMLOCK status for about 27 years, and it will
continue to work on any IBM-compatible system that might run DOS. 
Especially, the NUMLOCK status on Linus' famous laptop should be usable.

---

I'd like some information about how this patch works non non-IBM-compatible
x86 PCs. For now, I've documented the wordt possible outcome I can imagine.

Signed-Off-By: Bodo Eggert [EMAIL PROTECTED]

diff -pruN -X dontdiff linux-2.6.23.pure/drivers/char/keyboard.c 
linux-2.6.23.base/drivers/char/keyboard.c
--- linux-2.6.23.pure/drivers/char/keyboard.c   2007-10-11 14:15:18.0 
+0200
+++ linux-2.6.23.base/drivers/char/keyboard.c   2007-12-30 12:12:11.0 
+0100
@@ -24,6 +24,7 @@
  * 21-08-02: Converted to input API, major cleanup. (Vojtech Pavlik)
  */
 
+#include asm/io.h
 #include linux/consolemap.h
 #include linux/module.h
 #include linux/sched.h
@@ -57,7 +58,10 @@ extern void ctrl_alt_del(void);
  * to be used for numbers.
  */
 
-#if defined(CONFIG_PARISC)  (defined(CONFIG_KEYBOARD_HIL) || 
defined(CONFIG_KEYBOARD_HIL_OLD))
+#ifdef CONFIG_KBD_DEFLEDS_PCBIOS
+/* KBD_DEFLEDS is a variable */
+#undef KBD_DEFLEDS
+#elif defined(CONFIG_PARISC)  (defined(CONFIG_KEYBOARD_HIL) || 
defined(CONFIG_KEYBOARD_HIL_OLD))
 #define KBD_DEFLEDS (1  VC_NUMLOCK)
 #else
 #define KBD_DEFLEDS 0
@@ -1358,8 +1362,17 @@ int __init kbd_init(void)
 {
int i;
int error;
+#ifdef CONFIG_KBD_DEFLEDS_PCBIOS
+   int KBD_DEFLEDS = 0;
+   /* address 0x40:0x17 */
+   char * bios_kbd_status=xlate_dev_mem_ptr(0x417);
+
+   /* Numlock status bit set? */
+   if (*bios_kbd_status  0x20)
+   KBD_DEFLEDS = 1  VC_NUMLOCK;
+#endif
 
-for (i = 0; i  MAX_NR_CONSOLES; i++) {
+   for (i = 0; i  MAX_NR_CONSOLES; i++) {
kbd_table[i].ledflagstate = KBD_DEFLEDS;
kbd_table[i].default_ledflagstate = KBD_DEFLEDS;
kbd_table[i].ledmode = LED_SHOW_FLAGS;
diff -pruN -X dontdiff linux-2.6.23.pure/drivers/input/keyboard/Kconfig 
linux-2.6.23.base/drivers/input/keyboard/Kconfig
--- linux-2.6.23.pure/drivers/input/keyboard/Kconfig2007-10-11 
14:14:41.0 +0200
+++ linux-2.6.23.base/drivers/input/keyboard/Kconfig2007-12-30 
12:11:45.0 +0100
@@ -12,6 +12,17 @@ menuconfig INPUT_KEYBOARD
 
 if INPUT_KEYBOARD
 
+config KBD_DEFLEDS_PCBIOS
+   bool Enable Num-Lock based on BIOS settings
+   depends on X86_PC  EXPERIMENTAL
+   help
+ Turns on Numlock depending on the BIOS settings.
+ This works by reading the BIOS data area as defined for IBM PCs 
(1981).
+
+ If you have an alternative firmware like OpenFirmware or LinuxBios,
+ this flag might not be set correctly, which results in a random state
+ of the Numlock key.
+
 config KEYBOARD_ATKBD
tristate AT keyboard if EMBEDDED || !X86_PC
default y

-- 
Top 100 things you don't want the sysadmin to say:
39. It is only a minor upgrade, the system should be back up in
a few hours.  ( This is said on a monday afternoon.)
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86: provide a DMI based port 0x80 I/O delay override

2007-12-30 Thread Bodo Eggert
Ingo Molnar [EMAIL PROTECTED] wrote:

 do you have any memories about the outb_p() use of misc_32.c:
 
 pos = (x + cols * y) * 2;   /* Update cursor position */
 outb_p(14, vidport);
 outb_p(0xff  (pos  9), vidport+1);
 outb_p(15, vidport);
 outb_p(0xff  (pos  1), vidport+1);
 
 was this ever needed? This is so early in the bootup that can we cannot
 do any sensible delay. Perhaps we could try a natural delay sequence via
 inb from 0x3cc:
 
 outb(14, vidport);
  inb(0x3cc); /* delay */
 outb(0xff  (pos  9), vidport+1);

I've never seen code which would do that, and it was not suggested by any
tutorial I ever saw. I'd expect any machine to break on all kinds of software
if it required this. The only thing I remember being warned about is writing
the index and the data register at the same time using outw, because that
would write both registers at the same time on 16-bit-cards.


BTW: The error function in linux-2.6.23/arch/i386/boot/compressed/misc.c
uses while(1) without cpu_relax() in order to halt the machine. Is this fixed?
Should it be fixed?

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86: provide a DMI based port 0x80 I/O delay override

2007-12-30 Thread Bodo Eggert
On Sun, 30 Dec 2007, Ingo Molnar wrote:
 * H. Peter Anvin [EMAIL PROTECTED] wrote:
  Ingo Molnar wrote:
  * Bodo Eggert [EMAIL PROTECTED] wrote:

  BTW: The error function in linux-2.6.23/arch/i386/boot/compressed/misc.c 
  uses while(1) without cpu_relax() in order to halt the machine. Is this 
  fixed? Should it be fixed?
 
  this is early bootup so there's no need to be nice to other cores or 
  sockets - none of them are really running.
 
 
  It probably should actually HLT, to avoid sucking power, and stressing 
  the thermal system.  We're dead at this point, and the early 486's 
  which had problems with HLT will lock up - we don't care.
 
 ok. Like the patch below?

  
 - while(1);   /* Halt */
 + asm(cli; hlt);/* Halt */

The other users would loop around the hlt. Cargo Cult?
-- 
Top 100 things you don't want the sysadmin to say:
97. Go get your backup tape. (You _do_ have a backup tape?)
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: RFC: permit link(2) to work across --bind mounts ?

2007-12-20 Thread Bodo Eggert
On Wed, 19 Dec 2007, Al Viro wrote:
 On Wed, Dec 19, 2007 at 02:43:26PM +0100, Bodo Eggert wrote:

  Since nobody knows about this security boundary and everybody knows about
  the annoying can't link across bind-mountpoints bug,
 
 ... how about teaching people to RTFM?  Starting, perhaps, with man 2 link?

What about reading POSIX which says 

1264 [EXDEV]
1265 Improper link. A link to a file on another file system was attempted.

So if the link creates a file on NOT another filesystem (which is the point 
of bind mounts), it should NOT return EXDEV.

Having an artificial boundary between different views to a fs may happen to 
be a security feature if used with care, but most users do expect the 
opposite and wonder why mv is needlessly slow. I'm not even sure if 
defaulting to having a barrier is sane at all, but if people confuse 
filesystems and mountpoints^W^W^W^Wuse this feature, they will depend on 
this feature not changing:-)

-- 
It is generally inadvisable to eject directly over the area you just
bombed.
-U.S. Air Force Manual
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Trying to convert old modules to newer kernels

2007-12-20 Thread Bodo Eggert
linux-os (Dick Johnson) [EMAIL PROTECTED] wrote:
 On Thu, 20 Dec 2007, Sam Ravnborg wrote:

 It never gets to the printk(). You were right about the
 compilation. Somebody changed the kernel to compile with
 parameter passing in REGISTERS! This means that EVERYTHING
 needs to be compiled the same way, 'C' calling conventions
 were not good enough!

 How did you build the module. It reads like you failed to use
 kbuild to build your module which is why you did not pass
 correct options to gcc - correct?

 If you did not use kbuild - why not?
 Is there anything missing you need?

 I need to get rid of -mregparm=3 on gcc's command line. It
 is completely incompatible with the standard calling conventions
 used in all our assembly-language files in our drivers. We make
 very high-speed number-crunching drivers that munge high-speed
 data into images. We need to do that in assembly as we have
 always done.

According to my quick googling, __attribute__((regparm,0)) is what you need.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: RFC: permit link(2) to work across --bind mounts ?

2007-12-19 Thread Bodo Eggert
Al Viro [EMAIL PROTECTED] wrote:
 On Tue, Dec 18, 2007 at 11:00:16PM +, Al Viro wrote:
 On Tue, Dec 18, 2007 at 05:46:21PM -0500, Mark Lord wrote:

  Why does link(2) not support hard-linking across bind mount points
  of the same underlying filesystem ?
 
 Because it gives you a security boundary around a subtree.
 
 PS: that had been discussed quite a few times, but to avoid searches:
 consider e.g. mount --bind /tmp /tmp; now you've got a situation when
 users can't create links to elsewhere no root fs, even though they
 have /tmp writable to them.  Similar technics works for other isolation
 needs - basically, you can confine rename/link to given subtree.  IOW,
 it's a deliberate feature.  Note that you can bind a bunch of trees
 into chroot and get predictable restrictions regardless of how the
 stuff might get rearranged a year later in the main tree, etc.

Since nobody knows about this security boundary and everybody knows about
the annoying can't link across bind-mountpoints bug, what about introducing
a mount option to allow link()ing?

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 1st version of azfs

2007-12-17 Thread Bodo Eggert
Maxim Shchetynin [EMAIL PROTECTED] wrote:

 +config AZ_FS
 +tristate AZFS filesystem support
 +default m
   ^
STRONG NACK, I hate digging in the menu tree and hunting for things I
don't need.

 +help
 +  Non-buffered filesystem for block devices with a gendisk and
 +  with direct_access() method in gendisk-fops.
 +  AZFS does not buffer outgoing traffic and is doing no read
 ahead.
 +  AZFS uses block-size and sector-size provided by block
 device
 +  and gendisk's queue. Though mmap() method is available only
 if
 +  block-size equals to or is greater than system page size.

What is the benefit or intended use of this filesystem? Will your intended
user say gendisk-fops-direct_access? I wanted to use it all my life?

AZFZ seems to be an acronym. AirZound File System?
http://globetrotter.de/de/shop/detail.php?mod_nr=ex_35001GTID=7c553060901a873c5bd29a1846ff39a3a32


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] xen: relax signature check

2007-12-11 Thread Bodo Eggert
Jeremy Fitzhardinge [EMAIL PROTECTED] wrote:

 Some versions of Xen 3.x set their magic number to xen-3.[12], so
 relax the test to match them.


 - BUG_ON(memcmp(xen_start_info-magic, xen-3.0, 7) != 0);
 + BUG_ON(memcmp(xen_start_info-magic, xen-3, 5) != 0);

Not BUG_ON(memcmp(xen_start_info-magic, xen-3., 6) != 0); ?
I don't thin Xen version 32 will be compatible ...

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Out of tree module using LSM

2007-12-03 Thread Bodo Eggert
Jon Masters [EMAIL PROTECTED] wrote:
 On Thu, 2007-11-29 at 11:11 -0800, Ray Lee wrote:
 On Nov 29, 2007 10:56 AM, Jon Masters [EMAIL PROTECTED] wrote:
  On Thu, 2007-11-29 at 10:40 -0800, Ray Lee wrote:
   On Nov 29, 2007 9:36 AM, Alan Cox [EMAIL PROTECTED] wrote:

 closed. But more importantly further access to it can be blocked
 until appropriate actions are taken which also applies with your
 example, no? Is
   
That bit is hard- very hard.

 To lift Alan's example, a naive first implementation
 would be to create a suffix tree of all of ESR's works, then scan each
 page on fault to see if there are any partial matches in the tree.
 
 Ah, but I could write a sequence of pages that on their own looked
 garbage, but in reality, when executed would print out a copy of the
 Jargon File in all its glory. And if you still think you could look for
 patterns, how about executable code that self-modifies in random ways
 but when executed as a whole actually has the functionality of fetchmail
 embedded within it? How would you guard against that?

You can't scan all possible code for malware:
Take a random piece of code, possibly halting. Replace all halting conditions
using a piece of malware. Scan it. If it were possible to detect the malware
without false positives, you'd have solved the halting problem.

In practice, this does not hinder virus scanners from preventing most damage.
Therefore I think it's OK to have one.


If I had to design a virus scanner interface, I'd e.g. create a library*
providing an {open|mmap}_and_scan() function that would give me a clean
copy/really-private mapping of a scanned file, and a scan_{blob,file}()
function that would scan a block of memory/a file. Then, it's up to the
application to ensure that it uses that library. As a result, you could
e.g. run less eicar.sh, but you could not run bash eicar.sh**, and an
application receiving a strangely encoded piece of malware into it's
memory has a chance of avoiding an infection without writing it to a file.
Maybe gpg  eicar.gpg.sh|sh will unintendedly work, but I don't think
scanning pipes would be easy anyway. OTOH, maybe the library would make
it feasible at all, provided the malicious code is not located way before
the signature.

Off cause I'd need to do something about binaries. At first glance, this
does not seem too bad, since there is a way to run ld*.so. I'd just use it
to enforce a preloader for static binaries, too. (I'm glad I can leave the
implementation details to somebody else.-)


*  Without having a virus scanner installed, this library will just NOOP
   by default.

** Bonus: I can unzip open_office_file; rm macros; zip open_office_file.
   OTOH, the scanner should provide a cleaner for those simple cases.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH][retry-2] init: Introduce rootdir bootparm to select which dir to sys_chroot

2007-11-18 Thread Bodo Eggert
Al Boldi [EMAIL PROTECTED] wrote:

 Second try; this time with a doc-update, and the ability to remount normally.
 
 Tested against 2.6.23.
 
 ---
 
 This patch introduces a rootdir kernel boot parameter, which specifies the
 path to the kernel sys_chroot boot dir.
 
 This is useful for systems that have more than one distribution installed on
 the same fs/partition.

1) This is useful for booting a rescue or test system, too. In those cases,
   you might want to have the old root moved somewhere.
   (Always $rootdir/oldroot? Additional parameter? I'm not sure ...)

2) You use a static buffer, but you don't check for bad return values of
   strlcat().

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH][retry-2] init: Introduce rootdir bootparm to select which dir to sys_chroot

2007-11-18 Thread Bodo Eggert
On Sun, 18 Nov 2007, H. Peter Anvin wrote:
 Bodo Eggert wrote:

 1) This is useful for booting a rescue or test system, too. In those cases,
you might want to have the old root moved somewhere.
(Always $rootdir/oldroot? Additional parameter? I'm not sure ...)
 

 Again, this is a good example of why this really shouldn't be additional 
 hacks in kernel space.

ACK, but until kinit is default (and Godot arrives), this little hack does 
seem to be useful.

-- 
Top 100 things you don't want the sysadmin to say:
63. Oracle will be down until 8pm, but you can come back in and finish your
work when it comes up tonight.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: AppArmor Security Goal

2007-11-12 Thread Bodo Eggert
Rogelio M. Serrano Jr. [EMAIL PROTECTED] wrote:
 Dr. David Alan Gilbert wrote:

 Allowing a user to tweak (under constraints) their settings might allow
 them to do something like create two mozilla profiles which are isolated
 from each other, so that the profile they use for general web surfing
 is isolated from the one they use for online banking.

   
 Doesnt this allow the user to shoot their own foot? The exact thing
 mandatory access control are supposed to prevent?

cat `which mozilla`  ~/bin/mymozilla; chmod +x ~/bin/mozilla; mymozilla

Unless you lock down the system to a state where it's barely usable, MAC
isn't going to protect you from shooting your own feet. But having more
restricted roles and a safe way of activating them (as in damn obvious
if or if not this role is active), you can have e.g. one mozilla for
banking and one for pr0n.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/11 v3] enable make ARCH=x86

2007-11-11 Thread Bodo Eggert
Theodore Tso [EMAIL PROTECTED] wrote:
 On Sat, Nov 10, 2007 at 12:35:01PM -0800, H. Peter Anvin wrote:

 In fact, we should be able to get rid of ARCH entirely;  CONFIG_ options
 have the huge advantage that they're saved in a file, and you don't have to
 type them on every make run. The only option that I can't see us getting
 rid of easily is HOSTCC, since it is used before config is run, but
 probably something clever can be done there, too.

 Yes, please!  One of the more annoying things is forgetting the
 ARCH=um when rebuilding UML.  It would be awfully nice if ARCH was set
 via a CONFIG_ option and was persistent.

This should have been fixed, or it's about to be fixed. My patch is here:
http://groups.google.com/group/linux.kernel/browse_thread/thread/93e5c33fc6e8cff6/39aff558a636ad02
(This patch was superseded by another patch, which may be delayed or mm-only.)

OTOH, if you can implement ARCH= using CONFIG_ARCH, why not? Just don't forget
to keep the scripts running, and make randconfig only select buildable archs.


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [AppArmor 19/45] Add struct vfsmount parameters to vfs_rename()

2007-11-02 Thread Bodo Eggert
Al Viro [EMAIL PROTECTED] wrote:
 On Fri, Oct 26, 2007 at 11:23:53AM -0700, John Johansen wrote:

 In the current code, both vfsmounts are always identical, and so one of
 the two should go, agreed.
 
 The thought behind passing both vfsmounts was that they could differ but
 point to the same super_block, in which case renames would still be
 possible at least from a filesystem point of view. The essential
 restriction here is that both files must be on the same device; the vfs
 restriction of not allowing cross-mount renames is arbitrary.
 
 It's called access control.  Pathname-based one, BTW.  And yes, it's
 100% deliberate.

I doubt anybody uses bind mounts  co instead of symlinks in order to
prevent rename() while still allowing to move files by either copying
or by using the source file in the bound directory. At least I expected
bind mounted directories to behave like symlinked ones, minus the problems
of symlinks. 

Since this feature only protects you from rename(src/foo,dst/foo) if
1) There is no way to access src and dst in the same mount space
2) src and dst are writebale by the attacker
3) Unlinking src/foo is OK
4) Renaming src/foo is OK as long as it's within the same mount as foo
5) Symlinking src/foo to dst/foo is OK
6) Creating dst/foo having a different owner is OK
7) Having dst/foo with the original content and owner from src/foo is _not_ OK
8) Moon crashes on earth
, I'd rather like to have a fast mv.

 Cross-mount renames are not allowed currently, and granted, they may not
 be very useful, either.
 
 raised brows
 Excuse me, but IIRC LSM was supposed to _add_ restrictions, not to remove
 existing security checks.

Security checks as in we built a steel door into that Chinese paper wall?

As far as I understand, the restriction would not be removed by the LSM
explicitely allowing it, but by the fixed vfs then being able to handle
cross-mountpoint-renames. Maybe yo'll want to keep the ability for the users
who use bind mounts in order to not allow rename() ... both of them.-)

/me prepares for the impact of a large round object on earth.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/6] kill i386 and x86_64 directories

2007-10-25 Thread Bodo Eggert
Thomas Gleixner [EMAIL PROTECTED] wrote:

 I think the last remaining bit to cleanup is the symlink from
 arch/x86/boot/bzImage.

BTW: Is it useful to have (b)zimage under $ARCH while vmlinux is in the root
dir? (Besides being compatible to external scripts)
-- 
I always tell customers/clients the same thing:
   Good, Fast, Cheap.  You can pick two.
-- randem in [EMAIL PROTECTED]
Friß, Spammer: [EMAIL PROTECTED] [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux machines dieing in swap storms

2007-10-25 Thread Bodo Eggert
Rik van Riel [EMAIL PROTECTED] wrote:
 On Thu, 25 Oct 2007 16:20:41 +0100
 Richard Purdie [EMAIL PROTECTED] wrote:

 Advice on solving this welcome preferably in mainline but I'll happily
 hack my kernels with a workaround if need be.
 
 I can't see any easy hacks or workarounds to fix the issue in the
 current MM, except maybe activate the OOM killer if the amount of
 page cache and buffer cache is really low and swap is full...
 
 In the longer run, I'm working on:
 
 http://linux-mm.org/PageReplacementDesign

What about only reclaimimn cache if the cache has grown beyond a watermark
and only reclaimimn non-cache if it's below another watermark? I can imagine
it will solve my diskcache-pushes-out-mousehandler problem, and I'm pretty
sure having very low file cache is bad for performance, too.

Another thing I can imagine is to detect thrashing conditions and to
change scheduling in order to increase the likehood of cache hits and
thereby progress: If an application just got a page, keep it running for
a while (accumulating negative credits).
-- 
Of course, as admin, I can read all your email. But I am not THAT bored!
-- unknown author in comp.unix.aix

Friß, Spammer: [EMAIL PROTECTED] [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] fix passing argument # of '__memcpy' discards qualifiers from pointer target type warnings

2007-10-24 Thread Bodo Eggert
Miguel Botón [EMAIL PROTECTED] wrote:

 This patch fixes the warnings passing argument 1 of '__memcpy' discards
 qualifiers from pointer target type and passing argument 2 of '__memcpy'
 discards qualifiers from pointer target type when compiling some files.
 
 I don't really know if this is the best way but at least I don't get more
 warnings.

 +++ linux-2.6.24-rc1/fs/cifs/dir.c2007-10-24 15:49:44.0 +0200
 @@ -585,6 +585,7 @@

 + unsigned char *dstname = (unsigned char *)a-name;

 @@ -593,7 +594,7 @@

 - memcpy((unsigned char *)a-name, b-name, a-len);
 + memcpy(dstname, b-name, a-len);

This looks like a compiler bug. Get the gcc people to fix it.
-- 
Top 100 things you don't want the sysadmin to say:
20. ...and if we just swap these two disc controllers like _this_...

Friß, Spammer: [EMAIL PROTECTED] [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Power button policy and mechanism

2007-10-16 Thread Bodo Eggert
Dmitry Torokhov [EMAIL PROTECTED] wrote:
 On 10/16/07, Kristoffer Ericson [EMAIL PROTECTED] wrote:

 Is the suggested approach on handling powerbutton (in keyboard driver) to
 simply push out the event and let userland handle it?
 
 Yes.
 
 The reason Im asking this is because as you might know Im maintainer for two
 mini-laptop style pda's (HP7xx  HP6xx) and it would simplify my life alot if
 I didn't need to depend on userland applications to be able to
 suspend/resume.

 For instance HP6XX receives an interrupt call whenever the powerbutton is
 pressed. Now I could just push out the event and let another program handle
 it but considering it would take a minimum amount of lines to let it simply
 suspend/resume I feel its a waste.

 Previously the hp6xx has been allowed to do this policy way but that was
 when LinuxSH stod as a side branch to main tree. Now when everything gets
 merged into mainline I need to decide how to do this.

 This is mainly an embedded issue, but I feel it's quite important. It should
 apply to other devices also like for example Zaurus branches (those with
 keyboard and designated power button).

 So in short:
 1. Does mainline policy allow static power button events inside kernel (power
 button == suspend/resume)?
Why/Why Not?
 
 Could it be that you may want to prevent suspend from happening? Or
 delay it until system completes some important operation?

If I want to prevent the suspend/shutdown from happening, I don't press
the button. If my system insists on not shutting down in order to instead
run out to power while doing something important - I'm not happy, too.

OTOH, maybe you could allocate the storage for the suspend image here instead
of creating HUGE swap space and letting the kernel prevent suspend by using
it or watching the kernel thrash for hours before each OOM.

 Do something
 else, like cleanly disconnect your network connections?

Not needed, possibly not desired.

 With actual
 handling done in userspace it's all possible. With suspend done
 directly in kernel it is much harder and couples input subsystem with
 power management too tightly.

You can do something like control_alt_delete() - in fact I'm dedicating
400 KB of RAM for kill -INT 1 on my desktop, not using suspend. Having
an ACPI userspace event is not bad if you intend to use it, but for people
who just want shutdown or suspend to happen, it's overkill.
-- 
Top 100 things you don't want the sysadmin to say:
79. What's this any key I'm supposed to press?

Friß, Spammer: [EMAIL PROTECTED] [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Killing a network connection

2007-10-15 Thread Bodo Eggert
Andi Kleen [EMAIL PROTECTED] wrote:
 Stefan Monnier [EMAIL PROTECTED] writes:

 The main use for me is to deal with dangling connections due to taking
 network interfaces updown with different IP addresses (typically the wlan0
 interface where the IP is different because I've modes from an AP to
 another).  Of course, maybe there's another way to solve this particular
 problem, in case I'd like to hear about it as well.
 
 Long ago I did a 2.4 patch that solved exactly this problem. It introduced
 a new ifconfig flag dynamic and when a dynamic address went down
 all TCP connections originating from it were killed. It's still available
 in older SUSE releases. I might post a forward port later.

There is a /proc/sys/net/ipv4/ip_dynaddr sysctl in 2.6.21.
-- 
If at first you don't succeed, call it version 1.0 

Friß, Spammer: [EMAIL PROTECTED]
 [EMAIL PROTECTED] [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] FAT: Fix printk format strings.

2007-10-11 Thread Bodo Eggert
Vegard Nossum [EMAIL PROTECTED] wrote:

 This makes sure printk format strings contain no more than a single
 line.

  printk(KERN_WARNING
 -FAT: Did not find valid FSINFO signature.\n
 +FAT: Did not find valid FSINFO signature.\n);
 + printk(KERN_WARNING
   Found signature1 0x%08x signature2 0x%08x
   (sector = %lu)\n,

What about something like
Fat32: Invalid FSINFO signatures 0x%08x, 0x%08x; expected 0x%08x, 0x%08\n ?
or
Fat32: Invalid FSINFO signatures 0x%08x, 0x%08x\n ?
-- 
bus error. passengers dumped.

Friß, Spammer: [EMAIL PROTECTED]
 [EMAIL PROTECTED] [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: gigabit ethernet power consumption

2007-10-11 Thread Bodo Eggert
Kok, Auke [EMAIL PROTECTED] wrote:
 K.Prasad wrote:

 Without the side-effect of experiencing a link-flap when switching to a
 lower-speed (with its toll in terms of down-time for auto-negotiation,
 STP, etc), the Interrupt Moderation Algorithm dynamically adjusts the
 number of interrupts based on traffic - and presumably consume less
 power. For an Optimise for Power kind of profile - the driver can be
 loaded with a higher throttle rate during boot-time.
 
 We're changing this to be run-time adjustable in newer drivers.
 
 However, the power consumed by your nic staying in gigabit mode is much
 greater in the long run then what you can save by trying to scrounge for
 milliwatts reducing interrupts generated by the nic. By default it already
 moderates them somewhat. Practically this feature is really not useful for
 powersaving, it just won't add up to actual benefits in a real life situtation
 I think.

Just a thought:
How much power does a non-connected NIC consume, and can you save power
by forcing 10 MBit until a link is detected (doubling negotiation time)?
-- 
Top 100 things you don't want the sysadmin to say:
22. hey, what does mkfs do?

Friß, Spammer: [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 2/2] getattr - fill the size of FIFOs

2007-10-03 Thread Bodo Eggert
Jan Engelhardt [EMAIL PROTECTED] wrote:

 [PATCH]: Fill the size of FIFOs
 
 Instead of reporting 0 in size when stating() a 

FIFO
-- 
Whenever you have plenty of ammo, you never miss. Whenever you are low on
ammo, you can't hit the broad side of a barn.

Friß, Spammer: [EMAIL PROTECTED] [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Out of memory management in embedded systems

2007-09-29 Thread Bodo Eggert
linux-os (Dick Johnson) [EMAIL PROTECTED] wrote:
 On Fri, 28 Sep 2007, [iso-8859-1] Daniel Spång wrote:
 On 9/28/07, linux-os (Dick Johnson) [EMAIL PROTECTED] wrote:
 On Fri, 28 Sep 2007, [iso-8859-1] Daniel Spång wrote:
 On 9/28/07, linux-os (Dick Johnson) [EMAIL PROTECTED] wrote:

 But an embedded system contains all the software that will
 ever be executed on that system! If it is properly designed,
 it can never run out of memory because everything it will
 ever do is known at design time.

 Not if its input is not known beforehand. Take a browser in a mobile
 phone as an example, it does not know at design time how big the web
 pages are. On the other hand we want to use as much memory as
 possible, for cache etc., a method that involves the kernel would
 simplify this and avoids setting manual limits.

 Any networked appliance can (will) throw data away if there are
 no resources available.

 The length of a web-page is not relevent, nor is the length
 of any external data. Your example will buffer whatever it
 can and not read anything more from the external source until
 it has resources available unless it is broken.

And reload the web page whenever the suser scrolls up.

 And how do you determine when no resources are availabe? We are using
 overcommit here so malloc() will always return non null.

 A networked appliance using embedded software is not your daddy's
 Chevrolet. Any task that is permanent needs to allocate all its
 resources when it starts. That's how it knows how much there are,
 and incidentally, it doesn't do it blindly. The system designer
 must know how much memory is available in the system and how much
 is allocated to the kernel.

So if I design a mobile phone and want the user to be able to listen to
music and to browse the web and use the camera, I must

- allocate the web cache on start
- allocate the mp3 cache on start
- allocate the video/photo buffer on start
- leave enough RAM for other applications
- waste much of the memory most of the time
- reduce the capabilities of the camera in order to reduce used memory
  (e.g. reduce the video resolution to match the slow flash speed instead
   of allowing short high-res clips or reduce the number of pictures for
   automatic sequences)
- costly reload parts of web pages that could have been stored in 
  the currently unused RAM
- increase the realtime constraints of the mp3 player in order to
  conserve memory, making the music stutter while performing
  other operations.

instead of just telling the system Hey' I'm the web browswer, I need
2 MB of RAM to work correctly, 4 MB to show a nice performance and
if you ask me nicely, I'll release memory, because nobody could possibly
want this feature?

IMO, even desktop systems would perform better having this feature.
Imagine no more GIMP images pushing out X's mouse driver ...

 The fact that you can give a fictitious value to malloc() is not
 relevant. If you don't provide resources for malloc(), like
 (ultimately) a swap file, then you can't assume that it can do
 any design work for you.
 
 An embedded system is NOT an ordinary system that happens to
 boot from flash. An embedded system requires intelligent design.

Yeah, you should know all apllications your users will put on their phones!

Maybe you can do this for industry embedded single-task systems, but
multi-purpose systems will benefit from 

 It is important to understand how a virtual memory system
 operates. The basics are that the kernel only knows that
 a new page needs to be allocated when it encounters a trap
 called a page fault. If you don't have any memory resources
 to free up (read no swap file to write a seldom-used task's
 working set), then you are screwed --pure and simple. So,
 if you don't provide any resources to actually use virtual
 memory, then you need to make certain that virtual memory
 and physical memory are, for all practical purposes, the same.

The system should notify the applications before reaching that point,
e.g. when the cache is reduced below a theresold.

 With embedded servers, it's usually very easy to limit the
 number of connections allowed, therefore the amount of
 dynamic resources that must be provided.

And it's easy to limit the size and number of images on web pages
that need to be cached.

 With clients
 it should be equally easy, but generic software won't
 work because,

... it will ignore the signal.

 for instance, Mozilla doesn't keep track
 of the number of windows you have up and the number
 of connections you have.

about:config network.http.max-connections

 HOWEVER, remember that malloc()
 is a library call. You can substitute your own using
 LD_PRELOAD, they keeps track of everything if you must
 use generic software.

Knowing one's memory usage - possibly not counting stack, code and data,
is only half of the trick. You need to know the current memory pressure,
too.


Imagine mozilla asks if it may use the whole lotta 4 GB RAM for it's memory
cache, and 

Re: Chroot bug (was: sys_chroot+sys_fchdir Fix)

2007-09-26 Thread Bodo Eggert
On Wed, 26 Sep 2007, David Newall wrote:

 Miloslav Semler pointed out that a root process can chdir(..) out of 
 its chroot.  Although this is documented in the man page, it conflicts 
 with the essential function, which is to change the root directory of 
 the process.

The root directory, '/' is changed, and if the process is capable of using
chroot, it may change the root directory again. Works as defined.

  In addition to any creative uses, for example Philipp 
 Marek's loading dynamic libraries, it seems clear that the prime purpose 
 of chroot is to aid security.

As long as root has more than a safe subset of capabilities, root can escape 
a chroot.

Besides that, fchdir on open-at-chroot fds does not decrease the security, 
since the attacker needs help from the outside root, who is not restricted 
by chroot.

I'm more concerned about abstract unix sockets, they could be used to 
send a file descriptor to compromised daemons and extend exploits to
the outside of a chroot and across namespaces - at least I suspect it.
The whole f* family of syscalls would be affected. This can be cured by
e.g. not allowing to receive fds if the root+namespace do not match.

  Being able to cd your way out is handy 
 for the bad guys, but the good guys don't need it; there are a thousand 
 better, safer solutions.

The good guys don't cd out, they open the instalkler archive, chroot to the 
new system root and extract it there. Then they chroot back using the saved 
cwd.

 If there truly is a need to be able to pop in and out of a chroot, then 
 the solution should be obvious, such as with real versus effective user 
 and group ids.  An important quality of a solution would be a way to fix 
 that essential function: to set the root in such a way that you can no 
 longer pop out.  But that is a separate question.

As in jail()?

As far as I know, the new virtualisation features sneaking into the kernel  
will allow implementing a jail, too, in a more secure way than any hacking 
on chroot can give.

 The question: is chroot buggy?  I'm pleased to turn to SCO for an 
 independent definition for chroot, from which I get the following:
 
 http://osr600doc.sco.com/en/man/html.S/chroot.S.html:
 
  The *..* entry in the root directory is interpreted to mean the root 
  directory itself. Thus, *..* cannot be used to access files outside 
  the subtree rooted at the root directory.
 
 
 I argue chroot is buggy.  Miloslav's patch might not be the right 
 solution, but he has the right idea (i.e. fix it.)

There are implementations of chroot which imply chdir(), and not having f* 
functions, they can not _directly_ acces files outside the chroot. But as 
long as they can e.g. mknod /dev/mem or strace, they can do anything.

So let's not put a fingerprint sensor on that chinese paper door.
-- 
You know you're in trouble when packet floods are competing to flood you.
-- grc.com
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Unfortunate infinite make recursion

2007-09-22 Thread Bodo Eggert
Jan Engelhardt [EMAIL PROTECTED] wrote:

 You can cause a recursion in kbuild/make with the following:
 
 make O=$PWD kernel/time.o
 make mrproper
 
 Of course no one would use O=$PWD (that's just the testcase),
 but this happened too often:
 
 /ws/linux/linux-2.6.23$ make O=/ws/linux/linux-2.6.23 kernel/time.o
 (Oops - should have been O=/ws/linux/obj-2.6.23!)
 
 The make O=$PWD truncates the Makefile, making it necessary to run `git
 checkout Makefile` - should you have git; or reextract the tarball
 (should you /still/ have it). Well, can we catch this case somehow?

You can test for the existence of MAINTAINERS in the build dir and abort
if it's there.
-- 
My computer isn't that nervous...it's just a bit ANSI. 

Friß, Spammer: [EMAIL PROTECTED] [EMAIL PROTECTED]
 [EMAIL PROTECTED] [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] kernel/printk.c: Concerns about the console handover

2007-09-21 Thread Bodo Eggert
Maciej W. Rozycki [EMAIL PROTECTED] wrote:

  Move the hadover message to after the boot console has been released to
 avoid bad interactions between it and the real console.

This message is usefull if the handover fails, therefore it should be printed
on the boot console, while successfull switchover is implied by having any
output on the real console. Isn't it?
-- 
Our last fight was my fault: My wife asked me What's on the TV?
I said, Dust!

Friß, Spammer: [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [BUG] Possible cache memory leak.

2007-09-21 Thread Bodo Eggert
Micha? Kazior [EMAIL PROTECTED] wrote:

 I've discovered a strange thing lately. My memory is being sucked out
 when doing (I suppose) _a lot_ of stat() on the file system. I got left
 once with ~30MB of ram (of 512 in total) which made my system trash
 like hell. You might try doing the following to reproduce the effect:
 
  $ echo 1  /proc/sys/vm/drop_caches
  $ vmstat# write down the result
  $ du -s -x /lots/of/files   # keep it running for a minute
  $ echo 1  /proc/sys/vm/drop_caches
  $ vmstat# write it down too

RTFM: Documentation/filesystems/proc.txt
-- 
To define recursion, we must first define recursion. 

Friß, Spammer: [EMAIL PROTECTED] [EMAIL PROTECTED]
 [EMAIL PROTECTED] [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: sys_chroot+sys_fchdir Fix

2007-09-20 Thread Bodo Eggert
David Newall [EMAIL PROTECTED] wrote:

 Normal users cannot use chroot() themselves so they can't use chroot to
 get back out
 
 I think Bill is right, that this is to fix a method that non-root
 processes can use to escape their chroot. The exploit, which is
 documented in chroot(2)*, is to chdir(..) your way out. Who'd have
 thought it? Only root can do that, but even that seems wrong. Chroot
 should be chroot and that should be the end of it.

chroot with having open directories outside the chroot is a convenience
feature, allowing e.g. to install programs into a different root while
opening the archives from another root tree. Only if there is a working
capability system preventing root from accessing the hardware*, a chroot
may become a security feature.

Off cause having the new fchdir, you might run chroot /var/foo 3 / in
order to pass a dir filehandle and compromise your own security, but this
is nothin a system should protect against.

The only problem I'm concerned about is passing a file descriptor to a
privileged, compromised process using an abstract unix socket. This combines
two different privileges, possibly increasing the impact of the attack.
I think it may be enough to not allow passing directory fds if the two
processes have different device/inode/namespace, but I'm not sure about
device fds.


*) chmod u+s binary; su nobody; exec binary; mount tmpfs /; mknod dev_mem
   should be enough to void most root-in-chroot setups. Very untested.
-- 
Funny quotes:
26. If you take an Oriental person and spin him around several times, does he
become disoriented?
Friß, Spammer: [EMAIL PROTECTED] [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: patch/option to wipe memory at boot?

2007-09-20 Thread Bodo Eggert
Chris Snook [EMAIL PROTECTED] wrote:
 David Madore wrote:
 On Mon, Sep 17, 2007 at 11:11:52AM -0700, Jeremy Fitzhardinge wrote:

 Boot memtest86 for a little while before booting the kernel?  And if you
 haven't already run it for a while, then that would be your first step
 anyway.
 
 Indeed, that does the trick, thanks for the suggestion.  So I can be
 quite confident, now, that my RAM is sane and it's just that the BIOS
 doesn't initialize it properly.
 
 But I'd still like some way of filling the RAM when Linux starts (or
 perhaps in the bootloader), because letting memtest86 run after every
 cold reboot isn't a very satisfactory solution.
 
 Bootloaders like to do things like run in 16-bit or 32-bit mode on boxes where
 higher bitness is necessary to access all the memory.  It may be possible to
 do this in the bootloader, but the BIOS is clearly the correct place to fix
 this problem.

Just an idea: Does this BIOS have an option to (not) skip the full memory
test on bootup?
-- 
Have you ever noticed that the Klingons are all speaking unix?
Grep ls awk chmod.   Mknod ksh tar imap.
Wall fsck yacc! (that last is obviously a curse of some sort)
-- Gandalf  Parker
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Wasting our Freedom

2007-09-18 Thread Bodo Eggert
Paul de Weerd [EMAIL PROTECTED] wrote:
 On Mon, Sep 17, 2007 at 03:38:45PM +0200, Adrian Bunk wrote:

 | It's not about lazyness of BSD developers, many people who consider the
 | BSD licence more free than the GPL argue that the advantage of the BSD
 | licence is that it does not require you to give back.
 | 
 | Something is wrong if your licence text clearly states that you do not
 | require getting anything back but you then argue on moral grounds that
 | something has to be given back.
 
 Something is wrong if your licence text clearly states that you MUST
 give back, but then you don't return the favour on grounds that hey,
 they don't require it, so we don't have to.

If you may demand me to give back, why should I(*) not demand the same thing
for my contributions?

You're not only asking to contribute to your project, but you're asking me
to throw my code to the feet of Apple amd Microsoft, who will user it, make
big bucks and lock out alternatives as far as possible, especially free ones.
This happens to not be my idea of sharing code.

You may say it's morally correct for them because they never claimed they'd
not suck your blood, but did the GPL people claim not to demand others to
give back? Au contrair, and if it's OK for companies to do what they say
they would, it will be OK for them and for me, too.


*) I did not yet work on a BSD licensed project, but let's asume I did

 It may be perfectly legal, but it's interesting to say the least.
 No, you do not have to give back. But weren't you open source / free
 software developers ? Why did you pick the GPL ? Because you didn't
 want someone to run of with your code ? You wanted code to be given
 back ? Why not do it yourself ?

 By not giving back you're giving a strange signal.

Gee, since you're demanding back anyway, you can use the GPL for your
project and use my contribution. Problem solved.

Oh, you want people to be free not to share? Freedom includes having the
moral right to do something. Am I free not to share? Or is this freedom
just an empty shell?

-- 
Funny quotes:
36. You never really learn to swear until you learn to drive.

Friß, Spammer: [EMAIL PROTECTED] [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: O_NOLINK for open()

2007-09-14 Thread Bodo Eggert
On Thu, 13 Sep 2007, Jan Kara wrote:

   However, it occurs to me that this problem goes away if there were
   a method create a file in an unlinked state to begin with.  However
   there does not appear to be any such mechanism in Linux's open()
   interface.
  
  Having no window for creating stale temp files is nice to have. We only
  need a clever fool to implement it.-) But since it's hard to get killed
  just in the right moment for having a stale temp file, there is very low
  interest for this feature.

   I don't think this is a problem. The file is simply created with link
 count 0. As soon as the process closes the file, it gets deleted. So
 there would be no stale files... Or did you mean anything else?

This feature does, AFAIK, not yet exist. Therefore we'd need a code monkey.
-- 
Top 100 things you don't want the sysadmin to say:
42. Hey Fred, did you save that posting about restoring filesystems
with vi and a toothpick?  More importantly, did you print it out?
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: O_NOLINK for open()

2007-09-14 Thread Bodo Eggert
Brent Casavant [EMAIL PROTECTED] wrote:

[...]
 Hmm.  This will work as long as the peer process is running setuid
 to it's own unique user.  Excellent idea!  Since I need to make the
 program setuid to avoid non-priveleged ptrace attacks, this is a
 terrific solution.

Tried that:

~  cd tmp
~/tmp  cp /bin/sleep .
~/tmp  chmod u+s sleep
~/tmp  ./sleep 2147483647 
[1] 2823
~/tmp  strace -p 2823
Process 2823 attached - interrupt to quit
setup(

-- 
Top 100 things you don't want the sysadmin to say:
27. You can do this patch with the system up...

Friß, Spammer: [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: O_NOLINK for open()

2007-09-14 Thread Bodo Eggert
On Fri, 14 Sep 2007, Andreas Schwab wrote:
 Bodo Eggert [EMAIL PROTECTED] writes:

  ~/tmp  cp /bin/sleep .
  ~/tmp  chmod u+s sleep
  ~/tmp  ./sleep 2147483647 
  [1] 2823
  ~/tmp  strace -p 2823
  Process 2823 attached - interrupt to quit
  setup(
 
 You didn't change the owner, so this is not a setuid execution.

I expected that, but I wanted to be sure before telling bull.
Besides that, if the suid program was owned by the suid-to user,
that user could modify the binary in order to prepare a future attack.
-- 
Top 100 things you don't want the sysadmin to say:
16. find /usr2 -name nethack -exec rm -f {};
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: O_NOLINK for open()

2007-09-12 Thread Bodo Eggert
Brent Casavant [EMAIL PROTECTED] wrote:

[...]
 I could mmap a temporary tmpfs file (tmpfs so that if there is a
 machine crash no sensitive data persists) which is created with
 permissions of 0, immediately unlink it, and pass the file
 descriptor through an AF_UNIX socket.  This does open up a very
 small window of vulnerability if another process is able to chmod
 the file and open it before the unlink.

If the process can chmod the file, it can ptrace the daemon, too.
Or, using CAP_DAC_OVERRIDE, it can patch the daemon.

Both will void any security.

 However, it occurs to me that this problem goes away if there were
 a method create a file in an unlinked state to begin with.  However
 there does not appear to be any such mechanism in Linux's open()
 interface.

Having no window for creating stale temp files is nice to have. We only
need a clever fool to implement it.-) But since it's hard to get killed
just in the right moment for having a stale temp file, there is very low
interest for this feature.
-- 
You know you're in trouble when packet floods are competing to flood you.
-- grc.com

Friß, Spammer: [EMAIL PROTECTED] [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: sata scsi suggestion for make menuconfig

2007-09-08 Thread Bodo Eggert
Al Boldi [EMAIL PROTECTED] wrote:
 Alan Cox wrote:

  I once sent a patch to make libata a submenu of scsi.

 Which is wrong

 Nakked-by: Alan Cox [EMAIL PROTECTED]

 The general comments about moving this stuff around and making it clearer
 what sd/sr etc are nowdays are good but hiding libata under SCSI will
 cause even more confusion than it cures
 
 That's easy to fix:  just change the SCSI heading to include a libata hint.

I think you're fixing the wrong problem.

The real problem is hiding devices attached to some controlers between
one kind of the controllers. This has been correct whern they were bus-
specific, but since they are now shared by three busses, they should get
their own menu called (S)ATA/USB/SCSI attached devices - or whatever a
native speaker would suggest.

Besides that, if I imagine being a semi-novice and searching for IDE
support, I would have a hard time finding the IDE menu, and asuming
PATA to be non-experimental one day, I'd have a hard time deciding
which of the drivers to use. Maybe the SATA-drivers should be put
above the old PATA menu, amd maybe both of the titles should include
(E)IDE?

BTW: For CONFIG_ATA, you can replace
(!M32R  !M68K || BROKEN)  (!SUN4 || BROKEN)
with (!M32R  !M68K  !SUN4 || BROKEN)

BTW2: I think that menu needs very much reordering. Block devices should
be renamed to Other block devices, AGP support should belong into graphics
support, and many other things I don't even know need to be pushed around.
Even ordering by name would be better than the current situation! But it
should be done by someone knowing these devices, I could only do a part.
-- 
Top 100 things you don't want the sysadmin to say:
14. Any more trouble from you and your account gets moved to the 750

Friß, Spammer: [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: modinfo modulename question

2007-09-05 Thread Bodo Eggert
Justin Piszcz [EMAIL PROTECTED] wrote:

 Is there anyway to get/see what parameters were passed to a kernel module?
 Running modinfo -p module will show the defaults, but for example, st,
 the scsi tape driver, is there a way to see what it is currently using?

/sys/modules/$NAME/parameters (if it's using the new API)
-- 
It is still called paranoia when they really are out to get you.

Friß, Spammer: [EMAIL PROTECTED] [EMAIL PROTECTED]
 [EMAIL PROTECTED] [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Fwd: That whole Linux stealing our code thing

2007-09-02 Thread Bodo Eggert
Igor Sobrado [EMAIL PROTECTED] wrote:

 When code is multi-licensed it must be distributed under *all* these
 licensing terms concurrently.

No. E.g.:

If I don't agree to the GPL (or if I had violated it and therefore have lost
it's privileges), I MUST NOT redistribute it under the GPL because I have no
license to do that, but the BSD license would still allow me to redistribute.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-usb-devel] [RFC] USB: driver for iphone charging

2007-08-25 Thread Bodo Eggert
On Fri, 24 Aug 2007, Greg KH wrote:
 On Fri, Aug 24, 2007 at 12:51:19PM +0200, Bodo Eggert wrote:
  Greg KH [EMAIL PROTECTED] wrote:

   my berry_charge code that adds support for charging the iphone when it
   is plugged into a Linux machine.
  
  This should be a runtime option, because you may want to build a non-module
  kernel and not charge the phone while running your laptop on battery.
 
 Then just don't build this module if you are creating such a kernel :)
 
 Same thing goes for the existing blackberry charge driver too...

So I have to reboot if I want to charge my (blackberry|iphone) on my 
laptop?

Even if you say installing 850 KB of software (modutils) in order to 
toggle charging an iphone is sane, having to (un)load a module is not
user-friendly.

-- 
'Multiple exclamation marks,' he went on, shaking his head, 'are a 
sure sign of a diseased mind.'
-- Terry Pratchett in Eric
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] USB: driver for iphone charging

2007-08-24 Thread Bodo Eggert
Greg KH [EMAIL PROTECTED] wrote:

 my berry_charge code that adds support for charging the iphone when it
 is plugged into a Linux machine.

This should be a runtime option, because you may want to build a non-module
kernel and not charge the phone while running your laptop on battery.
-- 
Top 100 things you don't want the sysadmin to say:
72. My leave starts tomorrow.

Friß, Spammer: [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: RFC: issues concerning the next NAPI interface

2007-08-24 Thread Bodo Eggert
Linas Vepstas [EMAIL PROTECTED] wrote:
 On Fri, Aug 24, 2007 at 03:59:16PM +0200, Jan-Bernd Themann wrote:

 3) On modern systems the incoming packets are processed very fast. Especially
 on SMP systems when we use multiple queues we process only a few packets
 per napi poll cycle. So NAPI does not work very well here and the interrupt
 rate is still high.
 
 I saw this too, on a system that is modern but not terribly fast, and
 only slightly (2-way) smp. (the spidernet)
 
 I experimented wih various solutions, none were terribly exciting.  The
 thing that killed all of them was a crazy test case that someone sprung on
 me:  They had written a worst-case network ping-pong app: send one
 packet, wait for reply, send one packet, etc.
 
 If I waited (indefinitely) for a second packet to show up, the test case
 completely stalled (since no second packet would ever arrive).  And if I
 introduced a timer to wait for a second packet, then I just increased
 the latency in the response to the first packet, and this was noticed,
 and folks complained.

Possible solution / possible brainfart:

Introduce a timer, but don't start to use it to combine packets unless you
receive n packets within the timeframe. If you receive less than m packets
within one timeframe, stop using the timer. The system should now have a
decent response time when the network is idle, and when the network is
busy, nobody will complain about the latency.-)
-- 
Funny quotes:
22. When everything's going your way, you're in the wrong lane and and going
the wrong way.
Friß, Spammer: [EMAIL PROTECTED] [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Make checkpatch rant about trailing ; at the end of if expr

2007-08-22 Thread Bodo Eggert
On Mon, 20 Aug 2007, Jan Engelhardt wrote:
 On Aug 20 2007 13:52, Bodo Eggert wrote:

  But. The above regex does not seem to handle
  
  if ((a = b));
  oops;
  
  I have tried to come up with a superduper regex that handles multiple
  (), but my regex fu seems to stop above two pairs of ().
 
 This is because you can't do that using finite regular expressions.
 
 Regular expressions are Type-3 grammars, but you'd need a Type-2
 grammar to express the Dyck language (and you need to parse a Dyck
 Language, ignoring the non-dyck-parts).
 
 So what about this then...
 
 
 $s = shift @ARGV;
 $r = qr/a(??{ $r })?b/;

This is not a regular expression, because it can't be parsed by a
finite state machine (DFA/NFA) without a stack.
http://en.wikipedia.org/wiki/Deterministic_finite_state_machine

Obviously perl does allow non-regular expressions.

 if ($s =~ /^$r$/) {
   print Yup, that's good\n;
 } else {
   print fail\n;
 }
 
 
 $ perl foo.pl aa
 Not so much
 $ perl foo.pl 
 Yup, that's good
 $ perl foo.pl a
 Not so much

perl foo.pl aaababbb
fail

$r = qr/a(??{ $r })?b(??{ $r })?/; does seem to work.
-- 
Those who would give up essential liberty, to purchase a little
temporary safety, deserve neither liberty nor safety.
-- Benjamin Franklin, Historical Review of Pennsylvania, 1759
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Software based ECC ?

2007-08-21 Thread Bodo Eggert
Folkert van Heusden [EMAIL PROTECTED] wrote:

  http://pdos.csail.mit.edu/papers/softecc:ddopson-meng
softecc_ddopson-meng.pdf
  SoftECC : A System for Software Memory Integrity Checking
 
 Personally, I'd recommend just shelling out the bucks for hardware ECC if
 the reliability matters.
 
 a question and an idea: Q: is ecc guaranteed to detect all bitflips?

It's guaranteed not to.

Having n extra bits, you can detect n-bit-flips and correct n/2-bit-flips
(provided you use an optimal code).

These extra bits can flip, too, so if you have m = 1 data bits and any
finite number n of extra bits, it's possible to have an undetectable
n+1-bit-flip.
-- 
If you can't remember, then the claymore IS pointed at you. 

Friß, Spammer: [EMAIL PROTECTED] [EMAIL PROTECTED]
 [EMAIL PROTECTED] [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: group ownership of tun devices -- nonfunctional?

2007-08-20 Thread Bodo Eggert
On Mon, 20 Aug 2007, Rene Herman wrote:
 On 08/19/2007 11:42 PM, Bodo Eggert wrote:

  The intended [my me] semantics is If the user is not
   * the allowed user
  or
   * member of the allowed group
  or
   * cabable of CAP_NET_ADMIN
  then error out. I'm asuming
 
 There is a short description of the desired semantics in the link that was 
 posted:
 
 http://lkml.org/lkml/2007/6/18/228
 
 ===
 The user now is allowed to send packages if either his euid or his egid
 matches the one specified via tunctl (via -u or -g respecitvely). If both
 gid and uid are set via tunctl, both have to match.
 ===
 
 Paraphrasing the original code above, it's saying:
 
 if ((owner_is_set  does_not_match) || (group_is_set  does_not_match))
   bugger_off_unless(CAP_NET_ADMIN);
 
 or reverting the logic:
 
 if ((owner_is_unset || does_match)  (group_is_unset || does_match))
   good_to_go();
 
 which probably matches the intention -- we're good to go only if the 
 credentials that are set also match.

Maybe there are valid reasons to do it this way, but I think having it the 
way I described would be less confusing.

-- 
   ¤ Bill of Spammer-Rights ¤
1. We have the right to assassinate you.
2. You have the right to be assassinated.
3. You have the right to resist, but it is futile.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Make checkpatch rant about trailing ; at the end of if expr

2007-08-20 Thread Bodo Eggert
Jan Engelhardt [EMAIL PROTECTED] wrote:
 On Aug 16 2007 10:21, Andy Whitcroft wrote:

 +   if ($line =~ /\bif\s*\([^\)]*\)\s*\;/) {

Heh, you are the second person to suggest this check today, do I detect
some ripped out hair due to one of these!

I've taken this idea and expanded it to cover if, for and while which
can all suffer from this.  Using the relative indent to work out which
are valid combinations:
 
 But. The above regex does not seem to handle
 
 if ((a = b));
 oops;
 
 I have tried to come up with a superduper regex that handles multiple
 (), but my regex fu seems to stop above two pairs of ().

This is because you can't do that using finite regular expressions.

Regular expressions are Type-3 grammars, but you'd need a Type-2
grammar to express the Dyck language (and you need to parse a Dyck
Language, ignoring the non-dyck-parts).
-- 
Your e-mail has been returned due to insufficient voltage. 

Friß, Spammer: [EMAIL PROTECTED] [EMAIL PROTECTED]
 [EMAIL PROTECTED] [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: group ownership of tun devices -- nonfunctional?

2007-08-19 Thread Bodo Eggert
Mike Mohr [EMAIL PROTECTED] wrote:

(intentionally not snipping much)

 Per the post here:
 
 http://lkml.org/lkml/2007/6/18/228
 
 it appears that the group ownership patch has made it into .23.  I am
 using these patches, amongst which the kernel component appears to be
 identical:
 
 http://sigxcpu.org/unsorted-patches/0001-allow-tun-ownership-by-group.patch
 http://sigxcpu.org/unsorted-patches/tunctl_gid.diff
 
 I can create devices that are owned by my user account (tunctl -u
 `whoami` -t tap0) and it works fine.  However, if I use group
 permissions with -g it stops working.  In all cases, if I pass -g
 group, the interface is created correctly but it is unusable as a
 non-root user.
 
 So my question is: am I doing something wrong?  If I am, I don't see
 it.  Assuming then that I am not doing anything wrong on my end, I
 assume then that there is something missing from the kernel patch I
 applied.  I read over it and I can't see any issues, especially
 considering that tunctl comes back without error (even with -g) and
 creates an interface.
 
 Just wondering if this was an issue that should be looked into--


IMHO the check is broken:

+   if (((tun-owner != -1 
+ current-euid != tun-owner) ||
+(tun-group != -1 
+ current-egid != tun-group)) 
+!capable(CAP_NET_ADMIN))
return -EPERM;

It should be something like:

+   if (!((tun-owner == tun-owner) ||
+ (tun-group == tun-group) ||
+ capable(CAP_NET_ADMIN)))
return -EPERM;

Please verify and forward to the maintainers if my guess appears to be correct.
-- 
Never stand when you can sit, never sit when you can lie down, never stay
awake when you can sleep.

Friß, Spammer: [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: group ownership of tun devices -- nonfunctional?

2007-08-19 Thread Bodo Eggert
On Sun, 19 Aug 2007, Rene Herman wrote:

 On 08/19/2007 06:05 PM, Bodo Eggert wrote:
 
  IMHO the check is broken:
  
  +   if (((tun-owner != -1 
  + current-euid != tun-owner) ||
  +(tun-group != -1 
  + current-egid != tun-group)) 
  +!capable(CAP_NET_ADMIN))
  return -EPERM;
  
  It should be something like:
  
  +   if (!((tun-owner == tun-owner) ||
  + (tun-group == tun-group) ||
 
 ???

Argh, I edited asuming the same order of variables. Substitute 
current-e{uid,gid} for one of the sides.

  + capable(CAP_NET_ADMIN)))
  return -EPERM;

The intended semantics is If the user is not
 * the allowed user
or
 * member of the allowed group
or
 * cabable of CAP_NET_ADMIN
then error out. I'm asuming  

Thinking about it, maybe you should check each group, not just the 
effective group. In that case, my change would be still wrong. However, 
I'm not going to fix this anytime soon.

-- 
Funny quotes:
15. I drive way too fast to worry about cholesterol.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Bad CD disk disables IDE DMA

2007-08-16 Thread Bodo Eggert
Michal Piotrowski [EMAIL PROTECTED] wrote:
 On 15/08/07, Zoltan Boszormenyi [EMAIL PROTECTED] wrote:

 I noticed that a bad CD of mine makes DMA disabled:

[...]
 hda: cdrom_decode_status: error=0x40 { LastFailedSense=0x04 }
 ide: failed opcode was: unknown
 hda: DMA disabled
 hda: ide_intr: huh? expected NULL handler on exit
 hda: ATAPI reset complete

 Every time I put the said CD into to drive and DMA is on, I get the
 above messages.
 
 This might be intended.

Maybe, and maybe only a certain effect might be intended. And maybe I can
help by asking these questions:

1) Does disabling DMA fix the seek errors, or does it hide them by the PIO
   interfce not printing them?

2) If it does hide them, would filtering for seek errors be a sane thing to
   do, or does the DMA engine (or HDD) behave badly on these errors, or
   do bad-DMA devices report seek errors, too?

3) If it does not hide them, would re-enabling DMA on disk change be a
   feasable workaround?
   (proposed interface: use hdparm -d $num where $num =
 0, 1: as before
 3: DMA on, if switched off automatically, will be set to 2
 2: DMA off, will be turned on (set to 3) on disk change)

4) Does libata work well enough for making all effort put into that old
   IDE layer be a waste of time?
-- 
Fun things to slip into your budget
True: NASA couldn't get money approved for hangars for Space Shuttle so they
put in request for 'Space Vehicle Refurbishment and Storage Facilities'
   - and got the money to build hangars
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V3] limit minixfs printks on corrupted dir i_size, CVE-2006-6058

2007-08-15 Thread Bodo Eggert
On Mon, 13 Aug 2007, Eric Sandeen wrote:
 Bodo Eggert wrote:

  Warning: I'm only looking at the patch.
 
  You are supposed to print an error message for a user, not to write in a
  chat window to a 1337 script kiddie. OK, you just matched the current style,
  and your patch is IMHO OK for a quick security fix, but:
 
  - Security fixes should be CCed to the security mailing list, shouldn't 
  they?
(It might be security@ or stable@, I'll remember tomorrow, but then I'd
 forget to comment)
  - Imagine you have three mounts containing a minix fs, how can you tell 
  which
one is the the defective one?
  - The message says minix_bmap, while the patch suggests it's in
block_to_path. Therefore I asume minix_bmap to have only random
informational value.
  - Does block  0 or block  $size make a difference?
  - the printk lacks the loglevel.
  - Asuming minix supports error handling, shouldn't it do something?
 
  I'd suggest a message saying something like minix: Bad block address on
  device 08:15, needs fsck.

 Ok, do you like this slightly better?  It states the subsystem, the 
 function with the error, the block nr. in the case of a too-large block,
 and the block device on which the error occurred.

- how long is BDEVNAME_SIZE? Will it fit on the stack?
- Does it include thespace for \0?

I asume you copied other users, and the other users will do it right (or 
at least not terribly wrong:), but I can't dig the code right now.

  Honestly minix.fsck
 doesn't handle the situation well either, so at this point I hesitate
 to recommend it in the print.  :)

*g*
-- 
Top 100 things you don't want the sysadmin to say:
79. What's this any key I'm supposed to press?
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Noatime vs relatime

2007-08-11 Thread Bodo Eggert
Rene Herman [EMAIL PROTECTED] wrote:

 I must say I've been wondering about relatime a bit as well. Are there
 actually users who do really want atime, but not badly enough to want real
 atime?

Anyone using /var/spool/mail.
-- 
Programming is an art form that fights back. 

Friß, Spammer: [EMAIL PROTECTED]
 [EMAIL PROTECTED] [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] make atomic_t volatile on all architectures

2007-08-09 Thread Bodo Eggert
Jerry Jiang [EMAIL PROTECTED] wrote:
 On Wed, 8 Aug 2007 21:18:25 -0700 (PDT)
 On Wed, 8 Aug 2007, Chris Snook wrote:

  Some architectures currently do not declare the contents of an atomic_t to
  be
  volatile.  This causes confusion since atomic_read() might not actually
  read anything if an optimizing compiler re-uses a value stored in a
  register, which can break code that loops until something external changes
  the value of an atomic_t.
 
 I'd be *much* happier with atomic_read() doing the volatile instead.
 
 The fact is, volatile on data structures is a bug. It's a wart in the C
 language. It shouldn't be used.
 
 Why? It's a wart! Is it due to unclear C standard on volatile related point?
 
 Why the *volatile-accesses-in-code* is acceptable, does C standard make it
 clear?

http://lwn.net/Articles/233482/
-- 
Fun things to slip into your budget
Heisenberg Compensator upgrade kit

Friß, Spammer: [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Documentation files in html format?

2007-08-09 Thread Bodo Eggert
Jan Engelhardt [EMAIL PROTECTED] wrote:
 On Aug 9 2007 11:31, Stephen Hemminger wrote:

Since the network device documentation needs a rewrite, I was thinking
of using basic html format instead of just plain text. But since this would
be starting an new precedent for kernel documentation, some it seemed
like a worthwhile topic for discussion.

Advantages of html:
  * basic formatting like lists, italics, etc
  * easier to integrate into other places and retain formatting
  * ability to link documents and to external sources easier

Downsides:
  * can become too formatted and unclear
 
 So only use h1 to h6, p, b, i, u, ul, ol and li.

I don't think b and i should be used, instead you should use styles
(span class=code etc). Things like em and strong should be OK,
if used consistently.

 Perhaps maybe p class=block with .block{text-align:justify;}
 because that looks nice in general.

If you like that, you can say p {text-align:justify;} in the default
stylesheet. (And if people would override the alignment, class=block
would be the wrong name).


BTW: You should not listen to those frames-are-evil guys. This will only
force you to include all the navigation code in each page, and having to scroll
beyond a ton of navigation stuff was a major annoyance while trying to read
the selinux docs. Instead, you should e.g. provide an up, prev and next
link on each page, and keep the rest in the navigation frame (if needed).
-- 
Top 100 things you don't want the sysadmin to say:
0. I just made an extra 2 meg of space in /,  I stripped /vmunix.
Oh, so that's why ps doesn't work.
Friß, Spammer: [EMAIL PROTECTED] [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Documentation files in html format?

2007-08-09 Thread Bodo Eggert
On Thu, 9 Aug 2007, Jan Engelhardt wrote:
 On Aug 9 2007 14:34, Bodo Eggert wrote:

 I don't think b and i should be used, instead you should use styles
 (span class=code etc).
 
 b does the same as span style=font-weight: bold;, and the latter is much
 more verbose for the same thing.

You shoud use neither. It's OK on homepages, but for documents, you should
be able to change the formating using stylesheet.

 Things like em and strong should be OK, if used consistently.
 
  Perhaps maybe p class=block with .block{text-align:justify;}
  because that looks nice in general.
 
 If you like that, you can say p {text-align:justify;} in the default
 stylesheet. (And if people would override the alignment, class=block
 would be the wrong name).
 
 Oh that comes because MS Office calls it Blocksatz in German ;-)

I missed that point. What I wanted to say was: If the user overrides 
:justify, it would still be called justified, which would be not 
justified.-)
-- 
A slipping gear could let your M203 grenade launcher fire when you
least expect it. That would make you quite unpopular in what's left of
your unit.
-Army's magazine of preventive maintenance.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V2] limit minixfs printks on corrupted dir i_size, CVE-2006-6058

2007-08-09 Thread Bodo Eggert
Eric Sandeen [EMAIL PROTECTED] wrote:

 This attempts to address CVE-2006-6058
 http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2006-6058
  
 first reported at http://projects.info-pull.com/mokb/MOKB-17-11-2006.html
 
 Essentially a corrupted minix dir inode reporting a very large
 i_size will loop for a very long time in minix_readdir, minix_find_entry,
 etc, because on EIO they just move on to try the next page.  This is
 under the BKL, printk-storming as well.  This can lock up the machine
 for a very long time.  Simply ratelimiting the printks gets things back
 under control.

 Index: linux-2.6.22-rc4/fs/minix/itree_v1.c
 ===
 --- linux-2.6.22-rc4.orig/fs/minix/itree_v1.c
 +++ linux-2.6.22-rc4/fs/minix/itree_v1.c
 @@ -27,7 +27,8 @@ static int block_to_path(struct inode *
  if (block  0) {
  printk(minix_bmap: block0\n);
  } else if (block = (minix_sb(inode-i_sb)-s_max_size/BLOCK_SIZE)) {
 - printk(minix_bmap: blockbig\n);
 + if (printk_ratelimit())
 + printk(minix_bmap: blockbig\n);

Warning: I'm only looking at the patch.

You are supposed to print an error message for a user, not to write in a
chat window to a 1337 script kiddie. OK, you just matched the current style,
and your patch is IMHO OK for a quick security fix, but:

- Security fixes should be CCed to the security mailing list, shouldn't they?
  (It might be security@ or stable@, I'll remember tomorrow, but then I'd
   forget to comment)
- Imagine you have three mounts containing a minix fs, how can you tell which
  one is the the defective one?
- The message says minix_bmap, while the patch suggests it's in
  block_to_path. Therefore I asume minix_bmap to have only random
  informational value.
- Does block  0 or block  $size make a difference?
- the printk lacks the loglevel.
- Asuming minix supports error handling, shouldn't it do something?

I'd suggest a message saying something like minix: Bad block address on
device 08:15, needs fsck.
-- 
Oops. My brain just hit a bad sector. 

Friß, Spammer: [EMAIL PROTECTED] [EMAIL PROTECTED]
 [EMAIL PROTECTED] [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH 1/4] pass open file to -setattr()

2007-08-09 Thread Bodo Eggert
Miklos Szeredi [EMAIL PROTECTED] wrote:

  This is needed to be able to correctly implement open-unlink-fsetattr
  semantics in some filesystem such as sshfs, without having to resort
  to silly-renaming.
 
 How do you plan to do that?
 
 Easy: the SFTP protocol has stateful opens and defines an FSTAT call.

Is it possible to reconnect without umounting? If yes, the unlinked files
would be lost in spite of being opened, wouldn't they?
-- 
Top 100 things you don't want the sysadmin to say:
11. Can you get VMS for this Sparc thingy?

Friß, Spammer: [EMAIL PROTECTED] [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: allow non root users to set io priority idle ?

2007-08-07 Thread Bodo Eggert
Andi Kleen [EMAIL PROTECTED] wrote:

 couldn't this be fixed by bumping idle tasks to middle while they hold a
 
 Usually to high.

Then use the lowest non-idle priority. The result will not be more b0rken
than nice -n 19.

 But it's all complicated and hasn't been done consistently
 (there are real time mutexes in the -rt kernel for example,
 but there are lots of other locks and they have higher overhead too)
 and it's unclear we really want to do all this complexity anyways.

If you have an rt application, it will block normal tasks like a normal
task will block idle tasks. You need to handle that situation anyway.

 Also as I said the problem could then still happen in user space
 which then would all need to be fixed to handle PI too.

Don't do that then. If I ask an idle priority task, I should not expect an
answer unless the system is idle. And besides that, I should not depend on
an answer at all, since the task might be stuck while reading a NFS file
from that smoking and burning server.

 In some cases the relationship is also not as simple as a single
 lock. And for IO handling it would be likely quite hard.

As long as IO works correctly while having realtime tasks, there is no
problem, and if it doesn't work, it needs to be fixed anyway.

 I personally always found idle priorities quite dubious because
 even if they worked reliable for the CPU they will clear your cache/
 load your memory controller and impact all other programs because
 of this. And for the disk they will cause additional seeks which are
 also very costly.

A very-low-normal-priority tasks would cause all these effects anyway, but
it would do so more frequently.
-- 
Top 100 things you don't want the sysadmin to say:
74. I remember the last time I saw it do that...

Friß, Spammer: [EMAIL PROTECTED] [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Documentation for sysfs, hotplug, and firmware loading.

2007-08-05 Thread Bodo Eggert
On Mon, 23 Jul 2007, Rob Landley wrote:
 On Saturday 21 July 2007 8:14:41 am Bodo Eggert wrote:
  Greg KH [EMAIL PROTECTED] wrote:
   On Fri, Jul 20, 2007 at 08:21:39PM -0400, Rob Landley wrote:
   I'm not trying to document /sys/devices.  I'm trying to document
   hotplug, populating /dev, and things like firmware loading that fall out
   of that. This requires use of sysfs, and I'm only trying to document as
   much of sysfs as you need to do that.
  
   Like I stated before, you do not need to even have sysfs mounted to have
   a dynamic /dev.
  
   And why do you need to document populating /dev dynamically?  udev
   already solves this problem for you, it's not like people are going off
   and reinventing udev for their own enjoyment would not at least look at
   how it solves this problem first.
 
  Turning your words around, you get: Whatever one of these programs does
  documents how dynamic devices should be handled. If this is true, any
  change that makes one of these programs break is a kernel bug.
 
  Besides that: How am I supposed to be able to correctly change udev if
  there is no document telling me what would work and what happens to
  work by accident?
 
 You aren't expected to.  Remember that udev and sysfs are written by the same 
 people, working together off-list.  They're free to break the exported data 
 format on a whim, because they write the code at both ends and fundamentally 
 they're talking to themselves.  They honestly say you can't expect a new 
 kernel to work with an old udev, and they say it with a straight face.  (To 
 me, this sounds like saying you can't expect a new kernel to work with an old 
 version of ps, because of /proc.)
 
 Documentation is a threat to this way of working, because it would impose 
 restrictions on them.  A spec is only of use if you introduce the radical 
 idea that the information exported by sysfs exists for some purpose _other_ 
 than simply to provide udev with information (and a specific version of udev 
 matched to that kernel version, at that).

And having no documentation is, as you can see in this thread, a threat to
open software. Having exactly one blob of software, and being left on your
own once you change a bit, is one step away from having a binary only module.

   To do otherwise would be foolish :)
 
  Some people like to fool around and create even smaller wheels.
  E.g. I'm changing the ACPI button driver to just call Ctrl_alt_del
  in order not to have an extra process running and free 0.2 % of my RAM.
 
 When I started looking at udev in 2005, it was a disaster.  My commentary at 
 the time is at http://lkml.org/lkml/2005/10/30/189 and the relevant bit is:

[...]

 And so I made mdev, a utility which populated /dev _with_ a config file in 
 7k.  
 Greg's upset I didn't just patch udev to remove libsysfs, remove the 
 duplicated klibc code, remove the gratuitous database, remove the 
 overcomplicated config file parser (with rules compiler), and so on.  They're 
 boggling that I could ever have been unhappy with the One True Project to 
 populate /dev.

I see, we agree on this point. Besides that, I like to see the steps to be
done, instead of having a letter sent to a voodoo doctor in Africa (called 
udev) and getting back a magic spell to be chanted on my system (unless 
he just pokes some voodoo doll).
-- 
Top 100 things you don't want the sysadmin to say:
87. Sorry, the new equipment didn't get budgetted.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Documentation for sysfs, hotplug, and firmware loading.

2007-07-21 Thread Bodo Eggert
Greg KH [EMAIL PROTECTED] wrote:
 On Fri, Jul 20, 2007 at 08:21:39PM -0400, Rob Landley wrote:

 I'm not trying to document /sys/devices.  I'm trying to document hotplug,
 populating /dev, and things like firmware loading that fall out of that.
 This requires use of sysfs, and I'm only trying to document as much of sysfs
 as you need to do that.
 
 Like I stated before, you do not need to even have sysfs mounted to have
 a dynamic /dev.
 
 And why do you need to document populating /dev dynamically?  udev
 already solves this problem for you, it's not like people are going off
 and reinventing udev for their own enjoyment would not at least look at
 how it solves this problem first.

Turning your words around, you get: Whatever one of these programs does
documents how dynamic devices should be handled. If this is true, any
change that makes one of these programs break is a kernel bug.

Besides that: How am I supposed to be able to correctly change udev if
there is no document telling me what would work and what happens to
work by accident?

 To do otherwise would be foolish :)

Some people like to fool around and create even smaller wheels.
E.g. I'm changing the ACPI button driver to just call Ctrl_alt_del
in order not to have an extra process running and free 0.2 % of my RAM.

 Firmware loading is fine to document if you wish to do so.  But again,
 why?  We already have multiple userspace programs that provide this
 feature for them.  Perhaps you want to document how to add firmware to a
 system in order for these different programs to pick them up?

I once tried to install a firmware for hotplug. Even finding the place whre
I'm supposed to put it was harder than rewriting that *beep* from start,
but I could not rewrite it because I didn't have any documentation.
Even digging in that pile of wrapper scrips in order to debug that thing
was a nightmare. (Having a number of places where the firmware will be
expected in one of many versions and formats stored using one of many
filenames can drive you nuts.)

 Or perhaps you want to document how to add this kind of functionality to
 your kernel driver so that it can handle firmware loading by using the
 firmware interface that the kernel provides?

I suppose that's missing, too. Or scattered in a number of contradicting
and mostly outdated howtos across the internet.

 If you just want to document the hotplug/uevent api, then do just that.
 However I think you are overreaching with your scope here and getting
 mighty confused in the process.

In other words: Grasping sysfs is not a feasible task? If this is true,
how can anybody reliably use sysfs?
-- 
Top 100 things you don't want the sysadmin to say:
99. Shit!!

Friß, Spammer: [EMAIL PROTECTED] [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH][RFC] 4K stacks default, not a debug thing any more...?

2007-07-19 Thread Bodo Eggert
Alan Cox [EMAIL PROTECTED] wrote:
 On Thu, 19 Jul 2007 03:33:58 +0200
 Andrea Arcangeli [EMAIL PROTECTED] wrote:

  8K stacks without IRQ stacks are not safer so I don't understand your
  comment ?
 
 Ouch, see the reports about 4k stack crashes. I agree they're not
 safe w/o irq stacks (like on x86-64), but they're generally safer.
 
 Still don't follow. How is exceeds stack space but less likely to be
 noticed safer.

If there is a tree in the forest, is it as likely to fall as the three
that's being chopped in front of our eyes? It is, because each tree will
fall eventually, but you'd still not allow your kids to play on the
tree being chopped, but you'd probably allow them to climb that other tree
like all the other kids do.

The same applies to the stack: We don't know if or when we'll see all
possible interrupts fire and kill the 8K stack, but we know for sure the
8K stack has been climbed for years and there is an axe on that 4K stack.
So where do you send the users to play?
-- 
What's worse than a Male Chauvinist Pig?
A woman that won't do what she's told.

Friß, Spammer: [EMAIL PROTECTED] [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/3][try 1] init: enable system-on-initramfs

2007-07-19 Thread Bodo Eggert
On Wed, 18 Jul 2007, Rob Landley wrote:
 On Friday 13 July 2007 2:56:00 pm Bodo Eggert wrote:

  I toyed with setting up a diskless system in initramfs. In the process, I
  came across some things:
 
  1)  There is no way to have the kernel not mount a filesystem,
  unless you use /init or rdinit=.
 
 Er, yes.  By design.
 
 The kernel has to run an init program in order to hand off control to 
 userspace.  In initramfs, this is defined as /init.  It looks in exactly one 
 documented place.
 
 The older root= mechanism fell back to a half-dozen places (eventually trying 
 things like /bin/sh if it couldn't find anything else).  This is because 
 there wasn't a documented standard for what should be executed, like there is 
 with initramfs.

Ever since I started using linux in 1997, the first program to run from an
installed system was /sbin/init. (I might think about removing the other 
paths.)

The ramfs' /init was intended for system setup, which is a separate job.
It is not intended to be the program running the system. Mixing those two
up just does not feel right. Setting root= in order to change the root
directory is much more natural.

  1a) In the process of writing these patches, I found prepare_namespace not
  to be called if /init is present. prepare_namespace will call
  security_sb_post_mountroot after mounting the root fs. I did not yet
  see a way to call this from /init, and grepping kinit for security did
  not help, too.
 
 I don't use selinux.  I'm neither a Fortune 500 company nor the NSA.

That's no reason for keeping bugs in that part.

 However, if you don't trust your own initramfs, where everything starts out 
 running as root, you have bigger problems.

Using that argument, you can deduce that nobody would need selinux at all.
Obviously some people disagree, and maybe some of them are no fools,
therefore we should try to DTRT for them and do the callback when it's
supposed to happen.

BTW: The problems start as soon as you have to do reflections on trusting 
trust.

 This is probably a bug, but using the features of this patchset, you'll
 avoid hitting it. Therefore this patchset does nothing about that.
 
  2)  If you want to use tmpfs, you need a script which essentially
  duplicates the work the kernel just did: Mount the root fs, unpack or move
  the files.
 
 mkdir sub
 mount -t tmpfs sub sub
 cp -ax / sub
 switch_root sub /init-stage-2
 
 I haven't tried it but it really doesn't sound like brain surgery.  If your 
 switch_root is statically linked, you can use mv instead of cp.

Why should I have to do that if I can do the right thing in the first 
place with in each respect negative costs?

  Using tmpfs instead for the first root mount is as cheap as 
  using ramfs, as long as tmpfs is used anyway (and most likely it is).
 
 *shrug*  I don't object to doing so, but I've heard nebulous technical 
 objections from people who know more about the implementation details fo 
 tmpfs than I do.

Obviously these problems were solved.

  2a) I figured if you prepared the root fs to contain a running system, you
  woud probably also set up a runnable system on it.
 
 *scratches head*
 
 That's a tautology, right?

You may also use the same kernel with the same initramfs in order to start
a classic system, just by changing root=. This may be nice for rescue
systems. Off cause after applying the third patch, you may also unselect
that feature.

  Therefore I changed 
  the default to boot from tmpfs if there was no /init nor a root=
  option. (If there is a /init, it will be executed as usural.)
 
 I have no idea what you just said.  If there's an /init, we boot from it.  If 
 there's no /init, they just violated the spec and we don't know what to boot 
 from.

These patches changes the spec in order to support system-on-rootfs:

If there is an init, it will run. No change from the current situation 
ever, at least if not using rdev.

If you use root=rootfs, a system-on-rootfs will run.

If you use rootfs=tmpfs, root= will default to rootfs, and I did it in a 
way disabeling that obsolete rdev.

 They made an initramfs.  They constructed it, explicitly, for the Linux 
 kernel, and /init is what the kernel will try to boot.  I documented this 
 years ago.  If they chose not to put an /init, then they didn't want us to 
 boot from initramfs.  (Maybe they're supplying firmware and an /sbin/hotplug 
 for a statically linked device.  I dunno.)

Not booting /using/ an intramfs differs from not booting /from/ an 
intramfs. Mixing them works by accident.

  Unfortunately the way I do it, this will override the rdev setting, but
  that should be OK, since rdev is dead. Isn't it?
 
  3)  While I was at it, I figured I would not need most of the init/mount*
  code anymore. Therefore I made patch 3, which ifdefs it out as far as
  possible while still aiming for a small change.
 
  Patch 1 adds the capability to use root=rootfs.
 
 I've been using

  1   2   3   4   >