Re: [PATCH] kvm needs menu structure

2006-12-11 Thread Avi Kivity

Randy Dunlap wrote:

From: Randy Dunlap <[EMAIL PROTECTED]>

KVM config items need to be inside a menu structure instead of
dangling off of Device Drivers.

  


A similar patch (kvm-put-kvm-in-a-new-virtualization-menu.patch) is 
already queued in -mm.


Andrew, Randy's patch shouldn't be applied, unless there's strong 
feeling for a doubly-nested menu.


--
error compiling committee.c: too many arguments to function

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] rfkill - Add support for input key to control wireless radio

2006-12-11 Thread Ivo Van Doorn

Hi,


> > > > > 2 - Hardware key that does not control the hardware radio and does 
not report anything to userspace
> > > >
> > > > Kind of uninteresting button ;)
> > >
> > > And this is the button that rfkill was originally designed for.
> > > Laptops with integrated WiFi cards from Ralink have a hardware button 
that don't send anything to
> > > userspace (unless the ACPI event is read) and does not directly control 
the radio itself.
> > >
> >
> > So what does such a button do? I am confused here...
>
> Without a handler like rfkill, it does nothing besides toggling a bit in a 
register.
> The Ralink chipsets have a couple of registers that represent the state of 
that key.
> Besides that, there are no notifications to the userspace nor does it 
directly control the
> radio.
> That is where rfkill came in with the toggle handler that will listen to the 
register
> to check if the key has been pressed and properly process the key event.

In this case the driver can make the button state available to userspace so
thsi is really type 2) driver as far as I can see. The fact that the button
is not reported to userspace yet should not get into our way of classifying
it.


I was indeed considering it as a type 2) device.
I am currently working on revising the rfkill driver to work as you suggested,
so I hope to have a new patch ready for you soon.

Ivo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[take27 1/8] kevent: Description.

2006-12-11 Thread Evgeniy Polyakov

Description.


diff --git a/Documentation/kevent.txt b/Documentation/kevent.txt
new file mode 100644
index 000..2e03a3f
--- /dev/null
+++ b/Documentation/kevent.txt
@@ -0,0 +1,240 @@
+Description.
+
+int kevent_init(struct kevent_ring *ring, unsigned int ring_size, 
+   unsigned int flags);
+
+num - size of the ring buffer in events 
+ring - pointer to allocated ring buffer
+flags - various flags, see KEVENT_FLAGS_* definitions.
+
+Return value: kevent control file descriptor or negative error value.
+
+ struct kevent_ring
+ {
+   unsigned int ring_kidx, ring_over;
+   struct ukevent event[0];
+ }
+
+ring_kidx - index in the ring buffer where kernel will put new events 
+   when kevent_wait() or kevent_get_events() is called 
+ring_over - number of overflows of ring_uidx happend from the start.
+   Overflow counter is used to prevent situation when two threads 
+   are going to free the same events, but one of them was scheduled 
+   away for too long, so ring indexes were wrapped, so when that 
+   thread will be awakened, it will free not those events, which 
+   it suppose to free.
+
+Example userspace code (ring_buffer.c) can be found on project's homepage.
+
+Each kevent syscall can be so called cancellation point in glibc, i.e. when 
+thread has been cancelled in kevent syscall, thread can be safely removed 
+and no events will be lost, since each syscall (kevent_wait() or 
+kevent_get_events()) will copy event into special ring buffer, accessible 
+from other threads or even processes (if shared memory is used).
+
+When kevent is removed (not dequeued when it is ready, but just removed), 
+even if it was ready, it is not copied into ring buffer, since if it is 
+removed, no one cares about it (otherwise user would wait until it becomes 
+ready and got it through usual way using kevent_get_events() or kevent_wait()) 
+and thus no need to copy it to the ring buffer.
+
+---
+
+
+int kevent_ctl(int fd, unsigned int cmd, unsigned int num, struct ukevent 
*arg);
+
+fd - is the file descriptor referring to the kevent queue to manipulate. 
+It is created by opening "/dev/kevent" char device, which is created with 
+dynamic minor number and major number assigned for misc devices. 
+
+cmd - is the requested operation. It can be one of the following:
+KEVENT_CTL_ADD - add event notification 
+KEVENT_CTL_REMOVE - remove event notification 
+KEVENT_CTL_MODIFY - modify existing notification 
+KEVENT_CTL_READY - mark existing events as ready, if number of events is 
zero,
+   it just wakes up parked in syscall thread
+
+num - number of struct ukevent in the array pointed to by arg 
+arg - array of struct ukevent
+
+Return value: 
+ number of events processed or negative error value.
+
+When called, kevent_ctl will carry out the operation specified in the 
+cmd parameter.
+---
+
+ int kevent_get_events(int ctl_fd, unsigned int min_nr, unsigned int max_nr, 
+   struct timespec timeout, struct ukevent *buf, unsigned flags);
+
+ctl_fd - file descriptor referring to the kevent queue 
+min_nr - minimum number of completed events that kevent_get_events will block 
+waiting for 
+max_nr - number of struct ukevent in buf 
+timeout - time to wait before returning less than min_nr 
+ events. If this is -1, then wait forever. 
+buf - pointer to an array of struct ukevent. 
+flags - various flags, see KEVENT_FLAGS_* definitions.
+
+Return value:
+ number of events copied or negative error value.
+
+kevent_get_events will wait timeout milliseconds for at least min_nr completed 
+events, copying completed struct ukevents to buf and deleting any 
+KEVENT_REQ_ONESHOT event requests. In nonblocking mode it returns as many 
+events as possible, but not more than max_nr. In blocking mode it waits until 
+timeout or if at least min_nr events are ready.
+
+This function copies event into ring buffer if it was initialized, if ring 
buffer
+is full, KEVENT_RET_COPY_FAILED flag is set in ret_flags field.
+---
+
+ int kevent_wait(int ctl_fd, unsigned int num, unsigned int old_uidx, 
+   struct timespec timeout, unsigned int flags);
+
+ctl_fd - file descriptor referring to the kevent queue 
+num - number of processed kevents 
+old_uidx - the last index user is aware of
+timeout - time to wait until there is free space in kevent queue
+flags - various flags, see KEVENT_FLAGS_* definitions.
+
+Return value:
+ number of events copied into ring buffer or negative error value.
+
+This syscall waits until either timeout expires or at least one event becomes 
+ready. It also copies events into special ring buffer. If ring buffer is full,
+it waits until there are ready events and then return.
+If kevent is one-shot kevent it is 

[take27 5/8] kevent: Timer notifications.

2006-12-11 Thread Evgeniy Polyakov

Timer notifications.

Timer notifications can be used for fine grained per-process time 
management, since interval timers are very inconvenient to use, 
and they are limited.

This subsystem uses high-resolution timers.
id.raw[0] is used as number of seconds
id.raw[1] is used as number of nanoseconds

Signed-off-by: Evgeniy Polyakov <[EMAIL PROTECTED]>

diff --git a/kernel/kevent/kevent_timer.c b/kernel/kevent/kevent_timer.c
new file mode 100644
index 000..df93049
--- /dev/null
+++ b/kernel/kevent/kevent_timer.c
@@ -0,0 +1,112 @@
+/*
+ * 2006 Copyright (c) Evgeniy Polyakov <[EMAIL PROTECTED]>
+ * All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+struct kevent_timer
+{
+   struct hrtimer  ktimer;
+   struct kevent_storage   ktimer_storage;
+   struct kevent   *ktimer_event;
+};
+
+static int kevent_timer_func(struct hrtimer *timer)
+{
+   struct kevent_timer *t = container_of(timer, struct kevent_timer, 
ktimer);
+   struct kevent *k = t->ktimer_event;
+
+   kevent_storage_ready(>ktimer_storage, NULL, KEVENT_MASK_ALL);
+   hrtimer_forward(timer, timer->base->softirq_time,
+   ktime_set(k->event.id.raw[0], k->event.id.raw[1]));
+   return HRTIMER_RESTART;
+}
+
+static struct lock_class_key kevent_timer_key;
+
+static int kevent_timer_enqueue(struct kevent *k)
+{
+   int err;
+   struct kevent_timer *t;
+
+   t = kmalloc(sizeof(struct kevent_timer), GFP_KERNEL);
+   if (!t)
+   return -ENOMEM;
+
+   hrtimer_init(>ktimer, CLOCK_MONOTONIC, HRTIMER_REL);
+   t->ktimer.expires = ktime_set(k->event.id.raw[0], k->event.id.raw[1]);
+   t->ktimer.function = kevent_timer_func;
+   t->ktimer_event = k;
+
+   err = kevent_storage_init(>ktimer, >ktimer_storage);
+   if (err)
+   goto err_out_free;
+   lockdep_set_class(>ktimer_storage.lock, _timer_key);
+
+   err = kevent_storage_enqueue(>ktimer_storage, k);
+   if (err)
+   goto err_out_st_fini;
+
+   hrtimer_start(>ktimer, t->ktimer.expires, HRTIMER_REL);
+
+   return 0;
+
+err_out_st_fini:
+   kevent_storage_fini(>ktimer_storage);
+err_out_free:
+   kfree(t);
+
+   return err;
+}
+
+static int kevent_timer_dequeue(struct kevent *k)
+{
+   struct kevent_storage *st = k->st;
+   struct kevent_timer *t = container_of(st, struct kevent_timer, 
ktimer_storage);
+
+   hrtimer_cancel(>ktimer);
+   kevent_storage_dequeue(st, k);
+   kfree(t);
+
+   return 0;
+}
+
+static int kevent_timer_callback(struct kevent *k)
+{
+   k->event.ret_data[0] = jiffies_to_msecs(jiffies);
+   return 1;
+}
+
+static int __init kevent_init_timer(void)
+{
+   struct kevent_callbacks tc = {
+   .callback = _timer_callback,
+   .enqueue = _timer_enqueue,
+   .dequeue = _timer_dequeue};
+
+   return kevent_add_callbacks(, KEVENT_TIMER);
+}
+module_init(kevent_init_timer);
+

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[take27 3/8] kevent: poll/select() notifications.

2006-12-11 Thread Evgeniy Polyakov

poll/select() notifications.

This patch includes generic poll/select notifications.
kevent_poll works simialr to epoll and has the same issues (callback
is invoked not from internal state machine of the caller, but through
process awake, a lot of allocations and so on).

Signed-off-by: Evgeniy Polyakov <[EMAIL PROTECTED]>

diff --git a/fs/file_table.c b/fs/file_table.c
index bc35a40..0805547 100644
--- a/fs/file_table.c
+++ b/fs/file_table.c
@@ -20,6 +20,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include 
@@ -119,6 +120,7 @@ struct file *get_empty_filp(void)
f->f_uid = tsk->fsuid;
f->f_gid = tsk->fsgid;
eventpoll_init_file(f);
+   kevent_init_file(f);
/* f->f_version: 0 */
return f;
 
@@ -164,6 +166,7 @@ void fastcall __fput(struct file *file)
 * in the file cleanup chain.
 */
eventpoll_release(file);
+   kevent_cleanup_file(file);
locks_remove_flock(file);
 
if (file->f_op && file->f_op->release)
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 5baf3a1..8bbf3a5 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -276,6 +276,7 @@ extern int dir_notify_enable;
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -586,6 +587,10 @@ struct inode {
struct mutexinotify_mutex;  /* protects the watches list */
 #endif
 
+#if defined CONFIG_KEVENT_SOCKET || defined CONFIG_KEVENT_PIPE
+   struct kevent_storage   st;
+#endif
+
unsigned long   i_state;
unsigned long   dirtied_when;   /* jiffies of first dirtying */
 
@@ -739,6 +744,9 @@ struct file {
struct list_headf_ep_links;
spinlock_t  f_ep_lock;
 #endif /* #ifdef CONFIG_EPOLL */
+#ifdef CONFIG_KEVENT_POLL
+   struct kevent_storage   st;
+#endif
struct address_space*f_mapping;
 };
 extern spinlock_t files_lock;
diff --git a/kernel/kevent/kevent_poll.c b/kernel/kevent/kevent_poll.c
new file mode 100644
index 000..11dbe25
--- /dev/null
+++ b/kernel/kevent/kevent_poll.c
@@ -0,0 +1,232 @@
+/*
+ * 2006 Copyright (c) Evgeniy Polyakov <[EMAIL PROTECTED]>
+ * All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+static kmem_cache_t *kevent_poll_container_cache;
+static kmem_cache_t *kevent_poll_priv_cache;
+
+struct kevent_poll_ctl
+{
+   struct poll_table_structpt;
+   struct kevent   *k;
+};
+
+struct kevent_poll_wait_container
+{
+   struct list_headcontainer_entry;
+   wait_queue_head_t   *whead;
+   wait_queue_twait;
+   struct kevent   *k;
+};
+
+struct kevent_poll_private
+{
+   struct list_headcontainer_list;
+   spinlock_t  container_lock;
+};
+
+static int kevent_poll_enqueue(struct kevent *k);
+static int kevent_poll_dequeue(struct kevent *k);
+static int kevent_poll_callback(struct kevent *k);
+
+static int kevent_poll_wait_callback(wait_queue_t *wait,
+   unsigned mode, int sync, void *key)
+{
+   struct kevent_poll_wait_container *cont =
+   container_of(wait, struct kevent_poll_wait_container, wait);
+   struct kevent *k = cont->k;
+
+   kevent_storage_ready(k->st, NULL, KEVENT_MASK_ALL);
+   return 0;
+}
+
+static void kevent_poll_qproc(struct file *file, wait_queue_head_t *whead,
+   struct poll_table_struct *poll_table)
+{
+   struct kevent *k =
+   container_of(poll_table, struct kevent_poll_ctl, pt)->k;
+   struct kevent_poll_private *priv = k->priv;
+   struct kevent_poll_wait_container *cont;
+   unsigned long flags;
+
+   cont = kmem_cache_alloc(kevent_poll_container_cache, GFP_KERNEL);
+   if (!cont) {
+   kevent_break(k);
+   return;
+   }
+
+   cont->k = k;
+   init_waitqueue_func_entry(>wait, kevent_poll_wait_callback);
+   cont->whead = whead;
+
+   spin_lock_irqsave(>container_lock, flags);
+   list_add_tail(>container_entry, >container_list);
+   spin_unlock_irqrestore(>container_lock, flags);
+
+   add_wait_queue(whead, >wait);
+}
+
+static int kevent_poll_enqueue(struct kevent *k)
+{
+   struct file *file;
+   int err;
+   unsigned int revents;
+   unsigned long flags;
+   struct 

[take27 0/8] kevent: Generic event handling mechanism.

2006-12-11 Thread Evgeniy Polyakov

Generic event handling mechanism.

Kevent is a generic subsytem which allows to handle event notifications.
It supports both level and edge triggered events. It is similar to
poll/epoll in some cases, but it is more scalable, it is faster and
allows to work with essentially eny kind of events.

Events are provided into kernel through control syscall and can be read
back through ring buffer or using usual syscalls.
Kevent update (i.e. readiness switching) happens directly from internals
of the appropriate state machine of the underlying subsytem (like
network, filesystem, timer or any other).

Homepage:
http://tservice.net.ru/~s0mbre/old/?section=projects=kevent

Documentation page:
http://linux-net.osdl.org/index.php/Kevent

Consider for inclusion.

Changes from 'take26' patchset:
 * made kevent visible in config only in case of embedded setup.
 * added comment about KEVENT_MAX number.
 * spell fix.

Changes from 'take25' patchset:
 * use timespec as timeout parameter.
 * added high-resolution timer to handle absolute timeouts.
 * added flags to waiting and initialization syscalls.
 * kevent_commit() has new_uidx parameter.
 * kevent_wait() has old_uidx parameter, which, if not equal to u->uidx,
results in immediate wakeup (usefull for the case when entries
are added asynchronously from kernel (not supported for now)).
 * added interface to mark any event as ready.
 * event POSIX timers support.
 * return -ENOSYS if there is no registered event type.
 * provided file descriptor must be checked for fifo type (spotted by Eric 
Dumazet).
 * signal notifications.
 * documentation update.
 * lighttpd patch updated (the latest benchmarks with lighttpd patch can be 
found in blog).

Changes from 'take24' patchset:
 * new (old (new)) ring buffer implementation with kernel and user indexes.
 * added initialization syscall instead of opening /dev/kevent
 * kevent_commit() syscall to commit ring buffer entries
 * changed KEVENT_REQ_WAKEUP_ONE flag to KEVENT_REQ_WAKEUP_ALL, kevent wakes
   only first thread always if that flag is not set
 * KEVENT_REQ_ALWAYS_QUEUE flag. If set, kevent will be queued into ready queue
   instead of copying back to userspace when kevent is ready immediately when
   it is added.
 * lighttpd patch (Hail! Although nothing really outstanding compared to epoll)

Changes from 'take23' patchset:
 * kevent PIPE notifications
 * KEVENT_REQ_LAST_CHECK flag, which allows to perform last check at dequeueing 
time
 * fixed poll/select notifications (were broken due to tree manipulations)
 * made Documentation/kevent.txt look nice in 80-col terminal
 * fix for copy_to_user() failure report for the first kevent (Andrew Morton)
 * minor function renames

Changes from 'take22' patchset:
 * new ring buffer implementation in process' memory
 * wakeup-one-thread flag
 * edge-triggered behaviour

Changes from 'take21' patchset:
 * minor cleanups (different return values, removed unneded variables, 
whitespaces and so on)
 * fixed bug in kevent removal in case when kevent being removed
   is the same as overflow_kevent (spotted by Eric Dumazet)

Changes from 'take20' patchset:
 * new ring buffer implementation
 * removed artificial limit on possible number of kevents

Changes from 'take19' patchset:
 * use __init instead of __devinit
 * removed 'default N' from config for user statistic
 * removed kevent_user_fini() since kevent can not be unloaded
 * use KERN_INFO for statistic output

Changes from 'take18' patchset:
 * use __init instead of __devinit
 * removed 'default N' from config for user statistic
 * removed kevent_user_fini() since kevent can not be unloaded
 * use KERN_INFO for statistic output

Changes from 'take17' patchset:
 * Use RB tree instead of hash table. 
At least for a web sever, frequency of addition/deletion of new kevent 
is comparable with number of search access, i.e. most of the time 
events 
are added, accesed only couple of times and then removed, so it 
justifies 
RB tree usage over AVL tree, since the latter does have much slower 
deletion 
time (max O(log(N)) compared to 3 ops), 
although faster search time (1.44*O(log(N)) vs. 2*O(log(N))). 
So for kevents I use RB tree for now and later, when my AVL tree 
implementation 
is ready, it will be possible to compare them.
 * Changed readiness check for socket notifications.

With both above changes it is possible to achieve more than 3380 req/second 
compared to 2200, 
sometimes 2500 req/second for epoll() for trivial web-server and httperf client 
on the same
hardware.
It is possible that above kevent limit is due to maximum allowed kevents in a 
time limit, which is
4096 events.

Changes from 'take16' patchset:
 * misc cleanups (__read_mostly, const ...)
 * created special macro which is used for mmap size (number of pages) 
calculation
 * export kevent_socket_notify(), since it is used in network protocols which 
can be 
built as modules 

[take27 7/8] kevent: Signal notifications.

2006-12-11 Thread Evgeniy Polyakov

Signal notifications.

This type of notifications allows to deliver signals through kevent queue.
One can find example application signal.c on project homepage.

If KEVENT_SIGNAL_NOMASK bit is set in raw_u64 id then signal will be
delivered only through queue, otherwise both delivery types are used - old
through update of mask of pending signals and through queue.

If signal is delivered only through kevent queue mask of pending signals
is not updated at all, which is equal to putting signal into blocked mask,
but with delivery of that signal through kevent queue.

Signed-off-by: Evgeniy Polyakov <[EMAIL PROTECTED]>


diff --git a/include/linux/sched.h b/include/linux/sched.h
index fc4a987..ef38a3c 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -80,6 +80,7 @@ struct sched_param {
 #include 
 #include 
 #include 
+#include 
 
 #include 
 
@@ -1013,6 +1014,10 @@ struct task_struct {
 #ifdef CONFIG_TASK_DELAY_ACCT
struct task_delay_info *delays;
 #endif
+#ifdef CONFIG_KEVENT_SIGNAL
+   struct kevent_storage st;
+   u32 kevent_signals;
+#endif
 };
 
 static inline pid_t process_group(struct task_struct *tsk)
diff --git a/kernel/fork.c b/kernel/fork.c
index 1c999f3..e5b5b14 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -46,6 +46,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -115,6 +116,9 @@ void __put_task_struct(struct task_struct *tsk)
WARN_ON(atomic_read(>usage));
WARN_ON(tsk == current);
 
+#ifdef CONFIG_KEVENT_SIGNAL
+   kevent_storage_fini(>st);
+#endif
security_task_free(tsk);
free_uid(tsk->user);
put_group_info(tsk->group_info);
@@ -1121,6 +1125,10 @@ static struct task_struct *copy_process(unsigned long 
clone_flags,
if (retval)
goto bad_fork_cleanup_namespace;
 
+#ifdef CONFIG_KEVENT_SIGNAL
+   kevent_storage_init(p, >st);
+#endif
+
p->set_child_tid = (clone_flags & CLONE_CHILD_SETTID) ? child_tidptr : 
NULL;
/*
 * Clear TID on mm_release()?
diff --git a/kernel/kevent/kevent_signal.c b/kernel/kevent/kevent_signal.c
new file mode 100644
index 000..0edd2e4
--- /dev/null
+++ b/kernel/kevent/kevent_signal.c
@@ -0,0 +1,92 @@
+/*
+ * kevent_signal.c
+ * 
+ * 2006 Copyright (c) Evgeniy Polyakov <[EMAIL PROTECTED]>
+ * All rights reserved.
+ * 
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+static int kevent_signal_callback(struct kevent *k)
+{
+   struct task_struct *tsk = k->st->origin;
+   int sig = k->event.id.raw[0];
+   int ret = 0;
+
+   if (sig == tsk->kevent_signals)
+   ret = 1;
+
+   if (ret && (k->event.id.raw_u64 & KEVENT_SIGNAL_NOMASK))
+   tsk->kevent_signals |= 0x8000;
+
+   return ret;
+}
+
+int kevent_signal_enqueue(struct kevent *k)
+{
+   int err;
+
+   err = kevent_storage_enqueue(>st, k);
+   if (err)
+   goto err_out_exit;
+
+   if (k->event.req_flags & KEVENT_REQ_ALWAYS_QUEUE) {
+   kevent_requeue(k);
+   err = 0;
+   } else {
+   err = k->callbacks.callback(k);
+   if (err)
+   goto err_out_dequeue;
+   }
+
+   return err;
+
+err_out_dequeue:
+   kevent_storage_dequeue(k->st, k);
+err_out_exit:
+   return err;
+}
+
+int kevent_signal_dequeue(struct kevent *k)
+{
+   kevent_storage_dequeue(k->st, k);
+   return 0;
+}
+
+int kevent_signal_notify(struct task_struct *tsk, int sig)
+{
+   tsk->kevent_signals = sig;
+   kevent_storage_ready(>st, NULL, KEVENT_SIGNAL_DELIVERY);
+   return (tsk->kevent_signals & 0x8000);
+}
+
+static int __init kevent_init_signal(void)
+{
+   struct kevent_callbacks sc = {
+   .callback = _signal_callback,
+   .enqueue = _signal_enqueue,
+   .dequeue = _signal_dequeue};
+
+   return kevent_add_callbacks(, KEVENT_SIGNAL);
+}
+module_init(kevent_init_signal);
diff --git a/kernel/signal.c b/kernel/signal.c
index fb5da6d..d3d3594 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -23,6 +23,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -703,6 +704,9 @@ static int 

[take27 4/8] kevent: Socket notifications.

2006-12-11 Thread Evgeniy Polyakov

Socket notifications.

This patch includes socket send/recv/accept notifications.
Using trivial web server based on kevent and this features
instead of epoll it's performance increased more than noticebly.
More details about various benchmarks and server itself 
(evserver_kevent.c) can be found on project's homepage.

Signed-off-by: Evgeniy Polyakov <[EMAIL PROTECTED]>

diff --git a/fs/inode.c b/fs/inode.c
index ada7643..2740617 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -21,6 +21,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 /*
@@ -164,12 +165,18 @@ static struct inode *alloc_inode(struct super_block *sb)
}
inode->i_private = 0;
inode->i_mapping = mapping;
+#if defined CONFIG_KEVENT_SOCKET || defined CONFIG_KEVENT_PIPE
+   kevent_storage_init(inode, >st);
+#endif
}
return inode;
 }
 
 void destroy_inode(struct inode *inode) 
 {
+#if defined CONFIG_KEVENT_SOCKET || defined CONFIG_KEVENT_PIPE
+   kevent_storage_fini(>st);
+#endif
BUG_ON(inode_has_buffers(inode));
security_inode_free(inode);
if (inode->i_sb->s_op->destroy_inode)
diff --git a/include/net/sock.h b/include/net/sock.h
index edd4d73..d48ded8 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -48,6 +48,7 @@
 #include 
 #include   /* struct sk_buff */
 #include 
+#include 
 
 #include 
 
@@ -450,6 +451,21 @@ static inline int sk_stream_memory_free(struct sock *sk)
 
 extern void sk_stream_rfree(struct sk_buff *skb);
 
+struct socket_alloc {
+   struct socket socket;
+   struct inode vfs_inode;
+};
+
+static inline struct socket *SOCKET_I(struct inode *inode)
+{
+   return _of(inode, struct socket_alloc, vfs_inode)->socket;
+}
+
+static inline struct inode *SOCK_INODE(struct socket *socket)
+{
+   return _of(socket, struct socket_alloc, socket)->vfs_inode;
+}
+
 static inline void sk_stream_set_owner_r(struct sk_buff *skb, struct sock *sk)
 {
skb->sk = sk;
@@ -477,6 +493,7 @@ static inline void sk_add_backlog(struct sock *sk, struct 
sk_buff *skb)
sk->sk_backlog.tail = skb;
}
skb->next = NULL;
+   kevent_socket_notify(sk, KEVENT_SOCKET_RECV);
 }
 
 #define sk_wait_event(__sk, __timeo, __condition)  \
@@ -679,21 +696,6 @@ static inline struct kiocb *siocb_to_kiocb(struct 
sock_iocb *si)
return si->kiocb;
 }
 
-struct socket_alloc {
-   struct socket socket;
-   struct inode vfs_inode;
-};
-
-static inline struct socket *SOCKET_I(struct inode *inode)
-{
-   return _of(inode, struct socket_alloc, vfs_inode)->socket;
-}
-
-static inline struct inode *SOCK_INODE(struct socket *socket)
-{
-   return _of(socket, struct socket_alloc, socket)->vfs_inode;
-}
-
 extern void __sk_stream_mem_reclaim(struct sock *sk);
 extern int sk_stream_mem_schedule(struct sock *sk, int size, int kind);
 
diff --git a/include/net/tcp.h b/include/net/tcp.h
index 7a093d0..69f4ad2 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -857,6 +857,7 @@ static inline int tcp_prequeue(struct sock *sk, struct 
sk_buff *skb)
tp->ucopy.memory = 0;
} else if (skb_queue_len(>ucopy.prequeue) == 1) {
wake_up_interruptible(sk->sk_sleep);
+   kevent_socket_notify(sk, 
KEVENT_SOCKET_RECV|KEVENT_SOCKET_SEND);
if (!inet_csk_ack_scheduled(sk))
inet_csk_reset_xmit_timer(sk, ICSK_TIME_DACK,
  (3 * TCP_RTO_MIN) / 4,
diff --git a/kernel/kevent/kevent_socket.c b/kernel/kevent/kevent_socket.c
new file mode 100644
index 000..9c24b5b
--- /dev/null
+++ b/kernel/kevent/kevent_socket.c
@@ -0,0 +1,142 @@
+/*
+ * kevent_socket.c
+ * 
+ * 2006 Copyright (c) Evgeniy Polyakov <[EMAIL PROTECTED]>
+ * All rights reserved.
+ * 
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+
+static int kevent_socket_callback(struct kevent *k)
+{
+   struct inode *inode = k->st->origin;
+   unsigned int events = SOCKET_I(inode)->ops->poll(SOCKET_I(inode)->file, 
SOCKET_I(inode), NULL);

[take27 6/8] kevent: Pipe notifications.

2006-12-11 Thread Evgeniy Polyakov

Pipe notifications.


diff --git a/fs/pipe.c b/fs/pipe.c
index f3b6f71..aeaee9c 100644
--- a/fs/pipe.c
+++ b/fs/pipe.c
@@ -16,6 +16,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -312,6 +313,7 @@ redo:
break;
}
if (do_wakeup) {
+   kevent_pipe_notify(inode, KEVENT_SOCKET_SEND);
wake_up_interruptible_sync(>wait);
kill_fasync(>fasync_writers, SIGIO, POLL_OUT);
}
@@ -321,6 +323,7 @@ redo:
 
/* Signal writers asynchronously that there is more room. */
if (do_wakeup) {
+   kevent_pipe_notify(inode, KEVENT_SOCKET_SEND);
wake_up_interruptible(>wait);
kill_fasync(>fasync_writers, SIGIO, POLL_OUT);
}
@@ -490,6 +493,7 @@ redo2:
break;
}
if (do_wakeup) {
+   kevent_pipe_notify(inode, KEVENT_SOCKET_RECV);
wake_up_interruptible_sync(>wait);
kill_fasync(>fasync_readers, SIGIO, POLL_IN);
do_wakeup = 0;
@@ -501,6 +505,7 @@ redo2:
 out:
mutex_unlock(>i_mutex);
if (do_wakeup) {
+   kevent_pipe_notify(inode, KEVENT_SOCKET_RECV);
wake_up_interruptible(>wait);
kill_fasync(>fasync_readers, SIGIO, POLL_IN);
}
@@ -605,6 +610,7 @@ pipe_release(struct inode *inode, int decr, int decw)
free_pipe_info(inode);
} else {
wake_up_interruptible(>wait);
+   kevent_pipe_notify(inode, 
KEVENT_SOCKET_SEND|KEVENT_SOCKET_RECV);
kill_fasync(>fasync_readers, SIGIO, POLL_IN);
kill_fasync(>fasync_writers, SIGIO, POLL_OUT);
}
diff --git a/kernel/kevent/kevent_pipe.c b/kernel/kevent/kevent_pipe.c
new file mode 100644
index 000..d529fa9
--- /dev/null
+++ b/kernel/kevent/kevent_pipe.c
@@ -0,0 +1,121 @@
+/*
+ * kevent_pipe.c
+ * 
+ * 2006 Copyright (c) Evgeniy Polyakov <[EMAIL PROTECTED]>
+ * All rights reserved.
+ * 
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+static int kevent_pipe_callback(struct kevent *k)
+{
+   struct inode *inode = k->st->origin;
+   struct pipe_inode_info *pipe = inode->i_pipe;
+   int nrbufs = pipe->nrbufs;
+
+   if (k->event.event & KEVENT_SOCKET_RECV && nrbufs > 0) {
+   if (!pipe->writers)
+   return -1;
+   return 1;
+   }
+   
+   if (k->event.event & KEVENT_SOCKET_SEND && nrbufs < PIPE_BUFFERS) {
+   if (!pipe->readers)
+   return -1;
+   return 1;
+   }
+
+   return 0;
+}
+
+int kevent_pipe_enqueue(struct kevent *k)
+{
+   struct file *pipe;
+   int err = -EBADF;
+   struct inode *inode;
+
+   pipe = fget(k->event.id.raw[0]);
+   if (!pipe)
+   goto err_out_exit;
+
+   inode = igrab(pipe->f_dentry->d_inode);
+   if (!inode)
+   goto err_out_fput;
+
+   err = -EINVAL;
+   if (!S_ISFIFO(inode->i_mode))
+   goto err_out_iput;
+
+   err = kevent_storage_enqueue(>st, k);
+   if (err)
+   goto err_out_iput;
+
+   if (k->event.req_flags & KEVENT_REQ_ALWAYS_QUEUE) {
+   kevent_requeue(k);
+   err = 0;
+   } else {
+   err = k->callbacks.callback(k);
+   if (err)
+   goto err_out_dequeue;
+   }
+
+   fput(pipe);
+
+   return err;
+
+err_out_dequeue:
+   kevent_storage_dequeue(k->st, k);
+err_out_iput:
+   iput(inode);
+err_out_fput:
+   fput(pipe);
+err_out_exit:
+   return err;
+}
+
+int kevent_pipe_dequeue(struct kevent *k)
+{
+   struct inode *inode = k->st->origin;
+
+   kevent_storage_dequeue(k->st, k);
+   iput(inode);
+
+   return 0;
+}
+
+void kevent_pipe_notify(struct inode *inode, u32 event)
+{
+   kevent_storage_ready(>st, NULL, event);
+}
+
+static int __init kevent_init_pipe(void)
+{
+   struct kevent_callbacks sc = {
+   .callback = _pipe_callback,
+ 

[take27 8/8] kevent: Kevent posix timer notifications.

2006-12-11 Thread Evgeniy Polyakov

Kevent posix timer notifications.

Simple extensions to POSIX timers which allows
to deliver notification of the timer expiration
through kevent queue.

Example application posix_timer.c can be found
in archive on project homepage.

Signed-off-by: Evgeniy Polyakov <[EMAIL PROTECTED]>


diff --git a/include/asm-generic/siginfo.h b/include/asm-generic/siginfo.h
index 8786e01..3768746 100644
--- a/include/asm-generic/siginfo.h
+++ b/include/asm-generic/siginfo.h
@@ -235,6 +235,7 @@ typedef struct siginfo {
 #define SIGEV_NONE 1   /* other notification: meaningless */
 #define SIGEV_THREAD   2   /* deliver via thread creation */
 #define SIGEV_THREAD_ID 4  /* deliver to thread */
+#define SIGEV_KEVENT   8   /* deliver through kevent queue */
 
 /*
  * This works because the alignment is ok on all current architectures
@@ -260,6 +261,8 @@ typedef struct sigevent {
void (*_function)(sigval_t);
void *_attribute;   /* really pthread_attr_t */
} _sigev_thread;
+
+   int kevent_fd;
} _sigev_un;
 } sigevent_t;
 
diff --git a/include/linux/posix-timers.h b/include/linux/posix-timers.h
index a7dd38f..4b9deb4 100644
--- a/include/linux/posix-timers.h
+++ b/include/linux/posix-timers.h
@@ -4,6 +4,7 @@
 #include 
 #include 
 #include 
+#include 
 
 union cpu_time_count {
cputime_t cpu;
@@ -49,6 +50,9 @@ struct k_itimer {
sigval_t it_sigev_value;/* value word of sigevent struct */
struct task_struct *it_process; /* process to send signal to */
struct sigqueue *sigq;  /* signal queue entry. */
+#ifdef CONFIG_KEVENT_TIMER
+   struct kevent_storage st;
+#endif
union {
struct {
struct hrtimer timer;
diff --git a/kernel/posix-timers.c b/kernel/posix-timers.c
index e5ebcc1..8d0e7a3 100644
--- a/kernel/posix-timers.c
+++ b/kernel/posix-timers.c
@@ -48,6 +48,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 
 /*
  * Management arrays for POSIX timers.  Timers are kept in slab memory
@@ -224,6 +226,99 @@ static int posix_ktime_get_ts(clockid_t which_clock, 
struct timespec *tp)
return 0;
 }
 
+#ifdef CONFIG_KEVENT_TIMER
+static int posix_kevent_enqueue(struct kevent *k)
+{
+   /*
+* It is not ugly - there is no pointer in the id field union, 
+* but its size is 64bits, which is ok for any known pointer size.
+*/
+   struct k_itimer *tmr = (struct k_itimer *)(unsigned 
long)k->event.id.raw_u64;
+   return kevent_storage_enqueue(>st, k);
+}
+static int posix_kevent_dequeue(struct kevent *k)
+{
+   struct k_itimer *tmr = (struct k_itimer *)(unsigned 
long)k->event.id.raw_u64;
+   kevent_storage_dequeue(>st, k);
+   return 0;
+}
+static int posix_kevent_callback(struct kevent *k)
+{
+   return 1;
+}
+static int posix_kevent_init(void)
+{
+   struct kevent_callbacks tc = {
+   .callback = _kevent_callback,
+   .enqueue = _kevent_enqueue,
+   .dequeue = _kevent_dequeue};
+
+   return kevent_add_callbacks(, KEVENT_POSIX_TIMER);
+}
+
+extern struct file_operations kevent_user_fops;
+
+static int posix_kevent_init_timer(struct k_itimer *tmr, int fd)
+{
+   struct ukevent uk;
+   struct file *file;
+   struct kevent_user *u;
+   int err;
+
+   file = fget(fd);
+   if (!file) {
+   err = -EBADF;
+   goto err_out;
+   }
+
+   if (file->f_op != _user_fops) {
+   err = -EINVAL;
+   goto err_out_fput;
+   }
+
+   u = file->private_data;
+
+   memset(, 0, sizeof(struct ukevent));
+
+   uk.event = KEVENT_MASK_ALL;
+   uk.type = KEVENT_POSIX_TIMER;
+   uk.id.raw_u64 = (unsigned long)(tmr); /* Just cast to something unique 
*/
+   uk.req_flags = KEVENT_REQ_ONESHOT | KEVENT_REQ_ALWAYS_QUEUE;
+   uk.ptr = tmr->it_sigev_value.sival_ptr;
+
+   err = kevent_user_add_ukevent(, u);
+   if (err)
+   goto err_out_fput;
+
+   fput(file);
+
+   return 0;
+
+err_out_fput:
+   fput(file);
+err_out:
+   return err;
+}
+
+static void posix_kevent_fini_timer(struct k_itimer *tmr)
+{
+   kevent_storage_fini(>st);
+}
+#else
+static int posix_kevent_init_timer(struct k_itimer *tmr, int fd)
+{
+   return -ENOSYS;
+}
+static int posix_kevent_init(void)
+{
+   return 0;
+}
+static void posix_kevent_fini_timer(struct k_itimer *tmr)
+{
+}
+#endif
+
+
 /*
  * Initialize everything, well, just everything in Posix clocks/timers ;)
  */
@@ -241,6 +336,11 @@ static __init int init_posix_timers(void)
register_posix_clock(CLOCK_REALTIME, _realtime);
register_posix_clock(CLOCK_MONOTONIC, _monotonic);
 
+   if (posix_kevent_init()) {
+   printk(KERN_ERR "Failed to initialize kevent posix timers.\n");
+   BUG();
+   }
+
posix_timers_cache = 

Re: [PATCH] Introduce jiffies_32 and related compare functions

2006-12-11 Thread Eric Dumazet

David Miller a écrit :

From: Eric Dumazet <[EMAIL PROTECTED]>
Date: Tue, 12 Dec 2006 05:09:23 +0100


We definitly *like* being able to use bigger timeouts on 64bits platforms.

Not that they are mandatory since the same application should run fine on 
32bits kernel. But as the standard type for 'tick timestamps' is 'unsigned 
long', a change would be invasive.


Maybe some applications are now relying on being able to
sleep()/select()/poll() for periods > 30 days and only run on 64
bits kernels.


I think one possible target would be struct timer, at least
in theory.

There is also a line of reasoning that says that on 64-bit
platforms we have some flexibility to set HZ very large, if
we wanted to at some point, and going to 32-bit jiffies
storage for some things may eliminate that kind of flexibility.



Yes good point, and my understanding is that we go for a tickless kernel in 
2.6.21, or so.

I wonder if virtual HZ wont be sticked to a low value.

I suspect in the case HZ raises, we switch some/most uses of jiffies_32 to 
another  variable (xtime_32 or whatever), but keep the storage on 32bits...


But keeping 64bits values 'just because hardware allows us this kind of 
expenditure' seems not reasonable to me, but lazy...


Eric

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


BUG: unable to handle kernel paging request in 2.6.19-git

2006-12-11 Thread Ben Castricum

This bug started to show up after the release of 2.6.19 (iirc plain 2.6.19
was still working fine).

The full dmesg is at
http://www.bencastricum.nl/lk/bootmessages-2.6.19-g9202f325.log,
and the .config http://www.bencastricum.nl/lk/config-g9202f325.log

I haven't tried disabling CONFIG_PCI_MULTITHREAD_PROBE. But if this
might help in someway I'll give it a shot.

Thanks,
Ben

e100: Intel(R) PRO/100 Network Driver, 3.5.17-k2-NAPI
e100: Copyright(c) 1999-2006 Intel Corporation
BUG: unable to handle kernel paging request at virtual address d880a000
 printing eip:
d880a000
*pde = 01382067
*pte = 
Oops:  [#1]
Modules linked in: e100 mii ext2 unix
CPU:0
EIP:0060:[]Not tainted VLI
EFLAGS: 00010282   (2.6.19-g9202f325 #15)
EIP is at 0xd880a000
eax: c13c9000   ebx: d8876fe0   ecx: d8876470   edx: d8876470
esi: d8876fe0   edi: ffed   ebp: d8877014   esp: d7a15f7c
ds: 007b   es: 007b   ss: 0068
Process probe-:00:0 (pid: 72, ti=d7a14000 task=d7828560
task.ti=d7a14000)
Stack: c01b009a c13c9000 c01b00ec d8876fe0 c13c9000  c01b0126
c13c9048
   d7821560 c0205b27 d7821560 1fcc 6ab5e081 4ada d7acded0
d7821560
   c0205aa0 fffc c0128186 0001   c01280d0

Call Trace:
 [] pci_call_probe+0xa/0x10
 [] __pci_device_probe+0x4c/0x60
 [] pci_device_probe+0x26/0x50
 [] really_probe+0x87/0x100
 [] really_probe+0x0/0x100
 [] kthread+0xb6/0xc0
 [] kthread+0x0/0xc0
 [] kernel_thread_helper+0x7/0x14
 ===
Code:  Bad EIP value.
EIP: [] 0xd880a000 SS:ESP 0068:d7a15f7c
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.19-mm1

2006-12-11 Thread KAMEZAWA Hiroyuki
On Mon, 11 Dec 2006 22:06:17 -0800
Andrew Morton <[EMAIL PROTECTED]> wrote:
> > When I use ftp on 2.6.19-mm1, transfered file is always broken.
> > like this:
> > ==
> > [EMAIL PROTECTED] ~]$ file ./linux-2.6.19.tar.bz2 (got on 2.6.19-mm1)
> > ./linux-2.6.19.tar.bz2: data
> > (I confirmed original file was not broken.)
> 
> Yes, a couple of people have reported things like this.  Strange. 
> test.kernel.org is showing mostly-green.  There's one fsx-linux failure (for
> unclear reasons) on one of the x86_64 machines, all the rest are happy.
> 
> Which filesystem were you using?
> 
using ext3.
> Can you investigate it a bit further please??  reboot, re-download, work
> out how the data differs, etc?
> 
Hmm, this is summary of broken linux-2.6.19.tar.bz2 file (used od and diff) 

offset 00 -> 000b4f  zero cleared.
offset 000b50 -> 000fff  not broken
offset 001000 -> 001c47  zero cleared
offset 001c48 -> 001fff  not broken
offset 002000 -> 002d39  zero cleared
offset 002d40 -> 003fff  not broken.
offset 004000 -> 004f2f  zero cleared
offset 004f30 -> 004fff  not broken
offset 005000 -> 005a79  zero cleared
offset 005a80 -> 005fff  not broken
offset 006000 -> 006b7f  zero cleared
offset 006b80 -> 007fff  not broken
...
 
All broken parts are always zero-cleared and start from offset 
aligned to 0x1000. (note: broken kernel's PAGE_SIZE is 16384)

I'll do AMAP.

-Kame









-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] incorrect error handling inside generic_file_direct_write

2006-12-11 Thread Andrew Morton
On Tue, 12 Dec 2006 12:22:14 +0300
Dmitriy Monakhov <[EMAIL PROTECTED]> wrote:

> >> @@ -2041,6 +2041,14 @@ generic_file_direct_write(struct kiocb *
> >>mark_inode_dirty(inode);
> >>}
> >>*ppos = end;
> >> +  } else if (written < 0) {
> >> +  loff_t isize = i_size_read(inode);
> >> +  /*
> >> +   * generic_file_direct_IO() may have instantiated a few blocks
> >> +   * outside i_size.  Trim these off again.
> >> +   */
> >> +  if (pos + count > isize)
> >> +  vmtruncate(inode, isize);
> >>}
> >>  
> >
> > XFS (at least) can call generic_file_direct_write() with i_mutex not held. 
> How could it be ?
> 
> from mm/filemap.c:2046 generic_file_direct_write() comment right after 
> place where i want to add vmtruncate()
> /*
>* Sync the fs metadata but not the minor inode changes and
>* of course not the data as we did direct DMA for the IO.
>* i_mutex is held, which protects generic_osync_inode() from
>* livelocking.
>*/
> 
> > And vmtruncate() expects i_mutex to be held.
> generic_file_direct_IO must called under i_mutex too
> from mm/filemap.c:2388
>   /*
>* Called under i_mutex for writes to S_ISREG files.   Returns -EIO if 
> something
>* went wrong during pagecache shootdown.
>*/
>   static ssize_t
>   generic_file_direct_IO(int rw, struct kiocb *iocb, const struct iovec *iov,

yup, the comments are wrong.

> This means XFS generic_file_direct_write() call generic_file_direct_IO() 
> without
> i_mutex held too?

Think so.  XFS uses blockdev_direct_IO_own_locking().  We'd need to check
with the XFS guys regarding its precise operation and what needs to be done
here.

> >
> > I guess a suitable solution would be to push this problem back up to the
> > callers: let them decide whether to run vmtruncate() and if so, to ensure
> > that i_mutex is held.
> >
> > The existence of generic_file_aio_write_nolock() makes that rather messy
> > though.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC][PATCH 2.6.19 5/6] add "add" element in /sys/class/misc/netconsole

2006-12-11 Thread Keiichi KII
From: Keiichi KII <[EMAIL PROTECTED]>

This patch contains the following changes.

To add port dynamically, create "add" element in /sys/class/misc/netconsole.

ex)
1. echo "eth0" > /sys/clas/misc/netconsole/add
   then the port is added with the default settings.

2. echo "@/eth0,@192.168.0.1/" > /sys/class/misc/netconsole/add
   then the port is added with the settings sending kernel messages
   to 192.168.0.1 using eth0 device.

-+- /sys/class/misc/
 |-+- netconsole/
   |--- add   [-w---]  If you write parameter(network interface name
   |   or one config parameter of netconsole), The
   |port related its config is added
   |--- port1/
   |--- port2/
   ...

Signed-off-by: Keiichi KII <[EMAIL PROTECTED]>
---
--- linux-2.6.19/drivers/net/netconsole.c   2006-12-06 14:37:26.874827500 
+0900
+++ enhanced-netconsole/drivers/net/netconsole.c.add2006-12-06
13:33:05.661516750 +0900
@@ -321,6 +321,50 @@ static struct miscdevice netcon_miscdev
.name = "netconsole",
 };

+static ssize_t set_netconmisc_add(struct class_device *cdev, const char *buf,
+ size_t count)
+{
+   char *target;
+   char *target_param;
+
+   target_param = (char*)kmalloc(count+1, GFP_ATOMIC);
+   if (!target_param) {
+   printk(KERN_INFO "netconsole: kmalloc() failed!\n");
+   return -ENOMEM;
+   }
+
+   strcpy(target_param, buf);
+   if (target_param[count - 1] == '\n') {
+   target_param[count - 1] = '\0';
+   }
+
+   if (dev_get_by_name(target_param)) {
+   printk(KERN_INFO "netconsole: device name = [%s]\n",
+  target_param);
+   target = (char*)kmalloc(MAX_CONFIG_LENGTH+1, GFP_ATOMIC);
+   if (!target) {
+   printk(KERN_INFO "netconsole: kmalloc() failed!\n");
+   kfree(target_param);
+   return -ENOMEM;
+   }
+   sprintf(target,"@/%s,@/", target_param);
+   add_netcon_dev(target);
+   kfree(target);
+   } else {
+   printk(KERN_INFO "netconsole: config = [%s]\n", target_param);
+   add_netcon_dev(target_param);
+   }
+   kfree(target_param);
+
+   return count;
+}
+
+static CLASS_DEVICE_ATTR(add, S_IWUSR, NULL, set_netconmisc_add);
+
+static struct class_device_attribute *netcon_misc_attr[] = {
+   _device_attr_add,
+};
+
 static struct netpoll np = {
.name = "netconsole",
.dev_name = "eth0",
@@ -446,6 +490,7 @@ static int __init init_netconsole(void)
 {
char *p;
char *tmp = config;
+   int i = 0;

if (misc_register(_miscdev)) {
printk(KERN_INFO
@@ -456,6 +501,11 @@ static int __init init_netconsole(void)
}
register_console();

+   for(i=0; i < ARRAY_SIZE(netcon_misc_attr); i++) {
+   class_device_create_file(netcon_miscdev.class,
+netcon_misc_attr[i]);
+   }
+
if(!strlen(config)) {
printk(KERN_INFO "netconsole: not configured\n");
return 0;

-- 
Keiichi KII
NEC Corporation OSS Promotion Center
E-mail: [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC][PATCH 2.6.19 6/6] update modification history

2006-12-11 Thread Keiichi KII
From: Keiichi KII <[EMAIL PROTECTED]>

Update modification history.

Signed-off-by: Keiichi KII <[EMAIL PROTECTED]>
---
--- linux-2.6.19/drivers/net/netconsole.c   2006-12-12 14:57:45.588967500 
+0900
+++ enhanced-netconsole/drivers/net/netconsole.c.sign   2006-12-12
14:54:49.541965250 +0900
@@ -15,6 +15,9 @@
  *   generic card hooks
  *   works non-modular
  * 2003-09-07rewritten with netpoll api
+ * 2006-12-12add extended features for
+ *   dynamic configurable netconsole
+ *   by Keiichi KII <[EMAIL PROTECTED]>
  */

 /

-- 
Keiichi KII
NEC Corporation OSS Promotion Center
E-mail: [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


AIC79XX abort -- hardware fault?

2006-12-11 Thread Daniel Pittman
G'day.  One of the machines I maintain is having real trouble with the
AIC79XX HBA or the tape drive attached to it.  I believe this is a
hardware fault, but I am not certain where the problem lies.

Normally I would blame the cable or, maybe, the tape drive, but the
early stage of the fault and the reported SCSI driver state make me
wonder if this is perhaps an HBA fault?

Regards,
Daniel

scsi0 : Adaptec AIC79XX PCI-X SCSI HBA DRIVER, Rev 3.0

aic7901: Ultra320 Wide Channel A, SCSI Id=7, PCI-X 101-133Mhz, 512 SCBs

 target0:0:0: asynchronous
scsi 0:0:0:0: Attempting to queue an ABORT message:CDB: 0x12 0x0 0x0 0x0 0x24 
0x0
scsi0: At time of recovery, card was not paused
>> Dump Card State Begins <
scsi0: Dumping Card State at program address 0x1b1 Mode 0x11
Card was paused
INTSTAT[0x0] SELOID[0x0] SELID[0x0] HS_MAILBOX[0x0]
INTCTL[0x0] SEQINTSTAT[0x0] SAVED_MODE[0x0] DFFSTAT[0x19]
SCSISIGI[0xa4] SCSIPHASE[0x0] SCSIBUS[0x0] LASTPHASE[0xa0]
SCSISEQ0[0x0] SCSISEQ1[0x12] SEQCTL0[0x0] SEQINTCTL[0x0]
SEQ_FLAGS[0x0] SEQ_FLAGS2[0x4] QFREEZE_COUNT[0x0]
KERNEL_QFREEZE_COUNT[0x0] MK_MESSAGE_SCB[0xff00]
MK_MESSAGE_SCSIID[0xff] SSTAT0[0x0] SSTAT1[0x0]
SSTAT2[0x0] SSTAT3[0x0] PERRDIAG[0xc0] SIMODE1[0xac]
LQISTAT0[0x0] LQISTAT1[0x0] LQISTAT2[0x0] LQOSTAT0[0x0]
LQOSTAT1[0x0] LQOSTAT2[0x0]

SCB Count = 4 CMDS_PENDING = 1 LASTSCB 0x CURRSCB 0x3 NEXTSCB 0x0
qinstart = 1 qinfifonext = 1
QINFIFO:
WAITING_TID_QUEUES:
Pending list:
  3 FIFO_USE[0x0] SCB_CONTROL[0x40] SCB_SCSIID[0x7]
Total 1
Kernel Free SCB list: 2 1 0
Sequencer Complete DMA-inprog list:
Sequencer Complete list:
Sequencer DMA-Up and Complete list:
Sequencer On QFreeze and Complete list:


scsi0: FIFO0 Free, LONGJMP == 0x80ff, SCB 0x0
SEQIMODE[0x3f] SEQINTSRC[0x0] DFCNTRL[0x0] DFSTATUS[0x89]
SG_CACHE_SHADOW[0x2] SG_STATE[0x0] DFFSXFRCTL[0x0]
SOFFCNT[0x0] MDFFSTAT[0x5] SHADDR = 0x00, SHCNT = 0x0
HADDR = 0x00, HCNT = 0x0 CCSGCTL[0x10]

scsi0: FIFO1 Active, LONGJMP == 0x8063, SCB 0x3
SEQIMODE[0x3f] SEQINTSRC[0x0] DFCNTRL[0x4] DFSTATUS[0x89]
SG_CACHE_SHADOW[0x2] SG_STATE[0x0] DFFSXFRCTL[0x0]
SOFFCNT[0x0] MDFFSTAT[0x14] SHADDR = 0x06, SHCNT = 0x0
HADDR = 0x00, HCNT = 0x0 CCSGCTL[0x10]
LQIN: 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 
0x0 0x0
scsi0: LQISTATE = 0x0, LQOSTATE = 0x0, OPTIONMODE = 0x52
scsi0: OS_SPACE_CNT = 0x20 MAXCMDCNT = 0x0
scsi0: SAVED_SCSIID = 0x0 SAVED_LUN = 0x0
SIMODE0[0xc]
CCSCBCTL[0x4]
scsi0: REG0 == 0x3, SINDEX = 0x1e0, DINDEX = 0xe1
scsi0: SCBPTR == 0x3, SCB_NEXT == 0xffc0, SCB_NEXT2 == 0xffed
CDB 12 0 0 0 24 0
STACK: 0x121 0x0 0x0 0x0 0x0 0x0 0x0 0x0
< Dump Card State Ends >>
scsi 0:0:0:0: Device is active, asserting ATN
scsi0: Recovery code sleeping
scsi0: Timer Expired (active 1)
Recovery code awake
scsi0: Command abort returning 0x2003
scsi 0:0:0:0: Attempting to queue a TARGET RESET message:CDB: 0x12 0x0 0x0 0x0 
0x24 0x
0
scsi0: Device reset code sleeping
scsi0: Device reset timer expired (active 2)
scsi0: Device reset returning 0x2003
Recovery SCB completes
Recovery SCB completes
scsi 0:0:0:0: Attempting to queue an ABORT message:CDB: 0x0 0x0 0x0 0x0 0x0 0x0
scsi0: At time of recovery, card was not paused
>> Dump Card State Begins <
scsi0: Dumping Card State at program address 0x1b2 Mode 0x11
Card was paused
INTSTAT[0x0] SELOID[0x0] SELID[0x0] HS_MAILBOX[0x0]
INTCTL[0x0] SEQINTSTAT[0x0] SAVED_MODE[0x0] DFFSTAT[0x19]
SCSISIGI[0xa4] SCSIPHASE[0x0] SCSIBUS[0x0] LASTPHASE[0xa0]
SCSISEQ0[0x0] SCSISEQ1[0x12] SEQCTL0[0x0] SEQINTCTL[0x0]
SEQ_FLAGS[0x0] SEQ_FLAGS2[0x4] QFREEZE_COUNT[0x0]
KERNEL_QFREEZE_COUNT[0x0] MK_MESSAGE_SCB[0xff00]
MK_MESSAGE_SCSIID[0xff] SSTAT0[0x0] SSTAT1[0x0]
SSTAT2[0x0] SSTAT3[0x0] PERRDIAG[0xc0] SIMODE1[0xac]
LQISTAT0[0x0] LQISTAT1[0x0] LQISTAT2[0x0] LQOSTAT0[0x0]
LQOSTAT1[0x0] LQOSTAT2[0x0]

SCB Count = 4 CMDS_PENDING = 1 LASTSCB 0x CURRSCB 0x3 NEXTSCB 0x0
qinstart = 2 qinfifonext = 2
QINFIFO:
WAITING_TID_QUEUES:
Pending list:
  3 FIFO_USE[0x0] SCB_CONTROL[0x40] SCB_SCSIID[0x7]
Total 1
Kernel Free SCB list: 2 1 0
Sequencer Complete DMA-inprog list:
Sequencer Complete list:
Sequencer DMA-Up and Complete list:
Sequencer On QFreeze and Complete list:


scsi0: FIFO0 Free, LONGJMP == 0x80ff, SCB 0x0
SEQIMODE[0x3f] SEQINTSRC[0x0] DFCNTRL[0x0] DFSTATUS[0x89]
SG_CACHE_SHADOW[0x2] SG_STATE[0x0] DFFSXFRCTL[0x0]
SOFFCNT[0x0] MDFFSTAT[0x5] SHADDR = 0x00, SHCNT = 0x0
HADDR = 0x00, HCNT = 0x0 CCSGCTL[0x10]

scsi0: FIFO1 Active, LONGJMP == 0x8063, SCB 0x3
SEQIMODE[0x3f] SEQINTSRC[0x0] DFCNTRL[0x4] DFSTATUS[0x89]
SG_CACHE_SHADOW[0x2] SG_STATE[0x0] DFFSXFRCTL[0x0]
SOFFCNT[0x0] MDFFSTAT[0x14] SHADDR = 0x06, SHCNT = 0x0
HADDR = 0x00, HCNT = 0x0 CCSGCTL[0x10]
LQIN: 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 
0x0 0x0
scsi0: LQISTATE = 0x0, LQOSTATE = 0x0, OPTIONMODE = 0x52
scsi0: OS_SPACE_CNT = 0x20 MAXCMDCNT = 0x0
scsi0: SAVED_SCSIID = 0x0 SAVED_LUN = 0x0

[RFC][PATCH 2.6.19 3/6] add interface for netconsole using sysfs

2006-12-11 Thread Keiichi KII
From: Keiichi KII <[EMAIL PROTECTED]>

This patch contains the following changes.

create a sysfs entry for netconsole in /sys/class/misc.
This entry has elements related to netconsole as follows.
You can change configuration of netconsole(writable attributes such as IP
address, port number and so on) and check current configuration of netconsole.

-+- /sys/class/misc/
 |-+- netconsole/
   |-+- port1/
   | |--- id  [r--r--r--]  unique port id
   | |--- remove  [-w---]  if you write something to "remove",
   | | this port is removed.
   | |--- dev_name[r--r--r--]  network interface name
   | |--- local_ip[rw-r--r--]  source IP to use, writable
   | |--- local_port  [rw-r--r--]  source port number for UDP packets, writable
   | |--- local_mac   [r--r--r--]  source MAC address
   | |--- remote_ip   [rw-r--r--]  port number for logging agent, writable
   | |--- remote_port [rw-r--r--]  IP address for logging agent, writable
   |  remote_mac  [rw-r--r--]  MAC address for logging agent, writable
   |--- port2/
   |--- port3/
   ...

Signed-off-by: Keiichi KII <[EMAIL PROTECTED]>
---
--- linux-2.6.19/drivers/net/netconsole.c   2006-12-06 14:37:26.842825500 
+0900
+++ enhanced-netconsole/drivers/net/netconsole.c.sysfs  2006-12-06
13:32:47.488381000 +0900
@@ -45,6 +45,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 

 MODULE_AUTHOR("Maintainer: Matt Mackall <[EMAIL PROTECTED]>");
 MODULE_DESCRIPTION("Console driver for network interfaces");
@@ -53,6 +55,7 @@ MODULE_LICENSE("GPL");
 enum {
MAX_PRINT_CHUNK = 1000,
MAX_CONFIG_LENGTH = 256,
+   MAC_ADDR_DIGIT = 6,
 };

 static char config[MAX_CONFIG_LENGTH];
@@ -62,19 +65,214 @@ MODULE_PARM_DESC(netconsole, " netconsol

 struct netconsole_device {
struct list_head list;
+   struct kobject obj;
spinlock_t netpoll_lock;
int id;
struct netpoll np;
 };

+struct netcon_dev_attr {
+   struct attribute attr;
+   ssize_t (*show)(struct netconsole_device*, char*);
+   ssize_t (*store)(struct netconsole_device*, const char*,
+size_t count);
+};
+
 static int add_netcon_dev(const char*);
+static void setup_netcon_dev_sysfs(struct netconsole_device*);
 static void cleanup_netconsole(void);
 static void netcon_dev_cleanup(struct netconsole_device *nd);

+static int netcon_miscdev_configured = 0;
+
 static LIST_HEAD(active_netconsole_dev);

 static DEFINE_SPINLOCK(netconsole_dev_list_lock);

+#define SHOW_CLASS_ATTR(field, type, format, ...) \
+static ssize_t show_##field(type, char *buf) \
+{ \
+ return sprintf(buf, format, __VA_ARGS__); \
+} \
+
+SHOW_CLASS_ATTR(id, struct netconsole_device *nd, "%d\n", nd->id);
+SHOW_CLASS_ATTR(dev_name, struct netconsole_device *nd,
+   "%s\n", nd->np.dev_name);
+SHOW_CLASS_ATTR(local_port, struct netconsole_device *nd,
+   "%d\n", nd->np.local_port);
+SHOW_CLASS_ATTR(remote_port, struct netconsole_device *nd,
+   "%d\n", nd->np.remote_port);
+SHOW_CLASS_ATTR(local_ip, struct netconsole_device *nd,
+   "%d.%d.%d.%d\n", HIPQUAD(nd->np.local_ip));
+SHOW_CLASS_ATTR(remote_ip, struct netconsole_device *nd,
+   "%d.%d.%d.%d\n", HIPQUAD(nd->np.remote_ip));
+SHOW_CLASS_ATTR(local_mac, struct netconsole_device *nd,
+   "%02x:%02x:%02x:%02x:%02x:%02x\n",
+   nd->np.local_mac[0], nd->np.local_mac[1], nd->np.local_mac[2],
+   nd->np.local_mac[3], nd->np.local_mac[4], nd->np.local_mac[5]);
+SHOW_CLASS_ATTR(remote_mac, struct netconsole_device *nd,
+   "%02x:%02x:%02x:%02x:%02x:%02x\n",
+   nd->np.remote_mac[0], nd->np.remote_mac[1],
+   nd->np.remote_mac[2], nd->np.remote_mac[3],
+   nd->np.remote_mac[4], nd->np.remote_mac[5]);
+
+static ssize_t store_local_port(struct netconsole_device *nd, const char *buf,
+   size_t count)
+{
+   spin_lock(>netpoll_lock);
+   nd->np.local_port = simple_strtol(buf, NULL, 10);
+   spin_unlock(>netpoll_lock);
+
+   return count;
+}
+
+static ssize_t store_remote_port(struct netconsole_device *nd, const char *buf,
+   size_t count)
+{
+   spin_lock(>netpoll_lock);
+   nd->np.remote_port = simple_strtol(buf, NULL, 10);
+   spin_unlock(>netpoll_lock);
+
+   return count;
+}
+
+static ssize_t store_local_ip(struct netconsole_device *nd, const char *buf,
+ size_t count)
+{
+   spin_lock(>netpoll_lock);
+   nd->np.local_ip = ntohl(in_aton(buf));
+   spin_unlock(>netpoll_lock);
+
+   return count;
+}
+
+static ssize_t store_remote_ip(struct netconsole_device *nd, const char *buf,
+  size_t count)
+{
+   spin_lock(>netpoll_lock);
+   nd->np.remote_ip = ntohl(in_aton(buf));
+   spin_unlock(>netpoll_lock);
+
+   return count;
+}
+
+static ssize_t 

[RFC][PATCH 2.6.19 4/6] switch function of netpoll

2006-12-11 Thread Keiichi KII
From: Keiichi KII <[EMAIL PROTECTED]>

This patch contains switch function of netpoll.

if "enable" attribute of certain port is '1', this port is used.
if "enable" attribute of certain port is '0', this port isn't used.

active_netconsole_dev list manages a list of active ports.
inactive_netconsole_dev list manages a list of inactive ports.

If you write '0' to "enable" attribute of a port included in
active_netconsole_dev_list, This port moves from active_netconsole_dev to
inactive_netconsole_dev and won't used to send kernel message.

-+- /sys/class/misc/
 |-+- netconsole/
   |-+- port1/
   | |--- id  [r--r--r--]  id
   | |--- enable  [rw-r--r--]  0: disable, 1: enable, writable
   | ...
   |--- port2/
   ...

Signed-off-by: Keiichi KII <[EMAIL PROTECTED]>
---
--- linux-2.6.19/drivers/net/netconsole.c   2006-12-06 14:37:26.858826500 
+0900
+++ enhanced-netconsole/drivers/net/netconsole.c.switch 2006-12-06
13:32:56.744959500 +0900
@@ -86,9 +86,25 @@ static void netcon_dev_cleanup(struct ne
 static int netcon_miscdev_configured = 0;

 static LIST_HEAD(active_netconsole_dev);
+static LIST_HEAD(inactive_netconsole_dev);

 static DEFINE_SPINLOCK(netconsole_dev_list_lock);

+static ssize_t show_enable(struct netconsole_device *nd, char *buf) {
+   struct netconsole_device *dev;
+
+   spin_lock(_dev_list_lock);
+   list_for_each_entry(dev, _netconsole_dev, list) {
+   if (dev->id == nd->id) {
+   spin_unlock(_dev_list_lock);
+   return sprintf(buf, "1\n");
+   }
+   }
+   spin_unlock(_dev_list_lock);
+
+   return sprintf(buf, "0\n");
+}
+
 #define SHOW_CLASS_ATTR(field, type, format, ...) \
 static ssize_t show_##field(type, char *buf) \
 { \
@@ -180,6 +196,36 @@ static ssize_t store_remote_mac(struct n
return count;
 }

+static ssize_t store_enable(struct netconsole_device *nd, const char *buf,
+   size_t count)
+{
+   struct netconsole_device *dev, *tmp;
+   struct list_head *src, *dst;
+
+   if (strncmp(buf, "1", 1) == 0) {
+   src = _netconsole_dev;
+   dst = _netconsole_dev;
+   } else if(strncmp(buf, "0", 1) == 0) {
+   src = _netconsole_dev;
+   dst = _netconsole_dev;
+   } else {
+   printk(KERN_INFO "netconsole: invalid argument: %s\n", buf);
+   return -EINVAL;
+   }
+   
+   spin_lock(_dev_list_lock);
+   list_for_each_entry_safe(dev, tmp, src, list) {
+   if (dev->id == nd->id) {
+   list_del(>list);
+   list_add(>list, dst);
+   break;
+   }
+   }
+   spin_unlock(_dev_list_lock);
+
+   return count;
+}
+
 static ssize_t store_remove(struct netconsole_device *nd, const char *buf,
size_t count)
 {
@@ -204,6 +250,7 @@ static NETCON_CLASS_ATTR(remote_ip, S_IR
 static NETCON_CLASS_ATTR(local_mac, S_IRUGO, show_local_mac, NULL);
 static NETCON_CLASS_ATTR(remote_mac, S_IRUGO | S_IWUSR,
 show_remote_mac, store_remote_mac);
+static NETCON_CLASS_ATTR(enable, S_IRUGO | S_IWUSR, show_enable, store_enable);
 static NETCON_CLASS_ATTR(remove, S_IWUSR, NULL, store_remove);

 static struct attribute *netcon_dev_attrs[] = {
@@ -215,6 +262,7 @@ static struct attribute *netcon_dev_attr
_dev_attr_remote_ip.attr,
_dev_attr_local_mac.attr,
_dev_attr_remote_mac.attr,
+   _dev_attr_enable.attr,
_dev_attr_remove.attr,
NULL
 };
@@ -434,6 +482,9 @@ static void cleanup_netconsole(void)
list_for_each_entry_safe(dev, tmp, _netconsole_dev, list) {
kobject_unregister(>obj);
}
+   list_for_each_entry_safe(dev, tmp, _netconsole_dev, list) {
+   kobject_unregister(>obj);
+   }

if (netcon_miscdev_configured) {
misc_deregister(_miscdev);

-- 
Keiichi KII
NEC Corporation OSS Promotion Center
E-mail: [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC][PATCH 2.6.19 1/6] cleanup for netconsole

2006-12-11 Thread Keiichi KII
From: Keiichi KII <[EMAIL PROTECTED]>

This patch contains the following cleanups.
 - add __init for initialization functions(option_setup() and 
init_netconsole()).
 - define name of magic number.

Signed-off-by: Keiichi KII <[EMAIL PROTECTED]>
---
--- linux-2.6.19/drivers/net/netconsole.c   2006-12-06 14:37:06.985584500 
+0900
+++ enhanced-netconsole/drivers/net/netconsole.c.cleanup2006-12-06
14:34:52.561183500 +0900
@@ -50,8 +50,14 @@ MODULE_AUTHOR("Maintainer: Matt Mackall
 MODULE_DESCRIPTION("Console driver for network interfaces");
 MODULE_LICENSE("GPL");

-static char config[256];
-module_param_string(netconsole, config, 256, 0);
+enum {
+   MAX_PRINT_CHUNK = 1000,
+   MAX_CONFIG_LENGTH = 256,
+};
+
+static char config[MAX_CONFIG_LENGTH];
+
+module_param_string(netconsole, config, MAX_CONFIG_LENGTH, 0);
 MODULE_PARM_DESC(netconsole, "
[EMAIL PROTECTED]/[dev],[tgt-port]@/[tgt-macaddr]\n");

 static struct netpoll np = {
@@ -62,9 +68,8 @@ static struct netpoll np = {
.remote_mac = {0xff, 0xff, 0xff, 0xff, 0xff, 0xff},
.drop = netpoll_queue,
 };
-static int configured = 0;

-#define MAX_PRINT_CHUNK 1000
+static int configured = 0;

 static void write_msg(struct console *con, const char *msg, unsigned int len)
 {
@@ -75,14 +80,12 @@ static void write_msg(struct console *co
return;

local_irq_save(flags);
-
for(left = len; left; ) {
frag = min(left, MAX_PRINT_CHUNK);
netpoll_send_udp(, msg, frag);
msg += frag;
left -= frag;
}
-
local_irq_restore(flags);
 }

@@ -92,7 +95,7 @@ static struct console netconsole = {
.write = write_msg
 };

-static int option_setup(char *opt)
+static int __init option_setup(char *opt)
 {
configured = !netpoll_parse_options(, opt);
return 1;
@@ -100,7 +103,7 @@ static int option_setup(char *opt)

 __setup("netconsole=", option_setup);

-static int init_netconsole(void)
+static int __init init_netconsole(void)
 {
if(strlen(config))
option_setup(config);

-- 
Keiichi KII
NEC Corporation OSS Promotion Center
E-mail: [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] incorrect error handling inside generic_file_direct_write

2006-12-11 Thread Dmitriy Monakhov
Andrew Morton <[EMAIL PROTECTED]> writes:

> On Mon, 11 Dec 2006 16:34:27 +0300
> Dmitriy Monakhov <[EMAIL PROTECTED]> wrote:
>
>> OpenVZ team has discovered error inside generic_file_direct_write()
>> If generic_file_direct_IO() has fail (ENOSPC condition) it may have 
>> instantiated
>> a few blocks outside i_size. And fsck will complain about wrong i_size
>> (ext2, ext3 and reiserfs interpret i_size and biggest block difference as 
>> error),
>> after fsck will fix error i_size will be increased to the biggest block,
>> but this blocks contain gurbage from previous write attempt, this is not 
>> information leak, but its silence file data corruption. 
>> We need truncate any block beyond i_size after write have failed , do in 
>> simular
>> generic_file_buffered_write() error path.
>> 
>> Exampe:
>> open("mnt2/FILE3", O_WRONLY|O_CREAT|O_DIRECT, 0666) = 3
>> write(3, "aa"..., 4096) = -1 ENOSPC (No space left on device)
>> 
>> stat mnt2/FILE3
>> File: `mnt2/FILE3'
>> Size: 0   Blocks: 4  IO Block: 4096   regular empty file
>> >>^^ file size is less than biggest block idx
>> Device: 700h/1792d  Inode: 14  Links: 1
>> Access: (0644/-rw-r--r--)  Uid: (0/root)   Gid: (0/root)
>> 
>> fsck.ext2 -f -n  mnt1/fs_img
>> Pass 1: Checking inodes, blocks, and sizes
>> Inode 14, i_size is 0, should be 2048.  Fix? no
>> 
>> Signed-off-by: Dmitriy Monakhov <[EMAIL PROTECTED]>
>> --
>>
>> diff --git a/mm/filemap.c b/mm/filemap.c
>> index 7b84dc8..bf7cf6c 100644
>> --- a/mm/filemap.c
>> +++ b/mm/filemap.c
>> @@ -2041,6 +2041,14 @@ generic_file_direct_write(struct kiocb *
>>  mark_inode_dirty(inode);
>>  }
>>  *ppos = end;
>> +} else if (written < 0) {
>> +loff_t isize = i_size_read(inode);
>> +/*
>> + * generic_file_direct_IO() may have instantiated a few blocks
>> + * outside i_size.  Trim these off again.
>> + */
>> +if (pos + count > isize)
>> +vmtruncate(inode, isize);
>>  }
>>  
>
> XFS (at least) can call generic_file_direct_write() with i_mutex not held. 
How could it be ?

from mm/filemap.c:2046 generic_file_direct_write() comment right after 
place where i want to add vmtruncate()
/*
 * Sync the fs metadata but not the minor inode changes and
 * of course not the data as we did direct DMA for the IO.
 * i_mutex is held, which protects generic_osync_inode() from
 * livelocking.
 */

> And vmtruncate() expects i_mutex to be held.
generic_file_direct_IO must called under i_mutex too
from mm/filemap.c:2388
  /*
   * Called under i_mutex for writes to S_ISREG files.   Returns -EIO if 
something
   * went wrong during pagecache shootdown.
   */
  static ssize_t
  generic_file_direct_IO(int rw, struct kiocb *iocb, const struct iovec *iov,

This means XFS generic_file_direct_write() call generic_file_direct_IO() without
i_mutex held too?
>
> I guess a suitable solution would be to push this problem back up to the
> callers: let them decide whether to run vmtruncate() and if so, to ensure
> that i_mutex is held.
>
> The existence of generic_file_aio_write_nolock() makes that rather messy
> though.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC][PATCH 2.6.19 0/6] proposal for dynamic configurable netconsole

2006-12-11 Thread Keiichi KII
From: Keiichi KII <[EMAIL PROTECTED]>

The netconsole is a very useful module for collecting kernel message under
certain circumstances(e.g. disk logging fails, serial port is unavailable).

But current netconsole is not flexible. For example, if you want to change ip
address for logging agent, in the case of built-in netconsole, you can't change
config except for changing boot parameter and rebooting your system, or in the
case of module netconsole, you need to reload netconsole module.

So, I propose the following extended features for netconsole.

1) support for multiple logging agents.
2) add interface to access each parameter of netconsole
   using sysfs.

This patch is for linux-2.6.19 and is divided to each function.
Your comments are very welcome.

Signed-off-by: Keiichi KII <[EMAIL PROTECTED]>
---
-- 
Keiichi KII
NEC Corporation OSS Promotion Center
E-mail: [EMAIL PROTECTED]









-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


amd64 iommu causing corruption? (was Re: data corruption with nvidia chipsets and IDE/SATA drives // memory hole mapping related bug?!)

2006-12-11 Thread Chris Wedgwood
On Mon, Dec 11, 2006 at 10:24:02AM +0100, Karsten Weiss wrote:

> We could not reproduce the data corruption anymore if we boot the
> machines with the kernel parameter "iommu=soft" i.e. if we use
> software bounce buffering instead of the hw-iommu. (As mentioned
> before, booting with mem=2g works fine, too, because this disables
> the iommu altogether.)

I can confirm this also seems to be the case for me, I'm still doing
more testing to confirm this.  But it would seem:

nforce4, transfer of a large mount of data with 4GB+ of RAM I get some
corruption.  This is present on both the nv SATA and also Sil 3112
connected drives.

Using iommu=soft so far seems to be working without any corruption.



I still need to do more testing on other machines which have less
memory (so the IOMMU won't be in use there either) and see if there
are problems there.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.19-mm1

2006-12-11 Thread Andrew Morton
On Tue, 12 Dec 2006 14:53:41 +0900
KAMEZAWA Hiroyuki <[EMAIL PROTECTED]> wrote:

> 
> On Mon, 11 Dec 2006 00:58:07 -0800
> Andrew Morton <[EMAIL PROTECTED]> wrote:
> 
> > 
> > Temporarily at
> > 
> > http://userweb.kernel.org/~akpm/2.6.19-mm1/
> > 
> > Will appear later at
> > 
> > 
> > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.19/2.6.19-mm1/
> > 
> 
> When I use ftp on 2.6.19-mm1, transfered file is always broken.
> like this:
> ==
> [EMAIL PROTECTED] ~]$ file ./linux-2.6.19.tar.bz2 (got on 2.6.19-mm1)
> ./linux-2.6.19.tar.bz2: data
> (I confirmed original file was not broken.)

Yes, a couple of people have reported things like this.  Strange. 
test.kernel.org is showing mostly-green.  There's one fsx-linux failure (for
unclear reasons) on one of the x86_64 machines, all the rest are happy.

Which filesystem were you using?

Can you investigate it a bit further please??  reboot, re-download, work
out how the data differs, etc?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.19-mm1

2006-12-11 Thread KAMEZAWA Hiroyuki
On Mon, 11 Dec 2006 00:58:07 -0800
Andrew Morton <[EMAIL PROTECTED]> wrote:

> 
> Temporarily at
> 
>   http://userweb.kernel.org/~akpm/2.6.19-mm1/
> 
> Will appear later at
> 
>   
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.19/2.6.19-mm1/
> 

When I use ftp on 2.6.19-mm1, transfered file is always broken.
like this:
==
[EMAIL PROTECTED] ~]$ file ./linux-2.6.19.tar.bz2 (got on 2.6.19-mm1)
./linux-2.6.19.tar.bz2: data
(I confirmed original file was not broken.)
==

It seems that file systems work well. So I doubts network...but no idea.
I attaches my .config.

hardware config is:
Itanium2 1.3GHz x2, SMP, e1000 network card as eth0

Thanks,
-Kame
==
#
# Automatically generated make config: don't edit
# Linux kernel version: 2.6.19-mm1
# Tue Dec 12 10:44:08 2006
#
CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"

#
# Code maturity level options
#
CONFIG_EXPERIMENTAL=y
CONFIG_LOCK_KERNEL=y
CONFIG_INIT_ENV_ARG_LIMIT=32

#
# General setup
#
CONFIG_LOCALVERSION=""
CONFIG_LOCALVERSION_AUTO=y
CONFIG_SWAP=y
CONFIG_SWAP_PREFETCH=y
CONFIG_SYSVIPC=y
# CONFIG_IPC_NS is not set
CONFIG_POSIX_MQUEUE=y
# CONFIG_BSD_PROCESS_ACCT is not set
# CONFIG_TASKSTATS is not set
# CONFIG_UTS_NS is not set
# CONFIG_AUDIT is not set
CONFIG_IKCONFIG=y
CONFIG_IKCONFIG_PROC=y
# CONFIG_CPUSETS is not set
CONFIG_SYSFS_DEPRECATED=y
# CONFIG_RELAY is not set
CONFIG_INITRAMFS_SOURCE=""
CONFIG_CC_OPTIMIZE_FOR_SIZE=y
CONFIG_SYSCTL=y
# CONFIG_EMBEDDED is not set
CONFIG_SYSCTL_SYSCALL=y
CONFIG_KALLSYMS=y
CONFIG_KALLSYMS_ALL=y
# CONFIG_KALLSYMS_EXTRA_PASS is not set
CONFIG_HOTPLUG=y
CONFIG_PRINTK=y
CONFIG_BUG=y
CONFIG_ELF_CORE=y
CONFIG_BASE_FULL=y
CONFIG_FUTEX=y
CONFIG_EPOLL=y
CONFIG_SHMEM=y
CONFIG_SLAB=y
CONFIG_VM_EVENT_COUNTERS=y
CONFIG_RT_MUTEXES=y
# CONFIG_TINY_SHMEM is not set
CONFIG_BASE_SMALL=0
# CONFIG_SLOB is not set

#
# Loadable module support
#
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
# CONFIG_MODULE_FORCE_UNLOAD is not set
CONFIG_MODVERSIONS=y
CONFIG_MODULE_SRCVERSION_ALL=y
CONFIG_KMOD=y
CONFIG_STOP_MACHINE=y

#
# Block layer
#
CONFIG_BLOCK=y
# CONFIG_BLK_DEV_IO_TRACE is not set

#
# IO Schedulers
#
CONFIG_IOSCHED_NOOP=y
CONFIG_IOSCHED_AS=y
CONFIG_IOSCHED_DEADLINE=y
CONFIG_IOSCHED_CFQ=y
CONFIG_DEFAULT_AS=y
# CONFIG_DEFAULT_DEADLINE is not set
# CONFIG_DEFAULT_CFQ is not set
# CONFIG_DEFAULT_NOOP is not set
CONFIG_DEFAULT_IOSCHED="anticipatory"

#
# Processor type and features
#
CONFIG_IA64=y
CONFIG_64BIT=y
CONFIG_ZONE_DMA=y
CONFIG_MMU=y
CONFIG_SWIOTLB=y
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
# CONFIG_ARCH_HAS_ILOG2_U32 is not set
# CONFIG_ARCH_HAS_ILOG2_U64 is not set
CONFIG_GENERIC_FIND_NEXT_BIT=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_TIME_INTERPOLATION=y
CONFIG_DMI=y
CONFIG_EFI=y
CONFIG_GENERIC_IOMAP=y
CONFIG_SCHED_NO_NO_OMIT_FRAME_POINTER=y
CONFIG_AUDIT_ARCH=y
# CONFIG_IA64_GENERIC is not set
CONFIG_IA64_DIG=y
# CONFIG_IA64_HP_ZX1 is not set
# CONFIG_IA64_HP_ZX1_SWIOTLB is not set
# CONFIG_IA64_SGI_SN2 is not set
# CONFIG_IA64_HP_SIM is not set
# CONFIG_ITANIUM is not set
CONFIG_MCKINLEY=y
# CONFIG_IA64_PAGE_SIZE_4KB is not set
# CONFIG_IA64_PAGE_SIZE_8KB is not set
CONFIG_IA64_PAGE_SIZE_16KB=y
# CONFIG_IA64_PAGE_SIZE_64KB is not set
CONFIG_PGTABLE_3=y
# CONFIG_PGTABLE_4 is not set
# CONFIG_HZ_100 is not set
CONFIG_HZ_250=y
# CONFIG_HZ_300 is not set
# CONFIG_HZ_1000 is not set
CONFIG_HZ=250
CONFIG_IA64_L1_CACHE_SHIFT=7
CONFIG_IA64_CYCLONE=y
CONFIG_IOSAPIC=y
CONFIG_FORCE_MAX_ZONEORDER=17
CONFIG_SMP=y
CONFIG_NR_CPUS=16
CONFIG_HOTPLUG_CPU=y
CONFIG_ARCH_ENABLE_MEMORY_HOTPLUG=y
# CONFIG_SCHED_SMT is not set
CONFIG_PERMIT_BSP_REMOVE=y
CONFIG_FORCE_CPEI_RETARGET=y
# CONFIG_PREEMPT is not set
CONFIG_SELECT_MEMORY_MODEL=y
# CONFIG_FLATMEM_MANUAL is not set
# CONFIG_DISCONTIGMEM_MANUAL is not set
CONFIG_SPARSEMEM_MANUAL=y
CONFIG_SPARSEMEM=y
CONFIG_NEED_MULTIPLE_NODES=y
CONFIG_HAVE_MEMORY_PRESENT=y
# CONFIG_SPARSEMEM_STATIC is not set
CONFIG_SPARSEMEM_EXTREME=y
CONFIG_SPARSEMEM_VMEMMAP=y
CONFIG_SPARSEMEM_VMEMMAP_STATIC=y
CONFIG_SPLIT_PTLOCK_CPUS=4
CONFIG_MIGRATION=y
CONFIG_RESOURCES_64BIT=y
CONFIG_ZONE_DMA_FLAG=1
CONFIG_ADAPTIVE_READAHEAD=y
CONFIG_ARCH_SELECT_MEMORY_MODEL=y
CONFIG_ARCH_DISCONTIGMEM_ENABLE=y
CONFIG_ARCH_FLATMEM_ENABLE=y
CONFIG_ARCH_SPARSEMEM_ENABLE=y
CONFIG_ARCH_SPARSEMEM_VMEMMAP=y
CONFIG_ARCH_SPARSEMEM_VMEMMAP_STATIC=y
CONFIG_NUMA=y
CONFIG_NODES_SHIFT=10
CONFIG_ARCH_POPULATES_NODE_MAP=y
CONFIG_HAVE_ARCH_EARLY_PFN_TO_NID=y
CONFIG_HAVE_ARCH_NODEDATA_EXTENSION=y
CONFIG_IA32_SUPPORT=y
CONFIG_COMPAT=y
CONFIG_IA64_MCA_RECOVERY=y
CONFIG_PERFMON=y
CONFIG_IA64_PALINFO=y
# CONFIG_IA64_ESI is not set
# CONFIG_KEXEC is not set
# CONFIG_CRASH_DUMP is not set

#
# Firmware Drivers
#
CONFIG_EFI_VARS=y
CONFIG_EFI_PCDP=y
CONFIG_BINFMT_ELF=y
CONFIG_BINFMT_MISC=m

#
# Power management and ACPI
#
CONFIG_PM=y
CONFIG_PM_LEGACY=y
# CONFIG_PM_DEBUG is not set
# CONFIG_PM_SYSFS_DEPRECATED is not set

#
# ACPI (Advanced Configuration and Power Interface) Support
#
CONFIG_ACPI=y
CONFIG_ACPI_BUTTON=m

Re: [take26-resend1 7/8] kevent: Signal notifications.

2006-12-11 Thread Evgeniy Polyakov
On Mon, Dec 11, 2006 at 12:32:55PM -0400, Mauricio Lin ([EMAIL PROTECTED]) 
wrote:
> Hi Evgeniy,

Hi Mauricio.

> I have used kobject_uevent() to notify userspace about some events.
> For instance, when memory comsumption reaches a predefined watermark,
> a signal is sent to userspace to allow applications to free resources.
> But I am not sure if kobject_uevent() is the more appropriate way for
> that since if I have many different levels of notifications (using
> kobject_uevent()) from kernel space to user space, so how the
> application could know or differentiate from which level of kernel
> notification the signal was sent from?
> 
> The application should perform a specific task according to different
> type of received notification. So I do not know if the current kernel
> provides something like that. Do you know any current kernel (2.6.19)
> implementation for that?
> 
> After reading about your Kevent implementation, I guess that your
> patches are able to do what I need, right? Will it be included in the
> mainline kernel? Do you have examples about how can I use your socket
> and/or signal notifications to establish kernel and user space
> communication?

I do not know if it will be included or not, but would like to hear an
opinion of people added to Cc: list on that point.

I have a lot of examples from trivial applications to real-world web
server patched with kevent support. Although some applications might not
compile with the latest kevent sources due to interface parameters
changes, it is easily fixable looking into other examples.

According to your task - yes, it can be done through kevent - you need
to create own kevent subsystem if you plan to use something special,
register it with kevent and start commiting events.
But it is easier to use different notification mechanisms for that task:
I suggest using netlink based connector, gennetlink or kobject_uevent,
although the latter is not the best choice definitely, and create own
protocol embedded into that transports.

> BR,
> 
> Mauricio Lin.

-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [take26-resend1 0/8] kevent: Generic event handling mechanism.

2006-12-11 Thread Evgeniy Polyakov
On Mon, Dec 11, 2006 at 10:16:30AM -0500, Jeff Garzik ([EMAIL PROTECTED]) wrote:
> Comments:
> 
> * [oh, everybody will hate me for saying this, but...]  to me, "kevent" 
> implies an internal kernel subsystem.  I would rather call it "uevent" 
> or anything else lacking a 'k' prefix.

It is kernel subsystem indeed, which exports some of its part to
userspace.
I previously thought that prefix 'k' can only be confused with KDE.

> * I like the absolute timespec (and use of timespec itself)

And I do not, but I made them to make at least some progress.

> * more on naming:  I think kevent_open would be more natural than 
> kevent_init, since it opens a file descriptor.

It is also initializes ring buffer.

> * why is KEVENT_MAX not equal to KEVENT_POSIX_TIMER?  (perhaps answer 
> this question in a comment, if it is not a mistake)

I check for error number using '>=' and use it as array size, 
so it is always bigger than the last entry id.
I will add a comment.

> * Kill all the CONFIG_KEVENT_xxx sub-options, or hide them under 
> CONFIG_EMBEDDED.  Application developers should NOT be left wondering 
> whether or support for KEVENT_INODE was compiled into the kernel.

Ok, I will put them under !CONFIG_EMBEDDED.

-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH -mm] sata_nv: fix kfree ordering in remove

2006-12-11 Thread Robert Hancock

Jeff Garzik wrote:

It is unwise to free the struct before the ports are even detached.


Right, theoretically something bad could happen here (though not 
likely). Here's a fix. Sorry for attaching with something so trivial, 
but Thunderbird isn't very cooperative..


---

The suspend/resume change for sata_nv introduced a potential bug where 
the hpriv structure could be used after it was freed in nv_remove_one. 
Fix that.


Signed-off-by: Robert Hancock <[EMAIL PROTECTED]>

---
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/
--- linux-2.6.19-rc6-mm2/drivers/ata/sata_nv.c  2006-12-11 22:13:26.0 
-0600
+++ linux-2.6.19-rc6-mm2admafix/drivers/ata/sata_nv.c   2006-12-11 
22:15:58.0 -0600
@@ -1555,8 +1555,8 @@ static void nv_remove_one (struct pci_de
struct ata_host *host = dev_get_drvdata(>dev);
struct nv_host_priv *hpriv = host->private_data;

-   kfree(hpriv);
ata_pci_remove_one(pdev);
+   kfree(hpriv);
 }  
 
 static int nv_pci_device_resume(struct pci_dev *pdev)


Re: Kevent POSIX timers support.

2006-12-11 Thread Evgeniy Polyakov
On Mon, Dec 11, 2006 at 05:36:44PM -0800, David Miller ([EMAIL PROTECTED]) 
wrote:
> From: Evgeniy Polyakov <[EMAIL PROTECTED]>
> Date: Tue, 28 Nov 2006 22:22:36 +0300
> 
> > And, btw, last time I checked, aligned_u64 was not exported to
> > userspace.
> 
> It is in linux/types.h and not protected by __KERNEL__ ifdefs.
> Perhaps you mean something else?

It looks like I checked wrong #ifdef __KERNEL__/#endif pair.
It is there indeed.

-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] rfkill - Add support for input key to control wireless radio

2006-12-11 Thread Dmitry Torokhov
Hi Ivo,

On Thursday 07 December 2006 16:53, Ivo van Doorn wrote:
> Hi,
> 
> > > > >  2 - Hardware key that does not control the hardware radio and does 
> > > > > not report anything to userspace
> > > >
> > > > Kind of uninteresting button ;)
> > >
> > > And this is the button that rfkill was originally designed for.
> > > Laptops with integrated WiFi cards from Ralink have a hardware button 
> > > that don't send anything to
> > > userspace (unless the ACPI event is read) and does not directly control 
> > > the radio itself.
> > >
> > 
> > So what does such a button do? I am confused here...
> 
> Without a handler like rfkill, it does nothing besides toggling a bit in a 
> register.
> The Ralink chipsets have a couple of registers that represent the state of 
> that key.
> Besides that, there are no notifications to the userspace nor does it 
> directly control the
> radio.
> That is where rfkill came in with the toggle handler that will listen to the 
> register
> to check if the key has been pressed and properly process the key event.

In this case the driver can make the button state available to userspace so
thsi is really type 2) driver as far as I can see. The fact that the button
is not reported to userspace yet should not get into our way of classifying
it.

-- 
Dmitry
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


2.6.19.1 GFS2_FS_LOCKING_DLM bug still lurking

2006-12-11 Thread Chris Zubrzycki

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

I tried building the new kernel and ran into this bug:

WARNING: "kernel_sendmsg" [fs/dlm/dlm.ko] undefined!
WARNING: "sock_release" [fs/dlm/dlm.ko] undefined!
WARNING: "config_item_put" [fs/dlm/dlm.ko] undefined!
WARNING: "sock_create_kern" [fs/dlm/dlm.ko] undefined!
WARNING: "config_item_init_type_name" [fs/dlm/dlm.ko] undefined!
WARNING: "config_group_init_type_name" [fs/dlm/dlm.ko] undefined!
WARNING: "configfs_register_subsystem" [fs/dlm/dlm.ko] undefined!
WARNING: "config_group_find_obj" [fs/dlm/dlm.ko] undefined!
WARNING: "configfs_unregister_subsystem" [fs/dlm/dlm.ko] undefined!
WARNING: "kernel_recvmsg" [fs/dlm/dlm.ko] undefined!
WARNING: "config_item_get" [fs/dlm/dlm.ko] undefined!
WARNING: "config_group_init" [fs/dlm/dlm.ko] undefined!
make[1]: *** [__modpost] Error 1
make: *** [modules] Error 2



I found the solution here: http://www.spinics.net/lists/kernel/ 
msg535532.html

I only changed the one file, fs/gfs2/Kconfig, and added

+   depends on GFS2_FS && INET && (IPV6 || IPV6=n)
+   select IP_SCTP if DLM_SCTP
+   select CONFIGFS_FS

It seems to work fine now. Please CC me on any replies, thank you.

- -chris zubrzycki
- - --
PGP public key: http://homepage.mac.com/beren/publickey.txt
ID: 0xA2ABC070
Fingerprint: 26B0 BA6B A409 FA83 42B3  1688 FBF9 8232 A2AB C070


Remember: it's a "Microsoft virus", not an "email virus",
a "Microsoft worm", not a "computer worm".



-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (Darwin)
Comment: Please sign reply-http://www.gnupg.org

iEYEARECAAYFAkV+OckACgkQ+/mCMqKrwHBE4gCgr6imhZykVHdTSKvNXF65IPdK
noMAn3ZipJc04zWA3NhzxFbZ84OCLMFt
=8DB0
-END PGP SIGNATURE-
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Introduce jiffies_32 and related compare functions

2006-12-11 Thread David Miller
From: Eric Dumazet <[EMAIL PROTECTED]>
Date: Tue, 12 Dec 2006 05:09:23 +0100

> We definitly *like* being able to use bigger timeouts on 64bits platforms.
> 
> Not that they are mandatory since the same application should run fine on 
> 32bits kernel. But as the standard type for 'tick timestamps' is 'unsigned 
> long', a change would be invasive.
>
> Maybe some applications are now relying on being able to
> sleep()/select()/poll() for periods > 30 days and only run on 64
> bits kernels.

I think one possible target would be struct timer, at least
in theory.

There is also a line of reasoning that says that on 64-bit
platforms we have some flexibility to set HZ very large, if
we wanted to at some point, and going to 32-bit jiffies
storage for some things may eliminate that kind of flexibility.

Just some food for thought...
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Introduce jiffies_32 and related compare functions

2006-12-11 Thread Eric Dumazet

David Miller a écrit :

From: Eric Dumazet <[EMAIL PROTECTED]>
Date: Tue, 12 Dec 2006 04:47:14 +0100


I doubt being able to extend the expiration of a dst above 2^32
ticks (49 days if HZ=1000, 198 days if HZ=250) is worth the ram
wastage.


And this doesn't apply for all jiffies uses because? :-)

That's the point I'm trying to make and get a discussion on.




Ah ok :)

Maybe my intentions were not clear :

I am not suggesting replacing all jiffies to jiffies_32. Just *selected* ones :)

BTW, the real limit is 2^31 ticks, so its 24 days.

We definitly *like* being able to use bigger timeouts on 64bits platforms.

Not that they are mandatory since the same application should run fine on 
32bits kernel. But as the standard type for 'tick timestamps' is 'unsigned 
long', a change would be invasive.


Maybe some applications are now relying on being able to 
sleep()/select()/poll() for periods > 30 days and only run on 64 bits kernels.


Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


ieee80211 sleeping in invalid context

2006-12-11 Thread Ray Lee
Hey all, more data on my bcm43xx problem report from a few weeks back.

By random chance I acquired a brain, and decided to rebuild my latest kernel
pull with as many debugging options on as I could stand. Got the below, plus
a dead keyboard (except for Magic SysRq) (but only if I let userspace come up
fully -- booting with init=/bin/bash is fine). Since the trace below mentions
scans, I'm hoping it's related to my problem.

In other news, now that I've moved my laptop back to my home office, I'm able
to recreate the dead-keyboard lockups I've been having again, about once every
day or two. What fun. So if there are patches I should try ontop of the latest
git, let me know. (Though I'm hoping the below will be a smoking gun to someone
who has a clue, i.e., not me.)

Ray

Dec 11 19:34:18 phoenix syslogd 1.4.1#18ubuntu6: restart.
Dec 11 19:34:18 phoenix kernel: Inspecting /boot/System.map-2.6.19
Dec 11 19:34:19 phoenix kernel: Loaded 26330 symbols from 
/boot/System.map-2.6.19.
Dec 11 19:34:19 phoenix kernel: Symbols match kernel version 2.6.19.
Dec 11 19:34:19 phoenix kernel: No module symbols loaded - kernel modules not 
enabled.
Dec 11 19:34:19 phoenix kernel: [0.00] Linux version 2.6.19 ([EMAIL 
PROTECTED]) (gcc version 4.1.2 20060928 (prerelease) (Ubuntu 4.1.1-13ubuntu5)) 
#1 PREEMPT Mon Dec 11 12:52:41 PST 2006
Dec 11 19:34:19 phoenix kernel: [0.00] Command line: 
root=UUID=bf7dc35f-5eff-4a85-b398-590f37c5679e ro noapic
Dec 11 19:34:19 phoenix kernel: [0.00] BIOS-provided physical RAM map:
Dec 11 19:34:19 phoenix kernel: [0.00]  BIOS-e820:  - 
0009fc00 (usable)
Dec 11 19:34:19 phoenix kernel: [0.00]  BIOS-e820: 0009fc00 - 
000a (reserved)
Dec 11 19:34:19 phoenix kernel: [0.00]  BIOS-e820: 000e - 
0010 (reserved)
Dec 11 19:34:20 phoenix kernel: [0.00]  BIOS-e820: 0010 - 
37fd (usable)
Dec 11 19:34:21 phoenix kernel: [0.00]  BIOS-e820: 37fd - 
37fefc00 (reserved)
Dec 11 19:34:21 phoenix kernel: [0.00]  BIOS-e820: 37fefc00 - 
37ffb000 (ACPI NVS)
Dec 11 19:34:21 phoenix kernel: [0.00]  BIOS-e820: 37ffb000 - 
4000 (reserved)
Dec 11 19:34:21 phoenix kernel: [0.00]  BIOS-e820: e000 - 
f000 (reserved)
Dec 11 19:34:21 phoenix kernel: [0.00]  BIOS-e820: fec0 - 
fec02000 (reserved)
Dec 11 19:34:21 phoenix kernel: [0.00]  BIOS-e820: ffb8 - 
ffc0 (reserved)
Dec 11 19:34:21 phoenix kernel: [0.00]  BIOS-e820: fff8 - 
0001 (reserved)
Dec 11 19:34:21 phoenix kernel: [0.00] end_pfn_map = 1048576
Dec 11 19:34:21 phoenix kernel: [0.00] DMI 2.3 present.
Dec 11 19:34:23 phoenix kernel: [0.00] No mptable found.
Dec 11 19:34:23 phoenix kernel: [0.00] Zone PFN ranges:
Dec 11 19:34:23 phoenix kernel: [0.00]   DMA 0 -> 4096
Dec 11 19:34:23 phoenix kernel: [0.00]   DMA324096 ->  1048576
Dec 11 19:34:24 phoenix kernel: [0.00]   Normal1048576 ->  1048576
Dec 11 19:34:24 phoenix kernel: [0.00] early_node_map[2] active PFN 
ranges
Dec 11 19:34:24 phoenix kernel: [0.00] 0:0 ->  159
Dec 11 19:34:24 phoenix kernel: [0.00] 0:  256 ->   229328
Dec 11 19:34:24 phoenix hpiod: 1.6.9 accepting connections at 2208...
Dec 11 19:34:25 phoenix kernel: [0.00] ACPI: PM-Timer IO Port: 0x8008
Dec 11 19:34:25 phoenix kernel: [0.00] ACPI: LAPIC (acpi_id[0x01] 
lapic_id[0x00] enabled)
Dec 11 19:34:25 phoenix kernel: [0.00] Processor #0 (Bootup-CPU)
Dec 11 19:34:25 phoenix kernel: [0.00] ACPI: LAPIC_NMI (acpi_id[0x01] 
high edge lint[0x1])
Dec 11 19:34:25 phoenix kernel: [0.00] ACPI: Skipping IOAPIC probe due 
to 'noapic' option.
Dec 11 19:34:25 phoenix kernel: [0.00] arch/x86_64/mm/init.c:145: bad 
pte 810001c58fe8(8000fec01173).
Dec 11 19:34:25 phoenix kernel: [0.00] Nosave address range: 
0009f000 - 000a
Dec 11 19:34:25 phoenix kernel: [0.00] Nosave address range: 
000a - 000e
Dec 11 19:34:25 phoenix kernel: [0.00] Nosave address range: 
000e - 0010
Dec 11 19:34:25 phoenix kernel: [0.00] Allocating PCI resources 
starting at 5000 (gap: 4000:a000)
Dec 11 19:34:25 phoenix kernel: [0.00] Built 1 zonelists.  Total pages: 
223940
Dec 11 19:34:25 phoenix kernel: [0.00] Kernel command line: 
root=UUID=bf7dc35f-5eff-4a85-b398-590f37c5679e ro noapic
Dec 11 19:34:25 phoenix kernel: [0.00] Initializing CPU#0
Dec 11 19:34:25 phoenix kernel: [0.00] PID hash table entries: 4096 
(order: 12, 32768 bytes)
Dec 11 19:34:25 phoenix kernel: [   13.705535] time.c: Using 3.579545 MHz WALL 
PM GTOD 

Re: [PATCH] Introduce jiffies_32 and related compare functions

2006-12-11 Thread David Miller
From: Eric Dumazet <[EMAIL PROTECTED]>
Date: Tue, 12 Dec 2006 04:47:14 +0100

> I doubt being able to extend the expiration of a dst above 2^32
> ticks (49 days if HZ=1000, 198 days if HZ=250) is worth the ram
> wastage.

And this doesn't apply for all jiffies uses because? :-)

That's the point I'm trying to make and get a discussion on.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH][SCSI]: Save some bytes in struct scsi_target

2006-12-11 Thread Matthew Wilcox
On Tue, Dec 12, 2006 at 01:17:18AM -0200, Arnaldo Carvalho de Melo wrote:
> }; /* size: 368, cachelines: 12 */
> }; /* size: 364, cachelines: 12 */

Saving space is always good ;-)

> - unsigned intcreate:1; /* signal that it needs to be added */
> + charscsi_level;
> + unsigned char   create:1; /* signal that it needs to be added */
>   unsigned intpdt_1f_for_no_lun;  /* PDT = 0x1f */
>   /* means no lun present */
>  
> - charscsi_level;

However, pdt_1f_for_no_lun is really only one bit, saving another 4 bytes.

>   struct execute_work ew;
>   enum scsi_target_state  state;

enums are a bit of a pain.  Even though scsi_target_state uses only two
values, it's represented as an int.  Unless you're on arm-eabi, when
it'll use less.  And even then, it won't use less than a byte, as it has
to be addressable.  I wonder if we can turn scsi_target_state into a
bit.  That'll save another 8 bytes total.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Introduce jiffies_32 and related compare functions

2006-12-11 Thread Eric Dumazet

David Miller a écrit :

From: Eric Dumazet <[EMAIL PROTECTED]>
Date: Mon, 11 Dec 2006 23:58:06 +0100


Some subsystems dont need more than 32bits timestamps.

See for example net/ipv4/inetpeer.c and include/net/tcp.h :
#define tcp_time_stamp((__u32)(jiffies))


Because most timeouts should work with 'normal jiffies' that are 32bits on 
32bits platforms, it makes sense to be able to use only 32bits to store them 
and not 64 bits, to save ram.


This patch introduces jiffies_32, and related comparison functions 
time_after32(), time_before32(), time_after_eq32() and time_before_eq32().


I plan to use this infrastructure in network code for example (struct 
dst_entry comes to mind).


The TCP case is because the protocol limits the size of
the timestamp we can store in the TCP Timestamp option.

Otherwise we would use the full 64-bit jiffies timestamp,
in order to have a larger window of values which would not
overflow.

Since there is no protocol limitation involved in cases
such as dst_entry, I think we should keep it at 64-bits
on 64-bit platforms to make the wrap-around window as
large as possible.

I really don't see any reason to make these changes.  Yes,
you'd save some space, but one of the chief advantages of
64-bit is that we get larger jiffies value windows.  If
that has zero value, as your intended changes imply, then
we shouldn't need the default 64-bit jiffies either, by
implication.


Well, just to have similar functions to manipulate jiffies.

We already know that using a 32bits dtime in struct inet_peer permited an 
object size of 64 bytes instead of 128 bytes (on 64bits platforms)


On one machine (running linux-2.6.16) I have :

# grep peer /proc/slabinfo
inet_peer_cache65972  86220128   301 : tunables  120   608 : 
slabdata   2874   2874262


(So the 128 bytes -> 64 bytes is going to save 1437 pages of memory)

# grep dst /proc/slabinfo
ip_dst_cache  1765818 2077380320   121 : tunables   54   278 : 
slabdata 173115 173115 39


(So saving 4*4 bytes per dst might save 32 MB of ram).
I doubt being able to extend the expiration of a dst above 2^32 ticks (49 days 
if HZ=1000, 198 days if HZ=250) is worth the ram wastage.


Dont you prefer to be able to apply this patch for example ?

Signed-off-by: Eric Dumazet <[EMAIL PROTECTED]>
--- linux/net/ipv4/inetpeer.c.orig  2006-12-12 05:25:42.0 +0100
+++ linux-ed/net/ipv4/inetpeer.c2006-12-12 05:29:22.0 +0100
@@ -338,8 +338,7 @@ static int cleanup_once(unsigned long tt
spin_lock_bh(_peer_unused_lock);
p = inet_peer_unused_head;
if (p != NULL) {
-   __u32 delta = (__u32)jiffies - p->dtime;
-   if (delta < ttl) {
+   if (time_after32(p->dtime + ttl, jiffies_32)) {
/* Do not prune fresh entries. */
spin_unlock_bh(_peer_unused_lock);
return -1;
@@ -466,7 +465,7 @@ void inet_putpeer(struct inet_peer *p)
p->unused_next = NULL;
*inet_peer_unused_tailp = p;
inet_peer_unused_tailp = >unused_next;
-   p->dtime = (__u32)jiffies;
+   p->dtime = jiffies_32;
}
spin_unlock_bh(_peer_unused_lock);
 }


Re: [PATCH] connector: Some fixes for ia64 unaligned access errors

2006-12-11 Thread David Miller
From: Matt Helsley <[EMAIL PROTECTED]>
Date: Mon, 11 Dec 2006 19:09:16 -0800

> Hmm, that GCC assumption conflicts with the prototypes of memcpy() I've
> seen.

When GCC expands __builtin_memcpy() internally it looks at the types
of the arguments, and what it knows about their guarenteed alignment.

memcpy()'s declaration of the first argument as "void *" has
zero influence upon any of this.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] WorkStruct: Implement generic UP cmpxchg() where an

2006-12-11 Thread linux
>> to keep the amount of code between ll and sc to an absolute minimum
>> to avoid interference which causes livelock.  Processor timeouts
>> are generally much longer than any reasonable code sequence.

> "Generally" does not mean you can just ignore it and hope the C compiler
> does the right thing. Nor is it enough for just SOME of the architectures
> to have the properties you require.

If it's an order of magnitude larger than the common case, then yes
you can.  Do we worry about writing functions so big that they
exceed branch displacement limits?

That's detected at compile time, but LL/SC pair distance is
in principle straightforward to measure, too.

> Ralf tells us that MIPS cannot execute any loads, stores, or sync
> instructions on MIPS. Ivan says no loads, stores, taken branches etc
> on Alpha.
>
> MIPS also has a limit of 2048 bytes between the ll and sc.

I agree with you about the Alpha, and that will have to be directly
coded.  But on MIPS, the R4000 manual (2nd ed, covering the R4400
as well) says

> The link is broken in the following circumstances:
>·   if any external request (invalidate, snoop, or intervention)
>changes the state of the line containing the lock variable to
>invalid
>·   upon completion of an ERET (return from exception)
>instruction
>·   an external update to the cache line containing the lock
>variable

Are you absolutely sure of what you are reporting about MIPS?
Have you got a source?  I've been checking the most authoritative
references I can find and can't find mention of such a restriction.
(The R8000 User's Manual doesn't appear to mention LL/SC at all, sigh.)

One thing I DID find is the "R4000MC Errata, Processor Revision 2.2 and
3.0", which documents several LL/SC bugs (Numbers 10, 12, 13) and #12
in particular requires extremely careful coding in the workaround.

That may completely scuttle the idea of using generic LL/SC functions.

> So you almost definitely cannot have gcc generated assembly between. I
> think we agree on that much.

We don't.  I think that if that restriction applies, it's worthless,
because you can't achieve a net reduction in arch-dependent code.

GCC specifically says that if you want a 100% guarantee of no reloads
between asm instructions, place them in a single asm() statement.

> In truth, however, realizing that we're only talking about three
> architectures (wo of which have 32 & 64-bit versions) it's probably not
> worth it.  If there were five, it would probably be a savings, but 3x
> code duplication of some small, well-defined primitives is a fair price
> to pay for avoiding another layer of abstraction (a.k.a. obfuscation).
> 
> And it lets you optimize them better.
> 
> I apologize for not having counted them before.

> I also disagree that the architectures don't matter. ARM and PPC are
> pretty important, and I believe Linux on MIPS is growing too.

Er... I definitely don't see where I said, and I don't even see where
I implied - or even hinted - that MIPS, ARM and PPC "don't matter."
I use Linux on ARM daily.

I just thought that writing a nearly-optimal generic primitive is about
3x harder than writing a single-architecture one, so even for primitives
yet to be written, its just as easy to do it fully arch-specific.

Plus you have corner cases like the R5900 that don't have LL/SC at all.
(Can it be used multiprocessor?)

> One proposal that I could buy is an atomic_ll/sc API, which mapped
> to a cmpxchg emulation even on those llsc architectures which had
> any sort of restriction whatsoever. This could be used in regular C
> code (eg. you indicate powerpc might be able to do this). But it may
> also help cmpxchg architectures optimise their code, because the
> load really wants to be a "load with intent to store" -- and is
> IMO the biggest suboptimal aspect of current atomic_cmpxchg.

Or, possibly, an interface like

do {
oldvalue = ll(addr)
newvalue = ... oldvalue ...
} while (!sc(addr, oldvalue, newvalue))

Where sc() could be a cmpxchg.  But, more importantly, if the
architecture did implement LL/SC, it could be a "try plain SC; if
that fails try CMPXCHG built out of LL/SC; if that fails, loop"

Actually, I'd want something a bit more integrated, that could
have the option of fetching the new oldvalue as part of the sc()
implementation if that failed.  Something like

DO_ATOMIC(addr, oldvalue) {
... code ...
} UNTIL_ATOMIC(addr, oldvalue, newvalue);

or perhaps, to encourage short code sections,
DO_ATOMIC(addr, oldvalue, code, newvalue);

The problem is, that's already not optimal for spinlocks, where
you want to use a non-linked load while spinning.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [2.6.19] NFS: server error: fileid changed

2006-12-11 Thread Martin Knoblauch

--- Trond Myklebust <[EMAIL PROTECTED]> wrote:

> On Mon, 2006-12-11 at 15:44 -0800, Martin Knoblauch wrote:
> >  So far, we are only seeing it on amd-mounted filesystems, not on
> > static NFS mounts. Unfortunatelly, it is difficult to avoid "amd"
> in
> > our environment.
> 
> Any chance you could try substituting a recent version of autofs?
> This
> sort of problem is more likely to happen on partitions that are
> unmounted and then remounted often. I'd just like to figure out if
> this
> is something that we need to fix in the kernel, or if it is purely an
> amd problem.
> 
> Cheers
>   Trond
> 
Hi Trond,

 unfortunatelly I have no controll over the mounting maps, as they are
maintained from different people. So the answer is no. Unfortunatelly
the customer has decided on using am-utils. This has been hurting us
(and them) for years ...

 Your are likely correct when you hint towards partitions which are
frequently remounted.  

 In any case, your help is appreciated.

Cheers
Martin


--
Martin Knoblauch
email: k n o b i AT knobisoft DOT de
www:   http://www.knobisoft.de
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH][SCSI]: Save some bytes in struct scsi_target

2006-12-11 Thread Arnaldo Carvalho de Melo
Before:

[EMAIL PROTECTED] kpahole-2.6]$ pahole --cacheline 32 /tmp/scsi.o.before 
scsi_target
/* include/scsi/scsi_device.h:86 */
struct scsi_target {
struct scsi_device *   starget_sdev_user;/* 0 4 */
struct list_head   siblings; /* 4 8 */
struct list_head   devices;  /*12 8 */
struct device  dev;  /*20   300 */
/* --- cacheline 10 boundary (320 bytes) --- */
unsigned int   reap_ref; /*   320 4 */
unsigned int   channel;  /*   324 4 */
unsigned int   id;   /*   328 4 */
unsigned int   create:1; /*   332 4 */

/* XXX 31 bits hole, try to pack */

unsigned int   pdt_1f_for_no_lun;/*   336 4 */
char   scsi_level;   /*   340 1 */

/* XXX 3 bytes hole, try to pack */

struct execute_workew;   /*   34416 */
/* --- cacheline 11 boundary (352 bytes) was 8 bytes ago --- */
enum scsi_target_state state;/*   360 4 */
void * hostdata; /*   364 4 */
long unsigned int  starget_data[0];  /*   368 0 */
}; /* size: 368, cachelines: 12 */
   /* sum members: 365, holes: 1, sum holes: 3 */
   /* bit holes: 1, sum bit holes: 31 bits */
   /* last cacheline: 16 bytes */

After:

[EMAIL PROTECTED] kpahole-2.6]$ pahole --cacheline 32 drivers/scsi/scsi.o 
scsi_target
/* include/scsi/scsi_device.h:86 */
struct scsi_target {
struct scsi_device *   starget_sdev_user;/* 0 4 */
struct list_head   siblings; /* 4 8 */
struct list_head   devices;  /*12 8 */
struct device  dev;  /*20   300 */
/* --- cacheline 10 boundary (320 bytes) --- */
unsigned int   reap_ref; /*   320 4 */
unsigned int   channel;  /*   324 4 */
unsigned int   id;   /*   328 4 */
char   scsi_level;   /*   332 1 */
unsigned char  create:1; /*   333 1 */

/* XXX 7 bits hole, try to pack */
/* XXX 2 bytes hole, try to pack */

unsigned int   pdt_1f_for_no_lun;/*   336 4 */
struct execute_workew;   /*   34016 */
/* --- cacheline 11 boundary (352 bytes) was 4 bytes ago --- */
enum scsi_target_state state;/*   356 4 */
void * hostdata; /*   360 4 */
long unsigned int  starget_data[0];  /*   364 0 */
}; /* size: 364, cachelines: 12 */
   /* sum members: 362, holes: 1, sum holes: 2 */
   /* bit holes: 1, sum bit holes: 7 bits */
   /* last cacheline: 12 bytes */

[EMAIL PROTECTED] kpahole-2.6]$ codiff -V /tmp/scsi.o.before drivers/scsi/scsi.o
drivers/scsi/scsi.c:
  struct scsi_target |   -4
create:1;
 from: unsigned int  /*   332(31)4(1) */
 to:   unsigned char /*   333(7) 1(1) */
scsi_level;
 from: char  /*   340(0) 1(0) */
 to:   char  /*   332(0) 1(0) */

 1 struct changed

Signed-off-by: Arnaldo Carvalho de Melo <[EMAIL PROTECTED]>

---

 scsi_device.h |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h
index ebf31b1..ab245fc 100644
--- a/include/scsi/scsi_device.h
+++ b/include/scsi/scsi_device.h
@@ -175,11 +175,11 @@ struct scsi_target {
unsigned intchannel;
unsigned intid; /* target id ... replace
 * scsi_device.id eventually */
-   unsigned intcreate:1; /* signal that it needs to be added */
+   charscsi_level;
+   unsigned char   create:1; /* signal that it needs to be added */
unsigned intpdt_1f_for_no_lun;  /* PDT = 0x1f */
/* means no lun present */
 
-   charscsi_level;
struct execute_work ew;
enum scsi_target_state  state;
void*hostdata; /* available to low-level driver */
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] connector: Some fixes for ia64 unaligned access errors

2006-12-11 Thread Matt Helsley
On Mon, 2006-12-11 at 17:50 -0800, David Miller wrote:
> From: Pete Zaitcev <[EMAIL PROTECTED]>
> Date: Mon, 11 Dec 2006 17:29:07 -0800
> 
> > On Mon, 11 Dec 2006 15:52:47 -0800, Matt Helsley <[EMAIL PROTECTED]> wrote:
> >
> > >   I'm shocked memcpy() introduces 8-byte stores that violate architecture
> > > alignment rules. Is there any chance this a bug in ia64's memcpy()
> > > implementation? I've tried to read it but since I'm not familiar with
> > > ia64 asm I can't make out significant parts of it in
> > > arch/ia64/lib/memcpy.S.
> >
> > The arch/ia64/lib/memcpy.S is probably fine, it must be gcc doing
> > an inline substitution of a well-known function.
> >
> > A commenter on my blog mentioned seeing the same thing in the past.
> > (http://zaitcev.livejournal.com/107185.html?thread=128945#t128945)
> >
> > It's possible that applying (void *) cast to the first argument of memcpy
> > would disrupt this optimization. But since we have a well understood
> > patch by Erik, which only adds a penalty of 32 bytes of stack waste
> > and 32 bytes of memcpy, I thought it best not to bother with heaping
> > workarounds.
> 
> Yes GCC can assume the object is aligned because of the type
> of the argument to memcpy().

Hmm, that GCC assumption conflicts with the prototypes of memcpy() I've
seen.

Does the code really check the type or just the size argument? If the
latter then I don't think assuming alignment is correct -- we could be
copying a non-nul-terminated string that happens to be a power of 2 in
length.

Cheers,
-Matt Helsley

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/1] V4L/DVB fix

2006-12-11 Thread Linus Torvalds


On Mon, 11 Dec 2006, Mauro Carvalho Chehab wrote:
> 
> Please pull 'master' from:
> git://git.kernel.org:/pub/scm/linux/kernel/git/mchehab/v4l-dvb.git 
> master
> 
> It fixes a breakage when compiling on ia64.

I get "Already up-to-date."

Did you forget to push out again?

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Whinge in paging_init if noexec is on with a non-PAE kernel

2006-12-11 Thread Kyle McMartin
On second thought, this is probably better since most people will
presumably be booting non-PAE kernels, generating this message when
they've not tried to force the issue seems silly.

This way, the user will only see a warning if they actually go
out and specify "noexec=on" on the command line.

Sucks to have to do #ifdef #else though, I wonder if there's a
better way to initialize that.

Cheers,
Kyle

diff --git a/arch/i386/mm/init.c b/arch/i386/mm/init.c
index 84697df..ff389f1 100644
--- a/arch/i386/mm/init.c
+++ b/arch/i386/mm/init.c
@@ -422,7 +422,15 @@ void zap_low_mappings (void)
flush_tlb_all();
 }
 
+/* disable_nx = 0 will generate unwanted warnings if it is
+ * the default case for non-PAE kernels, but we probably want
+ * NX by default on kernels built with PAE.
+ */
+#ifdef CONFIG_X86_PAE
 static int disable_nx __initdata = 0;
+#else
+static int disable_nx __initdata = 1;
+#endif
 u64 __supported_pte_mask __read_mostly = ~_PAGE_NX;
 
 /*
@@ -512,6 +520,9 @@ void __init paging_init(void)
set_nx();
if (nx_enabled)
printk("NX (Execute Disable) protection: active\n");
+#else
+   if (!disable_nx)
+   printk("NX (Execute Disable) only supported with 
CONFIG_HIGHMEM64G\n");
 #endif
 
pagetable_init();
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] pipe: Don't oops when pipe filesystem isn't mounted

2006-12-11 Thread Linus Torvalds


On Mon, 11 Dec 2006, Andrew Morton wrote:
> 
> Looks like this might break pcmcia which for some reason does firmware
> requesting at fs_initcall level (drivers/pcmcia/ds.c).

Ok, that's just strange. 

I think it's fine to do init_pcmcia_bus early to make sure that the PCMCIA 
bus interface is there by the time the driver init stuff happens, but I 
really don't see the point of that firmware load to be there. And all that 
MATCH_FAKE_CIS stuff is about the _devices_, not the PCMCIA bus, so that 
whole thing looks pretty silly. It should be done by the device 
registration (which is obviously device_initcall), not by some bus layer.

Hopefully Dominik can fix whatever up (if it even needs it)

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA-performance with AHCI

2006-12-11 Thread Tejun Heo
Martin A. Fink wrote:
> Compared to ICH6R with AHCI OFF the only difference I can see is that with 
> AHCI the system seems to reac much faster on keyboard events and screen 
> redraw seems to be as fast as normal. It looks like that CPU usage has not 
> decreased that dramatically as I would have expected it.
> 
> Thus I did a small calculation:
> Assuming that the processor gives workloads of (a) 1B (b) 1kB (c) 64kB to the 
> DMA controller in AHCI mode to write 45 MB/s to disk, I calculate for 10% CPU 
> time usage of the 3.2 GHz Pentium
> (a) 10% * 3.2GHz / 45M calls = 7.3 CPU cycles per 1B call to DMA
> (b) 10% * 3.2GHz / 45k calls = 7.4E+03 CPU cycles per 1kB call to DMA
> (c) 10% * 3.2GHz / 720 calls = 4.8E+05 CPU cycles per 64kB call to DMA
> 
> For me (a) looks reasonable (some overhead per byte), but stupid - if 
> implemented. Giving bigger packages like (b) and (c) looks better to me, but 
> then I can't understand that huge overhead (1E3 to 1E5 cpu cycles per 
> package) for one package.
> 
> Is this normal or do I still have something wrong in my system?

It's not ata_piix or ahci that's eating up your cpu cycles.  It's
memcpy from your user program to kernel buffer.

[root]# dd if=/dev/zero of=/dev/sda bs=1M &
[1] 2649
[root]# vmstat 1
procs ---memory-- ---swap-- -io -system-- cpu
 r  b   swpd   free   buff  cache   si   sobibo   in   cs us sy id wa
 0  3  0   6216 477328   366000  2579 31377  955 1380  2  8 38 51
 0  3  0   6008 477616   368000 0 55296  404  170  0 14  0 86
 0  3  0   5772 477808   364000 0 73728  392  207  0 18  0 82
 0  3  0   5896 477680   365600 0 74240  394  207  0 18  0 82
 0  3  0   6084 477520   365600 0 73728  393  205  0 18  0 82
 1  2  0   6652 477204   366800 0 57440  401  197  0 16  0 84
 0  3  0   6136 477496   366400 0 72168  394  195  0 17  0 83
 0  3  0   6320 477316   364400 0 73728  392  207  0 19  0 81

[root]# dd if=/dev/zero of=/dev/sda bs=1M oflag=direct &
[1] 2657
[root]# vmstat 1
procs ---memory-- ---swap-- -io -system-- cpu
 r  b   swpd   free   buff  cache   si   sobibo   in   cs us sy id wa
 0  1  0 494568224   383600   835 11149  480  454  1  3 78 18
 0  1  0 494568224   383600 0 70656  406  146  0  2  0 98
 0  1  0 494568224   384000 0 69632  393  144  0  1  0 99
 0  1  0 494568232   383200 0 69680  396  152  0  2  0 98
 0  1  0 494568232   384000 0 68608  392  142  0  1  0 99
 1  1  0 494568232   384000 0 69632  395  144  0  2  0 98
 0  1  0 494576232   384000 0 69632  393  143  0  2  0 98
 0  1  0 494576232   384000 0 70656  395  144  0  1  0 99
 0  1  0 494576232   384000 0 70656  394  148  0  2  0 98

-- 
tejun
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] pipe: Don't oops when pipe filesystem isn't mounted

2006-12-11 Thread Dominik Brodowski
On Mon, Dec 11, 2006 at 06:08:22PM -0800, Andrew Morton wrote:
> > diff --git a/include/linux/init.h b/include/linux/init.h
> > index 5eb5d24..5a593a1 100644
> > --- a/include/linux/init.h
> > +++ b/include/linux/init.h
> > @@ -111,6 +111,7 @@ extern void setup_arch(char **);
> >  #define subsys_initcall_sync(fn)   __define_initcall("4s",fn,4s)
> >  #define fs_initcall(fn)__define_initcall("5",fn,5)
> >  #define fs_initcall_sync(fn)   __define_initcall("5s",fn,5s)
> > +#define rootfs_initcall(fn)
> > __define_initcall("rootfs",fn,rootfs)
> >  #define device_initcall(fn)__define_initcall("6",fn,6)
> >  #define device_initcall_sync(fn)   __define_initcall("6s",fn,6s)
> >  #define late_initcall(fn)  __define_initcall("7",fn,7)
> 
> Looks like this might break pcmcia which for some reason does firmware
> requesting at fs_initcall level (drivers/pcmcia/ds.c).

That codepath is not triggered before device_initcall()s occur. So it's a
non-issue for PCMCIA.

Thanks,
Dominik
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH] connector: Some fixes for ia64 unaligned access errors

2006-12-11 Thread Chen, Kenneth W
Pete Zaitcev wrote on Monday, December 11, 2006 5:29 PM
> On Mon, 11 Dec 2006 15:52:47 -0800, Matt Helsley <[EMAIL PROTECTED]> wrote:
> 
> > I'm shocked memcpy() introduces 8-byte stores that violate architecture
> > alignment rules. Is there any chance this a bug in ia64's memcpy()
> > implementation? I've tried to read it but since I'm not familiar with
> > ia64 asm I can't make out significant parts of it in
> > arch/ia64/lib/memcpy.S.
> 
> The arch/ia64/lib/memcpy.S is probably fine, it must be gcc doing
> an inline substitution of a well-known function.

arch/ia64/lib/memcpy.S is fine because it does alignment check at the very
beginning of the function and depends on the alignment of dst/src alignment,
it does different thing.  The unaligned access is coming from gcc inlined
version of memcpy.

But looking deeply, memory allocation for proc_event in proc_for_connector
doesn't looked correct at all:

In drivers/connector/cn_proc.c:
#define CN_PROC_MSG_SIZE (sizeof(struct cn_msg) + sizeof(struct proc_event))

void proc_fork_connector(struct task_struct *task)
{
struct cn_msg *msg;
struct proc_event *ev;
__u8 buffer[CN_PROC_MSG_SIZE];

You can't do that because gcc assumes struct proc_event aligns on certain
boundary.  Doing fancy hand crafting like that breaks code generated by gcc.

- Ken
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] pipe: Don't oops when pipe filesystem isn't mounted

2006-12-11 Thread Andrew Morton
On Mon, 11 Dec 2006 08:01:40 -0800 (PST)
Linus Torvalds <[EMAIL PROTECTED]> wrote:

> 
> 
> On Mon, 11 Dec 2006, Al Viro wrote:
> 
> > On Mon, Dec 11, 2006 at 02:27:46AM -0800, Andrew Morton wrote:
> > > @@ -115,6 +115,11 @@ extern void setup_arch(char **);
> > >  #define device_initcall_sync(fn) __define_initcall("6s",fn,6s)
> > >  #define late_initcall(fn)__define_initcall("7",fn,7)
> > >  #define late_initcall_sync(fn)   __define_initcall("7s",fn,7s)
> > > +#define populate_rootfs_initcall(fn) __define_initcall("8",fn,8)
> > > +#define populate_rootfs_initcall_sync(fn) __define_initcall("8s",fn,8s)
> > > +#define rootfs_neeeded_initcall(fn)  __define_initcall("9",fn,9)
> > > +#define rootfs_neeeded_initcall_sync(fn) __define_initcall("9s",fn,9s)
> > 
> > Ewww  After module_init()?  Please, don't.  Come on, if it can
> > be a module, it _must_ be ready to run late in the game.
> 
> Yeah, I think you should just run "populate_rootfs()" just before 
> "module_init" (which is the same as "device_initcall()").
> 
> So perhaps somethign like this? (totally untested)
> 
> Btw, if the linker sorts sections some way (does it?) we could probably 
> just make the vmlinux.lds.S file do
> 
>   *(.initcall*.init)
> 
> or something, and then just let special cases like this use
> 
>   __initcall(myfn, 5.1);
> 
> to show that it's between levels 5 and 6. But that would depend on the 
> linker section beign sorted alphabetically. Does anybody know if the 
> linker sorts these things somehow?
> 
> This patch is totally untested, but it looks obvious. It just says that 
> we'll populate rootfs _after_ we've done the fs-level initcalls, but 
> before we do any actual "device" initcalls.
> 
> If any really core stuff needs user-land - tough titties, as they say.
> 
>   Linus
> 
> diff --git a/include/asm-generic/vmlinux.lds.h 
> b/include/asm-generic/vmlinux.lds.h
> index 6e9fceb..7437cca 100644
> --- a/include/asm-generic/vmlinux.lds.h
> +++ b/include/asm-generic/vmlinux.lds.h
> @@ -242,6 +242,7 @@
>   *(.initcall4s.init) \
>   *(.initcall5.init)  \
>   *(.initcall5s.init) \
> + *(.initcallrootfs.init) \
>   *(.initcall6.init)  \
>   *(.initcall6s.init) \
>   *(.initcall7.init)  \
> diff --git a/include/linux/init.h b/include/linux/init.h
> index 5eb5d24..5a593a1 100644
> --- a/include/linux/init.h
> +++ b/include/linux/init.h
> @@ -111,6 +111,7 @@ extern void setup_arch(char **);
>  #define subsys_initcall_sync(fn) __define_initcall("4s",fn,4s)
>  #define fs_initcall(fn)  __define_initcall("5",fn,5)
>  #define fs_initcall_sync(fn) __define_initcall("5s",fn,5s)
> +#define rootfs_initcall(fn)  __define_initcall("rootfs",fn,rootfs)
>  #define device_initcall(fn)  __define_initcall("6",fn,6)
>  #define device_initcall_sync(fn) __define_initcall("6s",fn,6s)
>  #define late_initcall(fn)__define_initcall("7",fn,7)

Looks like this might break pcmcia which for some reason does firmware
requesting at fs_initcall level (drivers/pcmcia/ds.c).


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 4/6] MTHCA driver (infiniband) use new pci interfaces

2006-12-11 Thread Benjamin Herrenschmidt

> Actually even PCIe might not be that easy.  For example with current
> kernels on PowerPC 440SPe (SoC with PCIe), I just get:
> 
> # lspci
> 00:01.0 InfiniBand: Mellanox Technology: Unknown device 6274 (rev a0)
> 
> ie no host bridge / root complex.

Did somebody used the spec as toilet paper again ? Or is it just the
kernel that isn't properly showing the root complex ? 
 
Ben.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Why disable vdso by default with CONFIG_PARAVIRT?

2006-12-11 Thread Jeremy Fitzhardinge
Zachary Amsden wrote:
> It's not for us or Xen.  Perhaps it came from lhype?  

(I suspect it came from Andi's fevered brain.)  If lhype can't deal with
vdso, it can turn it off for itself - but I don't think its a problem
for lhype.

J
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] connector: Some fixes for ia64 unaligned access errors

2006-12-11 Thread David Miller
From: Pete Zaitcev <[EMAIL PROTECTED]>
Date: Mon, 11 Dec 2006 17:29:07 -0800

> On Mon, 11 Dec 2006 15:52:47 -0800, Matt Helsley <[EMAIL PROTECTED]> wrote:
> 
> > I'm shocked memcpy() introduces 8-byte stores that violate architecture
> > alignment rules. Is there any chance this a bug in ia64's memcpy()
> > implementation? I've tried to read it but since I'm not familiar with
> > ia64 asm I can't make out significant parts of it in
> > arch/ia64/lib/memcpy.S.
> 
> The arch/ia64/lib/memcpy.S is probably fine, it must be gcc doing
> an inline substitution of a well-known function.
> 
> A commenter on my blog mentioned seeing the same thing in the past.
> (http://zaitcev.livejournal.com/107185.html?thread=128945#t128945)
> 
> It's possible that applying (void *) cast to the first argument of memcpy
> would disrupt this optimization. But since we have a well understood
> patch by Erik, which only adds a penalty of 32 bytes of stack waste
> and 32 bytes of memcpy, I thought it best not to bother with heaping
> workarounds.

Yes GCC can assume the object is aligned because of the type
of the argument to memcpy().

I tried myself some games with adding a "packed" attribute to the
pointer declaration (trying to tell it that "the thing pointed to"
might be unaligned), but to no avail.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Why disable vdso by default with CONFIG_PARAVIRT?

2006-12-11 Thread Zachary Amsden

Jeremy Fitzhardinge wrote:

Zachary Amsden wrote:
  

Jeremy Fitzhardinge wrote:


Hi Andi,

What problem do they cause together?  There's certainly no problem with
Xen+vdso (in fact, its actually very useful so that it picks up the
right libc with Xen-friendly TLS).
  
  

Methinks the compat VDSO support got broken in the config?  Paravirt +
COMPAT_VDSO are incompatible. 



Yes, that's true, but I'm looking at arch/i386/kernel/sysenter.c:

#ifdef CONFIG_PARAVIRT
unsigned int __read_mostly vdso_enabled = 0;
#else
unsigned int __read_mostly vdso_enabled = 1;
#endif

I can't think of any reason why that should be necessary.
  


It's not for us or Xen.  Perhaps it came from lhype?  
-

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Why disable vdso by default with CONFIG_PARAVIRT?

2006-12-11 Thread Jeremy Fitzhardinge
Zachary Amsden wrote:
> Jeremy Fitzhardinge wrote:
>> Hi Andi,
>>
>> What problem do they cause together?  There's certainly no problem with
>> Xen+vdso (in fact, its actually very useful so that it picks up the
>> right libc with Xen-friendly TLS).
>>   
>
> Methinks the compat VDSO support got broken in the config?  Paravirt +
> COMPAT_VDSO are incompatible. 

Yes, that's true, but I'm looking at arch/i386/kernel/sysenter.c:

#ifdef CONFIG_PARAVIRT
unsigned int __read_mostly vdso_enabled = 0;
#else
unsigned int __read_mostly vdso_enabled = 1;
#endif

I can't think of any reason why that should be necessary.

J
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Why disable vdso by default with CONFIG_PARAVIRT?

2006-12-11 Thread Zachary Amsden

Jeremy Fitzhardinge wrote:

Hi Andi,

What problem do they cause together?  There's certainly no problem with
Xen+vdso (in fact, its actually very useful so that it picks up the
right libc with Xen-friendly TLS).
  


Methinks the compat VDSO support got broken in the config?  Paravirt + 
COMPAT_VDSO are incompatible.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 4/6] MTHCA driver (infiniband) use new pci interfaces

2006-12-11 Thread Roland Dreier
 > I'm worried by this... At no point do you check the host bridge
 > capabilities, and thus will happily set the max read req size to some
 > value larger than the max the host bridge can cope...

Well, it's disabled by default... the option is there as a quick way
to fix "why is my bandwidth so low" when a broken BIOS sets these to
minimum values.  Maybe we should just strip out that code and point
people who want to tweak this at setpci instead.

 > So for PCI-X, if we want tat, we need a pcibios hook for the platform
 > to validate the size requested. For PCI-E, we can use standard code to
 > look for the root complex (and bridges on the path to it) and get the
 > proper max value.

Actually even PCIe might not be that easy.  For example with current
kernels on PowerPC 440SPe (SoC with PCIe), I just get:

# lspci
00:01.0 InfiniBand: Mellanox Technology: Unknown device 6274 (rev a0)

ie no host bridge / root complex.

 - R.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] connector: Some fixes for ia64 unaligned access errors

2006-12-11 Thread Pete Zaitcev
On Mon, 11 Dec 2006 15:52:47 -0800, Matt Helsley <[EMAIL PROTECTED]> wrote:

>   I'm shocked memcpy() introduces 8-byte stores that violate architecture
> alignment rules. Is there any chance this a bug in ia64's memcpy()
> implementation? I've tried to read it but since I'm not familiar with
> ia64 asm I can't make out significant parts of it in
> arch/ia64/lib/memcpy.S.

The arch/ia64/lib/memcpy.S is probably fine, it must be gcc doing
an inline substitution of a well-known function.

A commenter on my blog mentioned seeing the same thing in the past.
(http://zaitcev.livejournal.com/107185.html?thread=128945#t128945)

It's possible that applying (void *) cast to the first argument of memcpy
would disrupt this optimization. But since we have a well understood
patch by Erik, which only adds a penalty of 32 bytes of stack waste
and 32 bytes of memcpy, I thought it best not to bother with heaping
workarounds.

-- Pete
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Kevent POSIX timers support.

2006-12-11 Thread David Miller
From: Evgeniy Polyakov <[EMAIL PROTECTED]>
Date: Tue, 28 Nov 2006 22:22:36 +0300

> And, btw, last time I checked, aligned_u64 was not exported to
> userspace.

It is in linux/types.h and not protected by __KERNEL__ ifdefs.
Perhaps you mean something else?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Broken commit: [NETFILTER]: ipt_REJECT: remove largely duplicate route_reverse function

2006-12-11 Thread David Miller
From: Herbert Xu <[EMAIL PROTECTED]>
Date: Fri, 1 Dec 2006 15:37:55 +1100

> So in general when allocating packets we have two scenarios:
> 
> 1) The dst is known and fixed, i.e., all datagram protocols.  This is
> the easy case where the headroom is known exactly beforehand.
> 
> 2) The dst is unknown or may vary, this includes TCP, SCTP and DCCP.
> This is where we currently use MAX_HEADER plus some protocol-specific
> headroom.
> 
> Right now the normal (non-IPsec) dst output path always checks for
> sufficient headroom and reallocates if necessary (ip_finish_output2).
> I propose that we make IPsec do the same thing.

Agreed.

> For standard MTU-sized packets this discussion is moot since we have
> 2K of memory in each chunk.  However, for ACKs it could save a bit of
> memory.

For linear MTU-sized SKBs yes, but TCP data packets are going out %99
of the time with paged data these days and thus suffers from the same
set of issues and potential savings.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Introduce jiffies_32 and related compare functions

2006-12-11 Thread David Miller
From: Eric Dumazet <[EMAIL PROTECTED]>
Date: Mon, 11 Dec 2006 23:58:06 +0100

> Some subsystems dont need more than 32bits timestamps.
> 
> See for example net/ipv4/inetpeer.c and include/net/tcp.h :
> #define tcp_time_stamp((__u32)(jiffies))
> 
> 
> Because most timeouts should work with 'normal jiffies' that are 32bits on 
> 32bits platforms, it makes sense to be able to use only 32bits to store them 
> and not 64 bits, to save ram.
> 
> This patch introduces jiffies_32, and related comparison functions 
> time_after32(), time_before32(), time_after_eq32() and time_before_eq32().
> 
> I plan to use this infrastructure in network code for example (struct 
> dst_entry comes to mind).

The TCP case is because the protocol limits the size of
the timestamp we can store in the TCP Timestamp option.

Otherwise we would use the full 64-bit jiffies timestamp,
in order to have a larger window of values which would not
overflow.

Since there is no protocol limitation involved in cases
such as dst_entry, I think we should keep it at 64-bits
on 64-bit platforms to make the wrap-around window as
large as possible.

I really don't see any reason to make these changes.  Yes,
you'd save some space, but one of the chief advantages of
64-bit is that we get larger jiffies value windows.  If
that has zero value, as your intended changes imply, then
we shouldn't need the default 64-bit jiffies either, by
implication.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Why disable vdso by default with CONFIG_PARAVIRT?

2006-12-11 Thread Jeremy Fitzhardinge
Hi Andi,

What problem do they cause together?  There's certainly no problem with
Xen+vdso (in fact, its actually very useful so that it picks up the
right libc with Xen-friendly TLS).

J
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: new procfs memory analysis feature

2006-12-11 Thread Joe Green

Albert Cahalan wrote:

David Singleton writes:


Add variation of /proc/PID/smaps called /proc/PID/pagemaps.
Shows reference counts for individual pages instead of aggregate totals.
Allows more detailed memory usage information for memory analysis tools.
An example of the output shows the shared text VMA for ld.so and
the share depths of the pages in the VMA.

a7f4b000-a7f65000 r-xp  00:0d 19185826   /lib/ld-2.5.90.so
 11 11 11 11 11 11 11 11 11 13 13 13 13 13 13 13 8 8 8 13 13 13 13 13 
13 13


Arrrgh! Not another ghastly maps file!

Now we have /proc/*/smaps, which should make decent programmers cry.


Yes, that's what we based this implementation on.  :)


Along the way, nobody bothered to add support for describing the
page size (IMHO your format ***severely*** needs this)


Since the map size and an entry for each page is given, it's possible to 
figure out the page size, assuming each map uses only a single page 
size.  But adding the page size would be reasonable.



There can be a million pages in a mapping for a 32-bit process.
If my guess (since you too failed to document your format) is right,
you propose to have one decimal value per page.


Yes, that's right.  We considered using repeat counts for sequences 
pages with the same reference count (quite common), but it hasn't been 
necessary in our application (see below).


In other words, the lines of this file can be megabytes long without 
even getting

to the issue of 64-bit hardware. This is no text file!

How about a proper system call?


Our use for this is to optimize memory usage on very small embedded 
systems, so the number of pages hasn't been a problem.


For the same reason, not needing a special program on the target system 
to read the data is an advantage, because each extra program needed adds 
to the footprint problem.


The data is taken off the target and interpreted on another system, 
which often is of a different architecture, so the portable text format 
is useful also.


This isn't mean to say your arguments aren't important, I'm just 
explaining why this implementation is useful for us.



--
Joe Green <[EMAIL PROTECTED]>
MontaVista Software, Inc.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


doubts about disk scheduling

2006-12-11 Thread xu feng
Hi, 
Please cc your reply to my email. many thanks

I would appreciate any help on the following
questions.

 I have looked on disk scheduling algorithms 

and the main thing that striked me is that most of the
algorithms that i have read in the textbooks (some are
explained in the previous ) don't take into
consideration the "priority of the process". Being the
short seek time first, scan, or c-scan algorithm, all
are explained through a string of block numbers, but
no mention is given about the owner of these
blocks.does it mean all the processes are treated
equally??  In my opinion a sort of Multi level queue
like with CPU scheduling algorithm can be used to
schedule the processes according to their importance.
any comment?


My second question is about the implementation, i.e.
how the different requests are actually aligned in the
disk queue?

if a process submit a disk I/O request, its PCB should
be linked to the disk queue. My question is, in making
the system call, and after checking the permission
rights and identifying the sought data (block) address
in the disk and the target address in the memory does
the kernel store this information in the pcb, then
link this pcb in the disk queue?

By doing so and once the disk controller is free , the
device driver checks the queue pcbs and read the
requested blocks and depending on the current location
of the disk head and the queue pcb block request, the
driver orders the controller to process a certain
block request of a certain process. The driver removes
this request from the pcb content 

is that how it is implemented?

Many thanks


 

Yahoo! Music Unlimited
Access over 1 million songs.
http://music.yahoo.com/unlimited
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: oops on 2.6.19-rc6-mm2: deref of 0x28 at permission+0x7

2006-12-11 Thread Neil Brown
On Monday December 11, [EMAIL PROTECTED] wrote:
> On Mon, 11 Dec 2006, Neil Brown wrote:
> 
> > > this nash thing is exactly the command which triggers a bit different 
> > > oops in my case. On my side, the oops is fully reproducible. If you 
> > > manage to make your case also reproducible, could you please try to 
> > > revert md-change-lifetime-rules-for-md-devices.patch? This made the 
> > > oops vanish in my case. I think Neil is working on it.
> > Trying to work on it - not making a lot of progress.  I find it hard to 
> > see how anything in md can cause the inode for a block-device file to 
> > disappear... It is a bit of a long-shot, but this patch might change 
> > things.  It changes the order in which things are de-allocated. Jiri and 
> > Jiri: would either of both of you see if you can reproduce the bug with 
> > this patch on 2.6.19-rc6-mm2 ???
> 
> Hi Neil,
> 
> sorry to say that, but it's still there after applying your patch.

Not a big surprise, but thanks a lot for testing.  I think I'm going
to have to try harder to duplicate it myself.

If I remember rightly you are using FC - which version exactly?  (I've
never installed FC before so this is going to be learning experience).

And you have no MD arrays at all - is that correct?

And you compile your own kernel.  Is it monolithic, or are you using
modules?  Do you boot with an initrd or just the kernel?

I'd like to duplicate your installation as closely as possible, so any
relevant details or recipes would be greatly appreciated.

Thanks,

NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] pipe: Don't oops when pipe filesystem isn't mounted

2006-12-11 Thread Benjamin Herrenschmidt

> So it makes perfect sense to say
> 
>"you won't be getting any notification by anything built-in, until 
> 'device_initcall' (which is the default module_init, of course)".
> 
> which in the case of certain drivers obviously _does_ mean that they had 
> better not try to use any early initcalls to load firmware.

And that will fix some other issues I think I've seen (a while ago, I
might have a memory mixup here) related to /sbin/hotplug being called
before /dev/null & /dev/zero are initialized (they are fs_initcall). At
least with that patch, it won't happen.

Ben.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [2.6.19] NFS: server error: fileid changed

2006-12-11 Thread Trond Myklebust
On Mon, 2006-12-11 at 15:44 -0800, Martin Knoblauch wrote:
>  So far, we are only seeing it on amd-mounted filesystems, not on
> static NFS mounts. Unfortunatelly, it is difficult to avoid "amd" in
> our environment.

Any chance you could try substituting a recent version of autofs? This
sort of problem is more likely to happen on partitions that are
unmounted and then remounted often. I'd just like to figure out if this
is something that we need to fix in the kernel, or if it is purely an
amd problem.

Cheers
  Trond

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC] Patch: dynticks: idle load balancing

2006-12-11 Thread Siddha, Suresh B
Appended patch attempts to fix the process idle load balancing in the
presence of dynticks. cpus for which ticks are stopped will sleep
till the next event wakes it up. Potentially these sleeps can be for large
durations and during which today, there is no idle load balancing being done.
There was some discussion happened(last year) on this topic on lkml, where two
main approaches were gettting debated. One is to back off the idle load
balancing for bigger intervals and the second is a watchdog mechanism where
the busy cpu will trigger the load balance on an idle cpu.  Both of these
mechanisms have its drawbacks.

For the first mechanism, defining the interval will be tricky and if it is too
much, then the response time will also be high and we won't be able to respond
for sudden changes in the load. If it is small, then we won't be able to save
power.

Second mechanism will be making changes to the busy load balancing(which will
be doing more load balancing work, while the current busy task on that cpu
is eagerly waiting for the cpu cycles). Also busy load balancing intervals are
quite different from idle load balancing intervals. Similar to the first
mechanism, we won't be able to respond quickly to change in loads. And also
figuring out that a cpu is heavily loaded and where that extra load need to
moved, is some what difficult job, especially so in the case of hierarchical
scheduler domains.

Appended patch takes a third route which nominates an owner among
the idle cpus, which does the idle load balancing on behalf of the other
idle cpus. And once all the cpus are completely idle, then we can stop
this idle load balancing too. Checks added in fast path are minimized.
Whenever there are busy cpus in the system, there will be an owner(idle cpu)
doing the system wide idle load balancing. If we nominate this owner
carefully(like an idle core in a busy package), we can minimize the power
wasted also.

Some of the questions I have are: Will this single owner become bottleneck?
Idle load balancing is now serialized among all the idle cpus. This perhaps
will add some delays in load movement to different idle cpus. IMO, these
delays will be small and tolerable. If this comes out to be a concern, we
can offload the actual load movement work to the idle cpu, where the load
will be finally run.

Any more optimizations we can do to start/stop_sched_tick() routines to track
this info more efficiently?

Comments and review feedback welcome. Minimal testing done on couple of
i386 platforms. Perf testing yet to be done.

thanks,
suresh
---
Track the cpus for which ticks are stopped and one among these cpus will
be doing the idle load balancing on behalf of all the remaining cpus.
If the ticks are stopped for all the cpus in the system, idle load balancing
will stop at that moment. And restarts as soon as there is a busy cpu in
the system.

TBD: Select the appropriate idle cpu for doing this idle load balancing.
Such as an idle core in a busy package(which has a busy core). Selecting an idle
thread as the owner when there are other busy thread siblings is
not a good idea.

We can also think of offloading the task movements from the idle load balancing
owner to the idle cpu on behalf of which this work is being done.

Signed-off-by: Suresh Siddha <[EMAIL PROTECTED]>
---

diff -pNru linux-2.6.19-mm1/include/linux/sched.h linux/include/linux/sched.h
--- linux-2.6.19-mm1/include/linux/sched.h  2006-12-12 06:39:22.0 
-0800
+++ linux/include/linux/sched.h 2006-12-12 06:51:03.0 -0800
@@ -195,6 +195,14 @@ extern void sched_init_smp(void);
 extern void init_idle(struct task_struct *idle, int cpu);
 
 extern cpumask_t nohz_cpu_mask;
+#ifdef CONFIG_SMP
+extern int select_notick_load_balancer(int cpu);
+#else
+static inline int select_notick_load_balancer(int cpu)
+{
+   return 0;
+}
+#endif
 
 /*
  * Only dump TASK_* tasks. (-1 for all tasks)
diff -pNru linux-2.6.19-mm1/kernel/hrtimer.c linux/kernel/hrtimer.c
--- linux-2.6.19-mm1/kernel/hrtimer.c   2006-12-12 06:39:22.0 -0800
+++ linux/kernel/hrtimer.c  2006-12-12 06:51:03.0 -0800
@@ -600,6 +600,9 @@ void hrtimer_stop_sched_tick(void)
 * the scheduler tick in hrtimer_restart_sched_tick.
 */
if (!cpu_base->tick_stopped) {
+   if (select_notick_load_balancer(1))
+   goto end;
+
cpu_base->idle_tick = cpu_base->sched_timer.expires;
cpu_base->tick_stopped = 1;
cpu_base->idle_jiffies = last_jiffies;
@@ -616,6 +619,7 @@ void hrtimer_stop_sched_tick(void)
raise_softirq_irqoff(TIMER_SOFTIRQ);
}
 
+end:
local_irq_restore(flags);
 }
 
@@ -630,6 +634,8 @@ void hrtimer_restart_sched_tick(void)
unsigned long ticks;
ktime_t now, delta;
 
+   select_notick_load_balancer(0);
+
if (!cpu_base->hres_active || 

Re: 2.6.19-git13: uts banner changes break SLES9 (at least)

2006-12-11 Thread David Miller
From: Paul Mackerras <[EMAIL PROTECTED]>
Date: Tue, 12 Dec 2006 09:04:41 +1100

> If there is a reliable way to get the version string, great, I'll use
> that.

FWIW, on sparc and sparc64 we have this information block for
the boot loader.

The first two instructions at the entry point simply branch
over the boot loader information block header.

The information block starts with a known magic string "HdrS" which
does not match any valid Sparc instruction.  Any tool can search for
it starting at the symbol "_start" in the kernel image.

Inside this information block we stick a 32-bit word which contains
LINUX_VERSION_CODE.

That only gives you the version, not the whole version string, but you
could put the whole string in such a location when adding such a
facility to powerpc if you wanted to.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] Whinge in paging_init if noexec is on with a non-PAE kernel

2006-12-11 Thread Kyle McMartin
Signed-off-by: Kyle McMartin <[EMAIL PROTECTED]>

diff --git a/arch/i386/mm/init.c b/arch/i386/mm/init.c
index 84697df..fb61709 100644
--- a/arch/i386/mm/init.c
+++ b/arch/i386/mm/init.c
@@ -512,6 +512,9 @@ void __init paging_init(void)
set_nx();
if (nx_enabled)
printk("NX (Execute Disable) protection: active\n");
+#else
+   if (!disable_nx)
+   printk("NX (Execute Disable) only supported with 
CONFIG_HIGHMEM64G\n");
 #endif
 
pagetable_init();
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.19-mm1 (md/raid1 randomly drops partitions)

2006-12-11 Thread Neil Brown
On Tuesday December 12, [EMAIL PROTECTED] wrote:
> On Monday, 11 December 2006 23:52, Neil Brown wrote:
> > On Monday December 11, [EMAIL PROTECTED] wrote:
> > > Hi,
> > > 
> > > On Monday, 11 December 2006 09:58, Andrew Morton wrote:
> > > > 
> > > > Temporarily at
> > > > 
> > > > http://userweb.kernel.org/~akpm/2.6.19-mm1/
> > > > 
> > > > Will appear later at
> > > > 
> > > > 
> > > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.19/2.6.19-mm1/
> > > 
> > > It caused all of the md RAID1s on my test box to drop one of their 
> > > partitions,
> > > apparently at random.
> > 
> > That's clever
> > 
> > Do you have any kernel logs of this happening?  My guess would be the
> > underlying device driver is returned more errors than before, but we
> > need the logs to be sure.
> 
> I've only found lots of messages like this:
> 
> md: super_written gets error=-5, uptodate=0

So when md writes to write out the superblock, to gets EIO... Odd that
you aren't getting errors for normal writes.

What devices are the md/raid1 built on?

> 
> I'll try to reproduce it tomorrow and collect some more information.

Thanks.  More information is definitely better than less, so send over
anything you can find.

NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC 2.6.19 1/1] fbdev,mm: hecuba/E-Ink fbdev driver v2

2006-12-11 Thread Jaya Kumar

On 12/11/06, Franck Bui-Huu <[EMAIL PROTECTED]> wrote:

[EMAIL PROTECTED] wrote:
> + atomic_t ref_count;
> + atomic_t vma_count;

what purpose do these counters deserve ?


You are right. I can remove them.


> +
> +void hcb_wait_for_ack(struct hecubafb_par *par)
> +{
> +
> + int timeout;
> + unsigned char ctl;
> +
> + timeout=500;
> + do {
> + ctl = hcb_get_ctl(par);
> + if ((ctl & HCB_ACK_BIT))
> + return;
> + udelay(1);
> + } while (timeout--);
> + printk(KERN_ERR "timed out waiting for ack\n");
> +}

When timeout occur this function does not return any error values.
the callers needn't to be warn in this case ?


You are right. I need to figure out what exactly to do. Currently, if
a timeout is observed it normally means the display controller is
hung. However, in some cases  the controller does seem to recover
after some period of time. I guess I should probably return an error
and terminate pending activity.


> +
> +/* this is to find and return the vmalloc-ed fb pages */
> +static struct page* hecubafb_vm_nopage(struct vm_area_struct *vma,
> + unsigned long vaddr, int *type)
> +{
> + unsigned long offset;
> + struct page *page;
> + struct fb_info *info = vma->vm_private_data;
> +
> + offset = (vaddr - vma->vm_start) + (vma->vm_pgoff << PAGE_SHIFT);
> + if (offset >= (DPY_W*DPY_H)/8)
> + return NOPAGE_SIGBUS;
> +
> + page = vmalloc_to_page(info->screen_base + offset);
> + if (!page)
> + return NOPAGE_OOM;
> +
> + get_page(page);
> + if (type)
> + *type = VM_FAULT_MINOR;
> + return page;
> +}
> +

so page can be accessed by using vma->start virtual address


The userspace app would be doing:

ioctl(fd, FBIOGET_FSCREENINFO, );
ioctl(fd, FBIOGET_VSCREENINFO, );
screensize = ( vinfo.xres * vinfo.yres * vinfo.bits_per_pixel) / 8;
maddr = mmap(finfo.mmio_start, screensize, PROT_WRITE, MAP_SHARED, fd, 0);



> +static int hecubafb_page_mkwrite(struct vm_area_struct *vma,

[snip]

> +
> + if (!(videomemory = vmalloc(videomemorysize)))
> + return retval;

and here the kernel access to the same page by using address returned
by vmalloc which are different from the previous one. So 2 different
addresses map the same physical page. In this case are there any cache
aliasing issues specially for x86 arch ?


I think that PTEs set up by vmalloc are marked cacheable and via the
above nopage end up as cacheable. I'm not doing DMA. So the accesses
are through the cache so I don't think cache aliasing is an issue for
this case. Please let me know if I misunderstood.

Thanks,
jayakumar
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] connector: Some fixes for ia64 unaligned access errors

2006-12-11 Thread Matt Helsley
On Thu, 2006-12-07 at 17:22 -0600, Erik Jacobson wrote:
> On ia64, the various functions that make up cn_proc.c cause kernel
> unaligned access errors.
> 
> If you are using these, for example, to get notification about
> all tasks forking and exiting, you get multiple unaligned access errors
> per process.
> 
> Here, we just adjust how the variables are declared and use memcopy to
> avoid the error messages.
> 
> Signed-off-by: Erik Jacobson <[EMAIL PROTECTED]>

Acked-by: Matt Helsley <[EMAIL PROTECTED]>

> ---
> 
>  cn_proc.c |   94 
> +++---
>  1 file changed, 47 insertions(+), 47 deletions(-)
> --- linux.orig/drivers/connector/cn_proc.c2006-11-29 15:57:37.0 
> -0600
> +++ linux/drivers/connector/cn_proc.c 2006-12-07 16:50:03.195035791 -0600
> @@ -49,7 +49,7 @@
>  void proc_fork_connector(struct task_struct *task)
>  {
>   struct cn_msg *msg;
> - struct proc_event *ev;
> + struct proc_event ev;
>   __u8 buffer[CN_PROC_MSG_SIZE];
>   struct timespec ts;
> 
> @@ -57,19 +57,19 @@
>   return;
> 
>   msg = (struct cn_msg*)buffer;
> - ev = (struct proc_event*)msg->data;
> - get_seq(>seq, >cpu);
> + get_seq(>seq, );
>   ktime_get_ts(); /* get high res monotonic timestamp */
> - ev->timestamp_ns = timespec_to_ns();
> - ev->what = PROC_EVENT_FORK;
> - ev->event_data.fork.parent_pid = task->real_parent->pid;
> - ev->event_data.fork.parent_tgid = task->real_parent->tgid;
> - ev->event_data.fork.child_pid = task->pid;
> - ev->event_data.fork.child_tgid = task->tgid;
> + ev.timestamp_ns = timespec_to_ns();
> + ev.what = PROC_EVENT_FORK;
> + ev.event_data.fork.parent_pid = task->real_parent->pid;
> + ev.event_data.fork.parent_tgid = task->real_parent->tgid;
> + ev.event_data.fork.child_pid = task->pid;
> + ev.event_data.fork.child_tgid = task->tgid;
> 
>   memcpy(>id, _proc_event_id, sizeof(msg->id));
>   msg->ack = 0; /* not used */
> - msg->len = sizeof(*ev);
> + msg->len = sizeof(ev);
> + memcpy(msg->data, , sizeof(ev));
>   /*  If cn_netlink_send() failed, the data is not sent */
>   cn_netlink_send(msg, CN_IDX_PROC, GFP_KERNEL);
>  }
> @@ -77,7 +77,7 @@
>  void proc_exec_connector(struct task_struct *task)
>  {
>   struct cn_msg *msg;
> - struct proc_event *ev;
> + struct proc_event ev;
>   struct timespec ts;
>   __u8 buffer[CN_PROC_MSG_SIZE];
> 
> @@ -85,24 +85,24 @@
>   return;
> 
>   msg = (struct cn_msg*)buffer;
> - ev = (struct proc_event*)msg->data;
> - get_seq(>seq, >cpu);
> + get_seq(>seq, );
>   ktime_get_ts(); /* get high res monotonic timestamp */
> - ev->timestamp_ns = timespec_to_ns();
> - ev->what = PROC_EVENT_EXEC;
> - ev->event_data.exec.process_pid = task->pid;
> - ev->event_data.exec.process_tgid = task->tgid;
> + ev.timestamp_ns = timespec_to_ns();
> + ev.what = PROC_EVENT_EXEC;
> + ev.event_data.exec.process_pid = task->pid;
> + ev.event_data.exec.process_tgid = task->tgid;
> 
>   memcpy(>id, _proc_event_id, sizeof(msg->id));
>   msg->ack = 0; /* not used */
> - msg->len = sizeof(*ev);
> + msg->len = sizeof(ev);
> + memcpy(msg->data, , sizeof(ev));
>   cn_netlink_send(msg, CN_IDX_PROC, GFP_KERNEL);
>  }
> 
>  void proc_id_connector(struct task_struct *task, int which_id)
>  {
>   struct cn_msg *msg;
> - struct proc_event *ev;
> + struct proc_event ev;
>   __u8 buffer[CN_PROC_MSG_SIZE];
>   struct timespec ts;
> 
> @@ -110,32 +110,32 @@
>   return;
> 
>   msg = (struct cn_msg*)buffer;
> - ev = (struct proc_event*)msg->data;
> - ev->what = which_id;
> - ev->event_data.id.process_pid = task->pid;
> - ev->event_data.id.process_tgid = task->tgid;
> + ev.what = which_id;
> + ev.event_data.id.process_pid = task->pid;
> + ev.event_data.id.process_tgid = task->tgid;
>   if (which_id == PROC_EVENT_UID) {
> - ev->event_data.id.r.ruid = task->uid;
> - ev->event_data.id.e.euid = task->euid;
> + ev.event_data.id.r.ruid = task->uid;
> + ev.event_data.id.e.euid = task->euid;
>   } else if (which_id == PROC_EVENT_GID) {
> - ev->event_data.id.r.rgid = task->gid;
> - ev->event_data.id.e.egid = task->egid;
> + ev.event_data.id.r.rgid = task->gid;
> + ev.event_data.id.e.egid = task->egid;
>   } else
>   return;
> - get_seq(>seq, >cpu);
> + get_seq(>seq, );
>   ktime_get_ts(); /* get high res monotonic timestamp */
> - ev->timestamp_ns = timespec_to_ns();
> + ev.timestamp_ns = timespec_to_ns();
> 
>   memcpy(>id, _proc_event_id, sizeof(msg->id));
>   msg->ack = 0; /* not used */
> - msg->len = sizeof(*ev);
> + msg->len = sizeof(ev);
> + memcpy(msg->data, , sizeof(ev));
>   

Re: [PATCH] connector: Some fixes for ia64 unaligned access errors

2006-12-11 Thread Matt Helsley
On Sat, 2006-12-09 at 18:34 -0800, Pete Zaitcev wrote:
> On Sat, 9 Dec 2006 15:09:13 -0600, Erik Jacobson <[EMAIL PROTECTED]> wrote:
> 
> > > Please try to declare u64 timestamp_ns, then copy it into the *ev
> > > instead of copying whole *ev. This ought to fix the problem if
> > > buffer[] ends aligned to 32 bits or better.
> >
> > So I took this suggestion for a spin and met with the same result.
> > The unaligned access messages are still produced.
> 
> I see. And I see you went a few steps forward with dignosing it:
> 
> > dbg fork after timespec_to_ns call, b4 memcpy
> > kernel unaligned access to 0xe03076b6fbe4, ip=0xa001004f1480
> > dbg fork after memcpy, b4 other ev settings...
> 
> > a001004f1470  [MMI]   ld8 r40=[r14]
> > a001004f1476  ld8 r38=[r38]
> > a001004f147c  nop.i 0x0;;
> > a001004f1480  [MIB]   st8 [r39]=r40
> > a001004f1486  nop.i 0x0
> > a001004f148c  br.call.sptk.many 
> > b0=a001000a36c0 ;;

I'm not very familiar with ia64 asm but it looks like its loading and
storying 8 bytes at a time for the memcpy().

> It seems rather strange that memcpy gets optimized this way. I could
> not have foreseen it. Still, it was worth a try, even if putting 32
> extra bytes on stack and running memcpy on them does not seem too
> onerous, for a fork(). Thanks for doing it, and let's go with your
> original patch then... if Matt Helsley does not mind.

OK, I'll ack the original.

> Thank you,
> -- Pete

I'm shocked memcpy() introduces 8-byte stores that violate architecture
alignment rules. Is there any chance this a bug in ia64's memcpy()
implementation? I've tried to read it but since I'm not familiar with
ia64 asm I can't make out significant parts of it in
arch/ia64/lib/memcpy.S.



Cheers,
-Matt Helsley

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [2.6.19] NFS: server error: fileid changed

2006-12-11 Thread Martin Knoblauch

--- Trond Myklebust <[EMAIL PROTECTED]> wrote:

> On Mon, 2006-12-11 at 08:09 -0800, Martin Knoblauch wrote:
> > Hi, [please CC me, as I am not subscribed]
> > 
> >  after updating a RHEL4 box (EM64T based) to a plain 2.6.19 kernel,
> we
> > are seeing repeated occurences of the following messages (about
> every
> > 45-50 minutes).
> > 
> >  It is always the same server (a NetApp filer, mounted via the
> > user-space automounter "amd") and the expected/got numbers seem to
> > repeat.
> 
> Are you seeing it _without_ amd? The usual reason for the errors you
> see are bogus replay cache replies. For that reason, the kernel is
> usually very careful when initialising its value for the
> XID: we set part of it using the clock value, and part of it
> using a random number generator.
> I'm not so sure that other services are as careful.
>

 So far, we are only seeing it on amd-mounted filesystems, not on
static NFS mounts. Unfortunatelly, it is difficult to avoid "amd" in
our environment.
 
> >  Is there a  way to find out which files are involved? Nothing
> seems to
> > be obviously breaking, but I do not like to get my logfiles filled
> up. 
> 
> The fileid is the same as the inode number. Just convert those
> hexadecimal values into ordinary numbers, then search for them using
> 'ls
> -i'.
> 

 thanks. will check that out.

Cheers
Martin

--
Martin Knoblauch
email: k n o b i AT knobisoft DOT de
www:   http://www.knobisoft.de
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.19-mm1 (md/raid1 randomly drops partitions)

2006-12-11 Thread Rafael J. Wysocki
On Monday, 11 December 2006 23:52, Neil Brown wrote:
> On Monday December 11, [EMAIL PROTECTED] wrote:
> > Hi,
> > 
> > On Monday, 11 December 2006 09:58, Andrew Morton wrote:
> > > 
> > > Temporarily at
> > > 
> > >   http://userweb.kernel.org/~akpm/2.6.19-mm1/
> > > 
> > > Will appear later at
> > > 
> > >   
> > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.19/2.6.19-mm1/
> > 
> > It caused all of the md RAID1s on my test box to drop one of their 
> > partitions,
> > apparently at random.
> 
> That's clever
> 
> Do you have any kernel logs of this happening?  My guess would be the
> underlying device driver is returned more errors than before, but we
> need the logs to be sure.

I've only found lots of messages like this:

md: super_written gets error=-5, uptodate=0

I'll try to reproduce it tomorrow and collect some more information.

Greetings,
Rafael


-- 
If you don't have the time to read,
you don't have the time or the tools to write.
- Stephen King
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: rdtscp vgettimeofday

2006-12-11 Thread Andrea Arcangeli
On Mon, Dec 11, 2006 at 03:15:44PM -0800, dean gaudet wrote:
> rdtscp gets you 2 of the 5 values you need to compute the time.  anything 
> can happen between when you do the rdtscp and do the other 3 reads:  the 
> computation is (((tsc-A)*B)>>N)+C where N is a constant, and A, B, C are 
> per-cpu data.
> A/B/C change a few times a second (to avoid 32-bit rollover in (tsc-A)), 
> every time there's a halt, and every P-state transition.

This is wrong. There's the D variable too, the seq lock.

The thing I've in mind is something like:

rdstcp (get tsc and cpu atomic) this is fundamental without tsc
and cpu read atomically nothing of the below is possible

read D from cpu we got from rdtscp (seqlock)
smb_rmb()
check that D isn't during the race condition (last LSB clear
or similar) or restart
rdstcp again (tsc and cpu atomic)
check that cpu is still the same or restart
index the per-cpu array and get the safe A B C
smp_rmb()
read per-cpu D again and check that it didn't change or restart

Then you have tsc, A, B and C all atomic. N is a constant. rdtsc again
is fundamental in getting this info all atomic w/o accessing the
southbridge and without expensive asm instruction.

> if you lose your tick in the middle of those reads any number of things 
> can happen to screw the computation... including being scheduled on 
> another core and mixing values from two cores.

Being scheduled in another core is normal. continuing gettimeofday
from another core after you have the tsc value is just fine.

If something the problem is to generate A B C in per-cpu data with a
per-cpu seqlock around it. That's the job for the per-cpu kernel
thread.

The only real trouble I see is the offset from the last irq. It's
possible to make this to work we need to rotate the timer irq across
all cpus at regular intervals (before the tsc2usec measurement error
showup).

> oh i think there are several solutions which will work... and i also think 
> rdtscp wasn't a necessary addition to the ISA :)

Please don't suggest me the userland rsp manual unwinding, that's
orders of magnitude more fragile and it sounds much more complex too ;).
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 2.6.19-git] rts-rs5c372 updates: more chips, alarm, 12hr mode, etc

2006-12-11 Thread Dan Williams

On 12/11/06, Voipio Riku <[EMAIL PROTECTED]> wrote:


> Have you asked around for anyone who may have insights about i2c-iop3xx
> driver bugs?  Maybe the driver maintainers, or arm-linux folk, or on
> the i2c list.

I was told to contact Dan Williams, I didn't get any response.


Hi Riku, this is the first message I have received.

According to the latest specification update
(http://www.intel.com/design/iio/specupdt/27351910.pdf) there are no
known issues with the i2c.  I looked through the thread and did not
see what board you are using, can you send those details?

I have not dealt with the i2c-iop3xx driver in the past. Have you
tried contacting the last person to make functional changes to the
driver?
http://kernel.org/git/gitweb.cgi?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=39288e1ac10b3b9a68a629be67d81a0b53512c4e

Regards,
Dan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/4] Lumpy Reclaim V3

2006-12-11 Thread Andrew Morton
On Wed, 6 Dec 2006 16:59:04 +
Andy Whitcroft <[EMAIL PROTECTED]> wrote:

> This is a repost of the lumpy reclaim patch set.  This is
> basically unchanged from the last post, other than being rebased
> to 2.6.19-rc2-mm2.

The patch sequencing appeared to be designed to make the code hard to
review, so I clumped them all into a single diff:

>  
>  /*
> + * Attempt to remove the specified page from its LRU.  Only take this
> + * page if it is of the appropriate PageActive status.  Pages which
> + * are being freed elsewhere are also ignored.
> + *
> + * @page:page to consider
> + * @active:  active/inactive flag only take pages of this type

I dunno who started adding these @'s into non-kernel-doc comments.  I'll
un-add them.

> + * returns 0 on success, -ve errno on failure.
> + */
> +int __isolate_lru_page(struct page *page, int active)
> +{
> + int ret = -EINVAL;
> +
> + if (PageLRU(page) && (PageActive(page) == active)) {

We hope that all architectures remember that test_bit returns 0 or
1.  We got that wrong a few years back.  What we do now is rather
un-C-like.  And potentially inefficient.  Hopefully the compiler usually
sorts it out though.


> + ret = -EBUSY;
> + if (likely(get_page_unless_zero(page))) {
> + /*
> +  * Be careful not to clear PageLRU until after we're
> +  * sure the page is not being freed elsewhere -- the
> +  * page release code relies on it.
> +  */
> + ClearPageLRU(page);
> + ret = 0;
> + }
> + }
> +
> + return ret;
> +}
> +
> +/*
>   * zone->lru_lock is heavily contended.  Some of the functions that
>   * shrink the lists perform better by taking out a batch of pages
>   * and working on them outside the LRU lock.
> @@ -621,33 +653,71 @@ keep:
>   */
>  static unsigned long isolate_lru_pages(unsigned long nr_to_scan,
>   struct list_head *src, struct list_head *dst,
> - unsigned long *scanned)
> + unsigned long *scanned, int order)
>  {
>   unsigned long nr_taken = 0;
> - struct page *page;
> - unsigned long scan;
> + struct page *page, *tmp;

"tmp" isn't a very good identifier.

> + unsigned long scan, pfn, end_pfn, page_pfn;

One declaration per line is preferred.  This gives you room for a brief
comment, where appropriate.


> + /*
> +  * Attempt to take all pages in the order aligned region
> +  * surrounding the tag page.  Only take those pages of
> +  * the same active state as that tag page.
> +  */
> + zone_id = page_zone_id(page);
> + page_pfn = __page_to_pfn(page);
> + pfn = page_pfn & ~((1 << order) - 1);

Is this always true?  It assumes that the absolute value of the starting
pfn of each zone is a multiple of MAX_ORDER (doesn't it?) I don't see any
reason per-se why that has to be true (although it might be).

hm, I guess it has to be true, else hugetlb pages wouldn't work too well.

> + end_pfn = pfn + (1 << order);
> + for (; pfn < end_pfn; pfn++) {
> + if (unlikely(pfn == page_pfn))
> + continue;
> + if (unlikely(!pfn_valid(pfn)))
> + break;
> +
> + tmp = __pfn_to_page(pfn);
> + if (unlikely(page_zone_id(tmp) != zone_id))
> + continue;
> + scan++;
> + switch (__isolate_lru_page(tmp, active)) {
> + case 0:
> + list_move(>lru, dst);
> + nr_taken++;
> + break;
> +
> + case -EBUSY:
> + /* else it is being freed elsewhere */
> + list_move(>lru, src);
> + default:
> + break;
> + }
> + }

I think each of those

if (expr)
continue;

statements would benefit from a nice comment explaining why.


This physical-scan part of the function will skip pages which happen to be
on *src.  I guess that won't matter much, once the sytem has been up for a
while and the LRU is nicely scrambled.


If this function is passed a list of 32 pages, and order=4, I think it will
go and give us as many as 512 pages on *dst?  A check of nr_taken might be
needed.


The patch is pretty simple, isn't it?

I guess a shortcoming is that it doesn't address the situation where
GFP_ATOMIC network rx is trying to allocate order-2 pages for large skbs,
but kswapd doesn't know that.  AFACIT nobody will actually run the nice new
code in this quite common scenario.

-
To unsubscribe from this list: send the line "unsubscribe 

Re: [BUG] commit 3c517a61, slab: better fallback allocation behavior

2006-12-11 Thread Jay Cliburn

Christoph Lameter wrote:
Ahh. Fallback_alloc() does not do the check for GFP_WAIT as done in 
cache_grow(). Thus interrupts are disabled when we call kmem_getpages() 
which results in the failure.


Duplicate the handling of GFP_WAIT in cache_grow().

Jay could you try this patch?


The patch seems to fix the bug.  I've been running about an hour with it now, 
and I haven't seen any error messages.  Prior to the patch, I'd see the messages 
within a few minutes of starting a login session.


Jay
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: rdtscp vgettimeofday

2006-12-11 Thread dean gaudet
On Mon, 11 Dec 2006, Andrea Arcangeli wrote:

> On Mon, Dec 11, 2006 at 01:17:25PM -0800, dean gaudet wrote:
> > rdtscp doesn't solve anything extra [..]
> > [..] lsl-based vgetcpu is relatively slow
> 
> Well, if you accept to run slow there's nothing to solve in the first
> place indeed.
> 
> If nothing else rdtscp should avoid the mess of restarting a
> vsyscalls, which is quite a difficult problem as it heavily depends on
> the compiler/dwarf.

rdtscp gets you 2 of the 5 values you need to compute the time.  anything 
can happen between when you do the rdtscp and do the other 3 reads:  the 
computation is (((tsc-A)*B)>>N)+C where N is a constant, and A, B, C are 
per-cpu data.

A/B/C change a few times a second (to avoid 32-bit rollover in (tsc-A)), 
every time there's a halt, and every P-state transition.

if you lose your tick in the middle of those reads any number of things 
can happen to screw the computation... including being scheduled on 
another core and mixing values from two cores.


> > even with rdtscp you have to deal with the definite possibility of being 
> > scheduled away in the middle of the computation.  arguably you need
> > to 
> 
> Isn't rdtscp atomic? all you need is to read atomically the current
> contents of the tsc and the index to use in a per-cpu table exported
> in readonly. This table will contain a per-cpu seqlock as well. Then a
> math logic has to be built with per-cpu threads, so that those per-cpu
> tables are updated by cpufreq and at regular intervals.
> 
> If this is all wrong and it's not feasible to implement a safe and
> monothonic vgettimeofday that doesn't access the southbridge and that
> doesn't require restarting the vsyscall manually by patching rip/rsp,
> I've an hard time to see how rdtscp is useful at all. I hope somebody
> thought about those issues before adding a new instruction to a
> popular CPU ;).

oh i think there are several solutions which will work... and i also think 
rdtscp wasn't a necessary addition to the ISA :)

-dean
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: race in sysfs between sysfs_remove_file() and read()/write() #2

2006-12-11 Thread Greg KH
On Mon, Dec 11, 2006 at 04:13:06PM +0530, Maneesh Soni wrote:
> On Mon, Dec 04, 2006 at 11:06:41AM -0500, Alan Stern wrote:
> > On Mon, 4 Dec 2006, Maneesh Soni wrote:
> > 
> > > hmm, I guess Greg has to say the final word. The question is either to 
> > > fail
> > > the IO (-ENODEV) or fail the file removal (-EBUSY). If we are not going to
> > > fail the removal then your patch is the way to go.
> > >
> > > Greg?
> > 
> > Oliver is right that we cannot allow device_remove_file() to fail.  In
> > fact we can't even allow it to block until all the existing open file
> > references are closed.
> > 
> > Our major questions have to do with the details of the patch itself.  In
> > particular, we are worried about possible races with the VFS and the
> > handling of the inode's usage count.  Can you examine the patch carefully
> > to see if it is okay?
> > 
> 
> Sorry for late reply.. I reviewed the patch and it looks ok me.

Thanks for the review.  Oliver, care to resend it to me so I can give it
some testing in the -mm tree?

thanks,

greg k-h
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


xfslogd-spinlock bug?

2006-12-11 Thread Haar János
Hello, list,

I am the "big red button men" with the one big 14TB xfs, if somebody can
remember me. :-)

Now i found something in the 2.6.16.18, and try the 2.6.18.4, and the
2.6.19, but the bug still exists:

Dec 11 22:47:21 dy-base BUG: spinlock bad magic on CPU#3, xfslogd/3/317
Dec 11 22:47:21 dy-base general protection fault:  [1]
Dec 11 22:47:21 dy-base SMP
Dec 11 22:47:21 dy-base
Dec 11 22:47:21 dy-base CPU 3
Dec 11 22:47:21 dy-base
Dec 11 22:47:21 dy-base Modules linked in:
Dec 11 22:47:21 dy-base  nbd
Dec 11 22:47:21 dy-base  rd
Dec 11 22:47:21 dy-base  netconsole
Dec 11 22:47:21 dy-base  e1000
Dec 11 22:47:21 dy-base  video
Dec 11 22:47:21 dy-base
Dec 11 22:47:21 dy-base Pid: 317, comm: xfslogd/3 Not tainted 2.6.19 #1
Dec 11 22:47:21 dy-base RIP: 0010:[]
Dec 11 22:47:21 dy-base  [] spin_bug+0x69/0xdf
Dec 11 22:47:21 dy-base RSP: 0018:81011fb89bc0  EFLAGS: 00010002
Dec 11 22:47:21 dy-base RAX: 0033 RBX: 6b6b6b6b6b6b6b6b RCX:

Dec 11 22:47:21 dy-base RDX: 808137a0 RSI: 0082 RDI:
0001
Dec 11 22:47:21 dy-base RBP: 81011fb89be0 R08: 00026a70 R09:
6b6b6b6b
Dec 11 22:47:21 dy-base R10: 0082 R11: 81000584d380 R12:
8100db92ad80
Dec 11 22:47:21 dy-base R13: 80642dc6 R14:  R15:
0003
Dec 11 22:47:21 dy-base FS:  ()
GS:81011fc76b90() knlGS:
Dec 11 22:47:21 dy-base CS:  0010 DS: 0018 ES: 0018 CR0: 8005003b
Dec 11 22:47:21 dy-base CR2: 2ba00770 CR3: 000108c05000 CR4:
06e0
Dec 11 22:47:21 dy-base Process xfslogd/3 (pid: 317, threadinfo
81011fb88000, task 81011fa7f830)
Dec 11 22:47:21 dy-base Stack:
Dec 11 22:47:21 dy-base  81011fb89be0
Dec 11 22:47:21 dy-base  8100db92ad80
Dec 11 22:47:21 dy-base  
Dec 11 22:47:21 dy-base  
Dec 11 22:47:21 dy-base
Dec 11 22:47:21 dy-base  81011fb89c10
Dec 11 22:47:21 dy-base  803f3bdc
Dec 11 22:47:21 dy-base  0282
Dec 11 22:47:21 dy-base  
Dec 11 22:47:21 dy-base
Dec 11 22:47:21 dy-base  
Dec 11 22:47:21 dy-base  
Dec 11 22:47:21 dy-base  81011fb89c30
Dec 11 22:47:21 dy-base  805e7f2b
Dec 11 22:47:21 dy-base
Dec 11 22:47:21 dy-base Call Trace:
Dec 11 22:47:21 dy-base  [] _raw_spin_lock+0x23/0xf1
Dec 11 22:47:21 dy-base  [] _spin_lock_irqsave+0x11/0x18
Dec 11 22:47:21 dy-base  [] __wake_up+0x22/0x50
Dec 11 22:47:21 dy-base  [] xfs_buf_unpin+0x21/0x23
Dec 11 22:47:21 dy-base  [] xfs_buf_item_unpin+0x2e/0xa6
Dec 11 22:47:21 dy-base  []
xfs_trans_chunk_committed+0xc3/0xf7
Dec 11 22:47:21 dy-base  [] xfs_trans_committed+0x49/0xde
Dec 11 22:47:21 dy-base  []
xlog_state_do_callback+0x185/0x33f
Dec 11 22:47:21 dy-base  [] xlog_iodone+0x104/0x131
Dec 11 22:47:22 dy-base  [] xfs_buf_iodone_work+0x1a/0x3e
Dec 11 22:47:22 dy-base  [] worker_thread+0x0/0x134
Dec 11 22:47:22 dy-base  [] run_workqueue+0xa8/0xf8
Dec 11 22:47:22 dy-base  [] xfs_buf_iodone_work+0x0/0x3e
Dec 11 22:47:22 dy-base  [] worker_thread+0x0/0x134
Dec 11 22:47:22 dy-base  [] worker_thread+0xfb/0x134
Dec 11 22:47:22 dy-base  [] default_wake_function+0x0/0xf
Dec 11 22:47:22 dy-base  [] worker_thread+0x0/0x134
Dec 11 22:47:22 dy-base  [] kthread+0xd8/0x10b
Dec 11 22:47:22 dy-base  [] schedule_tail+0x45/0xa6
Dec 11 22:47:22 dy-base  [] child_rip+0xa/0x12
Dec 11 22:47:22 dy-base  [] worker_thread+0x0/0x134
Dec 11 22:47:22 dy-base  [] kthread+0x0/0x10b
Dec 11 22:47:22 dy-base  [] child_rip+0x0/0x12
Dec 11 22:47:22 dy-base
Dec 11 22:47:22 dy-base
Dec 11 22:47:22 dy-base Code:
Dec 11 22:47:22 dy-base 8b
Dec 11 22:47:22 dy-base 83
Dec 11 22:47:22 dy-base 0c
Dec 11 22:47:22 dy-base 01
Dec 11 22:47:22 dy-base 00
Dec 11 22:47:22 dy-base 00
Dec 11 22:47:22 dy-base 48
Dec 11 22:47:22 dy-base 8d
Dec 11 22:47:22 dy-base 8b
Dec 11 22:47:22 dy-base 98
Dec 11 22:47:22 dy-base 02
Dec 11 22:47:22 dy-base 00
Dec 11 22:47:22 dy-base 00
Dec 11 22:47:22 dy-base 41
Dec 11 22:47:22 dy-base 8b
Dec 11 22:47:22 dy-base 54
Dec 11 22:47:22 dy-base 24
Dec 11 22:47:22 dy-base 04
Dec 11 22:47:22 dy-base 41
Dec 11 22:47:22 dy-base 89
Dec 11 22:47:22 dy-base
Dec 11 22:47:22 dy-base RIP
Dec 11 22:47:22 dy-base  [] spin_bug+0x69/0xdf
Dec 11 22:47:22 dy-base  RSP 
Dec 11 22:47:22 dy-base
Dec 11 22:47:22 dy-base Kernel panic - not syncing: Fatal exception
Dec 11 22:47:22 dy-base
Dec 11 22:47:22 dy-base Rebooting in 5 seconds..

After this, sometimes the server reboots normally, but sometimes hangs, no
console, no sysreq, no nothing.

This is a "simple" crash, no "too much" data lost, or else.

Can somebody help me to tracking down the problem?

Thanks,
Janos Haar



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] Introduce jiffies_32 and related compare functions

2006-12-11 Thread Eric Dumazet

Some subsystems dont need more than 32bits timestamps.

See for example net/ipv4/inetpeer.c and include/net/tcp.h :
#define tcp_time_stamp((__u32)(jiffies))


Because most timeouts should work with 'normal jiffies' that are 32bits on 
32bits platforms, it makes sense to be able to use only 32bits to store them 
and not 64 bits, to save ram.


This patch introduces jiffies_32, and related comparison functions 
time_after32(), time_before32(), time_after_eq32() and time_before_eq32().


I plan to use this infrastructure in network code for example (struct 
dst_entry comes to mind).


Signed-off-by: Eric Dumazet <[EMAIL PROTECTED]>
--- linux-2.6.19/include/linux/jiffies.h2006-12-12 00:32:00.0 
+0100
+++ linux-2.6.19-ed/include/linux/jiffies.h 2006-12-12 00:41:40.0 
+0100
@@ -80,6 +80,11 @@
  */
 extern u64 __jiffy_data jiffies_64;
 extern unsigned long volatile __jiffy_data jiffies;
+/*
+ * Some subsystems need small deltas and can store 32 bits timestamps
+ * instead of 'long', to save space on 64bits platforms.
+ */
+#define jiffies_32 ((u32)jiffies)
 
 #if (BITS_PER_LONG < 64)
 u64 get_jiffies_64(void);
@@ -131,6 +136,22 @@ static inline u64 get_jiffies_64(void)
 #define time_before_eq64(a,b)  time_after_eq64(b,a)
 
 /*
+ * Same as above, but does so with 32bits types.
+ * These must be used when using jiffies_32
+ */
+#define time_after32(a,b)  \
+   (typecheck(__u32, a) && \
+typecheck(__u32, b) && \
+((__s32)(b) - (__s32)(a) < 0))
+#define time_before32(a,b) time_after32(b,a)
+
+#define time_after_eq32(a,b)   \
+   (typecheck(__u32, a) && \
+typecheck(__u32, b) && \
+((__s32)(a) - (__s32)(b) >= 0))
+#define time_before_eq32(a,b)  time_after_eq32(b,a)
+
+/*
  * Have the 32 bit jiffies value wrap 5 minutes after boot
  * so jiffies wrap bugs show up earlier.
  */


[PATCH] md: Don't assume that READ==0 and WRITE==1 - use the names explicitly.

2006-12-11 Thread NeilBrown
### Comments for Changeset

Thanks Jens for alerting me to this.

Cc: Jens Axboe <[EMAIL PROTECTED]>
Cc: [EMAIL PROTECTED]
Signed-off-by: Neil Brown <[EMAIL PROTECTED]>

### Diffstat output
 ./drivers/md/faulty.c |2 +-
 ./drivers/md/raid1.c  |2 +-
 ./drivers/md/raid10.c |6 +++---
 ./drivers/md/raid5.c  |   20 ++--
 4 files changed, 15 insertions(+), 15 deletions(-)

diff .prev/drivers/md/faulty.c ./drivers/md/faulty.c
--- .prev/drivers/md/faulty.c   2006-12-12 09:47:58.0 +1100
+++ ./drivers/md/faulty.c   2006-12-12 09:48:10.0 +1100
@@ -173,7 +173,7 @@ static int make_request(request_queue_t 
conf_t *conf = (conf_t*)mddev->private;
int failit = 0;
 
-   if (bio->bi_rw & 1) {
+   if (bio_data_dir(bio) == WRITE) {
/* write request */
if (atomic_read(>counters[WriteAll])) {
/* special case - don't decrement, don't 
generic_make_request,

diff .prev/drivers/md/raid10.c ./drivers/md/raid10.c
--- .prev/drivers/md/raid10.c   2006-12-12 09:42:11.0 +1100
+++ ./drivers/md/raid10.c   2006-12-12 09:45:02.0 +1100
@@ -1785,7 +1785,7 @@ static sector_t sync_request(mddev_t *md
biolist = bio;
bio->bi_private = r10_bio;
bio->bi_end_io = end_sync_read;
-   bio->bi_rw = 0;
+   bio->bi_rw = READ;
bio->bi_sector = 
r10_bio->devs[j].addr +

conf->mirrors[d].rdev->data_offset;
bio->bi_bdev = 
conf->mirrors[d].rdev->bdev;
@@ -1801,7 +1801,7 @@ static sector_t sync_request(mddev_t *md
biolist = bio;
bio->bi_private = r10_bio;
bio->bi_end_io = end_sync_write;
-   bio->bi_rw = 1;
+   bio->bi_rw = WRITE;
bio->bi_sector = 
r10_bio->devs[k].addr +

conf->mirrors[i].rdev->data_offset;
bio->bi_bdev = 
conf->mirrors[i].rdev->bdev;
@@ -1870,7 +1870,7 @@ static sector_t sync_request(mddev_t *md
biolist = bio;
bio->bi_private = r10_bio;
bio->bi_end_io = end_sync_read;
-   bio->bi_rw = 0;
+   bio->bi_rw = READ;
bio->bi_sector = r10_bio->devs[i].addr +
conf->mirrors[d].rdev->data_offset;
bio->bi_bdev = conf->mirrors[d].rdev->bdev;

diff .prev/drivers/md/raid1.c ./drivers/md/raid1.c
--- .prev/drivers/md/raid1.c2006-12-08 12:07:39.0 +1100
+++ ./drivers/md/raid1.c2006-12-12 09:45:10.0 +1100
@@ -1736,7 +1736,7 @@ static sector_t sync_request(mddev_t *md
/* take from bio_init */
bio->bi_next = NULL;
bio->bi_flags |= 1 << BIO_UPTODATE;
-   bio->bi_rw = 0;
+   bio->bi_rw = READ;
bio->bi_vcnt = 0;
bio->bi_idx = 0;
bio->bi_phys_segments = 0;

diff .prev/drivers/md/raid5.c ./drivers/md/raid5.c
--- .prev/drivers/md/raid5.c2006-12-11 09:54:43.0 +1100
+++ ./drivers/md/raid5.c2006-12-12 09:49:53.0 +1100
@@ -1827,16 +1827,16 @@ static void handle_stripe5(struct stripe
struct bio *bi;
mdk_rdev_t *rdev;
if (test_and_clear_bit(R5_Wantwrite, >dev[i].flags))
-   rw = 1;
+   rw = WRITE;
else if (test_and_clear_bit(R5_Wantread, >dev[i].flags))
-   rw = 0;
+   rw = READ;
else
continue;
  
bi = >dev[i].req;
  
bi->bi_rw = rw;
-   if (rw)
+   if (rw == WRITE)
bi->bi_end_io = raid5_end_write_request;
else
bi->bi_end_io = raid5_end_read_request;
@@ -1872,7 +1872,7 @@ static void handle_stripe5(struct stripe
atomic_add(STRIPE_SECTORS, 
>corrected_errors);
generic_make_request(bi);
} else {
-   if (rw == 1)
+   if (rw == WRITE)
set_bit(STRIPE_DEGRADED, >state);
PRINTK("skip op %ld on disc %d for sector %llu\n",

Re: 2.6.19-mm1 (md/raid1 randomly drops partitions)

2006-12-11 Thread Neil Brown
On Monday December 11, [EMAIL PROTECTED] wrote:
> Hi,
> 
> On Monday, 11 December 2006 09:58, Andrew Morton wrote:
> > 
> > Temporarily at
> > 
> > http://userweb.kernel.org/~akpm/2.6.19-mm1/
> > 
> > Will appear later at
> > 
> > 
> > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.19/2.6.19-mm1/
> 
> It caused all of the md RAID1s on my test box to drop one of their partitions,
> apparently at random.

That's clever

Do you have any kernel logs of this happening?  My guess would be the
underlying device driver is returned more errors than before, but we
need the logs to be sure.

Thanks,
NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] kvm needs menu structure

2006-12-11 Thread Randy Dunlap
From: Randy Dunlap <[EMAIL PROTECTED]>

KVM config items need to be inside a menu structure instead of
dangling off of Device Drivers.

Signed-off-by: Randy Dunlap <[EMAIL PROTECTED]>
---
 drivers/kvm/Kconfig |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- linux-2.6.19-git18.orig/drivers/kvm/Kconfig
+++ linux-2.6.19-git18/drivers/kvm/Kconfig
@@ -1,7 +1,7 @@
 #
 # KVM configuration
 #
-config KVM
+menuconfig KVM
tristate "Kernel-based Virtual Machine (KVM) support"
depends on X86 && EXPERIMENTAL
---help---


---
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.19-git13: uts banner changes break SLES9 (at least)

2006-12-11 Thread Andy Whitcroft

Linus Torvalds wrote:


On Mon, 11 Dec 2006, Andy Whitcroft wrote:

I am afraid to report that this second version also fails for me, as you point
out CIFS can break us if defined.


Olaf, will you admit that the SLES9 code is crap now?

Andy, does just replacing the "__initdata" with "const" fix it for you? 
That should hopefully mean that IN PRACTICE the Linux version string will 
be the first one to be triggered, if only because init/main.c is linked 
reasonably early, and all the other "Linux version" strings will hopefully 
be in the same rodata section.


Yes that does make things 'work' again.  This all seems pretty fragile :(.



Sad, sad. We shouldn't need to work around tools that are so _obviously_ 
broken like this.


-apw
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.19-mm1

2006-12-11 Thread Rafael J. Wysocki
Hi,

On Monday, 11 December 2006 09:58, Andrew Morton wrote:
> 
> Temporarily at
> 
>   http://userweb.kernel.org/~akpm/2.6.19-mm1/
> 
> Will appear later at
> 
>   
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.19/2.6.19-mm1/

It caused all of the md RAID1s on my test box to drop one of their partitions,
apparently at random.

Greetings,
Rafael
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH] i386 add idle notifier (take 2)

2006-12-11 Thread Pallipadi, Venkatesh


Stephane,

This patch has the same race as in 64 bit patch, that was fixed here
http://www.ussg.iu.edu/hypermail/linux/kernel/0611.3/1264.html

With that race, idle callbacks does not work correctly. Even on a
totally idle system, I can see exit_idle called before enter_idle once
every few seconds. Can you update this patch with similar changes as in
64 bit part in the above patch.

Thanks,
Venki 

>-Original Message-
>From: [EMAIL PROTECTED] 
>[mailto:[EMAIL PROTECTED] On Behalf Of 
>Stephane Eranian
>Sent: Wednesday, November 29, 2006 8:41 AM
>To: linux-kernel@vger.kernel.org
>Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]; Stephane Eranian
>Subject: Re: [PATCH] i386 add idle notifier (take 2)
>
>Hello,
>
>[This is the second take due to stray '}' in the patch. Sorry 
>about that]
>
>Here is a patch that adds an idle notifier to the i386 tree.
>The idle notifier functionalities and implementation are
>identical to the x86_64 idle notifier. We use the idle notifier
>in the context of perfmon.
>
>The patch is against Andi Kleen's x86_64-2.6.19-rc6-061128-1.bz2
>kernel. It may apply to other kernels but it needs some updates
>to poll_idle() and default_idle() to work correctly.
>
>changelog:
>   - add an idle notifier mechanism to i386 tree
>
>signed-off-by: stephane eranian <[EMAIL PROTECTED]>
>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Teles PCI not initializing with HiSax

2006-12-11 Thread Stian Jordet
Hi,

since at least 2.6.18, my ISDN card has given me this error in dmesg:

ISDN subsystem Rev: 1.1.2.3/1.1.2.3/1.1.2.2/1.1.2.3/none/1.1.2.2
PPP BSD Compression module registered
HiSax: Linux Driver for passive ISDN cards
HiSax: Version 3.5 (kernel)
HiSax: Layer1 Revision 2.46.2.5
HiSax: Layer2 Revision 2.30.2.4
HiSax: TeiMgr Revision 2.20.2.3
HiSax: Layer3 Revision 2.22.2.3
HiSax: LinkLayer Revision 2.59.2.4
HiSax: Total 1 card defined
HiSax: Card 1 Protocol EDSS1 Id=HiSax (0)
HiSax: Teles/PCI driver Rev. 2.23.2.3
ACPI: PCI Interrupt :00:0d.0[A] -> GSI 16 (level, low) -> IRQ 19
Found: Zoran, base-address: 0xdb80, irq: 0x13
HiSax: Teles PCI config irq:19 mem:e084e000
TelesPCI: ISAC version (0): 2086/2186 V1.1
TelesPCI: HSCX version A: V2.1  B: V2.1
Teles PCI: IRQ 19 count 0
Teles PCI: IRQ 19 count 0
Teles PCI: IRQ(19) getting no interrupts during init 1
Teles PCI: IRQ 19 count 0
Teles PCI: IRQ(19) getting no interrupts during init 2
Teles PCI: IRQ 19 count 0
Teles PCI: IRQ(19) getting no interrupts during init 3
HiSax: Card Teles PCI not installed !

I also get this error when I boot with "acpi=off noapic", so I'm pretty
sure it's a hisax bug, not an acpi/irq error.

Anyone have an idea what's the problem here?

Please cc me, as I'm not subscribed.

Thanks.

Best regards,
Stian

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC: -mm patch] OCFS2: make code static

2006-12-11 Thread Mark Fasheh
Hi Adrian,

On Mon, Dec 11, 2006 at 08:10:01PM +0100, Adrian Bunk wrote:
> On Mon, Dec 11, 2006 at 12:58:07AM -0800, Andrew Morton wrote:
> >...
> > Changes since 2.6.19-rc6-mm2:
> >...
> >  git-ocfs2.patch
> >...
> >  git trees.
> >...
> 
> This patch makes needlessly global code static.
> 
> Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]>

I hand-merged the tcp.c change as the patch introducing that variable is
going upstream soon. Would you mind sending me the dlm/* stuff as a seperate
patch?

Thanks,
--Mark

--
Mark Fasheh
Senior Software Developer, Oracle
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 2.6.19-git] rts-rs5c372 updates: more chips, alarm, 12hr mode, etc

2006-12-11 Thread Voipio Riku
> On Sunday 10 December 2006 10:27 pm, Voipio Riku wrote:
>> > Update the rtc-rs5c372 driver:
>> > I suspect the
>> > issue wasn't that "mode 1" didn't work on that board; the original
>> > code to fetch the trim was broken.  If "mode 1" really won't work,
>> > that's almost certainly a bug in that board's I2C driver.

>> It was not related to trim fetching. Yes, it very likely that the boards
>> i2c controller (i2c-iop3xx) is has a bug, but I'm not competent enough
>> to
>> find out what it is actually sending out to the wire.

> I'd expect that would be the controller _driver_ ... although it would
> not surprise me to know there were also (unfixed) silicon bugs to cope
> with, like version-specific differences.  One hopes errata are published
> for the chip you're using, and that they don't lie.

from what I saw, the driver simply passes messages over to the i2c
controller. It even specifically mentiones that it supports repeated start
conditions, as needed for read method #1. Comparing to 80219 manual[1], I
did not spot anything obviously wrong.

> Have you asked around for anyone who may have insights about i2c-iop3xx
> driver bugs?  Maybe the driver maintainers, or arm-linux folk, or on
> the i2c list.

I was told to contact Dan Williams, I didn't get any response.

>> With your patch, the rtc acts like the chip would completely ignore the
>> "address" transfer, and starts reading from the last (default) register
>> anyway. Thus all the regs look shifted by one in the driver.

> That's quite strange.  The docs on the RTC are quite clear about what's
> supposed to happen with what I2C messages.  And I'd expect them to be
> right ... especially since they behaved for me, and the original author
> of that code!  That makes me suspect that your particular I2C controller
> driver must not be issuing the protocol requests it should be, at least
> on your hardware and revision.

Well at least I'm happy that there is now someone more experienced working
on this driver. When I tried to get it working I could not find anyone
with another board to verify if the original and/or my patch works for
them..

>> > +  /* this implements the first (most portable) reading method
>> > +   * specified in the datasheet.
>> > */

>> Why is this method considered more portable? Howabout making the read
>> method a module parameter?

> Of the three methods, #2 depends on messages that not all I2C masters
> are necessarily going to be able to issue, and #3 assumes that there's
> no other I2C master accessing that chip.

Agreed, I wouldn't consider method #2 either.

> Plus, if I understand things correctly, using mode #3 would break when
> writing

I should not. Writing isn't related to reading methods according the
datasheet[2]. It provides one addressing method for writing, and writing
works fine our Thecus/Allnet hardware.

[1] http://www.intel.com/design/iio/manuals/274017.htm
[2] http://www.ricoh.com/LSI/product_rtc/2wire/5c372/5c372a-e.pdf
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: powerpc: "IRQ probe failed (0x0)" on powerbook

2006-12-11 Thread Christoph Hellwig
On Tue, Dec 12, 2006 at 07:28:23AM +1100, Benjamin Herrenschmidt wrote:
> > Same here, btw - except that I couldn't catch the exact message as
> > nicely.
> 
> Yeah, fixed in the patch I sent yesterday [PATCH] powerpc: Fix irq
> routing on some PowerMac 32 bit.

Confirmed, everything is fine with that patch.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.19-git13: uts banner changes break SLES9 (at least)

2006-12-11 Thread Paul Mackerras
Linus Torvalds writes:

> On Mon, 11 Dec 2006, Olaf Hering wrote:
> > 
> > arch/powerpc/boot/wrapper:156:version=`${CROSS}strings "$kernel" | grep 
> > '^Linux version [-0-9.]' | \
> 
> This is also obviously broken (and really sad), but actually ends up being 
> better than what get_kernel_version apparently does, by at least adding 
> the requirement that the string "Linux version" be slightly more correct.
> 
> However, it's also TOTALLY BROKEN. 

It's the minimum effort for the barely acceptable outcome. :)

The wrapper script, although it currently lives in arch/powerpc/boot,
is designed and intended to be standalone, so that people can use it
outside the kernel tree, and possibly even without having the kernel
source easily to hand.  Therefore I didn't want to use any kernel
header files.

Apparently the only reason "mkimage" wants to know the kernel version
is to put it as a comment in the image, which can be displayed to the
user when booting with u-boot (the bootloader used on some embedded
platforms).  So it's not critical if the grep fails, and it's slightly
more useful to do the grep than it would be to not even try to provide
any version string to mkimage.

If there is a reliable way to get the version string, great, I'll use
that.

Paul.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: noexec=on doesn't work

2006-12-11 Thread John Richard Moser
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1



Eric Piel wrote:
> 12/09/2006 09:03 PM, Kyle McMartin wrote/a écrit:
>> On Sat, Dec 09, 2006 at 02:34:47PM -0500, John Richard Moser wrote:
>>> I have filed this as a distro bug with Ubuntu; it may be their issue, I
>>> haven't dug deep enough to find out.  I am posting this here to disperse
>>> the information breadth-first instead of depth-first, which will shorten
>>> the bug's life cycle if it turns out to be an upstream bug.
>>>
>>
>> NX requires the 64-bit page table entries (ie, PAE) which requires
>> CONFIG_HIGHMEM64G.
> 
> Somehow there is a problem: a user can explicitly put "noexec=on" and it
> will be silently ignored if the kernel doesn't have PAE support. I guess
> that currently no message is written because "noexec=on" is the
> _default_. Still, it would be fair to the user who added "noexec=on" on
> its command line that if it is not respected, either because the
> hardware doesn't support it or because the kernel doesn't support it, we
> display a warning saying it's hopeless.
> 

Would have saved me and others a lot of trouble if this happened, yes; I
wouldn't have written a test case and wtf'd at it for 5 days.  :)

> I'll send a patch if it seems meaningful to you,

Telling may be better than letting the user think; then again any
knowledgeable user should know based on his config (yes I know, by this
logic I should have known about the HIGHMEM64G thing).

> c u
> Eric
> 
> 
> 
> 

- --
We will enslave their women, eat their children and rape their
cattle!
  -- Bosc, Evil alien overlord from the fifth dimension
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iQIVAwUBRX3VJgs1xW0HCTEFAQLmaRAAj1e55b6if2lLTEbFNtylIn2aikAuPC87
wCqzvmdp/NBxUcIgXESdQeCDPxPuNK6OUCT6dtPTNCMu15wn7bfq3QUsXCR6z4za
lI7nBzIhU1ZH6HaGMm2d2MAuXfOg1I+SFEokOzXwh8db6HXGvH8DjP0mDLtKVxKP
yYjUXd8ZK3RPwU7eHUPN/V9s1v0ekc/1uFIlBBQHmzA0la/D32NcwhuCVsTEA8Ne
iix3QqBTn3p3UnD7LhnqaIKfBQEDTKfRnuWeGsf6L764cbyMaoga/6E6S7E8P2Jw
X+D940tAylrG8uH0CnmCDVzEGEPmozvN8Kk+UmSSwzgiFMQ3RlJaBbYEX9VsvqBZ
uIC77KVRHsKc+/nRYfYnDWoXRapWJTqVJfC+Ouuj1pm3NNptaHjSgpsgtHde6MuJ
ZZvvFhjN1iedDSCzRRYP4OLKTvomdiIQ9XrKPdfkqUvSgJZS7/zvCn+q6mZZDlqc
SthGcf9wCTSplGNwXzeIwMA14DGN6zZabA4ZTHNeyrLMAjzCrzd4/T8DSNGTyT0d
EotN0paFP5p7rgY37o7D+smm7m2V+zfGMn8iQr64E/xUDlySbEJKuea7VANzTj/1
FtLSb4rQRgA0yNaeCFuNQkvaCtn0U0/Ot/E7GQM53Hjr43mq2Pienc6+U/1KHFje
cmZt2/ZbzeM=
=pgCd
-END PGP SIGNATURE-

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 2.6.19-git] watchdog: at91_wdt build fix

2006-12-11 Thread Wim Van Sebroeck
Hi David,

> > See also Andrew Victor's patch (dated 04/Dec/2006) in the
> > linux-2.6-watchdog tree. It's indeed the at91rm9200_wdt, the mpcore_wdt
> > and the omap_wdt that are affected by the miscdev changes
> 
> I was just following the "fix brown paper bags ASAP" policy.
> One that seems followed less closely than usual in the current
> kernel tree.  :(

H, if I re-read this then my original message wasn't really clear...
I actually wanted to say that I already included Andrew's patches and
because of that I didn't add your patches also. Sorry about that. 
But you're right: we need to fix things as soon as possible.

> Hmm, I'm a bit surprised that not all the watchdog drivers have
> this issue.  Is the problem that most of them don't actually
> adhere to the driver model ... that is, most don't have any kind
> of (platform or other) device backing the watchdog?

Most of them don't have the driver model yet and those who have
don't use the miscdev as a "parent". But this is on the roadmap 
as part of the conversion to the generic watchdog structure.
(Hmm, I need to talk to Rudolf Marek again and continue our
conversation about the generic watchdog functionsi -> will do
that tomorrow).

Greetings,
Wim.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] group xtime, xtime_lock, wall_to_monotonic, avenrun, calc_load_count fields together in ktimed

2006-12-11 Thread Andrew Morton
On Mon, 11 Dec 2006 21:44:34 +0100
Eric Dumazet <[EMAIL PROTECTED]> wrote:

> Andrew Morton a __crit :
> > 
> > hm, the patch seems to transform a mess into a mess.  I guess it's a messy
> > problem.
> > 
> > I agree that aggregating all the time-related things into a struct like
> > this makes some sense.  As does aggregating them all into a similar-looking
> > namespace, but that'd probably be too intrusive - too late for that.
> 
> 
> Hi Andrew, thanks for your comments.
> 
> I sent two patches for the __attribute__((weak)) xtime_lock thing, and 
> calc_load() optimization, which dont depend on ktimed.

yup, thanks.

> Should I now send patches for aggregating things or is it considered too 
> intrusive ?

The previous version didn't look too intrusive.  But it would be nice to
have a plan to get rid of the macros:

#define xtime_lock  ktimed.xtime_lock

and just open-code this everywhere.

> (Sorry if I didnt understand your last sentence)

What I meant was: if we're not going to to aggregate all these globals like
this:

ktimed.xtime_lock
ktimed.wall_to_monotonic

then it would be nice if they were at least aggregated by naming convention:

time_management_time_lock
time_management_wall_to_monotonic
etc

so the reader can see that these things are all part of the same subsystem.

But the proposed ktimed.xtime_lock achieves that, and has runtime benefits
too.

Can we please not call it ktimed?  That sounds like a kernel thread to me. 
time_data would be better.

> If yes, should I send separate patches to :
> 
> 1) define an empty ktimed (or with a placeholder for jiffies64, not yet used)
> 2) move xtime into ktimed
> 3) move xtime_lock into ktimed
> 4) move wall_to_monotonic into ktimed
> 5) move calc_load.count into ktimed
> 6) move avenrun into ktimed.

A single patch there would suffice, I suspect.

> 7) patches to use ktimed.jiffies64 on various arches (with the problem of 
> aliasing jiffies)

That might be a sprinkle of per-arch patches, but I'm not sure what is
entailed here.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


PCI resource allocation problem

2006-12-11 Thread Steve Murphy

Hi
I've got a card that presents a PCIe to PCI transparent
bridge to the slot connector - behind which is a non
transparent bridge with 3 bars - 1 non prefetchable,
2 prefetchable. The non prefetchable is not assigned
after boot on some machines. It seems that if resource
allocation fails on linux it fails on XP too. I presume the
OS has failed to correct a bios device enumeration error.
Running identical install on some machines is fine ( some HP
and Dells) but I have some ASUS and Intel board machines
that show this fault.

I've tried loading the kernel with pci=assign-busses (BTW
how can verify the params a running kernel is using?) but
no result.  The prefetchable bar in question is only 64K -
I never see a problem assigning the PF bars (256M and 64M).

from syslog

Dec 11 18:45:28 brambling kernel: [   42.691399] PCI: Failed to allocate 
mem resource #8:[EMAIL PROTECTED] for :01:00.0
Dec 11 18:45:28 brambling kernel: [   42.691498] PCI: Failed to allocate 
mem resource #0:[EMAIL PROTECTED] for :02:0c.0

Dec 11 18:45:28 brambling kernel: [   42.691561] PCI: Bridge: :01:00.0
Dec 11 18:45:28 brambling kernel: [   42.691617]   IO window: disabled.
Dec 11 18:45:28 brambling kernel: [   42.691679]   MEM window: disabled.
Dec 11 18:45:28 brambling kernel: [   42.691739]   PREFETCH window: 
3000-47ff



from lspci -v

:01:00.0 PCI bridge: PLX Technology, Inc.: Unknown device 8114 (rev 
ba) (prog-if 00 [Normal decode])

   Flags: bus master, fast devsel, latency 0
   Memory at 2820 (32-bit, non-prefetchable) [size=8K]
   Bus: primary=01, secondary=02, subordinate=02, sec-latency=0
   Prefetchable memory behind bridge: 3000-47f0
   Capabilities: [40] Power Management version 3
   Capabilities: [48] Message Signalled Interrupts: 64bit+ 
Queue=0/0 Enable-

   Capabilities: [58] PCI-X bridge device.
   Capabilities: [68] #10 [0071]

:02:0c.0 Bridge: Intel Corporation: Unknown device 5378 (rev 02)
   Subsystem: Aspex Semiconductor Ltd: Unknown device 3416
   Flags: 66MHz, medium devsel
   Memory at  (64-bit, non-prefetchable) [disabled]
   Memory at 3000 (64-bit, prefetchable) [disabled] [size=256M]
   Memory at 4000 (64-bit, prefetchable) [disabled] [size=64M]
   Expansion ROM at 4400 [disabled] [size=32M]
   Capabilities: [c8] Slot ID: 0 slots, First-, chassis 00
   Capabilities: [cc] Power Management version 2
   Capabilities: [d4] #06 []
   Capabilities: [e0] Message Signalled Interrupts: 64bit+ 
Queue=0/2 Enable-

   Capabilities: [f0] PCI-X non-bridge device.


I'd be very grateful for any help. 
Steve


Dec 11 18:45:28 brambling syslogd 1.4.1#17ubuntu7: restart.
Dec 11 18:45:28 brambling kernel: Inspecting /boot/System.map-2.6.19
Dec 11 18:45:28 brambling kernel: Loaded 23693 symbols from 
/boot/System.map-2.6.19.
Dec 11 18:45:28 brambling kernel: Symbols match kernel version 2.6.19.
Dec 11 18:45:28 brambling kernel: No module symbols loaded - kernel modules not 
enabled. 
Dec 11 18:45:28 brambling kernel: [0.00] Linux version 2.6.19 ([EMAIL 
PROTECTED]) (gcc version 4.0.3 (Ubuntu 4.0.3-1ubuntu5)) #1 SMP Sat Dec 9 
01:41:11 GMT 2006
Dec 11 18:45:28 brambling kernel: [0.00] BIOS-provided physical RAM map:
Dec 11 18:45:28 brambling kernel: [0.00]  BIOS-e820:  - 
0009fc00 (usable)
Dec 11 18:45:28 brambling kernel: [0.00]  BIOS-e820: 0009fc00 - 
0010 (reserved)
Dec 11 18:45:28 brambling kernel: [0.00]  BIOS-e820: 0010 - 
1e9b5000 (usable)
Dec 11 18:45:28 brambling kernel: [0.00]  BIOS-e820: 1e9b5000 - 
1ea9e000 (ACPI NVS)
Dec 11 18:45:28 brambling kernel: [0.00]  BIOS-e820: 1ea9e000 - 
1fe0e000 (usable)
Dec 11 18:45:28 brambling kernel: [0.00]  BIOS-e820: 1fe0e000 - 
1fe5f000 (reserved)
Dec 11 18:45:28 brambling kernel: [0.00]  BIOS-e820: 1fe5f000 - 
1fe82000 (usable)
Dec 11 18:45:28 brambling kernel: [0.00]  BIOS-e820: 1fe82000 - 
1fedf000 (ACPI NVS)
Dec 11 18:45:28 brambling kernel: [0.00]  BIOS-e820: 1fedf000 - 
1fef2000 (usable)
Dec 11 18:45:28 brambling kernel: [0.00]  BIOS-e820: 1fef2000 - 
1feff000 (ACPI data)
Dec 11 18:45:28 brambling kernel: [0.00]  BIOS-e820: 1feff000 - 
1ff0 (usable)
Dec 11 18:45:28 brambling kernel: [0.00]  BIOS-e820: fffc - 
fffd (reserved)
Dec 11 18:45:28 brambling kernel: [0.00] 0MB HIGHMEM available.
Dec 11 18:45:28 brambling kernel: [0.00] 511MB LOWMEM available.
Dec 11 18:45:28 brambling kernel: [0.00] found SMP MP-table at 000fd980
Dec 11 18:45:28 brambling kernel: [0.00] NX (Execute Disable) 
protection: active
Dec 11 18:45:28 brambling kernel: [0.00] Entering 

[PATCH] reorder struct pipe_buf_operations

2006-12-11 Thread Eric Dumazet
Fields of struct pipe_buf_operations have not a precise layout (ie not 
optimized to fit cache lines nor reduce cache line ping pongs)


The bufs[] array is *large* and is placed near the beginning of the structure, 
so all following fields have a large offset. This is unfortunate because many 
archs have smaller instructions when using small offsets relative to a base 
register. On x86 for example, 7 bits offsets have smaller instruction lengths.


Moving bufs[] at the end of pipe_buf_operations permits all fields to have 
small offsets, and reduce text size, and icache pressure.


# size vmlinux.pre vmlinux
   textdata bss dec hex filename
3268989  664356  492196 4425541  438745 vmlinux.pre
3268765  664356  492196 4425317  438665 vmlinux

So this patch reduces text size by 224 bytes on my x86_64 machine. Similar 
results on ia32.



Signed-off-by: Eric Dumazet <[EMAIL PROTECTED]>
--- linux-2.6.19/include/linux/pipe_fs_i.h  2006-12-11 23:06:57.0 
+0100
+++ linux-2.6.19-ed/include/linux/pipe_fs_i.h   2006-12-11 22:58:42.0 
+0100
@@ -41,7 +41,6 @@ struct pipe_buf_operations {
 struct pipe_inode_info {
wait_queue_head_t wait;
unsigned int nrbufs, curbuf;
-   struct pipe_buffer bufs[PIPE_BUFFERS];
struct page *tmp_page;
unsigned int readers;
unsigned int writers;
@@ -51,6 +50,7 @@ struct pipe_inode_info {
struct fasync_struct *fasync_readers;
struct fasync_struct *fasync_writers;
struct inode *inode;
+   struct pipe_buffer bufs[PIPE_BUFFERS];
 };
 
 /* Differs from PIPE_BUF in that PIPE_SIZE is the length of the actual


Re: 2.6.19-git3 panics on boot - ata_piix/PCI related [still in -git17]

2006-12-11 Thread Steve Wise
I'm also hitting this running at commit:

commit 7bf65382caeecea4ae7206138e92e732b676d6e5
Author: Andrew Morton <[EMAIL PROTECTED]>
Date:   Fri Dec 8 02:41:14 2006 -0800

I was at 2.6.19, then merged up to Linus's tree Friday 12/8 and now I
hit this. I have 2 identical systems with one difference, one has a DVD
ROM device hooked to the ATA controller.  This system displays the same
problem.  Since the other system without the DVD worked fine with the
same code, I removed the DVD from the problem system and it boots ok.
However I need the DVD, so I guess I'll start bisecting to see what
caused this. There's about 2000 commits from 2.6.19 to my head...

More to come...

Steve.



On Mon, 2006-12-11 at 14:26 +0100, Alessandro Suardi wrote:
> On 12/3/06, Alessandro Suardi <[EMAIL PROTECTED]> wrote:
> > On 12/3/06, Alan <[EMAIL PROTECTED]> wrote:
> > > > > ACPI: PCI Interrupt :00:1f.2[B] -> Link [LNKB] -> GSI 5 (level, 
> > > > > low) -> IRQ5
> > > > > PCI: Unable to reserve I/O region #1:[EMAIL PROTECTED] for device 
> > > > > :00:1f.2
> > > > > ata_piix: probe of :00:1f.2 failed with error -16
> > > > > [snip]
> > > > > mount: could not find filesystem '/dev/root'
> > > >
> > > > Same failure is also in 2.6.19-git4...
> > >
> > > Thats the PCI updates - you need the matching fix to libata-sff where it
> > > tries to reserve stuff it shouldn't.
> >
> > Thanks Alan. Indeed -git1 is where stuff breaks for me.
> > I'll watch out for when libata-sff gets fixed in the -git
> >  snapshots and will then report back.
> 
> Alan,
> 
>   I still have this problem in 2.6.19-git17. Is this expected behavior
>   or should it have been fixed by now ?
> 
> Thanks,
> 
> --alessandro
> 
> "...when I get it, I _get_ it"
> 
>  (Lara Eidemiller)
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   3   4   5   6   7   8   >