Re: [PATCHv2 net-next] dropwatch: Support monitoring of dropped frames

2020-08-31 Thread Michal Schmidt

Dne 04. 08. 20 v 18:09 izabela.bakoll...@gmail.com napsala:

From: Izabela Bakollari 

Dropwatch is a utility that monitors dropped frames by having userspace
record them over the dropwatch protocol over a file. This augument
allows live monitoring of dropped frames using tools like tcpdump.

With this feature, dropwatch allows two additional commands (start and
stop interface) which allows the assignment of a net_device to the
dropwatch protocol. When assinged, dropwatch will clone dropped frames,
and receive them on the assigned interface, allowing tools like tcpdump
to monitor for them.

With this feature, create a dummy ethernet interface (ip link add dev
dummy0 type dummy), assign it to the dropwatch kernel subsystem, by using
these new commands, and then monitor dropped frames in real time by
running tcpdump -i dummy0.

Signed-off-by: Izabela Bakollari 
---
Changes in v2:
- protect the dummy ethernet interface from being changed by another
thread/cpu
---
  include/uapi/linux/net_dropmon.h |  3 ++
  net/core/drop_monitor.c  | 84 
  2 files changed, 87 insertions(+)

[...]

@@ -255,6 +259,21 @@ static void trace_drop_common(struct sk_buff *skb, void 
*location)
  
  out:

spin_unlock_irqrestore(&data->lock, flags);
+   spin_lock_irqsave(&interface_lock, flags);
+   if (interface && interface != skb->dev) {
+   skb = skb_clone(skb, GFP_ATOMIC);


I suggest naming the cloned skb "nskb". Less potential for confusion 
that way.



+   if (skb) {
+   skb->dev = interface;
+   spin_unlock_irqrestore(&interface_lock, flags);
+   netif_receive_skb(skb);
+   } else {
+   spin_unlock_irqrestore(&interface_lock, flags);
+   pr_err("dropwatch: Not enough memory to clone dropped 
skb\n");


Maybe avoid logging the error here. In NET_DM_ALERT_MODE_PACKET mode, 
drop monitor does not log about the skb_clone() failure either.
We don't want to open the possibility to flood the logs in case this 
somehow gets triggered by every packet.


A coding style suggestion - can you rearrange it so that the error path 
code is spelled out first? Then the regular path does not have to be 
indented further:


  nskb = skb_clone(skb, GFP_ATOMIC);
  if (!nskb) {
  spin_unlock_irqrestore(&interface_lock, flags);
  return;
  }

  /* ... implicit else ... Proceed normally ... */


+   return;
+   }
+   } else {
+   spin_unlock_irqrestore(&interface_lock, flags);
+   }
  }
  
  static void trace_kfree_skb_hit(void *ignore, struct sk_buff *skb, void *location)

@@ -1315,6 +1334,53 @@ static int net_dm_cmd_trace(struct sk_buff *skb,
return -EOPNOTSUPP;
  }
  
+static int net_dm_interface_start(struct net *net, const char *ifname)

+{
+   struct net_device *nd = dev_get_by_name(net, ifname);
+
+   if (nd)
+   interface = nd;
+   else
+   return -ENODEV;
+
+   return 0;


Similarly here, consider:

  if (!nd)
  return -ENODEV;

  interface = nd;
  return 0;

But maybe I'm nitpicking ...


+}
+
+static int net_dm_interface_stop(struct net *net, const char *ifname)
+{
+   dev_put(interface);
+   interface = NULL;
+
+   return 0;
+}
+
+static int net_dm_cmd_ifc_trace(struct sk_buff *skb, struct genl_info *info)
+{
+   struct net *net = sock_net(skb->sk);
+   char ifname[IFNAMSIZ];
+
+   if (net_dm_is_monitoring())
+   return -EBUSY;
+
+   memset(ifname, 0, IFNAMSIZ);
+   nla_strlcpy(ifname, info->attrs[NET_DM_ATTR_IFNAME], IFNAMSIZ - 1);
+
+   switch (info->genlhdr->cmd) {
+   case NET_DM_CMD_START_IFC:
+   if (!interface)
+   return net_dm_interface_start(net, ifname);
+   else
+   return -EBUSY;
+   case NET_DM_CMD_STOP_IFC:
+   if (interface)
+   return net_dm_interface_stop(net, interface->name);
+   else
+   return -ENODEV;


... and here too.

Best regards,
Michal



Re: [GIT PULL] kdbus for 4.1-rc1

2015-04-15 Thread Michal Schmidt
On 04/15/2015 09:31 AM, Mike Galbraith wrote:
> it seems [systemd] has now mandated group scheduling.

What makes you think so? Was it the fact that by default you have a
populated /sys/fs/cgroup/cpu/ hierarchy? This is either because some
unit requests the use of the cpu controller using one of the CPU*=
directives from systemd.resource-control(5), or (perhaps more likely)
because there is a privileged unit with Delegate=yes. The most likely
candidate is user@0.service, and so you could try preventing it from
starting:
  systemctl mask user@0.service

Note that systemd still works without group scheduling or any cgroup
subsystems enabled in the kernel:

  $ grep GROUP .config
  CONFIG_CGROUPS=y
  # CONFIG_CGROUP_DEBUG is not set
  # CONFIG_CGROUP_FREEZER is not set
  # CONFIG_CGROUP_DEVICE is not set
  # CONFIG_CGROUP_CPUACCT is not set
  # CONFIG_CGROUP_HUGETLB is not set
  # CONFIG_CGROUP_PERF is not set
  # CONFIG_CGROUP_SCHED is not set
  # CONFIG_BLK_CGROUP is not set
  # CONFIG_SCHED_AUTOGROUP is not set
  # CONFIG_NETFILTER_XT_MATCH_CGROUP is not set
  # CONFIG_NETFILTER_XT_MATCH_DEVGROUP is not set
  # CONFIG_NET_CLS_CGROUP is not set
  # CONFIG_CGROUP_NET_PRIO is not set
  # CONFIG_CGROUP_NET_CLASSID is not set

Michal
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: The ext3 way of journalling

2008-01-09 Thread Michal Schmidt
On Wed, 9 Jan 2008 07:25:56 -0500
Theodore Tso <[EMAIL PROTECTED]> wrote:

> On Wed, Jan 09, 2008 at 10:54:11AM +0100, Martin Schwidefsky wrote:
> > On Jan 8, 2008 7:15 PM, Theodore Tso <[EMAIL PROTECTED]> wrote:
> > > That will fix the this issue.  The problem you are facing is that
> > > you have your hardware clock set to ticking localtime, instead of
> > > GMT. Windows ticks localtime, which is a mistake carried over
> > > from the 1970's and MS-DOS.  Ticking localtime has all sorts of
> > > problems, among which is if you reboot around the transition
> > > between Summer Time (or Daylight Savings Time, depending on your
> > > contry) and normal time, the OS has no idea whether the DST
> > > adjustment has been applied or not.
> > 
> > Actually you can force Windows to accept a hardware clock in UTC:
> > HKEY_LOCAL_MACHINE/SYSTEMCurrentControlSetControl/TimeZoneInformation/RealTimeIsUniversal
> 
> Oh, so cool!!!  Do you know off hand what version of Windows started
> honoring that registry setting? 
> 
> And what do you set that registry value to?  Just a boolean "true"?
> 
> Now, how to convince Ubuntu to put this in their FAQ so I stop having
> their ahhh, less than clueful dual-booting Windows users who happen to
> live in Europe stop submitting bugs on this issue

According to http://www.cl.cam.ac.uk/~mgk25/mswish/ut-rtc.html it's
been there since Windows NT, but it is more or less broken in all newer
versions.

Michal
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] kthread: always create the kernel threads with normal priority

2008-01-08 Thread Michal Schmidt
On Mon, 7 Jan 2008 09:29:56 -0800
Andrew Morton <[EMAIL PROTECTED]> wrote:

> On Mon, 7 Jan 2008 12:09:04 +0100 Ingo Molnar <[EMAIL PROTECTED]> wrote:
> 
> > 
> > > > This causes a practical problem. When a runaway real-time task
> > > > is eating 100% CPU and we attempt to put the CPU offline,
> > > > sometimes we block while waiting for the creation of the
> > > > highest-priority "kstopmachine" thread.
> > 
> > sched-devel.git has new mechanisms against runaway RT tasks.
> > There's a new RLIMIT_RTTIME rlimit - if an RT task exceeds that
> > rlimit then it is sent SIGXCPU.
> 
> Is that "total RT CPU time" or "elapsed time since last schedule()"?
> 
> If the former, it is not useful for this problem.

It's "runtime since last sleep" so it is useful.

I still think the kthread patch is good to have anyway. The user can
have other reasons to change kthreadd's priority/cpumask.

Michal
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] kthread: always create the kernel threads with normal priority

2008-01-07 Thread Michal Schmidt
On Mon, 7 Jan 2008 02:25:13 -0800
Andrew Morton <[EMAIL PROTECTED]> wrote:

> On Mon, 7 Jan 2008 11:06:03 +0100 Michal Schmidt
> <[EMAIL PROTECTED]> wrote:
> 
> > On Sat, 22 Dec 2007 01:30:21 -0800
> > Andrew Morton <[EMAIL PROTECTED]> wrote:
> > 
> > > On Mon, 17 Dec 2007 23:43:14 +0100 Michal Schmidt
> > > <[EMAIL PROTECTED]> wrote:
> > > 
> > > > kthreadd, the creator of other kernel threads, runs as a normal
> > > > priority task. This is a potential for priority inversion when a
> > > > task wants to spawn a high-priority kernel thread. A middle
> > > > priority SCHED_FIFO task can block kthreadd's execution
> > > > indefinitely and thus prevent the timely creation of the
> > > > high-priority kernel thread. 
> > > > This causes a practical problem. When a runaway real-time task
> > > > is eating 100% CPU and we attempt to put the CPU offline,
> > > > sometimes we block while waiting for the creation of the
> > > > highest-priority "kstopmachine" thread. 
> > > > 
> > > > The fix is to run kthreadd with the highest possible SCHED_FIFO
> > > > priority. Its children must still run as slightly negatively
> > > > reniced SCHED_NORMAL tasks.
> > > 
> > > Did you hit this problem with the stock kernel, or have you been
> > > working on other stuff?
> > 
> > This was with RHEL5 and with current Fedora kernels.
> > 
> > > A locked-up SCHED_FIFO process will cause kernel threads all
> > > sorts of problems.  You've hit one instance, but there will be
> > > others. (pdflush stops working, for one).
> > > 
> > > The general approach we've taken to this is "don't do that".
> > > Yes, we could boost lots of kernel threads in the way which this
> > > patch does but this actually takes control *away* from
> > > userspace.  Userspace no longer has the ability to guarantee
> > > itself minimum possible latency without getting preempted by
> > > kernel threads.
> > > 
> > > And yes, giving userspace this minimum-latency capability does
> > > imply that userspace has a responsibility to not 100% starve
> > > kernel threads.  It's a reasonable compromise, I think?
> > 
> > You're right. We should not run kthreadd with SCHED_FIFO by default.
> > But the user should be able to change it using chrt if he wants to
> > avoid this particular problem. So how about this instead?:
> > 
> > 
> > 
> > kthreadd, the creator of other kernel threads, runs as a normal
> > priority task. This is a potential for priority inversion when a
> > task wants to spawn a high-priority kernel thread. A middle
> > priority SCHED_FIFO task can block kthreadd's execution
> > indefinitely and thus prevent the timely creation of the
> > high-priority kernel thread.
> > 
> > This causes a practical problem. When a runaway real-time task is
> > eating 100% CPU and we attempt to put the CPU offline, sometimes we
> > block while waiting for the creation of the highest-priority
> > "kstopmachine" thread.
> > 
> > This could be solved by always running kthreadd with the highest
> > possible SCHED_FIFO priority, but that would be undesirable policy
> > decision in the kernel. kthreadd would cause unwanted latencies
> > even for the realtime users who know what they're doing.
> > 
> > Let's not make the decision for the user. Just allow the
> > administrator to change kthreadd's priority safely if he chooses to
> > do it. Ensure that the kernel threads are created with the usual
> > nice level even if kthreadd's priority is changed from the default.
> > 
> > Signed-off-by: Michal Schmidt <[EMAIL PROTECTED]>
> > ---
> >  kernel/kthread.c |   11 +++
> >  1 files changed, 11 insertions(+), 0 deletions(-)
> > 
> > diff --git a/kernel/kthread.c b/kernel/kthread.c
> > index dcfe724..e832a85 100644
> > --- a/kernel/kthread.c
> > +++ b/kernel/kthread.c
> > @@ -94,10 +94,21 @@ static void create_kthread(struct
> > kthread_create_info *create) if (pid < 0) {
> > create->result = ERR_PTR(pid);
> > } else {
> > +   struct sched_param param = { .sched_priority = 0 };
> > wait_for_completion(&create->started);
> > read_lock(&tasklist_lock);
> > create->result = find_task_by_pid(pid);
> > read_unl

Re: [PATCH] kthread: always create the kernel threads with normal priority

2008-01-07 Thread Michal Schmidt
On Mon, 7 Jan 2008 12:22:51 +0100
"Remy Bohmer" <[EMAIL PROTECTED]> wrote:

> Hello Michal and Andrew,
> 
> > Let's not make the decision for the user. Just allow the
> > administrator to change kthreadd's priority safely if he chooses to
> > do it. Ensure that the kernel threads are created with the usual
> > nice level even if kthreadd's priority is changed from the default.
> 
> Last year, I posted a patchset (that was meant for Preempt-RT at that
> time) to be able to prioritise the interrupt-handler-threads (which
> are kthreads) and softirq-threads from the kernel commandline. See
> http://lkml.org/lkml/2007/12/19/208
> 
> Maybe we can find a way to use a similar mechanism as I used in my
> patchset for the priorities of the remaining kthreads.
> I do not like the way of forcing userland to change the priorities,
> because that would require a userland with the chrt tool installed,
> and that is not that practical for embedded systems (in which there
> could be cases that there is no userland at all, or the init-process
> is the whole embedded application). In that case an option to do it on
> the kernel commandline is more practical.
> 
> I propose this kernel cmd-line option:
> kthread_pmap=somethread:50,otherthread:12,34

I see. kthreadd would look up the priority for itself and
kthread_create would consult the map for all other kernel threads.
That should work.
Your sirq_pmap would not be needed anymore, as kthread_pmap could be
used for softirq threads too, right?

Michal
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] kthread: always create the kernel threads with normal priority

2008-01-07 Thread Michal Schmidt
On Sat, 22 Dec 2007 01:30:21 -0800
Andrew Morton <[EMAIL PROTECTED]> wrote:

> On Mon, 17 Dec 2007 23:43:14 +0100 Michal Schmidt
> <[EMAIL PROTECTED]> wrote:
> 
> > kthreadd, the creator of other kernel threads, runs as a normal
> > priority task. This is a potential for priority inversion when a
> > task wants to spawn a high-priority kernel thread. A middle priority
> > SCHED_FIFO task can block kthreadd's execution indefinitely and thus
> > prevent the timely creation of the high-priority kernel thread.
> > 
> > This causes a practical problem. When a runaway real-time task is
> > eating 100% CPU and we attempt to put the CPU offline, sometimes we
> > block while waiting for the creation of the highest-priority
> > "kstopmachine" thread. 
> > 
> > The fix is to run kthreadd with the highest possible SCHED_FIFO
> > priority. Its children must still run as slightly negatively reniced
> > SCHED_NORMAL tasks.
> 
> Did you hit this problem with the stock kernel, or have you been
> working on other stuff?

This was with RHEL5 and with current Fedora kernels.

> A locked-up SCHED_FIFO process will cause kernel threads all sorts of
> problems.  You've hit one instance, but there will be others.
> (pdflush stops working, for one).
> 
> The general approach we've taken to this is "don't do that".  Yes, we
> could boost lots of kernel threads in the way which this patch does
> but this actually takes control *away* from userspace.  Userspace no
> longer has the ability to guarantee itself minimum possible latency
> without getting preempted by kernel threads.
> 
> And yes, giving userspace this minimum-latency capability does imply
> that userspace has a responsibility to not 100% starve kernel
> threads.  It's a reasonable compromise, I think?

You're right. We should not run kthreadd with SCHED_FIFO by default.
But the user should be able to change it using chrt if he wants to
avoid this particular problem. So how about this instead?:



kthreadd, the creator of other kernel threads, runs as a normal priority task.
This is a potential for priority inversion when a task wants to spawn a
high-priority kernel thread. A middle priority SCHED_FIFO task can block
kthreadd's execution indefinitely and thus prevent the timely creation of the
high-priority kernel thread.

This causes a practical problem. When a runaway real-time task is eating 100%
CPU and we attempt to put the CPU offline, sometimes we block while waiting for
the creation of the highest-priority "kstopmachine" thread.

This could be solved by always running kthreadd with the highest possible
SCHED_FIFO priority, but that would be undesirable policy decision in the
kernel. kthreadd would cause unwanted latencies even for the realtime users who
know what they're doing.

Let's not make the decision for the user. Just allow the administrator to
change kthreadd's priority safely if he chooses to do it. Ensure that the
kernel threads are created with the usual nice level even if kthreadd's
priority is changed from the default.

Signed-off-by: Michal Schmidt <[EMAIL PROTECTED]>
---
 kernel/kthread.c |   11 +++
 1 files changed, 11 insertions(+), 0 deletions(-)

diff --git a/kernel/kthread.c b/kernel/kthread.c
index dcfe724..e832a85 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -94,10 +94,21 @@ static void create_kthread(struct kthread_create_info 
*create)
if (pid < 0) {
create->result = ERR_PTR(pid);
} else {
+   struct sched_param param = { .sched_priority = 0 };
wait_for_completion(&create->started);
read_lock(&tasklist_lock);
create->result = find_task_by_pid(pid);
read_unlock(&tasklist_lock);
+   /*
+* root may want to change our (kthreadd's) priority to
+* realtime to solve a corner case priority inversion problem
+* (a realtime task consuming 100% CPU blocking the creation of
+* kernel threads). The kernel thread should not inherit the
+* higher priority. Let's always create it with the usual nice
+* level.
+*/
+   sched_setscheduler(create->result, SCHED_NORMAL, ¶m);
+   set_user_nice(create->result, -5);
}
complete(&create->done);
 }
-- 
1.5.3.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] kthread: run kthreadd with max priority SCHED_FIFO

2007-12-17 Thread Michal Schmidt
kthreadd, the creator of other kernel threads, runs as a normal
priority task. This is a potential for priority inversion when a task
wants to spawn a high-priority kernel thread. A middle priority
SCHED_FIFO task can block kthreadd's execution indefinitely and thus
prevent the timely creation of the high-priority kernel thread.

This causes a practical problem. When a runaway real-time task is
eating 100% CPU and we attempt to put the CPU offline, sometimes we
block while waiting for the creation of the highest-priority
"kstopmachine" thread. 

The fix is to run kthreadd with the highest possible SCHED_FIFO
priority. Its children must still run as slightly negatively reniced
SCHED_NORMAL tasks.

Signed-off-by: Michal Schmidt <[EMAIL PROTECTED]>

diff --git a/kernel/kthread.c b/kernel/kthread.c
index dcfe724..a7ce932 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -94,10 +94,17 @@ static void create_kthread(struct kthread_create_info 
*create)
if (pid < 0) {
create->result = ERR_PTR(pid);
} else {
+   struct sched_param param = { .sched_priority = 0 };
wait_for_completion(&create->started);
read_lock(&tasklist_lock);
create->result = find_task_by_pid(pid);
read_unlock(&tasklist_lock);
+   /*
+* We (kthreadd) run with SCHED_FIFO, but we don't want
+* the kthreads we create to have it too by default.
+*/
+   sched_setscheduler(create->result, SCHED_NORMAL, ¶m);
+   set_user_nice(create->result, -5);
}
complete(&create->done);
 }
@@ -217,11 +224,12 @@ EXPORT_SYMBOL(kthread_stop);
 int kthreadd(void *unused)
 {
struct task_struct *tsk = current;
+   struct sched_param param = { .sched_priority = MAX_RT_PRIO - 1 };
 
/* Setup a clean context for our children to inherit. */
set_task_comm(tsk, "kthreadd");
ignore_signals(tsk);
-   set_user_nice(tsk, -5);
+   sched_setscheduler(tsk, SCHED_FIFO, ¶m);
set_cpus_allowed(tsk, CPU_MASK_ALL);
 
current->flags |= PF_NOFREEZE;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kernel panic - help!?

2007-12-12 Thread Michal Schmidt
On Wed, 12 Dec 2007 07:24:36 -0700
Justin Banks <[EMAIL PROTECTED]> wrote:

> > > (2.6.9-55.0.9.ELsmp)
> -^^
> 
> It's really really old :)

No, it's actually less than 3 months old kernel from RHEL-4 or CentOS.

Michal
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] isapnp driver semaphore to mutex

2007-12-03 Thread Michal Schmidt
Dne Mon, 03 Dec 2007 10:35:01 -0800
Daniel Walker <[EMAIL PROTECTED]> napsal(a):

> Speaking of automating.. I created a little .vimrc add-on which helps
> doing sem2mutex type changes. Here's the chunk I added,
> 
> function Semtomutex( lo )
> exe '%s/down(&'.a:lo.')/mutex_lock\(\&'.a:lo.'\)/g'
> exe '%s/down_trylock(&'.a:lo.')/mutex_trylock\(\&'.a:lo.'\)/g'

>From the comment above mutex_trylock():
 * NOTE: this function follows the spin_trylock() convention, so
 * it is negated to the down_trylock() return values! Be careful
 * about this when converting semaphore users to mutexes.

Michal
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Use of mutex in interrupt context flawed/impossible, need advice.

2007-11-22 Thread Michal Schmidt
On Thu, 22 Nov 2007 17:19:44 +0100
"Leon Woestenberg" <[EMAIL PROTECTED]> wrote:

> I forgot to mention that I would like to be prepared for, and use the
> -rt patch soon. I understand (maybe wrongly?) that semaphores are not
> real-time pre-emptible, mutexes and spinlocks are.

Semaphores are preemptible, but they don't do priority inheritance.

Michal
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] proc: loadavg reading race

2007-11-12 Thread Michal Schmidt
The avenrun[] values are supposed to be protected by xtime_lock.
loadavg_read_proc does not use it. Theoretically this may result in an
occasional glitch when the value read from /proc/loadavg would be as much as
1<<11 times higher than it should be.

Signed-off-by: Michal Schmidt <[EMAIL PROTECTED]>
---
 fs/proc/proc_misc.c |   11 ---
 1 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/fs/proc/proc_misc.c b/fs/proc/proc_misc.c
index e0d064e..10cc9ad 100644
--- a/fs/proc/proc_misc.c
+++ b/fs/proc/proc_misc.c
@@ -83,10 +83,15 @@ static int loadavg_read_proc(char *page, char **start, 
off_t off,
 {
int a, b, c;
int len;
+   unsigned long seq;
+
+   do {
+   seq = read_seqbegin(&xtime_lock);
+   a = avenrun[0] + (FIXED_1/200);
+   b = avenrun[1] + (FIXED_1/200);
+   c = avenrun[2] + (FIXED_1/200);
+   } while (read_seqretry(&xtime_lock, seq));
 
-   a = avenrun[0] + (FIXED_1/200);
-   b = avenrun[1] + (FIXED_1/200);
-   c = avenrun[2] + (FIXED_1/200);
len = sprintf(page,"%d.%02d %d.%02d %d.%02d %ld/%d %d\n",
LOAD_INT(a), LOAD_FRAC(a),
LOAD_INT(b), LOAD_FRAC(b),
-- 
1.5.3.3

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Truecrypt in kernel ?

2007-11-05 Thread Michal Schmidt
On Mon, 5 Nov 2007 20:42:39 -0500
"Zurk Tech" <[EMAIL PROTECTED]> wrote:
> just wondering why the truecrypt module isnt in the mainline kernel ?
> its the only cross platform encrypted disk solution out there and it
> should be less of a chore to use it in linux...is there something
> wrong with the truecrypt kernel driver ?

Two reasons:
The author hasn't sent patches.
It looks to me the license is incompatible with the GPLv2.

Michal
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: get amount of "entropy" in /dev/random ?

2007-10-02 Thread Michal Schmidt
Yakov Lerner wrote:
> >From the userlevel, can I get an estimate of  "amount of entropy"
> in /dev/random, that is, the estimate of number of bytes
> readable until it blocks ? Of course multiple processes
> can read bytes and this would not be exact ... but still .. as an upper
> boundary estimate ?
>
> Thanks
> Yakov

Try ioctl(fd, RNDGETENTCNT, &entropy_count)

Michal

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] pci: use pci=bfsort for HP DL385 G2, DL585 G2

2007-09-27 Thread Michal Schmidt
Hello,

HP ProLiant systems DL385 G2 and DL585 G2 need pci=bfsort to enumerate PCI
devices in the expected order.

(John, can you please confirm and ACK this?)

Signed-off-by: Michal Schmidt <[EMAIL PROTECTED]>

diff --git a/arch/i386/pci/common.c b/arch/i386/pci/common.c
index ebc6f3c..8737c53 100644
--- a/arch/i386/pci/common.c
+++ b/arch/i386/pci/common.c
@@ -287,6 +287,22 @@ static struct dmi_system_id __devinitdata 
pciprobe_dmi_table[] = {
DMI_MATCH(DMI_PRODUCT_NAME, "ProLiant BL685c G1"),
},
},
+   {
+   .callback = set_bf_sort,
+   .ident = "HP ProLiant DL385 G2",
+   .matches = {
+   DMI_MATCH(DMI_SYS_VENDOR, "HP"),
+   DMI_MATCH(DMI_PRODUCT_NAME, "ProLiant DL385 G2"),
+   },
+   },
+   {
+   .callback = set_bf_sort,
+   .ident = "HP ProLiant DL585 G2",
+   .matches = {
+   DMI_MATCH(DMI_SYS_VENDOR, "HP"),
+   DMI_MATCH(DMI_PRODUCT_NAME, "ProLiant DL585 G2"),
+   },
+   },
{}
 };
 


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ppp_mppe: Don't put InterimKey on the stack

2007-09-21 Thread Michal Schmidt
Matt Domsch skrev:
> On Fri, Sep 21, 2007 at 04:08:09PM +0200, Michal Schmidt wrote:
>   
>> Hello,
>>
>> The interrupt stack can be in the __START_KERNEL_map region in which
>> virt_to_page will not work. This caused ppp_mppe to crash on CentOS 5 on 
>> x86_64
>> (http://bugs.centos.org/view.php?id=2076).
>>
>> The fix is to avoid copying the interim key. We can simply use it in its
>> original place, which is kmalloc'd.
>> 
>
> Needs a Signed-off-by: line, but otherwise, looks good, and even saves
> some stack space.  Thanks for tracking this down.
>
> -Matt
>   

Sorry about the forgotten sign-off. Here it is.
Andrew, please apply.

Signed-off-by: Michal Schmidt <[EMAIL PROTECTED]>

---
 drivers/net/ppp_mppe.c |   14 ++
 1 files changed, 6 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ppp_mppe.c b/drivers/net/ppp_mppe.c
index f79cf87..c0b6d19 100644
--- a/drivers/net/ppp_mppe.c
+++ b/drivers/net/ppp_mppe.c
@@ -136,7 +136,7 @@ struct ppp_mppe_state {
  * Key Derivation, from RFC 3078, RFC 3079.
  * Equivalent to Get_Key() for MS-CHAP as described in RFC 3079.
  */
-static void get_new_key_from_sha(struct ppp_mppe_state * state, unsigned char 
*InterimKey)
+static void get_new_key_from_sha(struct ppp_mppe_state * state)
 {
struct hash_desc desc;
struct scatterlist sg[4];
@@ -153,8 +153,6 @@ static void get_new_key_from_sha(struct ppp_mppe_state * 
state, unsigned char *I
desc.flags = 0;
 
crypto_hash_digest(&desc, sg, nbytes, state->sha1_digest);
-
-   memcpy(InterimKey, state->sha1_digest, state->keylen);
 }
 
 /*
@@ -163,21 +161,21 @@ static void get_new_key_from_sha(struct ppp_mppe_state * 
state, unsigned char *I
  */
 static void mppe_rekey(struct ppp_mppe_state * state, int initial_key)
 {
-   unsigned char InterimKey[MPPE_MAX_KEY_LEN];
struct scatterlist sg_in[1], sg_out[1];
struct blkcipher_desc desc = { .tfm = state->arc4 };
 
-   get_new_key_from_sha(state, InterimKey);
+   get_new_key_from_sha(state);
if (!initial_key) {
-   crypto_blkcipher_setkey(state->arc4, InterimKey, state->keylen);
-   setup_sg(sg_in, InterimKey, state->keylen);
+   crypto_blkcipher_setkey(state->arc4, state->sha1_digest,
+   state->keylen);
+   setup_sg(sg_in, state->sha1_digest, state->keylen);
setup_sg(sg_out, state->session_key, state->keylen);
if (crypto_blkcipher_encrypt(&desc, sg_out, sg_in,
 state->keylen) != 0) {
printk(KERN_WARNING "mppe_rekey: cipher_encrypt failed\n");
}
} else {
-   memcpy(state->session_key, InterimKey, state->keylen);
+   memcpy(state->session_key, state->sha1_digest, state->keylen);
}
if (state->keylen == 8) {
/* See RFC 3078 */

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] ppp_mppe: Don't put InterimKey on the stack

2007-09-21 Thread Michal Schmidt
Hello,

The interrupt stack can be in the __START_KERNEL_map region in which
virt_to_page will not work. This caused ppp_mppe to crash on CentOS 5 on x86_64
(http://bugs.centos.org/view.php?id=2076).

The fix is to avoid copying the interim key. We can simply use it in its
original place, which is kmalloc'd.

Michal
---
 drivers/net/ppp_mppe.c |   14 ++
 1 files changed, 6 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ppp_mppe.c b/drivers/net/ppp_mppe.c
index f79cf87..c0b6d19 100644
--- a/drivers/net/ppp_mppe.c
+++ b/drivers/net/ppp_mppe.c
@@ -136,7 +136,7 @@ struct ppp_mppe_state {
  * Key Derivation, from RFC 3078, RFC 3079.
  * Equivalent to Get_Key() for MS-CHAP as described in RFC 3079.
  */
-static void get_new_key_from_sha(struct ppp_mppe_state * state, unsigned char 
*InterimKey)
+static void get_new_key_from_sha(struct ppp_mppe_state * state)
 {
struct hash_desc desc;
struct scatterlist sg[4];
@@ -153,8 +153,6 @@ static void get_new_key_from_sha(struct ppp_mppe_state * 
state, unsigned char *I
desc.flags = 0;
 
crypto_hash_digest(&desc, sg, nbytes, state->sha1_digest);
-
-   memcpy(InterimKey, state->sha1_digest, state->keylen);
 }
 
 /*
@@ -163,21 +161,21 @@ static void get_new_key_from_sha(struct ppp_mppe_state * 
state, unsigned char *I
  */
 static void mppe_rekey(struct ppp_mppe_state * state, int initial_key)
 {
-   unsigned char InterimKey[MPPE_MAX_KEY_LEN];
struct scatterlist sg_in[1], sg_out[1];
struct blkcipher_desc desc = { .tfm = state->arc4 };
 
-   get_new_key_from_sha(state, InterimKey);
+   get_new_key_from_sha(state);
if (!initial_key) {
-   crypto_blkcipher_setkey(state->arc4, InterimKey, state->keylen);
-   setup_sg(sg_in, InterimKey, state->keylen);
+   crypto_blkcipher_setkey(state->arc4, state->sha1_digest,
+   state->keylen);
+   setup_sg(sg_in, state->sha1_digest, state->keylen);
setup_sg(sg_out, state->session_key, state->keylen);
if (crypto_blkcipher_encrypt(&desc, sg_out, sg_in,
 state->keylen) != 0) {
printk(KERN_WARNING "mppe_rekey: cipher_encrypt failed\n");
}
} else {
-   memcpy(state->session_key, InterimKey, state->keylen);
+   memcpy(state->session_key, state->sha1_digest, state->keylen);
}
if (state->keylen == 8) {
/* See RFC 3078 */

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2.6.21] Return available first timeslice to the creator, not parent

2007-08-30 Thread Michal Schmidt

Vitaly Mayatskikh skrev:

Short-living process returns its timeslice to the parent, this
affects process that creates a lot of such short-living threads,
because its not a parent for new threads.


I don't see the point of sending patches for old Linux versions such as 
2.6.21, unless it's something applicable to the -stable tree.

Do recent kernels with CFS have the same problem?


Patch fixes this issue and
doesn't break kabi as does the patch from reporter:
http://lkml.org/lkml/2007/4/7/21


There's no kabi.

Michal

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: nanosleep() accuracy

2007-08-17 Thread Michal Schmidt
GolovaSteek wrote:
> 2007/8/17, Michal Schmidt <[EMAIL PROTECTED]>:
>   
>> GolovaSteek skrev:
>> 
>>> Hello!
>>> I need use sleep with accurat timing.
>>> I use 2.6.21 with rt-prempt patch.
>>> with enabled rt_preempt, dyn_ticks, and local_apic
>>> But
>>>
>>> req.tv_nsec = 30;
>>> req.tv_sec = 0;
>>> nanosleep(&req,NULL)
>>>
>>> make pause around 310-330 microseconds.
>>>   
>> How do you measure this?
>> If you want to have something done every 300 microseconds, you must not
>> sleep for 300 microseconds in each iteration, because you'd accumulate
>> errors. Use a periodic timer or use the current time to compute how long
>> to sleep in each iteration. Take a look how cyclictest does it.
>> 
>
> no. I just want my programm go to sleep sometimes and wake up in correct time.
>   

What does your program do that it has such a strict requirement on the
exact length of sleeping?

>>> I tried to understend how work nanosleep(), but it not depends from
>>> jiffies and from smp_apic_timer_interrupt.
>>>
>>> When can accuracy be lost?
>>> And how are process waked up?
>>>
>>>
>>> GolovaSteek
>>>   
>> Don't forget the process will always have non-zero wakeup latency. It
>> takes some time to process an interrupt, wakeup the process and schedule
>> it to run on the CPU. 10-30 microseconds is not unreasonable.
>> 
>
> But 2 operations can be done in 10 microseconds?
> and why is there that inconstancy? Why sametimes 10 and sometimes 30?
> In which points of implementation it happens?
>
> GolovaSteek
>   

If a jitter of 20 microseconds is unacceptable for your application,
don't use PC hardware. Consider using a microcontroller.

Michal

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: nanosleep() accuracy

2007-08-17 Thread Michal Schmidt

GolovaSteek skrev:

Hello!
I need use sleep with accurat timing.
I use 2.6.21 with rt-prempt patch.
with enabled rt_preempt, dyn_ticks, and local_apic
But

req.tv_nsec = 30;
req.tv_sec = 0;
nanosleep(&req,NULL)

make pause around 310-330 microseconds.


How do you measure this?
If you want to have something done every 300 microseconds, you must not 
sleep for 300 microseconds in each iteration, because you'd accumulate 
errors. Use a periodic timer or use the current time to compute how long 
to sleep in each iteration. Take a look how cyclictest does it.



I tried to understend how work nanosleep(), but it not depends from
jiffies and from smp_apic_timer_interrupt.

When can accuracy be lost?
And how are process waked up?


GolovaSteek


Don't forget the process will always have non-zero wakeup latency. It 
takes some time to process an interrupt, wakeup the process and schedule 
it to run on the CPU. 10-30 microseconds is not unreasonable.


Michal
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] destroy_workqueue() can livelock

2007-07-13 Thread Michal Schmidt
Oleg Nesterov wrote:
> Pointed out by Michal Schmidt <[EMAIL PROTECTED]>.
> 
> The bug was introduced in 2.6.22 by me.
> 
> cleanup_workqueue_thread() does flush_cpu_workqueue(cwq) in a loop until
> ->worklist becomes empty. This is live-lockable, a re-niced caller can
> get CPU after wake_up() and insert a new barrier before the lower-priority
> cwq->thread has a chance to clear ->current_work.
> 
> Change cleanup_workqueue_thread() to do flush_cpu_workqueue(cwq) only once.
> We can rely on the fact that run_workqueue() won't return until it flushes
> all works. So it is safe to call kthread_stop() after that, the "should stop"
> request won't be noticed until run_workqueue() returns.
> 
> Signed-off-by: Oleg Nesterov <[EMAIL PROTECTED]>

I confirm the patch fixes the bug I was seeing.

Michal

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


destroy_workqueue can livelock

2007-07-11 Thread Michal Schmidt
Hi,

While using SystemTap I noticed an interesting situation. When my stap
probe was exiting, there was a several seconds long delay, during which
the CPU was 100% loaded. I narrowed the problem down to destroy_workqueue.

The attached module is a minimized testcase. To reproduce it, load the
module and then try to rmmod it from a higher priority process:
 nice -n -10 rmmod wqtest.ko  # that's how SystemTap's staprun behaves
or:
 chrt -f  90 rmmod wqtest.ko  # this may be more reliably reproducible

I tested it (with "nice") on Linux 2.6.22. The rmmod process took about
55% CPU, the workqueue thread consumed the rest. This situation can last
for minutes. As soon as the rmmod process is reniced to 0, the workqueue
is destroyed successfully and the module is unloaded.

Here's what happens in detail:

When rmmod executes cancel_rearming_delayed_workqueue() ->
wait_on_work() -> wait_on_cpu_work(), the work is the current_work on
the workqueue (it's in ssleep(1)). So wait_on_cpu_work() inserts a
wq_barrier on the workqueue and waits for the completion. As soon as
wq_barrier_func signals the completion, it is most likely preempted by
the rmmod process. At this moment, the worklist is already empty, but
cwq->current_work still points to the barrier. run_workqueue() didn't
get to reset it to NULL yet.

Now rmmod calls destroy_workqueue() -> cleanup_workqueue_thread() ->
flush_cpu_workqueue(). Because cwq->current_work!=NULL it decides to
insert another wq_barrier and wait for it to complete. But
cwq->current_work will never be reset to NULL, so
cleanup_workqueue_thread() keeps trying flush_cpu_workqueue()
indefinitely, inserting wq_barriers and waiting for them.

If rmmod's priority is lowered, run_workqueue() will not be preempted by
it and manages to reset cwq->current_work. This ends the livelock.

Can this be fixed? Or is it just a case of "Don't do that then!"?
("that" meaning destroying workqueues from negatively reniced processes)

Michal
#include 
#include 
#include 
#include 

MODULE_LICENSE("GPL");
MODULE_AUTHOR("Michal Schmidt");

static void wq_func(struct work_struct *w);
static DECLARE_DELAYED_WORK(wq_work, wq_func);
static struct workqueue_struct *wq;

static DECLARE_WAIT_QUEUE_HEAD(ctl_wq);

static void wq_func(struct work_struct *w)
{
	/*
	 * So that this work is most likely cwq->current_work
	 * when destroy_workqueue comes...
	 */
	ssleep(1);

	queue_delayed_work(wq, &wq_work, HZ/100);
}

static int wqtest_start(void)
{
	wq = create_workqueue("wqtest");
	if (!wq)
		return -1;

	queue_delayed_work(wq, &wq_work, HZ/100);

	return 0;
}

static void wqtest_stop(void)
{
	printk(KERN_CRIT "wqtest: cancelling the work\n");
	cancel_rearming_delayed_work(&wq_work);
	printk(KERN_CRIT "wqtest: destroying the wq\n");
	destroy_workqueue(wq);
	printk(KERN_CRIT "wqtest: done\n");
}

module_init(wqtest_start);
module_exit(wqtest_stop);



Re: Need help making sense of IRQ API

2007-06-29 Thread Michal Schmidt
LOL ER wrote:
> Hello,
>   I've been trying to make sense of how the kernel (on an i386) calls
> __do_IRQ() from do_IRQ() for the past few days to no avail. [...]

Since i386 was switched to the generic-IRQ architecture (see "Linux
generic IRQ handling" in Documentation/Docbook) it does not use __do_IRQ().

common_interrupt (in assembler) calls do_IRQ(), which calls
desc->handle_irq() that is usually one of:
 handle_fasteoi_irq()
 handle_level_irq()
 handle_edge_irq()

Michal
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH -rt] irq nobody cared workaround for i386

2007-06-21 Thread Michal Schmidt
Steven Rostedt wrote:
> Michal Schmidt wrote:
>   
>> I came to the conclusion that the IO-APICs which need the fix for the
>> nobody cared bug don't have the issue ack_ioapic_quirk_irq is designed
>> to work-around. It should be safe simply to use the normal
>> ack_ioapic_irq as the .eoi method in pcix_ioapic_chip.
>> So this is the port of Steven's fix for the nobody cared bug to i386. It
>> works fine on IBM LS21 I have access to.
>>
>> 
> You want to make that "apic > 0".  Note the spacing. If it breaks
> 80 characters, then simply put it to a new line.
>   
> [...]
> ACK
>
> -- Steve
>   

OK, I fixed the spacing in both occurences.

Signed-off-by: Michal Schmidt <[EMAIL PROTECTED]>

--- arch/i386/kernel/io_apic.c.orig 2007-06-19 08:40:05.0 -0400
+++ arch/i386/kernel/io_apic.c  2007-06-21 06:51:16.0 -0400
@@ -261,6 +261,18 @@ static void __unmask_IO_APIC_irq (unsign
__modify_IO_APIC_irq(irq, 0, 0x0001);
 }
 
+/* trigger = 0 (edge mode) */
+static void __pcix_mask_IO_APIC_irq (unsigned int irq)
+{
+   __modify_IO_APIC_irq(irq, 0, 0x8000);
+}
+
+/* mask = 0, trigger = 1 (level mode) */
+static void __pcix_unmask_IO_APIC_irq (unsigned int irq)
+{
+   __modify_IO_APIC_irq(irq, 0x8000, 0x0001);
+}
+
 static void mask_IO_APIC_irq (unsigned int irq)
 {
unsigned long flags;
@@ -279,6 +291,24 @@ static void unmask_IO_APIC_irq (unsigned
spin_unlock_irqrestore(&ioapic_lock, flags);
 }
 
+static void pcix_mask_IO_APIC_irq (unsigned int irq)
+{
+   unsigned long flags;
+
+   spin_lock_irqsave(&ioapic_lock, flags);
+   __pcix_mask_IO_APIC_irq(irq);
+   spin_unlock_irqrestore(&ioapic_lock, flags);
+}
+
+static void pcix_unmask_IO_APIC_irq (unsigned int irq)
+{
+   unsigned long flags;
+
+   spin_lock_irqsave(&ioapic_lock, flags);
+   __pcix_unmask_IO_APIC_irq(irq);
+   spin_unlock_irqrestore(&ioapic_lock, flags);
+}
+
 static void clear_IO_APIC_pin(unsigned int apic, unsigned int pin)
 {
struct IO_APIC_route_entry entry;
@@ -1257,22 +1287,27 @@ static int assign_irq_vector(int irq)
 
return vector;
 }
+
 static struct irq_chip ioapic_chip;
+static struct irq_chip pcix_ioapic_chip;
 
 #define IOAPIC_AUTO-1
 #define IOAPIC_EDGE0
 #define IOAPIC_LEVEL   1
 
-static void ioapic_register_intr(int irq, int vector, unsigned long trigger)
+static void ioapic_register_intr(int irq, int vector, unsigned long trigger,
+int pcix)
 {
+   struct irq_chip *chip = pcix ? &pcix_ioapic_chip : &ioapic_chip;
+
if ((trigger == IOAPIC_AUTO && IO_APIC_irq_trigger(irq)) ||
trigger == IOAPIC_LEVEL)
-   set_irq_chip_and_handler_name(irq, &ioapic_chip,
-handle_fasteoi_irq, "fasteoi");
-   else {
-   set_irq_chip_and_handler_name(irq, &ioapic_chip,
-handle_edge_irq, "edge");
-   }
+   set_irq_chip_and_handler_name(irq, chip, handle_fasteoi_irq,
+ pcix ? "pcix-fasteoi" : 
"fasteoi");
+   else
+   set_irq_chip_and_handler_name(irq, chip, handle_edge_irq,
+ pcix ? "pcix-edge" : "edge");
+   
set_intr_gate(vector, interrupt[irq]);
 }
 
@@ -1336,7 +1371,8 @@ static void __init setup_IO_APIC_irqs(vo
if (IO_APIC_IRQ(irq)) {
vector = assign_irq_vector(irq);
entry.vector = vector;
-   ioapic_register_intr(irq, vector, IOAPIC_AUTO);
+   ioapic_register_intr(irq, vector, IOAPIC_AUTO,
+apic > 0);

if (!apic && (irq < 16))
disable_8259A_irq(irq);
@@ -2058,6 +2094,18 @@ static struct irq_chip ioapic_chip __rea
.retrigger  = ioapic_retrigger_irq,
 };
 
+static struct irq_chip pcix_ioapic_chip __read_mostly = {
+   .name   = "IO-APIC",
+   .startup= startup_ioapic_irq,
+   .mask   = pcix_mask_IO_APIC_irq,
+   .unmask = pcix_unmask_IO_APIC_irq,
+   .ack= ack_ioapic_irq,
+   .eoi= ack_ioapic_irq,
+#ifdef CONFIG_SMP
+   .set_affinity   = set_ioapic_affinity_irq,
+#endif
+   .retrigger  = ioapic_retrigger_irq,
+};
 
 static inline void init_IO_APIC_traps(void)
 {
@@ -2858,7 +2906,7 @@ int io_apic_set_pci_routing (int ioapic,
mp_ioapics[ioapic].mpc_apicid, pin, entry.vector, irq,
edge_level, active_high_low);
 
-   ioapic_register_intr(irq, entry.vec

Re: [PATCH -rt] irq nobody cared workaround for i386

2007-06-20 Thread Michal Schmidt
Michal Schmidt wrote:
> Steven Rostedt wrote:
>   
>> This is the final "design" for the nobody cared bug. For all IO-APICS 
>> other than the first one (the chained IO-APICS) we use the PCIX version 
>> of the mask and unmask interrupt routines.  This changes the interrupt 
>> from level to edge for mask and edge to level for unmask. This keeps the 
>> PCI-E from thinking it's in legacy mode and assert an old fashion INT# 
>> interrupt which might spread to other interrupts.
>>
>>   
>> 
>
> Here's a port of the workaround to i386. I tested it successfully on IBM
> LS21.
> Notice I had to disable the quirk handling in ack_ioapic_quirk_irq. The
> code path was triggering on LS21 and because it plays with the Interrupt
> Mask bit, it produced the doubled interrupts again. I don't like it and
> I need to think about a solution which would handle both quirks correctly.
>   

I came to the conclusion that the IO-APICs which need the fix for the
nobody cared bug don't have the issue ack_ioapic_quirk_irq is designed
to work-around. It should be safe simply to use the normal
ack_ioapic_irq as the .eoi method in pcix_ioapic_chip.
So this is the port of Steven's fix for the nobody cared bug to i386. It
works fine on IBM LS21 I have access to.

Signed-off-by: Michal Schmidt <[EMAIL PROTECTED]>

--- arch/i386/kernel/io_apic.c.orig 2007-06-19 08:40:05.0 -0400
+++ arch/i386/kernel/io_apic.c  2007-06-20 09:03:55.0 -0400
@@ -261,6 +261,18 @@ static void __unmask_IO_APIC_irq (unsign
__modify_IO_APIC_irq(irq, 0, 0x0001);
 }
 
+/* trigger = 0 (edge mode) */
+static void __pcix_mask_IO_APIC_irq (unsigned int irq)
+{
+   __modify_IO_APIC_irq(irq, 0, 0x8000);
+}
+
+/* mask = 0, trigger = 1 (level mode) */
+static void __pcix_unmask_IO_APIC_irq (unsigned int irq)
+{
+   __modify_IO_APIC_irq(irq, 0x8000, 0x0001);
+}
+
 static void mask_IO_APIC_irq (unsigned int irq)
 {
unsigned long flags;
@@ -279,6 +291,24 @@ static void unmask_IO_APIC_irq (unsigned
spin_unlock_irqrestore(&ioapic_lock, flags);
 }
 
+static void pcix_mask_IO_APIC_irq (unsigned int irq)
+{
+   unsigned long flags;
+
+   spin_lock_irqsave(&ioapic_lock, flags);
+   __pcix_mask_IO_APIC_irq(irq);
+   spin_unlock_irqrestore(&ioapic_lock, flags);
+}
+
+static void pcix_unmask_IO_APIC_irq (unsigned int irq)
+{
+   unsigned long flags;
+
+   spin_lock_irqsave(&ioapic_lock, flags);
+   __pcix_unmask_IO_APIC_irq(irq);
+   spin_unlock_irqrestore(&ioapic_lock, flags);
+}
+
 static void clear_IO_APIC_pin(unsigned int apic, unsigned int pin)
 {
struct IO_APIC_route_entry entry;
@@ -1257,22 +1287,27 @@ static int assign_irq_vector(int irq)
 
return vector;
 }
+
 static struct irq_chip ioapic_chip;
+static struct irq_chip pcix_ioapic_chip;
 
 #define IOAPIC_AUTO-1
 #define IOAPIC_EDGE0
 #define IOAPIC_LEVEL   1
 
-static void ioapic_register_intr(int irq, int vector, unsigned long trigger)
+static void ioapic_register_intr(int irq, int vector, unsigned long trigger,
+int pcix)
 {
+   struct irq_chip *chip = pcix ? &pcix_ioapic_chip : &ioapic_chip;
+
if ((trigger == IOAPIC_AUTO && IO_APIC_irq_trigger(irq)) ||
trigger == IOAPIC_LEVEL)
-   set_irq_chip_and_handler_name(irq, &ioapic_chip,
-handle_fasteoi_irq, "fasteoi");
-   else {
-   set_irq_chip_and_handler_name(irq, &ioapic_chip,
-handle_edge_irq, "edge");
-   }
+   set_irq_chip_and_handler_name(irq, chip, handle_fasteoi_irq,
+ pcix ? "pcix-fasteoi" : 
"fasteoi");
+   else
+   set_irq_chip_and_handler_name(irq, chip, handle_edge_irq,
+ pcix ? "pcix-edge" : "edge");
+   
set_intr_gate(vector, interrupt[irq]);
 }
 
@@ -1336,7 +1371,7 @@ static void __init setup_IO_APIC_irqs(vo
if (IO_APIC_IRQ(irq)) {
vector = assign_irq_vector(irq);
entry.vector = vector;
-   ioapic_register_intr(irq, vector, IOAPIC_AUTO);
+   ioapic_register_intr(irq, vector, IOAPIC_AUTO, apic>0);

if (!apic && (irq < 16))
disable_8259A_irq(irq);
@@ -2058,6 +2093,18 @@ static struct irq_chip ioapic_chip __rea
.retrigger  = ioapic_retrigger_irq,
 };
 
+static struct irq_chip pcix_ioapic_chip __read_mostly = {
+   .name   = "IO-APIC",
+   .startup= startup_ioapic_irq,
+   

[PATCH -rt] irq nobody cared workaround for i386

2007-06-19 Thread Michal Schmidt
Steven Rostedt wrote:
> This is the final "design" for the nobody cared bug. For all IO-APICS 
> other than the first one (the chained IO-APICS) we use the PCIX version 
> of the mask and unmask interrupt routines.  This changes the interrupt 
> from level to edge for mask and edge to level for unmask. This keeps the 
> PCI-E from thinking it's in legacy mode and assert an old fashion INT# 
> interrupt which might spread to other interrupts.
>
>   

Here's a port of the workaround to i386. I tested it successfully on IBM
LS21.
Notice I had to disable the quirk handling in ack_ioapic_quirk_irq. The
code path was triggering on LS21 and because it plays with the Interrupt
Mask bit, it produced the doubled interrupts again. I don't like it and
I need to think about a solution which would handle both quirks correctly.

Michal

--- arch/i386/kernel/io_apic.c.orig 2007-06-19 08:40:05.0 -0400
+++ arch/i386/kernel/io_apic.c  2007-06-19 08:58:00.0 -0400
@@ -261,6 +261,18 @@ static void __unmask_IO_APIC_irq (unsign
__modify_IO_APIC_irq(irq, 0, 0x0001);
 }
 
+/* trigger = 0 (edge mode) */
+static void __pcix_mask_IO_APIC_irq (unsigned int irq)
+{
+   __modify_IO_APIC_irq(irq, 0, 0x8000);
+}
+
+/* mask = 0, trigger = 1 (level mode) */
+static void __pcix_unmask_IO_APIC_irq (unsigned int irq)
+{
+   __modify_IO_APIC_irq(irq, 0x8000, 0x0001);
+}
+
 static void mask_IO_APIC_irq (unsigned int irq)
 {
unsigned long flags;
@@ -279,6 +291,24 @@ static void unmask_IO_APIC_irq (unsigned
spin_unlock_irqrestore(&ioapic_lock, flags);
 }
 
+static void pcix_mask_IO_APIC_irq (unsigned int irq)
+{
+   unsigned long flags;
+
+   spin_lock_irqsave(&ioapic_lock, flags);
+   __pcix_mask_IO_APIC_irq(irq);
+   spin_unlock_irqrestore(&ioapic_lock, flags);
+}
+
+static void pcix_unmask_IO_APIC_irq (unsigned int irq)
+{
+   unsigned long flags;
+
+   spin_lock_irqsave(&ioapic_lock, flags);
+   __pcix_unmask_IO_APIC_irq(irq);
+   spin_unlock_irqrestore(&ioapic_lock, flags);
+}
+
 static void clear_IO_APIC_pin(unsigned int apic, unsigned int pin)
 {
struct IO_APIC_route_entry entry;
@@ -1257,22 +1287,27 @@ static int assign_irq_vector(int irq)
 
return vector;
 }
+
 static struct irq_chip ioapic_chip;
+static struct irq_chip pcix_ioapic_chip;
 
 #define IOAPIC_AUTO-1
 #define IOAPIC_EDGE0
 #define IOAPIC_LEVEL   1
 
-static void ioapic_register_intr(int irq, int vector, unsigned long trigger)
+static void ioapic_register_intr(int irq, int vector, unsigned long trigger,
+int pcix)
 {
+   struct irq_chip *chip = pcix ? &pcix_ioapic_chip : &ioapic_chip;
+
if ((trigger == IOAPIC_AUTO && IO_APIC_irq_trigger(irq)) ||
trigger == IOAPIC_LEVEL)
-   set_irq_chip_and_handler_name(irq, &ioapic_chip,
-handle_fasteoi_irq, "fasteoi");
-   else {
-   set_irq_chip_and_handler_name(irq, &ioapic_chip,
-handle_edge_irq, "edge");
-   }
+   set_irq_chip_and_handler_name(irq, chip, handle_fasteoi_irq,
+ pcix ? "pcix-fasteoi" : 
"fasteoi");
+   else
+   set_irq_chip_and_handler_name(irq, chip, handle_edge_irq,
+ pcix ? "pcix-edge" : "edge");
+   
set_intr_gate(vector, interrupt[irq]);
 }
 
@@ -1336,7 +1371,7 @@ static void __init setup_IO_APIC_irqs(vo
if (IO_APIC_IRQ(irq)) {
vector = assign_irq_vector(irq);
entry.vector = vector;
-   ioapic_register_intr(irq, vector, IOAPIC_AUTO);
+   ioapic_register_intr(irq, vector, IOAPIC_AUTO, apic>0);

if (!apic && (irq < 16))
disable_8259A_irq(irq);
@@ -2027,6 +2062,7 @@ static void ack_ioapic_quirk_irq(unsigne
 
ack_APIC_irq();
 
+#if 0
if (!(v & (1 << (i & 0x1f {
atomic_inc(&irq_mis_count);
spin_lock(&ioapic_lock);
@@ -2036,6 +2072,7 @@ static void ack_ioapic_quirk_irq(unsigne
__modify_IO_APIC_irq(irq, 0x8000, 0x0001);
spin_unlock(&ioapic_lock);
}
+#endif
 }
 
 static int ioapic_retrigger_irq(unsigned int irq)
@@ -2058,6 +2095,18 @@ static struct irq_chip ioapic_chip __rea
.retrigger  = ioapic_retrigger_irq,
 };
 
+static struct irq_chip pcix_ioapic_chip __read_mostly = {
+   .name   = "IO-APIC",
+   .startup= startup_ioapic_irq,
+   .mask   = pcix_mask_IO_APIC_irq,
+   .unmask = pcix_unmask_IO_APIC_irq,
+   .ack= ack_ioapic_irq,
+   .eoi= ack_ioapic_quirk_irq,
+#ifdef CONFIG_SMP
+   .set_affinity   = set_ioapic_affinity_irq,
+#endif
+   

Re: ZFS with Linux: An Open Plea

2007-04-17 Thread Michal Schmidt
linux-os (Dick Johnson) skrev:
> if you never look at somebody else's'
> implementation details, you certainly should not be violating a patent.

Oh, it would be a beautiful world in which this was true!

Michal
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 58/59] sysctl: Reimplement the sysctl proc support

2007-03-14 Thread Michal Schmidt
Ingo Molnar wrote:
> * Alexey Dobriyan <[EMAIL PROTECTED]> wrote:
>   
>> Use register_sysctl_table() for sysctls.
>> 
>
> yes - i just wanted to point out the incompatibility and subtle breakage 
> that this change caused. I'll now have to convert the current code over 
> to sysctl_table, which isnt that hard but not trivial either, and i 
> certainly could make use that time for other purposes.
>
>   Ingo
>   

How about this? It works for me.

Signed-off-by: Michal Schmidt <[EMAIL PROTECTED]>

diff --git a/kernel/latency_trace.c b/kernel/latency_trace.c
index e07bb95..a13d001 100644
--- a/kernel/latency_trace.c
+++ b/kernel/latency_trace.c
@@ -19,7 +19,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
 #include 
 #include 
@@ -2661,66 +2661,94 @@ void print_traces(struct task_struct *task)
 }
 #endif
 
-static int preempt_read_proc(char *page, char **start, off_t off,
-int count, int *eof, void *data)
+#if defined(CONFIG_WAKEUP_TIMING) || defined(CONFIG_EVENT_TRACE)
+
+static int preempt_proc_handler(ctl_table *table, int write, struct file *filp,
+   void __user *buffer, size_t *lenp, loff_t *ppos)
 {
-   cycle_t *max = data;
+#define TMPBUFLEN 21
+   char buf[TMPBUFLEN];
+   size_t left = *lenp;
+   cycle_t *max = table->data;
 
-   return sprintf(page, "%ld\n", cycles_to_usecs(*max));
-}
+   if (!table->data || table->maxlen!=sizeof(cycles_t) || !*lenp ||
+   (*ppos && !write)) {
+   *lenp = 0;
+   return 0;
+   }
 
-static int preempt_write_proc(struct file *file, const char __user *buffer,
- unsigned long count, void *data)
-{
-   unsigned int c, done = 0, val, sum = 0;
-   cycle_t *max = data;
+   if (!write) {
+   int len;
 
-   while (count) {
-   if (get_user(c, buffer))
-   return -EFAULT;
-   val = c - '0';
-   buffer++;
-   done++;
-   count--;
-   if (c == 0 || c == '\n')
-   break;
-   if (val > 9)
+   len = snprintf(buf, TMPBUFLEN, "%ld\n", cycles_to_usecs(*max));
+   if (len >= TMPBUFLEN)
return -EINVAL;
-   sum *= 10;
-   sum += val;
+   if (len > left)
+   len = left;
+   if (copy_to_user(buffer, buf, len))
+   return -EFAULT;
+   left -= len;
+   } else {
+   unsigned int c, val, sum = 0;
+
+   while (left) {
+   if (get_user(c, (char __user *)buffer))
+   return -EFAULT;
+   val = c - '0';
+   buffer++;
+   left--;
+   if (c == 0 || c == '\n')
+   break;
+   if (val > 9)
+   return -EINVAL;
+   sum *= 10;
+   sum += val;
+   }
+   *max = usecs_to_cycles(sum);
}
-   *max = usecs_to_cycles(sum);
-   return done;
+
+   *lenp -= left;
+   *ppos += *lenp;
+   return 0;
 }
 
-#if defined(CONFIG_WAKEUP_TIMING) || defined(CONFIG_EVENT_TRACE)
+static ctl_table preempt_latency_table[] = {
+   {
+   .ctl_name   = CTL_UNNUMBERED,
+   .procname   = "preempt_max_latency",
+   .data   = &preempt_max_latency,
+   .maxlen = sizeof(cycles_t),
+   .mode   = 0644,
+   .proc_handler   = &preempt_proc_handler,
+   },
+#ifdef CONFIG_EVENT_TRACE
+   {
+   .ctl_name   = CTL_UNNUMBERED,
+   .procname   = "preempt_thresh",
+   .data   = &preempt_thresh,
+   .maxlen = sizeof(cycles_t),
+   .mode   = 0644,
+   .proc_handler   = &preempt_proc_handler,
+   },
+#endif
+   { .ctl_name = 0 }
+};
 
-#definePROCNAME_PML"sys/kernel/preempt_max_latency"
-#define PROCNAME_PT"sys/kernel/preempt_thresh"
+static ctl_table kernel_root[] = {
+   {
+   .ctl_name   = CTL_KERN,
+   .procname   = "kernel",
+   .mode   = 0555,
+   .child  = preempt_latency_table,
+   },
+   { .ctl_name = 0 }
+};
+
+static struct ctl_table_header *sysctl_header;
 
 static __init int latency_fs_init(void)
 {
-   struct proc_dir_entry *entry;
-
-   if (!(entry = create_proc_entry(PROCNAME_PML, 0644, NULL)

[PATCH -rt] fix preempt count underflow in user_trace_stop

2007-03-12 Thread Michal Schmidt
When playing with trace_user_trigger_irq in order to trace
IRQ->userspace latencies, I encountered a bug in the latency tracer. If
I have wakeup_timing enabled and attempt to stop the trace in my
userspace program, the system crashes. This is caused by an unbalanced
preempt_enable() which underflows the preempt count.

Signed-off-by: Michal Schmidt <[EMAIL PROTECTED]>

diff --git a/kernel/latency_trace.c b/kernel/latency_trace.c
index e07bb95..29dfb79 100644
--- a/kernel/latency_trace.c
+++ b/kernel/latency_trace.c
@@ -2396,7 +2396,6 @@ long user_trace_stop(void)
if (current != sch.task) {
__raw_spin_unlock(&sch.trace_lock);
local_irq_restore(flags);
-   preempt_enable();
return -EINVAL;
}
sch.task = NULL;


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH -rt] airo: threaded IRQ handler sleeps forever

2007-03-06 Thread Michal Schmidt
The airo driver tries to avoid excessive latencies when issuing commands
to the card by calling schedule() after several retries. But
issuecommand() can be run from an interrupt handler. The function is
careful enough to check with in_atomic() if it is safe to call schedule().
This check breaks when the interrupt handler is threaded, because then
in_atomic() is always false there. The handler is run as
TASK_INTERRUPTIBLE, so schedule() takes it off the runqueue and it never
wakes up again.
Here's an obvious fix - simply don't call schedule() when using
preemptible hardirqs.
An improved solution might be to identify the commands that take so long
to issue and avoid sending them from the interrupt handler. In my
testing there was only one such command: CMD_ACCESS. I need to
investigate if it's always possible to delay it to airo's kthread.

Signed-off-by: Michal Schmidt <[EMAIL PROTECTED]>

diff --git a/drivers/net/wireless/airo.c b/drivers/net/wireless/airo.c
index 44a2270..98014c0 100644
--- a/drivers/net/wireless/airo.c
+++ b/drivers/net/wireless/airo.c
@@ -3938,8 +3938,10 @@ static u16 issuecommand(struct airo_info *ai, Cmd *pCmd, 
Resp *pRsp) {
if ((IN4500(ai, COMMAND)) == pCmd->cmd)
// PC4500 didn't notice command, try again
OUT4500(ai, COMMAND, pCmd->cmd);
+#ifndef CONFIG_PREEMPT_HARDIRQS
if (!in_atomic() && (max_tries & 255) == 0)
schedule();
+#endif
}
 
if ( max_tries == -1 ) {


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Fix compilation of drivers with -O0

2007-02-23 Thread Michal Schmidt

[that still wasn't right, here's for the 3rd and final time.]

It is sometimes useful to compile individual drivers with optimization
disabled for easier debugging. Currently drivers which use htonl() and
similar functions don't compile with -O0. This patch fixes it.
It also removes obsolete and misleading comments. This header is not
for userspace, so we don't have to care about strange programs these
comments mention.

Signed-off-by: Michal Schmidt <[EMAIL PROTECTED]>

diff --git a/include/linux/byteorder/generic.h 
b/include/linux/byteorder/generic.h
index e86e4a9..3dc715b 100644
--- a/include/linux/byteorder/generic.h
+++ b/include/linux/byteorder/generic.h
@@ -124,19 +124,8 @@
#define be32_to_cpus __be32_to_cpus
#define cpu_to_be16s __cpu_to_be16s
#define be16_to_cpus __be16_to_cpus
-#endif

-
-#if defined(__KERNEL__)
/*
- * Handle ntohl and suches. These have various compatibility
- * issues - like we want to give the prototype even though we
- * also have a macro for them in case some strange program
- * wants to take the address of the thing or something..
- *
- * Note that these used to return a "long" in libc5, even though
- * long is often 64-bit these days.. Thus the casts.
- *
 * They have to be macros in order to do the constant folding
 * correctly - if the argument passed into a inline function
 * it is no longer constant according to gcc..
@@ -147,17 +136,6 @@
#undef htonl
#undef htons

-/*
- * Do the prototypes. Somebody might want to take the
- * address or some such sick thing..
- */
-extern __u32   ntohl(__be32);
-extern __be32  htonl(__u32);
-extern __u16   ntohs(__be16);
-extern __be16  htons(__u16);
-
-#if defined(__GNUC__) && defined(__OPTIMIZE__)
-
#define ___htonl(x) __cpu_to_be32(x)
#define ___htons(x) __cpu_to_be16(x)
#define ___ntohl(x) __be32_to_cpu(x)
@@ -168,9 +146,6 @@ extern __be16   htons(__u16);
#define htons(x) ___htons(x)
#define ntohs(x) ___ntohs(x)

-#endif /* OPTIMIZE */
-
#endif /* KERNEL */

-
#endif /* _LINUX_BYTEORDER_GENERIC_H */


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Fix compilation of drivers with -O0

2007-02-23 Thread Michal Schmidt
[Sorry, the patch was corrupted by the mailer. Hopefully it's ok this time.]

It is sometimes useful to compile individual drivers with optimization
disabled for easier debugging. Currently drivers which use htonl() and
similar functions don't compile with -O0. This patch fixes it.
It also removes obsolete and misleading comments. This header is not
for userspace, so we don't have to care about strange programs these
comments mention.


Signed-off-by: Michal Schmidt <[EMAIL PROTECTED]>

diff --git a/include/linux/byteorder/generic.h
b/include/linux/byteorder/generic.h
index e86e4a9..3dc715b 100644
--- a/include/linux/byteorder/generic.h
+++ b/include/linux/byteorder/generic.h
@@ -124,19 +124,8 @@
 #define be32_to_cpus __be32_to_cpus
 #define cpu_to_be16s __cpu_to_be16s
 #define be16_to_cpus __be16_to_cpus
-#endif

-
-#if defined(__KERNEL__)
 /*
- * Handle ntohl and suches. These have various compatibility
- * issues - like we want to give the prototype even though we
- * also have a macro for them in case some strange program
- * wants to take the address of the thing or something..
- *
- * Note that these used to return a "long" in libc5, even though
- * long is often 64-bit these days.. Thus the casts.
- *
  * They have to be macros in order to do the constant folding
  * correctly - if the argument passed into a inline function
  * it is no longer constant according to gcc..
@@ -147,17 +136,6 @@
 #undef htonl
 #undef htons

-/*
- * Do the prototypes. Somebody might want to take the
- * address or some such sick thing..
- */
-extern __u32ntohl(__be32);
-extern __be32htonl(__u32);
-extern __u16ntohs(__be16);
-extern __be16htons(__u16);
-
-#if defined(__GNUC__) && defined(__OPTIMIZE__)
-
 #define ___htonl(x) __cpu_to_be32(x)
 #define ___htons(x) __cpu_to_be16(x)
 #define ___ntohl(x) __be32_to_cpu(x)
@@ -168,9 +146,6 @@ extern __be16htons(__u16);
 #define htons(x) ___htons(x)
 #define ntohs(x) ___ntohs(x)

-#endif /* OPTIMIZE */
-
 #endif /* KERNEL */

-
 #endif /* _LINUX_BYTEORDER_GENERIC_H */


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] Fix compilation of drivers with -O0

2007-02-23 Thread Michal Schmidt
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

It is sometimes useful to compile individual drivers with optimization
disabled for easier debugging. Currently drivers which use htonl() and
similar functions don't compile with -O0. This patch fixes it.
It also removes obsolete and misleading comments. This header is not
for userspace, so we don't have to care about strange programs these
comments mention.


Signed-off-by: Michal Schmidt <[EMAIL PROTECTED]>

diff --git a/include/linux/byteorder/generic.h
b/include/linux/byteorder/generic.h
index e86e4a9..3dc715b 100644
- --- a/include/linux/byteorder/generic.h
+++ b/include/linux/byteorder/generic.h
@@ -124,19 +124,8 @@
 #define be32_to_cpus __be32_to_cpus
 #define cpu_to_be16s __cpu_to_be16s
 #define be16_to_cpus __be16_to_cpus
- -#endif
 
- -
- -#if defined(__KERNEL__)
 /*
- - * Handle ntohl and suches. These have various compatibility
- - * issues - like we want to give the prototype even though we
- - * also have a macro for them in case some strange program
- - * wants to take the address of the thing or something..
- - *
- - * Note that these used to return a "long" in libc5, even though
- - * long is often 64-bit these days.. Thus the casts.
- - *
  * They have to be macros in order to do the constant folding
  * correctly - if the argument passed into a inline function
  * it is no longer constant according to gcc..
@@ -147,17 +136,6 @@
 #undef htonl
 #undef htons
 
- -/*
- - * Do the prototypes. Somebody might want to take the
- - * address or some such sick thing..
- - */
- -extern __u32ntohl(__be32);
- -extern __be32htonl(__u32);
- -extern __u16ntohs(__be16);
- -extern __be16htons(__u16);
- -
- -#if defined(__GNUC__) && defined(__OPTIMIZE__)
- -
 #define ___htonl(x) __cpu_to_be32(x)
 #define ___htons(x) __cpu_to_be16(x)
 #define ___ntohl(x) __be32_to_cpu(x)
@@ -168,9 +146,6 @@ extern __be16htons(__u16);
 #define htons(x) ___htons(x)
 #define ntohs(x) ___ntohs(x)
 
- -#endif /* OPTIMIZE */
- -
 #endif /* KERNEL */
 
- -
 #endif /* _LINUX_BYTEORDER_GENERIC_H */
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (GNU/Linux)
Comment: Using GnuPG with Red Hat - http://enigmail.mozdev.org

iD8DBQFF3w75abKV90ewf0QRAi+iAJ4g/NZXKdspLSi5wiRlzu5U0ytJFwCdEKD9
RUDYj69LURttm8qyCUCHz3k=
=73ft
-END PGP SIGNATURE-

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [OOPS] on 2.6.20-rc5-rt10

2007-01-30 Thread Michal Schmidt

Remy Bohmer wrote:

Hello All,

Once in a while we see the following stacktrace.
We do not know yet the exact condition that generates this, but is
there anyone that recognises this oops?

Kind Regards,

Remy Bohmer

[...]
Jan 30 14:09:20 localhost kernel: Modules linked in: cap_over
commoncap i2c_dev uhci_hcd i2c_i801 i2c_core ehci_hcd


What's the cap_over module? I can't find it in my kernel anywhere.
Michal
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: What happen when hangs !!

2006-12-07 Thread Michal Schmidt

Jaswinder Singh wrote:

Sometimes my machine hangs in userspace area like this :-

VFS: Mounted root (ext3 filesystem).
Freeing init memory: 124K
INIT:
<>

OR

VFS: Mounted root (ext3 filesystem).
Freeing init memory: 124K
INIT: version 2.85 booting
<>

How can I debug this hang, what are the cases.


When it hangs, try to capture the list of processes using Alt+SysRq+T. 
You need to have CONFIG_MAGIC_SYSRQ enabled in the kernel.


Michal
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: PREEMPT is messing with everyone

2006-12-05 Thread Michal Schmidt

Jaswinder Singh skrev:

Yes, Compiler will remove it but this looks ugly and confusing.

Why dont we use like this :-

#ifdef CONFIG_PREEMPT
#include 
#endif

#ifdef CONFIG_PREEMPT
 preempt_disable();
#endif

#ifdef CONFIG_PREEMPT
 preempt_enable();
#endif


Surely you're joking.
It is much more readable and maintainable to hide the #ifdef-hackery in 
header files than to clutter the *.c files.


Michal

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: PREEMPT is messing with everyone

2006-12-05 Thread Michal Schmidt

Jaswinder Singh wrote:

Hi,

preempt stuff SHOULD only stay in #ifdef CONFIG_PREEMP_* , but it is
messing with everyone even though not defined.

e.g.

1. linux-2.6.19/kernel/spinlock.c

Line 18: #include 

Line 26:  preempt_disable();

Line 32:  preempt_disable();

and so on .


Don't worry. These compile into "do { } while (0)" (i.e. nothing) when 
CONFIG_PREEMPT is not set.




2. linux-2.6.19/kernel/sched.c

Line 1096:  int preempted;

Line 1104:   preempted = !task_running(rq, p);

Line 1106:   if (preempted)

Line 2059:  if (TASK_PREEMPTS_CURR(p, this_rq))


Linux always does preemptive multitasking of user tasks. These have 
nothing to do with CONFIG_PREEMPT.



Line 3355:current->comm, preempt_count(), current->pid);

Line 3342:  preempt_disable();

Line 3375:  if (prev->state && !(preempt_count() & PREEMPT_ACTIVE)) {


preempt_count() is useful in !CONFIG_PREEMPT kernels too. It stores 
information about the current context (hardirq, softirq, ...).



[...]

70 to 80 % of this code is removed when compiled.

but 20 to 30 % code left in binary kernel image.

Why Linux kernel is wasting its resources which is not defined at all.


I don't think that's the case.


Any solution ?

Thank you,

Best Regards,

Jaswinder Singh.


Michal
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.19-rc6-rt3, yum repo

2006-11-18 Thread Michal Schmidt

Ingo Molnar wrote:


i've released the 2.6.18-rc6-rt3 tree

Hi Ingo,
lockdep doesn't compile on UP. per_cpu_offset only makes sense on SMP.

Michal

diff --git a/kernel/lockdep.c b/kernel/lockdep.c
index 8f6ba22..d46082d 100644
--- a/kernel/lockdep.c
+++ b/kernel/lockdep.c
@@ -1194,8 +1194,13 @@ register_lock_class(struct lockdep_map *
 */
if (!static_obj(lock->key)) {
debug_locks_off();
+#ifdef CONFIG_SMP
printk("INFO: trying to register non-static key %p (%016lx).\n",
lock->key, per_cpu_offset(raw_smp_processor_id()));
+#else
+   printk("INFO: trying to register non-static key %p.\n",
+   lock->key);
+#endif
printk("the code is fine but needs lockdep annotation.\n");
printk("turning off the locking correctness validator.\n");
dump_stack();


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: How to find out kernel stack over flow?

2005-09-07 Thread Michal Schmidt

nazim khan wrote:

I suspect that one of my module that I am inserting in
the kernel may be causing the stack overflow which is
leading to kernel crash (may because it is corrupting
some one lese memory).

How can I find this out?


You could enable CONFIG_DEBUG_STACKOVERFLOW.
If you showed us your module's source code, someone might see the bug.

Michal
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.13-rc6-rt1

2005-08-16 Thread Michal Schmidt

Ingo Molnar wrote:
i've released the 2.6.13-rc6-rt1 tree, which can be downloaded from the 
usual place:


  http://redhat.com/~mingo/realtime-preempt/

as the name already suggests, i've switched to a new, simplified naming 
scheme, which follows the usual naming convention of trees tracking the 
mainline kernel. The numbering will be restarted for every new upstream 
kernel the -RT tree is merged to.


Great! With this naming scheme it is easy to teach Matt Mackall's 
ketchup script about the -RT tree.

The modified ketchup script can be downloaded from:
http://www.uamt.feec.vutbr.cz/rizeni/pom/ketchup-0.9+rt

Matt, would you release a new ketchup version with this support for 
Ingo's tree?


Michal
--- ketchup-0.9 2005-08-16 14:06:20.0 +0200
+++ ketchup-0.9+rt  2005-08-16 14:24:05.0 +0200
@@ -307,7 +307,11 @@ version_info = {
 '2.6-mjb': (latest_mjb,
  kernel_url + "/people/mbligh/%(prebase)s/patch-%(full)s.bz2",
  r'patch-(2.6.*?).bz2',
- 1, "Martin Bligh's random collection 'o crap")
+ 1, "Martin Bligh's random collection 'o crap"),
+'2.6-rt': (latest_dir,
+
"http://people.redhat.com/mingo/realtime-preempt/patch-%(full)s",
+   r'patch-(2.6.*?)',
+   0, "Ingo Molnar's realtime-preempt kernel")
 }
 
 def version_url(ver, sign = 0):


Re: captive-ntfs FUSE support?

2005-08-10 Thread Michal Schmidt

Kristoffer wrote:

captive ntfs: http://www.jankratochvil.net/project/captive/
http://www.jankratochvil.net/project/captive/CVS.html.pl

Can someone please port cvs captive-ntfs to FUSE?


OK. How much do you pay me?
Michal
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: amd64-agp vs. swsusp

2005-08-07 Thread Michal Schmidt

Pavel Machek wrote:

I assume it is in -rc6, too; it is long-standing bug and I am not
aware of any attempts at fixing it. Please file bug report, assign to
me.


I've filed it as Bug 5018.
Michal
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-pm] [PATCH] swsusp: simpler calculation of number of pages in PBE list

2005-07-29 Thread Michal Schmidt

Rafael J. Wysocki wrote:

On Friday, 29 of July 2005 21:46, Michal Schmidt wrote:

The function calc_nr uses an iterative algorithm to calculate the number 
of pages needed for the image and the pagedir. Exactly the same result 
can be obtained with a one-line expression.



Could you please post the proof?

Rafael


OK, attached is a proof-by-brute-force program. It compares the results 
of the original function and the simplified one.


This is its output:

$ ./calc_nr2
checked 0 ...
checked 1 ...
checked 2 ...
checked 3 ...
checked 4 ...
checked 5 ...
checked 6 ...
checked 7 ...
checked 8 ...
checked 9 ...
checked 10 ...
checked 11 ...
checked 12 ...
checked 13 ...
checked 14 ...
checked 15 ...
checked 16 ...
checked 17 ...
checked 18 ...
checked 19 ...
checked 20 ...
checked 21 ...
First difference at 2130706433:  -2147483646 x -2147483647

It means that the two functions give the same results for sensible 
values of the input argument.
They results only differ when they overflow into negative values. At 
this point both of the results are useless.


Michal
#include 
#include 

typedef struct {
	unsigned long val;
} swp_entry_t;

typedef struct pbe {
	unsigned long address;
	unsigned long orig_address;
	swp_entry_t swap_address;
	struct pbe *next;
} suspend_pagedir_t;

#define PAGE_SIZE 4096
#define PBES_PER_PAGE (PAGE_SIZE/sizeof(struct pbe))

static int calc_nr_orig(int nr_copy)
{
int extra = 0;
	int mod = !!(nr_copy % PBES_PER_PAGE);
	int diff = (nr_copy / PBES_PER_PAGE) + mod;

	do {
		extra += diff;
		nr_copy += diff;
		mod = !!(nr_copy % PBES_PER_PAGE);
		diff = (nr_copy / PBES_PER_PAGE) + mod - extra;
	} while (diff > 0);
	
	return nr_copy;
}

static int calc_nr(int nr_copy)
{
	return nr_copy + (nr_copy+PBES_PER_PAGE-2)/(PBES_PER_PAGE-1);
}

int main()
{
	int i;
	for (i=0; i>=0; i++) {
		if (i%1 == 0)
			printf("checked %d ...\n", i);
		if (calc_nr(i) != calc_nr_orig(i)) {
			printf("First difference at %d:  %d x %d\n", i, calc_nr(i), calc_nr_orig(i));
			break;
		}
	}
	return 0;
}



[PATCH] swsusp: simpler calculation of number of pages in PBE list

2005-07-29 Thread Michal Schmidt
The function calc_nr uses an iterative algorithm to calculate the number 
of pages needed for the image and the pagedir. Exactly the same result 
can be obtained with a one-line expression.


Signed-off-by: Michal Schmidt <[EMAIL PROTECTED]>
diff -Nurp -X dontdiff.new linux-mm/kernel/power/swsusp.c linux-mm.mich/kernel/power/swsusp.c
--- linux-mm/kernel/power/swsusp.c	2005-07-28 13:57:53.0 +0200
+++ linux-mm.mich/kernel/power/swsusp.c	2005-07-29 21:01:46.0 +0200
@@ -737,18 +737,7 @@ static void copy_data_pages(void)
 
 static int calc_nr(int nr_copy)
 {
-	int extra = 0;
-	int mod = !!(nr_copy % PBES_PER_PAGE);
-	int diff = (nr_copy / PBES_PER_PAGE) + mod;
-
-	do {
-		extra += diff;
-		nr_copy += diff;
-		mod = !!(nr_copy % PBES_PER_PAGE);
-		diff = (nr_copy / PBES_PER_PAGE) + mod - extra;
-	} while (diff > 0);
-
-	return nr_copy;
+	return nr_copy + (nr_copy+PBES_PER_PAGE-2)/(PBES_PER_PAGE-1);
 }
 
 /**


Re: [RFT] solve "swsusp plays yoyo" with disks

2005-07-21 Thread Michal Schmidt

Michal Schmidt wrote:

Pavel Machek wrote:


Hi!

I'd like to get this tested under as many configurations as
possible. With this, your hdd should no longer do "yoyo" (spindown,
spinup, spindown) during suspend...



It looks like the patch is now in -mm (I use 2.6.13-rc3-mm1).
But my disks still yoyo during suspend. What more is needed? Some patch 
to ide-disk.c ?


I think I've found the problem.
The attached patch stops the disks from spinning down and up on suspend.
The patch applies to 2.6.13-rc3-mm1.

Signed-off-by: Michal Schmidt <[EMAIL PROTECTED]>
diff -Nurp -X dontdiff.new linux-mm/drivers/ide/ide-io.c linux-mm.mich/drivers/ide/ide-io.c
--- linux-mm/drivers/ide/ide-io.c	2005-06-30 01:00:53.0 +0200
+++ linux-mm.mich/drivers/ide/ide-io.c	2005-07-21 16:59:46.0 +0200
@@ -150,7 +150,7 @@ static void ide_complete_power_step(ide_
 
 	switch (rq->pm->pm_step) {
 	case ide_pm_flush_cache:	/* Suspend step 1 (flush cache) complete */
-		if (rq->pm->pm_state == 4)
+		if (rq->pm->pm_state == PM_EVENT_FREEZE)
 			rq->pm->pm_step = ide_pm_state_completed;
 		else
 			rq->pm->pm_step = idedisk_pm_standby;


Re: amd64-agp vs. swsusp

2005-07-21 Thread Michal Schmidt

Pavel Machek wrote:

I'm trying to do something similar for x86_64. See the attached patch.
Unfortunately, it doesn't help. The behaviour seems unchanged (resume 
still works iff amd64-agp wasn't loaded before suspend).



Are you sure problem is on level4_pgt? We probably use constant
level4_pgt but split pages at some deeper level. You may want try
saving 3rd-level table, instead.


I'm not sure about that at all. That was just my attempt of cargocult 
programming :-)
OK, I'll try saving the 3rd-level table. It'll take me some time to 
figure out how to do that, however :-)


Michal
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFT] solve "swsusp plays yoyo" with disks

2005-07-21 Thread Michal Schmidt

Pavel Machek wrote:

Hi!

I'd like to get this tested under as many configurations as
possible. With this, your hdd should no longer do "yoyo" (spindown,
spinup, spindown) during suspend...


It looks like the patch is now in -mm (I use 2.6.13-rc3-mm1).
But my disks still yoyo during suspend. What more is needed? Some patch 
to ide-disk.c ?


Michal
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: amd64-agp vs. swsusp

2005-07-21 Thread Michal Schmidt

Pavel Machek wrote:

Long time ago there were i386 problems because we assumed that kernel
is mapped in one big mapping and agp broke that assumption. Copying
pages backwards "fixed" it (and then we done proper fix). It should
not be, but it seems similar to this problem


Do you mean this patch of yours?:
http://www.ussg.iu.edu/hypermail/linux/kernel/0404.3/0640.html

I'm trying to do something similar for x86_64. See the attached patch.
Unfortunately, it doesn't help. The behaviour seems unchanged (resume 
still works iff amd64-agp wasn't loaded before suspend).


Michal
diff -Nurp -X dontdiff.new linux-mm/arch/x86_64/kernel/suspend_asm.S linux-mm.mich/arch/x86_64/kernel/suspend_asm.S
--- linux-mm/arch/x86_64/kernel/suspend_asm.S	2005-06-30 01:00:53.0 +0200
+++ linux-mm.mich/arch/x86_64/kernel/suspend_asm.S	2005-07-21 11:53:17.0 +0200
@@ -41,7 +41,7 @@ ENTRY(swsusp_arch_suspend)
 
 ENTRY(swsusp_arch_resume)
 	/* set up cr3 */	
-	leaq	init_level4_pgt(%rip),%rax
+	leaq	swsusp_level4_pgt(%rip),%rax
 	subq	$__START_KERNEL_map,%rax
 	movq	%rax,%cr3
 
diff -Nurp -X dontdiff.new linux-mm/arch/x86_64/mm/init.c linux-mm.mich/arch/x86_64/mm/init.c
--- linux-mm/arch/x86_64/mm/init.c	2005-07-18 19:48:12.0 +0200
+++ linux-mm.mich/arch/x86_64/mm/init.c	2005-07-21 11:21:36.0 +0200
@@ -310,10 +310,32 @@ void __init init_memory_mapping(unsigned
 
 extern struct x8664_pda cpu_pda[NR_CPUS];
 
+#ifdef CONFIG_SOFTWARE_SUSPEND
+/*
+ * Swap suspend & friends need this for resume because things like the intel-agp
+ * driver might have split up a kernel 4MB mapping.
+ */
+char __nosavedata swsusp_level4_pgt[PAGE_SIZE]
+	__attribute__ ((aligned (PAGE_SIZE)));
+
+static inline void save_pg_dir(void)
+{
+	memcpy(swsusp_level4_pgt, init_level4_pgt, PAGE_SIZE);
+}
+#else
+static inline void save_pg_dir(void)
+{
+}
+#endif
+
 /* Assumes all CPUs still execute in init_mm */
 void zap_low_mappings(void)
 {
-	pgd_t *pgd = pgd_offset_k(0UL);
+	pgd_t *pgd;
+
+	save_pg_dir();
+
+	pgd = pgd_offset_k(0UL);
 	pgd_clear(pgd);
 	flush_tlb_all();
 }


Re: amd64-agp vs. swsusp

2005-07-20 Thread Michal Schmidt

Rafael J. Wysocki wrote:

On Thursday, 21 of July 2005 00:07, Michal Schmidt wrote:
I also tried putting a printk before restore_processor_state(), but I'm 
not sure if it is safe to use printk there.



Yes, it is, but you may be unable to see the message if the box reboots before
it can be displayed.


OK, but then I also tried putting a 5s long busy wait there and the 
reset was not delayed. Therefore, the reset must be occurring before 
restore_processor_state().

Or is there a reason why
for(i=0; i<5000; i++)
udelay(1000);
wouldn't work as expected?

Michal
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: amd64-agp vs. swsusp

2005-07-20 Thread Michal Schmidt

Rafael J. Wysocki wrote:

On Tuesday, 19 of July 2005 23:26, Michal Schmidt wrote:
I have rebuilt agpgart and amd64-agp into the kernel and now it has 
resumed successfully for the first time. Thank you for the hint!


But I still wonder, why that makes a difference.



Before resume the module is not present.  When it gets loaded from the
image it probably runs with the assumption that the hardware was initialized
which is not correct.


It seems that the module doesn't even get a chance to run after resume. 
I've put some printks and udelays into kernel/power/swsusp.c and other 
places and I've found that the spontaneous reset occurs already in 
swsusp_arch_resume(), ie. before the drivers get their resume methods 
called. This is what I have in swsusp_suspend() now:

...
save_processor_state();
if ((error = swsusp_arch_suspend()))
printk(KERN_ERR "Error %d suspending\n", error);
/* Restore control flow magically appears here */
restore_processor_state();
printk(KERN_INFO "processor state restored!\n");/*I added this*/
BUG_ON (nr_copy_pages_check != nr_copy_pages);
restore_highmem();
device_power_up();
...

I'm recording the screen during resuming with a digital camera to see if 
the added printk is displayed before the reset and I am now sure that 
the reset occurs before that. The last thing I see is:


Stopping tasks: --|
Freeing memory... done (0 pages freed)
swsusp: Need to copy 8121 pages

Then on the next frame of the recorded MPEG, the display is already 
beginning to dim as the computer is resetting.


I also tried putting a printk before restore_processor_state(), but I'm 
not sure if it is safe to use printk there.
So I tried putting a loop of 5000 x udelay(1000) there to see if the 
reset would be delayed by 5s. It was not delayed, so I think that the 
reset occurs before restore_processor_state().


Michal
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: amd64-agp vs. swsusp

2005-07-19 Thread Michal Schmidt

Andreas Steinmetz wrote:

Michal Schmidt wrote:

Does resuming from swsuspend work for anyone with amd64-agp loaded?

On my system when I suspend with amd64-agp loaded, I get a spontaneous
reboot on resume. It reboots immediately after reading the saved image
from disk.
This is 100% reproducible.

Athlon 64 FX-53, Asus A8V Deluxe, Linux 2.6.13-rc3-mm1.



AMD Athlon(tm) 64 Processor 3000+, Acer Aspire

Linux gringo 2.6.13-rc3-gringo #36 Sun Jul 17 15:57:17 CEST 2005 x86_64
unknown unknown GNU/Linux

CONFIG_AGP=y
CONFIG_AGP_AMD64=y

swsusp works for me. Could it be mm, agp as a module or some speciality

   ^^^
That seems to be the problem!

of your hardware?


I have rebuilt agpgart and amd64-agp into the kernel and now it has 
resumed successfully for the first time. Thank you for the hint!


But I still wonder, why that makes a difference.

Michal
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


amd64-agp vs. swsusp

2005-07-19 Thread Michal Schmidt

Hello,

Does resuming from swsuspend work for anyone with amd64-agp loaded?

On my system when I suspend with amd64-agp loaded, I get a spontaneous 
reboot on resume. It reboots immediately after reading the saved image 
from disk.

This is 100% reproducible.

Athlon 64 FX-53, Asus A8V Deluxe, Linux 2.6.13-rc3-mm1.

Regards,
Michal
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: rt-preempt and x86_64?

2005-07-17 Thread Michal Schmidt

Alistair John Strachan wrote:

Hi Ingo,

(I searched the list for rt realtime x86_64 x86-64 before posting this, so I 
hope it's not a duplicate).


I've noticed -31 compiles without notable error or warning on x86-64, so I 
thought maybe it was a valid time to file a bug report about it not working.


The machine currently runs 2.6.12 but when booting with PREEMPT_RT mode on the 
same machine I get:


init[1]: segfault at 8010e9c4 rip 8010e9c4 rsp 
7fe28018

[...]


Do you have latency tracing enabled in the kernel config? Try disabling 
it. It's a known problem that it doesn't work on x86_64.


Michal
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Realtime Preemption, 2.6.12, Beginners Guide?

2005-07-06 Thread Michal Schmidt

Fernando Lopez-Lezcano wrote:

I see the same thing. "CONFIG_PRINTK_IGNORE_LOGLEVEL is not set" but
still printk ignores the loglevel (I commented out the #ifdef in
kernel/printk.c to make the spurious messages go away). 


The condition is reversed.
The '#ifdef CONFIG_PRINTK_IGNORE_LOGLEVEL' should be
'#ifndef CONFIG_PRINTK_IGNORE_LOGLEVEL'.

Michal
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: PROBLEM: please remove reserved word "new" from kernel headers

2005-07-06 Thread Michal Schmidt

Rob Prowel wrote:
[1.] One line summary of the problem:


2.4 and 2.6 kernel headers use c++ reserved word "new"
as identifier in function prototypes.


Yes, the kernel is written in C, not C++.


using the identifier "new" in kernel headers that are
visible to applications programs is a bad idea.


Programs are not supposed to include kernel headers.
This is a FAQ, see the archives.

Michal
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: How's the nforce4 support in Linux?

2005-03-26 Thread Michal Schmidt
Julien Wajsberg wrote:
Good point... I just tried, but forcedeth doesn't support netpoll. If
you have a pointer, I could try to implement it ;-)
Can you try the attached patch for forcedeth?
It compiles for me, but I don't have nForce hardware to test it.
Michal
--- linux-2.6.12-rc1/drivers/net/forcedeth.c.orig   2005-03-26 
15:00:12.0 +0100
+++ linux-2.6.12-rc1/drivers/net/forcedeth.c2005-03-26 15:08:56.0 
+0100
@@ -1480,6 +1480,13 @@ static void nv_do_nic_poll(unsigned long
enable_irq(dev->irq);
 }
 
+#ifdef CONFIG_NET_POLL_CONTROLLER
+static void nv_poll_controller(struct net_device *dev)
+{
+   nv_do_nic_poll((long) dev);
+}
+#endif
+
 static void nv_get_drvinfo(struct net_device *dev, struct ethtool_drvinfo 
*info)
 {
struct fe_priv *np = get_nvpriv(dev);
@@ -1962,6 +1969,9 @@ static int __devinit nv_probe(struct pci
dev->get_stats = nv_get_stats;
dev->change_mtu = nv_change_mtu;
dev->set_multicast_list = nv_set_multicast;
+#ifdef CONFIG_NET_POLL_CONTROLLER
+   dev->poll_controller = nv_poll_controller;
+#endif
SET_ETHTOOL_OPS(dev, &ops);
dev->tx_timeout = nv_tx_timeout;
dev->watchdog_timeo = NV_WATCHDOG_TIMEO;


Re: dmesg command output

2005-01-27 Thread Michal Schmidt
cranium2003 wrote:
[...] On my RH9
i386 arch i got 16kb output from dmesg. how to
increase it?
man dmesg (parameter -s).
You may also want to increase the kernel buffer size in General Setup -> 
Kernel log buffer size (CONFIG_LOG_BUF_SHIFT).

Michal
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] fix verify_command to allow burning more than 1 DVD

2005-01-18 Thread Michal Schmidt
Peter Osterlund wrote:
Michal Schmidt <[EMAIL PROTECTED]> writes:
--- linux-2.6.11-mm1/drivers/block/scsi_ioctl.c.orig2005-01-17 
20:42:40.0 +0100
+++ linux-2.6.11-mm1/drivers/block/scsi_ioctl.c 2005-01-17 20:43:14.0 
+0100
@@ -197,9 +197,7 @@ static int verify_command(struct file *f
if (type & CMD_WRITE_SAFE) {
if (file->f_mode & FMODE_WRITE)
return 0;
-   }
-
-   if (!(type & CMD_WARNED)) {
+   } else if (!(type & CMD_WARNED)) {
cmd_type[cmd[0]] = CMD_WARNED;
printk(KERN_WARNING "scsi: unknown opcode 0x%02x\n", cmd[0]);
}

That patch will not write the warning message in some cases. 
Yes. In cases when the device is opened for reading and the command is 
known as safe_for_write.
Do we really want to print this warning in that case?

I think this patch is better:
---
 linux-petero/drivers/block/scsi_ioctl.c |2 +-
 1 files changed, 1 insertion(+), 1 deletion(-)
diff -puN drivers/block/scsi_ioctl.c~scsi-filter drivers/block/scsi_ioctl.c
--- linux/drivers/block/scsi_ioctl.c~scsi-filter	2005-01-18 23:38:37.966026728 +0100
+++ linux-petero/drivers/block/scsi_ioctl.c	2005-01-18 23:38:37.970026120 +0100
@@ -200,7 +200,7 @@ static int verify_command(struct file *f
 	}
 
 	if (!(type & CMD_WARNED)) {
-		cmd_type[cmd[0]] = CMD_WARNED;
+		cmd_type[cmd[0]] |= CMD_WARNED;
 		printk(KERN_WARNING "scsi: unknown opcode 0x%02x\n", cmd[0]);
 	}
 
_

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Unable to burn DVDs

2005-01-18 Thread Michal Schmidt
Bill Davidsen wrote:
Nick Sanders wrote:
For me when running growisofs  with user permissions on 2.6.10 (ide-cd) it 
works perfectly 1st time but 2nd time fails with the error below. It works 
fine when run as root.

:-( unable to PREVENT MEDIA REMOVAL: Operation not permitted
As an aside audio cd burning with cdrecord works as long as the '-text' option 
isn't used, if it is the process hangs.

I reported a similar thing with cdrecord, writing a first session 
successfully using the -multi flag, but not being able to append to it 
or read the size with the "-msinfo" flag. I was totally blown off and 
told I didn't have permissions on the device, even though I was able to 
write to it.

I believe the true answer is that the SCSI command filter is blocking a 
command needed to perform the operation, probably a command to lock the 
door of the drive. In my case I have permissions to write the CD, just 
not to read the info needed to write additional sessions.

Hello,
Bill and Nick, could you try the attached patch that I sent to Jens 
Axboe yesterday? (You can see the mail with an explanation on
http://marc.theaimsgroup.com/?l=linux-kernel&m=110599420505734&w=2 )

Michal
--- linux-2.6.11-mm1/drivers/block/scsi_ioctl.c.orig	2005-01-17 20:42:40.0 +0100
+++ linux-2.6.11-mm1/drivers/block/scsi_ioctl.c	2005-01-17 20:43:14.0 +0100
@@ -197,9 +197,7 @@ static int verify_command(struct file *f
 	if (type & CMD_WRITE_SAFE) {
 		if (file->f_mode & FMODE_WRITE)
 			return 0;
-	}
-
-	if (!(type & CMD_WARNED)) {
+	} else if (!(type & CMD_WARNED)) {
 		cmd_type[cmd[0]] = CMD_WARNED;
 		printk(KERN_WARNING "scsi: unknown opcode 0x%02x\n", cmd[0]);
 	}


[PATCH] fix verify_command to allow burning more than 1 DVD

2005-01-17 Thread Michal Schmidt
Hello,
I use K3B with growisofs to burn DVDs. After boot I can burn a DVD as a 
normal user. But only the first one. When I want to burn another one, 
K3B complains that it is unable to prevent media removal. Then only root 
can burn DVDs.
The bug is in the kernel in the function verify_command.
When a process opens the DVD recorder with O_RDONLY and issues a command 
which is marked safe_for_write, this function is supposed to just return 
-EPERM and do nothing more. However, there is a bug that causes the 
command to be marked as CMD_WARNED. From now on no non-privileged 
process is able to issue this command even if it correctly opens the 
device with O_RDWR - because the command is no longer marked as 
CMD_WRITE_SAFE.
A patch is attached.

Michal
--- linux-2.6.11-mm1/drivers/block/scsi_ioctl.c.orig	2005-01-17 20:42:40.0 +0100
+++ linux-2.6.11-mm1/drivers/block/scsi_ioctl.c	2005-01-17 20:43:14.0 +0100
@@ -197,9 +197,7 @@ static int verify_command(struct file *f
 	if (type & CMD_WRITE_SAFE) {
 		if (file->f_mode & FMODE_WRITE)
 			return 0;
-	}
-
-	if (!(type & CMD_WARNED)) {
+	} else if (!(type & CMD_WARNED)) {
 		cmd_type[cmd[0]] = CMD_WARNED;
 		printk(KERN_WARNING "scsi: unknown opcode 0x%02x\n", cmd[0]);
 	}