Re: [-mm patch] make "struct menu_governor" static (again)
This is already fixed in the most recent ACPI CPUIDLE tree. Thanks, Adam On Mon, 2007-08-27 at 23:27 +0200, Adrian Bunk wrote: > On Wed, Aug 22, 2007 at 02:06:48AM -0700, Andrew Morton wrote: > >... > > Changes since 2.6.23-rc2-mm2: > >... > > git-acpi.patch > >... > > git trees > >... > > "struct menu_governor" needlessly again became global. > > Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]> > > --- > cb33b296204127cf50df54b84b2d79e152fb924b > diff --git a/drivers/cpuidle/governors/menu.c > b/drivers/cpuidle/governors/menu.c > index f5a8865..8d3fdc5 100644 > --- a/drivers/cpuidle/governors/menu.c > +++ b/drivers/cpuidle/governors/menu.c > @@ -117,7 +117,7 @@ static int menu_enable_device(struct cpuidle_device *dev) > return 0; > } > > -struct cpuidle_governor menu_governor = { > +static struct cpuidle_governor menu_governor = { > .name = "menu", > .rating = 20, > .enable = menu_enable_device, > - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [-mm patch] make struct menu_governor static (again)
This is already fixed in the most recent ACPI CPUIDLE tree. Thanks, Adam On Mon, 2007-08-27 at 23:27 +0200, Adrian Bunk wrote: On Wed, Aug 22, 2007 at 02:06:48AM -0700, Andrew Morton wrote: ... Changes since 2.6.23-rc2-mm2: ... git-acpi.patch ... git trees ... struct menu_governor needlessly again became global. Signed-off-by: Adrian Bunk [EMAIL PROTECTED] --- cb33b296204127cf50df54b84b2d79e152fb924b diff --git a/drivers/cpuidle/governors/menu.c b/drivers/cpuidle/governors/menu.c index f5a8865..8d3fdc5 100644 --- a/drivers/cpuidle/governors/menu.c +++ b/drivers/cpuidle/governors/menu.c @@ -117,7 +117,7 @@ static int menu_enable_device(struct cpuidle_device *dev) return 0; } -struct cpuidle_governor menu_governor = { +static struct cpuidle_governor menu_governor = { .name = menu, .rating = 20, .enable = menu_enable_device, - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.21-rc6-mm1
On Tue, 2007-04-10 at 15:20 -0700, Venki Pallipadi wrote: > On Mon, Apr 09, 2007 at 07:40:52PM +0200, Rafael J. Wysocki wrote: > > On Monday, 9 April 2007 18:14, Pallipadi, Venkatesh wrote: > > > > > > >-Original Message- > > > >From: Rafael J. Wysocki [mailto:[EMAIL PROTECTED] > > > >Sent: Monday, April 09, 2007 9:08 AM > > > >To: Andrew Morton > > > >Cc: linux-kernel@vger.kernel.org; [EMAIL PROTECTED]; > > > >[EMAIL PROTECTED]; Pallipadi, Venkatesh > > > >Subject: Re: 2.6.21-rc6-mm1 > > > > > > > >On Sunday, 8 April 2007 23:35, Andrew Morton wrote: > > > >> > > > >> > > > >ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2 > > > >.6.21-rc6/2.6.21-rc6-mm1/ > > > >> > > > >> > > > >> - Lots of x86 updates > > > >> > > > >> - This is a 25MB diff against mainline, which is rather large. > > > > > > > >The cpuidle thing tends to hang my x86-64 machines on boot. > > > > > > > > > > Hi Rafael, > > > > > > At what point during boot does it hang? > > > > When mounting the root filesystem. It hangs completely, even the magic > > SysRq > > doesn't work > > > > Rafael: Below patch should fix the hang. > Len: Please include this patch in acpi-test. > > Thanks, > Venki > > Prevent hang on x86-64, when ACPI processor driver is added as a module on > a system that does not support C-states. > > x86-64 expects all idle handlers to enable interrupts before returning from > idle handler. This is due to enter_idle(), exit_idle() races. Make > cpuidle_idle_call() confirm to this when there is no pm_idle_old. > > Also, cpuidle look at the return values of attch_driver() and set > current_driver to NULL if attach fails on all CPUs. My vote would be to instead remove enter_idle() and exit_idle() from x86-64, just as was done with i386. Performance monitoring infrastructure shouldn't be interfering with the idle interrupt delivery, as that could only hurt performance... Besides, there's probably a better way of doing this than an idle notifier anyway. -Adam - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.21-rc6-mm1
On Tue, 2007-04-10 at 15:20 -0700, Venki Pallipadi wrote: On Mon, Apr 09, 2007 at 07:40:52PM +0200, Rafael J. Wysocki wrote: On Monday, 9 April 2007 18:14, Pallipadi, Venkatesh wrote: -Original Message- From: Rafael J. Wysocki [mailto:[EMAIL PROTECTED] Sent: Monday, April 09, 2007 9:08 AM To: Andrew Morton Cc: linux-kernel@vger.kernel.org; [EMAIL PROTECTED]; [EMAIL PROTECTED]; Pallipadi, Venkatesh Subject: Re: 2.6.21-rc6-mm1 On Sunday, 8 April 2007 23:35, Andrew Morton wrote: ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2 .6.21-rc6/2.6.21-rc6-mm1/ - Lots of x86 updates - This is a 25MB diff against mainline, which is rather large. The cpuidle thing tends to hang my x86-64 machines on boot. Hi Rafael, At what point during boot does it hang? When mounting the root filesystem. It hangs completely, even the magic SysRq doesn't work Rafael: Below patch should fix the hang. Len: Please include this patch in acpi-test. Thanks, Venki Prevent hang on x86-64, when ACPI processor driver is added as a module on a system that does not support C-states. x86-64 expects all idle handlers to enable interrupts before returning from idle handler. This is due to enter_idle(), exit_idle() races. Make cpuidle_idle_call() confirm to this when there is no pm_idle_old. Also, cpuidle look at the return values of attch_driver() and set current_driver to NULL if attach fails on all CPUs. My vote would be to instead remove enter_idle() and exit_idle() from x86-64, just as was done with i386. Performance monitoring infrastructure shouldn't be interfering with the idle interrupt delivery, as that could only hurt performance... Besides, there's probably a better way of doing this than an idle notifier anyway. -Adam - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH 3/3] add the 'menu' cpuidle governor
On Mon, 2007-03-26 at 13:36 +0800, Shaohua Li wrote: > Hi, > On Sat, 2007-03-24 at 03:47 -0400, Adam Belay wrote: > > This patch adds the 'menu' governor, as was described in my first email. > > > > > +/** > > + * menu_select - selects the next idle state to enter > > + * @dev: the CPU > > + */ > > +static int menu_select(struct cpuidle_device *dev) > > +{ > > + struct menu_device *data = &__get_cpu_var(menu_devices); > > + int i, expected_us, max_state = dev->state_count; > > + > > + /* discard BM history because it is sticky */ > > + cpuidle_get_bm_activity(); > Why discard BM history here? This way the next bm check almost always > return 0. Yes, although in testing it detects BM activity more often then one might think, I agree, this is probably too aggressive. At the time, I was trying to avoid situations where BM_STS goes high early during a long busy period and as a result becomes stale. > BTW, bm activity is global (Not cpu specific), we'd better account it > system wide. Yes, but do we need to support BM_STS in the SMP case? > > > + /* determine the expected residency time */ > > + expected_us = (s32) ktime_to_ns(tick_nohz_get_sleep_length()) / 1000; > > + expected_us = min(expected_us, data->break_last_us); > > + > > + /* determine the maximum state compatible with current BM status */ > > + if (cpuidle_get_bm_activity()) > > + data->bm_elapsed_us = 0; > > + if (data->bm_elapsed_us <= data->bm_holdoff_us) > > + max_state = data->deepest_bm_state + 1; > > + > > + /* find the deepest idle state that satisfies our constraints */ > > + for (i = 1; i < max_state; i++) { > > + struct cpuidle_state *s = >states[i]; > > + if (s->target_residency > expected_us) > > + break; > > + if (s->exit_latency > system_latency_constraint()) > > + break; > > + } > > + > > + data->last_state_idx = i - 1; > > + data->idle_jiffies = tick_nohz_get_idle_jiffies(); > > + return i - 1; > > +} > > + > > +/** > > + * menu_reflect - attempts to guess what happened after entry > > + * @dev: the CPU > > + * > > + * NOTE: it's important to be fast here because this operation will add to > > + * the overall exit latency. > > + */ > > +static void menu_reflect(struct cpuidle_device *dev) > > +{ > > + struct menu_device *data = &__get_cpu_var(menu_devices); > > + int last_idx = data->last_state_idx; > > + int measured_us = cpuidle_get_last_residency(dev); > > + struct cpuidle_state *target = >states[last_idx]; > > + > > + /* > > +* Ugh, this idle state doesn't support residency measurements, so we > > +* are basically lost in the dark. As a compromise, assume we slept > > +* for one full standard timer tick. However, be aware that this > > +* could potentially result in a suboptimal state transition. > > +*/ > > + if (!(target->flags & CPUIDLE_FLAG_TIME_VALID)) > > + measured_us = USEC_PER_SEC / HZ; > > + > > + data->bm_elapsed_us += measured_us; > > + data->break_elapsed_us += measured_us; > See the system state: idle->running->idle > Looks the bm_elapsed_us and break_elapsed_us account ingored the running > state between the two idles. Eg, the 'running' might generate a lot of > bm activity, then maybe we should reset bm_elapsed_us in the next > 'idle'. I ignore the time between idle states because I'm only interested in accounting the idle sleep behavior. A more sophisticated strategy might also account the running time between idles in some way. However, it is worth noting that a busy system has the indirect effect of shortening the idle residency times. I think removing the BM_STS clear attempt at the beginning should help to reset bm_elapsed_us after sufficiently long busy periods. > > Thanks, > Shaohua Thanks for the feedback. -Adam - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH 3/3] add the 'menu' cpuidle governor
On Mon, 2007-03-26 at 13:36 +0800, Shaohua Li wrote: Hi, On Sat, 2007-03-24 at 03:47 -0400, Adam Belay wrote: This patch adds the 'menu' governor, as was described in my first email. +/** + * menu_select - selects the next idle state to enter + * @dev: the CPU + */ +static int menu_select(struct cpuidle_device *dev) +{ + struct menu_device *data = __get_cpu_var(menu_devices); + int i, expected_us, max_state = dev-state_count; + + /* discard BM history because it is sticky */ + cpuidle_get_bm_activity(); Why discard BM history here? This way the next bm check almost always return 0. Yes, although in testing it detects BM activity more often then one might think, I agree, this is probably too aggressive. At the time, I was trying to avoid situations where BM_STS goes high early during a long busy period and as a result becomes stale. BTW, bm activity is global (Not cpu specific), we'd better account it system wide. Yes, but do we need to support BM_STS in the SMP case? + /* determine the expected residency time */ + expected_us = (s32) ktime_to_ns(tick_nohz_get_sleep_length()) / 1000; + expected_us = min(expected_us, data-break_last_us); + + /* determine the maximum state compatible with current BM status */ + if (cpuidle_get_bm_activity()) + data-bm_elapsed_us = 0; + if (data-bm_elapsed_us = data-bm_holdoff_us) + max_state = data-deepest_bm_state + 1; + + /* find the deepest idle state that satisfies our constraints */ + for (i = 1; i max_state; i++) { + struct cpuidle_state *s = dev-states[i]; + if (s-target_residency expected_us) + break; + if (s-exit_latency system_latency_constraint()) + break; + } + + data-last_state_idx = i - 1; + data-idle_jiffies = tick_nohz_get_idle_jiffies(); + return i - 1; +} + +/** + * menu_reflect - attempts to guess what happened after entry + * @dev: the CPU + * + * NOTE: it's important to be fast here because this operation will add to + * the overall exit latency. + */ +static void menu_reflect(struct cpuidle_device *dev) +{ + struct menu_device *data = __get_cpu_var(menu_devices); + int last_idx = data-last_state_idx; + int measured_us = cpuidle_get_last_residency(dev); + struct cpuidle_state *target = dev-states[last_idx]; + + /* +* Ugh, this idle state doesn't support residency measurements, so we +* are basically lost in the dark. As a compromise, assume we slept +* for one full standard timer tick. However, be aware that this +* could potentially result in a suboptimal state transition. +*/ + if (!(target-flags CPUIDLE_FLAG_TIME_VALID)) + measured_us = USEC_PER_SEC / HZ; + + data-bm_elapsed_us += measured_us; + data-break_elapsed_us += measured_us; See the system state: idle-running-idle Looks the bm_elapsed_us and break_elapsed_us account ingored the running state between the two idles. Eg, the 'running' might generate a lot of bm activity, then maybe we should reset bm_elapsed_us in the next 'idle'. I ignore the time between idle states because I'm only interested in accounting the idle sleep behavior. A more sophisticated strategy might also account the running time between idles in some way. However, it is worth noting that a busy system has the indirect effect of shortening the idle residency times. I think removing the BM_STS clear attempt at the beginning should help to reset bm_elapsed_us after sufficiently long busy periods. Thanks, Shaohua Thanks for the feedback. -Adam - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC][PATCH 2/3] export time until next timer interrupt using NOHZ
This patch exposes information about the time remaining until the next timer interrupt expires by utilizing the dynticks infrastructure. It also modifies the main idle loop to allow dynticks to handle non-interrupt break events (e.g. DMA). Finally, it exposes sleep ticks information to external code. Thomas Gleixner is responsible for much of the code in this patch. However, I've made some additional changes, so I'm probably responsible if there are any bugs or oversights :) Thanks, Adam arch/i386/kernel/process.c |3 ++- include/linux/tick.h | 10 ++ kernel/softirq.c |5 - kernel/time/tick-sched.c | 24 4 files changed, 36 insertions(+), 6 deletions(-) diff -urN a/arch/i386/kernel/process.c b/arch/i386/kernel/process.c --- a/arch/i386/kernel/process.c2007-03-23 23:02:16.0 -0400 +++ b/arch/i386/kernel/process.c2007-03-24 01:48:33.0 -0400 @@ -174,13 +174,14 @@ /* endless idle loop with no priority at all */ while (1) { - tick_nohz_stop_sched_tick(); while (!need_resched()) { void (*idle)(void); if (__get_cpu_var(cpu_idle_state)) __get_cpu_var(cpu_idle_state) = 0; + tick_nohz_stop_sched_tick(); + rmb(); idle = pm_idle; diff -urN a/include/linux/tick.h b/include/linux/tick.h --- a/include/linux/tick.h 2007-03-23 23:03:03.0 -0400 +++ b/include/linux/tick.h 2007-03-24 01:39:03.0 -0400 @@ -40,6 +40,7 @@ * @idle_sleeps: Number of idle calls, where the sched tick was stopped * @idle_entrytime:Time when the idle call was entered * @idle_sleeptime:Sum of the time slept in idle with sched tick stopped + * @sleep_length: Duration of the current idle sleep */ struct tick_sched { struct hrtimer sched_timer; @@ -52,6 +53,7 @@ unsigned long idle_sleeps; ktime_t idle_entrytime; ktime_t idle_sleeptime; + ktime_t sleep_length; unsigned long last_jiffies; unsigned long next_jiffies; ktime_t idle_expires; @@ -100,10 +102,18 @@ extern void tick_nohz_stop_sched_tick(void); extern void tick_nohz_restart_sched_tick(void); extern void tick_nohz_update_jiffies(void); +extern ktime_t tick_nohz_get_sleep_length(void); +extern unsigned long tick_nohz_get_idle_jiffies(void); # else static inline void tick_nohz_stop_sched_tick(void) { } static inline void tick_nohz_restart_sched_tick(void) { } static inline void tick_nohz_update_jiffies(void) { } +static inline ktime_t tick_nohz_get_sleep_length(void) +{ + ktime_t len = { .tv64 = NSEC_PER_SEC/HZ }; + + return len; +} # endif /* !NO_HZ */ #endif diff -urN a/kernel/softirq.c b/kernel/softirq.c --- a/kernel/softirq.c 2007-03-23 23:03:03.0 -0400 +++ b/kernel/softirq.c 2007-03-24 01:54:11.0 -0400 @@ -303,11 +303,6 @@ if (!in_interrupt() && local_softirq_pending()) invoke_softirq(); -#ifdef CONFIG_NO_HZ - /* Make sure that timer wheel updates are propagated */ - if (!in_interrupt() && idle_cpu(smp_processor_id()) && !need_resched()) - tick_nohz_stop_sched_tick(); -#endif preempt_enable_no_resched(); } diff -urN a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c --- a/kernel/time/tick-sched.c 2007-03-23 23:03:03.0 -0400 +++ b/kernel/time/tick-sched.c 2007-03-24 01:44:55.0 -0400 @@ -153,6 +153,7 @@ unsigned long seq, last_jiffies, next_jiffies, delta_jiffies, flags; struct tick_sched *ts; ktime_t last_update, expires, now, delta; + struct clock_event_device *dev = __get_cpu_var(tick_cpu_device).evtdev; int cpu; local_irq_save(flags); @@ -250,11 +251,34 @@ out: ts->next_jiffies = next_jiffies; ts->last_jiffies = last_jiffies; + ts->sleep_length = ktime_sub(dev->next_event, now); end: local_irq_restore(flags); } /** + * tick_nohz_get_sleep_length - return the length of the current sleep + * + * Called from power state control code with interrupts disabled + */ +ktime_t tick_nohz_get_sleep_length(void) +{ + struct tick_sched *ts = &__get_cpu_var(tick_cpu_sched); + + return ts->sleep_length; +} + +/** + * tick_nohz_get_idle_jiffies - returns the current idle jiffie count + */ +unsigned long tick_nohz_get_idle_jiffies(void) +{ + struct tick_sched *ts = &__get_cpu_var(tick_cpu_sched); + + return ts->idle_jiffies; +} + +/** * nohz_restart_sched_tick - restart the idle tick from the idle task * * Restart the idle tick when the CPU is woken up from idle - To unsubscribe from this
[RFC][PATCH 3/3] add the 'menu' cpuidle governor
This patch adds the 'menu' governor, as was described in my first email. Thanks, Adam Kconfig| 11 +++ governors/Makefile |1 governors/menu.c | 152 + 3 files changed, 164 insertions(+) diff -urN a/drivers/cpuidle/governors/Makefile b/drivers/cpuidle/governors/Makefile --- a/drivers/cpuidle/governors/Makefile2007-03-23 23:09:45.0 -0400 +++ b/drivers/cpuidle/governors/Makefile2007-03-24 02:10:29.0 -0400 @@ -3,3 +3,4 @@ # obj-$(CONFIG_CPU_IDLE_GOV_LADDER) += ladder.o +obj-$(CONFIG_CPU_IDLE_GOV_MENU) += menu.o diff -urN a/drivers/cpuidle/governors/menu.c b/drivers/cpuidle/governors/menu.c --- a/drivers/cpuidle/governors/menu.c 1969-12-31 19:00:00.0 -0500 +++ b/drivers/cpuidle/governors/menu.c 2007-03-23 23:51:15.0 -0400 @@ -0,0 +1,152 @@ +/* + * menu.c - the menu idle governor + * + * Copyright (C) 2006-2007 Adam Belay <[EMAIL PROTECTED]> + * + * This code is licenced under the GPL. + */ + +#include +#include +#include +#include +#include +#include +#include + +#define BM_HOLDOFF 2 /* 20 ms */ + +struct menu_device { + int last_state_idx; + int deepest_bm_state; + + int break_last_us; + int break_elapsed_us; + + int bm_elapsed_us; + int bm_holdoff_us; + + unsigned long idle_jiffies; +}; + +static DEFINE_PER_CPU(struct menu_device, menu_devices); + +/** + * menu_select - selects the next idle state to enter + * @dev: the CPU + */ +static int menu_select(struct cpuidle_device *dev) +{ + struct menu_device *data = &__get_cpu_var(menu_devices); + int i, expected_us, max_state = dev->state_count; + + /* discard BM history because it is sticky */ + cpuidle_get_bm_activity(); + + /* determine the expected residency time */ + expected_us = (s32) ktime_to_ns(tick_nohz_get_sleep_length()) / 1000; + expected_us = min(expected_us, data->break_last_us); + + /* determine the maximum state compatible with current BM status */ + if (cpuidle_get_bm_activity()) + data->bm_elapsed_us = 0; + if (data->bm_elapsed_us <= data->bm_holdoff_us) + max_state = data->deepest_bm_state + 1; + + /* find the deepest idle state that satisfies our constraints */ + for (i = 1; i < max_state; i++) { + struct cpuidle_state *s = >states[i]; + if (s->target_residency > expected_us) + break; + if (s->exit_latency > system_latency_constraint()) + break; + } + + data->last_state_idx = i - 1; + data->idle_jiffies = tick_nohz_get_idle_jiffies(); + return i - 1; +} + +/** + * menu_reflect - attempts to guess what happened after entry + * @dev: the CPU + * + * NOTE: it's important to be fast here because this operation will add to + * the overall exit latency. + */ +static void menu_reflect(struct cpuidle_device *dev) +{ + struct menu_device *data = &__get_cpu_var(menu_devices); + int last_idx = data->last_state_idx; + int measured_us = cpuidle_get_last_residency(dev); + struct cpuidle_state *target = >states[last_idx]; + + /* +* Ugh, this idle state doesn't support residency measurements, so we +* are basically lost in the dark. As a compromise, assume we slept +* for one full standard timer tick. However, be aware that this +* could potentially result in a suboptimal state transition. +*/ + if (!(target->flags & CPUIDLE_FLAG_TIME_VALID)) + measured_us = USEC_PER_SEC / HZ; + + data->bm_elapsed_us += measured_us; + data->break_elapsed_us += measured_us; + + /* +* Did something other than the timer interrupt cause the break event? +*/ + if (tick_nohz_get_idle_jiffies() == data->idle_jiffies) { + data->break_last_us = data->break_elapsed_us; + data->break_elapsed_us = 0; + } +} + +/** + * menu_scan_device - scans a CPU's states and does setup + * @dev: the CPU + */ +static void menu_scan_device(struct cpuidle_device *dev) +{ + struct menu_device *data = _cpu(menu_devices, dev->cpu); + int i; + + data->last_state_idx = 0; + data->break_last_us = 0; + data->break_elapsed_us = 0; + data->bm_elapsed_us = 0; + data->bm_holdoff_us = BM_HOLDOFF; + + for (i = 1; i < dev->state_count; i++) + if (dev->states[i].flags & CPUIDLE_FLAG_CHECK_BM) + break; + data->deepest_bm_state = i - 1; +} + +struct cpuidle_governor menu_governor = { + .name = "menu", + .scan = menu_scan_device, +
[RFC][PATCH 0/3] A Dynticks Aware Processor Idle PM Governor
Hi All, Here is my first take at implementing an idle PM governor that takes full advantage of NO_HZ. I call it the 'menu' governor because it considers the full list of idle states before each entry. I've kept the implementation fairly simple. It attempts to guess the next residency time and then chooses a state that would meet at least the break-even point between power savings and entry cost. To this end, it selects the deepest idle state that satisfies the following constraints: 1. If the idle time elapsed since bus master activity was detected is below a threshold (currently 20 ms), then limit the selection to C2-type or above. 2. Do not choose a state with a break-even residency that exceeds the expected time remaining until the next timer interrupt. 3. Do not choose a state with a break-even residency that exceeds the elapsed time between the last pair of break events, excluding timer interrupts. This governor has an advantage over "ladder" governor because it proactively checks how much time remains until the next timer interrupt using the tick infrastructure. Also, it handles device interrupt activity more intelligently by not including timer interrupts in break event calculations. Finally, it doesn't make policy decisions using the number of state entries, which can have variable residency times (NO_HZ makes these potentially very large), and instead only considers sleep time deltas. The menu governor can be selected during runtime using the cpuidle sysfs interface like so: "echo "menu" > /sys/devices/system/cpu/cpuidle/current_governor" This patchset applies against 2.6.21-rc4 plus the latest from the acpi testing tree, which is available here: ftp://ftp.kernel.org/pub/linux/kernel/people/lenb/acpi/patches/test/2.6.21/acpi-test-20070126-2.6.21-rc4.diff.bz2 I'd really appreciate any comments, benchmarks, or suggestions. Cheers, Adam - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC][PATCH 1/3] cpuidle governor API changes
This patch prepares cpuidle for the menu governor. It adds an optional stage after idle state entry to give the governor an opportunity to check why the state was exited. Also it makes sure the idle loop returns after each state entry, allowing the appropriate dynticks code to run. Thanks, Adam drivers/cpuidle/cpuidle.c | 33 ++--- drivers/cpuidle/governor.c |2 +- drivers/cpuidle/governors/ladder.c |2 +- include/linux/cpuidle.h|4 ++-- 4 files changed, 18 insertions(+), 23 deletions(-) diff -urN a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c --- a/drivers/cpuidle/cpuidle.c 2007-03-23 23:09:45.0 -0400 +++ b/drivers/cpuidle/cpuidle.c 2007-03-24 00:22:09.0 -0400 @@ -30,12 +30,10 @@ * cpuidle_idle_call - the main idle loop * * NOTE: no locks or semaphores should be used here - * FIXME: DYNTICKS handling */ static void cpuidle_idle_call(void) { struct cpuidle_device *dev = &__get_cpu_var(cpuidle_devices); - struct cpuidle_state *target_state; int next_state; @@ -46,24 +44,21 @@ return; } - if (cpuidle_curr_governor->prepare_idle) - cpuidle_curr_governor->prepare_idle(dev); - - while(!need_resched()) { - next_state = cpuidle_curr_governor->select_state(dev); - if (need_resched()) - break; - - target_state = >states[next_state]; - - dev->last_residency = target_state->enter(dev, target_state); - dev->last_state = target_state; - target_state->time += dev->last_residency; - target_state->usage++; + /* ask the governor for the next state */ + next_state = cpuidle_curr_governor->select(dev); + if (need_resched()) + return; + target_state = >states[next_state]; - if (dev->status != CPUIDLE_STATUS_DOIDLE) - break; - } + /* enter the state and update stats */ + dev->last_residency = target_state->enter(dev, target_state); + dev->last_state = target_state; + target_state->time += dev->last_residency; + target_state->usage++; + + /* give the governor an opportunity to reflect on the outcome */ + if (cpuidle_curr_governor->reflect) + cpuidle_curr_governor->reflect(dev); } /** diff -urN a/drivers/cpuidle/governor.c b/drivers/cpuidle/governor.c --- a/drivers/cpuidle/governor.c2007-03-23 23:09:45.0 -0400 +++ b/drivers/cpuidle/governor.c2007-03-24 00:31:04.0 -0400 @@ -124,7 +124,7 @@ { int ret = -EEXIST; - if (!gov || !gov->select_state) + if (!gov || !gov->select) return -EINVAL; mutex_lock(_lock); diff -urN a/drivers/cpuidle/governors/ladder.c b/drivers/cpuidle/governors/ladder.c --- a/drivers/cpuidle/governors/ladder.c2007-03-23 23:09:45.0 -0400 +++ b/drivers/cpuidle/governors/ladder.c2007-03-23 23:26:06.0 -0400 @@ -202,7 +202,7 @@ .init = ladder_init_device, .exit = ladder_exit_device, .scan = ladder_scan_device, - .select_state = ladder_select_state, + .select = ladder_select_state, .owner =THIS_MODULE, }; diff -urN a/include/linux/cpuidle.h b/include/linux/cpuidle.h --- a/include/linux/cpuidle.h 2007-03-23 23:09:46.0 -0400 +++ b/include/linux/cpuidle.h 2007-03-23 23:24:02.0 -0400 @@ -158,8 +158,8 @@ void (*exit)(struct cpuidle_device *dev); void (*scan)(struct cpuidle_device *dev); - void (*prepare_idle)(struct cpuidle_device *dev); - int (*select_state)(struct cpuidle_device *dev); + int (*select) (struct cpuidle_device *dev); + void (*reflect) (struct cpuidle_device *dev); struct module *owner; }; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC][PATCH 0/3] A Dynticks Aware Processor Idle PM Governor
Hi All, Here is my first take at implementing an idle PM governor that takes full advantage of NO_HZ. I call it the 'menu' governor because it considers the full list of idle states before each entry. I've kept the implementation fairly simple. It attempts to guess the next residency time and then chooses a state that would meet at least the break-even point between power savings and entry cost. To this end, it selects the deepest idle state that satisfies the following constraints: 1. If the idle time elapsed since bus master activity was detected is below a threshold (currently 20 ms), then limit the selection to C2-type or above. 2. Do not choose a state with a break-even residency that exceeds the expected time remaining until the next timer interrupt. 3. Do not choose a state with a break-even residency that exceeds the elapsed time between the last pair of break events, excluding timer interrupts. This governor has an advantage over ladder governor because it proactively checks how much time remains until the next timer interrupt using the tick infrastructure. Also, it handles device interrupt activity more intelligently by not including timer interrupts in break event calculations. Finally, it doesn't make policy decisions using the number of state entries, which can have variable residency times (NO_HZ makes these potentially very large), and instead only considers sleep time deltas. The menu governor can be selected during runtime using the cpuidle sysfs interface like so: echo menu /sys/devices/system/cpu/cpuidle/current_governor This patchset applies against 2.6.21-rc4 plus the latest from the acpi testing tree, which is available here: ftp://ftp.kernel.org/pub/linux/kernel/people/lenb/acpi/patches/test/2.6.21/acpi-test-20070126-2.6.21-rc4.diff.bz2 I'd really appreciate any comments, benchmarks, or suggestions. Cheers, Adam - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC][PATCH 1/3] cpuidle governor API changes
This patch prepares cpuidle for the menu governor. It adds an optional stage after idle state entry to give the governor an opportunity to check why the state was exited. Also it makes sure the idle loop returns after each state entry, allowing the appropriate dynticks code to run. Thanks, Adam drivers/cpuidle/cpuidle.c | 33 ++--- drivers/cpuidle/governor.c |2 +- drivers/cpuidle/governors/ladder.c |2 +- include/linux/cpuidle.h|4 ++-- 4 files changed, 18 insertions(+), 23 deletions(-) diff -urN a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c --- a/drivers/cpuidle/cpuidle.c 2007-03-23 23:09:45.0 -0400 +++ b/drivers/cpuidle/cpuidle.c 2007-03-24 00:22:09.0 -0400 @@ -30,12 +30,10 @@ * cpuidle_idle_call - the main idle loop * * NOTE: no locks or semaphores should be used here - * FIXME: DYNTICKS handling */ static void cpuidle_idle_call(void) { struct cpuidle_device *dev = __get_cpu_var(cpuidle_devices); - struct cpuidle_state *target_state; int next_state; @@ -46,24 +44,21 @@ return; } - if (cpuidle_curr_governor-prepare_idle) - cpuidle_curr_governor-prepare_idle(dev); - - while(!need_resched()) { - next_state = cpuidle_curr_governor-select_state(dev); - if (need_resched()) - break; - - target_state = dev-states[next_state]; - - dev-last_residency = target_state-enter(dev, target_state); - dev-last_state = target_state; - target_state-time += dev-last_residency; - target_state-usage++; + /* ask the governor for the next state */ + next_state = cpuidle_curr_governor-select(dev); + if (need_resched()) + return; + target_state = dev-states[next_state]; - if (dev-status != CPUIDLE_STATUS_DOIDLE) - break; - } + /* enter the state and update stats */ + dev-last_residency = target_state-enter(dev, target_state); + dev-last_state = target_state; + target_state-time += dev-last_residency; + target_state-usage++; + + /* give the governor an opportunity to reflect on the outcome */ + if (cpuidle_curr_governor-reflect) + cpuidle_curr_governor-reflect(dev); } /** diff -urN a/drivers/cpuidle/governor.c b/drivers/cpuidle/governor.c --- a/drivers/cpuidle/governor.c2007-03-23 23:09:45.0 -0400 +++ b/drivers/cpuidle/governor.c2007-03-24 00:31:04.0 -0400 @@ -124,7 +124,7 @@ { int ret = -EEXIST; - if (!gov || !gov-select_state) + if (!gov || !gov-select) return -EINVAL; mutex_lock(cpuidle_lock); diff -urN a/drivers/cpuidle/governors/ladder.c b/drivers/cpuidle/governors/ladder.c --- a/drivers/cpuidle/governors/ladder.c2007-03-23 23:09:45.0 -0400 +++ b/drivers/cpuidle/governors/ladder.c2007-03-23 23:26:06.0 -0400 @@ -202,7 +202,7 @@ .init = ladder_init_device, .exit = ladder_exit_device, .scan = ladder_scan_device, - .select_state = ladder_select_state, + .select = ladder_select_state, .owner =THIS_MODULE, }; diff -urN a/include/linux/cpuidle.h b/include/linux/cpuidle.h --- a/include/linux/cpuidle.h 2007-03-23 23:09:46.0 -0400 +++ b/include/linux/cpuidle.h 2007-03-23 23:24:02.0 -0400 @@ -158,8 +158,8 @@ void (*exit)(struct cpuidle_device *dev); void (*scan)(struct cpuidle_device *dev); - void (*prepare_idle)(struct cpuidle_device *dev); - int (*select_state)(struct cpuidle_device *dev); + int (*select) (struct cpuidle_device *dev); + void (*reflect) (struct cpuidle_device *dev); struct module *owner; }; - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC][PATCH 2/3] export time until next timer interrupt using NOHZ
This patch exposes information about the time remaining until the next timer interrupt expires by utilizing the dynticks infrastructure. It also modifies the main idle loop to allow dynticks to handle non-interrupt break events (e.g. DMA). Finally, it exposes sleep ticks information to external code. Thomas Gleixner is responsible for much of the code in this patch. However, I've made some additional changes, so I'm probably responsible if there are any bugs or oversights :) Thanks, Adam arch/i386/kernel/process.c |3 ++- include/linux/tick.h | 10 ++ kernel/softirq.c |5 - kernel/time/tick-sched.c | 24 4 files changed, 36 insertions(+), 6 deletions(-) diff -urN a/arch/i386/kernel/process.c b/arch/i386/kernel/process.c --- a/arch/i386/kernel/process.c2007-03-23 23:02:16.0 -0400 +++ b/arch/i386/kernel/process.c2007-03-24 01:48:33.0 -0400 @@ -174,13 +174,14 @@ /* endless idle loop with no priority at all */ while (1) { - tick_nohz_stop_sched_tick(); while (!need_resched()) { void (*idle)(void); if (__get_cpu_var(cpu_idle_state)) __get_cpu_var(cpu_idle_state) = 0; + tick_nohz_stop_sched_tick(); + rmb(); idle = pm_idle; diff -urN a/include/linux/tick.h b/include/linux/tick.h --- a/include/linux/tick.h 2007-03-23 23:03:03.0 -0400 +++ b/include/linux/tick.h 2007-03-24 01:39:03.0 -0400 @@ -40,6 +40,7 @@ * @idle_sleeps: Number of idle calls, where the sched tick was stopped * @idle_entrytime:Time when the idle call was entered * @idle_sleeptime:Sum of the time slept in idle with sched tick stopped + * @sleep_length: Duration of the current idle sleep */ struct tick_sched { struct hrtimer sched_timer; @@ -52,6 +53,7 @@ unsigned long idle_sleeps; ktime_t idle_entrytime; ktime_t idle_sleeptime; + ktime_t sleep_length; unsigned long last_jiffies; unsigned long next_jiffies; ktime_t idle_expires; @@ -100,10 +102,18 @@ extern void tick_nohz_stop_sched_tick(void); extern void tick_nohz_restart_sched_tick(void); extern void tick_nohz_update_jiffies(void); +extern ktime_t tick_nohz_get_sleep_length(void); +extern unsigned long tick_nohz_get_idle_jiffies(void); # else static inline void tick_nohz_stop_sched_tick(void) { } static inline void tick_nohz_restart_sched_tick(void) { } static inline void tick_nohz_update_jiffies(void) { } +static inline ktime_t tick_nohz_get_sleep_length(void) +{ + ktime_t len = { .tv64 = NSEC_PER_SEC/HZ }; + + return len; +} # endif /* !NO_HZ */ #endif diff -urN a/kernel/softirq.c b/kernel/softirq.c --- a/kernel/softirq.c 2007-03-23 23:03:03.0 -0400 +++ b/kernel/softirq.c 2007-03-24 01:54:11.0 -0400 @@ -303,11 +303,6 @@ if (!in_interrupt() local_softirq_pending()) invoke_softirq(); -#ifdef CONFIG_NO_HZ - /* Make sure that timer wheel updates are propagated */ - if (!in_interrupt() idle_cpu(smp_processor_id()) !need_resched()) - tick_nohz_stop_sched_tick(); -#endif preempt_enable_no_resched(); } diff -urN a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c --- a/kernel/time/tick-sched.c 2007-03-23 23:03:03.0 -0400 +++ b/kernel/time/tick-sched.c 2007-03-24 01:44:55.0 -0400 @@ -153,6 +153,7 @@ unsigned long seq, last_jiffies, next_jiffies, delta_jiffies, flags; struct tick_sched *ts; ktime_t last_update, expires, now, delta; + struct clock_event_device *dev = __get_cpu_var(tick_cpu_device).evtdev; int cpu; local_irq_save(flags); @@ -250,11 +251,34 @@ out: ts-next_jiffies = next_jiffies; ts-last_jiffies = last_jiffies; + ts-sleep_length = ktime_sub(dev-next_event, now); end: local_irq_restore(flags); } /** + * tick_nohz_get_sleep_length - return the length of the current sleep + * + * Called from power state control code with interrupts disabled + */ +ktime_t tick_nohz_get_sleep_length(void) +{ + struct tick_sched *ts = __get_cpu_var(tick_cpu_sched); + + return ts-sleep_length; +} + +/** + * tick_nohz_get_idle_jiffies - returns the current idle jiffie count + */ +unsigned long tick_nohz_get_idle_jiffies(void) +{ + struct tick_sched *ts = __get_cpu_var(tick_cpu_sched); + + return ts-idle_jiffies; +} + +/** * nohz_restart_sched_tick - restart the idle tick from the idle task * * Restart the idle tick when the CPU is woken up from idle - To unsubscribe from this list: send the
[RFC][PATCH 3/3] add the 'menu' cpuidle governor
This patch adds the 'menu' governor, as was described in my first email. Thanks, Adam Kconfig| 11 +++ governors/Makefile |1 governors/menu.c | 152 + 3 files changed, 164 insertions(+) diff -urN a/drivers/cpuidle/governors/Makefile b/drivers/cpuidle/governors/Makefile --- a/drivers/cpuidle/governors/Makefile2007-03-23 23:09:45.0 -0400 +++ b/drivers/cpuidle/governors/Makefile2007-03-24 02:10:29.0 -0400 @@ -3,3 +3,4 @@ # obj-$(CONFIG_CPU_IDLE_GOV_LADDER) += ladder.o +obj-$(CONFIG_CPU_IDLE_GOV_MENU) += menu.o diff -urN a/drivers/cpuidle/governors/menu.c b/drivers/cpuidle/governors/menu.c --- a/drivers/cpuidle/governors/menu.c 1969-12-31 19:00:00.0 -0500 +++ b/drivers/cpuidle/governors/menu.c 2007-03-23 23:51:15.0 -0400 @@ -0,0 +1,152 @@ +/* + * menu.c - the menu idle governor + * + * Copyright (C) 2006-2007 Adam Belay [EMAIL PROTECTED] + * + * This code is licenced under the GPL. + */ + +#include linux/kernel.h +#include linux/cpuidle.h +#include linux/latency.h +#include linux/time.h +#include linux/ktime.h +#include linux/tick.h +#include linux/hrtimer.h + +#define BM_HOLDOFF 2 /* 20 ms */ + +struct menu_device { + int last_state_idx; + int deepest_bm_state; + + int break_last_us; + int break_elapsed_us; + + int bm_elapsed_us; + int bm_holdoff_us; + + unsigned long idle_jiffies; +}; + +static DEFINE_PER_CPU(struct menu_device, menu_devices); + +/** + * menu_select - selects the next idle state to enter + * @dev: the CPU + */ +static int menu_select(struct cpuidle_device *dev) +{ + struct menu_device *data = __get_cpu_var(menu_devices); + int i, expected_us, max_state = dev-state_count; + + /* discard BM history because it is sticky */ + cpuidle_get_bm_activity(); + + /* determine the expected residency time */ + expected_us = (s32) ktime_to_ns(tick_nohz_get_sleep_length()) / 1000; + expected_us = min(expected_us, data-break_last_us); + + /* determine the maximum state compatible with current BM status */ + if (cpuidle_get_bm_activity()) + data-bm_elapsed_us = 0; + if (data-bm_elapsed_us = data-bm_holdoff_us) + max_state = data-deepest_bm_state + 1; + + /* find the deepest idle state that satisfies our constraints */ + for (i = 1; i max_state; i++) { + struct cpuidle_state *s = dev-states[i]; + if (s-target_residency expected_us) + break; + if (s-exit_latency system_latency_constraint()) + break; + } + + data-last_state_idx = i - 1; + data-idle_jiffies = tick_nohz_get_idle_jiffies(); + return i - 1; +} + +/** + * menu_reflect - attempts to guess what happened after entry + * @dev: the CPU + * + * NOTE: it's important to be fast here because this operation will add to + * the overall exit latency. + */ +static void menu_reflect(struct cpuidle_device *dev) +{ + struct menu_device *data = __get_cpu_var(menu_devices); + int last_idx = data-last_state_idx; + int measured_us = cpuidle_get_last_residency(dev); + struct cpuidle_state *target = dev-states[last_idx]; + + /* +* Ugh, this idle state doesn't support residency measurements, so we +* are basically lost in the dark. As a compromise, assume we slept +* for one full standard timer tick. However, be aware that this +* could potentially result in a suboptimal state transition. +*/ + if (!(target-flags CPUIDLE_FLAG_TIME_VALID)) + measured_us = USEC_PER_SEC / HZ; + + data-bm_elapsed_us += measured_us; + data-break_elapsed_us += measured_us; + + /* +* Did something other than the timer interrupt cause the break event? +*/ + if (tick_nohz_get_idle_jiffies() == data-idle_jiffies) { + data-break_last_us = data-break_elapsed_us; + data-break_elapsed_us = 0; + } +} + +/** + * menu_scan_device - scans a CPU's states and does setup + * @dev: the CPU + */ +static void menu_scan_device(struct cpuidle_device *dev) +{ + struct menu_device *data = per_cpu(menu_devices, dev-cpu); + int i; + + data-last_state_idx = 0; + data-break_last_us = 0; + data-break_elapsed_us = 0; + data-bm_elapsed_us = 0; + data-bm_holdoff_us = BM_HOLDOFF; + + for (i = 1; i dev-state_count; i++) + if (dev-states[i].flags CPUIDLE_FLAG_CHECK_BM) + break; + data-deepest_bm_state = i - 1; +} + +struct cpuidle_governor menu_governor = { + .name = menu, + .scan = menu_scan_device, + .select = menu_select, + .reflect = menu_reflect
Re: [PATCH 1/3] Introducing cpuidle: core cpuidle infrastructure
On Tue, 2007-02-13 at 05:31 -0800, Venkatesh Pallipadi wrote: > On Mon, Feb 12, 2007 at 08:22:01PM -0500, Dave Jones wrote: > > On Mon, Feb 12, 2007 at 10:39:25AM -0800, Venkatesh Pallipadi wrote: > > > > > > Introducing 'cpuidle', a new CPU power management infrastructure to > > manage > > > idle CPUs in a clean and efficient manner. > > > cpuidle separates out the drivers that can provide support for multiple > > types > > > of idle states and policy governors that decide on what idle state to use > > > at run time. > > > A cpuidle driver can support multiple idle states based on parameters > > like > > > varying power consumption, wakeup latency, etc (ACPI C-states for > > example). > > > A cpuidle governor can be usage model specific (laptop, server, > > > laptop on battery etc). > > > Main advantage of the infrastructure being, it allows independent > > development > > > of drivers and governors and allows for better CPU power management. > > > > > > A huge thanks to Adam Belay and Shaohua Li who were part of this > > mini-project > > > since its beginning and are greatly responsible for this patchset. > > > > interesting. Though I wonder about giving admins _more_ knobs to twiddle. > > It took cpufreq a long time to settle down in this area, and typically > > 'ondemand' was the answer in the end for 99.9% of people. I question the > > usefulness > > for the whole multiple governors interface, because in the case of cpuidle > > there shouldn't be any real trade-off between one algorithm and another > > afaics? > > So why can't we just have one, that just 'does the right thing' ? > > The only differentiator that I can think of would be latency, but that seems > > to be a) covered in a different tunable, and b) probably wouldn't affect > > most people enough where it matters. > > > > Agreed. In long term, I think cpuidle will also have one governor that will be > used in most of the cases. But, we have to go through the process of > experimenting with different governors, just like cpufreq and let the best > governor win. I think this interface helps to experiment with new > governors in a non-disruptive way. I mean, any new experiments will not have > side effects on people already using currently established drivers in > distributions. > > Also, one of the things we are looking at is to have ratings for different > drivers and governors (similar to time subsystem), with which we can control > best driver and best governor for a platform from inside the kernel, instead > of depending on admin/init script to do the right thing. > > Having said that, I do feel we may need a different governor for things like > handhelds. I heard them saying there idle routines has more than one > dimension of low power-high latency idle states. But, that do not suggest the > need for runtime switch in sysfs, as it will still be one proper governor for > a platform. Learning from the past, I think a good comparison would be the support for several block IO schedulers (e.g. deadline, cfq, anticipatory, etc). The added flexibility of a pluggable architecture allowed for a lot of innovation and experimentation that might not have happened otherwise. There even is a "noop" scheduler that makes sense for some hardware devices but not others. In short, Linux processor idle power management support needs some growing room to find its "ondemand" equivalent. In my opinion, the best sort of a tunable would be a variable that indicates userspace's intentions to the cpuidle governor. Maybe something to the effect of the following... - Maximum Performance - Balanced (attempt to do well in both) - Maximum Battery-life Of course governors can have their own specific tunables, but it would probably be best to not touch them in the typical use-case. Thanks, Adam - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/3] Introducing cpuidle: core cpuidle infrastructure
On Tue, 2007-02-13 at 05:31 -0800, Venkatesh Pallipadi wrote: On Mon, Feb 12, 2007 at 08:22:01PM -0500, Dave Jones wrote: On Mon, Feb 12, 2007 at 10:39:25AM -0800, Venkatesh Pallipadi wrote: Introducing 'cpuidle', a new CPU power management infrastructure to manage idle CPUs in a clean and efficient manner. cpuidle separates out the drivers that can provide support for multiple types of idle states and policy governors that decide on what idle state to use at run time. A cpuidle driver can support multiple idle states based on parameters like varying power consumption, wakeup latency, etc (ACPI C-states for example). A cpuidle governor can be usage model specific (laptop, server, laptop on battery etc). Main advantage of the infrastructure being, it allows independent development of drivers and governors and allows for better CPU power management. A huge thanks to Adam Belay and Shaohua Li who were part of this mini-project since its beginning and are greatly responsible for this patchset. interesting. Though I wonder about giving admins _more_ knobs to twiddle. It took cpufreq a long time to settle down in this area, and typically 'ondemand' was the answer in the end for 99.9% of people. I question the usefulness for the whole multiple governors interface, because in the case of cpuidle there shouldn't be any real trade-off between one algorithm and another afaics? So why can't we just have one, that just 'does the right thing' ? The only differentiator that I can think of would be latency, but that seems to be a) covered in a different tunable, and b) probably wouldn't affect most people enough where it matters. Agreed. In long term, I think cpuidle will also have one governor that will be used in most of the cases. But, we have to go through the process of experimenting with different governors, just like cpufreq and let the best governor win. I think this interface helps to experiment with new governors in a non-disruptive way. I mean, any new experiments will not have side effects on people already using currently established drivers in distributions. Also, one of the things we are looking at is to have ratings for different drivers and governors (similar to time subsystem), with which we can control best driver and best governor for a platform from inside the kernel, instead of depending on admin/init script to do the right thing. Having said that, I do feel we may need a different governor for things like handhelds. I heard them saying there idle routines has more than one dimension of low power-high latency idle states. But, that do not suggest the need for runtime switch in sysfs, as it will still be one proper governor for a platform. Learning from the past, I think a good comparison would be the support for several block IO schedulers (e.g. deadline, cfq, anticipatory, etc). The added flexibility of a pluggable architecture allowed for a lot of innovation and experimentation that might not have happened otherwise. There even is a noop scheduler that makes sense for some hardware devices but not others. In short, Linux processor idle power management support needs some growing room to find its ondemand equivalent. In my opinion, the best sort of a tunable would be a variable that indicates userspace's intentions to the cpuidle governor. Maybe something to the effect of the following... - Maximum Performance - Balanced (attempt to do well in both) - Maximum Battery-life Of course governors can have their own specific tunables, but it would probably be best to not touch them in the typical use-case. Thanks, Adam - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Custom IORESOURCE Class
On Mon, Aug 08, 2005 at 09:00:21AM -0700, Greg KH wrote: > On Mon, Aug 08, 2005 at 11:11:45AM -0700, Matthew Gilbert wrote: > > Below is a patch that adds an additional resource class to the platform > > resource types. This is to support additional resources that need to be > > passed > > to drivers without overloading the existing specific types. In my case, I > > need > > to send clock information to the driver to enable power management. > > > > Signed-off-by: Matthew Gilbert <[EMAIL PROTECTED]> > > Hm, you do realize that Pat's no longer the driver core maintainer? :) > > Anyway, Russell and Adam, any objections to this patch? I'm not sure if I agree with this patch. "struct resource" is used primarily for I/O resource assignment. Although I agree we may need to add new IORESOURCE types, I'm not sure if clock data belongs here. I don't think "start" and "end" would be useful for most platform data. Could you provide more information about this specific issue and resource type? Maybe we could create a new sysfs attribute? Thanks, Adam - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Custom IORESOURCE Class
On Mon, Aug 08, 2005 at 09:00:21AM -0700, Greg KH wrote: On Mon, Aug 08, 2005 at 11:11:45AM -0700, Matthew Gilbert wrote: Below is a patch that adds an additional resource class to the platform resource types. This is to support additional resources that need to be passed to drivers without overloading the existing specific types. In my case, I need to send clock information to the driver to enable power management. Signed-off-by: Matthew Gilbert [EMAIL PROTECTED] Hm, you do realize that Pat's no longer the driver core maintainer? :) Anyway, Russell and Adam, any objections to this patch? I'm not sure if I agree with this patch. struct resource is used primarily for I/O resource assignment. Although I agree we may need to add new IORESOURCE types, I'm not sure if clock data belongs here. I don't think start and end would be useful for most platform data. Could you provide more information about this specific issue and resource type? Maybe we could create a new sysfs attribute? Thanks, Adam - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH] Add PCI<->PCI bridge driver [4/9]
On Fri, 2005-07-15 at 09:58 +0100, Russell King wrote: > On Thu, Jul 14, 2005 at 04:55:19AM -0400, Adam Belay wrote: > > This patch adds a basic PCI<->PCI bridge driver that utilizes the new > > PCI bus class API. > > Thanks. I think this breaks Cardbus. > > The whole point of the way PCI is _presently_ organised is that it allows > busses to be configured and setup _before_ the devices are made available > to drivers. This breaks that completely: Hi Russell, I'm aware of this issue. These changes are major and will need more than one pass to be correct. I'll be redoing most of the bus configuration code in the next patch set. I have a strategy for proper device and bus configuration. These are my current thoughts: 1.) When bound to its device PCI bridge drivers will add their current devices to the bus device list, but will not register them with the driver model. 2.) The bus class driver will initiate a procedure similar to pci_bus_add_devices(), but only for host (root) bridges and hot-plugged devices. pci_register_bus_devices(struct pci_bus *bus) { - register all bios configured bridges - call pci_register_bus_devices() for each previously registered bridge - register remaining uninitialized bridges and call pci_register_bus_devices() for each bridge as it's registered. } pci_register_devices(struct pci_bus *bus) { - register all remaining PCI devices, including those of child pci buses } * pci_register_bus_devices() will be called first followed by pci_register_devices(). 3.) Bridge windows will not be configured until a child device is enabled. In other words, resource configuration is lazy much like we handle PCI IRQ routing. We will, however, verify the validity of BIOS assignments. If the assignments are incorrect, the bridge will be disabled and then reconfigured when needed. Thanks, Adam - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH] Add PCI-PCI bridge driver [4/9]
On Fri, 2005-07-15 at 09:58 +0100, Russell King wrote: On Thu, Jul 14, 2005 at 04:55:19AM -0400, Adam Belay wrote: This patch adds a basic PCI-PCI bridge driver that utilizes the new PCI bus class API. Thanks. I think this breaks Cardbus. The whole point of the way PCI is _presently_ organised is that it allows busses to be configured and setup _before_ the devices are made available to drivers. This breaks that completely: Hi Russell, I'm aware of this issue. These changes are major and will need more than one pass to be correct. I'll be redoing most of the bus configuration code in the next patch set. I have a strategy for proper device and bus configuration. These are my current thoughts: 1.) When bound to its device PCI bridge drivers will add their current devices to the bus device list, but will not register them with the driver model. 2.) The bus class driver will initiate a procedure similar to pci_bus_add_devices(), but only for host (root) bridges and hot-plugged devices. pci_register_bus_devices(struct pci_bus *bus) { - register all bios configured bridges - call pci_register_bus_devices() for each previously registered bridge - register remaining uninitialized bridges and call pci_register_bus_devices() for each bridge as it's registered. } pci_register_devices(struct pci_bus *bus) { - register all remaining PCI devices, including those of child pci buses } * pci_register_bus_devices() will be called first followed by pci_register_devices(). 3.) Bridge windows will not be configured until a child device is enabled. In other words, resource configuration is lazy much like we handle PCI IRQ routing. We will, however, verify the validity of BIOS assignments. If the assignments are incorrect, the bridge will be disabled and then reconfigured when needed. Thanks, Adam - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH] split PCI probing code [1/9]
On Thu, 2005-07-14 at 12:30 -0700, Greg KH wrote: > On Thu, Jul 14, 2005 at 07:10:14PM +0200, Francois Romieu wrote: > > Adam Belay <[EMAIL PROTECTED]> : > > [...] > > > > Some nits + a suspect error branch. It seems nice otherwise. > > If I'm correct, this patch only moves the code into different files, it > doesn't change any of it, so your comments apply to the current code > today, not Adam's changes :) Correct. I've been trying to make my changes incremental. Nonetheless, I do appreciate the comments. I'll try to apply these fixes to my current tree. Thanks, Adam - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH] add PCI bus registration support [2/9]
On Thu, 2005-07-14 at 12:33 -0700, Greg KH wrote: > On Thu, Jul 14, 2005 at 04:55:12AM -0400, Adam Belay wrote: > > +EXPORT_SYMBOL(pci_add_bus); > > This doens't need to be exported, right? No module uses it. But if > they do, I suggest EXPORT_SYMBOL_GPL() instead, is that ok? > > thanks, > > greg k-h Yes, no module currently uses it, but now that "pci_driver" is supported, any PCI bridge driver could potentially be made into a module. In theory, this could even include the PCI<->PCI bridge driver. I also wanted to export this as a module so that it would be easier to add new drivers for more unusual bridge hardware. EXPORT_SYMBOL_GPL() would be fine. Thanks, Adam - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC][PATCH] don't bind to PCI express links [8/9]
This patch prevents the PCI<->PCI bridge driver from binding to PCI express devices. This is needed to coexist with the PCI express root port driver. Eventually we may want to rework and better integrate linux PCI express link support, but for now this should work. Signed-off-by: Adam Belay <[EMAIL PROTECTED]> --- a/drivers/pci/bus/pci-bridge.c 2005-07-14 02:30:09.0 -0400 +++ b/drivers/pci/bus/pci-bridge.c 2005-07-14 02:46:12.0 -0400 @@ -132,6 +132,10 @@ if (dev->subordinate) return -ENODEV; + /* don't bind to pci express links */ + if (pci_find_capability(dev, PCI_CAP_ID_EXP)) + return -ENODEV; + bus = ppb_detect_bus(dev); if (!bus) return -ENODEV; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC][PATCH] master abort on scanning fixes [6/9]
The PCI bridge driver now checks if changing bridge_ctrl is necessary. It also restores the original bridge_ctl settings when finished scanning for devices. Finally, a pci_bus setup fix is included. Signed-off-by: Adam Belay <[EMAIL PROTECTED]> --- a/drivers/pci/bus/pci-bridge.c 2005-07-12 01:45:46.0 -0400 +++ b/drivers/pci/bus/pci-bridge.c 2005-07-14 02:09:15.0 -0400 @@ -30,7 +30,7 @@ bus->bridge = >dev; bus->ops = bus->parent->ops; bus->sysdata = bus->parent->sysdata; - bus->bridge = get_device(>dev); + bus->self = dev; /* Set up default resource pointers and names.. */ for (i = 0; i < 4; i++) { @@ -82,12 +82,7 @@ if (!bus) return NULL; - /* Disable MasterAbortMode during probing to avoid reporting -* of bus errors (in some architectures) -*/ pci_read_config_word(dev, PCI_BRIDGE_CONTROL, ); - pci_write_config_word(dev, PCI_BRIDGE_CONTROL, - bctl & ~PCI_BRIDGE_CTL_MASTER_ABORT); bus->number = bus->secondary = busnr; bus->primary = buses & 0xFF; @@ -105,10 +100,22 @@ { unsigned int devfn; + /* Disable MasterAbortMode during probing to avoid reporting +* of bus errors (in some architectures) +*/ + if (!(bus->bridge_ctl & PCI_BRIDGE_CTL_MASTER_ABORT)) + pci_write_config_word(bus->self, PCI_BRIDGE_CONTROL, + bus->bridge_ctl & ~PCI_BRIDGE_CTL_MASTER_ABORT); + /* Go find them, Rover! */ for (devfn = 0; devfn < 0x100; devfn += 8) pci_scan_slot(bus, devfn); + /* restore the original bridge_ctl configuration */ + if (!(bus->bridge_ctl & PCI_BRIDGE_CTL_MASTER_ABORT)) + pci_write_config_word(bus->self, PCI_BRIDGE_CONTROL, + bus->bridge_ctl); + pcibios_fixup_bus(bus); pci_bus_add_devices(bus); } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC][PATCH] split PCI probing code [1/9]
This patch divides the PCI probing code into three smaller files: config.c - PCI configuration space parsing probe.c - PCI bus detection routines bus.c - the PCI bus class driver core These files are placed in the new directory "drivers/pci/bus". It will be used for functions related to the PCI bus class driver and PCI device detection in general. Signed-off-by: Adam Belay <[EMAIL PROTECTED]> --- a/drivers/pci/Makefile 2005-07-08 17:06:19.0 -0400 +++ b/drivers/pci/Makefile 2005-07-10 22:32:53.0 -0400 @@ -2,9 +2,9 @@ # Makefile for the PCI bus specific drivers. # -obj-y += access.o bus.o probe.o remove.o pci.o quirks.o \ - names.o pci-driver.o search.o pci-sysfs.o \ - rom.o +obj-y += access.o bus.o remove.o pci.o quirks.o names.o \ + pci-driver.o search.o pci-sysfs.o rom.o bus/ + obj-$(CONFIG_PROC_FS) += proc.o ifndef CONFIG_SPARC64 --- a/drivers/pci/bus/Makefile 1969-12-31 19:00:00.0 -0500 +++ b/drivers/pci/bus/Makefile 2005-07-10 22:32:53.0 -0400 @@ -0,0 +1,5 @@ +# +# Makefile for the PCI device detection +# + +obj-y := bus.o config.o probe.o --- a/drivers/pci/bus/bus.c 1969-12-31 19:00:00.0 -0500 +++ b/drivers/pci/bus/bus.c 2005-07-10 22:32:53.0 -0400 @@ -0,0 +1,69 @@ +/* + * bus.c - the PCI bus class driver + * + */ + +#include +#include +#include +#include + +#include "bus.h" + +#undef DEBUG + +#ifdef DEBUG +#define DBG(x...) printk(x) +#else +#define DBG(x...) +#endif + + +/* + * PCI Bus Class + */ + +static void pci_release_bus_classdev(struct class_device *class_dev) +{ + struct pci_bus *pci_bus = to_pci_bus(class_dev); + + if (pci_bus->bridge) + put_device(pci_bus->bridge); + kfree(pci_bus); +} + +struct class pcibus_class = { + .name = "pci_bus", + .release= _release_bus_classdev, +}; + +static int __init pcibus_class_init(void) +{ + return class_register(_class); +} + +postcore_initcall(pcibus_class_init); + + +/* + * Registration + */ + +/** + * pci_alloc_bus - allocates a "pci_bus" structure + */ +struct pci_bus * pci_alloc_bus(void) +{ + struct pci_bus *b; + + b = kmalloc(sizeof(*b), GFP_KERNEL); + if (b) { + memset(b, 0, sizeof(*b)); + INIT_LIST_HEAD(>node); + INIT_LIST_HEAD(>children); + INIT_LIST_HEAD(>devices); + } + return b; +} + +EXPORT_SYMBOL(pci_alloc_bus); --- a/drivers/pci/bus/bus.h 1969-12-31 19:00:00.0 -0500 +++ b/drivers/pci/bus/bus.h 2005-07-10 22:32:53.0 -0400 @@ -0,0 +1,5 @@ +/* + * bus.h - functions internal to PCI device detection + */ + +extern struct class pcibus_class; --- a/drivers/pci/bus/config.c 1969-12-31 19:00:00.0 -0500 +++ b/drivers/pci/bus/config.c 2005-07-12 00:52:35.147664368 -0400 @@ -0,0 +1,466 @@ +/* + * config.c - PCI configuration space parsing code + */ + +#include +#include +#include +#include +#include + +#include "../pci.h" + +#define PCI_CFG_SPACE_SIZE 256 +#define PCI_CFG_SPACE_EXP_SIZE 4096 + +LIST_HEAD(pci_devices); + +/** + * pci_release_dev - free a pci device structure when all users of it are finished. + * @dev: device that's been disconnected + * + * Will be called only by the device core when all users of this pci device are + * done. + */ +static void pci_release_dev(struct device *dev) +{ + struct pci_dev *pci_dev; + + pci_dev = to_pci_dev(dev); + kfree(pci_dev); +} + +/* + * Translate the low bits of the PCI base + * to the resource type + */ +static inline unsigned int pci_calc_resource_flags(unsigned int flags) +{ + if (flags & PCI_BASE_ADDRESS_SPACE_IO) + return IORESOURCE_IO; + + if (flags & PCI_BASE_ADDRESS_MEM_PREFETCH) + return IORESOURCE_MEM | IORESOURCE_PREFETCH; + + return IORESOURCE_MEM; +} + +/* + * Find the extent of a PCI decode.. + */ +static u32 pci_size(u32 base, u32 maxbase, unsigned long mask) +{ + u32 size = mask & maxbase; /* Find the significant bits */ + if (!size) + return 0; + + /* Get the lowest of them to find the decode size, and + from that the extent. */ + size = (size & ~(size-1)) - 1; + + /* base == maxbase can be valid only if the BAR has + already been programmed with all 1s. */ + if (base == maxbase && ((base | size) & mask) != mask) + return 0; + + return size; +} + +static void pci_read_bases(struct pci_dev *dev, unsigned int howmany, int rom) +{ + unsigned int pos, reg, next; + u32 l, sz; + struct resource *res; + + for(pos=0; posresource[pos]; + res->name = pci_name(dev); + reg = PCI_BASE_ADDRESS_0 + (pos <<
[RFC][PATCH] PCI root bridge detection fix [7/9]
This patch prevents the root bridge drivers from using the legacy API. It also updates the PCI<->PCI bridge driver to better coexist with the legacy code. Signed-off-by: Adam Belay <[EMAIL PROTECTED]> --- a/drivers/pci/bus/pci-bridge.c 2005-07-14 02:17:04.735566464 -0400 +++ b/drivers/pci/bus/pci-bridge.c 2005-07-14 02:24:29.577940144 -0400 @@ -128,6 +128,10 @@ if (dev->hdr_type != PCI_HEADER_TYPE_BRIDGE) return -ENODEV; + /* don't bind to devices controlled by legacy code */ + if (dev->subordinate) + return -ENODEV; + bus = ppb_detect_bus(dev); if (!bus) return -ENODEV; --- a/drivers/pci/bus/probe.c 2005-07-14 02:26:30.660532792 -0400 +++ b/drivers/pci/bus/probe.c 2005-07-14 02:25:20.455205624 -0400 @@ -395,6 +395,7 @@ struct pci_bus * __devinit pci_scan_bus_parented(struct device *parent, int bus, struct pci_ops *ops, void *sysdata) { int error; + unsigned int devfn; struct pci_bus *b; struct device *dev; @@ -440,7 +441,9 @@ /* Create legacy_io and legacy_mem files for this bus */ pci_create_legacy_files(b); - b->subordinate = pci_scan_child_bus(b); + /* Go find them, Rover! */ + for (devfn = 0; devfn < 0x100; devfn += 8) + pci_scan_slot(b, devfn); return b; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC][PATCH] Add PCI<->PCI bridge driver [4/9]
This patch adds a basic PCI<->PCI bridge driver that utilizes the new PCI bus class API. Signed-off-by: Adam Belay <[EMAIL PROTECTED]> --- a/drivers/pci/bus/pci-bridge.c 1969-12-31 19:00:00.0 -0500 +++ b/drivers/pci/bus/pci-bridge.c 2005-07-08 02:18:43.0 -0400 @@ -0,0 +1,165 @@ +/* + * pci-bridge.c - a generic PCI bus driver for PCI<->PCI bridges + * + */ + +#include +#include +#include + +static struct pci_device_id ppb_id_tbl[] = { + { PCI_DEVICE_CLASS(PCI_CLASS_BRIDGE_PCI << 8, 0x00) }, + { 0 }, +}; + +MODULE_DEVICE_TABLE(pci, ppb_id_tbl); + +/** + * ppb_create_bus - allocates a bus and fills in basic information + * @dev: the pci bridge device + */ +static struct pci_bus * ppb_create_bus(struct pci_dev *dev) +{ + int i; + struct pci_bus *bus = pci_alloc_bus(); + + if (!bus) + return NULL; + + bus->parent = dev->bus; + bus->bridge = >dev; + bus->ops = bus->parent->ops; + bus->sysdata = bus->parent->sysdata; + bus->bridge = get_device(>dev); + + /* Set up default resource pointers and names.. */ + for (i = 0; i < 4; i++) { + bus->resource[i] = >resource[PCI_BRIDGE_RESOURCES+i]; + bus->resource[i]->name = bus->name; + } + + return bus; +} + +/** + * ppb_detect_bus - creates a bus and reads configuration space data + * @dev: the pci bridge device + * + * This function will do some verification to ensure we should drive this + * bridge. + */ +static struct pci_bus * ppb_detect_bus(struct pci_dev *dev) +{ + struct pci_bus *bus; + u32 buses; + u16 bctl; + unsigned int busnr; + + pci_read_config_dword(dev, PCI_PRIMARY_BUS, ); + busnr = (buses >> 8) & 0xFF; + + /* +* FIXME: This driver currently doesn't support bridges that haven't +* been configured by the BIOS. +*/ + if (!(buses & 0x00)) { + printk(KERN_INFO "PCI: Unable to drive bus %04x:%02x\n", + pci_domain_nr(dev->bus), busnr); + return NULL; + } + + /* +* If we already got to this bus through a different bridge, +* ignore it. This can happen with the i450NX chipset. +*/ + if (pci_find_bus(pci_domain_nr(dev->bus), busnr)) { + printk(KERN_INFO "PCI: Bus %04x:%02x already known\n", + pci_domain_nr(dev->bus), busnr); + return NULL; + } + + bus = ppb_create_bus(dev); + if (!bus) + return NULL; + + /* Disable MasterAbortMode during probing to avoid reporting +* of bus errors (in some architectures) +*/ + pci_read_config_word(dev, PCI_BRIDGE_CONTROL, ); + pci_write_config_word(dev, PCI_BRIDGE_CONTROL, + bctl & ~PCI_BRIDGE_CTL_MASTER_ABORT); + + bus->number = bus->secondary = busnr; + bus->primary = buses & 0xFF; + bus->subordinate = (buses >> 16) & 0xFF; + bus->bridge_ctl = bctl; + + return bus; +} + +/** + * ppb_detect_children - detects and registers child devices + * @bus: pci bus + */ +static void ppb_detect_children(struct pci_bus *bus) +{ + unsigned int devfn; + + /* Go find them, Rover! */ + for (devfn = 0; devfn < 0x100; devfn += 8) + pci_scan_slot(bus, devfn); + + pcibios_fixup_bus(bus); + pci_bus_add_devices(bus); +} + +static int ppb_probe(struct pci_dev *dev, const struct pci_device_id *id) +{ + int err; + struct pci_bus *bus; + + if (dev->hdr_type != PCI_HEADER_TYPE_BRIDGE) + return -ENODEV; + + bus = ppb_detect_bus(dev); + if (!bus) + return -ENODEV; + + err = pci_add_bus(bus); + if (err) + goto out; + + dev->subordinate = bus; + ppb_detect_children(bus); + return 0; + +out: + kfree(bus); + return err; +} + +static void ppb_remove(struct pci_dev *dev) +{ + pci_remove_behind_bridge(dev); + pci_remove_bus(dev->subordinate); +} + +static struct pci_driver ppb_driver = { + .name = "pci-bridge", + .id_table = ppb_id_tbl, + .probe = ppb_probe, + .remove = ppb_remove, +}; + +static int __init ppb_init(void) +{ + return pci_register_driver(_driver); +} + +static void __exit ppb_exit(void) +{ + pci_unregister_driver(_driver); +} + +module_init(ppb_init); +module_exit(ppb_exit); --- a/drivers/pci/bus/Makefile 2005-07-07 22:22:49.0 -0400 +++ b/drivers/pci/bus/Makefile 2005-07-08 02:16:39.0 -0400 @@ -2,4 +2,4 @@ # Makefile for the PCI device detection # -obj-y := bus.o config.o device.o probe.o +obj-y := bus.o conf
[RFC][PATCH] device registration cleanups [3/9]
This patch moves all device registration related functions to bus/device.c. Signed-off-by: Adam Belay <[EMAIL PROTECTED]> --- a/drivers/pci/bus/device.c 1969-12-31 19:00:00.0 -0500 +++ b/drivers/pci/bus/device.c 2005-07-12 01:32:41.0 -0400 @@ -0,0 +1,187 @@ +/* + * device.c - PCI device registration + */ + +#include +#include +#include +#include + +#include "../pci.h" + +/** + * add a single device + * @dev: device to add + * + * This adds a single pci device to the global + * device list and adds sysfs and procfs entries + */ +void __devinit pci_bus_add_device(struct pci_dev *dev) +{ + device_add(>dev); + + spin_lock(_bus_lock); + list_add_tail(>global_list, _devices); + spin_unlock(_bus_lock); + + pci_proc_attach_device(dev); + pci_create_sysfs_dev_files(dev); +} + +EXPORT_SYMBOL_GPL(pci_bus_add_device); + +/** + * pci_bus_add_devices - insert newly discovered PCI devices + * @bus: bus to check for new devices + * + * Add newly discovered PCI devices (which are on the bus->devices + * list) to the global PCI device list, add the sysfs and procfs + * entries. Where a bridge is found, add the discovered bus to + * the parents list of child buses, and recurse (breadth-first + * to be compatible with 2.4) + * + * Call hotplug for each new devices. + */ +void __devinit pci_bus_add_devices(struct pci_bus *bus) +{ + struct pci_dev *dev; + + list_for_each_entry(dev, >devices, bus_list) { + /* +* Skip already-present devices (which are on the +* global device list.) +*/ + if (!list_empty(>global_list)) + continue; + pci_bus_add_device(dev); + } + + list_for_each_entry(dev, >devices, bus_list) { + + BUG_ON(list_empty(>global_list)); + + /* +* If there is an unattached subordinate bus, attach +* it and then scan for unattached PCI devices. +*/ + if (dev->subordinate) { + if (list_empty(>subordinate->node)) { + spin_lock(_bus_lock); + list_add_tail(>subordinate->node, + >bus->children); + spin_unlock(_bus_lock); + } + pci_bus_add_devices(dev->subordinate); + + sysfs_create_link(>subordinate->class_dev.kobj, >dev.kobj, "bridge"); + } + } +} + +EXPORT_SYMBOL(pci_bus_add_devices); + +static void pci_free_resources(struct pci_dev *dev) +{ + int i; + + msi_remove_pci_irq_vectors(dev); + + pci_cleanup_rom(dev); + for (i = 0; i < PCI_NUM_RESOURCES; i++) { + struct resource *res = dev->resource + i; + if (res->parent) + release_resource(res); + } +} + +static void pci_destroy_dev(struct pci_dev *dev) +{ + if (!list_empty(>global_list)) { + pci_proc_detach_device(dev); + pci_remove_sysfs_dev_files(dev); + device_unregister(>dev); + spin_lock(_bus_lock); + list_del(>global_list); + dev->global_list.next = dev->global_list.prev = NULL; + spin_unlock(_bus_lock); + } + + /* Remove the device from the device lists, and prevent any further +* list accesses from this device */ + spin_lock(_bus_lock); + list_del(>bus_list); + dev->bus_list.next = dev->bus_list.prev = NULL; + spin_unlock(_bus_lock); + + pci_free_resources(dev); + pci_dev_put(dev); +} + +/** + * pci_remove_device_safe - remove an unused hotplug device + * @dev: the device to remove + * + * Delete the device structure from the device lists and + * notify userspace (/sbin/hotplug), but only if the device + * in question is not being used by a driver. + * Returns 0 on success. + */ +int pci_remove_device_safe(struct pci_dev *dev) +{ + if (pci_dev_driver(dev)) + return -EBUSY; + pci_destroy_dev(dev); + return 0; +} + +EXPORT_SYMBOL(pci_remove_device_safe); + +/** + * pci_remove_bus_device - remove a PCI device and any children + * @dev: the device to remove + * + * Remove a PCI device from the device lists, informing the drivers + * that the device has been removed. We also remove any subordinate + * buses and children in a depth-first manner. + * + * For each device we remove, delete the device structure from the + * device lists, remove the /proc entry, and notify userspace + * (/sbin/hotplug). + */ +void pci_remove_bus_device(struct pci_dev *dev) +{ + if (dev->subordinate) { + struct pci_bus *b = dev->subordinate; + +
[RFC][PATCH] add PCI bus registration support [2/9]
This patch adds pci_add_bus() for PCI bus registration. It also moves pci_remove_bus() from remove.c to bus/bus.c for consistency. Signed-off-by: Adam Belay <[EMAIL PROTECTED]> --- a/drivers/pci/bus/bus.c 2005-07-12 00:59:58.0 -0400 +++ b/drivers/pci/bus/bus.c 2005-07-12 01:01:13.992787920 -0400 @@ -9,6 +9,7 @@ #include #include "bus.h" +#include "../pci.h" #undef DEBUG @@ -50,7 +51,7 @@ */ /** - * pci_alloc_bus - allocates a "pci_bus" structure + * pci_alloc_bus - allocates a "pci_bus" structure */ struct pci_bus * pci_alloc_bus(void) { @@ -67,3 +68,61 @@ } EXPORT_SYMBOL(pci_alloc_bus); + +/** + * pci_add_bus - registers a bus with the pci bus class + * @bus: the bus + * + * Setup class data, register with the driver core, proc, etc... + */ +int pci_add_bus(struct pci_bus *bus) +{ + int ret; + + bus->class_dev.class = _class; + sprintf(bus->class_dev.class_id, "%04x:%02x", pci_domain_nr(bus), + bus->primary); + + ret = class_device_register(>class_dev); + if (ret) + return ret; + + class_device_create_file(>class_dev, +_device_attr_cpuaffinity); + if (bus->self) + sysfs_create_link(>class_dev.kobj, + >self->dev.kobj, "bridge"); + + spin_lock(_bus_lock); + list_add_tail(>node, >parent->children); + spin_unlock(_bus_lock); + + pci_proc_attach_bus(bus); + + return 0; +} + +EXPORT_SYMBOL(pci_add_bus); + +/** + * pci_remove_bus - unregisters a bus with the pci bus class + * @bus: the bus + * + * Remove the bus from bus lists, remove proc/sysfs files, and unregister + * from the driver core. + */ +void pci_remove_bus(struct pci_bus *pci_bus) +{ + pci_proc_detach_bus(pci_bus); + + spin_lock(_bus_lock); + list_del(_bus->node); + spin_unlock(_bus_lock); + pci_remove_legacy_files(pci_bus); + class_device_remove_file(_bus->class_dev, + _device_attr_cpuaffinity); + sysfs_remove_link(_bus->class_dev.kobj, "bridge"); + class_device_unregister(_bus->class_dev); +} + +EXPORT_SYMBOL(pci_remove_bus); --- a/drivers/pci/remove.c 2005-07-08 17:06:20.0 -0400 +++ b/drivers/pci/remove.c 2005-07-12 01:01:13.998787008 -0400 @@ -57,20 +57,6 @@ } EXPORT_SYMBOL(pci_remove_device_safe); -void pci_remove_bus(struct pci_bus *pci_bus) -{ - pci_proc_detach_bus(pci_bus); - - spin_lock(_bus_lock); - list_del(_bus->node); - spin_unlock(_bus_lock); - pci_remove_legacy_files(pci_bus); - class_device_remove_file(_bus->class_dev, - _device_attr_cpuaffinity); - sysfs_remove_link(_bus->class_dev.kobj, "bridge"); - class_device_unregister(_bus->class_dev); -} -EXPORT_SYMBOL(pci_remove_bus); /** * pci_remove_bus_device - remove a PCI device and any children --- a/include/linux/pci.h 2005-07-12 00:59:58.0 -0400 +++ b/include/linux/pci.h 2005-07-12 01:01:14.065776824 -0400 @@ -734,6 +734,8 @@ /* Generic PCI functions used internally */ extern struct pci_bus * pci_alloc_bus(void); +extern int pci_add_bus(struct pci_bus *bus); +extern void pci_remove_bus(struct pci_bus *bus); extern struct pci_bus *pci_find_bus(int domain, int busnr); void pci_bus_add_devices(struct pci_bus *bus); struct pci_bus *pci_scan_bus_parented(struct device *parent, int bus, struct pci_ops *ops, void *sysdata); @@ -756,7 +758,6 @@ int pci_get_interrupt_pin(struct pci_dev *dev, struct pci_dev **bridge); extern struct pci_dev *pci_dev_get(struct pci_dev *dev); extern void pci_dev_put(struct pci_dev *dev); -extern void pci_remove_bus(struct pci_bus *b); extern void pci_remove_bus_device(struct pci_dev *dev); /* Generic PCI functions exported to card drivers */ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC][PATCH] root PCI bridge registration updates [5/9]
This patch updates pci_scan_bus_parented() and also has some important fixes to the PCI bus class. Signed-off-by: Adam Belay <[EMAIL PROTECTED]> --- a/drivers/pci/bus/bus.c 2005-07-12 01:08:20.0 -0400 +++ b/drivers/pci/bus/bus.c 2005-07-13 02:01:57.0 -0400 @@ -81,7 +81,7 @@ bus->class_dev.class = _class; sprintf(bus->class_dev.class_id, "%04x:%02x", pci_domain_nr(bus), - bus->primary); + bus->number); ret = class_device_register(>class_dev); if (ret) @@ -89,13 +89,15 @@ class_device_create_file(>class_dev, _device_attr_cpuaffinity); - if (bus->self) + if (bus->bridge) sysfs_create_link(>class_dev.kobj, - >self->dev.kobj, "bridge"); + >bridge->kobj, "bridge"); - spin_lock(_bus_lock); - list_add_tail(>node, >parent->children); - spin_unlock(_bus_lock); + if (bus->parent) { + spin_lock(_bus_lock); + list_add_tail(>node, >parent->children); + spin_unlock(_bus_lock); + } pci_proc_attach_bus(bus); --- a/drivers/pci/bus/probe.c 2005-07-12 00:59:58.0 -0400 +++ b/drivers/pci/bus/probe.c 2005-07-13 01:58:54.0 -0400 @@ -427,37 +427,24 @@ error = device_register(dev); if (error) goto dev_reg_err; + b->bridge = get_device(dev); + b->number = b->secondary = bus; + b->resource[0] = _resource; + b->resource[1] = _resource; - b->class_dev.class = _class; - sprintf(b->class_dev.class_id, "%04x:%02x", pci_domain_nr(b), bus); - error = class_device_register(>class_dev); - if (error) - goto class_dev_reg_err; - error = class_device_create_file(>class_dev, _device_attr_cpuaffinity); + error = pci_add_bus(b); if (error) - goto class_dev_create_file_err; + goto bus_class_reg_err; /* Create legacy_io and legacy_mem files for this bus */ pci_create_legacy_files(b); - error = sysfs_create_link(>class_dev.kobj, >bridge->kobj, "bridge"); - if (error) - goto sys_create_link_err; - - b->number = b->secondary = bus; - b->resource[0] = _resource; - b->resource[1] = _resource; - b->subordinate = pci_scan_child_bus(b); return b; -sys_create_link_err: - class_device_remove_file(>class_dev, _device_attr_cpuaffinity); -class_dev_create_file_err: - class_device_unregister(>class_dev); -class_dev_reg_err: +bus_class_reg_err: device_unregister(dev); dev_reg_err: spin_lock(_bus_lock); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC][PATCH] basic PCI<->PCI bridge PM (suspend/resume) [9/9]
This patch adds very simplistic suspend/resume support for the PCI bridge driver. Soon this will be replaced with bridge specific code, but for now we'll try using pci_save/restore_state(). Signed-off-by: Adam Belay <[EMAIL PROTECTED]> --- a/drivers/pci/bus/pci-bridge.c 2005-07-14 04:22:13.0 -0400 +++ b/drivers/pci/bus/pci-bridge.c 2005-07-14 04:26:17.257004064 -0400 @@ -159,11 +159,29 @@ pci_remove_bus(dev->subordinate); } +static int ppb_suspend(struct pci_dev *dev, pm_message_t state) +{ + pci_save_state(dev); + pci_set_power_state(dev, pci_choose_state(dev, state)); + + return 0; +} + +static int ppb_resume(struct pci_dev *dev) +{ + pci_set_power_state(dev, PCI_D0); + pci_restore_state(dev); + + return 0; +} + static struct pci_driver ppb_driver = { .name = "pci-bridge", .id_table = ppb_id_tbl, .probe = ppb_probe, .remove = ppb_remove, + .suspend= ppb_suspend, + .resume = ppb_resume, }; static int __init ppb_init(void) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC][PATCH] PCI bus class driver rewrite for 2.6.13-rc2 [0/9]
Hi all, I'm in the process of overhauling some aspects of the PCI subsystem. This patch series is a rewrite of the PCI probing and detection code. It creates a well defined PCI bus class API and allows a standard PCI driver to bind to PCI bridge devices. This results in the following: * cleaner code * improved driver core support * the option of adding new PCI bridge drivers * better power management Example from sysfs: (/sys/bus/pci/drivers) |-- pci-bridge | |-- :00:1e.0 -> ../../../../devices/pci:00/:00:1e.0 | |-- bind | |-- new_id | `-- unbind Summary: drivers/pci/Makefile | 10 drivers/pci/bus.c| 69 --- drivers/pci/bus/Makefile |9 drivers/pci/bus/bus.c| 144 ++ drivers/pci/bus/bus.h|5 drivers/pci/bus/config.c | 466 drivers/pci/bus/device.c | 187 drivers/pci/bus/pci-bridge.c | 206 - drivers/pci/bus/probe.c | 512 +- drivers/pci/probe.c | 971 --- drivers/pci/remove.c | 122 - include/linux/pci.h |4 12 files changed, 1501 insertions(+), 1204 deletions(-) For these changes to be fully effective, the following code (some of which was broken by these changes) will need to be fixed: 1.) PCI resource management and bus numbers - We need to utilize ACPI provided PCI root bridge resource information. - Lazy allocation should be used for device resource assignments. - The PCI bus resource assignment API needs to be refined. - We need smarter bus number assignment algorithms that maintain BIOS configuration when possible. 2.) PCI Hotplug - Hotplug drivers should use PCI subsystem resource assignment and configuration code whenever possible (e.g. the recent changes to ACPI PCI hotplug were a step in the right direction). - I have some changes planned for device registration. 3.) ACPI - The new probing code breaks _PRT handling. - We need to register ACPI devices in the /sys/devices tree so we can bind to the root bridge device. 4.) Platform Specific PCI support - I'd like to improve the "pcibios" API. 5.) PCMCIA/Cardbus - This needs to use the new PCI bus class driver. I'm currently working on these issues. I look forward to any comments or suggestions. Cheers, Adam - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC][PATCH] PCI bus class driver rewrite for 2.6.13-rc2 [0/9]
Hi all, I'm in the process of overhauling some aspects of the PCI subsystem. This patch series is a rewrite of the PCI probing and detection code. It creates a well defined PCI bus class API and allows a standard PCI driver to bind to PCI bridge devices. This results in the following: * cleaner code * improved driver core support * the option of adding new PCI bridge drivers * better power management Example from sysfs: (/sys/bus/pci/drivers) |-- pci-bridge | |-- :00:1e.0 - ../../../../devices/pci:00/:00:1e.0 | |-- bind | |-- new_id | `-- unbind Summary: drivers/pci/Makefile | 10 drivers/pci/bus.c| 69 --- drivers/pci/bus/Makefile |9 drivers/pci/bus/bus.c| 144 ++ drivers/pci/bus/bus.h|5 drivers/pci/bus/config.c | 466 drivers/pci/bus/device.c | 187 drivers/pci/bus/pci-bridge.c | 206 - drivers/pci/bus/probe.c | 512 +- drivers/pci/probe.c | 971 --- drivers/pci/remove.c | 122 - include/linux/pci.h |4 12 files changed, 1501 insertions(+), 1204 deletions(-) For these changes to be fully effective, the following code (some of which was broken by these changes) will need to be fixed: 1.) PCI resource management and bus numbers - We need to utilize ACPI provided PCI root bridge resource information. - Lazy allocation should be used for device resource assignments. - The PCI bus resource assignment API needs to be refined. - We need smarter bus number assignment algorithms that maintain BIOS configuration when possible. 2.) PCI Hotplug - Hotplug drivers should use PCI subsystem resource assignment and configuration code whenever possible (e.g. the recent changes to ACPI PCI hotplug were a step in the right direction). - I have some changes planned for device registration. 3.) ACPI - The new probing code breaks _PRT handling. - We need to register ACPI devices in the /sys/devices tree so we can bind to the root bridge device. 4.) Platform Specific PCI support - I'd like to improve the pcibios API. 5.) PCMCIA/Cardbus - This needs to use the new PCI bus class driver. I'm currently working on these issues. I look forward to any comments or suggestions. Cheers, Adam - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC][PATCH] basic PCI-PCI bridge PM (suspend/resume) [9/9]
This patch adds very simplistic suspend/resume support for the PCI bridge driver. Soon this will be replaced with bridge specific code, but for now we'll try using pci_save/restore_state(). Signed-off-by: Adam Belay [EMAIL PROTECTED] --- a/drivers/pci/bus/pci-bridge.c 2005-07-14 04:22:13.0 -0400 +++ b/drivers/pci/bus/pci-bridge.c 2005-07-14 04:26:17.257004064 -0400 @@ -159,11 +159,29 @@ pci_remove_bus(dev-subordinate); } +static int ppb_suspend(struct pci_dev *dev, pm_message_t state) +{ + pci_save_state(dev); + pci_set_power_state(dev, pci_choose_state(dev, state)); + + return 0; +} + +static int ppb_resume(struct pci_dev *dev) +{ + pci_set_power_state(dev, PCI_D0); + pci_restore_state(dev); + + return 0; +} + static struct pci_driver ppb_driver = { .name = pci-bridge, .id_table = ppb_id_tbl, .probe = ppb_probe, .remove = ppb_remove, + .suspend= ppb_suspend, + .resume = ppb_resume, }; static int __init ppb_init(void) - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC][PATCH] root PCI bridge registration updates [5/9]
This patch updates pci_scan_bus_parented() and also has some important fixes to the PCI bus class. Signed-off-by: Adam Belay [EMAIL PROTECTED] --- a/drivers/pci/bus/bus.c 2005-07-12 01:08:20.0 -0400 +++ b/drivers/pci/bus/bus.c 2005-07-13 02:01:57.0 -0400 @@ -81,7 +81,7 @@ bus-class_dev.class = pcibus_class; sprintf(bus-class_dev.class_id, %04x:%02x, pci_domain_nr(bus), - bus-primary); + bus-number); ret = class_device_register(bus-class_dev); if (ret) @@ -89,13 +89,15 @@ class_device_create_file(bus-class_dev, class_device_attr_cpuaffinity); - if (bus-self) + if (bus-bridge) sysfs_create_link(bus-class_dev.kobj, - bus-self-dev.kobj, bridge); + bus-bridge-kobj, bridge); - spin_lock(pci_bus_lock); - list_add_tail(bus-node, bus-parent-children); - spin_unlock(pci_bus_lock); + if (bus-parent) { + spin_lock(pci_bus_lock); + list_add_tail(bus-node, bus-parent-children); + spin_unlock(pci_bus_lock); + } pci_proc_attach_bus(bus); --- a/drivers/pci/bus/probe.c 2005-07-12 00:59:58.0 -0400 +++ b/drivers/pci/bus/probe.c 2005-07-13 01:58:54.0 -0400 @@ -427,37 +427,24 @@ error = device_register(dev); if (error) goto dev_reg_err; + b-bridge = get_device(dev); + b-number = b-secondary = bus; + b-resource[0] = ioport_resource; + b-resource[1] = iomem_resource; - b-class_dev.class = pcibus_class; - sprintf(b-class_dev.class_id, %04x:%02x, pci_domain_nr(b), bus); - error = class_device_register(b-class_dev); - if (error) - goto class_dev_reg_err; - error = class_device_create_file(b-class_dev, class_device_attr_cpuaffinity); + error = pci_add_bus(b); if (error) - goto class_dev_create_file_err; + goto bus_class_reg_err; /* Create legacy_io and legacy_mem files for this bus */ pci_create_legacy_files(b); - error = sysfs_create_link(b-class_dev.kobj, b-bridge-kobj, bridge); - if (error) - goto sys_create_link_err; - - b-number = b-secondary = bus; - b-resource[0] = ioport_resource; - b-resource[1] = iomem_resource; - b-subordinate = pci_scan_child_bus(b); return b; -sys_create_link_err: - class_device_remove_file(b-class_dev, class_device_attr_cpuaffinity); -class_dev_create_file_err: - class_device_unregister(b-class_dev); -class_dev_reg_err: +bus_class_reg_err: device_unregister(dev); dev_reg_err: spin_lock(pci_bus_lock); - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC][PATCH] add PCI bus registration support [2/9]
This patch adds pci_add_bus() for PCI bus registration. It also moves pci_remove_bus() from remove.c to bus/bus.c for consistency. Signed-off-by: Adam Belay [EMAIL PROTECTED] --- a/drivers/pci/bus/bus.c 2005-07-12 00:59:58.0 -0400 +++ b/drivers/pci/bus/bus.c 2005-07-12 01:01:13.992787920 -0400 @@ -9,6 +9,7 @@ #include linux/module.h #include bus.h +#include ../pci.h #undef DEBUG @@ -50,7 +51,7 @@ */ /** - * pci_alloc_bus - allocates a pci_bus structure + * pci_alloc_bus - allocates a pci_bus structure */ struct pci_bus * pci_alloc_bus(void) { @@ -67,3 +68,61 @@ } EXPORT_SYMBOL(pci_alloc_bus); + +/** + * pci_add_bus - registers a bus with the pci bus class + * @bus: the bus + * + * Setup class data, register with the driver core, proc, etc... + */ +int pci_add_bus(struct pci_bus *bus) +{ + int ret; + + bus-class_dev.class = pcibus_class; + sprintf(bus-class_dev.class_id, %04x:%02x, pci_domain_nr(bus), + bus-primary); + + ret = class_device_register(bus-class_dev); + if (ret) + return ret; + + class_device_create_file(bus-class_dev, +class_device_attr_cpuaffinity); + if (bus-self) + sysfs_create_link(bus-class_dev.kobj, + bus-self-dev.kobj, bridge); + + spin_lock(pci_bus_lock); + list_add_tail(bus-node, bus-parent-children); + spin_unlock(pci_bus_lock); + + pci_proc_attach_bus(bus); + + return 0; +} + +EXPORT_SYMBOL(pci_add_bus); + +/** + * pci_remove_bus - unregisters a bus with the pci bus class + * @bus: the bus + * + * Remove the bus from bus lists, remove proc/sysfs files, and unregister + * from the driver core. + */ +void pci_remove_bus(struct pci_bus *pci_bus) +{ + pci_proc_detach_bus(pci_bus); + + spin_lock(pci_bus_lock); + list_del(pci_bus-node); + spin_unlock(pci_bus_lock); + pci_remove_legacy_files(pci_bus); + class_device_remove_file(pci_bus-class_dev, + class_device_attr_cpuaffinity); + sysfs_remove_link(pci_bus-class_dev.kobj, bridge); + class_device_unregister(pci_bus-class_dev); +} + +EXPORT_SYMBOL(pci_remove_bus); --- a/drivers/pci/remove.c 2005-07-08 17:06:20.0 -0400 +++ b/drivers/pci/remove.c 2005-07-12 01:01:13.998787008 -0400 @@ -57,20 +57,6 @@ } EXPORT_SYMBOL(pci_remove_device_safe); -void pci_remove_bus(struct pci_bus *pci_bus) -{ - pci_proc_detach_bus(pci_bus); - - spin_lock(pci_bus_lock); - list_del(pci_bus-node); - spin_unlock(pci_bus_lock); - pci_remove_legacy_files(pci_bus); - class_device_remove_file(pci_bus-class_dev, - class_device_attr_cpuaffinity); - sysfs_remove_link(pci_bus-class_dev.kobj, bridge); - class_device_unregister(pci_bus-class_dev); -} -EXPORT_SYMBOL(pci_remove_bus); /** * pci_remove_bus_device - remove a PCI device and any children --- a/include/linux/pci.h 2005-07-12 00:59:58.0 -0400 +++ b/include/linux/pci.h 2005-07-12 01:01:14.065776824 -0400 @@ -734,6 +734,8 @@ /* Generic PCI functions used internally */ extern struct pci_bus * pci_alloc_bus(void); +extern int pci_add_bus(struct pci_bus *bus); +extern void pci_remove_bus(struct pci_bus *bus); extern struct pci_bus *pci_find_bus(int domain, int busnr); void pci_bus_add_devices(struct pci_bus *bus); struct pci_bus *pci_scan_bus_parented(struct device *parent, int bus, struct pci_ops *ops, void *sysdata); @@ -756,7 +758,6 @@ int pci_get_interrupt_pin(struct pci_dev *dev, struct pci_dev **bridge); extern struct pci_dev *pci_dev_get(struct pci_dev *dev); extern void pci_dev_put(struct pci_dev *dev); -extern void pci_remove_bus(struct pci_bus *b); extern void pci_remove_bus_device(struct pci_dev *dev); /* Generic PCI functions exported to card drivers */ - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC][PATCH] device registration cleanups [3/9]
This patch moves all device registration related functions to bus/device.c. Signed-off-by: Adam Belay [EMAIL PROTECTED] --- a/drivers/pci/bus/device.c 1969-12-31 19:00:00.0 -0500 +++ b/drivers/pci/bus/device.c 2005-07-12 01:32:41.0 -0400 @@ -0,0 +1,187 @@ +/* + * device.c - PCI device registration + */ + +#include linux/module.h +#include linux/kernel.h +#include linux/pci.h +#include linux/init.h + +#include ../pci.h + +/** + * add a single device + * @dev: device to add + * + * This adds a single pci device to the global + * device list and adds sysfs and procfs entries + */ +void __devinit pci_bus_add_device(struct pci_dev *dev) +{ + device_add(dev-dev); + + spin_lock(pci_bus_lock); + list_add_tail(dev-global_list, pci_devices); + spin_unlock(pci_bus_lock); + + pci_proc_attach_device(dev); + pci_create_sysfs_dev_files(dev); +} + +EXPORT_SYMBOL_GPL(pci_bus_add_device); + +/** + * pci_bus_add_devices - insert newly discovered PCI devices + * @bus: bus to check for new devices + * + * Add newly discovered PCI devices (which are on the bus-devices + * list) to the global PCI device list, add the sysfs and procfs + * entries. Where a bridge is found, add the discovered bus to + * the parents list of child buses, and recurse (breadth-first + * to be compatible with 2.4) + * + * Call hotplug for each new devices. + */ +void __devinit pci_bus_add_devices(struct pci_bus *bus) +{ + struct pci_dev *dev; + + list_for_each_entry(dev, bus-devices, bus_list) { + /* +* Skip already-present devices (which are on the +* global device list.) +*/ + if (!list_empty(dev-global_list)) + continue; + pci_bus_add_device(dev); + } + + list_for_each_entry(dev, bus-devices, bus_list) { + + BUG_ON(list_empty(dev-global_list)); + + /* +* If there is an unattached subordinate bus, attach +* it and then scan for unattached PCI devices. +*/ + if (dev-subordinate) { + if (list_empty(dev-subordinate-node)) { + spin_lock(pci_bus_lock); + list_add_tail(dev-subordinate-node, + dev-bus-children); + spin_unlock(pci_bus_lock); + } + pci_bus_add_devices(dev-subordinate); + + sysfs_create_link(dev-subordinate-class_dev.kobj, dev-dev.kobj, bridge); + } + } +} + +EXPORT_SYMBOL(pci_bus_add_devices); + +static void pci_free_resources(struct pci_dev *dev) +{ + int i; + + msi_remove_pci_irq_vectors(dev); + + pci_cleanup_rom(dev); + for (i = 0; i PCI_NUM_RESOURCES; i++) { + struct resource *res = dev-resource + i; + if (res-parent) + release_resource(res); + } +} + +static void pci_destroy_dev(struct pci_dev *dev) +{ + if (!list_empty(dev-global_list)) { + pci_proc_detach_device(dev); + pci_remove_sysfs_dev_files(dev); + device_unregister(dev-dev); + spin_lock(pci_bus_lock); + list_del(dev-global_list); + dev-global_list.next = dev-global_list.prev = NULL; + spin_unlock(pci_bus_lock); + } + + /* Remove the device from the device lists, and prevent any further +* list accesses from this device */ + spin_lock(pci_bus_lock); + list_del(dev-bus_list); + dev-bus_list.next = dev-bus_list.prev = NULL; + spin_unlock(pci_bus_lock); + + pci_free_resources(dev); + pci_dev_put(dev); +} + +/** + * pci_remove_device_safe - remove an unused hotplug device + * @dev: the device to remove + * + * Delete the device structure from the device lists and + * notify userspace (/sbin/hotplug), but only if the device + * in question is not being used by a driver. + * Returns 0 on success. + */ +int pci_remove_device_safe(struct pci_dev *dev) +{ + if (pci_dev_driver(dev)) + return -EBUSY; + pci_destroy_dev(dev); + return 0; +} + +EXPORT_SYMBOL(pci_remove_device_safe); + +/** + * pci_remove_bus_device - remove a PCI device and any children + * @dev: the device to remove + * + * Remove a PCI device from the device lists, informing the drivers + * that the device has been removed. We also remove any subordinate + * buses and children in a depth-first manner. + * + * For each device we remove, delete the device structure from the + * device lists, remove the /proc entry, and notify userspace + * (/sbin/hotplug). + */ +void pci_remove_bus_device(struct pci_dev *dev) +{ + if (dev-subordinate) { + struct pci_bus *b = dev-subordinate; + + pci_remove_behind_bridge
[RFC][PATCH] Add PCI-PCI bridge driver [4/9]
This patch adds a basic PCI-PCI bridge driver that utilizes the new PCI bus class API. Signed-off-by: Adam Belay [EMAIL PROTECTED] --- a/drivers/pci/bus/pci-bridge.c 1969-12-31 19:00:00.0 -0500 +++ b/drivers/pci/bus/pci-bridge.c 2005-07-08 02:18:43.0 -0400 @@ -0,0 +1,165 @@ +/* + * pci-bridge.c - a generic PCI bus driver for PCI-PCI bridges + * + */ + +#include linux/pci.h +#include linux/init.h +#include linux/module.h + +static struct pci_device_id ppb_id_tbl[] = { + { PCI_DEVICE_CLASS(PCI_CLASS_BRIDGE_PCI 8, 0x00) }, + { 0 }, +}; + +MODULE_DEVICE_TABLE(pci, ppb_id_tbl); + +/** + * ppb_create_bus - allocates a bus and fills in basic information + * @dev: the pci bridge device + */ +static struct pci_bus * ppb_create_bus(struct pci_dev *dev) +{ + int i; + struct pci_bus *bus = pci_alloc_bus(); + + if (!bus) + return NULL; + + bus-parent = dev-bus; + bus-bridge = dev-dev; + bus-ops = bus-parent-ops; + bus-sysdata = bus-parent-sysdata; + bus-bridge = get_device(dev-dev); + + /* Set up default resource pointers and names.. */ + for (i = 0; i 4; i++) { + bus-resource[i] = dev-resource[PCI_BRIDGE_RESOURCES+i]; + bus-resource[i]-name = bus-name; + } + + return bus; +} + +/** + * ppb_detect_bus - creates a bus and reads configuration space data + * @dev: the pci bridge device + * + * This function will do some verification to ensure we should drive this + * bridge. + */ +static struct pci_bus * ppb_detect_bus(struct pci_dev *dev) +{ + struct pci_bus *bus; + u32 buses; + u16 bctl; + unsigned int busnr; + + pci_read_config_dword(dev, PCI_PRIMARY_BUS, buses); + busnr = (buses 8) 0xFF; + + /* +* FIXME: This driver currently doesn't support bridges that haven't +* been configured by the BIOS. +*/ + if (!(buses 0x00)) { + printk(KERN_INFO PCI: Unable to drive bus %04x:%02x\n, + pci_domain_nr(dev-bus), busnr); + return NULL; + } + + /* +* If we already got to this bus through a different bridge, +* ignore it. This can happen with the i450NX chipset. +*/ + if (pci_find_bus(pci_domain_nr(dev-bus), busnr)) { + printk(KERN_INFO PCI: Bus %04x:%02x already known\n, + pci_domain_nr(dev-bus), busnr); + return NULL; + } + + bus = ppb_create_bus(dev); + if (!bus) + return NULL; + + /* Disable MasterAbortMode during probing to avoid reporting +* of bus errors (in some architectures) +*/ + pci_read_config_word(dev, PCI_BRIDGE_CONTROL, bctl); + pci_write_config_word(dev, PCI_BRIDGE_CONTROL, + bctl ~PCI_BRIDGE_CTL_MASTER_ABORT); + + bus-number = bus-secondary = busnr; + bus-primary = buses 0xFF; + bus-subordinate = (buses 16) 0xFF; + bus-bridge_ctl = bctl; + + return bus; +} + +/** + * ppb_detect_children - detects and registers child devices + * @bus: pci bus + */ +static void ppb_detect_children(struct pci_bus *bus) +{ + unsigned int devfn; + + /* Go find them, Rover! */ + for (devfn = 0; devfn 0x100; devfn += 8) + pci_scan_slot(bus, devfn); + + pcibios_fixup_bus(bus); + pci_bus_add_devices(bus); +} + +static int ppb_probe(struct pci_dev *dev, const struct pci_device_id *id) +{ + int err; + struct pci_bus *bus; + + if (dev-hdr_type != PCI_HEADER_TYPE_BRIDGE) + return -ENODEV; + + bus = ppb_detect_bus(dev); + if (!bus) + return -ENODEV; + + err = pci_add_bus(bus); + if (err) + goto out; + + dev-subordinate = bus; + ppb_detect_children(bus); + return 0; + +out: + kfree(bus); + return err; +} + +static void ppb_remove(struct pci_dev *dev) +{ + pci_remove_behind_bridge(dev); + pci_remove_bus(dev-subordinate); +} + +static struct pci_driver ppb_driver = { + .name = pci-bridge, + .id_table = ppb_id_tbl, + .probe = ppb_probe, + .remove = ppb_remove, +}; + +static int __init ppb_init(void) +{ + return pci_register_driver(ppb_driver); +} + +static void __exit ppb_exit(void) +{ + pci_unregister_driver(ppb_driver); +} + +module_init(ppb_init); +module_exit(ppb_exit); --- a/drivers/pci/bus/Makefile 2005-07-07 22:22:49.0 -0400 +++ b/drivers/pci/bus/Makefile 2005-07-08 02:16:39.0 -0400 @@ -2,4 +2,4 @@ # Makefile for the PCI device detection # -obj-y := bus.o config.o device.o probe.o +obj-y := bus.o config.o device.o probe.o pci-bridge.o - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More
[RFC][PATCH] split PCI probing code [1/9]
This patch divides the PCI probing code into three smaller files: config.c - PCI configuration space parsing probe.c - PCI bus detection routines bus.c - the PCI bus class driver core These files are placed in the new directory drivers/pci/bus. It will be used for functions related to the PCI bus class driver and PCI device detection in general. Signed-off-by: Adam Belay [EMAIL PROTECTED] --- a/drivers/pci/Makefile 2005-07-08 17:06:19.0 -0400 +++ b/drivers/pci/Makefile 2005-07-10 22:32:53.0 -0400 @@ -2,9 +2,9 @@ # Makefile for the PCI bus specific drivers. # -obj-y += access.o bus.o probe.o remove.o pci.o quirks.o \ - names.o pci-driver.o search.o pci-sysfs.o \ - rom.o +obj-y += access.o bus.o remove.o pci.o quirks.o names.o \ + pci-driver.o search.o pci-sysfs.o rom.o bus/ + obj-$(CONFIG_PROC_FS) += proc.o ifndef CONFIG_SPARC64 --- a/drivers/pci/bus/Makefile 1969-12-31 19:00:00.0 -0500 +++ b/drivers/pci/bus/Makefile 2005-07-10 22:32:53.0 -0400 @@ -0,0 +1,5 @@ +# +# Makefile for the PCI device detection +# + +obj-y := bus.o config.o probe.o --- a/drivers/pci/bus/bus.c 1969-12-31 19:00:00.0 -0500 +++ b/drivers/pci/bus/bus.c 2005-07-10 22:32:53.0 -0400 @@ -0,0 +1,69 @@ +/* + * bus.c - the PCI bus class driver + * + */ + +#include linux/init.h +#include linux/pci.h +#include linux/slab.h +#include linux/module.h + +#include bus.h + +#undef DEBUG + +#ifdef DEBUG +#define DBG(x...) printk(x) +#else +#define DBG(x...) +#endif + + +/* + * PCI Bus Class + */ + +static void pci_release_bus_classdev(struct class_device *class_dev) +{ + struct pci_bus *pci_bus = to_pci_bus(class_dev); + + if (pci_bus-bridge) + put_device(pci_bus-bridge); + kfree(pci_bus); +} + +struct class pcibus_class = { + .name = pci_bus, + .release= pci_release_bus_classdev, +}; + +static int __init pcibus_class_init(void) +{ + return class_register(pcibus_class); +} + +postcore_initcall(pcibus_class_init); + + +/* + * Registration + */ + +/** + * pci_alloc_bus - allocates a pci_bus structure + */ +struct pci_bus * pci_alloc_bus(void) +{ + struct pci_bus *b; + + b = kmalloc(sizeof(*b), GFP_KERNEL); + if (b) { + memset(b, 0, sizeof(*b)); + INIT_LIST_HEAD(b-node); + INIT_LIST_HEAD(b-children); + INIT_LIST_HEAD(b-devices); + } + return b; +} + +EXPORT_SYMBOL(pci_alloc_bus); --- a/drivers/pci/bus/bus.h 1969-12-31 19:00:00.0 -0500 +++ b/drivers/pci/bus/bus.h 2005-07-10 22:32:53.0 -0400 @@ -0,0 +1,5 @@ +/* + * bus.h - functions internal to PCI device detection + */ + +extern struct class pcibus_class; --- a/drivers/pci/bus/config.c 1969-12-31 19:00:00.0 -0500 +++ b/drivers/pci/bus/config.c 2005-07-12 00:52:35.147664368 -0400 @@ -0,0 +1,466 @@ +/* + * config.c - PCI configuration space parsing code + */ + +#include linux/delay.h +#include linux/pci.h +#include linux/slab.h +#include linux/module.h +#include linux/cpumask.h + +#include ../pci.h + +#define PCI_CFG_SPACE_SIZE 256 +#define PCI_CFG_SPACE_EXP_SIZE 4096 + +LIST_HEAD(pci_devices); + +/** + * pci_release_dev - free a pci device structure when all users of it are finished. + * @dev: device that's been disconnected + * + * Will be called only by the device core when all users of this pci device are + * done. + */ +static void pci_release_dev(struct device *dev) +{ + struct pci_dev *pci_dev; + + pci_dev = to_pci_dev(dev); + kfree(pci_dev); +} + +/* + * Translate the low bits of the PCI base + * to the resource type + */ +static inline unsigned int pci_calc_resource_flags(unsigned int flags) +{ + if (flags PCI_BASE_ADDRESS_SPACE_IO) + return IORESOURCE_IO; + + if (flags PCI_BASE_ADDRESS_MEM_PREFETCH) + return IORESOURCE_MEM | IORESOURCE_PREFETCH; + + return IORESOURCE_MEM; +} + +/* + * Find the extent of a PCI decode.. + */ +static u32 pci_size(u32 base, u32 maxbase, unsigned long mask) +{ + u32 size = mask maxbase; /* Find the significant bits */ + if (!size) + return 0; + + /* Get the lowest of them to find the decode size, and + from that the extent. */ + size = (size ~(size-1)) - 1; + + /* base == maxbase can be valid only if the BAR has + already been programmed with all 1s. */ + if (base == maxbase ((base | size) mask) != mask) + return 0; + + return size; +} + +static void pci_read_bases(struct pci_dev *dev, unsigned int howmany, int rom) +{ + unsigned int pos, reg, next; + u32 l, sz; + struct resource *res; + + for(pos=0; poshowmany; pos = next) { + next = pos+1; + res = dev-resource[pos]; + res-name
[RFC][PATCH] master abort on scanning fixes [6/9]
The PCI bridge driver now checks if changing bridge_ctrl is necessary. It also restores the original bridge_ctl settings when finished scanning for devices. Finally, a pci_bus setup fix is included. Signed-off-by: Adam Belay [EMAIL PROTECTED] --- a/drivers/pci/bus/pci-bridge.c 2005-07-12 01:45:46.0 -0400 +++ b/drivers/pci/bus/pci-bridge.c 2005-07-14 02:09:15.0 -0400 @@ -30,7 +30,7 @@ bus-bridge = dev-dev; bus-ops = bus-parent-ops; bus-sysdata = bus-parent-sysdata; - bus-bridge = get_device(dev-dev); + bus-self = dev; /* Set up default resource pointers and names.. */ for (i = 0; i 4; i++) { @@ -82,12 +82,7 @@ if (!bus) return NULL; - /* Disable MasterAbortMode during probing to avoid reporting -* of bus errors (in some architectures) -*/ pci_read_config_word(dev, PCI_BRIDGE_CONTROL, bctl); - pci_write_config_word(dev, PCI_BRIDGE_CONTROL, - bctl ~PCI_BRIDGE_CTL_MASTER_ABORT); bus-number = bus-secondary = busnr; bus-primary = buses 0xFF; @@ -105,10 +100,22 @@ { unsigned int devfn; + /* Disable MasterAbortMode during probing to avoid reporting +* of bus errors (in some architectures) +*/ + if (!(bus-bridge_ctl PCI_BRIDGE_CTL_MASTER_ABORT)) + pci_write_config_word(bus-self, PCI_BRIDGE_CONTROL, + bus-bridge_ctl ~PCI_BRIDGE_CTL_MASTER_ABORT); + /* Go find them, Rover! */ for (devfn = 0; devfn 0x100; devfn += 8) pci_scan_slot(bus, devfn); + /* restore the original bridge_ctl configuration */ + if (!(bus-bridge_ctl PCI_BRIDGE_CTL_MASTER_ABORT)) + pci_write_config_word(bus-self, PCI_BRIDGE_CONTROL, + bus-bridge_ctl); + pcibios_fixup_bus(bus); pci_bus_add_devices(bus); } - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC][PATCH] don't bind to PCI express links [8/9]
This patch prevents the PCI-PCI bridge driver from binding to PCI express devices. This is needed to coexist with the PCI express root port driver. Eventually we may want to rework and better integrate linux PCI express link support, but for now this should work. Signed-off-by: Adam Belay [EMAIL PROTECTED] --- a/drivers/pci/bus/pci-bridge.c 2005-07-14 02:30:09.0 -0400 +++ b/drivers/pci/bus/pci-bridge.c 2005-07-14 02:46:12.0 -0400 @@ -132,6 +132,10 @@ if (dev-subordinate) return -ENODEV; + /* don't bind to pci express links */ + if (pci_find_capability(dev, PCI_CAP_ID_EXP)) + return -ENODEV; + bus = ppb_detect_bus(dev); if (!bus) return -ENODEV; - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH] add PCI bus registration support [2/9]
On Thu, 2005-07-14 at 12:33 -0700, Greg KH wrote: On Thu, Jul 14, 2005 at 04:55:12AM -0400, Adam Belay wrote: +EXPORT_SYMBOL(pci_add_bus); This doens't need to be exported, right? No module uses it. But if they do, I suggest EXPORT_SYMBOL_GPL() instead, is that ok? thanks, greg k-h Yes, no module currently uses it, but now that pci_driver is supported, any PCI bridge driver could potentially be made into a module. In theory, this could even include the PCI-PCI bridge driver. I also wanted to export this as a module so that it would be easier to add new drivers for more unusual bridge hardware. EXPORT_SYMBOL_GPL() would be fine. Thanks, Adam - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH] split PCI probing code [1/9]
On Thu, 2005-07-14 at 12:30 -0700, Greg KH wrote: On Thu, Jul 14, 2005 at 07:10:14PM +0200, Francois Romieu wrote: Adam Belay [EMAIL PROTECTED] : [...] Some nits + a suspect error branch. It seems nice otherwise. If I'm correct, this patch only moves the code into different files, it doesn't change any of it, so your comments apply to the current code today, not Adam's changes :) Correct. I've been trying to make my changes incremental. Nonetheless, I do appreciate the comments. I'll try to apply these fixes to my current tree. Thanks, Adam - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 2.6.13-rc2] pci: restore BAR values from pci_set_power_state for D3hot->D0
On Fri, Jul 08, 2005 at 02:34:56PM -0400, John W. Linville wrote: > Some PCI devices lose all configuration (including BARs) when > transitioning from D3hot->D0. This leaves such a device in an > inaccessible state. The patch below causes the BARs to be restored > when enabling such a device, so that its driver will be able to > access it. > > Signed-off-by: John W. Linville <[EMAIL PROTECTED]> > --- > Some firmware leaves devices in D3hot after a (re)boot. Most drivers > call pci_enable_device very early, so devices left in D3hot that lose > configuration during the D3hot->D0 transition will be inaccessible to > their drivers. Also, I think there is a possibility of only enabling boot devices for ACPI S4. However, for the reboot case, we're not restoring anything. Instead new resource assignments are being made. Doesn't the PCI subsystem already handle this? > > Drivers could be modified to account for this, but it would > be difficult to know which drivers need modification. This is > especially true since often many devices are covered by the same > driver. It likely would be necessary to replicate code across dozens > of drivers. Agreed. > > The patch below should trigger only when transitioning from D3hot->D0 > (or at boot), and only for devices that have the "no soft reset" bit > cleared in the PM control register. I believe it is safe to include as > part of the PCI infrastructure. > * pci_set_power_state - Set the power state of a PCI device > * @dev: PCI device to be suspended > * @state: PCI power state (D0, D1, D2, D3hot, D3cold) we're entering > @@ -239,7 +270,7 @@ pci_find_parent_resource(const struct pc > int > pci_set_power_state(struct pci_dev *dev, pci_power_t state) > { Couldn't this be in pci_restore_state() instead? I was thinking it would (in part) replace the ugly dword reads we have now. They include many registers we don't need to touch. I wonder if we'll need pci_save_state() at all or if we can derive all the information from the pci_dev. I'll have to look into it further. Also we need a way to restore specific PCI capabilities. Thanks, Adam - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Bug in pcmcia-core
On Thu, Jun 16, 2005 at 11:37:30PM +0100, James Courtier-Dutton wrote: > Hi, > > I have tried conacting the mailing list for the PCMCIA subsystem in > Linux, but no-one seems to respond. > > PCMCIA SUBSYSTEM > L: http://lists.infradead.org/mailman/listinfo/linux-pcmcia > S: Unmaintained > > I am trying to write a Linux ALSA driver for the Creative Audigy 2 NX > Notebook PCMCIA card. > This is a cardbus card, that uses ioports. > When it is inserted into the laptop, the entry appears in "lspci -vv " > showing ioports used by the card. > As soon as my driver uses "outb()" to anything in the address range > shown in "lspci -vv" , the PC hangs. > > I can only conclude from this that ioport resources are not being > allocated correctly to the PCMCIA card. It's possible. > > Can anybody help me track this down. If someone could tell me which > PCMCIA and PCI registers should be set for it to work, I could then find > out which pcmcia registers have not been set correctly, and fix the bug. > > It seems that the PCMCIA specification is not open and free, so I cannot > refer to it in order to fix this myself. > > Can anybody help me? > > James Please provide more information. /proc/ioports, lspci -vv, the ranges assigned to your driver, and your driver code if it's available. I'll try to look into the problem. Thanks, Adam - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Bug in pcmcia-core
On Thu, Jun 16, 2005 at 11:37:30PM +0100, James Courtier-Dutton wrote: Hi, I have tried conacting the mailing list for the PCMCIA subsystem in Linux, but no-one seems to respond. PCMCIA SUBSYSTEM L: http://lists.infradead.org/mailman/listinfo/linux-pcmcia S: Unmaintained I am trying to write a Linux ALSA driver for the Creative Audigy 2 NX Notebook PCMCIA card. This is a cardbus card, that uses ioports. When it is inserted into the laptop, the entry appears in lspci -vv showing ioports used by the card. As soon as my driver uses outb() to anything in the address range shown in lspci -vv , the PC hangs. I can only conclude from this that ioport resources are not being allocated correctly to the PCMCIA card. It's possible. Can anybody help me track this down. If someone could tell me which PCMCIA and PCI registers should be set for it to work, I could then find out which pcmcia registers have not been set correctly, and fix the bug. It seems that the PCMCIA specification is not open and free, so I cannot refer to it in order to fix this myself. Can anybody help me? James Please provide more information. /proc/ioports, lspci -vv, the ranges assigned to your driver, and your driver code if it's available. I'll try to look into the problem. Thanks, Adam - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 2.6.13-rc2] pci: restore BAR values from pci_set_power_state for D3hot-D0
On Fri, Jul 08, 2005 at 02:34:56PM -0400, John W. Linville wrote: Some PCI devices lose all configuration (including BARs) when transitioning from D3hot-D0. This leaves such a device in an inaccessible state. The patch below causes the BARs to be restored when enabling such a device, so that its driver will be able to access it. Signed-off-by: John W. Linville [EMAIL PROTECTED] --- Some firmware leaves devices in D3hot after a (re)boot. Most drivers call pci_enable_device very early, so devices left in D3hot that lose configuration during the D3hot-D0 transition will be inaccessible to their drivers. Also, I think there is a possibility of only enabling boot devices for ACPI S4. However, for the reboot case, we're not restoring anything. Instead new resource assignments are being made. Doesn't the PCI subsystem already handle this? Drivers could be modified to account for this, but it would be difficult to know which drivers need modification. This is especially true since often many devices are covered by the same driver. It likely would be necessary to replicate code across dozens of drivers. Agreed. The patch below should trigger only when transitioning from D3hot-D0 (or at boot), and only for devices that have the no soft reset bit cleared in the PM control register. I believe it is safe to include as part of the PCI infrastructure. * pci_set_power_state - Set the power state of a PCI device * @dev: PCI device to be suspended * @state: PCI power state (D0, D1, D2, D3hot, D3cold) we're entering @@ -239,7 +270,7 @@ pci_find_parent_resource(const struct pc int pci_set_power_state(struct pci_dev *dev, pci_power_t state) { Couldn't this be in pci_restore_state() instead? I was thinking it would (in part) replace the ugly dword reads we have now. They include many registers we don't need to touch. I wonder if we'll need pci_save_state() at all or if we can derive all the information from the pci_dev. I'll have to look into it further. Also we need a way to restore specific PCI capabilities. Thanks, Adam - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [2.6 patch] drivers/pnp/pnpbios/rsparser.c: fix an array overflow
On Sat, Apr 09, 2005 at 08:03:52PM +0200, Adrian Bunk wrote: > This patch fixes an array overflow found by the Coverity checker. > > Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]> > Looks good. Thanks, Adam - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [2.6 patch] drivers/pnp/pnpbios/rsparser.c: fix an array overflow
On Sat, Apr 09, 2005 at 08:03:52PM +0200, Adrian Bunk wrote: This patch fixes an array overflow found by the Coverity checker. Signed-off-by: Adrian Bunk [EMAIL PROTECTED] Looks good. Thanks, Adam - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [linux-pm] Re: [RFC] Driver States
On Tue, 2005-04-05 at 11:24 +0200, Pavel Machek wrote: > Hi! > > > > You have a few things here that can easily conflict, and that will be > > > developed at different paces. I like the direction that it's going, but > > > how do you intend to do it gradually. I.e. what to do first? > > > > I think the first step would be for us to all agree on a design, whether > > it be this one or another, so we can began planning for long term > > changes. > > > > My arguments for these changes are as follows: > > 0. I do not see how to gradually roll this in. > > > 4. Having responsibilities at each driver level encourages a > > layered and object based design, reducing code duplication and > > complexity. > > Unfortunately, you'll be retrofiting this to existing drivers. AFAICS, > trying to force existing driver to "layered and object based design" > can only result in mess. > Pavel Fair enough. How does this sound? I'd like to add "*attach" and "*detach" to "struct device_driver". These functions would act as one time initializers and decontructors. Then we could rename "*probe" to "*start", and "*remove" to "*stop", which should be rather trivial to fix up. From there drivers could slowly be converted to use "*attach" and "*detach", but will not be broken along the way. So the basic flow would be like this: 1.) a driver is bound to a device 2.) *attach is called to allocate data structures 3.) *start when it's time to probe the device 4.) *stop when the user disables the device 5.) repeat steps 3 and 4 any number of times 6.) *detach is called when unbinding the driver The driver layering stuff could come later, but just implementing these specific components would have immediate benefits. In this early stage in development, I'd like to at least be able to start and stop drivers for reasons outside of power management (ex. user preference or resource re-balancing). If a "*resume" function can also utilize this functionality, then all the better. Thanks, Adam - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [linux-pm] Re: [RFC] Driver States
On Tue, 2005-04-05 at 11:24 +0200, Pavel Machek wrote: Hi! You have a few things here that can easily conflict, and that will be developed at different paces. I like the direction that it's going, but how do you intend to do it gradually. I.e. what to do first? I think the first step would be for us to all agree on a design, whether it be this one or another, so we can began planning for long term changes. My arguments for these changes are as follows: 0. I do not see how to gradually roll this in. 4. Having responsibilities at each driver level encourages a layered and object based design, reducing code duplication and complexity. Unfortunately, you'll be retrofiting this to existing drivers. AFAICS, trying to force existing driver to layered and object based design can only result in mess. Pavel Fair enough. How does this sound? I'd like to add *attach and *detach to struct device_driver. These functions would act as one time initializers and decontructors. Then we could rename *probe to *start, and *remove to *stop, which should be rather trivial to fix up. From there drivers could slowly be converted to use *attach and *detach, but will not be broken along the way. So the basic flow would be like this: 1.) a driver is bound to a device 2.) *attach is called to allocate data structures 3.) *start when it's time to probe the device 4.) *stop when the user disables the device 5.) repeat steps 3 and 4 any number of times 6.) *detach is called when unbinding the driver The driver layering stuff could come later, but just implementing these specific components would have immediate benefits. In this early stage in development, I'd like to at least be able to start and stop drivers for reasons outside of power management (ex. user preference or resource re-balancing). If a *resume function can also utilize this functionality, then all the better. Thanks, Adam - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 1/3] pnpbios eliminate bad section references
On Mon, Apr 04, 2005 at 12:56:32PM -0700, Randy.Dunlap wrote: > maximilian attems wrote: > >one of the last buildcheck errors on i386, > >thanks Randy again for double checking. > > > >Fix pnpbios section references: > >make dmi_system_id pnpbios_dmi_table __initdata > > > >Error: ./drivers/pnp/pnpbios/core.o .data refers to 0100 R_386_32 > >.init.text > >Error: ./drivers/pnp/pnpbios/core.o .data refers to 012c R_386_32 > >.init.text > > > >Signed-off-by: maximilian attems <[EMAIL PROTECTED]> > > > > > >--- linux-2.6.12-rc1-bk5/drivers/pnp/pnpbios/core.c.orig 2005-04-04 > >19:11:37.814477672 +0200 > >+++ linux-2.6.12-rc1-bk5/drivers/pnp/pnpbios/core.c 2005-04-04 > >19:25:50.074402365 +0200 > >@@ -512,7 +512,7 @@ > > return 0; > > } > > > >-static struct dmi_system_id pnpbios_dmi_table[] = { > >+static struct dmi_system_id pnpbios_dmi_table[] __initdata = { > > { /* PnPBIOS GPF on boot */ > > .callback = exploding_pnp_bios, > > .ident = "Higraded P14H", > > Looks OK to me, but I'd prefer to leave it up to Adam. Thank you for forwarding this to me. It looks good. Cheers, Adam - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 1/3] pnpbios eliminate bad section references
On Mon, Apr 04, 2005 at 12:56:32PM -0700, Randy.Dunlap wrote: maximilian attems wrote: one of the last buildcheck errors on i386, thanks Randy again for double checking. Fix pnpbios section references: make dmi_system_id pnpbios_dmi_table __initdata Error: ./drivers/pnp/pnpbios/core.o .data refers to 0100 R_386_32 .init.text Error: ./drivers/pnp/pnpbios/core.o .data refers to 012c R_386_32 .init.text Signed-off-by: maximilian attems [EMAIL PROTECTED] --- linux-2.6.12-rc1-bk5/drivers/pnp/pnpbios/core.c.orig 2005-04-04 19:11:37.814477672 +0200 +++ linux-2.6.12-rc1-bk5/drivers/pnp/pnpbios/core.c 2005-04-04 19:25:50.074402365 +0200 @@ -512,7 +512,7 @@ return 0; } -static struct dmi_system_id pnpbios_dmi_table[] = { +static struct dmi_system_id pnpbios_dmi_table[] __initdata = { { /* PnPBIOS GPF on boot */ .callback = exploding_pnp_bios, .ident = Higraded P14H, Looks OK to me, but I'd prefer to leave it up to Adam. Thank you for forwarding this to me. It looks good. Cheers, Adam - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PCI bridge devices questions
On Sat, Apr 02, 2005 at 01:04:33PM -0500, Marty Leisner wrote: > I have to write some code to insert a non-standard bridge > (it identifies itself as bridge-other, but it functions > as a pci-pci bridge). > > I'm going to be using 2.4.2x and eventually 2.6.x for intel > and ppc... I'm currently working on a new pci bridge class framework for 2.6. The most significant change is that you will be able to bind to the bridge using a "struct pci_driver". > > In the pci_dev structure (for 2.4.29) > there's > (in include/linux/pci.h) > > 00355 #define DEVICE_COUNT_RESOURCE 12 > 00410 struct resource resource[DEVICE_COUNT_RESOURCE]; /* I/O and > memory regions + expansion ROMs */ > > We also have: > 00431 /* > 00432 * For PCI devices, the region numbers are assigned this way: > 00433 * > 00434 * 0-5 standard PCI regions > 00435 * 6 expansion ROM > 00436 * 7-10bridges: address space assigned to buses behind the > bridge > 00437 */ > 00438 > 00439 #define PCI_ROM_RESOURCE 6 > 00440 #define PCI_BRIDGE_RESOURCES 7 > 00441 #define PCI_NUM_RESOURCES 11 > > Now where my confusion sets in: > 1) PCI_NUM_RESOURCES + 1 == DEVICE_COUNT_RESOURCE > Why? At a glance it looks like it's because the array starts at 0. > 2) I understand the first 6 regions (standard) and the expansion rom) -- > why 5 more? I'm currently redesigning this to use a resource array in "struct device". > 3) I've only seen instances of 3 bus regions used -- IO, MEM prefetch, > MEM nonprefetch -- are they order dependent? There are 4 on cardbus bridges. In my implementation, they will probably not be very order dependent. > > Thanks... > > Marty Leisner > [EMAIL PROTECTED] Could you provide any additional details about this bridge? Thanks, Adam - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PCI bridge devices questions
On Sat, Apr 02, 2005 at 01:04:33PM -0500, Marty Leisner wrote: I have to write some code to insert a non-standard bridge (it identifies itself as bridge-other, but it functions as a pci-pci bridge). I'm going to be using 2.4.2x and eventually 2.6.x for intel and ppc... I'm currently working on a new pci bridge class framework for 2.6. The most significant change is that you will be able to bind to the bridge using a struct pci_driver. In the pci_dev structure (for 2.4.29) there's (in include/linux/pci.h) 00355 #define DEVICE_COUNT_RESOURCE 12 00410 struct resource resource[DEVICE_COUNT_RESOURCE]; /* I/O and memory regions + expansion ROMs */ We also have: 00431 /* 00432 * For PCI devices, the region numbers are assigned this way: 00433 * 00434 * 0-5 standard PCI regions 00435 * 6 expansion ROM 00436 * 7-10bridges: address space assigned to buses behind the bridge 00437 */ 00438 00439 #define PCI_ROM_RESOURCE 6 00440 #define PCI_BRIDGE_RESOURCES 7 00441 #define PCI_NUM_RESOURCES 11 Now where my confusion sets in: 1) PCI_NUM_RESOURCES + 1 == DEVICE_COUNT_RESOURCE Why? At a glance it looks like it's because the array starts at 0. 2) I understand the first 6 regions (standard) and the expansion rom) -- why 5 more? I'm currently redesigning this to use a resource array in struct device. 3) I've only seen instances of 3 bus regions used -- IO, MEM prefetch, MEM nonprefetch -- are they order dependent? There are 4 on cardbus bridges. In my implementation, they will probably not be very order dependent. Thanks... Marty Leisner [EMAIL PROTECTED] Could you provide any additional details about this bridge? Thanks, Adam - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC] Driver States
Dynamic power management may require devices and drivers to transition between various physical and logical states. I would like to start a discussion on how these might be defined at the bus, driver, and class levels. Bus Level = At the bus level, there are two state attributes, power and enable/disable. Enable/disable may mean different things on different buses, but they generally refer to resource decoding. A device can only be enabled during a non-off power state. A possible API: struct bus_type { char* name; struct subsystemsubsys; struct kset drivers; struct kset devices; struct bus_attribute* bus_attrs; struct device_attribute * dev_attrs; struct driver_attribute * drv_attrs; int (*match)(struct device * dev, struct device_driver * drv); int (*hotplug) (struct device *dev, char **envp, int num_envp, char *buffer, int buffer_size); int (*suspend)(struct device * dev, pm_message_t state); int (*resume)(struct device * dev); int (*enable)(struct device * dev); int (*disable)(struct device * dev); }; Driver Level At the driver level there are two areas of interest, physical and logical state. There is an additional concern of transitioning between these states multiple times. Because a driver acts as a bridge between physical and logical components, I think separating these steps seems natural. A possible API: struct device_driver { char* name; struct bus_type * bus; struct semaphoreunload_sem; struct kobject kobj; struct list_headdevices; struct module * owner; int (*attach) (struct device * dev); int (*start)(struct device * dev); int (*open) (struct device * dev); int (*close)(struct device * dev); void(*stop) (struct device * dev); void(*detach) (struct device * dev); void(*shutdown) (struct device * dev); int (*suspend) (struct device * dev, u32 state, u32 level); int (*resume) (struct device * dev, u32 level); }; *attach - allocates data structures, creates sysfs entries, prepares driver to handle the hardware. *start - Sets up device resources and configures the hardware. Loads firmware, etc. (physical) *open - engages the hardware, and makes it usable by the class device. (logical and physical) *close - disengages the hardware, and stops class level access (logical and physical) *stop - physically disables the hardware (physical) *detach - tears down the driver and releases it from the "struct device" The idea behind *attach and *detach is to move code that would only need to be called once out of *probe and *remove. A table could be defined that indicates what should be called for each power level transition. *suspend and *resume could handle any extra steps (ex. saving state). As an example, *start and *stop may only be called when power is going to be lost entirely. Additional states are class specific and would only be used after *open is called. Class Level === At the class level, we could have a simple start/stop mechanism. A possible API: struct class_device { struct list_headnode; struct kobject kobj; struct class* class; struct device * dev; void* class_data; charclass_id[BUS_ID_SIZE]; int (*attach) (struct device * dev); int (*start)(struct device * dev); void(*stop) (struct device * dev); void(*detach) (struct device * dev); }; *attach - allocates data structures, creates sysfs entries, prepares class to handle the device. *start - start the logical class device, accept userspace interaction *stop - stop the logical class device, deny userspace interaction *detach - tear down the class driver's bindings with this class device These are just rough ideas. I look forward to any comments or alternative approaches. Thanks, Adam - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] Some thoughts on device drivers and sysfs
On Sun, 2005-03-27 at 23:43 +0200, Dominik Brodowski wrote: > On Sun, Mar 27, 2005 at 04:27:24PM -0500, Adam Belay wrote: > > > extern int device_create_file(struct device *device, struct > > > device_attribute > > > * entry); > > > and delete them (e.g. in ->remove) using > > > extern void device_remove_file(struct device * dev, struct > > > device_attribute > > > * attr); > > > > > > and there's also > > > > > > extern int driver_create_file(struct device_driver *, struct > > > driver_attribute *); > > > extern void driver_remove_file(struct device_driver *, struct > > > driver_attribute *); > > > > > > > > > Dominik > > > > Yes, I'm aware of these functions but they pollute the bus level > > namespace. I'm interested in reactions to this alternative approach. I > > wanted to explore the possibility of making a device driver instance a > > separate component with its own individual state and relationships. > > To be honest, I don't consider this to be a pollution of the "bus" > namespace, but I fear that having two different places for somewhat similar, > or even equal, data adds unneeded complexity to the driver model. In what > specific instances has the current design limited or obstructed your > intentions? > Fair enough. I just wanted to float this possibility. I appreciate your comments. The original intention for this design was to begin working on a framework for driver layering. (ex. snd-intel8x0m -> ac97, or the pci express bus abstraction) I was considering the possibility of having driver devices with parent and child relationships that reflect the internal layering of Linux drivers. I haven't really had a chance to fully develop this idea, so at this point, driver layering and my original email are just abstract concepts. Adam - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC] Some thoughts on device drivers and sysfs
One of the original design goals of sysfs was to provide a standardized location to keep driver configuration attributes. Although sysfs handles this very well for bus devices and class devices, there isn't currently a method to export attributes for device drivers and their specific bound device instances to userspace. I would like to propose that we create a new type of device that would act as the layer between physical (bus devices) and logical (class devices). It could be referred to as a "driver device". Driver devices would bind to a bus devices and create one or more class devices. Their type would be of "struct device_driver". As an example, this would allow us to move something like /proc/driver/emu10k1/:01:09.0 into sysfs. (physical) |(logical) |bus device --> driver device --> class device | struct driver_device { struct list_head node; unsigned long id; struct kobject kobj; struct device_driver *drv; struct device *dev; int state; }; In sysfs, a new directory could be created to represent driver devices. It might look like the following: bus | \- pci | \- devices | \- link to device0 \- link to device1 \- drivers | \- link to random_drv (in other words random_drv can drive this bus) device | \- device0 [...] \- device1 [...] driver (this directory is new) | \- random_drv | \- 0 (a sequential instance number) <-- this is a driver device | \- link to device0 \- link to class0 \- a file to control driver state (start, stop, etc.) \- driver attributes for this link \- 1 | \- link to device1 \- link to class1 \- a file to control driver state \- driver attributes for this link class | \-some_type | \- class0 [...] \- class1 [...] This would allow us to represent per-device driver attributes in sysfs. As an added benefit, driver devices would allow the tracking and control of driver state, which may be needed for dynamic power management. I look forward to any comments. Thanks, Adam - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC] Some thoughts on device drivers and sysfs
One of the original design goals of sysfs was to provide a standardized location to keep driver configuration attributes. Although sysfs handles this very well for bus devices and class devices, there isn't currently a method to export attributes for device drivers and their specific bound device instances to userspace. I would like to propose that we create a new type of device that would act as the layer between physical (bus devices) and logical (class devices). It could be referred to as a driver device. Driver devices would bind to a bus devices and create one or more class devices. Their type would be of struct device_driver. As an example, this would allow us to move something like /proc/driver/emu10k1/:01:09.0 into sysfs. (physical) |(logical) |bus device -- driver device -- class device | struct driver_device { struct list_head node; unsigned long id; struct kobject kobj; struct device_driver *drv; struct device *dev; int state; }; In sysfs, a new directory could be created to represent driver devices. It might look like the following: bus | \- pci | \- devices | \- link to device0 \- link to device1 \- drivers | \- link to random_drv (in other words random_drv can drive this bus) device | \- device0 [...] \- device1 [...] driver (this directory is new) | \- random_drv | \- 0 (a sequential instance number) -- this is a driver device | \- link to device0 \- link to class0 \- a file to control driver state (start, stop, etc.) \- driver attributes for this link \- 1 | \- link to device1 \- link to class1 \- a file to control driver state \- driver attributes for this link class | \-some_type | \- class0 [...] \- class1 [...] This would allow us to represent per-device driver attributes in sysfs. As an added benefit, driver devices would allow the tracking and control of driver state, which may be needed for dynamic power management. I look forward to any comments. Thanks, Adam - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] Some thoughts on device drivers and sysfs
On Sun, 2005-03-27 at 23:43 +0200, Dominik Brodowski wrote: On Sun, Mar 27, 2005 at 04:27:24PM -0500, Adam Belay wrote: extern int device_create_file(struct device *device, struct device_attribute * entry); and delete them (e.g. in -remove) using extern void device_remove_file(struct device * dev, struct device_attribute * attr); and there's also extern int driver_create_file(struct device_driver *, struct driver_attribute *); extern void driver_remove_file(struct device_driver *, struct driver_attribute *); Dominik Yes, I'm aware of these functions but they pollute the bus level namespace. I'm interested in reactions to this alternative approach. I wanted to explore the possibility of making a device driver instance a separate component with its own individual state and relationships. To be honest, I don't consider this to be a pollution of the bus namespace, but I fear that having two different places for somewhat similar, or even equal, data adds unneeded complexity to the driver model. In what specific instances has the current design limited or obstructed your intentions? Fair enough. I just wanted to float this possibility. I appreciate your comments. The original intention for this design was to begin working on a framework for driver layering. (ex. snd-intel8x0m - ac97, or the pci express bus abstraction) I was considering the possibility of having driver devices with parent and child relationships that reflect the internal layering of Linux drivers. I haven't really had a chance to fully develop this idea, so at this point, driver layering and my original email are just abstract concepts. Adam - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC] Driver States
Dynamic power management may require devices and drivers to transition between various physical and logical states. I would like to start a discussion on how these might be defined at the bus, driver, and class levels. Bus Level = At the bus level, there are two state attributes, power and enable/disable. Enable/disable may mean different things on different buses, but they generally refer to resource decoding. A device can only be enabled during a non-off power state. A possible API: struct bus_type { char* name; struct subsystemsubsys; struct kset drivers; struct kset devices; struct bus_attribute* bus_attrs; struct device_attribute * dev_attrs; struct driver_attribute * drv_attrs; int (*match)(struct device * dev, struct device_driver * drv); int (*hotplug) (struct device *dev, char **envp, int num_envp, char *buffer, int buffer_size); int (*suspend)(struct device * dev, pm_message_t state); int (*resume)(struct device * dev); int (*enable)(struct device * dev); int (*disable)(struct device * dev); }; Driver Level At the driver level there are two areas of interest, physical and logical state. There is an additional concern of transitioning between these states multiple times. Because a driver acts as a bridge between physical and logical components, I think separating these steps seems natural. A possible API: struct device_driver { char* name; struct bus_type * bus; struct semaphoreunload_sem; struct kobject kobj; struct list_headdevices; struct module * owner; int (*attach) (struct device * dev); int (*start)(struct device * dev); int (*open) (struct device * dev); int (*close)(struct device * dev); void(*stop) (struct device * dev); void(*detach) (struct device * dev); void(*shutdown) (struct device * dev); int (*suspend) (struct device * dev, u32 state, u32 level); int (*resume) (struct device * dev, u32 level); }; *attach - allocates data structures, creates sysfs entries, prepares driver to handle the hardware. *start - Sets up device resources and configures the hardware. Loads firmware, etc. (physical) *open - engages the hardware, and makes it usable by the class device. (logical and physical) *close - disengages the hardware, and stops class level access (logical and physical) *stop - physically disables the hardware (physical) *detach - tears down the driver and releases it from the struct device The idea behind *attach and *detach is to move code that would only need to be called once out of *probe and *remove. A table could be defined that indicates what should be called for each power level transition. *suspend and *resume could handle any extra steps (ex. saving state). As an example, *start and *stop may only be called when power is going to be lost entirely. Additional states are class specific and would only be used after *open is called. Class Level === At the class level, we could have a simple start/stop mechanism. A possible API: struct class_device { struct list_headnode; struct kobject kobj; struct class* class; struct device * dev; void* class_data; charclass_id[BUS_ID_SIZE]; int (*attach) (struct device * dev); int (*start)(struct device * dev); void(*stop) (struct device * dev); void(*detach) (struct device * dev); }; *attach - allocates data structures, creates sysfs entries, prepares class to handle the device. *start - start the logical class device, accept userspace interaction *stop - stop the logical class device, deny userspace interaction *detach - tear down the class driver's bindings with this class device These are just rough ideas. I look forward to any comments or alternative approaches. Thanks, Adam - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.11: USB broken on nforce4, ipv6 still broken, centrino speedstep even more broken than in 2.6.10
On Mon, 2005-03-21 at 19:32, Andrew Morton wrote: > Adam Belay <[EMAIL PROTECTED]> wrote: > > > > On Fri, 2005-03-11 at 17:35 -0800, Andrew Morton wrote: > > > Felix von Leitner <[EMAIL PROTECTED]> wrote: > > > > > > > > Finally Centrino SpeedStep. > > > > I have a "Intel(R) Pentium(R) M processor 1.80GHz" in my notebook. > > > > Linux does not support it. This architecture has been out there for > > > > months now, and there even was a patch to support it posted here a in > > > > October last year or so. Linux still does not include it. Until > > > > 2.6.11-rc4-bk8 or so, the old patched file from back then still worked. > > > > Now it doesn't. Because some interface changed. Now what? Using a > > > > Centrino notebook without CPU throttling is completely out of the > > > > question. Linux might as well not boot on it at all. > > > > > > Could you please dig out the old patch, send it? > > > > Why not use ACPI for CPU scaling? > > > > Felix, did you try this? > ACPI is the preferred (and only standardized) method of controlling cpu throttling on x86 systems. Also, as I said earlier, I wanted to see an lspci for the usb issues. Thanks, Adam - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.11: USB broken on nforce4, ipv6 still broken, centrino speedstep even more broken than in 2.6.10
On Mon, 2005-03-21 at 19:32, Andrew Morton wrote: Adam Belay [EMAIL PROTECTED] wrote: On Fri, 2005-03-11 at 17:35 -0800, Andrew Morton wrote: Felix von Leitner [EMAIL PROTECTED] wrote: Finally Centrino SpeedStep. I have a Intel(R) Pentium(R) M processor 1.80GHz in my notebook. Linux does not support it. This architecture has been out there for months now, and there even was a patch to support it posted here a in October last year or so. Linux still does not include it. Until 2.6.11-rc4-bk8 or so, the old patched file from back then still worked. Now it doesn't. Because some interface changed. Now what? Using a Centrino notebook without CPU throttling is completely out of the question. Linux might as well not boot on it at all. Could you please dig out the old patch, send it? Why not use ACPI for CPU scaling? Felix, did you try this? ACPI is the preferred (and only standardized) method of controlling cpu throttling on x86 systems. Also, as I said earlier, I wanted to see an lspci for the usb issues. Thanks, Adam - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.11: USB broken on nforce4, ipv6 still broken, centrino speedstep even more broken than in 2.6.10
On Fri, 2005-03-11 at 17:35 -0800, Andrew Morton wrote: > Felix von Leitner <[EMAIL PROTECTED]> wrote: > > > > Finally Centrino SpeedStep. > > I have a "Intel(R) Pentium(R) M processor 1.80GHz" in my notebook. > > Linux does not support it. This architecture has been out there for > > months now, and there even was a patch to support it posted here a in > > October last year or so. Linux still does not include it. Until > > 2.6.11-rc4-bk8 or so, the old patched file from back then still worked. > > Now it doesn't. Because some interface changed. Now what? Using a > > Centrino notebook without CPU throttling is completely out of the > > question. Linux might as well not boot on it at all. > > Could you please dig out the old patch, send it? Why not use ACPI for CPU scaling? Thanks, Adam - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.11: USB broken on nforce4, ipv6 still broken, centrino speedstep even more broken than in 2.6.10
On Fri, 2005-03-11 at 20:21 +, Felix von Leitner wrote: > Linux is getting less and less usable for me. :-( > > > My new nForce 4 mainboard has 10 or so USB 2.0 outlets. In Windows, > they all work. In Linux, two of them work. Putting my USB stick or > anything else in one of the others produces nothing in Linux. > Apparently no IRQ getting through or something? Could you also include lspci -vv. Thanks, Adam - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: IBM Thinkpad G41 PCMCIA problems [Was: Yenta TI: ... no PCI interrupts. Fish. Please report.]
On Sun, 2005-02-20 at 09:23 -0800, Linus Torvalds wrote: > > On Sun, 20 Feb 2005, Russell King wrote: > > On Sat, Feb 19, 2005 at 08:36:12PM -0500, Steven Rostedt wrote: > > > BIOS-e820: - 0009f000 (usable) > > > BIOS-e820: 0009f000 - 000a (reserved) > > > BIOS-e820: 000d - 000d4000 (reserved) > > > BIOS-e820: 000dc000 - 0010 (reserved) > > > BIOS-e820: 0010 - 0f6f (usable) > > > BIOS-e820: 0f6f - 0f70 (reserved) > > > BIOS-e820: 0f70 - 3fef (usable) > > > BIOS-e820: 3fef - 3fef8000 (ACPI data) > > > BIOS-e820: 3fef8000 - 3fefa000 (ACPI NVS) > > > BIOS-e820: 3ff0 - 4000 (reserved) > > > > Your BIOS is broken. You probably have 1GB of RAM which extends from > > 0x to 0x4000. However, there's a hole in the ACPI map > > between 0x3fefa000 and 0x3ff0. > > Good point. And dammit, we've had that problem too many times before. ACPI will report the ranges available to a PCI root bridge, even on single root machines. I'm hoping to take advantage of this in my PCI bus changes. It should help with these sort of problems. Thanks, Adam - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: IBM Thinkpad G41 PCMCIA problems [Was: Yenta TI: ... no PCI interrupts. Fish. Please report.]
On Sat, 2005-02-19 at 20:02 -0800, Linus Torvalds wrote: > > On Sat, 19 Feb 2005, Steven Rostedt wrote: > > > > On Sat, 2005-02-19 at 18:10 -0800, Linus Torvalds wrote: > > > > > I _think_ it's the code in arch/i386/pci/fixup.c that does this. See the > > > > > > static void __devinit pci_fixup_transparent_bridge(struct pci_dev *dev) > > > > > > thing, and try to disable it. Maybe that rule is wrong, and triggers much > > > too often? > > > > > > > Linus, > > > > Thank you very much! That was it. The following patch made everything > > look good. > > Ok. I've fired off an email to some Intel people asking what the > real rules are wrt Intel PCI-PCI bridges. It may be that it's not that > particular chip, but some generic rule (like "all Intel bridges act like > they are subtractive decode _except_ if they actually have the IO > start/stop ranges set" or something like that). > > If anybody on the list can figure the Intel bridge decoding rules out, > please holler.. > > Linus Actually, I've ran into a similar situation on my hardware. After looking into it for a while, I'm pretty sure it's actually a transparent bridge (despite it not indicating such in the programing interface class code). Have you heard anything more? Thanks, Adam - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [2.6 patch] drivers/pnp/: possible cleanups
> So in short, I'd rather not remove them, because they take away from the > original design of the PnP layer. s/they/it would - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [2.6 patch] drivers/pnp/: possible cleanups
On Fri, 2005-03-11 at 16:23 -0800, Andrew Morton wrote: > Adam Belay <[EMAIL PROTECTED]> wrote: > > > > This patch essential makes it impossible for PnP protocols to be > > modules. Currently, they are all in-kernel. If that is acceptable..., > > then this patch looks fine to me. Any comments? > > You're the maintainer... I've been holding off on making many changes to PnP at the moment, because I have been considering replacing it with a new (more modern and ACPI capable) ISA/LPC bridge driver. This work would likely begin after my PCI bridge driver rewrite is finished and merged (as the PCI work is in some ways a prerequisite). http://marc.theaimsgroup.com/?l=linux-kernel=111023821617705=2 Still, if there are changes to fix actual bugs, then I'm all for them. Also a few features could be added. Specifically PnPBIOS hotplug/docking station support. If anyone's interested, I may implement it (and it would use some functions that were removed by this patch). Furthermore, ISAPnP could be made a module. PnPBIOS probably couldn't. > > If someone converts a protocol to be moduar, presumably they will re-add > the needed exports to support that. Correct. > > Are there likely to be any out-of-tree modular protocols in existence? > Not that I'm aware of. So in short, I'd rather not remove them, because they take away from the original design of the PnP layer. Thanks, Adam - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [2.6 patch] drivers/pnp/: possible cleanups
This patch essential makes it impossible for PnP protocols to be modules. Currently, they are all in-kernel. If that is acceptable..., then this patch looks fine to me. Any comments? Thanks, Adam On Fri, 2005-03-11 at 19:16 +0100, Adrian Bunk wrote: > This patch contains the following possible cleanups: > - make needlessly global code static > - #if 0 the following unused global function: > - core.c: pnp_remove_device > - remove the following unneeded EXPORT_SYMBOL's: > - card.c: pnp_add_card > - card.c: pnp_remove_card > - card.c: pnp_add_card_device > - card.c: pnp_remove_card_device > - card.c: pnp_add_card_id > - core.c: pnp_register_protocol > - core.c: pnp_unregister_protocol > - core.c: pnp_add_device > - core.c: pnp_remove_device > - pnpacpi/core.c: pnpacpi_protocol > - driver.c: pnp_add_id > - isapnp/core.c: isapnp_read_byte > - manager.c: pnp_auto_config_dev > - resource.c: pnp_register_dependent_option > - resource.c: pnp_register_independent_option > - resource.c: pnp_register_irq_resource > - resource.c: pnp_register_dma_resource > - resource.c: pnp_register_port_resource > - resource.c: pnp_register_mem_resource > > Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]> > - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [2.6 patch] drivers/pnp/: possible cleanups
This patch essential makes it impossible for PnP protocols to be modules. Currently, they are all in-kernel. If that is acceptable..., then this patch looks fine to me. Any comments? Thanks, Adam On Fri, 2005-03-11 at 19:16 +0100, Adrian Bunk wrote: This patch contains the following possible cleanups: - make needlessly global code static - #if 0 the following unused global function: - core.c: pnp_remove_device - remove the following unneeded EXPORT_SYMBOL's: - card.c: pnp_add_card - card.c: pnp_remove_card - card.c: pnp_add_card_device - card.c: pnp_remove_card_device - card.c: pnp_add_card_id - core.c: pnp_register_protocol - core.c: pnp_unregister_protocol - core.c: pnp_add_device - core.c: pnp_remove_device - pnpacpi/core.c: pnpacpi_protocol - driver.c: pnp_add_id - isapnp/core.c: isapnp_read_byte - manager.c: pnp_auto_config_dev - resource.c: pnp_register_dependent_option - resource.c: pnp_register_independent_option - resource.c: pnp_register_irq_resource - resource.c: pnp_register_dma_resource - resource.c: pnp_register_port_resource - resource.c: pnp_register_mem_resource Signed-off-by: Adrian Bunk [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [2.6 patch] drivers/pnp/: possible cleanups
On Fri, 2005-03-11 at 16:23 -0800, Andrew Morton wrote: Adam Belay [EMAIL PROTECTED] wrote: This patch essential makes it impossible for PnP protocols to be modules. Currently, they are all in-kernel. If that is acceptable..., then this patch looks fine to me. Any comments? You're the maintainer... I've been holding off on making many changes to PnP at the moment, because I have been considering replacing it with a new (more modern and ACPI capable) ISA/LPC bridge driver. This work would likely begin after my PCI bridge driver rewrite is finished and merged (as the PCI work is in some ways a prerequisite). http://marc.theaimsgroup.com/?l=linux-kernelm=111023821617705w=2 Still, if there are changes to fix actual bugs, then I'm all for them. Also a few features could be added. Specifically PnPBIOS hotplug/docking station support. If anyone's interested, I may implement it (and it would use some functions that were removed by this patch). Furthermore, ISAPnP could be made a module. PnPBIOS probably couldn't. If someone converts a protocol to be moduar, presumably they will re-add the needed exports to support that. Correct. Are there likely to be any out-of-tree modular protocols in existence? Not that I'm aware of. So in short, I'd rather not remove them, because they take away from the original design of the PnP layer. Thanks, Adam - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [2.6 patch] drivers/pnp/: possible cleanups
So in short, I'd rather not remove them, because they take away from the original design of the PnP layer. s/they/it would - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: IBM Thinkpad G41 PCMCIA problems [Was: Yenta TI: ... no PCI interrupts. Fish. Please report.]
On Sat, 2005-02-19 at 20:02 -0800, Linus Torvalds wrote: On Sat, 19 Feb 2005, Steven Rostedt wrote: On Sat, 2005-02-19 at 18:10 -0800, Linus Torvalds wrote: I _think_ it's the code in arch/i386/pci/fixup.c that does this. See the static void __devinit pci_fixup_transparent_bridge(struct pci_dev *dev) thing, and try to disable it. Maybe that rule is wrong, and triggers much too often? Linus, Thank you very much! That was it. The following patch made everything look good. Ok. I've fired off an email to some Intel people asking what the real rules are wrt Intel PCI-PCI bridges. It may be that it's not that particular chip, but some generic rule (like all Intel bridges act like they are subtractive decode _except_ if they actually have the IO start/stop ranges set or something like that). If anybody on the list can figure the Intel bridge decoding rules out, please holler.. Linus Actually, I've ran into a similar situation on my hardware. After looking into it for a while, I'm pretty sure it's actually a transparent bridge (despite it not indicating such in the programing interface class code). Have you heard anything more? Thanks, Adam - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: IBM Thinkpad G41 PCMCIA problems [Was: Yenta TI: ... no PCI interrupts. Fish. Please report.]
On Sun, 2005-02-20 at 09:23 -0800, Linus Torvalds wrote: On Sun, 20 Feb 2005, Russell King wrote: On Sat, Feb 19, 2005 at 08:36:12PM -0500, Steven Rostedt wrote: BIOS-e820: - 0009f000 (usable) BIOS-e820: 0009f000 - 000a (reserved) BIOS-e820: 000d - 000d4000 (reserved) BIOS-e820: 000dc000 - 0010 (reserved) BIOS-e820: 0010 - 0f6f (usable) BIOS-e820: 0f6f - 0f70 (reserved) BIOS-e820: 0f70 - 3fef (usable) BIOS-e820: 3fef - 3fef8000 (ACPI data) BIOS-e820: 3fef8000 - 3fefa000 (ACPI NVS) BIOS-e820: 3ff0 - 4000 (reserved) Your BIOS is broken. You probably have 1GB of RAM which extends from 0x to 0x4000. However, there's a hole in the ACPI map between 0x3fefa000 and 0x3ff0. Good point. And dammit, we've had that problem too many times before. ACPI will report the ranges available to a PCI root bridge, even on single root machines. I'm hoping to take advantage of this in my PCI bus changes. It should help with these sort of problems. Thanks, Adam - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.11: USB broken on nforce4, ipv6 still broken, centrino speedstep even more broken than in 2.6.10
On Fri, 2005-03-11 at 20:21 +, Felix von Leitner wrote: Linux is getting less and less usable for me. :-( My new nForce 4 mainboard has 10 or so USB 2.0 outlets. In Windows, they all work. In Linux, two of them work. Putting my USB stick or anything else in one of the others produces nothing in Linux. Apparently no IRQ getting through or something? Could you also include lspci -vv. Thanks, Adam - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.11: USB broken on nforce4, ipv6 still broken, centrino speedstep even more broken than in 2.6.10
On Fri, 2005-03-11 at 17:35 -0800, Andrew Morton wrote: Felix von Leitner [EMAIL PROTECTED] wrote: Finally Centrino SpeedStep. I have a Intel(R) Pentium(R) M processor 1.80GHz in my notebook. Linux does not support it. This architecture has been out there for months now, and there even was a patch to support it posted here a in October last year or so. Linux still does not include it. Until 2.6.11-rc4-bk8 or so, the old patched file from back then still worked. Now it doesn't. Because some interface changed. Now what? Using a Centrino notebook without CPU throttling is completely out of the question. Linux might as well not boot on it at all. Could you please dig out the old patch, send it? Why not use ACPI for CPU scaling? Thanks, Adam - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH] PCI bridge driver rewrite (rev 02)
On Mon, 2005-03-07 at 18:03 -0500, Jon Smirl wrote: > What about a bridge driver for ISA LPC bridges? That would also > provide a logical place to hang serial ports, floppy, parallel port, > ps2 port, etc. Things in /sys/bus/platform are really attached to the > LPC bridge. > I agree that /sys/bus/platform isn't really the right place. It doesn't show the correct parent device, among other issues. I've done some work on this issue in the past, and it turns out to be a very complicated problem. Basically there are many protocols that feed into the pool of devices known as *legacy hardware*. The include the following: 1.) ACPI * provides resource information * provides identification information * the only protocol that accurately describes topology * often provides power management features * sometimes a few device specific methods (e.g. floppy drives) 2.) PnPBIOS * outdated by ACPI, generally useful for x86 boxes before 2000 * provides resource information * provides identification information, and a class code not found in ACPI * all devices reported are considered root devices, no sense of true topology * In theory could handle some hotplugging of these devices (e.g. docking stations) 3.) ISAPnP * Even more outdated. * Provides resource information. * provides identification, including a card id not found in PnPBIOS or ACPI (obviously). * Only used for ISA expansion cards 4.) SuperIO drivers * In theory it is possible to determine configuration information from the SuperIO directly. * Some, but very limited, work has been done in this area. * ACPI generally handles this because there is little standardization at this level. 5.) Legacy Probing * Driver attempts to find the hardware directly by reading various ports. * Can be dangerous. * Drivers of this type encourage vendors to include legacy compatibility (which in the long run holds us back). * Very difficult to integrate with the driver model. 6.) Open Firmware * I don't know much about it, but I believe it does do similar things to ACPI. * Hopefully it uses EISA ids, but not really sure. If not, it wouldn't be included. So basically we have to handle all (or most) of these. The question becomes should driver developers have to write code for all 6 of these interfaces (which seems a little overwhelming), or should they share a common layer. If so, the driver model would need a way to represent this. One idea I had was to make "buses" a special type of "class". And then allow classes to be layered. So it would look something like ISA/LPC ->ACPI ->PnPBIOS ->ISAPnP ->SuperIO ->legacy ->Open Firmware Where each of the 6 classes inherit characteristics from "ISA/LPC". A driver could then choose to bind to the more general "ISA/LPC" interfaces, or if necessary a more specific interface like "ACPI". "ISA/LPC" would be sort of a least common denominator. Of course this would require big changes to the driver model, so it would have to be really worth it. I look forward to any comments or suggestions for alternative approaches. Thanks, Adam - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH] PCI bridge driver rewrite (rev 02)
On Mon, 2005-03-07 at 15:43 -0800, Jesse Barnes wrote: > On Monday, March 7, 2005 3:39 pm, Jon Smirl wrote: > > How is sys/bus/platform/* going to work for IA64 machine line SGI SVN? > > SVN supports multiple simultaneously active legacy spaces, that means > > that there can be multiple floppy, serial, ps/2, etc controllers. > > Should these devices be hung off from the bridge they are on? > > Probably, though no one in their right mind is going to put anything like > that > on these machines (sn2 btw) :). VGA cards will hopefully be the only devices > of this type that we'll have installed. > > Jesse Well, if the system supports ACPI, then in theory it could have any number/configuration of legacy devices, and we'd know everything about them including exactly where to put them in the device tree. However, I agree that legacy hardware will be less common in this architecture. Adam - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Kernel hangs on PCI config register access ???
On Fri, Feb 18, 2005 at 08:49:58AM +0100, Matthias Urlichs wrote: > Hi, > > we have a bunch of systems which semi-reproducibly (chance of 1:1000) hang > when a PCMCIA card is removed from its PCI->PCMCIA interface via "cardctl > eject". Right *here*, in fact: > > static int pci_conf1_read (int seg, int bus, int devfn, int reg, int > len, + u32 *value) { > [...] > case 2: > debug("you see me \n"); > *value = inw(0xCFC + (reg & 2)); > debug("but you don't get here \n"); > break; > [...] > > Does anybody have *any* idea what could possibly be the cause of this? > Using pci=bios still hangs; pci=conf2 doesn't work. > > FWIW, the call sequence is: > > shutdown_socket > yenta_sock_init > yenta_clear_maps > yenta_set_socket > pci_bus_read_config_word > pci_conf1_read > > The systems in question are wildly different (VIA vs. Intel CPUs, standard > mainboard vs. PCI backplane, Ricoh vs. ENE cardbus bridges), so I'm > inclined to rule out hardware problems. The NMI monitor doesn't trigger > (yes I tested it), kgdb is unresponsive -- the system hangs hard at that > point, as far as I can determine. > > Kernel: tested with various 2.6.1? plus -rc* and/or -mm*, no change. Is this still an issue with recent kernels? Where in the PCI configuration space is it reading? In other words, could you show me the line that calls pci_bus_read_config_word. Thanks, Adam - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Kernel hangs on PCI config register access ???
On Fri, Feb 18, 2005 at 08:49:58AM +0100, Matthias Urlichs wrote: Hi, we have a bunch of systems which semi-reproducibly (chance of 1:1000) hang when a PCMCIA card is removed from its PCI-PCMCIA interface via cardctl eject. Right *here*, in fact: static int pci_conf1_read (int seg, int bus, int devfn, int reg, int len, + u32 *value) { [...] case 2: debug(you see me \n); *value = inw(0xCFC + (reg 2)); debug(but you don't get here \n); break; [...] Does anybody have *any* idea what could possibly be the cause of this? Using pci=bios still hangs; pci=conf2 doesn't work. FWIW, the call sequence is: shutdown_socket yenta_sock_init yenta_clear_maps yenta_set_socket pci_bus_read_config_word pci_conf1_read The systems in question are wildly different (VIA vs. Intel CPUs, standard mainboard vs. PCI backplane, Ricoh vs. ENE cardbus bridges), so I'm inclined to rule out hardware problems. The NMI monitor doesn't trigger (yes I tested it), kgdb is unresponsive -- the system hangs hard at that point, as far as I can determine. Kernel: tested with various 2.6.1? plus -rc* and/or -mm*, no change. Is this still an issue with recent kernels? Where in the PCI configuration space is it reading? In other words, could you show me the line that calls pci_bus_read_config_word. Thanks, Adam - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH] PCI bridge driver rewrite (rev 02)
On Mon, 2005-03-07 at 15:43 -0800, Jesse Barnes wrote: On Monday, March 7, 2005 3:39 pm, Jon Smirl wrote: How is sys/bus/platform/* going to work for IA64 machine line SGI SVN? SVN supports multiple simultaneously active legacy spaces, that means that there can be multiple floppy, serial, ps/2, etc controllers. Should these devices be hung off from the bridge they are on? Probably, though no one in their right mind is going to put anything like that on these machines (sn2 btw) :). VGA cards will hopefully be the only devices of this type that we'll have installed. Jesse Well, if the system supports ACPI, then in theory it could have any number/configuration of legacy devices, and we'd know everything about them including exactly where to put them in the device tree. However, I agree that legacy hardware will be less common in this architecture. Adam - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH] PCI bridge driver rewrite (rev 02)
On Mon, 2005-03-07 at 18:03 -0500, Jon Smirl wrote: What about a bridge driver for ISA LPC bridges? That would also provide a logical place to hang serial ports, floppy, parallel port, ps2 port, etc. Things in /sys/bus/platform are really attached to the LPC bridge. I agree that /sys/bus/platform isn't really the right place. It doesn't show the correct parent device, among other issues. I've done some work on this issue in the past, and it turns out to be a very complicated problem. Basically there are many protocols that feed into the pool of devices known as *legacy hardware*. The include the following: 1.) ACPI * provides resource information * provides identification information * the only protocol that accurately describes topology * often provides power management features * sometimes a few device specific methods (e.g. floppy drives) 2.) PnPBIOS * outdated by ACPI, generally useful for x86 boxes before 2000 * provides resource information * provides identification information, and a class code not found in ACPI * all devices reported are considered root devices, no sense of true topology * In theory could handle some hotplugging of these devices (e.g. docking stations) 3.) ISAPnP * Even more outdated. * Provides resource information. * provides identification, including a card id not found in PnPBIOS or ACPI (obviously). * Only used for ISA expansion cards 4.) SuperIO drivers * In theory it is possible to determine configuration information from the SuperIO directly. * Some, but very limited, work has been done in this area. * ACPI generally handles this because there is little standardization at this level. 5.) Legacy Probing * Driver attempts to find the hardware directly by reading various ports. * Can be dangerous. * Drivers of this type encourage vendors to include legacy compatibility (which in the long run holds us back). * Very difficult to integrate with the driver model. 6.) Open Firmware * I don't know much about it, but I believe it does do similar things to ACPI. * Hopefully it uses EISA ids, but not really sure. If not, it wouldn't be included. So basically we have to handle all (or most) of these. The question becomes should driver developers have to write code for all 6 of these interfaces (which seems a little overwhelming), or should they share a common layer. If so, the driver model would need a way to represent this. One idea I had was to make buses a special type of class. And then allow classes to be layered. So it would look something like ISA/LPC -ACPI -PnPBIOS -ISAPnP -SuperIO -legacy -Open Firmware Where each of the 6 classes inherit characteristics from ISA/LPC. A driver could then choose to bind to the more general ISA/LPC interfaces, or if necessary a more specific interface like ACPI. ISA/LPC would be sort of a least common denominator. Of course this would require big changes to the driver model, so it would have to be really worth it. I look forward to any comments or suggestions for alternative approaches. Thanks, Adam - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] PCI bridge driver rewrite
On Mon, 2005-02-28 at 15:38 -0800, Jesse Barnes wrote: > On Monday, February 28, 2005 3:27 pm, Adam Belay wrote: > > How can we specify which bus to target? > > Maybe we could have a list of legacy (ISA?) devices for drivers like vgacon > to > attach to? The bus info could be stuffed into the legacy device structure > itself so that the platform code would know what to do. Are these devices actually legacy, or PCI with compatibility interfaces? I think a "struct isa_device" would be be useful. Would a pointer to the "struct pci_bus" do the trick? > > > Also is the legacy IO space mapped to IO Memory on the other side of the > > bridge? > > How do you mean? Legacy I/O port accesses just become strongly ordered > memory > transactions, afaik, and legacy memory accesses are dealt with the same way. > > Jesse I was just wondering if we have to reserve a memory range for this? Thanks, Adam - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH] add driver matching priorities
On Fri, 2005-02-25 at 15:41 -0800, Greg KH wrote: > On Thu, Feb 10, 2005 at 04:37:03PM -0500, Adam Belay wrote: > > On Thu, 2005-02-10 at 18:45 +, Russell King wrote: > > > On Thu, Feb 10, 2005 at 12:18:37PM -0500, Adam Belay wrote: > > > > > I think the issue that Al raises about drivers grabbing devices, and > > > > > then trying to unbind them might be a real problem. > > > > > > > > I agree. Do you think registering every in-kernel driver before probing > > > > hardware would solve this problem? > > > > > > In which case, consider whether we should be tainting the kernel if > > > someone loads a device driver, it binds to a device, and then they > > > unload that driver. > > > > > > It's precisely the same situation, and precisely the same mechanics > > > as what I've suggested should be going on here. If one scenario is > > > inherently buggy, so is the other. > > > > > > > I think it would depend on whether the user makes the device busy before > > the driver is unloaded. Different device classes may have different > > requirements for when and how a device can be removed. Are there other > > issues as well? Maybe there are ways to improve driver start and stop > > mechanics. > > We never fail a device unbind from a driver, so this isn't as big a deal > as I originally thought. Yes, userspace can get messy, but as userspace > was the one that loaded the new driver to bind, it's acceptable. > > So, care to resubmit your patch? > Would you like me to include the portion that adds "*match" to "struct device_driver"? After some more thought, I began considering having driver priority be a static quality of a device driver. The question is whether we want a device driver to be able to return a variable priority based on bind device. Also, "*match" could be used to split some detection and validation out of "*probe". What are your reactions to this? Finally, should every in-kernel driver be registered before devices are detected? Thanks, Adam - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] PCI bridge driver rewrite
On Fri, 2005-02-25 at 15:38 -0800, Greg KH wrote: > On Thu, Feb 24, 2005 at 01:22:01AM -0500, Adam Belay wrote: > > I look forward to any comments or suggestions. > > I like it all :) > > If you want to submit patches now that rearrange the code to make it > easier for you to modify in the future to achieve the above goals, feel > free, I'll gladly take them. > > thanks, > > greg k-h I'm going to do an updated release soon. It should take care of some of the issues on the TODO list and also will be based on previous feedback. >From there, I'll start planning a strategy for merging with mainline. I appreciate the comments. Thanks, Adam - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] PCI bridge driver rewrite
On Thu, 2005-02-24 at 10:03 +, Russell King wrote: > On Thu, Feb 24, 2005 at 01:22:01AM -0500, Adam Belay wrote: > > 5.) write a bridge driver for Cardbus hardware > > We have this already - it's called "yenta". Yes, I'm aware. It should read: 5.) adapt the Yenta driver to the new PCI bus class :) > > What you need to be aware of is that cardbus hardware is special - it > may change its resource requirements at any time, both in terms of the > number of BUS IDs it wishes to consume, and the number and size of > IO and memory resources. We can have default sizes allocated for these windows. Maybe, we'll even have rebalancing at some point. As for BUS IDs, I'm not sure about the best behavior. I don't really like reserving 4 positions like we do now. It has a tendency to create conflicts, and seems to be unnecessary. How common are PCI bridge devices that attach to cardbus controllers? Does the BIOS ever preconfigure the cardbus bridge for this situation? I think it's important that we get bus numbering correct. Some hardware has problems now. > > Note also that if a cardbus bridge isn't on the root bus (it happens on > some laptops) these resource changes may impact on upstream bridges and > devices. > Yeah, also legacy resources can't pass through properly if the parent bridge isn't transparent. Complex bus topologies make the problem much more difficult when legacy hardware is involved. Thanks, Adam - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] PCI bridge driver rewrite
On Thu, 2005-02-24 at 02:25 -0500, Jon Smirl wrote: > When you start writing the PCI root bridge driver you'll run into the > AGP drivers that are already attached to the bridge. I was surprised > by this since I expected AGP to be attached to the AGP bridge but now > I learned that it is a root bridge function. I'm going to have the PCI root bridge driver bind to a device on the primary side of the bridge. The device could be enumerated by ACPI or created manually when the bridge is detected. It will not, however, be a PCI device. > > An ISA LPC bridge driver would be nice too. It would let you turn off > serial ports, etc and let other systems know how many ports there are. > No real need for this, just a nice toy. I think this would make a lot of sense. ACPI could be used to enumerate child devices for this bridge. I'd like to begin work on a generic ISA bus driver soon. > > Does this work to cause a probe based on PCI class? > static struct pci_device_id p2p_id_tbl[] = { >{ PCI_DEVICE_CLASS(PCI_CLASS_BRIDGE_PCI << 8, 0x00) }, >{ 0 }, > }; Yes, the macro is used when matching against only a class of device. > > I would like to install a driver that gets called whenever new > CLASS_VGA hardware shows up via hotplug. It won't attach to the > device, it will just add some sysfs attributes. The framebuffer > drivers need to attach the device. If I add attributes this way how > can I remove them? It would be possible, but probably not a clean solution. Ideally we want one driver to bind to the graphics controller and remain bound. It will then create class devices for each graphics subsystem, such as framebuffer. Much work remains to be done before this can happen. Thanks, Adam - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] PCI bridge driver rewrite
On Thu, 2005-02-24 at 15:02 -0800, Jesse Barnes wrote: > On Wednesday, February 23, 2005 11:03 pm, Adam Belay wrote: > > > > Jesse can comment on the specific support needed for multiple legacy IO > > > spaces. > > > > That would be great. Most of my experience has been with only a couple > > legacy IO port ranges passing through the bridge. > > Well, I'll give you one, somewhat perverse, example. On SGI sn2 machines, > each host<->pci bridge (either xio<->pci or numalink<->pci) has two pci > busses and some additional host bus ports. The bridges are capable of > generating low address bus cycles on both busses simultaneously, so we can do > ISA memory access and legacy port I/O on every bus in the system at the same > time. > > The main host chipset has no notion of VGA or legacy routing though, so doing > a port access to say 0x3c8 is ambiguous--we need a bus to target (though the > platform code could provide a 'default' bus for such accesses to go to, this > may be what VGA or legacy routing means for us under your scheme). Likewise, > accessing ISA memory space like 0xa needs a bus to target. > > It would be nice if this sort of thing was taken into account in your new > model, so that for example we could have the vgacon driver talking to > multiple different VGA cards at the same time. > > Thanks, > Jesse How can we specify which bus to target? Also is the legacy IO space mapped to IO Memory on the other side of the bridge? Thanks, Adam - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] PCI bridge driver rewrite
On Thu, 2005-02-24 at 15:02 -0800, Jesse Barnes wrote: On Wednesday, February 23, 2005 11:03 pm, Adam Belay wrote: Jesse can comment on the specific support needed for multiple legacy IO spaces. That would be great. Most of my experience has been with only a couple legacy IO port ranges passing through the bridge. Well, I'll give you one, somewhat perverse, example. On SGI sn2 machines, each host-pci bridge (either xio-pci or numalink-pci) has two pci busses and some additional host bus ports. The bridges are capable of generating low address bus cycles on both busses simultaneously, so we can do ISA memory access and legacy port I/O on every bus in the system at the same time. The main host chipset has no notion of VGA or legacy routing though, so doing a port access to say 0x3c8 is ambiguous--we need a bus to target (though the platform code could provide a 'default' bus for such accesses to go to, this may be what VGA or legacy routing means for us under your scheme). Likewise, accessing ISA memory space like 0xa needs a bus to target. It would be nice if this sort of thing was taken into account in your new model, so that for example we could have the vgacon driver talking to multiple different VGA cards at the same time. Thanks, Jesse How can we specify which bus to target? Also is the legacy IO space mapped to IO Memory on the other side of the bridge? Thanks, Adam - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] PCI bridge driver rewrite
On Thu, 2005-02-24 at 02:25 -0500, Jon Smirl wrote: When you start writing the PCI root bridge driver you'll run into the AGP drivers that are already attached to the bridge. I was surprised by this since I expected AGP to be attached to the AGP bridge but now I learned that it is a root bridge function. I'm going to have the PCI root bridge driver bind to a device on the primary side of the bridge. The device could be enumerated by ACPI or created manually when the bridge is detected. It will not, however, be a PCI device. An ISA LPC bridge driver would be nice too. It would let you turn off serial ports, etc and let other systems know how many ports there are. No real need for this, just a nice toy. I think this would make a lot of sense. ACPI could be used to enumerate child devices for this bridge. I'd like to begin work on a generic ISA bus driver soon. Does this work to cause a probe based on PCI class? static struct pci_device_id p2p_id_tbl[] = { { PCI_DEVICE_CLASS(PCI_CLASS_BRIDGE_PCI 8, 0x00) }, { 0 }, }; Yes, the macro is used when matching against only a class of device. I would like to install a driver that gets called whenever new CLASS_VGA hardware shows up via hotplug. It won't attach to the device, it will just add some sysfs attributes. The framebuffer drivers need to attach the device. If I add attributes this way how can I remove them? It would be possible, but probably not a clean solution. Ideally we want one driver to bind to the graphics controller and remain bound. It will then create class devices for each graphics subsystem, such as framebuffer. Much work remains to be done before this can happen. Thanks, Adam - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] PCI bridge driver rewrite
On Thu, 2005-02-24 at 10:03 +, Russell King wrote: On Thu, Feb 24, 2005 at 01:22:01AM -0500, Adam Belay wrote: 5.) write a bridge driver for Cardbus hardware We have this already - it's called yenta. Yes, I'm aware. It should read: 5.) adapt the Yenta driver to the new PCI bus class :) What you need to be aware of is that cardbus hardware is special - it may change its resource requirements at any time, both in terms of the number of BUS IDs it wishes to consume, and the number and size of IO and memory resources. We can have default sizes allocated for these windows. Maybe, we'll even have rebalancing at some point. As for BUS IDs, I'm not sure about the best behavior. I don't really like reserving 4 positions like we do now. It has a tendency to create conflicts, and seems to be unnecessary. How common are PCI bridge devices that attach to cardbus controllers? Does the BIOS ever preconfigure the cardbus bridge for this situation? I think it's important that we get bus numbering correct. Some hardware has problems now. Note also that if a cardbus bridge isn't on the root bus (it happens on some laptops) these resource changes may impact on upstream bridges and devices. Yeah, also legacy resources can't pass through properly if the parent bridge isn't transparent. Complex bus topologies make the problem much more difficult when legacy hardware is involved. Thanks, Adam - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] PCI bridge driver rewrite
On Fri, 2005-02-25 at 15:38 -0800, Greg KH wrote: On Thu, Feb 24, 2005 at 01:22:01AM -0500, Adam Belay wrote: I look forward to any comments or suggestions. I like it all :) If you want to submit patches now that rearrange the code to make it easier for you to modify in the future to achieve the above goals, feel free, I'll gladly take them. thanks, greg k-h I'm going to do an updated release soon. It should take care of some of the issues on the TODO list and also will be based on previous feedback. From there, I'll start planning a strategy for merging with mainline. I appreciate the comments. Thanks, Adam - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH] add driver matching priorities
On Fri, 2005-02-25 at 15:41 -0800, Greg KH wrote: On Thu, Feb 10, 2005 at 04:37:03PM -0500, Adam Belay wrote: On Thu, 2005-02-10 at 18:45 +, Russell King wrote: On Thu, Feb 10, 2005 at 12:18:37PM -0500, Adam Belay wrote: I think the issue that Al raises about drivers grabbing devices, and then trying to unbind them might be a real problem. I agree. Do you think registering every in-kernel driver before probing hardware would solve this problem? In which case, consider whether we should be tainting the kernel if someone loads a device driver, it binds to a device, and then they unload that driver. It's precisely the same situation, and precisely the same mechanics as what I've suggested should be going on here. If one scenario is inherently buggy, so is the other. I think it would depend on whether the user makes the device busy before the driver is unloaded. Different device classes may have different requirements for when and how a device can be removed. Are there other issues as well? Maybe there are ways to improve driver start and stop mechanics. We never fail a device unbind from a driver, so this isn't as big a deal as I originally thought. Yes, userspace can get messy, but as userspace was the one that loaded the new driver to bind, it's acceptable. So, care to resubmit your patch? Would you like me to include the portion that adds *match to struct device_driver? After some more thought, I began considering having driver priority be a static quality of a device driver. The question is whether we want a device driver to be able to return a variable priority based on bind device. Also, *match could be used to split some detection and validation out of *probe. What are your reactions to this? Finally, should every in-kernel driver be registered before devices are detected? Thanks, Adam - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] PCI bridge driver rewrite
On Mon, 2005-02-28 at 15:38 -0800, Jesse Barnes wrote: On Monday, February 28, 2005 3:27 pm, Adam Belay wrote: How can we specify which bus to target? Maybe we could have a list of legacy (ISA?) devices for drivers like vgacon to attach to? The bus info could be stuffed into the legacy device structure itself so that the platform code would know what to do. Are these devices actually legacy, or PCI with compatibility interfaces? I think a struct isa_device would be be useful. Would a pointer to the struct pci_bus do the trick? Also is the legacy IO space mapped to IO Memory on the other side of the bridge? How do you mean? Legacy I/O port accesses just become strongly ordered memory transactions, afaik, and legacy memory accesses are dealt with the same way. Jesse I was just wondering if we have to reserve a memory range for this? Thanks, Adam - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] PCI bridge driver rewrite
On Thu, 2005-02-24 at 01:45 -0500, Jon Smirl wrote: > On Thu, 24 Feb 2005 01:22:01 -0500, Adam Belay <[EMAIL PROTECTED]> wrote: > > For the past couple weeks I have been reorganizing the PCI subsystem to > > better utilize the driver model. Specifically, the bus detection code > > is now using a standard PCI driver. It turns out to be a major > > What about VGA routing? Most PCI buses do it with the normal VGA bit > but big hardware supports multiple legacy IO spaces via the bridge > chips. > > Are you going to make sysfs entries for the bridges? If so I'd like a > VGA attribute that directly reads the VGA bit from the hardware and > display it instead of using the shadow copy. Yeah, actually I've been thinking about this issue a lot. I think it would make a lot of sense to export this sort of thing under the "pci_bus" class in sysfs. The ISA enable bit should probably also be exported. Furthermore, we should be verifying the BIOS's configuration of VGA and ISA. I'll try to integrate this in my future releases. I appreciate the code. I also have a number of resource management plans for the VGA enable bit that I'll get into in my next set of patches. > > Jesse can comment on the specific support needed for multiple legacy IO > spaces. > That would be great. Most of my experience has been with only a couple legacy IO port ranges passing through the bridge. Thanks, Adam - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC] PCI bridge driver rewrite
Hi all, For the past couple weeks I have been reorganizing the PCI subsystem to better utilize the driver model. Specifically, the bus detection code is now using a standard PCI driver. It turns out to be a major undertaking, as the PCI probing code is closely tied into a lot of other PCI components, and is spread throughout various architecture specific areas. I'm hoping that these changes will allow for a much cleaner and more functional PCI implementation. The basic flow of the new code is as follows: 1.) A standard "driver core" driver binds to a bridge device. 2.) When "*probe" is called it sets up the hardware and allocates a "struct pci_bus". 3.) The "struct pci_bus" is filled with information about the detected bridge. 4.) The driver then registers the "struct pci_bus" with the PCI Bus Class. 5.) The PCI Bus Class makes the bridge available to sysfs. 6.) It then detects hardware attached to the bridge. 7.) Each new PCI bridge device is registered with the driver model. 8.) All remaining PCI devices are registered with the driver model. Steps 7 and 8 allow for better resource management. I've attached an early version of my code. It has most of the new PCI bus class registration code in place, and an early implementation of the PCI-to-PCI bridge driver. The following remains to be done: 1.) refine and cleanup the new PCI Bus API 2.) export the new API in "linux/pci.h", and cleanup any users of the old code. 3.) fix every PCI hotplug driver. 4.) write a bridge driver for the PCI root bridge 5.) write a bridge driver for Cardbus hardware 6.) refine device registration order 7.) redesign PCI bus number assignment and support bus renumbering 8.) redesign PCI resource management to be compatible with the new code 9.) testing on various architectures 10.) Write "*suspend" and "*resume" routines for PCI bridges. Any ideas on what needs to be done? 11.) fix "PCI_LEGACY" (I may have broke it, but it should be trivial) I look forward to any comments or suggestions. Thanks, Adam diffstat: Makefile |9 bus-class.c | 225 +++ bus/Makefile |6 bus/bus-p2p.c | 133 ++ device.c | 142 +++ pci.h |4 probe.c | 546 -- remove.c | 126 - 9 files changed, 598 insertions(+), 593 deletions(-) Patch is against 2.6.11-RC3. diff -urN linux/drivers/pci/bus/bus-p2p.c linux-pci/drivers/pci/bus/bus-p2p.c --- linux/drivers/pci/bus/bus-p2p.c 1969-12-31 19:00:00.0 -0500 +++ linux-pci/drivers/pci/bus/bus-p2p.c 2005-02-24 00:19:05.0 -0500 @@ -0,0 +1,133 @@ +/* + * bus-p2p.c - a generic PCI bus driver for PCI<->PCI bridges + * + */ + +#include +#include +#include + +static struct pci_device_id p2p_id_tbl[] = { + { PCI_DEVICE_CLASS(PCI_CLASS_BRIDGE_PCI << 8, 0x00) }, + { 0 }, +}; +MODULE_DEVICE_TABLE(pci, p2p_id_tbl); + +static void p2p_setup_bus_numbers(struct pci_dev *dev, struct pci_bus *bus) +{ + u32 buses; + + pci_read_config_dword(dev, PCI_PRIMARY_BUS, ); + + bus->primary = buses & 0xFF; + bus->secondary = (buses >> 8) & 0xFF; + bus->subordinate = (buses >> 16) & 0xFF; +} + +static void pci_enable_crs(struct pci_dev *dev) +{ + u16 cap, rpctl; + int rpcap = pci_find_capability(dev, PCI_CAP_ID_EXP); + if (!rpcap) + return; + + pci_read_config_word(dev, rpcap + PCI_CAP_FLAGS, ); + if (((cap & PCI_EXP_FLAGS_TYPE) >> 4) != PCI_EXP_TYPE_ROOT_PORT) + return; + + pci_read_config_word(dev, rpcap + PCI_EXP_RTCTL, ); + rpctl |= PCI_EXP_RTCTL_CRSSVE; + pci_write_config_word(dev, rpcap + PCI_EXP_RTCTL, rpctl); +} + +static void p2p_prepare_hardware(struct pci_dev *dev, struct pci_bus *bus) +{ + u16 bctl; + + /* Disable MasterAbortMode during probing to avoid reporting + of bus errors (in some architectures) */ + pci_read_config_word(dev, PCI_BRIDGE_CONTROL, ); + pci_write_config_word(dev, PCI_BRIDGE_CONTROL, + bctl & ~PCI_BRIDGE_CTL_MASTER_ABORT); + + bus->bridge_ctl = bctl; + + pci_enable_crs(dev); +} + +/* FIXME: these need to be defined in linux/pci.h */ +extern struct pci_bus * pci_alloc_bus(void); +extern int pci_add_bus(struct pci_bus *bus); +extern struct pci_bus * pci_derive_parent(struct device *); + +static int p2p_probe(struct pci_dev *dev, const struct pci_device_id *id) +{ + int err, i; + struct pci_bus *bus; + + if (dev->hdr_type != PCI_HEADER_TYPE_BRIDGE) + return -ENODEV; + + bus = pci_alloc_bus(); + + if (!bus) + return -ENOMEM; + + bus->bridge = >dev; + bus->parent = pci_derive_parent(>self->dev); + if (!bus->parent) { + err = -ENODEV; + goto out; + } + + bus->ops = bus->parent->ops; + bus->sysdata =
[RFC] PCI bridge driver rewrite
Hi all, For the past couple weeks I have been reorganizing the PCI subsystem to better utilize the driver model. Specifically, the bus detection code is now using a standard PCI driver. It turns out to be a major undertaking, as the PCI probing code is closely tied into a lot of other PCI components, and is spread throughout various architecture specific areas. I'm hoping that these changes will allow for a much cleaner and more functional PCI implementation. The basic flow of the new code is as follows: 1.) A standard driver core driver binds to a bridge device. 2.) When *probe is called it sets up the hardware and allocates a struct pci_bus. 3.) The struct pci_bus is filled with information about the detected bridge. 4.) The driver then registers the struct pci_bus with the PCI Bus Class. 5.) The PCI Bus Class makes the bridge available to sysfs. 6.) It then detects hardware attached to the bridge. 7.) Each new PCI bridge device is registered with the driver model. 8.) All remaining PCI devices are registered with the driver model. Steps 7 and 8 allow for better resource management. I've attached an early version of my code. It has most of the new PCI bus class registration code in place, and an early implementation of the PCI-to-PCI bridge driver. The following remains to be done: 1.) refine and cleanup the new PCI Bus API 2.) export the new API in linux/pci.h, and cleanup any users of the old code. 3.) fix every PCI hotplug driver. 4.) write a bridge driver for the PCI root bridge 5.) write a bridge driver for Cardbus hardware 6.) refine device registration order 7.) redesign PCI bus number assignment and support bus renumbering 8.) redesign PCI resource management to be compatible with the new code 9.) testing on various architectures 10.) Write *suspend and *resume routines for PCI bridges. Any ideas on what needs to be done? 11.) fix PCI_LEGACY (I may have broke it, but it should be trivial) I look forward to any comments or suggestions. Thanks, Adam diffstat: Makefile |9 bus-class.c | 225 +++ bus/Makefile |6 bus/bus-p2p.c | 133 ++ device.c | 142 +++ pci.h |4 probe.c | 546 -- remove.c | 126 - 9 files changed, 598 insertions(+), 593 deletions(-) Patch is against 2.6.11-RC3. diff -urN linux/drivers/pci/bus/bus-p2p.c linux-pci/drivers/pci/bus/bus-p2p.c --- linux/drivers/pci/bus/bus-p2p.c 1969-12-31 19:00:00.0 -0500 +++ linux-pci/drivers/pci/bus/bus-p2p.c 2005-02-24 00:19:05.0 -0500 @@ -0,0 +1,133 @@ +/* + * bus-p2p.c - a generic PCI bus driver for PCI-PCI bridges + * + */ + +#include linux/pci.h +#include linux/init.h +#include linux/module.h + +static struct pci_device_id p2p_id_tbl[] = { + { PCI_DEVICE_CLASS(PCI_CLASS_BRIDGE_PCI 8, 0x00) }, + { 0 }, +}; +MODULE_DEVICE_TABLE(pci, p2p_id_tbl); + +static void p2p_setup_bus_numbers(struct pci_dev *dev, struct pci_bus *bus) +{ + u32 buses; + + pci_read_config_dword(dev, PCI_PRIMARY_BUS, buses); + + bus-primary = buses 0xFF; + bus-secondary = (buses 8) 0xFF; + bus-subordinate = (buses 16) 0xFF; +} + +static void pci_enable_crs(struct pci_dev *dev) +{ + u16 cap, rpctl; + int rpcap = pci_find_capability(dev, PCI_CAP_ID_EXP); + if (!rpcap) + return; + + pci_read_config_word(dev, rpcap + PCI_CAP_FLAGS, cap); + if (((cap PCI_EXP_FLAGS_TYPE) 4) != PCI_EXP_TYPE_ROOT_PORT) + return; + + pci_read_config_word(dev, rpcap + PCI_EXP_RTCTL, rpctl); + rpctl |= PCI_EXP_RTCTL_CRSSVE; + pci_write_config_word(dev, rpcap + PCI_EXP_RTCTL, rpctl); +} + +static void p2p_prepare_hardware(struct pci_dev *dev, struct pci_bus *bus) +{ + u16 bctl; + + /* Disable MasterAbortMode during probing to avoid reporting + of bus errors (in some architectures) */ + pci_read_config_word(dev, PCI_BRIDGE_CONTROL, bctl); + pci_write_config_word(dev, PCI_BRIDGE_CONTROL, + bctl ~PCI_BRIDGE_CTL_MASTER_ABORT); + + bus-bridge_ctl = bctl; + + pci_enable_crs(dev); +} + +/* FIXME: these need to be defined in linux/pci.h */ +extern struct pci_bus * pci_alloc_bus(void); +extern int pci_add_bus(struct pci_bus *bus); +extern struct pci_bus * pci_derive_parent(struct device *); + +static int p2p_probe(struct pci_dev *dev, const struct pci_device_id *id) +{ + int err, i; + struct pci_bus *bus; + + if (dev-hdr_type != PCI_HEADER_TYPE_BRIDGE) + return -ENODEV; + + bus = pci_alloc_bus(); + + if (!bus) + return -ENOMEM; + + bus-bridge = dev-dev; + bus-parent = pci_derive_parent(bus-self-dev); + if (!bus-parent) { + err = -ENODEV; + goto out; + } + + bus-ops = bus-parent-ops; + bus-sysdata
Re: [RFC] PCI bridge driver rewrite
On Thu, 2005-02-24 at 01:45 -0500, Jon Smirl wrote: On Thu, 24 Feb 2005 01:22:01 -0500, Adam Belay [EMAIL PROTECTED] wrote: For the past couple weeks I have been reorganizing the PCI subsystem to better utilize the driver model. Specifically, the bus detection code is now using a standard PCI driver. It turns out to be a major What about VGA routing? Most PCI buses do it with the normal VGA bit but big hardware supports multiple legacy IO spaces via the bridge chips. Are you going to make sysfs entries for the bridges? If so I'd like a VGA attribute that directly reads the VGA bit from the hardware and display it instead of using the shadow copy. Yeah, actually I've been thinking about this issue a lot. I think it would make a lot of sense to export this sort of thing under the pci_bus class in sysfs. The ISA enable bit should probably also be exported. Furthermore, we should be verifying the BIOS's configuration of VGA and ISA. I'll try to integrate this in my future releases. I appreciate the code. I also have a number of resource management plans for the VGA enable bit that I'll get into in my next set of patches. Jesse can comment on the specific support needed for multiple legacy IO spaces. That would be great. Most of my experience has been with only a couple legacy IO port ranges passing through the bridge. Thanks, Adam - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH] add driver matching priorities
On Thu, 2005-02-10 at 18:45 +, Russell King wrote: > On Thu, Feb 10, 2005 at 12:18:37PM -0500, Adam Belay wrote: > > > I think the issue that Al raises about drivers grabbing devices, and > > > then trying to unbind them might be a real problem. > > > > I agree. Do you think registering every in-kernel driver before probing > > hardware would solve this problem? > > In which case, consider whether we should be tainting the kernel if > someone loads a device driver, it binds to a device, and then they > unload that driver. > > It's precisely the same situation, and precisely the same mechanics > as what I've suggested should be going on here. If one scenario is > inherently buggy, so is the other. > I think it would depend on whether the user makes the device busy before the driver is unloaded. Different device classes may have different requirements for when and how a device can be removed. Are there other issues as well? Maybe there are ways to improve driver start and stop mechanics. Thanks, Adam - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/