Re: Random panic in load_balance() with 3.16-rc

2014-08-04 Thread Steven Rostedt
On Fri, 25 Jul 2014 11:29:06 -0700
Linus Torvalds torva...@linux-foundation.org wrote:

 On Fri, Jul 25, 2014 at 7:02 AM, Steven Rostedt rost...@goodmis.org wrote:
 
  But wouldn't it be rather trivial to run a static analyzer on the final
  vmlinux to make sure there are no red zones? I mean, you would only need
  to read each function and check to make sure that the offset of rbp is
  within the change of rsp, wouldn't you?
 
  Almost seems like an objdump -rd into a perl script could do this.
 
 I'm sure it's possible, but it sounds potentially complicated. It's
 not like the function prologue is fixed, and gcc will create code
 (including conditional branches etc) before the whole frame setup if
 there are simple things that can be done purely with the
 callee-clobbered registers etc.
 
 Some simple pattern to make sure that the sub $frame-size,%rsp comes
 before any accesses to (%rbp) (when frame pointers are enabled)
 *might* work, but it might also end up missing things.
 
 You want to try?
 

I tried :-)

Seems to stay clean with my x86_64 vmlinux build. I compiled the fair.s
file and tested it and it got this:

$ objump -dr fair.o | ./check-vmlinux.pl
ERROR: load_balance: depth=-40 offset=136 at 4210
3572:   48 c7 85 78 ff ff ffmovq   $0x0,-0x88(%rbp)
ERROR: sched_group_set_shares: depth=8 offset=48 at 5734
4a79:   48 89 45 d0 mov%rax,-0x30(%rbp)
ERROR: sched_group_set_shares: depth=8 offset=48 at 5743
4a9c:   48 8b 75 d0 mov-0x30(%rbp),%rsi

The sched_group_set_shares errors seem to be bogus as that is done
after %rbp has been popped. Looks to be code that was jumped to from
the main body. I could probably fix this by ignoring code after a pop
of %rbp. Or make it more complex and be a bit smarter about branches.

The script will read a file if passed by parameter, otherwise it reads
standard input.

e.g.

$ objump -dr fair.o | ./check-vmlinux.pl

or

$ objdump -dr fair.o  /tmp/fair.dump
$ ./check-vmlinux.pl /tmp/fair.dump

produce the same.

It's a little hacky, but I only spent a half hour at most on it.

-- Steve


check-vmlinux.pl
Description: Perl program


Re: Random panic in load_balance() with 3.16-rc

2014-07-25 Thread Steven Rostedt
On Thu, Jul 24, 2014 at 08:55:28PM -0700, Alexei Starovoitov wrote:
 
 -mno-red-zone only affected prologue emition in gcc. This part didn't
 change between the releases. So the bug is quite deep.
 What seems to be happening is that 2nd pass of instruction scheduler
 (after emit prologue and reg alloc) is ignoring barrier properties
 of 'subq $184, %rsp' and moving 'movq $.., -136(%rbp)' instruction
 ahead of it. afaik rtl sched was never aware of 'red-zone'.
 As an ex-compiler guy, I'm worried that this bug exists in earlier
 releases. rtl backend guys need to take a serious look at it.
 imo this is very serious bug, since broken red-zone is extremelly
 hard to debug.

But wouldn't it be rather trivial to run a static analyzer on the final
vmlinux to make sure there are no red zones? I mean, you would only need
to read each function and check to make sure that the offset of rbp is
within the change of rsp, wouldn't you?

Almost seems like an objdump -rd into a perl script could do this.

-- Steve


 There are two weak test in gcc testsuite related to -mno-red-zone,
 but not a single test that actually check that it is doing
 the right thing. It is scary. I hope I'm wrong with this analysis.
 
 --
 To unsubscribe from this list: send the line unsubscribe linux-kernel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/


-- 
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/20140725140237.gb32...@home.goodmis.org



Re: Random panic in load_balance() with 3.16-rc

2014-07-25 Thread Steven Rostedt
On Fri, 25 Jul 2014 11:29:06 -0700
Linus Torvalds torva...@linux-foundation.org wrote:

 On Fri, Jul 25, 2014 at 7:02 AM, Steven Rostedt rost...@goodmis.org wrote:
 
  But wouldn't it be rather trivial to run a static analyzer on the final
  vmlinux to make sure there are no red zones? I mean, you would only need
  to read each function and check to make sure that the offset of rbp is
  within the change of rsp, wouldn't you?
 
  Almost seems like an objdump -rd into a perl script could do this.
 
 I'm sure it's possible, but it sounds potentially complicated. It's
 not like the function prologue is fixed, and gcc will create code
 (including conditional branches etc) before the whole frame setup if
 there are simple things that can be done purely with the
 callee-clobbered registers etc.
 
 Some simple pattern to make sure that the sub $frame-size,%rsp comes
 before any accesses to (%rbp) (when frame pointers are enabled)
 *might* work, but it might also end up missing things.
 
 You want to try?
 

Yeah, I could write something up. I probably wont get to it for a week
or two, but it shouldn't be too hard.

As you said, it will probably miss the complex cases where gcc finishes
the frame later in the function or with branches and such. But at least
it should be able to catch any totally retard set up. I compiled
Michel's file and I'll make sure that it at least catches that:

3572:   48 c7 85 78 ff ff ffmovq   $0x0,-0x88(%rbp)
3579:   00 00 00 00 
3579: R_X86_64_32S  load_balance_mask
357d:   48 81 ec b8 00 00 00sub$0xb8,%rsp


-- Steve



-- 
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/20140725151011.148db...@gandalf.local.home



Re: Random panic in load_balance() with 3.16-rc

2014-07-25 Thread Steven Rostedt
On Fri, 25 Jul 2014 13:01:11 -0700
Linus Torvalds torva...@linux-foundation.org wrote:


 For example, gcc will not create a small stack frame with sub
 $8,%rsp. No, what gcc does is to use a random push instruction.
 Fair enough, but that really makes things much harder to see. Here's
 an example:
 
   813143a3 dock_notify:
   813143a3:   55  push   %rbp
   813143a4:   48 89 e5mov%rsp,%rbp
   813143a7:   41 57   push   %r15
   813143a9:   41 56   push   %r14
   813143ab:   49 89 femov%rdi,%r14
   813143ae:   41 55   push   %r13
   813143b0:   41 89 f5mov%esi,%r13d
   813143b3:   41 54   push   %r12
   813143b5:   53  push   %rbx
   813143b6:   51  push   %rcx
   ...
   81314501:   48 8b 7e 08 mov0x8(%rsi),%rdi
   81314505:   48 89 75 d0 mov%rsi,-0x30(%rbp)
   81314509:   e8 5f d1 ff ff  callq
 8131166d acpi_bus_scan
   8131450e:   85 c0   test   %eax,%eax
   ...
   813145d6:   5a  pop%rdx
   813145d7:   5b  pop%rbx
   813145d8:   44 89 e0mov%r12d,%eax
   813145db:   41 5c   pop%r12
   813145dd:   41 5d   pop%r13
   813145df:   41 5e   pop%r14
   813145e1:   41 5f   pop%r15
   813145e3:   5d  pop%rbp
   813145e4:   c3  retq
 
 note the use (deep down in the function) of -0x30(%rbp), and note how
 it does pop %rdx twice to undo the push %rcx. It was just to
 allocate space.

I don't see a pop %rdx twice. Sure you're not suffering from a little
dyslexia? ;-)  But I do get your point. The rdx is popped where the rcx
was, and both are useless, as rcx and rdx are volatile regs.


 
 So you definitely have to track the actual stack pointer updates, not
 just the patterns of add/sub to %rsp.

With Perl that would be rather trivial. I'm more concerned with branch
logic. I'll see if I can include some simple branch logic too to
flatten paths. But I wont really know the depth of this until I start
hacking at it.

-- Steve




-- 
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/20140725161318.3dd77...@gandalf.local.home



Bug#709647: [75/88] genirq: Fix can_request_irq() for IRQs without an action

2013-08-13 Thread Steven Rostedt
3.6.11.7-rc1 stable review patch.
If anyone has any objections, please let me know.

--

From: Ben Hutchings b...@decadent.org.uk

[ Upstream commit 2779db8d37d4b542d9ca2575f5f178dbeaca6c86 ]

Commit 02725e7471b8 ('genirq: Use irq_get/put functions'),
inadvertently changed can_request_irq() to return 0 for IRQs that have
no action.  This causes pcibios_lookup_irq() to select only IRQs that
already have an action with IRQF_SHARED set, or to fail if there are
none.  Change can_request_irq() to return 1 for IRQs that have no
action (if the first two conditions are met).

Reported-by: Bjarni Ingi Gislason bjarn...@rhi.hi.is
Tested-by: Bjarni Ingi Gislason bjarn...@rhi.hi.is (against 3.2)
Signed-off-by: Ben Hutchings b...@decadent.org.uk
Cc: 709...@bugs.debian.org
Cc: sta...@vger.kernel.org # 2.6.39+
Link: http://bugs.debian.org/709647
Link: 
http://lkml.kernel.org/r/1372383630.23847.40.ca...@deadeye.wl.decadent.org.uk
Signed-off-by: Thomas Gleixner t...@linutronix.de
Signed-off-by: Steven Rostedt rost...@goodmis.org
---
 kernel/irq/manage.c |6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
index 4c69326..50a0a66 100644
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -554,9 +554,9 @@ int can_request_irq(unsigned int irq, unsigned long 
irqflags)
return 0;
 
if (irq_settings_can_request(desc)) {
-   if (desc-action)
-   if (irqflags  desc-action-flags  IRQF_SHARED)
-   canrequest =1;
+   if (!desc-action ||
+   irqflags  desc-action-flags  IRQF_SHARED)
+   canrequest = 1;
}
irq_put_desc_unlock(desc, flags);
return canrequest;
-- 
1.7.10.4


-- 
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20130813155835.439750...@goodmis.org



Bug#700333: [ 059/136 ] clockevents: Set dummy handler on CPU_DEAD shutdown

2013-05-17 Thread Steven Rostedt
3.6.11.4 stable review patch.
If anyone has any objections, please let me know.

--

From: Thomas Gleixner t...@linutronix.de

[ Upstream commit 6f7a05d7018de222e40ca003721037a530979974 ]

Vitaliy reported that a per cpu HPET timer interrupt crashes the
system during hibernation. What happens is that the per cpu HPET timer
gets shut down when the nonboot cpus are stopped. When the nonboot
cpus are onlined again the HPET code sets up the MSI interrupt which
fires before the clock event device is registered. The event handler
is still set to hrtimer_interrupt, which then crashes the machine due
to highres mode not being active.

See http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=700333

There is no real good way to avoid that in the HPET code. The HPET
code alrady has a mechanism to detect spurious interrupts when event
handler == NULL for a similar reason.

We can handle that in the clockevent/tick layer and replace the
previous functional handler with a dummy handler like we do in
tick_setup_new_device().

The original clockevents code did this in clockevents_exchange_device(),
but that got removed by commit 7c1e76897 (clockevents: prevent
clockevent event_handler ending up handler_noop) which forgot to fix
it up in tick_shutdown(). Same issue with the broadcast device.

Reported-by: Vitaliy Fillipov vita...@yourcmc.ru
Cc: Ben Hutchings b...@decadent.org.uk
Cc: sta...@vger.kernel.org
Cc: 700...@bugs.debian.org
Signed-off-by: Thomas Gleixner t...@linutronix.de
Signed-off-by: Steven Rostedt rost...@goodmis.org
---
 kernel/time/tick-broadcast.c |4 
 kernel/time/tick-common.c|1 +
 2 files changed, 5 insertions(+)

diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c
index a13987a..239a323 100644
--- a/kernel/time/tick-broadcast.c
+++ b/kernel/time/tick-broadcast.c
@@ -66,6 +66,8 @@ static void tick_broadcast_start_periodic(struct 
clock_event_device *bc)
  */
 int tick_check_broadcast_device(struct clock_event_device *dev)
 {
+   struct clock_event_device *cur = tick_broadcast_device.evtdev;
+
if ((dev-features  CLOCK_EVT_FEAT_DUMMY) ||
(tick_broadcast_device.evtdev 
 tick_broadcast_device.evtdev-rating = dev-rating) ||
@@ -73,6 +75,8 @@ int tick_check_broadcast_device(struct clock_event_device 
*dev)
return 0;
 
clockevents_exchange_device(tick_broadcast_device.evtdev, dev);
+   if (cur)
+   cur-event_handler = clockevents_handle_noop;
tick_broadcast_device.evtdev = dev;
if (!cpumask_empty(tick_get_broadcast_mask()))
tick_broadcast_start_periodic(dev);
diff --git a/kernel/time/tick-common.c b/kernel/time/tick-common.c
index da6c9ec..ead79bc 100644
--- a/kernel/time/tick-common.c
+++ b/kernel/time/tick-common.c
@@ -323,6 +323,7 @@ static void tick_shutdown(unsigned int *cpup)
 */
dev-mode = CLOCK_EVT_MODE_UNUSED;
clockevents_exchange_device(dev, NULL);
+   dev-event_handler = clockevents_handle_noop;
td-evtdev = NULL;
}
raw_spin_unlock_irqrestore(tick_device_lock, flags);
-- 
1.7.10.4


-- 
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20130518021651.264575...@goodmis.org



Re: PREEMPT_RT vs 'hrtimer: Prevent hrtimer_enqueue_reprogram race'

2013-03-21 Thread Steven Rostedt
On Fri, 2013-03-22 at 01:11 +, Ben Hutchings wrote:
 Commit b22affe0aef4 'hrtimer: Prevent hrtimer_enqueue_reprogram race'
 conflicts with the RT patches
 hrtimer-fixup-hrtimer-callback-changes-for-preempt-r.patch and
 peter_zijlstra-frob-hrtimer.patch, as they all change
 hrtimer_enqueue_reprogram().  It seems that the changes in the RT
 patches now belong in __hrtimer_start_range_ns().
 
 Since I haven't seen any RT releases in a while, here's what I came up
 with for 3.2-rt:

Note, I posted a fix on Tuesday:

https://lkml.org/lkml/2013/3/19/369

I'm waiting for Thomas to give his OK on it before releasing the series.
He told me he'll have a look at it tomorrow. I've already ran the series
through all my tests, and will post it immediately after I get the OK.
Or if there's a issue I will have to fix it and rerun my tests.


 
 ---
 From: Thomas Gleixner t...@linutronix.de
 Date: Fri, 3 Jul 2009 08:44:31 -0500
 Subject: hrtimer: fixup hrtimer callback changes for preempt-rt
 
 In preempt-rt we can not call the callbacks which take sleeping locks
 from the timer interrupt context.
 
 Bring back the softirq split for now, until we fixed the signal
 delivery problem for real.
 
 Signed-off-by: Thomas Gleixner t...@linutronix.de
 Signed-off-by: Ingo Molnar mi...@elte.hu
 [bwh: Pull the changes to hrtimer_enqueue_reprogram() up into
  __hrtimer_start_range_ns(), following changes in
  commit b22affe0aef4 'hrtimer: Prevent hrtimer_enqueue_reprogram race'
  backported into 3.2.40]
 Signed-off-by: Ben Hutchings b...@decadent.org.uk
 ---
 --- a/include/linux/hrtimer.h
 +++ b/include/linux/hrtimer.h
 @@ -111,6 +111,8 @@ struct hrtimer {
   enum hrtimer_restart(*function)(struct hrtimer *);
   struct hrtimer_clock_base   *base;
   unsigned long   state;
 + struct list_headcb_entry;
 + int irqsafe;
  #ifdef CONFIG_TIMER_STATS
   int start_pid;
   void*start_site;
 @@ -147,6 +149,7 @@ struct hrtimer_clock_base {
   int index;
   clockid_t   clockid;
   struct timerqueue_head  active;
 + struct list_headexpired;
   ktime_t resolution;
   ktime_t (*get_time)(void);
   ktime_t softirq_time;
 --- a/kernel/hrtimer.c
 +++ b/kernel/hrtimer.c
 @@ -589,8 +589,7 @@ static int hrtimer_reprogram(struct hrti
* When the callback is running, we do not reprogram the clock event
* device. The timer callback is either running on a different CPU or
* the callback is executed in the hrtimer_interrupt context. The
 -  * reprogramming is handled either by the softirq, which called the
 -  * callback or at the end of the hrtimer_interrupt.
 +  * reprogramming is handled at the end of the hrtimer_interrupt.
*/
   if (hrtimer_callback_running(timer))
   return 0;
 @@ -625,6 +624,9 @@ static int hrtimer_reprogram(struct hrti
   return res;
  }
  
 +static void __run_hrtimer(struct hrtimer *timer, ktime_t *now);
 +static int hrtimer_rt_defer(struct hrtimer *timer);
 +
  /*
   * Initialize the high resolution related parts of cpu_base
   */
 @@ -730,6 +732,11 @@ static inline int hrtimer_enqueue_reprog
  }
  static inline void hrtimer_init_hres(struct hrtimer_cpu_base *base) { }
  static inline void retrigger_next_event(void *arg) { }
 +static inline int hrtimer_reprogram(struct hrtimer *timer,
 + struct hrtimer_clock_base *base)
 +{
 + return 0;
 +}
  
  #endif /* CONFIG_HIGH_RES_TIMERS */
  
 @@ -861,9 +868,9 @@ void hrtimer_wait_for_timer(const struct
  {
   struct hrtimer_clock_base *base = timer-base;
  
 - if (base  base-cpu_base  !hrtimer_hres_active(base-cpu_base))
 + if (base  base-cpu_base  !timer-irqsafe)
   wait_event(base-cpu_base-wait,
 - !(timer-state  HRTIMER_STATE_CALLBACK));
 +!(timer-state  HRTIMER_STATE_CALLBACK));
  }
  
  #else
 @@ -913,6 +920,11 @@ static void __remove_hrtimer(struct hrti
   if (!(timer-state  HRTIMER_STATE_ENQUEUED))
   goto out;
  
 + if (unlikely(!list_empty(timer-cb_entry))) {
 + list_del_init(timer-cb_entry);
 + goto out;
 + }
 +
   next_timer = timerqueue_getnext(base-active);
   timerqueue_del(base-active, timer-node);
   if (timer-node == next_timer) {
 @@ -1011,6 +1023,26 @@ int __hrtimer_start_range_ns(struct hrti
*/
   if (leftmost  new_base-cpu_base == __get_cpu_var(hrtimer_bases)
hrtimer_enqueue_reprogram(timer, new_base)) {
 +#ifdef CONFIG_PREEMPT_RT_BASE
 + again:

What kernel are you working with? I don't see anywhere the again:
within a PREEMPT_RT_BASE block.

 + /*
 +  * Move softirq based timers away from the rbtree in
 + 

Re: PREEMPT_RT vs 'hrtimer: Prevent hrtimer_enqueue_reprogram race'

2013-03-21 Thread Steven Rostedt
On Fri, 2013-03-22 at 03:24 +, Ben Hutchings wrote:

  Note, I posted a fix on Tuesday:
  
  https://lkml.org/lkml/2013/3/19/369
 
 Thanks.  I did search GMANE with some obvious terms but I think its
 index is lagging.

It didn't help that my subject had no mention of -rt in it :-(

 I'm rebasing the rt patch series generated with
 'git format-patch v3.2.39..v3.2.39-rt59-rebase' on top of v3.2.41 (plus
 Debian changes, which introduce some trivial textual conflicts).

Ah, OK, I do it the opposite way. As the stable rt git never rebases, I
always merge the latest stable into my tree. I then create a rebased
tree as well. When I do that, I'm sure I'll end up fixing the patches up
pretty much the same as you have.

Thanks,

-- Steve




-- 
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/1363923275.6345.93.ca...@gandalf.local.home



Re: [opensuse-kernel] Re: [RFC] Simplifying kernel configuration for distro issues

2012-07-19 Thread Steven Rostedt
On Sat, Jul 14, 2012 at 07:48:27PM +0200, Borislav Petkov wrote:
 
 Let's have an example: when I have to build upstream on a distro here,
 I take the distro config and use it despite that it takes a long time
 to build since everything is module - it is still better for me to
 wait that one time instead of doing a dozen of trial and errors after
 forgetting a config option each time.

This is where 'make localmodconfig' does help. It can remove a lot of
modules for you. And I just recently fixed a bug in the tool that it now
removes even more modules (The fix is in linux-next).

Also, if you are building on another box than what the kernel is for,
you can go to that box and run 'lsmod  /tmp/lsmod'. Copy that file to
the build machine (into /tmp/lsmod), and then run
'make LSMOD=/tmp/lsmod localmodconfig', and this will remove the modules
not used by the target box.

-- Steve


-- 
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20120719144217.gc16...@home.goodmis.org



Re: [RFC] Simplifying kernel configuration for distro issues

2012-07-19 Thread Steven Rostedt
On Fri, Jul 13, 2012 at 02:17:30PM -0700, Linus Torvalds wrote:
 
 The *two* requirements (and they're really the same theme) I
 personally think we should have for this are
 
  -  I think every single select for these things should come with a
 comment about what it is about and why the distro needs it (to show
 there was some thought involved and not just a blind took it from the
 distro config)

What about expanding on Alan's idea. I'm guessing that 99% of the users
build the kernel for the box that they are running. If this is the case,
perhaps we can get the distros to add a:

  /usr/share/Linux/Kconfig

And this Kconfig would have something like:

bool Distro X config
 select A
 select B
 select C
 [...]

Perhaps with a comment for each select. Or have the comments in the help
section.

Then have the kernel kbuild system check if this file exists and include
it.

Of course the kbuild system would need to verify that the selects exist,
and perhaps warn if they do not. But the nice thing about this is that
you would get the minconfig for the system you are running. When the
system is updated to a new version, the minconfig would be updated too.
The list of selects would not have to live in the kernel, nor would the
kernel need to maintain the list for N+1 different distributions.


 
  - It should be about *minimal* settings. I'd rather have too few
 things and the occasional complaint about oh, it didn't work because
 it missed XYZ than have it grow to contain all the options just
 because somebody decided to just add random things until things
 worked.

Side note, and this is for the 1%. If you want a true minconfig for your
system, ktest can do that for you. You can set it up to run a test to
create a minimum config that will boot (and optionally run some test you
specify). It turns off configs in order of importance (chooses those
that select a lot, or are depended on most, first), and sees if it can
boot without the config. The end result can be rather a very small set
of configs.

See tools/testing/ktest/examples/include/min-config.conf for more
details.

-- Steve

 
 Other than that, even if it only gets you *closer* to a kernel that
 works with that distro, I think it doesn't have to be all that
 perfect. Because the alternative is what we have now.
 
Linus
 --
 To unsubscribe from this list: send the line unsubscribe linux-kernel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/


-- 
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20120719152618.gd16...@home.goodmis.org



Re: [RFC] Simplifying kernel configuration for distro issues

2012-07-19 Thread Steven Rostedt
On Thu, 2012-07-19 at 11:45 -0400, Josh Boyer wrote:
 Of course the kbuild system would need to verify that the selects exist,
  and perhaps warn if they do not. But the nice thing about this is that
  you would get the minconfig for the system you are running. When the
  system is updated to a new version, the minconfig would be updated too.
  The list of selects would not have to live in the kernel, nor would the
  kernel need to maintain the list for N+1 different distributions.
 
 Is there a reason you don't want distro maintainers to maintain these
 files in the upstream git tree?  (You said the kernel need to
 maintain, but I would expect the distro maintainers to be doing that
 work.)
 
 I think it would actually be beneficial to maintain them upstream
 instead of in distro kernel packaging.  You'd be able to track the
 history of changes with git.  You would see for a given kernel
 version what options are set for each distro (e.g. F17 can support
 NEW_FOO_THING but F16 userspace can't so it doesn't select that).
 Perhaps most importantly, it provides a consolidated view of what
 options various distros are setting and allows the distro maintainers to
 easily do comparisons.

Then we'll have a list of options in each kernel:

 Fedora 16
 Fedora 17
 Fedora 18
 [...]
 Debian x
 Debian x+1
 Debian x+2
 [...]
 Ubuntu y
 Ubuntu y+1
 [...]

What about older kernels? Say you installed Fedora 18 with an older
kernel that doesn't know what to select? Having the distro tell the
kernel what it needs seems to me the easiest for the 99% case.

Also, if something isn't supported by the older kernel, it would warn
the user about it. That way the user can be told that their older kernel
won't work with this version of the distro. And there wont be as many
surprises. If the user is told your init wont work with this kernel
before they compile it, then they shouldn't complain if they decide to
install this older kernel and their box doesn't boot.

-- Steve



-- 
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/1342714088.12353.33.ca...@gandalf.stny.rr.com



Re: [RFC] Simplifying kernel configuration for distro issues

2012-07-19 Thread Steven Rostedt
On Thu, 2012-07-19 at 08:43 -0700, Linus Torvalds wrote:
 On Thu, Jul 19, 2012 at 8:26 AM, Steven Rostedt rost...@goodmis.org wrote:
 
  Side note, and this is for the 1%. If you want a true minconfig for your
  system, ktest can do that for you.
 
 Try it, it's actually much harder than it seems. Like allmodconfig, it
 handles the minimum hardware well, but it tends to handle the subtle
 issues really badly.
 
 Many config options cause *very* subtle failures that are almost
 impossible to see. Like firewalls not loading correctly (and leaving
 the machine completely open), or just stuff that you didn't happen to
 test (USB sticks, printers, certain programs) not working. Not having
 the right audit options will make things still work, but you'll get
 warnings at bootup, and who knows what that causes etc etc.
 
 These kinds of things are exactly why I'd like to have a distro config.

This is why it was more of a side note, and for the 1%. If there's
things you have tests for, to confirm that they work, you could add
those to the TEST option, and the config generated will guarantee to fix
them.

But as you stated, there's lots of subtle things that can go wrong. I
was just posting this as a plug for ktest ;-)

For what you want, I think having the distro supply
a /usr/share/Linux/Kconfig that the linux build system can use would be
very helpful.

-- Steve



-- 
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/1342714322.12353.36.ca...@gandalf.stny.rr.com



Re: [opensuse-kernel] Re: [RFC] Simplifying kernel configuration for distro issues

2012-07-19 Thread Steven Rostedt
On Thu, 2012-07-19 at 18:48 +0200, Borislav Petkov wrote:

  Also, if you are building on another box than what the kernel is for,
  you can go to that box and run 'lsmod  /tmp/lsmod'. Copy that file to
  the build machine (into /tmp/lsmod), and then run
  'make LSMOD=/tmp/lsmod localmodconfig', and this will remove the modules
  not used by the target box.
 
 Seriously, this helps only in the cases where the stuff the distro
 actually needs is in modules. So, there probably are obscure situations
 where you need to enable stuff which is bool and not M. Hopefully those
 cases are seldom enough so thanks for this, I'll try that the next time.
 

This is why I created the make-min-config in ktest. It keeps on
disabling configs to see what the machine needs to boot (and optionally
run some test), and what configs it can disable. It does not touch the
multi options though.

It creates two configs. One that has the configs that it can't turn off
(still enabled with a make allnoconfig, or selected by something that it
must have), and a config that just has the configs that 'if I disable
this, the box doesn't boot'.

Here's an example:

For my min-config files with the configs that couldn't be turned off:

$ wc -l config-min*
  117 config-min
  139 config-min-net

The config-min will get the box to boot (no network). The -net, adds
enough to ssh to the box.

$ wc -l config-skip*
 11 config-skip
 14 config-skip-net

The above are the configs that ktest found if it disabled, would not
boot (or ssh).

$ cat config-skip-net
CONFIG_SERIAL_8250_CONSOLE=y
CONFIG_SATA_AHCI=y
CONFIG_E1000=y
CONFIG_QUOTA=y
CONFIG_ATA=y
CONFIG_UNIX=y
CONFIG_INET=y
CONFIG_DEVTMPFS=y
CONFIG_EXT4_FS=y
CONFIG_DEVTMPFS_MOUNT=y
CONFIG_SERIAL_8250=y
CONFIG_BLK_DEV_SD=y
CONFIG_NET=y
CONFIG_NETDEVICES=y

I can pass the above to a allnoconfig, and the box will boot and allow
ssh. Note, the reason for the serial config, is that this ktest run uses
a serial port to see if the box booted. If the serial isn't there, then
it thinks it failed.

-- Steve



-- 
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/1342717366.12353.48.ca...@gandalf.stny.rr.com



Re: [RFC] Simplifying kernel configuration for distro issues

2012-07-19 Thread Steven Rostedt
On Thu, 2012-07-19 at 13:19 -0400, Josh Boyer wrote:

  What about older kernels? Say you installed Fedora 18 with an older
  kernel that doesn't know what to select? Having the distro tell the
  kernel what it needs seems to me the easiest for the 99% case.
 
 How is the above not telling the kernel what it needs?  I'm confused how
 the location of such a file makes it's functionality and usefulness
 differ...  Quite possible I missed what you meant originally, but it
 sounds like we're talking about the same thing?

The point is, the user wont have to think What distro am I running? and
what version am I running?. I don't even know what version of the disto
I'm currently running (Debian testing).

The point is, the current running distro supplies what is needed from
the kernel in order to work properly. The user does not need to 'select'
it. They would only have to select a 'add my distro min configs'.

A developer working with a user could just say, select disto config
without needing to know what distro the user has.

What happens if someone does a yum update, and the kernel requirement
changes slightly. The yum update should update
this /usr/share/Linux/Kconfig. But it's still set at Fedora X. The
kernel can not be updated for these slight changes.

 
 Also, I'm not very convinced the 99% are going to be wanting to install
 shiny new versions of a distro with a kernel older than what the distro
 ships with.  I could be very wrong, but it seems like in-general the
 whole premise of this RFC was geared towards using new kernels on
 distros.

There are times when the update breaks something. A user may backport to
an older kernel where their Gizmo worked. I've done this to get webcams
working. I know I'm not the 99%, but the rational for my operation was a
99% thing to do: Crap, I upgraded my kernel and now my webcam doesn't
work. Oh well, download an older version and boot that one.

 
  Also, if something isn't supported by the older kernel, it would warn
  the user about it. That way the user can be told that their older kernel
  won't work with this version of the distro. And there wont be as many
  surprises. If the user is told your init wont work with this kernel
  before they compile it, then they shouldn't complain if they decide to
  install this older kernel and their box doesn't boot.
 
 kconfig already spits out warnings for symbols being selected that
 don't exist.

We can make these even bigger :-)   Add lots of stars (*) around them!

-- Steve



-- 
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/1342719222.12353.58.ca...@gandalf.stny.rr.com



Re: [opensuse-kernel] Re: [RFC] Simplifying kernel configuration for distro issues

2012-07-19 Thread Steven Rostedt
On Thu, 2012-07-19 at 19:34 +0200, Borislav Petkov wrote:

  I can pass the above to a allnoconfig, and the box will boot and allow
  ssh. Note, the reason for the serial config, is that this ktest run uses
  a serial port to see if the box booted. If the serial isn't there, then
  it thinks it failed.
 
 I agree with all this and you've explained this to me live already so
 you're preaching to the choir.

Yes, I know you know this already, as we discussed it in a pub over a
beer (choir practice). But this is a public forum on LKML (the church),
where I now have an audience of heathens. Convert! Convert! You are all
sinners!

 
 But it would be a lot faster/easier if users can select, let's call'em
 profiles which are not mutually exclusive and can speed up the
 configuration process. They can either be distro-specific or generic,
 selecting certain features you need.
 
 So configuring your kernel would be like shopping without paying too
 much attention to details. Let's look into the head of a person doing a
 config like that and read some of her thoughts :):
 
 Hm, ok, this new configurator is cool, a lot faster I gotta say... So,
 what do I need, ah, yes, it is an AMD laptop so from vendors I select
 AMD, then I probably need ext4, then I'd like to do packet filtering
 so I should enable iptables.. Oh, I'd like to do tracing too so let's
 enable tracing and trust Steven with the options he's added by default,
 then I need ahci, I'd also like to do encrypted partitions so I'll
 enable device mapper with crypto... 
 
 So all those things could be selectable from that profiles menu without
 having to go through the gazillion of little suboptions and having to
 read help (which is sometimes completely helpless) and figure out do I
 need it or not.
 
 And this would simplify configuration a lot. IMHO, anyway.
 

I totally agree with this. It would be nice to have a profile list where
you can pick and chose what you have installed:

network
nfs
ext3
serial
xen
kvm
etc etc

Where you can pick and choose what general features you want and it
selects all the core infrastructure to get those features usable. It
wouldn't select the device modules needed, you will still need to select
what hardware you have. But it gets most of the work done for you.

But this still doesn't solve Linus's initial request. That would be a
single option that makes your distro boot, and work well. Again, that
option wont have the drivers needed, but it will enable all the core
infrastructure that you need.

Going with my /usr/share/Linux/Kconfig, this could use the profile
options as well. And just select those that are required. But then
again, Linus did want a minimum selection of stuff.

Side note (again), IIRC, select has a bug. If you have Config X
selecting config Y but Y depends on Z, if you enable X, it will enable Y
without enabling Z. I think there were some patches to address this, but
I don't remember.

-- Steve



-- 
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/1342720646.12353.67.ca...@gandalf.stny.rr.com



Re: [RFC] Simplifying kernel configuration for distro issues

2012-07-19 Thread Steven Rostedt
On Thu, 2012-07-19 at 13:56 -0400, Josh Boyer wrote:

 Distros aren't stationary things.

Exactly my point.

   I mean, some of them certainly aim
 for that goal, but userspace and kernels get upgraded all the time.  So
 if this distro-Kconfig file is provided by some package _other_ than the
 kernel the distros are going to have a bit of a hassle keeping track of
 it.

How about a directory called /usr/share/Linux/Kconfig.d/

Then have anything installed that needs to work correctly put in its
minimum (must have) requirement configs of the kernel.

Say you are running Debian, and decide to try out systemd. If you set up
your system to run that it would add a file called:

/usr/share/Linux/Kconfig.d/systemd.conf

or something, and this would select things like CGROUPS and the like. We
could make the kernel build select all, or individual files in this
directory. All for the 'make my distro work' or individual for a 'I want
part of my distro to work' option.



 Upgraded the kernel within the confines of that distro, right?  So you
 go back to what was already installed and working.  You don't go back
 arbitrarily far just to see what happens.  I would think a reasonably
 crafted distro config would work in those scenarios.

A reasonable one, but still not the minimum.

One issue with Linus's proposal is that he's asking us to focus on the
99%. But the 99% of who? Because 99% of Linux users do not compile their
own kernels, so he must be asking about the 99% of Linux users that
compile their own kernels. This 99% does not just simply compile their
kernels, but only want to compile the absolutely necessary stuff. That
is, they want their kernels not to include anything they are not using.

A reasonable config would probably need to include a lot that's not
used.

-- Steve



-- 
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/1342721620.12353.75.ca...@gandalf.stny.rr.com



Re: [RFC] Simplifying kernel configuration for distro issues

2012-07-19 Thread Steven Rostedt
On Thu, 2012-07-19 at 18:35 -0400, Josh Boyer wrote:

  2... yeah.  I don't really know if that is going to pan out, but I am
  ever hopeful.  I'd be mostly concerned with people that are coding
  userspace applications using every whiz-bang kernel feature.  Or not
  paying attention at all to the kernel after the initial file creation
  and the options going stale (don't follow renames, etc).
  
  it would be determined by the distro maintainers who maintain the
  kernel config for that distro.
 
 Erm... not in Steven's scheme.  At least I don't think distro kernel
 maintainers are going to willingly crawl through every application
 package that might depend on a kernel feature being enabled and maintain
 those files across X number of packages.

Correct. If we keep the selects in the kernel proper, then it would be
the kernel maintainer to make sure it works for the necessary
applications.

If we have a directory called /usr/share/Linux/Kconfig.d/ then the
individual packages could add their needed selects. Now this wouldn't be
for the average package. We don't need emacs developers adding any
configs here. It's just for those packages that are already tightly
coupled with the kernel (systemd, iptables, SELinux tools, etc).

-- Steve



-- 
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/1342738194.12353.80.ca...@gandalf.stny.rr.com



Re: About ARCH=sparc and what to pass to recordmcount.pl

2011-08-15 Thread Steven Rostedt
On Mon, 2011-08-15 at 17:21 +0200, Uwe Kleine-König wrote:
 Hello,
 
 when I enable CONFIG_FUNCTION_TRACER and CONFIG_DYNAMIC_FTRACE for the
 Debian kernel the build fails with:
 
   CC  init/do_mounts_initrd.o
 Arch sparc is not supported with CONFIG_FTRACE_MCOUNT_RECORD at 
 .../scripts/recordmcount.pl line 368.
 
 This happens because the kernel package build scripts call
 
   make ARCH=sparc
 
 $(ARCH) is passed to scripts/recordmcount.pl and the latter only knows
 about sparc64.
 
 There are several options to fix that:
 
  a) let the package pass ARCH=sparc64 instead of ARCH=sparc
 Then sparc would be the only architecture that doesn't pass the
 (Debian) arch here. I'm not sure there are other reasons to stick to
 ARCH=sparc. Bastian?
 It seems only recordmcount.pl chokes on that because Debian uses
 ARCH=sparc since at least 2.6.32, probably much longer.
 
  b) pass SRCARCH instead of ARCH to recordmcount.pl and do
 s/sparc64/sparc/ in recordmcount.pl.

b may break other archs.

 
  c) do the following in recordmcount.pl:
   if ($arch =~ /sparc(32|64)?/) {
   if ($bits == 64) {
   $arch = sparc64;
   } else {
   $arch = sparc; # or sparc32?
   }
   }

c is fine with me.

 
 Something like that is already done for x86.
  
 Personally I prefer b), but I'm not sure I see all the consequences.

It may break other archs. I rather not do a change that affects all
archs. This is a sparc specific problem. The solution should only affect
sparc. Which to me is option c.

-- Steve

 
 What do you think?
 
 Best regards
 Uwe
 



-- 
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/1313422385.15704.7.ca...@gandalf.stny.rr.com



Re: About ARCH=sparc and what to pass to recordmcount.pl

2011-08-15 Thread Steven Rostedt
On Mon, 2011-08-15 at 11:33 -0400, Steven Rostedt wrote:
 On Mon, 2011-08-15 at 17:21 +0200, Uwe Kleine-König wrote:
  Hello,
  
  when I enable CONFIG_FUNCTION_TRACER and CONFIG_DYNAMIC_FTRACE for the
  Debian kernel the build fails with:
  
CC  init/do_mounts_initrd.o
  Arch sparc is not supported with CONFIG_FTRACE_MCOUNT_RECORD at 
  .../scripts/recordmcount.pl line 368.
  
  This happens because the kernel package build scripts call
  
  make ARCH=sparc
  
  $(ARCH) is passed to scripts/recordmcount.pl and the latter only knows
  about sparc64.
  
  There are several options to fix that:
  
   a) let the package pass ARCH=sparc64 instead of ARCH=sparc
  Then sparc would be the only architecture that doesn't pass the
  (Debian) arch here. I'm not sure there are other reasons to stick to
  ARCH=sparc. Bastian?
  It seems only recordmcount.pl chokes on that because Debian uses
  ARCH=sparc since at least 2.6.32, probably much longer.
  
   b) pass SRCARCH instead of ARCH to recordmcount.pl and do
  s/sparc64/sparc/ in recordmcount.pl.
 
 b may break other archs.
 
  
   c) do the following in recordmcount.pl:
  if ($arch =~ /sparc(32|64)?/) {
  if ($bits == 64) {
  $arch = sparc64;
  } else {
  $arch = sparc; # or sparc32?
  }
  }
 
 c is fine with me.
 
  
  Something like that is already done for x86.
   
  Personally I prefer b), but I'm not sure I see all the consequences.
 
 It may break other archs. I rather not do a change that affects all
 archs. This is a sparc specific problem. The solution should only affect
 sparc. Which to me is option c.
 

Actually, I think option d) is the best.

d) have sparc support recordmcount.c

-- Steve



-- 
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/1313422823.15704.8.ca...@gandalf.stny.rr.com



Re: About ARCH=sparc and what to pass to recordmcount.pl

2011-08-15 Thread Steven Rostedt
On Mon, 2011-08-15 at 12:44 -0700, David Miller wrote:
 From: Steven Rostedt rost...@goodmis.org
 Date: Mon, 15 Aug 2011 11:40:23 -0400
 
  Actually, I think option d) is the best.
  
  d) have sparc support recordmcount.c
 
 Maybe you misunderstand what these guys are doing.
 
 The recordmcount.pl script wants to look at the output of the
 architecture of the built kernel.  It's intimitately tied to
 the architecture the kernel was built as, it must therefore
 be invoked with exactly what the kernel was bult with.
 
 These guys are building a 64-bit sparc kernel then trying to use
 recordmcount.pl with the 32-bit sparc ARCH set.
 
 That can't work.
 
 Next, recordmcount.pl doesn't have 32-bit sparc support because we
 don't support building the kernel with profiling options enabled for
 that architecture.

But if they use recordmcount.c instead, then nothing needs to be done
with recordmcount.pl.

recordmcount.c looks at the elf file itself to determine what arch it is
for. If this is supported, then everything should work, and you get a
faster build of the kernel as an extra bonus.

But sure, if we test recordmcount.pl on all archs, and using the SRCARCH
doesn't break anything, then I'm fine with that change too.

-- Steve



-- 
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/1313438100.15704.20.ca...@gandalf.stny.rr.com



Bug#609371: linux-image-2.6.37-trunk-sparc64: module scsi_mod: Unknown relocation: 36

2011-01-19 Thread Steven Rostedt
After applying David's remove align patch, I got it to boot on x86_64
with the following two patches. I thought just adding the align to the
structure declaration would work, but it still failed on the syscall for
init_module. By removing the double declaration of event_exit_##sname,
removed this problem.

I'll test this on x86 32bit and PPC 64. If it works there, I'll push all
of them out for 38. Should these go to 37 stable too?

-- Steve

diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index cacc27a..0e3bce6 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -142,8 +142,6 @@ extern struct trace_event_functions 
exit_syscall_print_funcs;
 #define SYSCALL_TRACE_EXIT_EVENT(sname)
\
static struct syscall_metadata  \
__attribute__((__aligned__(4))) __syscall_meta_##sname; \
-   static struct ftrace_event_call \
-   __attribute__((__aligned__(4))) event_exit_##sname; \
static struct ftrace_event_call __used  \
  __attribute__((__aligned__(4)))   \
  __attribute__((section(_ftrace_events)))\


Index: linux-trace.git/include/linux/ftrace_event.h
===
--- linux-trace.git.orig/include/linux/ftrace_event.h
+++ linux-trace.git/include/linux/ftrace_event.h
@@ -194,7 +194,7 @@ struct ftrace_event_call {
int perf_refcount;
struct hlist_head __percpu  *perf_events;
 #endif
-};
+} __attribute__((__aligned__(32)));
 
 #define PERF_MAX_TRACE_SIZE2048
 





-- 
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/1295451961.12215.1291.ca...@gandalf.stny.rr.com



Bug#609371: linux-image-2.6.37-trunk-sparc64: module scsi_mod: Unknown relocation: 36

2011-01-19 Thread Steven Rostedt
On Wed, 2011-01-19 at 11:15 -0500, Mathieu Desnoyers wrote:
 * Steven Rostedt (rost...@goodmis.org) wrote:
  After applying David's remove align patch, I got it to boot on x86_64
  with the following two patches. I thought just adding the align to the
  structure declaration would work, but it still failed on the syscall for
  init_module. By removing the double declaration of event_exit_##sname,
  removed this problem.
  
  I'll test this on x86 32bit and PPC 64. If it works there, I'll push all
  of them out for 38. Should these go to 37 stable too?
 
 Please hold before adding these patches into git. They don't seem to address 
 the
 underlying problem correctly. See the latest exchanges between David Miller 
 and
 myself for more info.
 
 We need to come up with something better than it boots as an explanation for
 the fix.

Yes, I agree that we should solve this issue correctly. But if there is
a work around to the problem, we could implement that if the real
solution is not in our grasp yet.

-- Steve





-- 
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/1295460822.12215.1389.ca...@gandalf.stny.rr.com



Bug#609371: linux-image-2.6.37-trunk-sparc64: module scsi_mod: Unknown relocation: 36

2011-01-19 Thread Steven Rostedt
On Wed, 2011-01-19 at 13:40 -0800, David Miller wrote:

 My concern is that if there is ever a u64 or similarly long long
 typed member in these tracing structures, it will not be aligned
 sufficiently to avoid unaligned access traps on 32-bit systems.

The structure that gets placed in this section is the ftrace_event_call.
It consists only of pointers, a struct list_head, a couple of int, and
a struct trace_event, which consists of: a struct hlist_node, a struct
list_head, an int, and a pointer.

None of these are more than long and I don't foresee them needing a
long long type. I think assuming that a long is the largest item should
due.

We can add a comment next to these structures specifying this
dependency, and hopefully it would be updated if we ever do include a
long long in them.

-- Steve





-- 
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/1295474423.12215.1612.ca...@gandalf.stny.rr.com



Bug#609371: linux-image-2.6.37-trunk-sparc64: module scsi_mod: Unknown relocation: 36

2011-01-18 Thread Steven Rostedt
On Mon, 2011-01-17 at 22:27 -0800, David Miller wrote:

 I'm beginning to think that the align directive is there purposely to
 down-align the structure so that the amount of space that tracing
 information consumes is minimized.
 
 I honestly can't tell, only Steven Rostedt can tell us for sure,
 because there are no commit messages or comments that explain why
 these things need to be there.

You may have missed my earlier repsonse.

The alignment was there to keep the linker from adding holes into the
sections that store this data. As the tracepoints are processed at boot
up like an array (as the initcalls are done). If the linker adds space
into the section as it puts the sections from all the object files
together, it will crash the kernel as we read that section as an array.

The min alignment I used solved this issue (which I hit on x86_64). But
I could also have made the structures aligned to a bigger alignment,
which should also work. But this will add holes between each item in the
array.

 
 If the align directives exist for that reason, then your suggestion
 would likely completely undo what these directives are trying to
 achieve.
 
 Someone mentioned that the default struct alignment on x86-64 is
 something rather rediculious like 32 bytes or something like that.
 Yet someone else suggested to use __aligned__(32) to fix this, so
 color me completely confused.

If you have another idea to keep the linker from breaking the section up
where it can't be read as an array, I'm all ears :-)

-- Steve





-- 
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/1295370344.12215.18.ca...@gandalf.stny.rr.com



Bug#609371: linux-image-2.6.37-trunk-sparc64: module scsi_mod: Unknown relocation: 36

2011-01-18 Thread Steven Rostedt
On Mon, 2011-01-17 at 22:35 -0800, David Miller wrote:
 From: Steven Rostedt rost...@goodmis.org
 Date: Mon, 17 Jan 2011 09:15:41 -0500
 
  Again, this is to help the linker keep arrays in tacked. Tracepoints are
  allocated into the tracepoint section, and then read like an array. If
  the linker adds holes as it links sections into one big one, then the
  reading of the array breaks.
  
  We either need to compact it (with the align(4)) which is undesirable,
  or add our own holes like the above does.
 
 Just take away all of the align directives, it should just work.

If that was the case, we would have never added it :-)

 
 If it doesn't, then we should investigate to find out the real reason
 why.

OK, we can remove it and I can see what breaks.

 
 The linker has no reason to add holes, and in fact if we are to believe
 the commit message for 86c38a31aa7f2dd6e74a262710bf8ebf7455acc5
 (tracing: Fix ftrace_event_call alignment for use with gcc 4.5), with
 gcc-4.x where x  5, it did work even though some ftrace_event_call
 declarations lacked the align directive.
 
 If this align thing is so critical, why don't we need it for the
 exception table entries which live in the __ex_table section in the
 kernel?  It's the same _exact_ kind of thing, and the asm sequence we
 use to populate it is identical to what GCC emits for the tracing
 object declarations in question.

But aren't the __ex_table entries just two words? Which would align
nicely regardless.

The ftrace_event_call is a relatively large structure, as it may end on
a 4 byte alignement, the linker may start the next section with more
space to get it back to a 8 byte alignment. This is assuming that x86_64
packs the array in 4 bytes.

Hmm, perhaps changing the alignment in vmlinux.lds.h would fix that.

Anyway, I'll revert that commit and see if I can break it again. If so,
then I'll look for other solutions.

Thanks!

-- Steve





-- 
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/1295371856.12215.29.ca...@gandalf.stny.rr.com



Bug#609371: linux-image-2.6.37-trunk-sparc64: module scsi_mod: Unknown relocation: 36

2011-01-18 Thread Steven Rostedt
On Tue, 2011-01-18 at 11:46 -0500, Mathieu Desnoyers wrote:
 * David Miller (da...@davemloft.net) wrote:
  From: David Miller da...@davemloft.net
  Date: Mon, 17 Jan 2011 22:00:39 -0800 (PST)
  
   ftrace: Remove unnecessary alignment tag from ftrace_event_call.
   
   It's completely unnecessary and causes problems on platforms
   where this tag down-aligns the structure's alignment.
   
   Signed-off-by: David S. Miller da...@davemloft.net
   ...
  
  Ok, unless we can explain why these alignments are needed at all, we
  should kill all of them:
 
 ftrace: linker script add missing struct align
 
 We should add the missing STRUCT_ALIGN(); in
 include/asm-generic/vmlinux.lds.h as a preliminary step to remove the ftrace
 bogus structure alignments. Moving all STRUCT_ALIGN() for FTRACE_EVENTS()
 and TRACE_SYSCALLS() into the definitions, so the alignment is only done if
 these infrastructures are configured in. 
 
 Also align TRACE_PRINTKS on 8 bytes to make sure the beginning of the section 
 is
 aligned on pointer size.

If I can make it crash without the alignments and this fixes the issue,
I'll apply both patches.

Thanks,

-- Steve

 
 Signed-off-by: Mathieu Desnoyers mathieu.desnoy...@efficios.com
 ---
  include/asm-generic/vmlinux.lds.h |   19 ++-
  1 file changed, 10 insertions(+), 9 deletions(-)
 
 Index: linux-2.6-lttng/include/asm-generic/vmlinux.lds.h
 ===
 --- linux-2.6-lttng.orig/include/asm-generic/vmlinux.lds.h
 +++ linux-2.6-lttng/include/asm-generic/vmlinux.lds.h
 @@ -107,7 +107,8 @@
  #endif
  
  #ifdef CONFIG_TRACE_BRANCH_PROFILING
 -#define LIKELY_PROFILE() 
 VMLINUX_SYMBOL(__start_annotated_branch_profile) = .; \
 +#define LIKELY_PROFILE() STRUCT_ALIGN(); 
   \
 + 
 VMLINUX_SYMBOL(__start_annotated_branch_profile) = .; \
   *(_ftrace_annotated_branch) 
   \
   VMLINUX_SYMBOL(__stop_annotated_branch_profile) 
 = .;
  #else
 @@ -115,7 +116,8 @@
  #endif
  
  #ifdef CONFIG_PROFILE_ALL_BRANCHES
 -#define BRANCH_PROFILE() VMLINUX_SYMBOL(__start_branch_profile) = .;   \
 +#define BRANCH_PROFILE() STRUCT_ALIGN();   \
 + VMLINUX_SYMBOL(__start_branch_profile) = .;   \
   *(_ftrace_branch) \
   VMLINUX_SYMBOL(__stop_branch_profile) = .;
  #else
 @@ -123,7 +125,8 @@
  #endif
  
  #ifdef CONFIG_EVENT_TRACING
 -#define FTRACE_EVENTS()  VMLINUX_SYMBOL(__start_ftrace_events) = .;  
 \
 +#define FTRACE_EVENTS()  STRUCT_ALIGN(); 
 \
 + VMLINUX_SYMBOL(__start_ftrace_events) = .;  \
   *(_ftrace_events)   \
   VMLINUX_SYMBOL(__stop_ftrace_events) = .;
  #else
 @@ -131,7 +134,8 @@
  #endif
  
  #ifdef CONFIG_TRACING
 -#define TRACE_PRINTKS() VMLINUX_SYMBOL(__start___trace_bprintk_fmt) = .; 
  \
 +#define TRACE_PRINTKS()  . = ALIGN(8);   
\
 +  VMLINUX_SYMBOL(__start___trace_bprintk_fmt) = .;  \
*(__trace_printk_fmt) /* Trace_printk fmt' pointer */ \
VMLINUX_SYMBOL(__stop___trace_bprintk_fmt) = .;
  #else
 @@ -139,7 +143,8 @@
  #endif
  
  #ifdef CONFIG_FTRACE_SYSCALLS
 -#define TRACE_SYSCALLS() VMLINUX_SYMBOL(__start_syscalls_metadata) = .;  
 \
 +#define TRACE_SYSCALLS() STRUCT_ALIGN(); \
 +  VMLINUX_SYMBOL(__start_syscalls_metadata) = .; \
*(__syscalls_metadata) \
VMLINUX_SYMBOL(__stop_syscalls_metadata) = .;
  #else
 @@ -169,11 +174,7 @@
   LIKELY_PROFILE()\
   BRANCH_PROFILE()\
   TRACE_PRINTKS() \
 - \
 - STRUCT_ALIGN(); \
   FTRACE_EVENTS() \
 - \
 - STRUCT_ALIGN(); \
   TRACE_SYSCALLS()
  
  /*
 
 
 





-- 
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/1295372019.12215.30.ca...@gandalf.stny.rr.com



Bug#609371: linux-image-2.6.37-trunk-sparc64: module scsi_mod: Unknown relocation: 36

2011-01-18 Thread Steven Rostedt
On Tue, 2011-01-18 at 12:33 -0500, Steven Rostedt wrote:
 On Tue, 2011-01-18 at 11:46 -0500, Mathieu Desnoyers wrote:

  Also align TRACE_PRINTKS on 8 bytes to make sure the beginning of the 
  section is
  aligned on pointer size.
 
 If I can make it crash without the alignments and this fixes the issue,
 I'll apply both patches.

After applying David's patch, I do indeed get a crash. I'll now apply
yours and see if it fixes the issue.

-- Steve





-- 
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/1295374595.12215.32.ca...@gandalf.stny.rr.com



Bug#609371: linux-image-2.6.37-trunk-sparc64: module scsi_mod: Unknown relocation: 36

2011-01-18 Thread Steven Rostedt
On Tue, 2011-01-18 at 13:16 -0500, Steven Rostedt wrote:
 On Tue, 2011-01-18 at 12:33 -0500, Steven Rostedt wrote:
  On Tue, 2011-01-18 at 11:46 -0500, Mathieu Desnoyers wrote:
 
   Also align TRACE_PRINTKS on 8 bytes to make sure the beginning of the 
   section is
   aligned on pointer size.
  
  If I can make it crash without the alignments and this fixes the issue,
  I'll apply both patches.
 
 After applying David's patch, I do indeed get a crash. I'll now apply
 yours and see if it fixes the issue.

Your patch doesn't seem to fix it either. I'll investigate this further.

-- Steve





-- 
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/1295375178.12215.33.ca...@gandalf.stny.rr.com



Bug#609371: linux-image-2.6.37-trunk-sparc64: module scsi_mod: Unknown relocation: 36

2011-01-18 Thread Steven Rostedt
On Tue, 2011-01-18 at 15:13 -0500, Mathieu Desnoyers wrote:
 * Steven Rostedt (rost...@goodmis.org) wrote:
  On Tue, 2011-01-18 at 13:16 -0500, Steven Rostedt wrote:
   On Tue, 2011-01-18 at 12:33 -0500, Steven Rostedt wrote:
On Tue, 2011-01-18 at 11:46 -0500, Mathieu Desnoyers wrote:
   
 Also align TRACE_PRINTKS on 8 bytes to make sure the beginning of the 
 section is
 aligned on pointer size.

If I can make it crash without the alignments and this fixes the issue,
I'll apply both patches.
   
   After applying David's patch, I do indeed get a crash. I'll now apply
   yours and see if it fixes the issue.
  
  Your patch doesn't seem to fix it either. I'll investigate this further.
 
 I think David's patch missed kernel/trace/trace_export.c
 
 struct ftrace_event_call __used \
 __attribute__((__aligned__(4))) \
 __attribute__((section(_ftrace_events))) event_##call = { \
 
 and kernel/trace/trace.h:
 
 #define FTRACE_ENTRY(call, struct_name, id, tstruct, print) \
 extern struct ftrace_event_call \
 __attribute__((__aligned__(4))) event_##call;
 
 does it help if you remove these ?

I doubt it, but I'll try it anyway.

-- Steve






-- 
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/1295382144.12215.123.ca...@gandalf.stny.rr.com



Bug#609371: linux-image-2.6.37-trunk-sparc64: module scsi_mod: Unknown relocation: 36

2011-01-17 Thread Steven Rostedt
[ Added Mathieu on Cc, since he likes alignments ;-) ]

On Sun, 2011-01-16 at 11:39 -0800, David Miller wrote:
 From: Richard Mortimer ri...@oldelvet.org.uk
 Date: Sun, 16 Jan 2011 14:17:49 +
 
  I'm wondering if gcc is just getting better at honouring the source
  code. The DEFINE_EVENT macros in include/trace/ftrace.h have a
  __aligned__(4) attribute in them. Maybe that should be 8 on sparc64
  systems.
  The aligned 4 seems to be unchanged since include/trace/ftrace.h was
  created in f42c85e74faa422cf0bc747ed808681145448f88 in April 2009.
 
 That needs to be at least 8 on 64-bit systems.  Why is this aligned
 directive there at all?

IIRC, the problem showed up in 64-bit systems. OK, x86-64 (but of
course ;-).

The problem comes when the linker puts these sections together. We read
all the sections as one big array. If the linker puts in holes, then
this breaks the array, and the kernel crashes while reading the section.

I guess one solution is to remove the alignment at the allocation and
place it at the structure. This will mean all accesses to this structure
will need to be on an alignment.

-- Steve





-- 
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/1295273486.16479.15.ca...@gandalf.stny.rr.com



Bug#609371: linux-image-2.6.37-trunk-sparc64: module scsi_mod: Unknown relocation: 36

2011-01-17 Thread Steven Rostedt
On Mon, 2011-01-17 at 10:22 +, Richard Mortimer wrote:
 On Sun, 2011-01-16 at 22:07 -0800, David Miller wrote:
  From: David Miller da...@davemloft.net
  Date: Sat, 15 Jan 2011 21:17:22 -0800 (PST)
 
   I think the problem we have here is that the _ftrace_events section is
   not aligned sufficiently.  That .align 4 mnemonic is a good indication
   of this.  It should at least 8 on sparc64.
  
 I noticed another potentially 64 bit unfriendly alignment on struct
 tracepoint in include/linux/tracepoint.h. I don't think that the
 alignment of 32 breaks anything but it does leave a 24 byte hole. I
 don't know enough about tracing to know if that is necessary.
 
 struct tracepoint {
 const char *name;   /* Tracepoint name */
 int state;  /* State. */
 void (*regfunc)(void);
 void (*unregfunc)(void);
 struct tracepoint_func *funcs;
 }  __attribute__((aligned(32)));/*
  * Aligned on 32 bytes because it is
  * globally visible and gcc happily
  * align these on the structure size.
  * Keep in sync with vmlinux.lds.h.
  */
 
 Note I spotted this when looking at some residual sparc64 relocation
 issues when _ftrace_events alignment is changed to 8. I'll follow those
 issues up in a separate email when I get time later today.

Again, this is to help the linker keep arrays in tacked. Tracepoints are
allocated into the tracepoint section, and then read like an array. If
the linker adds holes as it links sections into one big one, then the
reading of the array breaks.

We either need to compact it (with the align(4)) which is undesirable,
or add our own holes like the above does.

-- Steve






-- 
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/1295273741.16479.19.ca...@gandalf.stny.rr.com