Bug#357538: immediate cause of problem diagnosed

2006-03-27 Thread Bastian Blank
On Mon, Mar 27, 2006 at 01:55:53AM +0200, maximilian attems wrote:
> the major fe is the old evms major nr. afaik evms switched to newer 117.
> no idea if the libdevmapper patch from lilo got updated in debian?

evms have no own major in 2.6 as it uses devmapper and md for the work.
devmapper have a dynamicaly allocated major which is mostly 254 in
current kernels. md uses the staticaly allocated major 9.

> parse_numeric() is rudimentary lilo support of initramfs-tools.
> it should work now for most cases of usual block devices.
> my small test programm show that it is not able to parse
> correctly "fe" it translates that to 254 instead of 63

Can you please describe the format used? Aren't that hexadecimal bytes?

Bastian

-- 
It is more rational to sacrifice one life than six.
-- Spock, "The Galileo Seven", stardate 2822.3


signature.asc
Description: Digital signature


Bug#357538: immediate cause of problem diagnosed

2006-03-26 Thread Ross Boylan
On Mon, Mar 27, 2006 at 01:55:53AM +0200, maximilian attems wrote:
> hello ross,
> 
> adding the evms maintainer on cc.

Trimming Mark Garey, our sysadmin, from the distribution, as I doubt
he wants the blow-by-blow.

> 
> On Mon, 20 Mar 2006, Ross Boylan wrote:
> 
> > On Sun, 2006-03-19 at 19:35 -0800, Ross Boylan wrote:
> > > I've looked at the initramfs image and tried to trace through
> > > execution by inspection (I'm not at the machine).  I may be off base,
> > > but here's what I noticed.  In short, it looks as if the root
> > > parameter of /dev/evms/newroot in lilo.conf is not making it to the
> > > scripts that run off the ramdisk.
> > > 
> > 
> > I've verified this at run (or boot) time. /proc/cmdline has "root=fe06"
> > not the "root=/dev/evms/newroot" from lilo.conf.  As a result, the evms
> > scripts don't think evms is in play, and never run evms_activate. lilo
> > has resolved the partition name to a BIOS address (I guess); that's fine
> > for lilo, but not great for others.
> 
> the major fe is the old evms major nr. afaik evms switched to newer 117.
> no idea if the libdevmapper patch from lilo got updated in debian?
>  

One of the evms developers suggested checking if lilo was properly
patched, and that the behavior I saw suggested it wasn't.  Debian's
lilo sounds as if it has the right patch, but maybe it doesn't.  Also
I need to compare the way I passed parameters in to the way on the
evms recommendation.  As I indicated earlier, append= rather than
root= may be required.

Further investigation of the Debian lilo patches and the my lilo.conf
vs the one recommend by evms is on my to-do list.

I don't completely follow the discussion of the device number for
evms.  But it does seem that the device name, as well as the number,
is necessary for things to work with evms.

[snip]

> > I'm less sure what the solution is, or even who is misbehaving.  It
> > looks like a lilo problem.  Possible solutions include

> > 4. work around this in initramfs-tools, or maybe in evms.
> the small compat stuff in initramfs-tools should be fixed.
> patches welcome :)

I'm don't know what the "small compat stuff" refers to.

> > 5. other.
> evms recommends/depends grub?

If it turns out the current lilo doesn't have the necessary patch,
that might be good to note in evms and/or lilo.  However, even in the
current state, there probably are cases in which lilo and evms will
work together OK (e.g., 2.4 kernel, or evms with bd-claim patch
applied, or root volume on a different disk from those managed by
evms, ...).

> > P.S. I had a lot of unsuccessful attempts to build my own initrd to
> > diagnose the problem, and ultimately hacked /usr/share/initramfs-tools/
> > to get the value of /proc/cmdline output.  Then I ran update-initramfs.
> > Any pointers on this would be great.

> you can pass break on the kernel append line to drop at special moments
> in the boot to check. i'm not sure i understood your question?

Thanks.  I didn't discover that until I started looked at the
initramfs code in more detail.  Will debug work also?  It looked as if
it might.

My question was two-fold, with neither part strictly relevant to the
bug :) One was "what's the best way to diagnose problems in the initrd
boot process?"; it looks as if the answer is to append parameters.

Unfortunately, I didn't figure that out, and tried something harder:
making my own alternative initramfs.  I did zcat initrd.img | cpio to
pull out the contents.  I then edited the files in the directory and
reversed the cpio and gzip to make a new initrd.img.  Finally I ran
lilo.  This never worked.  So my second question was "should that have
worked?  What was I doing wrong?"  Looking at the code in initramfs-tools
I see it does a few options differently (uses -dereference), and adds
some stuff afterward.  And maybe it needs to modify the kernel image?
At any rate, when I uses updateramfs, that worked.

Names of options, commands, and files in the previous paragraph are
from memory and likely not quite right.

> regards and sorry for the late reply.

Hey, we all are busy.  Thanks for your response, and I'll try to get
to the further investigations noted above.

Ross


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Bug#357538: immediate cause of problem diagnosed

2006-03-26 Thread maximilian attems
hello sesse,

On Mon, Mar 27, 2006 at 02:04:58AM +0200, Steinar H. Gunderson wrote:
> On Mon, Mar 27, 2006 at 01:55:53AM +0200, maximilian attems wrote:
> > parse_numeric() is rudimentary lilo support of initramfs-tools.
> > it should work now for most cases of usual block devices.
> > my small test programm show that it is not able to parse
> > correctly "fe" it translates that to 254 instead of 63
> 
> What did I miss here? Isn't 0xFE = 254?

hmm i may have missed basic courses..

afaik is the old evms major 63
$ bc
FE
63

thanks for your input.

--
maks


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Bug#357538: immediate cause of problem diagnosed

2006-03-26 Thread Steinar H. Gunderson
On Mon, Mar 27, 2006 at 01:55:53AM +0200, maximilian attems wrote:
> parse_numeric() is rudimentary lilo support of initramfs-tools.
> it should work now for most cases of usual block devices.
> my small test programm show that it is not able to parse
> correctly "fe" it translates that to 254 instead of 63

What did I miss here? Isn't 0xFE = 254?

/* Steinar */
-- 
Homepage: http://www.sesse.net/


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Bug#357538: immediate cause of problem diagnosed

2006-03-26 Thread maximilian attems
hello ross,

adding the evms maintainer on cc.

On Mon, 20 Mar 2006, Ross Boylan wrote:

> On Sun, 2006-03-19 at 19:35 -0800, Ross Boylan wrote:
> > I've looked at the initramfs image and tried to trace through
> > execution by inspection (I'm not at the machine).  I may be off base,
> > but here's what I noticed.  In short, it looks as if the root
> > parameter of /dev/evms/newroot in lilo.conf is not making it to the
> > scripts that run off the ramdisk.
> > 
> 
> I've verified this at run (or boot) time. /proc/cmdline has "root=fe06"
> not the "root=/dev/evms/newroot" from lilo.conf.  As a result, the evms
> scripts don't think evms is in play, and never run evms_activate. lilo
> has resolved the partition name to a BIOS address (I guess); that's fine
> for lilo, but not great for others.

the major fe is the old evms major nr. afaik evms switched to newer 117.
no idea if the libdevmapper patch from lilo got updated in debian?
 
> In general, I don't think there's any guarantee that the root= option
> will resolve to anything at the time lilo is run; it might only be
> visible after the evms_activate in the new environment.
> 
> parse_numeric() has code that deals with this case, but it doesn't solve
> the evms problem.

parse_numeric() is rudimentary lilo support of initramfs-tools.
it should work now for most cases of usual block devices.
my small test programm show that it is not able to parse
correctly "fe" it translates that to 254 instead of 63

--
#/bin/dash
test () {
echo "$1"
minora=$((0x${1#??}))
minorb=$((0x${1#?}))
major=$((0x${1%??}))
echo "$major $minora $minorb"
}
# broken major
test 'fe06'
# works
test '341'
test '2206'
--


 
> I'm less sure what the solution is, or even who is misbehaving.  It
> looks like a lilo problem.  Possible solutions include
> 1. don't use lilo
grub gets much more testing due beeing the default bootloader these days.

> 2. use lilo but pass explicit kernel options with append=.  I'm not sure
> if that would work.
> 3. change lilo to preserve the value given to it in the root= parameter,
> at least with initrd.
> 4. work around this in initramfs-tools, or maybe in evms.
the small compat stuff in initramfs-tools should be fixed.
patches welcome :)

> 5. other.
evms recommends/depends grub?
 
> P.S. I had a lot of unsuccessful attempts to build my own initrd to
> diagnose the problem, and ultimately hacked /usr/share/initramfs-tools/
> to get the value of /proc/cmdline output.  Then I ran update-initramfs.
> Any pointers on this would be great. 

you can pass break on the kernel append line to drop at special moments
in the boot to check. i'm not sure i understood your question?

regards and sorry for the late reply.

-- 
maks


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Bug#357538: immediate cause of problem diagnosed

2006-03-20 Thread Ross Boylan
On Sun, 2006-03-19 at 19:35 -0800, Ross Boylan wrote:
> I've looked at the initramfs image and tried to trace through
> execution by inspection (I'm not at the machine).  I may be off base,
> but here's what I noticed.  In short, it looks as if the root
> parameter of /dev/evms/newroot in lilo.conf is not making it to the
> scripts that run off the ramdisk.
> 

I've verified this at run (or boot) time. /proc/cmdline has "root=fe06"
not the "root=/dev/evms/newroot" from lilo.conf.  As a result, the evms
scripts don't think evms is in play, and never run evms_activate. lilo
has resolved the partition name to a BIOS address (I guess); that's fine
for lilo, but not great for others.

In general, I don't think there's any guarantee that the root= option
will resolve to anything at the time lilo is run; it might only be
visible after the evms_activate in the new environment.

parse_numeric() has code that deals with this case, but it doesn't solve
the evms problem.

I'm less sure what the solution is, or even who is misbehaving.  It
looks like a lilo problem.  Possible solutions include
1. don't use lilo
2. use lilo but pass explicit kernel options with append=.  I'm not sure
if that would work.
3. change lilo to preserve the value given to it in the root= parameter,
at least with initrd.
4. work around this in initramfs-tools, or maybe in evms.
5. other.

Anyone who is confident should feel free to reassign or reclassify this
bug.  My next step will probably be to switch to grub.

Again, these issues appear to be evms-specific.  They may affect other
ramdisk generators, however.

P.S. I had a lot of unsuccessful attempts to build my own initrd to
diagnose the problem, and ultimately hacked /usr/share/initramfs-tools/
to get the value of /proc/cmdline output.  Then I ran update-initramfs.
Any pointers on this would be great. 



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]