Re: [PATCH] [ARM] Use AT() in the linker script to create correct program headers

2012-10-12 Thread Jason Gunthorpe
On Wed, Oct 10, 2012 at 10:55:44AM +0100, Dave Martin wrote:

> As a side comment, some things such as MAC addresses can be probed and
> set from userspace after kernel boot, assuming that you have a way
> to fetch the device description blob from userspace.  If it's accessible
> via MTD I'm guessing you may have some chance of doing that.

Yes, there is a whole wack of other data in this structure (serial
numbers, etc) that are captured and used by userspace, so access is
always made possible.

Setting the MAC via userspace is interesting, I will bear that in
mind. However, we have several development-support boot scenarios that
enable the ethernet before the userspace code to fetch the data
structure is loaded - one of interest has the kernel's initramfs
perform a NBD mount of the actual application filesystem. So having
the kernel tend to this is very convenient.
 
> However, I worry that if you have to run hardware-dependent code in
> order to fetch or generate parts of the DT, then we have a chicken-
> and-egg problem with no guaranteed solution with the frameworks that
> exist today -- although I am not familiar with how DT gets used on
> all other arches, so you might have counterexamples I'm not aware of.

Certainly, in some cases, (like what I have done) such kernels are
non-generic and could only work with the hardware they are intended
for.

However, if you assume the bootloader provides a dtb (even an
incomplete one would do) then fixups/replacements/whatever could be
run based on tags in the dtb without creating a chicken-and-egg
problem.

> At least in ARM-land, the DT is inherently monolithic and static: the
> DT is not expected to change once you enter the kernel.

Sort of, it is the same on all arches, the DT block is expected to
remain valid, but all the of_node pointers go directly into the block
so you can alter values but not the structure without breaking any
invariants. As an example our DTBs all have mac address blocks, with a
0 value. The fixup simply overwrites with the correct value before the
ethernet driver reads from it.

> > I'm not sure there is a great interest in this? What are other folks
> > working on production embedded stuff doing? I suppose that will be
> > more clear as device tree is rolled out.
> 
> For now, the architecturally simplest solution still seems to me to be
> to write your own boot shim which customises the DT before booting the 
> kernel. As discussed, this can continue to look like a simple ELF image
> from the bootloader's point of view -- but I appreciate that it will
> involve effort and some duplication of some low-level driver components.

I admit, I am a bit conflicted. Viewing from the kernel perspective, I
definitely think this kind of very low level extremely board specific
stuff should not be in the kernel. That could be a maintenance
nightmare. However, as an OEM, I need the smallest, simplest boot
loader possible so I want to push this into the kernel to lighten my
development load.

> Your problem is unlikely to affect people outside the embedded space
> too much, but it doesn't sound likely to be unique.

Agreed..

This thread has left the original topic, what do you think I should do
with the linker patch?

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] [ARM] Use AT() in the linker script to create correct program headers

2012-10-12 Thread Jason Gunthorpe
On Wed, Oct 10, 2012 at 10:55:44AM +0100, Dave Martin wrote:

 As a side comment, some things such as MAC addresses can be probed and
 set from userspace after kernel boot, assuming that you have a way
 to fetch the device description blob from userspace.  If it's accessible
 via MTD I'm guessing you may have some chance of doing that.

Yes, there is a whole wack of other data in this structure (serial
numbers, etc) that are captured and used by userspace, so access is
always made possible.

Setting the MAC via userspace is interesting, I will bear that in
mind. However, we have several development-support boot scenarios that
enable the ethernet before the userspace code to fetch the data
structure is loaded - one of interest has the kernel's initramfs
perform a NBD mount of the actual application filesystem. So having
the kernel tend to this is very convenient.
 
 However, I worry that if you have to run hardware-dependent code in
 order to fetch or generate parts of the DT, then we have a chicken-
 and-egg problem with no guaranteed solution with the frameworks that
 exist today -- although I am not familiar with how DT gets used on
 all other arches, so you might have counterexamples I'm not aware of.

Certainly, in some cases, (like what I have done) such kernels are
non-generic and could only work with the hardware they are intended
for.

However, if you assume the bootloader provides a dtb (even an
incomplete one would do) then fixups/replacements/whatever could be
run based on tags in the dtb without creating a chicken-and-egg
problem.

 At least in ARM-land, the DT is inherently monolithic and static: the
 DT is not expected to change once you enter the kernel.

Sort of, it is the same on all arches, the DT block is expected to
remain valid, but all the of_node pointers go directly into the block
so you can alter values but not the structure without breaking any
invariants. As an example our DTBs all have mac address blocks, with a
0 value. The fixup simply overwrites with the correct value before the
ethernet driver reads from it.

  I'm not sure there is a great interest in this? What are other folks
  working on production embedded stuff doing? I suppose that will be
  more clear as device tree is rolled out.
 
 For now, the architecturally simplest solution still seems to me to be
 to write your own boot shim which customises the DT before booting the 
 kernel. As discussed, this can continue to look like a simple ELF image
 from the bootloader's point of view -- but I appreciate that it will
 involve effort and some duplication of some low-level driver components.

I admit, I am a bit conflicted. Viewing from the kernel perspective, I
definitely think this kind of very low level extremely board specific
stuff should not be in the kernel. That could be a maintenance
nightmare. However, as an OEM, I need the smallest, simplest boot
loader possible so I want to push this into the kernel to lighten my
development load.

 Your problem is unlikely to affect people outside the embedded space
 too much, but it doesn't sound likely to be unique.

Agreed..

This thread has left the original topic, what do you think I should do
with the linker patch?

Jason
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] [ARM] Use AT() in the linker script to create correct program headers

2012-10-10 Thread Dave Martin
On Tue, Oct 09, 2012 at 12:25:14PM -0600, Jason Gunthorpe wrote:
> On Mon, Oct 08, 2012 at 11:46:49AM +0100, Dave Martin wrote:
> 
> > > Yes, but we still need rely on complex code like I2C/MTD to create a
> > > correct DTB, which again puts us back to patching the kernel for that
> > > functionality.
> > 
> > I'm still confused as to where this complexity is coming from.
> > 
> > If you need to run some complex I2C and MTD code to generate the DT, what
> > is that code doing?  If it is probing to get the information, then can you
> > avoid putting the info in the DT at all?  Primarily the DT is to describe
> > those aspects of the hardware which can't be probed.
> 
> At manufacturing the unit is programmed with a small datastructure that
> contains all the unique ID's (MAC addresses, etc), manufacturing
> variations (ie vendor A or B was used for a socket) and other
> ancillary data.

As a side comment, some things such as MAC addresses can be probed and
set from userspace after kernel boot, assuming that you have a way
to fetch the device description blob from userspace.  If it's accessible
via MTD I'm guessing you may have some chance of doing that.

Identifying the board variant is harder to defer though.

> So a fair amount of I2C and MTD stack is required just to fetch this
> structure - a full CFI probe, for instance, is needed in the NOR flash
> case just to locate the structure.
> 
> Once read, things like MAC addresses are copied into the DTB, and
> certain sections of the DTB are NOP'd out - we have stanzas for chip
> vendor A and vendor B, the one that was not put on the board is
> replaced with NOP.
> 
> Similar to the A/B fix, a further fixup is done based on a runtime
> probe of the programmable devices to learn their current
> configuration/memory map.
> 
> It seems desirable to present a complete/correct DTB to the kernel,
> it doesn't seem there are great places to hook in custom discovery
> procedures.
> 
> From a maintenance perspective we already have to test/etc the kernel
> code for all of this, we don't want to do that twice by duplicating
> this stuff outside the kernel.

OK, that gives me a good understanding of what you're trying to achieve
here.

However, I worry that if you have to run hardware-dependent code in
order to fetch or generate parts of the DT, then we have a chicken-
and-egg problem with no guaranteed solution with the frameworks that
exist today -- although I am not familiar with how DT gets used on
all other arches, so you might have counterexamples I'm not aware of.

At least in ARM-land, the DT is inherently monolithic and static: the
DT is not expected to change once you enter the kernel.

Of course, more or less anything can be done with local patches, but
merging such functionality in a clean way might be a challenge.

> > Otherwise, you that have a few static configurations: you could have one
> > pre-baked DT per hardware platform, and choose the correct one once you
> > have detected the platform.
> 
> We do that too, for instance the PPC kernel we build supports 4
> different circuit boards, each served by a separate DTB, that needs a
> fixup pass.
> 
> I think the biggest DTB describes about 49 devices..
>   
> > > Where the 'dev tree provider' would use the stored bootloader
> > > registers and any other information to return the proper DTB. 
> > 
> > It would need developing a bit, but something like that might be
> > possible -- it should probably be discussed via devicetree-discuss.
> > 
> > If it is doing anything less trivial than picking a pre-baked DT, the
> > rationale would need to be carefully argued.
> 
> I'm not sure there is a great interest in this? What are other folks
> working on production embedded stuff doing? I suppose that will be
> more clear as device tree is rolled out.

For now, the architecturally simplest solution still seems to me to be
to write your own boot shim which customises the DT before booting the 
kernel. As discussed, this can continue to look like a simple ELF image
from the bootloader's point of view -- but I appreciate that it will
involve effort and some duplication of some low-level driver components.

If we could do dynamic DT properly (either fully dynamic, or dynamic
in a more constrained way as in your suggested patch), that would
provided an architected way to solve this problem from within the kernel.

This could certainly be worth raising on devicetree-discuss.

Your problem is unlikely to affect people outside the embedded space
too much, but it doesn't sound likely to be unique.

Cheers
---Dave
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] [ARM] Use AT() in the linker script to create correct program headers

2012-10-10 Thread Dave Martin
On Tue, Oct 09, 2012 at 11:37:06AM -0600, Jason Gunthorpe wrote:
> On Mon, Oct 08, 2012 at 11:24:13AM +0100, Dave Martin wrote:
> 
> > Partly this came from some side speculation about whether we could do
> > things like privileged read-only permissions on newer CPUs, for preventing
> > unintended or undesired writes to the kernel's code or read-only data.
> 
> Some other arches page protect the kernel, but that tends to be at
> odds with the desire to use huge pages for the kernel mapping, and
> independent of the load headers..

This wasn't so much about that headers themselves as about fragmentation
of the page permissions which makes it difficult to map everything using
huge pages / sections.  But as you say, there are conflicting concerns
here, and it seems not to be a priority.

Privileged write-protect is nice to have if non-disruptive, but not
essential.

Cheers
---Dave
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] [ARM] Use AT() in the linker script to create correct program headers

2012-10-10 Thread Dave Martin
On Tue, Oct 09, 2012 at 11:37:06AM -0600, Jason Gunthorpe wrote:
 On Mon, Oct 08, 2012 at 11:24:13AM +0100, Dave Martin wrote:
 
  Partly this came from some side speculation about whether we could do
  things like privileged read-only permissions on newer CPUs, for preventing
  unintended or undesired writes to the kernel's code or read-only data.
 
 Some other arches page protect the kernel, but that tends to be at
 odds with the desire to use huge pages for the kernel mapping, and
 independent of the load headers..

This wasn't so much about that headers themselves as about fragmentation
of the page permissions which makes it difficult to map everything using
huge pages / sections.  But as you say, there are conflicting concerns
here, and it seems not to be a priority.

Privileged write-protect is nice to have if non-disruptive, but not
essential.

Cheers
---Dave
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] [ARM] Use AT() in the linker script to create correct program headers

2012-10-10 Thread Dave Martin
On Tue, Oct 09, 2012 at 12:25:14PM -0600, Jason Gunthorpe wrote:
 On Mon, Oct 08, 2012 at 11:46:49AM +0100, Dave Martin wrote:
 
   Yes, but we still need rely on complex code like I2C/MTD to create a
   correct DTB, which again puts us back to patching the kernel for that
   functionality.
  
  I'm still confused as to where this complexity is coming from.
  
  If you need to run some complex I2C and MTD code to generate the DT, what
  is that code doing?  If it is probing to get the information, then can you
  avoid putting the info in the DT at all?  Primarily the DT is to describe
  those aspects of the hardware which can't be probed.
 
 At manufacturing the unit is programmed with a small datastructure that
 contains all the unique ID's (MAC addresses, etc), manufacturing
 variations (ie vendor A or B was used for a socket) and other
 ancillary data.

As a side comment, some things such as MAC addresses can be probed and
set from userspace after kernel boot, assuming that you have a way
to fetch the device description blob from userspace.  If it's accessible
via MTD I'm guessing you may have some chance of doing that.

Identifying the board variant is harder to defer though.

 So a fair amount of I2C and MTD stack is required just to fetch this
 structure - a full CFI probe, for instance, is needed in the NOR flash
 case just to locate the structure.
 
 Once read, things like MAC addresses are copied into the DTB, and
 certain sections of the DTB are NOP'd out - we have stanzas for chip
 vendor A and vendor B, the one that was not put on the board is
 replaced with NOP.
 
 Similar to the A/B fix, a further fixup is done based on a runtime
 probe of the programmable devices to learn their current
 configuration/memory map.
 
 It seems desirable to present a complete/correct DTB to the kernel,
 it doesn't seem there are great places to hook in custom discovery
 procedures.
 
 From a maintenance perspective we already have to test/etc the kernel
 code for all of this, we don't want to do that twice by duplicating
 this stuff outside the kernel.

OK, that gives me a good understanding of what you're trying to achieve
here.

However, I worry that if you have to run hardware-dependent code in
order to fetch or generate parts of the DT, then we have a chicken-
and-egg problem with no guaranteed solution with the frameworks that
exist today -- although I am not familiar with how DT gets used on
all other arches, so you might have counterexamples I'm not aware of.

At least in ARM-land, the DT is inherently monolithic and static: the
DT is not expected to change once you enter the kernel.

Of course, more or less anything can be done with local patches, but
merging such functionality in a clean way might be a challenge.

  Otherwise, you that have a few static configurations: you could have one
  pre-baked DT per hardware platform, and choose the correct one once you
  have detected the platform.
 
 We do that too, for instance the PPC kernel we build supports 4
 different circuit boards, each served by a separate DTB, that needs a
 fixup pass.
 
 I think the biggest DTB describes about 49 devices..
   
   Where the 'dev tree provider' would use the stored bootloader
   registers and any other information to return the proper DTB. 
  
  It would need developing a bit, but something like that might be
  possible -- it should probably be discussed via devicetree-discuss.
  
  If it is doing anything less trivial than picking a pre-baked DT, the
  rationale would need to be carefully argued.
 
 I'm not sure there is a great interest in this? What are other folks
 working on production embedded stuff doing? I suppose that will be
 more clear as device tree is rolled out.

For now, the architecturally simplest solution still seems to me to be
to write your own boot shim which customises the DT before booting the 
kernel. As discussed, this can continue to look like a simple ELF image
from the bootloader's point of view -- but I appreciate that it will
involve effort and some duplication of some low-level driver components.

If we could do dynamic DT properly (either fully dynamic, or dynamic
in a more constrained way as in your suggested patch), that would
provided an architected way to solve this problem from within the kernel.

This could certainly be worth raising on devicetree-discuss.

Your problem is unlikely to affect people outside the embedded space
too much, but it doesn't sound likely to be unique.

Cheers
---Dave
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] [ARM] Use AT() in the linker script to create correct program headers

2012-10-09 Thread Jason Gunthorpe
On Mon, Oct 08, 2012 at 11:46:49AM +0100, Dave Martin wrote:

> > Yes, but we still need rely on complex code like I2C/MTD to create a
> > correct DTB, which again puts us back to patching the kernel for that
> > functionality.
> 
> I'm still confused as to where this complexity is coming from.
> 
> If you need to run some complex I2C and MTD code to generate the DT, what
> is that code doing?  If it is probing to get the information, then can you
> avoid putting the info in the DT at all?  Primarily the DT is to describe
> those aspects of the hardware which can't be probed.

At manufacturing the unit is programmed with a small datastructure that
contains all the unique ID's (MAC addresses, etc), manufacturing
variations (ie vendor A or B was used for a socket) and other
ancillary data.

So a fair amount of I2C and MTD stack is required just to fetch this
structure - a full CFI probe, for instance, is needed in the NOR flash
case just to locate the structure.

Once read, things like MAC addresses are copied into the DTB, and
certain sections of the DTB are NOP'd out - we have stanzas for chip
vendor A and vendor B, the one that was not put on the board is
replaced with NOP.

Similar to the A/B fix, a further fixup is done based on a runtime
probe of the programmable devices to learn their current
configuration/memory map.

It seems desirable to present a complete/correct DTB to the kernel,
it doesn't seem there are great places to hook in custom discovery
procedures.

>From a maintenance perspective we already have to test/etc the kernel
code for all of this, we don't want to do that twice by duplicating
this stuff outside the kernel.

> Otherwise, you that have a few static configurations: you could have one
> pre-baked DT per hardware platform, and choose the correct one once you
> have detected the platform.

We do that too, for instance the PPC kernel we build supports 4
different circuit boards, each served by a separate DTB, that needs a
fixup pass.

I think the biggest DTB describes about 49 devices..
  
> > Where the 'dev tree provider' would use the stored bootloader
> > registers and any other information to return the proper DTB. 
> 
> It would need developing a bit, but something like that might be
> possible -- it should probably be discussed via devicetree-discuss.
> 
> If it is doing anything less trivial than picking a pre-baked DT, the
> rationale would need to be carefully argued.

I'm not sure there is a great interest in this? What are other folks
working on production embedded stuff doing? I suppose that will be
more clear as device tree is rolled out.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] [ARM] Use AT() in the linker script to create correct program headers

2012-10-09 Thread Jason Gunthorpe
On Mon, Oct 08, 2012 at 11:24:13AM +0100, Dave Martin wrote:

> Partly this came from some side speculation about whether we could do
> things like privileged read-only permissions on newer CPUs, for preventing
> unintended or undesired writes to the kernel's code or read-only data.

Some other arches page protect the kernel, but that tends to be at
odds with the desire to use huge pages for the kernel mapping, and
independent of the load headers..

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] [ARM] Use AT() in the linker script to create correct program headers

2012-10-09 Thread Jason Gunthorpe
On Mon, Oct 08, 2012 at 11:24:13AM +0100, Dave Martin wrote:

 Partly this came from some side speculation about whether we could do
 things like privileged read-only permissions on newer CPUs, for preventing
 unintended or undesired writes to the kernel's code or read-only data.

Some other arches page protect the kernel, but that tends to be at
odds with the desire to use huge pages for the kernel mapping, and
independent of the load headers..

Jason
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] [ARM] Use AT() in the linker script to create correct program headers

2012-10-09 Thread Jason Gunthorpe
On Mon, Oct 08, 2012 at 11:46:49AM +0100, Dave Martin wrote:

  Yes, but we still need rely on complex code like I2C/MTD to create a
  correct DTB, which again puts us back to patching the kernel for that
  functionality.
 
 I'm still confused as to where this complexity is coming from.
 
 If you need to run some complex I2C and MTD code to generate the DT, what
 is that code doing?  If it is probing to get the information, then can you
 avoid putting the info in the DT at all?  Primarily the DT is to describe
 those aspects of the hardware which can't be probed.

At manufacturing the unit is programmed with a small datastructure that
contains all the unique ID's (MAC addresses, etc), manufacturing
variations (ie vendor A or B was used for a socket) and other
ancillary data.

So a fair amount of I2C and MTD stack is required just to fetch this
structure - a full CFI probe, for instance, is needed in the NOR flash
case just to locate the structure.

Once read, things like MAC addresses are copied into the DTB, and
certain sections of the DTB are NOP'd out - we have stanzas for chip
vendor A and vendor B, the one that was not put on the board is
replaced with NOP.

Similar to the A/B fix, a further fixup is done based on a runtime
probe of the programmable devices to learn their current
configuration/memory map.

It seems desirable to present a complete/correct DTB to the kernel,
it doesn't seem there are great places to hook in custom discovery
procedures.

From a maintenance perspective we already have to test/etc the kernel
code for all of this, we don't want to do that twice by duplicating
this stuff outside the kernel.

 Otherwise, you that have a few static configurations: you could have one
 pre-baked DT per hardware platform, and choose the correct one once you
 have detected the platform.

We do that too, for instance the PPC kernel we build supports 4
different circuit boards, each served by a separate DTB, that needs a
fixup pass.

I think the biggest DTB describes about 49 devices..
  
  Where the 'dev tree provider' would use the stored bootloader
  registers and any other information to return the proper DTB. 
 
 It would need developing a bit, but something like that might be
 possible -- it should probably be discussed via devicetree-discuss.
 
 If it is doing anything less trivial than picking a pre-baked DT, the
 rationale would need to be carefully argued.

I'm not sure there is a great interest in this? What are other folks
working on production embedded stuff doing? I suppose that will be
more clear as device tree is rolled out.

Jason
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] [ARM] Use AT() in the linker script to create correct program headers

2012-10-08 Thread Dave Martin
On Thu, Oct 04, 2012 at 11:59:07AM -0600, Jason Gunthorpe wrote:
> On Thu, Oct 04, 2012 at 12:36:37PM +0100, Dave Martin wrote:
> 
> > OK, so it is supported, but not for ARM, yet.  I'm not sure that
> > such a patch would be rejected, since building in a DTB is not
> > really that different from building in a command-line.
> 
> Maybe I will be able to look at this in a few months..
> 
> > The other solution to this problem is to distribute the dtb
> > alongside the kernel as a supplementary image (similar to
> > initramfs), with the bootloader doing the bare minimum of
> > modifications to it.
> 
> Yes, but we still need rely on complex code like I2C/MTD to create a
> correct DTB, which again puts us back to patching the kernel for that
> functionality.

I'm still confused as to where this complexity is coming from.

If you need to run some complex I2C and MTD code to generate the DT, what
is that code doing?  If it is probing to get the information, then can you
avoid putting the info in the DT at all?  Primarily the DT is to describe
those aspects of the hardware which can't be probed.

Otherwise, you that have a few static configurations: you could have one
pre-baked DT per hardware platform, and choose the correct one once you
have detected the platform.
 
> > Naturally, if zero modifications are needed then the bootloader does not
> > need libfdt (or equivalent code) at all.
> 
> If only :)
> 
> > One other option open to you would be to provide an ELF image loadable
> > by your bootloader via a simple wrapper.  This would probably be a more
> > robust approach than carrying patches against head.S, since it removes
> > any intimate dependency on the kernel code itself.
> 
> This is what I ment by having a 2nd stage loader, but this example is
> not complete because at this point you also have to patch into the DTB
> the initrd address (passed in r0/r1), and the various MAC addresses
> (getting them is the hard part..).
> 
> The nice thing about the head.S patch is that it lives right after
> _entry, so there actually are no dependencies on the kernel code, it
> executes in the same environment as the example you presented. 

Well, logically there's little difference unless your patch were to be
merged.  An out-of-tree patch against head.S could be more effort to
maintain in some situations, but that code doesn't evolve rapidly.

> Anyhow, exploring a CONFIG_ARM_BUILT_IN_DTB.. I think I'd add
> something like this to head.S:
> 
> #ifdef CONFIG_ARM_BUILT_IN_DTB
>adr r3,bootloader_r0
>stmia   r3,{r0,r1,r2}
>mov r2, #0
>mov r1, #0
>mov r0, #0
> #else
>bl  __vet_atags
> #endif
> 
> And something like this to early_dt_fixup:
> 
> #ifdef CONFIG_ARM_BUILT_IN_DTB
> for_each_dt_provider(dtp) {
> devtree = dtp->get_dtb();
> if (devtree)
>  break;
> }
> #else
> devtree = phys_to_virt(dt_phys);
> #endif
> 
> Where the 'dev tree provider' would use the stored bootloader
> registers and any other information to return the proper DTB. 

It would need developing a bit, but something like that might be
possible -- it should probably be discussed via devicetree-discuss.

If it is doing anything less trivial than picking a pre-baked DT, the
rationale would need to be carefully argued.

Cheers
---Dave

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] [ARM] Use AT() in the linker script to create correct program headers

2012-10-08 Thread Dave Martin
On Fri, Oct 05, 2012 at 09:45:00AM +0100, Russell King - ARM Linux wrote:
> On Mon, Oct 01, 2012 at 10:06:39AM -0600, Jason Gunthorpe wrote:
> > On Mon, Oct 01, 2012 at 04:39:53PM +0100, Dave Martin wrote:
> > > >   Type   Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg 
> > > > Align
> > > >   LOAD   0x008000 0xc0008000 0x8000 0x372244 0x3a4310 RWE 
> > > > 0x8000
> > > 
> > > Not related directly to your patch, but I wonder why we don't we see
> > > separate r-x and rw- segments?
> > 
> > I think this is because the sections are not aligned when the
> > protections change, and the sections are not sorted by protection
> > type.
> 
> They aren't sorted by protection type, they're ordered according to what's
> required for the kernel - which is to have the init sections together as
> one complete block so that it can be freed, and to place all of the kernel
> text as close to the beginning of the image as possible.

Ah, right.

Partly this came from some side speculation about whether we could do
things like privileged read-only permissions on newer CPUs, for preventing
unintended or undesired writes to the kernel's code or read-only data.

There would be various things to solve in order to make that work, so I
guess there's no great urgency for it right now.

Cheers
---Dave
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] [ARM] Use AT() in the linker script to create correct program headers

2012-10-08 Thread Dave Martin
On Fri, Oct 05, 2012 at 09:45:00AM +0100, Russell King - ARM Linux wrote:
 On Mon, Oct 01, 2012 at 10:06:39AM -0600, Jason Gunthorpe wrote:
  On Mon, Oct 01, 2012 at 04:39:53PM +0100, Dave Martin wrote:
  Type   Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg 
Align
  LOAD   0x008000 0xc0008000 0x8000 0x372244 0x3a4310 RWE 
0x8000
   
   Not related directly to your patch, but I wonder why we don't we see
   separate r-x and rw- segments?
  
  I think this is because the sections are not aligned when the
  protections change, and the sections are not sorted by protection
  type.
 
 They aren't sorted by protection type, they're ordered according to what's
 required for the kernel - which is to have the init sections together as
 one complete block so that it can be freed, and to place all of the kernel
 text as close to the beginning of the image as possible.

Ah, right.

Partly this came from some side speculation about whether we could do
things like privileged read-only permissions on newer CPUs, for preventing
unintended or undesired writes to the kernel's code or read-only data.

There would be various things to solve in order to make that work, so I
guess there's no great urgency for it right now.

Cheers
---Dave
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] [ARM] Use AT() in the linker script to create correct program headers

2012-10-08 Thread Dave Martin
On Thu, Oct 04, 2012 at 11:59:07AM -0600, Jason Gunthorpe wrote:
 On Thu, Oct 04, 2012 at 12:36:37PM +0100, Dave Martin wrote:
 
  OK, so it is supported, but not for ARM, yet.  I'm not sure that
  such a patch would be rejected, since building in a DTB is not
  really that different from building in a command-line.
 
 Maybe I will be able to look at this in a few months..
 
  The other solution to this problem is to distribute the dtb
  alongside the kernel as a supplementary image (similar to
  initramfs), with the bootloader doing the bare minimum of
  modifications to it.
 
 Yes, but we still need rely on complex code like I2C/MTD to create a
 correct DTB, which again puts us back to patching the kernel for that
 functionality.

I'm still confused as to where this complexity is coming from.

If you need to run some complex I2C and MTD code to generate the DT, what
is that code doing?  If it is probing to get the information, then can you
avoid putting the info in the DT at all?  Primarily the DT is to describe
those aspects of the hardware which can't be probed.

Otherwise, you that have a few static configurations: you could have one
pre-baked DT per hardware platform, and choose the correct one once you
have detected the platform.
 
  Naturally, if zero modifications are needed then the bootloader does not
  need libfdt (or equivalent code) at all.
 
 If only :)
 
  One other option open to you would be to provide an ELF image loadable
  by your bootloader via a simple wrapper.  This would probably be a more
  robust approach than carrying patches against head.S, since it removes
  any intimate dependency on the kernel code itself.
 
 This is what I ment by having a 2nd stage loader, but this example is
 not complete because at this point you also have to patch into the DTB
 the initrd address (passed in r0/r1), and the various MAC addresses
 (getting them is the hard part..).
 
 The nice thing about the head.S patch is that it lives right after
 _entry, so there actually are no dependencies on the kernel code, it
 executes in the same environment as the example you presented. 

Well, logically there's little difference unless your patch were to be
merged.  An out-of-tree patch against head.S could be more effort to
maintain in some situations, but that code doesn't evolve rapidly.

 Anyhow, exploring a CONFIG_ARM_BUILT_IN_DTB.. I think I'd add
 something like this to head.S:
 
 #ifdef CONFIG_ARM_BUILT_IN_DTB
adr r3,bootloader_r0
stmia   r3,{r0,r1,r2}
mov r2, #0
mov r1, #0
mov r0, #0
 #else
bl  __vet_atags
 #endif
 
 And something like this to early_dt_fixup:
 
 #ifdef CONFIG_ARM_BUILT_IN_DTB
 for_each_dt_provider(dtp) {
 devtree = dtp-get_dtb();
 if (devtree)
  break;
 }
 #else
 devtree = phys_to_virt(dt_phys);
 #endif
 
 Where the 'dev tree provider' would use the stored bootloader
 registers and any other information to return the proper DTB. 

It would need developing a bit, but something like that might be
possible -- it should probably be discussed via devicetree-discuss.

If it is doing anything less trivial than picking a pre-baked DT, the
rationale would need to be carefully argued.

Cheers
---Dave

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] [ARM] Use AT() in the linker script to create correct program headers

2012-10-05 Thread Russell King - ARM Linux
On Mon, Oct 01, 2012 at 10:06:39AM -0600, Jason Gunthorpe wrote:
> On Mon, Oct 01, 2012 at 04:39:53PM +0100, Dave Martin wrote:
> > >   Type   Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
> > >   LOAD   0x008000 0xc0008000 0x8000 0x372244 0x3a4310 RWE 
> > > 0x8000
> > 
> > Not related directly to your patch, but I wonder why we don't we see
> > separate r-x and rw- segments?
> 
> I think this is because the sections are not aligned when the
> protections change, and the sections are not sorted by protection
> type.

They aren't sorted by protection type, they're ordered according to what's
required for the kernel - which is to have the init sections together as
one complete block so that it can be freed, and to place all of the kernel
text as close to the beginning of the image as possible.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] [ARM] Use AT() in the linker script to create correct program headers

2012-10-05 Thread Russell King - ARM Linux
On Mon, Oct 01, 2012 at 10:06:39AM -0600, Jason Gunthorpe wrote:
 On Mon, Oct 01, 2012 at 04:39:53PM +0100, Dave Martin wrote:
 Type   Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
 LOAD   0x008000 0xc0008000 0x8000 0x372244 0x3a4310 RWE 
   0x8000
  
  Not related directly to your patch, but I wonder why we don't we see
  separate r-x and rw- segments?
 
 I think this is because the sections are not aligned when the
 protections change, and the sections are not sorted by protection
 type.

They aren't sorted by protection type, they're ordered according to what's
required for the kernel - which is to have the init sections together as
one complete block so that it can be freed, and to place all of the kernel
text as close to the beginning of the image as possible.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] [ARM] Use AT() in the linker script to create correct program headers

2012-10-04 Thread Jason Gunthorpe
On Thu, Oct 04, 2012 at 12:36:37PM +0100, Dave Martin wrote:

> OK, so it is supported, but not for ARM, yet.  I'm not sure that
> such a patch would be rejected, since building in a DTB is not
> really that different from building in a command-line.

Maybe I will be able to look at this in a few months..

> The other solution to this problem is to distribute the dtb
> alongside the kernel as a supplementary image (similar to
> initramfs), with the bootloader doing the bare minimum of
> modifications to it.

Yes, but we still need rely on complex code like I2C/MTD to create a
correct DTB, which again puts us back to patching the kernel for that
functionality.
 
> Naturally, if zero modifications are needed then the bootloader does not
> need libfdt (or equivalent code) at all.

If only :)

> One other option open to you would be to provide an ELF image loadable
> by your bootloader via a simple wrapper.  This would probably be a more
> robust approach than carrying patches against head.S, since it removes
> any intimate dependency on the kernel code itself.

This is what I ment by having a 2nd stage loader, but this example is
not complete because at this point you also have to patch into the DTB
the initrd address (passed in r0/r1), and the various MAC addresses
(getting them is the hard part..).

The nice thing about the head.S patch is that it lives right after
_entry, so there actually are no dependencies on the kernel code, it
executes in the same environment as the example you presented. 

Anyhow, exploring a CONFIG_ARM_BUILT_IN_DTB.. I think I'd add
something like this to head.S:

#ifdef CONFIG_ARM_BUILT_IN_DTB
   adr r3,bootloader_r0
   stmia   r3,{r0,r1,r2}
   mov r2, #0
   mov r1, #0
   mov r0, #0
#else
   bl  __vet_atags
#endif

And something like this to early_dt_fixup:

#ifdef CONFIG_ARM_BUILT_IN_DTB
for_each_dt_provider(dtp) {
  devtree = dtp->get_dtb();
  if (devtree)
   break;
}
#else
devtree = phys_to_virt(dt_phys);
#endif

Where the 'dev tree provider' would use the stored bootloader
registers and any other information to return the proper DTB. 

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] [ARM] Use AT() in the linker script to create correct program headers

2012-10-04 Thread Dave Martin
On Wed, Oct 03, 2012 at 12:44:38PM -0600, Jason Gunthorpe wrote:
> On Wed, Oct 03, 2012 at 11:43:35AM +0100, Dave Martin wrote:
> 
> > I'm not sure exactly what you mean by linking the DTB into vmlinux.
> > I don't think this is supported upstream, at least for ARM.  It could
> > be done externally by post-processing vmlinux to add extra sections
> > and some boot shim code which initialises the registers appropriately
> > for kernel entry ... but you're really inventing a bootloader if you
> > start to do that.
> 
> We use the existing build infrastructure (see cmd_dt_S_dtb) and add a
> 4 line patch to head.S to put the dtb address into the right
> register. I think it would be great to have a CONFIG_ARM_BUILT_IN_DTB
> for applications like mine, but it is really no problem to carry the
> head.S patch. The normal build process produces a vmlinux that is self
> contained, no post processing required.

OK, so it is supported, but not for ARM, yet.  I'm not sure that such
a patch would be rejected, since building in a DTB is not really that
different from building in a command-line.

The post-processing that would otherwise be needed is likely to be
pretty trivial, though (see example below).

> > Making a simple bootloader DT-aware is not actually that hard: libfdt
> > is intentionally pretty straightforward to integrate.
> 
> Sure, but we have several major issues with that entire idea:
>  1) Our bootloader (on PPC and ARM) is in redundant NOR flash and
> is limited to 16K. That is just enough to do gzip, elf, cramfs
> and tftpv6 - there is no space left for dtb or fdt code.
>  2) Experience using DT with PPC has shown that the vmlinux is very
> sensitive to the DTB. Every DTB/vmlinux combination must be tested
> to ensure it works. By far the best way to manage this *for us* is
> to guarentee that the vmlinux sees only the DTB it was designed
> for by including it in vmlinux.
>  3) The DTB *changes*. We change it because we've changed things in the
> programmable part of our environment and we change it because the
> kernel has shifted. Planning for this inevitable change is
> necessary.
>  4) On our ARM platforms updating the boot loader in the field is more
> challenging than updating the vmlinux, not impossible, but not
> something we want to do every day. Our PPC boot loader has
> survived without change since 2007, and has booted kernels from
> 2.6.12 to 3.6 with and without DT. This is a time tested,
> field proven approach.
> 
> Fundamentally we treat the DTB like the .config - an integral,
> internal, part of vmlinux.
> 
> As far as I can see, the *only* argument to put the DT in the boot
> loader is to support standard, distribution kernels.

DT-aware does not have to mean that the DTB is part of the bootloader.
Generally, that's considered to be a bad idea for the reasons you
describe (among others).

The other solution to this problem is to distribute the dtb alongside the
kernel as a supplementary image (similar to initramfs), with the
bootloader doing the bare minimum of modifications to it.

Naturally, if zero modifications are needed then the bootloader does not
need libfdt (or equivalent code) at all.

> > It's more than 100 lines, but not _that_ much more, and contains
> > some implementation features you probably don't need.
> 
> But that is not enough, variations of our platforms use bit bang I2C,
> SPI and NOR FLASH to store things like the MAC address and other
> DT-relevant information. So, swaths of that code would need to go into
> the boot loader/zImage as well.
> 
> At that point we may as well take our chances with uBoot. :)
> 
> > From my side, I think that the fact this takes arch/arm/vmlinux.lds.S
> > closer to some other common arches means that this is low risk and should
> > be mostly harmless, but I will take another look at the patch first.
> 
> This matches my opinion..
> 
> > > > You store vmlinux.gz in a cramfs?  Is that a typo, or have you already
> > > > compressed the kernel twice?
> > > 
> > > The compression can either be intrisic to cramfs or a raw elf that has
> > > been gzip'd.
> > 
> > If you can have files with intrinsic compression such that they don't
> > get double-compressed, why can you not do this for zImage?
> > 
> > Am I still misunderstanding something?
> 
> We have two boot modes, one boots ELF.gz (for development), the other
> boots ELF in CRAMFS (for deployment). Both are compressed, one is
> compressed using gzip and one is compressed using zlib in CRAMFS. It
> is hard to see how to store a zImage in CRAMFS and avoid double
> de-compression. We never store an ELF.gz in CRAMFS.

OK, now I understand.
 
> Since the entire point of zImage is to introduce compression it is
> hard to see *why* we'd want to store a zImage in a CRAMFS.

Because zImage has grown one or two extra features (which in fact have
nothing to do with compression).  But I can see why you're not keen
on the idea.  In 

Re: [PATCH] [ARM] Use AT() in the linker script to create correct program headers

2012-10-04 Thread Dave Martin
On Wed, Oct 03, 2012 at 12:44:38PM -0600, Jason Gunthorpe wrote:
 On Wed, Oct 03, 2012 at 11:43:35AM +0100, Dave Martin wrote:
 
  I'm not sure exactly what you mean by linking the DTB into vmlinux.
  I don't think this is supported upstream, at least for ARM.  It could
  be done externally by post-processing vmlinux to add extra sections
  and some boot shim code which initialises the registers appropriately
  for kernel entry ... but you're really inventing a bootloader if you
  start to do that.
 
 We use the existing build infrastructure (see cmd_dt_S_dtb) and add a
 4 line patch to head.S to put the dtb address into the right
 register. I think it would be great to have a CONFIG_ARM_BUILT_IN_DTB
 for applications like mine, but it is really no problem to carry the
 head.S patch. The normal build process produces a vmlinux that is self
 contained, no post processing required.

OK, so it is supported, but not for ARM, yet.  I'm not sure that such
a patch would be rejected, since building in a DTB is not really that
different from building in a command-line.

The post-processing that would otherwise be needed is likely to be
pretty trivial, though (see example below).

  Making a simple bootloader DT-aware is not actually that hard: libfdt
  is intentionally pretty straightforward to integrate.
 
 Sure, but we have several major issues with that entire idea:
  1) Our bootloader (on PPC and ARM) is in redundant NOR flash and
 is limited to 16K. That is just enough to do gzip, elf, cramfs
 and tftpv6 - there is no space left for dtb or fdt code.
  2) Experience using DT with PPC has shown that the vmlinux is very
 sensitive to the DTB. Every DTB/vmlinux combination must be tested
 to ensure it works. By far the best way to manage this *for us* is
 to guarentee that the vmlinux sees only the DTB it was designed
 for by including it in vmlinux.
  3) The DTB *changes*. We change it because we've changed things in the
 programmable part of our environment and we change it because the
 kernel has shifted. Planning for this inevitable change is
 necessary.
  4) On our ARM platforms updating the boot loader in the field is more
 challenging than updating the vmlinux, not impossible, but not
 something we want to do every day. Our PPC boot loader has
 survived without change since 2007, and has booted kernels from
 2.6.12 to 3.6 with and without DT. This is a time tested,
 field proven approach.
 
 Fundamentally we treat the DTB like the .config - an integral,
 internal, part of vmlinux.
 
 As far as I can see, the *only* argument to put the DT in the boot
 loader is to support standard, distribution kernels.

DT-aware does not have to mean that the DTB is part of the bootloader.
Generally, that's considered to be a bad idea for the reasons you
describe (among others).

The other solution to this problem is to distribute the dtb alongside the
kernel as a supplementary image (similar to initramfs), with the
bootloader doing the bare minimum of modifications to it.

Naturally, if zero modifications are needed then the bootloader does not
need libfdt (or equivalent code) at all.

  It's more than 100 lines, but not _that_ much more, and contains
  some implementation features you probably don't need.
 
 But that is not enough, variations of our platforms use bit bang I2C,
 SPI and NOR FLASH to store things like the MAC address and other
 DT-relevant information. So, swaths of that code would need to go into
 the boot loader/zImage as well.
 
 At that point we may as well take our chances with uBoot. :)
 
  From my side, I think that the fact this takes arch/arm/vmlinux.lds.S
  closer to some other common arches means that this is low risk and should
  be mostly harmless, but I will take another look at the patch first.
 
 This matches my opinion..
 
You store vmlinux.gz in a cramfs?  Is that a typo, or have you already
compressed the kernel twice?
   
   The compression can either be intrisic to cramfs or a raw elf that has
   been gzip'd.
  
  If you can have files with intrinsic compression such that they don't
  get double-compressed, why can you not do this for zImage?
  
  Am I still misunderstanding something?
 
 We have two boot modes, one boots ELF.gz (for development), the other
 boots ELF in CRAMFS (for deployment). Both are compressed, one is
 compressed using gzip and one is compressed using zlib in CRAMFS. It
 is hard to see how to store a zImage in CRAMFS and avoid double
 de-compression. We never store an ELF.gz in CRAMFS.

OK, now I understand.
 
 Since the entire point of zImage is to introduce compression it is
 hard to see *why* we'd want to store a zImage in a CRAMFS.

Because zImage has grown one or two extra features (which in fact have
nothing to do with compression).  But I can see why you're not keen
on the idea.  In principle we could have an uncompressed zImage, but
for now this doesn't exist.


One other option open to you 

Re: [PATCH] [ARM] Use AT() in the linker script to create correct program headers

2012-10-04 Thread Jason Gunthorpe
On Thu, Oct 04, 2012 at 12:36:37PM +0100, Dave Martin wrote:

 OK, so it is supported, but not for ARM, yet.  I'm not sure that
 such a patch would be rejected, since building in a DTB is not
 really that different from building in a command-line.

Maybe I will be able to look at this in a few months..

 The other solution to this problem is to distribute the dtb
 alongside the kernel as a supplementary image (similar to
 initramfs), with the bootloader doing the bare minimum of
 modifications to it.

Yes, but we still need rely on complex code like I2C/MTD to create a
correct DTB, which again puts us back to patching the kernel for that
functionality.
 
 Naturally, if zero modifications are needed then the bootloader does not
 need libfdt (or equivalent code) at all.

If only :)

 One other option open to you would be to provide an ELF image loadable
 by your bootloader via a simple wrapper.  This would probably be a more
 robust approach than carrying patches against head.S, since it removes
 any intimate dependency on the kernel code itself.

This is what I ment by having a 2nd stage loader, but this example is
not complete because at this point you also have to patch into the DTB
the initrd address (passed in r0/r1), and the various MAC addresses
(getting them is the hard part..).

The nice thing about the head.S patch is that it lives right after
_entry, so there actually are no dependencies on the kernel code, it
executes in the same environment as the example you presented. 

Anyhow, exploring a CONFIG_ARM_BUILT_IN_DTB.. I think I'd add
something like this to head.S:

#ifdef CONFIG_ARM_BUILT_IN_DTB
   adr r3,bootloader_r0
   stmia   r3,{r0,r1,r2}
   mov r2, #0
   mov r1, #0
   mov r0, #0
#else
   bl  __vet_atags
#endif

And something like this to early_dt_fixup:

#ifdef CONFIG_ARM_BUILT_IN_DTB
for_each_dt_provider(dtp) {
  devtree = dtp-get_dtb();
  if (devtree)
   break;
}
#else
devtree = phys_to_virt(dt_phys);
#endif

Where the 'dev tree provider' would use the stored bootloader
registers and any other information to return the proper DTB. 

Jason
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] [ARM] Use AT() in the linker script to create correct program headers

2012-10-03 Thread Jason Gunthorpe
On Wed, Oct 03, 2012 at 11:43:35AM +0100, Dave Martin wrote:

> I'm not sure exactly what you mean by linking the DTB into vmlinux.
> I don't think this is supported upstream, at least for ARM.  It could
> be done externally by post-processing vmlinux to add extra sections
> and some boot shim code which initialises the registers appropriately
> for kernel entry ... but you're really inventing a bootloader if you
> start to do that.

We use the existing build infrastructure (see cmd_dt_S_dtb) and add a
4 line patch to head.S to put the dtb address into the right
register. I think it would be great to have a CONFIG_ARM_BUILT_IN_DTB
for applications like mine, but it is really no problem to carry the
head.S patch. The normal build process produces a vmlinux that is self
contained, no post processing required.

> Making a simple bootloader DT-aware is not actually that hard: libfdt
> is intentionally pretty straightforward to integrate.

Sure, but we have several major issues with that entire idea:
 1) Our bootloader (on PPC and ARM) is in redundant NOR flash and
is limited to 16K. That is just enough to do gzip, elf, cramfs
and tftpv6 - there is no space left for dtb or fdt code.
 2) Experience using DT with PPC has shown that the vmlinux is very
sensitive to the DTB. Every DTB/vmlinux combination must be tested
to ensure it works. By far the best way to manage this *for us* is
to guarentee that the vmlinux sees only the DTB it was designed
for by including it in vmlinux.
 3) The DTB *changes*. We change it because we've changed things in the
programmable part of our environment and we change it because the
kernel has shifted. Planning for this inevitable change is
necessary.
 4) On our ARM platforms updating the boot loader in the field is more
challenging than updating the vmlinux, not impossible, but not
something we want to do every day. Our PPC boot loader has
survived without change since 2007, and has booted kernels from
2.6.12 to 3.6 with and without DT. This is a time tested,
field proven approach.

Fundamentally we treat the DTB like the .config - an integral,
internal, part of vmlinux.

As far as I can see, the *only* argument to put the DT in the boot
loader is to support standard, distribution kernels.

> It's more than 100 lines, but not _that_ much more, and contains
> some implementation features you probably don't need.

But that is not enough, variations of our platforms use bit bang I2C,
SPI and NOR FLASH to store things like the MAC address and other
DT-relevant information. So, swaths of that code would need to go into
the boot loader/zImage as well.

At that point we may as well take our chances with uBoot. :)

> From my side, I think that the fact this takes arch/arm/vmlinux.lds.S
> closer to some other common arches means that this is low risk and should
> be mostly harmless, but I will take another look at the patch first.

This matches my opinion..

> > > You store vmlinux.gz in a cramfs?  Is that a typo, or have you already
> > > compressed the kernel twice?
> > 
> > The compression can either be intrisic to cramfs or a raw elf that has
> > been gzip'd.
> 
> If you can have files with intrinsic compression such that they don't
> get double-compressed, why can you not do this for zImage?
> 
> Am I still misunderstanding something?

We have two boot modes, one boots ELF.gz (for development), the other
boots ELF in CRAMFS (for deployment). Both are compressed, one is
compressed using gzip and one is compressed using zlib in CRAMFS. It
is hard to see how to store a zImage in CRAMFS and avoid double
de-compression. We never store an ELF.gz in CRAMFS.

Since the entire point of zImage is to introduce compression it is
hard to see *why* we'd want to store a zImage in a CRAMFS.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] [ARM] Use AT() in the linker script to create correct program headers

2012-10-03 Thread Dave Martin
[Nicolas, do you have a view on this thread with regard to XIP?]

Hi,

On Tue, Oct 02, 2012 at 11:47:59AM -0600, Jason Gunthorpe wrote:
> On Tue, Oct 02, 2012 at 11:23:46AM +0100, Dave Martin wrote:
> 
> > > Well, no, it boots ELFs, so it can boot anything, with any memory
> > > layout. A 2nd stage loader would be required to boot standard kernels,
> > > that loader would be an ELF with 1 section for the 2nd stage, 1
> > > section for the zImage and 1 section for the initrd, with proper load
> > > headers.
> > 
> > Don't you already have to treat Linux as a special case though?  How
> > do you know where to load ATAGs, DT and/or initramfs, and how to
> > initalise the registers?  None of that is part of any ELF specification,
> > and would be inappropriate if you boot any non-linux images.
> 
> Experience with our PPC systems has been that linking the DTB into
> vmlinux is the way to go - so we have a trivally small,
> non-upstreamble change (for PPC and ARM) to put the DTB(s) in vmlinux
> and capture the CPU registers values (entry call function arguments)
> that are set from the bootloader. 
> 
> Some DTB fixup code (supported by the OF kernel routines) in vmlinux
> edits the DTB chosen node to include the information from the
> bootloader.
> 
> From there any other DTB fixups (eg fetching a MAC address from I2C)
> plus other register initializations are done in Linux (typically by
> the upstream drivers, though not 100%), with the MMU on, with full
> kernel services. This is a much easier development environment :)
> 
> All in all it is basically about 100 lines of code to do what I've
> described in vmlinux. This is dramatically less code than would be
> required if we tried to conform to the standard vmlinux boot protocol.

[1]

I'm not sure exactly what you mean by linking the DTB into vmlinux.
I don't think this is supported upstream, at least for ARM.  It could
be done externally by post-processing vmlinux to add extra sections
and some boot shim code which initialises the registers appropriately
for kernel entry ... but you're really inventing a bootloader if you
start to do that.

For ARM, I believe that bootloaders that do not pass the dtb properly are
considered considered legacy, or broken (in the case of platforms which
are DT-based).  We don't really attempt to support this.

ARM_APPENDED_DTB is the workaround for that, but that is only available
as part of the zImage loader.  This is meant to ease migration, not as
a permanent solution... but it will probably be available for quite
a while yet.


Making a simple bootloader DT-aware is not actually that hard: libfdt
is intentionally pretty straightforward to integrate.

For an example of how this might look in a simple scenario, you could
take a look at the zImage loader implementation, or

git://git.linaro.org/arm/models/boot-wrapper.git

(Particularly update_fdt() and load_kernel() in semi_loader.c)

It's more than 100 lines, but not _that_ much more, and contains
some implementation features you probably don't need.

There may still be an argument against this if your bootloader is
exteremely space-constrained.

> > As you observe, GNU ld behaviour in this area tends to be rather patchily
> > specified, buggy or both.  That does argue in favour of reusing the
> > same techniques already used for other arches, though.
> 
> It is good to know PHDRS seems to be working better now, maybe things
> can migrate someday!

I guess we shouldn't rush to do that unnecessarily.  Old tools hang
around for a long, long time...

> > A question does occur to me: do your changes work with XIP_KERNEL?
> > I'm not very familiar with XIP_KERNEL myself, so I'm currently not
> > clear on whether there's an impact here.
> 
> I looked at the XIP defines and they didn't seem to conflict. Since
> this change only effects the values in the LOAD headers, not the
> actual image I doubt it has any impact.
> 
> > Beyond this, I think the approach doesn't look unreasonable.
> 
> Great, should I do anything further to get this patch into mainline.

I think that someone more familiar with XIP should at least comment on
it.

Since you admit [1] that you are deliberately not following the boot
protocol proper, you may get some pushback about whether the change
is justified.

>From my side, I think that the fact this takes arch/arm/vmlinux.lds.S
closer to some other common arches means that this is low risk and should
be mostly harmless, but I will take another look at the patch first.

> > You store vmlinux.gz in a cramfs?  Is that a typo, or have you already
> > compressed the kernel twice?
> 
> The compression can either be intrisic to cramfs or a raw elf that has
> been gzip'd.

If you can have files with intrinsic compression such that they don't
get double-compressed, why can you not do this for zImage?

Am I still misunderstanding something?

Cheers
---Dave
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to 

Re: [PATCH] [ARM] Use AT() in the linker script to create correct program headers

2012-10-03 Thread Dave Martin
[Nicolas, do you have a view on this thread with regard to XIP?]

Hi,

On Tue, Oct 02, 2012 at 11:47:59AM -0600, Jason Gunthorpe wrote:
 On Tue, Oct 02, 2012 at 11:23:46AM +0100, Dave Martin wrote:
 
   Well, no, it boots ELFs, so it can boot anything, with any memory
   layout. A 2nd stage loader would be required to boot standard kernels,
   that loader would be an ELF with 1 section for the 2nd stage, 1
   section for the zImage and 1 section for the initrd, with proper load
   headers.
  
  Don't you already have to treat Linux as a special case though?  How
  do you know where to load ATAGs, DT and/or initramfs, and how to
  initalise the registers?  None of that is part of any ELF specification,
  and would be inappropriate if you boot any non-linux images.
 
 Experience with our PPC systems has been that linking the DTB into
 vmlinux is the way to go - so we have a trivally small,
 non-upstreamble change (for PPC and ARM) to put the DTB(s) in vmlinux
 and capture the CPU registers values (entry call function arguments)
 that are set from the bootloader. 
 
 Some DTB fixup code (supported by the OF kernel routines) in vmlinux
 edits the DTB chosen node to include the information from the
 bootloader.
 
 From there any other DTB fixups (eg fetching a MAC address from I2C)
 plus other register initializations are done in Linux (typically by
 the upstream drivers, though not 100%), with the MMU on, with full
 kernel services. This is a much easier development environment :)
 
 All in all it is basically about 100 lines of code to do what I've
 described in vmlinux. This is dramatically less code than would be
 required if we tried to conform to the standard vmlinux boot protocol.

[1]

I'm not sure exactly what you mean by linking the DTB into vmlinux.
I don't think this is supported upstream, at least for ARM.  It could
be done externally by post-processing vmlinux to add extra sections
and some boot shim code which initialises the registers appropriately
for kernel entry ... but you're really inventing a bootloader if you
start to do that.

For ARM, I believe that bootloaders that do not pass the dtb properly are
considered considered legacy, or broken (in the case of platforms which
are DT-based).  We don't really attempt to support this.

ARM_APPENDED_DTB is the workaround for that, but that is only available
as part of the zImage loader.  This is meant to ease migration, not as
a permanent solution... but it will probably be available for quite
a while yet.


Making a simple bootloader DT-aware is not actually that hard: libfdt
is intentionally pretty straightforward to integrate.

For an example of how this might look in a simple scenario, you could
take a look at the zImage loader implementation, or

git://git.linaro.org/arm/models/boot-wrapper.git

(Particularly update_fdt() and load_kernel() in semi_loader.c)

It's more than 100 lines, but not _that_ much more, and contains
some implementation features you probably don't need.

There may still be an argument against this if your bootloader is
exteremely space-constrained.

  As you observe, GNU ld behaviour in this area tends to be rather patchily
  specified, buggy or both.  That does argue in favour of reusing the
  same techniques already used for other arches, though.
 
 It is good to know PHDRS seems to be working better now, maybe things
 can migrate someday!

I guess we shouldn't rush to do that unnecessarily.  Old tools hang
around for a long, long time...

  A question does occur to me: do your changes work with XIP_KERNEL?
  I'm not very familiar with XIP_KERNEL myself, so I'm currently not
  clear on whether there's an impact here.
 
 I looked at the XIP defines and they didn't seem to conflict. Since
 this change only effects the values in the LOAD headers, not the
 actual image I doubt it has any impact.
 
  Beyond this, I think the approach doesn't look unreasonable.
 
 Great, should I do anything further to get this patch into mainline.

I think that someone more familiar with XIP should at least comment on
it.

Since you admit [1] that you are deliberately not following the boot
protocol proper, you may get some pushback about whether the change
is justified.

From my side, I think that the fact this takes arch/arm/vmlinux.lds.S
closer to some other common arches means that this is low risk and should
be mostly harmless, but I will take another look at the patch first.

  You store vmlinux.gz in a cramfs?  Is that a typo, or have you already
  compressed the kernel twice?
 
 The compression can either be intrisic to cramfs or a raw elf that has
 been gzip'd.

If you can have files with intrinsic compression such that they don't
get double-compressed, why can you not do this for zImage?

Am I still misunderstanding something?

Cheers
---Dave
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  

Re: [PATCH] [ARM] Use AT() in the linker script to create correct program headers

2012-10-03 Thread Jason Gunthorpe
On Wed, Oct 03, 2012 at 11:43:35AM +0100, Dave Martin wrote:

 I'm not sure exactly what you mean by linking the DTB into vmlinux.
 I don't think this is supported upstream, at least for ARM.  It could
 be done externally by post-processing vmlinux to add extra sections
 and some boot shim code which initialises the registers appropriately
 for kernel entry ... but you're really inventing a bootloader if you
 start to do that.

We use the existing build infrastructure (see cmd_dt_S_dtb) and add a
4 line patch to head.S to put the dtb address into the right
register. I think it would be great to have a CONFIG_ARM_BUILT_IN_DTB
for applications like mine, but it is really no problem to carry the
head.S patch. The normal build process produces a vmlinux that is self
contained, no post processing required.

 Making a simple bootloader DT-aware is not actually that hard: libfdt
 is intentionally pretty straightforward to integrate.

Sure, but we have several major issues with that entire idea:
 1) Our bootloader (on PPC and ARM) is in redundant NOR flash and
is limited to 16K. That is just enough to do gzip, elf, cramfs
and tftpv6 - there is no space left for dtb or fdt code.
 2) Experience using DT with PPC has shown that the vmlinux is very
sensitive to the DTB. Every DTB/vmlinux combination must be tested
to ensure it works. By far the best way to manage this *for us* is
to guarentee that the vmlinux sees only the DTB it was designed
for by including it in vmlinux.
 3) The DTB *changes*. We change it because we've changed things in the
programmable part of our environment and we change it because the
kernel has shifted. Planning for this inevitable change is
necessary.
 4) On our ARM platforms updating the boot loader in the field is more
challenging than updating the vmlinux, not impossible, but not
something we want to do every day. Our PPC boot loader has
survived without change since 2007, and has booted kernels from
2.6.12 to 3.6 with and without DT. This is a time tested,
field proven approach.

Fundamentally we treat the DTB like the .config - an integral,
internal, part of vmlinux.

As far as I can see, the *only* argument to put the DT in the boot
loader is to support standard, distribution kernels.

 It's more than 100 lines, but not _that_ much more, and contains
 some implementation features you probably don't need.

But that is not enough, variations of our platforms use bit bang I2C,
SPI and NOR FLASH to store things like the MAC address and other
DT-relevant information. So, swaths of that code would need to go into
the boot loader/zImage as well.

At that point we may as well take our chances with uBoot. :)

 From my side, I think that the fact this takes arch/arm/vmlinux.lds.S
 closer to some other common arches means that this is low risk and should
 be mostly harmless, but I will take another look at the patch first.

This matches my opinion..

   You store vmlinux.gz in a cramfs?  Is that a typo, or have you already
   compressed the kernel twice?
  
  The compression can either be intrisic to cramfs or a raw elf that has
  been gzip'd.
 
 If you can have files with intrinsic compression such that they don't
 get double-compressed, why can you not do this for zImage?
 
 Am I still misunderstanding something?

We have two boot modes, one boots ELF.gz (for development), the other
boots ELF in CRAMFS (for deployment). Both are compressed, one is
compressed using gzip and one is compressed using zlib in CRAMFS. It
is hard to see how to store a zImage in CRAMFS and avoid double
de-compression. We never store an ELF.gz in CRAMFS.

Since the entire point of zImage is to introduce compression it is
hard to see *why* we'd want to store a zImage in a CRAMFS.

Jason
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] [ARM] Use AT() in the linker script to create correct program headers

2012-10-02 Thread Jason Gunthorpe
On Tue, Oct 02, 2012 at 11:23:46AM +0100, Dave Martin wrote:

> > Well, no, it boots ELFs, so it can boot anything, with any memory
> > layout. A 2nd stage loader would be required to boot standard kernels,
> > that loader would be an ELF with 1 section for the 2nd stage, 1
> > section for the zImage and 1 section for the initrd, with proper load
> > headers.
> 
> Don't you already have to treat Linux as a special case though?  How
> do you know where to load ATAGs, DT and/or initramfs, and how to
> initalise the registers?  None of that is part of any ELF specification,
> and would be inappropriate if you boot any non-linux images.

Experience with our PPC systems has been that linking the DTB into
vmlinux is the way to go - so we have a trivally small,
non-upstreamble change (for PPC and ARM) to put the DTB(s) in vmlinux
and capture the CPU registers values (entry call function arguments)
that are set from the bootloader. 

Some DTB fixup code (supported by the OF kernel routines) in vmlinux
edits the DTB chosen node to include the information from the
bootloader.

>From there any other DTB fixups (eg fetching a MAC address from I2C)
plus other register initializations are done in Linux (typically by
the upstream drivers, though not 100%), with the MMU on, with full
kernel services. This is a much easier development environment :)

All in all it is basically about 100 lines of code to do what I've
described in vmlinux. This is dramatically less code than would be
required if we tried to conform to the standard vmlinux boot protocol.

> As you observe, GNU ld behaviour in this area tends to be rather patchily
> specified, buggy or both.  That does argue in favour of reusing the
> same techniques already used for other arches, though.

It is good to know PHDRS seems to be working better now, maybe things
can migrate someday!

> A question does occur to me: do your changes work with XIP_KERNEL?
> I'm not very familiar with XIP_KERNEL myself, so I'm currently not
> clear on whether there's an impact here.

I looked at the XIP defines and they didn't seem to conflict. Since
this change only effects the values in the LOAD headers, not the
actual image I doubt it has any impact.

> Beyond this, I think the approach doesn't look unreasonable.

Great, should I do anything further to get this patch into mainline.
 
> You store vmlinux.gz in a cramfs?  Is that a typo, or have you already
> compressed the kernel twice?

The compression can either be intrisic to cramfs or a raw elf that has
been gzip'd.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] [ARM] Use AT() in the linker script to create correct program headers

2012-10-02 Thread Dave Martin
On Mon, Oct 01, 2012 at 12:35:43PM -0600, Jason Gunthorpe wrote:
> On Mon, Oct 01, 2012 at 06:56:47PM +0100, Dave Martin wrote:
> 
> > > > If the kernel is intended to be loadable at a physical address which is
> > > > not statically known, no ELF loader that does not ignore the ELF
> > > > phdr
> > > 
> > > In this case you can't really use a standard ELF loader to load the
> > > kernel so, LOAD_OFFSET = 0 is fine. My case is using an ELF loader,
> > > and I have set options for a static physical load address.
> > 
> > Generally, people should try to be compatible with the single kernel
> > image effort unless there's a really compelling reason not to.
> 
> Well, all the embedded kernels we use are always custom built and
> minimized for the target. So as long as there are options to minimize
> the kernel size/increase performance by taking out the relocation
> stuff, we are going to use them.
> 
> All our boards on PPC and ARM use DT kernels now, and we try to flow
> back all the generic stuff as best we can. As someone that makes
> custom boards, I really like DT, it makes things much easier :)
> 
> > Wouldn't your firmware be incapable of loading a multiplatform kernel?
> 
> Well, no, it boots ELFs, so it can boot anything, with any memory
> layout. A 2nd stage loader would be required to boot standard kernels,
> that loader would be an ELF with 1 section for the 2nd stage, 1
> section for the zImage and 1 section for the initrd, with proper load
> headers.

Don't you already have to treat Linux as a special case though?  How
do you know where to load ATAGs, DT and/or initramfs, and how to
initalise the registers?  None of that is part of any ELF specification,
and would be inappropriate if you boot any non-linux images.

> Creating such a system is a lot of annoyance, so we never have - it is
> *so much* easier to just boot vmlinux ELF directly.
> 
> > Well, that was a bit of a pedantic point I admit, but there are
> > conflicting definitions of what "virtual address" really means in these
> > situations.  The original SYSV ABI spec explicitly specifies that
> > e_entry is a virtual address, but is also rather vague about how the
> > paddr fields should be interpreted.
> 
> Granted the spec is vauge, but convention for loaders seems to be that
> it is a physical address these days.
> 
> It could be a virtual address, and the loader could translate it by
> looking at the phdrs, but I don't see any other arches doing that?

You would just give .head.text a virtual address matching its load
address.  But as you say, no other arches bother with this, and it's
not obviously worthwhile.

> > All that AT(ADDR(blah) - LOAD_OFFSET) stuff is cumbersome, but if it's
> > at least consistent with other architectures then it may not such a
> > disaster.  It's not universal though: less than 50% of the arches in
> > the kernel currently seem to use this.
> 
> I agree it is not nice, but I once did try to make PHDRS work as you
> described, but was never successful. IIRC there were serious linker
> bugs) As you note the AT method is consistent with other arches, and
> the generic vmlinux.lds.h

A quick experiment shows that 

PHDRS {
kernel PT_LOAD AT(PHYS_OFFSET + LOAD_OFFSET);
}

/* ... */

SECTIONS {
.head.text {
/* ... */
} :kernel

/* ... */
}

can produce a sensible-looking vmlinux at least with my version of the
tools.

As you observe, GNU ld behaviour in this area tends to be rather patchily
specified, buggy or both.  That does argue in favour of reusing the
same techniques already used for other arches, though.


A question does occur to me: do your changes work with XIP_KERNEL?
I'm not very familiar with XIP_KERNEL myself, so I'm currently not
clear on whether there's an impact here.

Beyond this, I think the approach doesn't look unreasonable.

> > > Yes, if you use more loader stages then the load headers are ignored.
> > > Our boot loaders on our boards boot straight ELF vmlinux.gz so they
> > > need correct load headers.
>  
> > If your image is compressed anyway though, why are you not using zImage?
> 
> We store the kernel in a CRAMFS, the loader pulls it out and
> decompresses it, processes the ELF sections 'on the fly' and jumps to
> it. Using zImage would result in double-decompression, and reallly has
> no benefits to us.

You store vmlinux.gz in a cramfs?  Is that a typo, or have you already
compressed the kernel twice?


Cheers
---Dave
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] [ARM] Use AT() in the linker script to create correct program headers

2012-10-02 Thread Dave Martin
On Mon, Oct 01, 2012 at 12:35:43PM -0600, Jason Gunthorpe wrote:
 On Mon, Oct 01, 2012 at 06:56:47PM +0100, Dave Martin wrote:
 
If the kernel is intended to be loadable at a physical address which is
not statically known, no ELF loader that does not ignore the ELF
phdr
   
   In this case you can't really use a standard ELF loader to load the
   kernel so, LOAD_OFFSET = 0 is fine. My case is using an ELF loader,
   and I have set options for a static physical load address.
  
  Generally, people should try to be compatible with the single kernel
  image effort unless there's a really compelling reason not to.
 
 Well, all the embedded kernels we use are always custom built and
 minimized for the target. So as long as there are options to minimize
 the kernel size/increase performance by taking out the relocation
 stuff, we are going to use them.
 
 All our boards on PPC and ARM use DT kernels now, and we try to flow
 back all the generic stuff as best we can. As someone that makes
 custom boards, I really like DT, it makes things much easier :)
 
  Wouldn't your firmware be incapable of loading a multiplatform kernel?
 
 Well, no, it boots ELFs, so it can boot anything, with any memory
 layout. A 2nd stage loader would be required to boot standard kernels,
 that loader would be an ELF with 1 section for the 2nd stage, 1
 section for the zImage and 1 section for the initrd, with proper load
 headers.

Don't you already have to treat Linux as a special case though?  How
do you know where to load ATAGs, DT and/or initramfs, and how to
initalise the registers?  None of that is part of any ELF specification,
and would be inappropriate if you boot any non-linux images.

 Creating such a system is a lot of annoyance, so we never have - it is
 *so much* easier to just boot vmlinux ELF directly.
 
  Well, that was a bit of a pedantic point I admit, but there are
  conflicting definitions of what virtual address really means in these
  situations.  The original SYSV ABI spec explicitly specifies that
  e_entry is a virtual address, but is also rather vague about how the
  paddr fields should be interpreted.
 
 Granted the spec is vauge, but convention for loaders seems to be that
 it is a physical address these days.
 
 It could be a virtual address, and the loader could translate it by
 looking at the phdrs, but I don't see any other arches doing that?

You would just give .head.text a virtual address matching its load
address.  But as you say, no other arches bother with this, and it's
not obviously worthwhile.

  All that AT(ADDR(blah) - LOAD_OFFSET) stuff is cumbersome, but if it's
  at least consistent with other architectures then it may not such a
  disaster.  It's not universal though: less than 50% of the arches in
  the kernel currently seem to use this.
 
 I agree it is not nice, but I once did try to make PHDRS work as you
 described, but was never successful. IIRC there were serious linker
 bugs) As you note the AT method is consistent with other arches, and
 the generic vmlinux.lds.h

A quick experiment shows that 

PHDRS {
kernel PT_LOAD AT(PHYS_OFFSET + LOAD_OFFSET);
}

/* ... */

SECTIONS {
.head.text {
/* ... */
} :kernel

/* ... */
}

can produce a sensible-looking vmlinux at least with my version of the
tools.

As you observe, GNU ld behaviour in this area tends to be rather patchily
specified, buggy or both.  That does argue in favour of reusing the
same techniques already used for other arches, though.


A question does occur to me: do your changes work with XIP_KERNEL?
I'm not very familiar with XIP_KERNEL myself, so I'm currently not
clear on whether there's an impact here.

Beyond this, I think the approach doesn't look unreasonable.

   Yes, if you use more loader stages then the load headers are ignored.
   Our boot loaders on our boards boot straight ELF vmlinux.gz so they
   need correct load headers.
  
  If your image is compressed anyway though, why are you not using zImage?
 
 We store the kernel in a CRAMFS, the loader pulls it out and
 decompresses it, processes the ELF sections 'on the fly' and jumps to
 it. Using zImage would result in double-decompression, and reallly has
 no benefits to us.

You store vmlinux.gz in a cramfs?  Is that a typo, or have you already
compressed the kernel twice?


Cheers
---Dave
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] [ARM] Use AT() in the linker script to create correct program headers

2012-10-02 Thread Jason Gunthorpe
On Tue, Oct 02, 2012 at 11:23:46AM +0100, Dave Martin wrote:

  Well, no, it boots ELFs, so it can boot anything, with any memory
  layout. A 2nd stage loader would be required to boot standard kernels,
  that loader would be an ELF with 1 section for the 2nd stage, 1
  section for the zImage and 1 section for the initrd, with proper load
  headers.
 
 Don't you already have to treat Linux as a special case though?  How
 do you know where to load ATAGs, DT and/or initramfs, and how to
 initalise the registers?  None of that is part of any ELF specification,
 and would be inappropriate if you boot any non-linux images.

Experience with our PPC systems has been that linking the DTB into
vmlinux is the way to go - so we have a trivally small,
non-upstreamble change (for PPC and ARM) to put the DTB(s) in vmlinux
and capture the CPU registers values (entry call function arguments)
that are set from the bootloader. 

Some DTB fixup code (supported by the OF kernel routines) in vmlinux
edits the DTB chosen node to include the information from the
bootloader.

From there any other DTB fixups (eg fetching a MAC address from I2C)
plus other register initializations are done in Linux (typically by
the upstream drivers, though not 100%), with the MMU on, with full
kernel services. This is a much easier development environment :)

All in all it is basically about 100 lines of code to do what I've
described in vmlinux. This is dramatically less code than would be
required if we tried to conform to the standard vmlinux boot protocol.

 As you observe, GNU ld behaviour in this area tends to be rather patchily
 specified, buggy or both.  That does argue in favour of reusing the
 same techniques already used for other arches, though.

It is good to know PHDRS seems to be working better now, maybe things
can migrate someday!

 A question does occur to me: do your changes work with XIP_KERNEL?
 I'm not very familiar with XIP_KERNEL myself, so I'm currently not
 clear on whether there's an impact here.

I looked at the XIP defines and they didn't seem to conflict. Since
this change only effects the values in the LOAD headers, not the
actual image I doubt it has any impact.

 Beyond this, I think the approach doesn't look unreasonable.

Great, should I do anything further to get this patch into mainline.
 
 You store vmlinux.gz in a cramfs?  Is that a typo, or have you already
 compressed the kernel twice?

The compression can either be intrisic to cramfs or a raw elf that has
been gzip'd.

Jason
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] [ARM] Use AT() in the linker script to create correct program headers

2012-10-01 Thread Jason Gunthorpe
On Mon, Oct 01, 2012 at 06:56:47PM +0100, Dave Martin wrote:

> > > If the kernel is intended to be loadable at a physical address which is
> > > not statically known, no ELF loader that does not ignore the ELF
> > > phdr
> > 
> > In this case you can't really use a standard ELF loader to load the
> > kernel so, LOAD_OFFSET = 0 is fine. My case is using an ELF loader,
> > and I have set options for a static physical load address.
> 
> Generally, people should try to be compatible with the single kernel
> image effort unless there's a really compelling reason not to.

Well, all the embedded kernels we use are always custom built and
minimized for the target. So as long as there are options to minimize
the kernel size/increase performance by taking out the relocation
stuff, we are going to use them.

All our boards on PPC and ARM use DT kernels now, and we try to flow
back all the generic stuff as best we can. As someone that makes
custom boards, I really like DT, it makes things much easier :)

> Wouldn't your firmware be incapable of loading a multiplatform kernel?

Well, no, it boots ELFs, so it can boot anything, with any memory
layout. A 2nd stage loader would be required to boot standard kernels,
that loader would be an ELF with 1 section for the 2nd stage, 1
section for the zImage and 1 section for the initrd, with proper load
headers.

Creating such a system is a lot of annoyance, so we never have - it is
*so much* easier to just boot vmlinux ELF directly.

> Well, that was a bit of a pedantic point I admit, but there are
> conflicting definitions of what "virtual address" really means in these
> situations.  The original SYSV ABI spec explicitly specifies that
> e_entry is a virtual address, but is also rather vague about how the
> paddr fields should be interpreted.

Granted the spec is vauge, but convention for loaders seems to be that
it is a physical address these days.

It could be a virtual address, and the loader could translate it by
looking at the phdrs, but I don't see any other arches doing that?

> All that AT(ADDR(blah) - LOAD_OFFSET) stuff is cumbersome, but if it's
> at least consistent with other architectures then it may not such a
> disaster.  It's not universal though: less than 50% of the arches in
> the kernel currently seem to use this.

I agree it is not nice, but I once did try to make PHDRS work as you
described, but was never successful. IIRC there were serious linker
bugs) As you note the AT method is consistent with other arches, and
the generic vmlinux.lds.h

> > Yes, if you use more loader stages then the load headers are ignored.
> > Our boot loaders on our boards boot straight ELF vmlinux.gz so they
> > need correct load headers.
 
> If your image is compressed anyway though, why are you not using zImage?

We store the kernel in a CRAMFS, the loader pulls it out and
decompresses it, processes the ELF sections 'on the fly' and jumps to
it. Using zImage would result in double-decompression, and reallly has
no benefits to us.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] [ARM] Use AT() in the linker script to create correct program headers

2012-10-01 Thread Dave Martin
On Mon, Oct 01, 2012 at 10:06:39AM -0600, Jason Gunthorpe wrote:
> On Mon, Oct 01, 2012 at 04:39:53PM +0100, Dave Martin wrote:
> 
> > >   Type   Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
> > >   LOAD   0x008000 0xc0008000 0x8000 0x372244 0x3a4310 RWE 
> > > 0x8000
> > 
> > Not related directly to your patch, but I wonder why we don't we see
> > separate r-x and rw- segments?
> 
> I think this is because the sections are not aligned when the
> protections change, and the sections are not sorted by protection
> type.
> 
>   [ 1] .head.textPROGBITSc0008000 008000 0001c0 00  AX  0   0 
> 32
>   [ 2] .text PROGBITSc00081c0 0081c0 1dc880 00  AX  0   0 
> 32
>   [ 3] .rodata   PROGBITSc01e5000 1e5000 060a80 00   A  0   0 
> 32
>   [ 4] __bug_table   PROGBITSc0245a80 245a80 002a78 00   A  0   0 
>  1
>   [ 5] .pci_fixupPROGBITSc02484f8 2484f8 90 00   A  0   0 
>  4
>   [ 6] __param   PROGBITSc0248588 248588 000280 00   A  0   0 
>  4
>   [ 7] __modver  PROGBITSc0248808 248808 0007f8 00   A  0   0 
>  4
>   [ 8] .init.textPROGBITSc0249000 249000 01af20 00  AX  0   0 
> 32
>   [ 9] .exit.textPROGBITSc0263f20 263f20 000c0c 00  AX  0   0 
>  4
>   [10] .init.proc.info   PROGBITSc0264b2c 264b2c 9c 00  AX  0   0 
>  1
>   [11] .init.arch.info   PROGBITSc0264bc8 264bc8 44 00   A  0   0 
>  4
>   [12] .init.tagtablePROGBITSc0264c0c 264c0c 40 00   A  0   0 
>  4
>   [13] .init.dataPROGBITSc0264c60 264c60 0f481c 00  WA  0   0 
> 32
>   [14] .data PROGBITSc035a000 35a000 020220 00  WA  0   0 
> 32
>   [15] .notesNOTEc037a220 37a220 24 00  AX  0   0 
>  4
>   [16] .bss  NOBITS  c037a260 37a244 0320b0 00  WA  0   0 
> 32
>   [17] .comment  PROGBITS 37a244 21 01  MS  0   0 
>  1
> 
>  
> > > +/* If we have a known, fixed physical load address then set LOAD_OFFSET
> > > +   and generate an ELF that has the physical load address in the program
> > > +   headers. */
> > > +#ifndef CONFIG_ARM_PATCH_PHYS_VIRT
> > > +#define LOAD_OFFSET (PAGE_OFFSET - PHYS_OFFSET)
> > > +#endif
> > > +
> > 
> > What happens if CONFIG_ARM_PATCH_PHYS_VIRT=y?  This will be used
> > increasingly, especially for multiplatform kernels.
> 
> Then LOAD_OFFSET is unset and include/asm-generic/vmlinux.lds.h does:
> 
> #ifndef LOAD_OFFSET
> #define LOAD_OFFSET 0
> #endif

OK, good.
 
> Which is exactly the same case we have today. LOAD_OFFSET is not a
> name I invented, it is the standard kernel name for the functionality.
>  
> > If the kernel is intended to be loadable at a physical address which is
> > not statically known, no ELF loader that does not ignore the ELF
> > phdr
> 
> In this case you can't really use a standard ELF loader to load the
> kernel so, LOAD_OFFSET = 0 is fine. My case is using an ELF loader,
> and I have set options for a static physical load address.

Generally, people should try to be compatible with the single kernel
image effort unless there's a really compelling reason not to.

Wouldn't your firmware be incapable of loading a multiplatform kernel?

> > >  OUTPUT_ARCH(arm)
> > > -ENTRY(stext)
> > > +ENTRY(phys_start)
> > 
> > This is debatable.  In fact, stext has the property that its virtual
> > (runtime) and load addresses are the same.
> 
> This is what other arches using LOAD_OFFSET do (eg see ia64), and is
> sensible. ENTRY is *only* used by the ELF loader and specifies where
> the loader should jump. It must be a physical address since the loader
> only works with physical addresse. I don't understand your comment
> that .stext has the same load and virtual address, that is clearly not
> true for the ELF images my build is producing:
> 
> c0008000 T _text
> c0008000 T stext
> 
> The physical address of that symbol is 0x8000.

Well, that was a bit of a pedantic point I admit, but there are
conflicting definitions of what "virtual address" really means in these
situations.  The original SYSV ABI spec explicitly specifies that
e_entry is a virtual address, but is also rather vague about how the
paddr fields should be interpreted.


All that AT(ADDR(blah) - LOAD_OFFSET) stuff is cumbersome, but if it's
at least consistent with other architectures then it may not such a
disaster.  It's not universal though: less than 50% of the arches in
the kernel currently seem to use this.

An alternative, cleaner approach might be to specify segment load
addresses directly using PHDRS { }.  This should at leat allow the
load address to be specified once per phdr, rather than once per section.

> > To represent this correctly in the linker scripts, the
> > position-independent head.S code should be split out into a separate
> > section to which LOAD_OFFSET is not applied.
> 
> I'm not sure, the linker script 

Re: [PATCH] [ARM] Use AT() in the linker script to create correct program headers

2012-10-01 Thread Jason Gunthorpe
On Mon, Oct 01, 2012 at 04:39:53PM +0100, Dave Martin wrote:

> >   Type   Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
> >   LOAD   0x008000 0xc0008000 0x8000 0x372244 0x3a4310 RWE 0x8000
> 
> Not related directly to your patch, but I wonder why we don't we see
> separate r-x and rw- segments?

I think this is because the sections are not aligned when the
protections change, and the sections are not sorted by protection
type.

  [ 1] .head.textPROGBITSc0008000 008000 0001c0 00  AX  0   0 32
  [ 2] .text PROGBITSc00081c0 0081c0 1dc880 00  AX  0   0 32
  [ 3] .rodata   PROGBITSc01e5000 1e5000 060a80 00   A  0   0 32
  [ 4] __bug_table   PROGBITSc0245a80 245a80 002a78 00   A  0   0  1
  [ 5] .pci_fixupPROGBITSc02484f8 2484f8 90 00   A  0   0  4
  [ 6] __param   PROGBITSc0248588 248588 000280 00   A  0   0  4
  [ 7] __modver  PROGBITSc0248808 248808 0007f8 00   A  0   0  4
  [ 8] .init.textPROGBITSc0249000 249000 01af20 00  AX  0   0 32
  [ 9] .exit.textPROGBITSc0263f20 263f20 000c0c 00  AX  0   0  4
  [10] .init.proc.info   PROGBITSc0264b2c 264b2c 9c 00  AX  0   0  1
  [11] .init.arch.info   PROGBITSc0264bc8 264bc8 44 00   A  0   0  4
  [12] .init.tagtablePROGBITSc0264c0c 264c0c 40 00   A  0   0  4
  [13] .init.dataPROGBITSc0264c60 264c60 0f481c 00  WA  0   0 32
  [14] .data PROGBITSc035a000 35a000 020220 00  WA  0   0 32
  [15] .notesNOTEc037a220 37a220 24 00  AX  0   0  4
  [16] .bss  NOBITS  c037a260 37a244 0320b0 00  WA  0   0 32
  [17] .comment  PROGBITS 37a244 21 01  MS  0   0  1

 
> > +/* If we have a known, fixed physical load address then set LOAD_OFFSET
> > +   and generate an ELF that has the physical load address in the program
> > +   headers. */
> > +#ifndef CONFIG_ARM_PATCH_PHYS_VIRT
> > +#define LOAD_OFFSET (PAGE_OFFSET - PHYS_OFFSET)
> > +#endif
> > +
> 
> What happens if CONFIG_ARM_PATCH_PHYS_VIRT=y?  This will be used
> increasingly, especially for multiplatform kernels.

Then LOAD_OFFSET is unset and include/asm-generic/vmlinux.lds.h does:

#ifndef LOAD_OFFSET
#define LOAD_OFFSET 0
#endif

Which is exactly the same case we have today. LOAD_OFFSET is not a
name I invented, it is the standard kernel name for the functionality.
 
> If the kernel is intended to be loadable at a physical address which is
> not statically known, no ELF loader that does not ignore the ELF
> phdr

In this case you can't really use a standard ELF loader to load the
kernel so, LOAD_OFFSET = 0 is fine. My case is using an ELF loader,
and I have set options for a static physical load address.

> >  OUTPUT_ARCH(arm)
> > -ENTRY(stext)
> > +ENTRY(phys_start)
> 
> This is debatable.  In fact, stext has the property that its virtual
> (runtime) and load addresses are the same.

This is what other arches using LOAD_OFFSET do (eg see ia64), and is
sensible. ENTRY is *only* used by the ELF loader and specifies where
the loader should jump. It must be a physical address since the loader
only works with physical addresse. I don't understand your comment
that .stext has the same load and virtual address, that is clearly not
true for the ELF images my build is producing:

c0008000 T _text
c0008000 T stext

The physical address of that symbol is 0x8000.

> To represent this correctly in the linker scripts, the
> position-independent head.S code should be split out into a separate
> section to which LOAD_OFFSET is not applied.

I'm not sure, the linker script should use addresses that are correct
during runtime, and if the PIC code does not care about its PC then it
is fine to keep it at the virtual address to minimize confusion once
the MMU is on.

> Setting vaddr and paddr to PAGE_OFFSET (as we do now) and having the
> loader choose the appropriate board-specific place to load the kernel
> image makes this irrelevant, if I've understood the situation correctly.

Yes, if you use more loader stages then the load headers are ignored.
Our boot loaders on our boards boot straight ELF vmlinux.gz so they
need correct load headers.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] [ARM] Use AT() in the linker script to create correct program headers

2012-10-01 Thread Dave Martin
On Sun, Sep 30, 2012 at 05:21:16PM -0600, Jason Gunthorpe wrote:
> The standard linux asm-generic/vmlinux.lds.h already supports this,
> and it seems other architectures do as well.
> 
> The goal is to create an ELF file that has correct program headers. We
> want to see the VirtAddr be the runtime address of the kernel with the
> MMU turned on, and PhysAddr be the physical load address for the section
> with no MMU.
> 
> This allows ELF based boot loaders to properly load vmlinux:
> 
> $ readelf -l vmlinux
> Entry point 0x8000
>   Type   Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
>   LOAD   0x008000 0xc0008000 0x8000 0x372244 0x3a4310 RWE 0x8000

Not related directly to your patch, but I wonder why we don't we see
separate r-x and rw- segments?

Perhaps the linker script needs tidyup, or there are wrong attributes on
some sections somewhere.

> Signed-off-by: Jason Gunthorpe 
> ---
>  arch/arm/include/asm/memory.h |2 +-
>  arch/arm/kernel/vmlinux.lds.S |   47 
>  2 files changed, 29 insertions(+), 20 deletions(-)
> 
> diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h
> index 5f6ddcc..4ce5b6d 100644
> --- a/arch/arm/include/asm/memory.h
> +++ b/arch/arm/include/asm/memory.h
> @@ -283,7 +283,7 @@ static inline __deprecated void *bus_to_virt(unsigned 
> long x)
>  #define arch_is_coherent()   0
>  #endif
>  
> -#endif
> +#endif /* __ASSEMBLY__ */
>  
>  #include 
>  
> diff --git a/arch/arm/kernel/vmlinux.lds.S b/arch/arm/kernel/vmlinux.lds.S
> index 36ff15b..07942b6 100644
> --- a/arch/arm/kernel/vmlinux.lds.S
> +++ b/arch/arm/kernel/vmlinux.lds.S
> @@ -3,6 +3,13 @@
>   * Written by Martin Mares 
>   */
>  
> +/* If we have a known, fixed physical load address then set LOAD_OFFSET
> +   and generate an ELF that has the physical load address in the program
> +   headers. */
> +#ifndef CONFIG_ARM_PATCH_PHYS_VIRT
> +#define LOAD_OFFSET (PAGE_OFFSET - PHYS_OFFSET)
> +#endif
> +

What happens if CONFIG_ARM_PATCH_PHYS_VIRT=y?  This will be used
increasingly, especially for multiplatform kernels.

Currently, it looks like we will just fail to link the kernel with
ARM_PATCH_PHYS_VIRT... or have I overlooked something?

If the kernel is intended to be loadable at a physical address which is
not statically known, no ELF loader that does not ignore the ELF phdr
address fields can work.  There is no correct way to represent this
situation using an ET_EXEC image, unless we specify explicitly that
the addresses must be ignored as part of our boot protocol.  (We ran into
the same situation with uImages, which bake the load address into the
image.  The eventual answer was to fix U-Boot.)


Really, LOAD_OFFSET (in your terminology) is something that has to be
inferred by the loader in the general case, and in that case the ELF
paddr might as well be the same as the vaddr.


>  #include 
>  #include 
>  #include 
> @@ -39,7 +46,7 @@
>  #endif
>  
>  OUTPUT_ARCH(arm)
> -ENTRY(stext)
> +ENTRY(phys_start)

This is debatable.  In fact, stext has the property that its virtual
(runtime) and load addresses are the same.  To represent this correctly
in the linker scripts, the position-independent head.S code should be
split out into a separate section to which LOAD_OFFSET is not applied.

This may cause confusion elsewhere though, since _text probably needs
to coincide with the virtual image of stext.


Setting vaddr and paddr to PAGE_OFFSET (as we do now) and having the
loader choose the appropriate board-specific place to load the kernel
image makes this irrelevant, if I've understood the situation correctly.

Cheers
---Dave

>  
>  #ifndef __ARMEB__
>  jiffies = jiffies_64;
> @@ -86,11 +93,13 @@ SECTIONS
>  #else
>   . = PAGE_OFFSET + TEXT_OFFSET;
>  #endif
> - .head.text : {
> + .head.text : AT(ADDR(.head.text) - LOAD_OFFSET) {
>   _text = .;
> + phys_start = . - LOAD_OFFSET;
>   HEAD_TEXT
>   }
> - .text : {   /* Real text segment*/
> + /* Real text segment */
> + .text :  AT(ADDR(.text) - LOAD_OFFSET) {
>   _stext = .; /* Text and read-only data  */
>   __exception_text_start = .;
>   *(.exception.text)
> @@ -119,12 +128,12 @@ SECTIONS
>* Stack unwinding tables
>*/
>   . = ALIGN(8);
> - .ARM.unwind_idx : {
> + .ARM.unwind_idx : AT(ADDR(.ARM.unwind_idx) - LOAD_OFFSET) {
>   __start_unwind_idx = .;
>   *(.ARM.exidx*)
>   __stop_unwind_idx = .;
>   }
> - .ARM.unwind_tab : {
> + .ARM.unwind_tab : AT(ADDR(.ARM.unwind_tab) - LOAD_OFFSET) {
>   __start_unwind_tab = .;
>   *(.ARM.extab*)
>   __stop_unwind_tab = .;
> @@ -139,35 +148,35 @@ SECTIONS
>  #endif
>  
>   INIT_TEXT_SECTION(8)
> - .exit.text : {
> + .exit.text : AT(ADDR(.exit.text) - 

Re: [PATCH] [ARM] Use AT() in the linker script to create correct program headers

2012-10-01 Thread Dave Martin
On Sun, Sep 30, 2012 at 05:21:16PM -0600, Jason Gunthorpe wrote:
 The standard linux asm-generic/vmlinux.lds.h already supports this,
 and it seems other architectures do as well.
 
 The goal is to create an ELF file that has correct program headers. We
 want to see the VirtAddr be the runtime address of the kernel with the
 MMU turned on, and PhysAddr be the physical load address for the section
 with no MMU.
 
 This allows ELF based boot loaders to properly load vmlinux:
 
 $ readelf -l vmlinux
 Entry point 0x8000
   Type   Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
   LOAD   0x008000 0xc0008000 0x8000 0x372244 0x3a4310 RWE 0x8000

Not related directly to your patch, but I wonder why we don't we see
separate r-x and rw- segments?

Perhaps the linker script needs tidyup, or there are wrong attributes on
some sections somewhere.

 Signed-off-by: Jason Gunthorpe jguntho...@obsidianresearch.com
 ---
  arch/arm/include/asm/memory.h |2 +-
  arch/arm/kernel/vmlinux.lds.S |   47 
  2 files changed, 29 insertions(+), 20 deletions(-)
 
 diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h
 index 5f6ddcc..4ce5b6d 100644
 --- a/arch/arm/include/asm/memory.h
 +++ b/arch/arm/include/asm/memory.h
 @@ -283,7 +283,7 @@ static inline __deprecated void *bus_to_virt(unsigned 
 long x)
  #define arch_is_coherent()   0
  #endif
  
 -#endif
 +#endif /* __ASSEMBLY__ */
  
  #include asm-generic/memory_model.h
  
 diff --git a/arch/arm/kernel/vmlinux.lds.S b/arch/arm/kernel/vmlinux.lds.S
 index 36ff15b..07942b6 100644
 --- a/arch/arm/kernel/vmlinux.lds.S
 +++ b/arch/arm/kernel/vmlinux.lds.S
 @@ -3,6 +3,13 @@
   * Written by Martin Mares m...@atrey.karlin.mff.cuni.cz
   */
  
 +/* If we have a known, fixed physical load address then set LOAD_OFFSET
 +   and generate an ELF that has the physical load address in the program
 +   headers. */
 +#ifndef CONFIG_ARM_PATCH_PHYS_VIRT
 +#define LOAD_OFFSET (PAGE_OFFSET - PHYS_OFFSET)
 +#endif
 +

What happens if CONFIG_ARM_PATCH_PHYS_VIRT=y?  This will be used
increasingly, especially for multiplatform kernels.

Currently, it looks like we will just fail to link the kernel with
ARM_PATCH_PHYS_VIRT... or have I overlooked something?

If the kernel is intended to be loadable at a physical address which is
not statically known, no ELF loader that does not ignore the ELF phdr
address fields can work.  There is no correct way to represent this
situation using an ET_EXEC image, unless we specify explicitly that
the addresses must be ignored as part of our boot protocol.  (We ran into
the same situation with uImages, which bake the load address into the
image.  The eventual answer was to fix U-Boot.)


Really, LOAD_OFFSET (in your terminology) is something that has to be
inferred by the loader in the general case, and in that case the ELF
paddr might as well be the same as the vaddr.


  #include asm-generic/vmlinux.lds.h
  #include asm/cache.h
  #include asm/thread_info.h
 @@ -39,7 +46,7 @@
  #endif
  
  OUTPUT_ARCH(arm)
 -ENTRY(stext)
 +ENTRY(phys_start)

This is debatable.  In fact, stext has the property that its virtual
(runtime) and load addresses are the same.  To represent this correctly
in the linker scripts, the position-independent head.S code should be
split out into a separate section to which LOAD_OFFSET is not applied.

This may cause confusion elsewhere though, since _text probably needs
to coincide with the virtual image of stext.


Setting vaddr and paddr to PAGE_OFFSET (as we do now) and having the
loader choose the appropriate board-specific place to load the kernel
image makes this irrelevant, if I've understood the situation correctly.

Cheers
---Dave

  
  #ifndef __ARMEB__
  jiffies = jiffies_64;
 @@ -86,11 +93,13 @@ SECTIONS
  #else
   . = PAGE_OFFSET + TEXT_OFFSET;
  #endif
 - .head.text : {
 + .head.text : AT(ADDR(.head.text) - LOAD_OFFSET) {
   _text = .;
 + phys_start = . - LOAD_OFFSET;
   HEAD_TEXT
   }
 - .text : {   /* Real text segment*/
 + /* Real text segment */
 + .text :  AT(ADDR(.text) - LOAD_OFFSET) {
   _stext = .; /* Text and read-only data  */
   __exception_text_start = .;
   *(.exception.text)
 @@ -119,12 +128,12 @@ SECTIONS
* Stack unwinding tables
*/
   . = ALIGN(8);
 - .ARM.unwind_idx : {
 + .ARM.unwind_idx : AT(ADDR(.ARM.unwind_idx) - LOAD_OFFSET) {
   __start_unwind_idx = .;
   *(.ARM.exidx*)
   __stop_unwind_idx = .;
   }
 - .ARM.unwind_tab : {
 + .ARM.unwind_tab : AT(ADDR(.ARM.unwind_tab) - LOAD_OFFSET) {
   __start_unwind_tab = .;
   *(.ARM.extab*)
   __stop_unwind_tab = .;
 @@ -139,35 +148,35 @@ SECTIONS
  #endif
  
   INIT_TEXT_SECTION(8)
 - .exit.text : {
 

Re: [PATCH] [ARM] Use AT() in the linker script to create correct program headers

2012-10-01 Thread Jason Gunthorpe
On Mon, Oct 01, 2012 at 04:39:53PM +0100, Dave Martin wrote:

Type   Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
LOAD   0x008000 0xc0008000 0x8000 0x372244 0x3a4310 RWE 0x8000
 
 Not related directly to your patch, but I wonder why we don't we see
 separate r-x and rw- segments?

I think this is because the sections are not aligned when the
protections change, and the sections are not sorted by protection
type.

  [ 1] .head.textPROGBITSc0008000 008000 0001c0 00  AX  0   0 32
  [ 2] .text PROGBITSc00081c0 0081c0 1dc880 00  AX  0   0 32
  [ 3] .rodata   PROGBITSc01e5000 1e5000 060a80 00   A  0   0 32
  [ 4] __bug_table   PROGBITSc0245a80 245a80 002a78 00   A  0   0  1
  [ 5] .pci_fixupPROGBITSc02484f8 2484f8 90 00   A  0   0  4
  [ 6] __param   PROGBITSc0248588 248588 000280 00   A  0   0  4
  [ 7] __modver  PROGBITSc0248808 248808 0007f8 00   A  0   0  4
  [ 8] .init.textPROGBITSc0249000 249000 01af20 00  AX  0   0 32
  [ 9] .exit.textPROGBITSc0263f20 263f20 000c0c 00  AX  0   0  4
  [10] .init.proc.info   PROGBITSc0264b2c 264b2c 9c 00  AX  0   0  1
  [11] .init.arch.info   PROGBITSc0264bc8 264bc8 44 00   A  0   0  4
  [12] .init.tagtablePROGBITSc0264c0c 264c0c 40 00   A  0   0  4
  [13] .init.dataPROGBITSc0264c60 264c60 0f481c 00  WA  0   0 32
  [14] .data PROGBITSc035a000 35a000 020220 00  WA  0   0 32
  [15] .notesNOTEc037a220 37a220 24 00  AX  0   0  4
  [16] .bss  NOBITS  c037a260 37a244 0320b0 00  WA  0   0 32
  [17] .comment  PROGBITS 37a244 21 01  MS  0   0  1

 
  +/* If we have a known, fixed physical load address then set LOAD_OFFSET
  +   and generate an ELF that has the physical load address in the program
  +   headers. */
  +#ifndef CONFIG_ARM_PATCH_PHYS_VIRT
  +#define LOAD_OFFSET (PAGE_OFFSET - PHYS_OFFSET)
  +#endif
  +
 
 What happens if CONFIG_ARM_PATCH_PHYS_VIRT=y?  This will be used
 increasingly, especially for multiplatform kernels.

Then LOAD_OFFSET is unset and include/asm-generic/vmlinux.lds.h does:

#ifndef LOAD_OFFSET
#define LOAD_OFFSET 0
#endif

Which is exactly the same case we have today. LOAD_OFFSET is not a
name I invented, it is the standard kernel name for the functionality.
 
 If the kernel is intended to be loadable at a physical address which is
 not statically known, no ELF loader that does not ignore the ELF
 phdr

In this case you can't really use a standard ELF loader to load the
kernel so, LOAD_OFFSET = 0 is fine. My case is using an ELF loader,
and I have set options for a static physical load address.

   OUTPUT_ARCH(arm)
  -ENTRY(stext)
  +ENTRY(phys_start)
 
 This is debatable.  In fact, stext has the property that its virtual
 (runtime) and load addresses are the same.

This is what other arches using LOAD_OFFSET do (eg see ia64), and is
sensible. ENTRY is *only* used by the ELF loader and specifies where
the loader should jump. It must be a physical address since the loader
only works with physical addresse. I don't understand your comment
that .stext has the same load and virtual address, that is clearly not
true for the ELF images my build is producing:

c0008000 T _text
c0008000 T stext

The physical address of that symbol is 0x8000.

 To represent this correctly in the linker scripts, the
 position-independent head.S code should be split out into a separate
 section to which LOAD_OFFSET is not applied.

I'm not sure, the linker script should use addresses that are correct
during runtime, and if the PIC code does not care about its PC then it
is fine to keep it at the virtual address to minimize confusion once
the MMU is on.

 Setting vaddr and paddr to PAGE_OFFSET (as we do now) and having the
 loader choose the appropriate board-specific place to load the kernel
 image makes this irrelevant, if I've understood the situation correctly.

Yes, if you use more loader stages then the load headers are ignored.
Our boot loaders on our boards boot straight ELF vmlinux.gz so they
need correct load headers.

Jason
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] [ARM] Use AT() in the linker script to create correct program headers

2012-10-01 Thread Dave Martin
On Mon, Oct 01, 2012 at 10:06:39AM -0600, Jason Gunthorpe wrote:
 On Mon, Oct 01, 2012 at 04:39:53PM +0100, Dave Martin wrote:
 
 Type   Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
 LOAD   0x008000 0xc0008000 0x8000 0x372244 0x3a4310 RWE 
   0x8000
  
  Not related directly to your patch, but I wonder why we don't we see
  separate r-x and rw- segments?
 
 I think this is because the sections are not aligned when the
 protections change, and the sections are not sorted by protection
 type.
 
   [ 1] .head.textPROGBITSc0008000 008000 0001c0 00  AX  0   0 
 32
   [ 2] .text PROGBITSc00081c0 0081c0 1dc880 00  AX  0   0 
 32
   [ 3] .rodata   PROGBITSc01e5000 1e5000 060a80 00   A  0   0 
 32
   [ 4] __bug_table   PROGBITSc0245a80 245a80 002a78 00   A  0   0 
  1
   [ 5] .pci_fixupPROGBITSc02484f8 2484f8 90 00   A  0   0 
  4
   [ 6] __param   PROGBITSc0248588 248588 000280 00   A  0   0 
  4
   [ 7] __modver  PROGBITSc0248808 248808 0007f8 00   A  0   0 
  4
   [ 8] .init.textPROGBITSc0249000 249000 01af20 00  AX  0   0 
 32
   [ 9] .exit.textPROGBITSc0263f20 263f20 000c0c 00  AX  0   0 
  4
   [10] .init.proc.info   PROGBITSc0264b2c 264b2c 9c 00  AX  0   0 
  1
   [11] .init.arch.info   PROGBITSc0264bc8 264bc8 44 00   A  0   0 
  4
   [12] .init.tagtablePROGBITSc0264c0c 264c0c 40 00   A  0   0 
  4
   [13] .init.dataPROGBITSc0264c60 264c60 0f481c 00  WA  0   0 
 32
   [14] .data PROGBITSc035a000 35a000 020220 00  WA  0   0 
 32
   [15] .notesNOTEc037a220 37a220 24 00  AX  0   0 
  4
   [16] .bss  NOBITS  c037a260 37a244 0320b0 00  WA  0   0 
 32
   [17] .comment  PROGBITS 37a244 21 01  MS  0   0 
  1
 
  
   +/* If we have a known, fixed physical load address then set LOAD_OFFSET
   +   and generate an ELF that has the physical load address in the program
   +   headers. */
   +#ifndef CONFIG_ARM_PATCH_PHYS_VIRT
   +#define LOAD_OFFSET (PAGE_OFFSET - PHYS_OFFSET)
   +#endif
   +
  
  What happens if CONFIG_ARM_PATCH_PHYS_VIRT=y?  This will be used
  increasingly, especially for multiplatform kernels.
 
 Then LOAD_OFFSET is unset and include/asm-generic/vmlinux.lds.h does:
 
 #ifndef LOAD_OFFSET
 #define LOAD_OFFSET 0
 #endif

OK, good.
 
 Which is exactly the same case we have today. LOAD_OFFSET is not a
 name I invented, it is the standard kernel name for the functionality.
  
  If the kernel is intended to be loadable at a physical address which is
  not statically known, no ELF loader that does not ignore the ELF
  phdr
 
 In this case you can't really use a standard ELF loader to load the
 kernel so, LOAD_OFFSET = 0 is fine. My case is using an ELF loader,
 and I have set options for a static physical load address.

Generally, people should try to be compatible with the single kernel
image effort unless there's a really compelling reason not to.

Wouldn't your firmware be incapable of loading a multiplatform kernel?

OUTPUT_ARCH(arm)
   -ENTRY(stext)
   +ENTRY(phys_start)
  
  This is debatable.  In fact, stext has the property that its virtual
  (runtime) and load addresses are the same.
 
 This is what other arches using LOAD_OFFSET do (eg see ia64), and is
 sensible. ENTRY is *only* used by the ELF loader and specifies where
 the loader should jump. It must be a physical address since the loader
 only works with physical addresse. I don't understand your comment
 that .stext has the same load and virtual address, that is clearly not
 true for the ELF images my build is producing:
 
 c0008000 T _text
 c0008000 T stext
 
 The physical address of that symbol is 0x8000.

Well, that was a bit of a pedantic point I admit, but there are
conflicting definitions of what virtual address really means in these
situations.  The original SYSV ABI spec explicitly specifies that
e_entry is a virtual address, but is also rather vague about how the
paddr fields should be interpreted.


All that AT(ADDR(blah) - LOAD_OFFSET) stuff is cumbersome, but if it's
at least consistent with other architectures then it may not such a
disaster.  It's not universal though: less than 50% of the arches in
the kernel currently seem to use this.

An alternative, cleaner approach might be to specify segment load
addresses directly using PHDRS { }.  This should at leat allow the
load address to be specified once per phdr, rather than once per section.

  To represent this correctly in the linker scripts, the
  position-independent head.S code should be split out into a separate
  section to which LOAD_OFFSET is not applied.
 
 I'm not sure, the linker script should use addresses that are correct
 during runtime, and if the PIC code does not care about its PC then it
 is fine to keep it at the virtual 

Re: [PATCH] [ARM] Use AT() in the linker script to create correct program headers

2012-10-01 Thread Jason Gunthorpe
On Mon, Oct 01, 2012 at 06:56:47PM +0100, Dave Martin wrote:

   If the kernel is intended to be loadable at a physical address which is
   not statically known, no ELF loader that does not ignore the ELF
   phdr
  
  In this case you can't really use a standard ELF loader to load the
  kernel so, LOAD_OFFSET = 0 is fine. My case is using an ELF loader,
  and I have set options for a static physical load address.
 
 Generally, people should try to be compatible with the single kernel
 image effort unless there's a really compelling reason not to.

Well, all the embedded kernels we use are always custom built and
minimized for the target. So as long as there are options to minimize
the kernel size/increase performance by taking out the relocation
stuff, we are going to use them.

All our boards on PPC and ARM use DT kernels now, and we try to flow
back all the generic stuff as best we can. As someone that makes
custom boards, I really like DT, it makes things much easier :)

 Wouldn't your firmware be incapable of loading a multiplatform kernel?

Well, no, it boots ELFs, so it can boot anything, with any memory
layout. A 2nd stage loader would be required to boot standard kernels,
that loader would be an ELF with 1 section for the 2nd stage, 1
section for the zImage and 1 section for the initrd, with proper load
headers.

Creating such a system is a lot of annoyance, so we never have - it is
*so much* easier to just boot vmlinux ELF directly.

 Well, that was a bit of a pedantic point I admit, but there are
 conflicting definitions of what virtual address really means in these
 situations.  The original SYSV ABI spec explicitly specifies that
 e_entry is a virtual address, but is also rather vague about how the
 paddr fields should be interpreted.

Granted the spec is vauge, but convention for loaders seems to be that
it is a physical address these days.

It could be a virtual address, and the loader could translate it by
looking at the phdrs, but I don't see any other arches doing that?

 All that AT(ADDR(blah) - LOAD_OFFSET) stuff is cumbersome, but if it's
 at least consistent with other architectures then it may not such a
 disaster.  It's not universal though: less than 50% of the arches in
 the kernel currently seem to use this.

I agree it is not nice, but I once did try to make PHDRS work as you
described, but was never successful. IIRC there were serious linker
bugs) As you note the AT method is consistent with other arches, and
the generic vmlinux.lds.h

  Yes, if you use more loader stages then the load headers are ignored.
  Our boot loaders on our boards boot straight ELF vmlinux.gz so they
  need correct load headers.
 
 If your image is compressed anyway though, why are you not using zImage?

We store the kernel in a CRAMFS, the loader pulls it out and
decompresses it, processes the ELF sections 'on the fly' and jumps to
it. Using zImage would result in double-decompression, and reallly has
no benefits to us.

Jason
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] [ARM] Use AT() in the linker script to create correct program headers

2012-09-30 Thread Jason Gunthorpe
The standard linux asm-generic/vmlinux.lds.h already supports this,
and it seems other architectures do as well.

The goal is to create an ELF file that has correct program headers. We
want to see the VirtAddr be the runtime address of the kernel with the
MMU turned on, and PhysAddr be the physical load address for the section
with no MMU.

This allows ELF based boot loaders to properly load vmlinux:

$ readelf -l vmlinux
Entry point 0x8000
  Type   Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
  LOAD   0x008000 0xc0008000 0x8000 0x372244 0x3a4310 RWE 0x8000

Signed-off-by: Jason Gunthorpe 
---
 arch/arm/include/asm/memory.h |2 +-
 arch/arm/kernel/vmlinux.lds.S |   47 
 2 files changed, 29 insertions(+), 20 deletions(-)

diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h
index 5f6ddcc..4ce5b6d 100644
--- a/arch/arm/include/asm/memory.h
+++ b/arch/arm/include/asm/memory.h
@@ -283,7 +283,7 @@ static inline __deprecated void *bus_to_virt(unsigned long 
x)
 #define arch_is_coherent() 0
 #endif
 
-#endif
+#endif /* __ASSEMBLY__ */
 
 #include 
 
diff --git a/arch/arm/kernel/vmlinux.lds.S b/arch/arm/kernel/vmlinux.lds.S
index 36ff15b..07942b6 100644
--- a/arch/arm/kernel/vmlinux.lds.S
+++ b/arch/arm/kernel/vmlinux.lds.S
@@ -3,6 +3,13 @@
  * Written by Martin Mares 
  */
 
+/* If we have a known, fixed physical load address then set LOAD_OFFSET
+   and generate an ELF that has the physical load address in the program
+   headers. */
+#ifndef CONFIG_ARM_PATCH_PHYS_VIRT
+#define LOAD_OFFSET (PAGE_OFFSET - PHYS_OFFSET)
+#endif
+
 #include 
 #include 
 #include 
@@ -39,7 +46,7 @@
 #endif
 
 OUTPUT_ARCH(arm)
-ENTRY(stext)
+ENTRY(phys_start)
 
 #ifndef __ARMEB__
 jiffies = jiffies_64;
@@ -86,11 +93,13 @@ SECTIONS
 #else
. = PAGE_OFFSET + TEXT_OFFSET;
 #endif
-   .head.text : {
+   .head.text : AT(ADDR(.head.text) - LOAD_OFFSET) {
_text = .;
+   phys_start = . - LOAD_OFFSET;
HEAD_TEXT
}
-   .text : {   /* Real text segment*/
+   /* Real text segment */
+   .text :  AT(ADDR(.text) - LOAD_OFFSET) {
_stext = .; /* Text and read-only data  */
__exception_text_start = .;
*(.exception.text)
@@ -119,12 +128,12 @@ SECTIONS
 * Stack unwinding tables
 */
. = ALIGN(8);
-   .ARM.unwind_idx : {
+   .ARM.unwind_idx : AT(ADDR(.ARM.unwind_idx) - LOAD_OFFSET) {
__start_unwind_idx = .;
*(.ARM.exidx*)
__stop_unwind_idx = .;
}
-   .ARM.unwind_tab : {
+   .ARM.unwind_tab : AT(ADDR(.ARM.unwind_tab) - LOAD_OFFSET) {
__start_unwind_tab = .;
*(.ARM.extab*)
__stop_unwind_tab = .;
@@ -139,35 +148,35 @@ SECTIONS
 #endif
 
INIT_TEXT_SECTION(8)
-   .exit.text : {
+   .exit.text : AT(ADDR(.exit.text) - LOAD_OFFSET) {
ARM_EXIT_KEEP(EXIT_TEXT)
}
-   .init.proc.info : {
+   .init.proc.info : AT(ADDR(.init.proc.info) - LOAD_OFFSET) {
ARM_CPU_DISCARD(PROC_INFO)
}
-   .init.arch.info : {
+   .init.arch.info : AT(ADDR(.init.arch.info) - LOAD_OFFSET) {
__arch_info_begin = .;
*(.arch.info.init)
__arch_info_end = .;
}
-   .init.tagtable : {
+   .init.tagtable : AT(ADDR(.init.tagtable) - LOAD_OFFSET) {
__tagtable_begin = .;
*(.taglist.init)
__tagtable_end = .;
}
 #ifdef CONFIG_SMP_ON_UP
-   .init.smpalt : {
+   .init.smpalt : AT(ADDR(.init.smpalt) - LOAD_OFFSET) {
__smpalt_begin = .;
*(.alt.smp.init)
__smpalt_end = .;
}
 #endif
-   .init.pv_table : {
+   .init.pv_table : AT(ADDR(.init.pv_table) - LOAD_OFFSET) {
__pv_table_begin = .;
*(.pv_table)
__pv_table_end = .;
}
-   .init.data : {
+   .init.data : AT(ADDR(.init.data) - LOAD_OFFSET) {
 #ifndef CONFIG_XIP_KERNEL
INIT_DATA
 #endif
@@ -178,7 +187,7 @@ SECTIONS
INIT_RAM_FS
}
 #ifndef CONFIG_XIP_KERNEL
-   .exit.data : {
+   .exit.data : AT(ADDR(.exit.data) - LOAD_OFFSET) {
ARM_EXIT_KEEP(EXIT_DATA)
}
 #endif
@@ -196,7 +205,7 @@ SECTIONS
__data_loc = .;
 #endif
 
-   .data : AT(__data_loc) {
+   .data : AT(__data_loc - LOAD_OFFSET) {
_data = .;  /* address in memory */
_sdata = .;
 
@@ -245,7 +254,7 @@ SECTIONS
 * free it after init has commenced and TCM contents have
 * been copied to its destination.
 */
-   .tcm_start : {
+   .tcm_start : AT(ADDR(.tcm_start) - LOAD_OFFSET) {
  

[PATCH] [ARM] Use AT() in the linker script to create correct program headers

2012-09-30 Thread Jason Gunthorpe
The standard linux asm-generic/vmlinux.lds.h already supports this,
and it seems other architectures do as well.

The goal is to create an ELF file that has correct program headers. We
want to see the VirtAddr be the runtime address of the kernel with the
MMU turned on, and PhysAddr be the physical load address for the section
with no MMU.

This allows ELF based boot loaders to properly load vmlinux:

$ readelf -l vmlinux
Entry point 0x8000
  Type   Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
  LOAD   0x008000 0xc0008000 0x8000 0x372244 0x3a4310 RWE 0x8000

Signed-off-by: Jason Gunthorpe jguntho...@obsidianresearch.com
---
 arch/arm/include/asm/memory.h |2 +-
 arch/arm/kernel/vmlinux.lds.S |   47 
 2 files changed, 29 insertions(+), 20 deletions(-)

diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h
index 5f6ddcc..4ce5b6d 100644
--- a/arch/arm/include/asm/memory.h
+++ b/arch/arm/include/asm/memory.h
@@ -283,7 +283,7 @@ static inline __deprecated void *bus_to_virt(unsigned long 
x)
 #define arch_is_coherent() 0
 #endif
 
-#endif
+#endif /* __ASSEMBLY__ */
 
 #include asm-generic/memory_model.h
 
diff --git a/arch/arm/kernel/vmlinux.lds.S b/arch/arm/kernel/vmlinux.lds.S
index 36ff15b..07942b6 100644
--- a/arch/arm/kernel/vmlinux.lds.S
+++ b/arch/arm/kernel/vmlinux.lds.S
@@ -3,6 +3,13 @@
  * Written by Martin Mares m...@atrey.karlin.mff.cuni.cz
  */
 
+/* If we have a known, fixed physical load address then set LOAD_OFFSET
+   and generate an ELF that has the physical load address in the program
+   headers. */
+#ifndef CONFIG_ARM_PATCH_PHYS_VIRT
+#define LOAD_OFFSET (PAGE_OFFSET - PHYS_OFFSET)
+#endif
+
 #include asm-generic/vmlinux.lds.h
 #include asm/cache.h
 #include asm/thread_info.h
@@ -39,7 +46,7 @@
 #endif
 
 OUTPUT_ARCH(arm)
-ENTRY(stext)
+ENTRY(phys_start)
 
 #ifndef __ARMEB__
 jiffies = jiffies_64;
@@ -86,11 +93,13 @@ SECTIONS
 #else
. = PAGE_OFFSET + TEXT_OFFSET;
 #endif
-   .head.text : {
+   .head.text : AT(ADDR(.head.text) - LOAD_OFFSET) {
_text = .;
+   phys_start = . - LOAD_OFFSET;
HEAD_TEXT
}
-   .text : {   /* Real text segment*/
+   /* Real text segment */
+   .text :  AT(ADDR(.text) - LOAD_OFFSET) {
_stext = .; /* Text and read-only data  */
__exception_text_start = .;
*(.exception.text)
@@ -119,12 +128,12 @@ SECTIONS
 * Stack unwinding tables
 */
. = ALIGN(8);
-   .ARM.unwind_idx : {
+   .ARM.unwind_idx : AT(ADDR(.ARM.unwind_idx) - LOAD_OFFSET) {
__start_unwind_idx = .;
*(.ARM.exidx*)
__stop_unwind_idx = .;
}
-   .ARM.unwind_tab : {
+   .ARM.unwind_tab : AT(ADDR(.ARM.unwind_tab) - LOAD_OFFSET) {
__start_unwind_tab = .;
*(.ARM.extab*)
__stop_unwind_tab = .;
@@ -139,35 +148,35 @@ SECTIONS
 #endif
 
INIT_TEXT_SECTION(8)
-   .exit.text : {
+   .exit.text : AT(ADDR(.exit.text) - LOAD_OFFSET) {
ARM_EXIT_KEEP(EXIT_TEXT)
}
-   .init.proc.info : {
+   .init.proc.info : AT(ADDR(.init.proc.info) - LOAD_OFFSET) {
ARM_CPU_DISCARD(PROC_INFO)
}
-   .init.arch.info : {
+   .init.arch.info : AT(ADDR(.init.arch.info) - LOAD_OFFSET) {
__arch_info_begin = .;
*(.arch.info.init)
__arch_info_end = .;
}
-   .init.tagtable : {
+   .init.tagtable : AT(ADDR(.init.tagtable) - LOAD_OFFSET) {
__tagtable_begin = .;
*(.taglist.init)
__tagtable_end = .;
}
 #ifdef CONFIG_SMP_ON_UP
-   .init.smpalt : {
+   .init.smpalt : AT(ADDR(.init.smpalt) - LOAD_OFFSET) {
__smpalt_begin = .;
*(.alt.smp.init)
__smpalt_end = .;
}
 #endif
-   .init.pv_table : {
+   .init.pv_table : AT(ADDR(.init.pv_table) - LOAD_OFFSET) {
__pv_table_begin = .;
*(.pv_table)
__pv_table_end = .;
}
-   .init.data : {
+   .init.data : AT(ADDR(.init.data) - LOAD_OFFSET) {
 #ifndef CONFIG_XIP_KERNEL
INIT_DATA
 #endif
@@ -178,7 +187,7 @@ SECTIONS
INIT_RAM_FS
}
 #ifndef CONFIG_XIP_KERNEL
-   .exit.data : {
+   .exit.data : AT(ADDR(.exit.data) - LOAD_OFFSET) {
ARM_EXIT_KEEP(EXIT_DATA)
}
 #endif
@@ -196,7 +205,7 @@ SECTIONS
__data_loc = .;
 #endif
 
-   .data : AT(__data_loc) {
+   .data : AT(__data_loc - LOAD_OFFSET) {
_data = .;  /* address in memory */
_sdata = .;
 
@@ -245,7 +254,7 @@ SECTIONS
 * free it after init has commenced and TCM contents have