Re: [Ksummit-2009-discuss] Representing Embedded Architectures at the Kernel Summit

2009-06-17 Thread Ralf Baechle
On Tue, Jun 16, 2009 at 09:26:02PM -0700, H. Peter Anvin wrote:

  I2C or similar busses can be a particularly annoying if they contain
  essential configuration information such as memory size which is needed
  long before anything else.  So for far a common solution is that platforms
  are carrying a private (aka redundant, ugly) early-i2c system that's just
  about sufficient for this purpose.
 
 For what it's worth, this is true for pretty much ALL systems with
 removable memory modules, since Serial Presence Detect (SPD) is
 electrically equivalent to I2C.
 
 However, on most systems, even embedded, bringing up memory falls on
 firmware (sometimes in the form of a boot loader) so Linux rarely sees it.

There are embedded systems were the firmware does not provide a usuable
memory map or where that is plain broken.  Or Linux with some extra init
code serves as the firmware.  Often there is a single serial EEPROM for
the entire system.  If there is an atrocity that can save a penny it will
be commited at least in the embedded world.

  Ralf
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Ksummit-2009-discuss] Representing Embedded Architectures at the Kernel Summit

2009-06-17 Thread H. Peter Anvin
Ralf Baechle wrote:

 However, on most systems, even embedded, bringing up memory falls on
 firmware (sometimes in the form of a boot loader) so Linux rarely sees it.
 
 There are embedded systems were the firmware does not provide a usuable
 memory map or where that is plain broken.  Or Linux with some extra init
 code serves as the firmware.  Often there is a single serial EEPROM for
 the entire system.  If there is an atrocity that can save a penny it will
 be commited at least in the embedded world.
 

Rarely is certainly not never.  Quite on the contrary.  Also, I
think you can remove that can save a penny from your last sentence...

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.

--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Ksummit-2009-discuss] Representing Embedded Architectures at the Kernel Summit

2009-06-16 Thread Ralf Baechle
On Tue, Jun 16, 2009 at 04:06:48AM -0400, Mike Frysinger wrote:

 On Tue, Jun 16, 2009 at 02:42, Mike Rapoport wrote:
  James Bottomley wrote:
  We've got to the point where there are simply too many embedded
  architectures to invite all the arch maintainers to the kernel summit.
  So, this year, we thought we'd do embedded via topic driven invitations
  instead.  So what we're looking for is a proposal to discuss the issues
  most affecting embedded architectures, or preview any features affecting
  the main kernel which embedded architectures might need ... or any other
  topics from embedded architectures which might need discussion or
  debate.
 
  Another issue that affects embedded architectures is drivers initialization
  order. There are a lot of cases when you need the drivers to be initialized 
  in
  particular order, and current initcalls scheme does not allow fine grained
  control for it.
 
 example: device configuration information stored in i2c eeprom (i.e.
 dimensions of attached framebuffer), but i2c is not available when
 framebuffer layer is setup.  framebuffer driver has to be built as a
 module and loaded by userspace, or i2c information is read by
 bootloader and passed down to the kernel.

I2C or similar busses can be a particularly annoying if they contain
essential configuration information such as memory size which is needed
long before anything else.  So for far a common solution is that platforms
are carrying a private (aka redundant, ugly) early-i2c system that's just
about sufficient for this purpose.

  Ralf
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Ksummit-2009-discuss] Representing Embedded Architectures at the Kernel Summit

2009-06-16 Thread H. Peter Anvin
Ralf Baechle wrote:
 
 I2C or similar busses can be a particularly annoying if they contain
 essential configuration information such as memory size which is needed
 long before anything else.  So for far a common solution is that platforms
 are carrying a private (aka redundant, ugly) early-i2c system that's just
 about sufficient for this purpose.
 

For what it's worth, this is true for pretty much ALL systems with
removable memory modules, since Serial Presence Detect (SPD) is
electrically equivalent to I2C.

However, on most systems, even embedded, bringing up memory falls on
firmware (sometimes in the form of a boot loader) so Linux rarely sees it.

-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.

--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Ksummit-2009-discuss] Representing Embedded Architectures at the Kernel Summit

2009-06-10 Thread Thomas Petazzoni
Le Wed, 03 Jun 2009 12:19:57 -0400,
James Bottomley james.bottom...@hansenpartnership.com a écrit :

 So ZONE_DMA and coherent memory allocation as represented by the
 coherent mask are really totally separate things.  The idea of
 ZONE_DMA was really that if you had an ISA device, allocations from
 ZONE_DMA would be able to access the allocated memory without
 bouncing.  Since ISA is really going away, this definition has been
 hijacked.  If your problem is just that you need memory allocated on
 a certain physical mask and neither GFP_DMA or GFP_DMA32 cut it for
 you, then we could revisit the kmalloc_mask() proposal again ... but
 the consensus last time was that no-one really had a compelling use
 case that couldn't be covered by GFP_DMA32.

Back a few years ago, I was working on a MIPS platform which had 256 MB
of RAM attached to the CPU memory controller and 128 MB attached to an
external memory controller. The layout of the memories was: 256 MB
CPU-attached memory first, and then the 128 MB
external-controller-attached memory.

Now, back to the DMA discussion: the Ethernet controller, which was
part of that external controller also driving the 128 MB bank of
memory, could only DMA to and from memory controlled by that same
controller (i.e only to the *top* 128 MB of the physical address
space). I'm by far not an mm expert, but as far as I could understand
the zone mechanism, it was not possible to describe such a
physical memory configuration where DMA-able memory is only at the top.

In the end, I ended up passing mem=..., managing manually a few
megabytes of memory at the top of the physical address space, and
hacking the Ethernet driver to copy back and forth the skb contents
between the main memory and the DMA-reserved memory.

So when Calatalin Marinas says « currently ZONE_DMA is assumed to be in
the bottom part of the memory which isn't always the case », I cannot
agree more.

Reference:
http://www.linux-mips.org/archives/linux-mips/2004-09/msg00152.html

Sincerly,

Thomas
-- 
Thomas Petazzoni, Free Electrons
Kernel, drivers and embedded Linux development,
consulting, training and support.
http://free-electrons.com
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Ksummit-2009-discuss] Representing Embedded Architectures at the Kernel Summit

2009-06-04 Thread Mel Gorman
On Wed, Jun 03, 2009 at 11:11:13PM -0400, David VomLehn (dvomlehn) wrote:
 David Delaney has a proof-of-concept of an idea of his which was
 presented at the last CELF, which is basically to put the kernel and
 loadable kernel modules closely enough together that you can avoid the
 use of long jumps. He sees a better than 1% improvement in performance,
 which we've duplicated using a slightly different approach. This is nice
 payback for little work and, though it doesn't help on all processors,
 it helps on several.
 
 The problem is: how do you allocate memory with the magical close to
 the kernel attribute? We have something that adds a new ZONE_KERNEL
 (this name has some problems, actually).

As this is about addresses of text, I imagine that you really care about the
virtual address the module is loaded at which is what the virtual address
space is responsible for and not the physical addresses which zones are
concerned with. Would that be right?

If that was the case, this could be potentially be done by moving where
the vmalloc address space is located or possibly splitting it in two. By
locating some portion of the vmalloc address space above the kernel
image, the kernel modules could be loaded in there.

It's different for the DMA problem, it really requires particular physical
addresses. No one trying to implementat Andrew's suggestion is a bit of a
surprise because basically, it'd do the job as far as I can see but is not
an issue that hurts me so I never sat down to try implementing it. Granted,
increasing the number of zones adds its own problem but it's for large numbers
of zones and there are other things that could be done to the allocator to
reduce its cache footprint. The big plus is that it plays very well with
reclaim and I would expect it to perform better than than searching the
free lists for a suitable page which would be a bit of a hatchet job.

 It seems like a pretty good
 solution if you look at zones as conceptually concentric usages, but
 with the current zone implementation, each zone must be contiguous. So,
 if we're talking about changing what zones are done, I'd like to throw
 this into the pot.
 
  -Original Message-
  From: linux-embedded-ow...@vger.kernel.org 
  [mailto:linux-embedded-ow...@vger.kernel.org] On Behalf Of 
  Andrew Morton
  Sent: Wednesday, June 03, 2009 11:44 AM
  To: Russell King
  Cc: james.bottom...@hansenpartnership.com; 
  linux-a...@vger.kernel.org; linux-embedded@vger.kernel.org; 
  ksummit-2009-disc...@lists.linux-foundation.org
  Subject: Re: [Ksummit-2009-discuss] Representing Embedded 
  Architectures at the Kernel Summit
  
  On Wed, 3 Jun 2009 18:09:25 +0100
  Russell King r...@arm.linux.org.uk wrote:
  
   In
   fact, on ARM the DMA mask is exactly that - it's a 100% 
  proper mask.  It's
   not a bunch of zeros in the MSB followed by a bunch of ones 
  down to the
   LSB.  It can be a bunch of ones, a bunch of zeros, followed 
  by a bunch of
   ones.
   
   The way we occasionally have to deal with this is to trial 
  an allocation,
   see if the physical address fits, if not free the page and 
  try again with
   GFP_DMA set.
  
  A couple of times I've suggested that we have the ability to allocate
  one zone per address bit, so a 32-bit machine with 4k pages would end
  up having 20 zones.  Then, your funny DMA mask can be directly passed
  into the page allocator as a zone mask and voila, I think.
  
   There's many stories I've heard on what is supposed to take 
  care of the
   coherency that I now just close my ears to the problem and chant it
   doesn't exist, people aren't seeing it, mainline folk just 
  don't give
   a damn.  Really.  It is a problem on _some_ ARM devices 
  and has been
   for several years now, and I've 100% given up caring about it.
  
  I wasn't even aware that there was an issue here.  Please don't blame
  mainline folk for something they weren't told about!
  
  --
  To unsubscribe from this list: send the line unsubscribe 
  linux-embedded in
  the body of a message to majord...@vger.kernel.org
  More majordomo info at  http://vger.kernel.org/majordomo-info.html
  
 ___
 Ksummit-2009-discuss mailing list
 ksummit-2009-disc...@lists.linux-foundation.org
 https://lists.linux-foundation.org/mailman/listinfo/ksummit-2009-discuss
 

-- 
Mel Gorman
Part-time Phd Student  Linux Technology Center
University of Limerick IBM Dublin Software Lab
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Ksummit-2009-discuss] Representing Embedded Architectures at the Kernel Summit

2009-06-03 Thread Josh Boyer
On Wed, Jun 03, 2009 at 02:04:46PM +0100, Catalin Marinas wrote:
  * Mixed endianness devices in the same system - this may only need
dedicated readl_be/writel_be etc. macros but it could also be
done by having bus-aware readl/writel-like macros

ioread/iowrite{8,16,32} and ioread/iowrite{8,16,32}_be don't suffice here?

josh
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Ksummit-2009-discuss] Representing Embedded Architectures at the Kernel Summit

2009-06-03 Thread Catalin Marinas
On Wed, 2009-06-03 at 09:18 -0400, Josh Boyer wrote:
 On Wed, Jun 03, 2009 at 02:04:46PM +0100, Catalin Marinas wrote:
   * Mixed endianness devices in the same system - this may only need
 dedicated readl_be/writel_be etc. macros but it could also be
 done by having bus-aware readl/writel-like macros
 
 ioread/iowrite{8,16,32} and ioread/iowrite{8,16,32}_be don't suffice here?

Yes, but there there are many drivers that only use readl/writel (and
arch/arm makes the assumption, maybe correctly, that this is little
endian only).

I think that's useful even if the outcome of such discussion is better
documentation on the above functions/macros (grepping Documentation/
doesn't show any reference).

Thanks.

-- 
Catalin

--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Ksummit-2009-discuss] Representing Embedded Architectures at the Kernel Summit

2009-06-03 Thread Josh Boyer
On Wed, Jun 03, 2009 at 02:45:37PM +0100, Catalin Marinas wrote:
On Wed, 2009-06-03 at 09:18 -0400, Josh Boyer wrote:
 On Wed, Jun 03, 2009 at 02:04:46PM +0100, Catalin Marinas wrote:
   * Mixed endianness devices in the same system - this may only need
 dedicated readl_be/writel_be etc. macros but it could also be
 done by having bus-aware readl/writel-like macros
 
 ioread/iowrite{8,16,32} and ioread/iowrite{8,16,32}_be don't suffice here?

Yes, but there there are many drivers that only use readl/writel (and
arch/arm makes the assumption, maybe correctly, that this is little
endian only).

readl/writel are little-endian only.

I think that's useful even if the outcome of such discussion is better
documentation on the above functions/macros (grepping Documentation/
doesn't show any reference).

I think we could perhaps start with just writting some of that documentation
and trying to get it into the kernel.  I'm not sure this specific item is
really worth of a KS topic.

josh
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Ksummit-2009-discuss] Representing Embedded Architectures at the Kernel Summit

2009-06-03 Thread Jean-Christophe PLAGNIOL-VILLARD
 
   * Asymmetric MP:
   * Different CPU frequencies
   * Different CPU features (e.g. floating point only one
 some CPUs): scheduler awareness, per-CPU hwcap bits (in
 case user space wants to set the affinity) 
   * Asymmetric workload balancing for power consumption (may
 be better to load 1 CPU at 60% than 4 at 15%) 
I'll add
* Different Core ARCH
* FDT or similar to describe I/O (MEM, PCI, GPIO) acessible for
  each instance
* Mailbox Architecture
* boot preocedure (bootloader as example done by Kumar Gala
  for the mpc8572ds in linux  U-Boot)
* sharing rootfs (RO) (reduce the rootfs size on embedded)

Best Regards,
J.
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Ksummit-2009-discuss] Representing Embedded Architectures at the Kernel Summit

2009-06-03 Thread James Bottomley
On Wed, 2009-06-03 at 14:04 +0100, Catalin Marinas wrote:
 Hi,
 
 On Tue, 2009-06-02 at 15:22 +, James Bottomley wrote:
  So what we're looking for is a proposal to discuss the issues
  most affecting embedded architectures, or preview any features affecting
  the main kernel which embedded architectures might need ... or any other
  topics from embedded architectures which might need discussion or
  debate.
 
 Some issues that come up on embedded systems (and not only):
 
   * Multiple coherency domains for devices - the system may have
 multiple bus levels, coherency ports, cache levels etc. Some
 devices in the system (but not all) may be able to see various
 cache levels but the DMA API (at least on ARM) cannot handle
 this. It may be useful to discuss how other embedded
 architectures handle this and come up with a unified solution

So this is partially what the dma_sync_for_{device|cpu} is supposed to
be helping with.  By and large, the DMA API tries to hide the
complexities of coherency domains from the user.  The actual API, as far
as it goes, seems to do this OK.  We have synchronisation issues that
mmiowb() and friends help with ... what's the actual problem here?

   * Better support for coherent DMA mask - currently ZONE_DMA is
 assumed to be in the bottom part of the memory which isn't
 always the case. Enabling NUMA may help but it is overkill for
 some systems. As above, a more unified solution across
 architectures would help

So ZONE_DMA and coherent memory allocation as represented by the
coherent mask are really totally separate things.  The idea of ZONE_DMA
was really that if you had an ISA device, allocations from ZONE_DMA
would be able to access the allocated memory without bouncing.  Since
ISA is really going away, this definition has been hijacked.  If your
problem is just that you need memory allocated on a certain physical
mask and neither GFP_DMA or GFP_DMA32 cut it for you, then we could
revisit the kmalloc_mask() proposal again ... but the consensus last
time was that no-one really had a compelling use case that couldn't be
covered by GFP_DMA32.

   * PIO block devices and non-coherent hardware - code like mpage.c
 assumes that the either the hardware is coherent or the device
 driver performs the cache flushing. The latter is true for
 DMA-capable device but not for PIO. The issue becomes visible
 with write-allocate caches and the device driver may not have
 the struct page information to call flush_dcache_page(). A
 proposed solution on the ARM lists was to differentiate (via
 some flags) between PIO and DMA block devices and use this
 information in mpage.c

flush_dcache_page() is supposed to be for making the data visible to the
user ... that coherency is supposed to be managed by the block layer.
The DMA API is specifically aimed at device to kernel space
coherency ... although if you line up all your aliases, that can also be
device to userspace.  Technically though we have two separate APIs for
user-kernel coherency and device-kernel coherency.  What's the path
you're seeing this problem down?  SG_IO to a device doing PIO should be
handling this correctly.

   * Mixed endianness devices in the same system - this may only need
 dedicated readl_be/writel_be etc. macros but it could also be
 done by having bus-aware readl/writel-like macros

We have ioreadXbe for this exact case (similar problem on parisc)

   * Asymmetric MP:
   * Different CPU frequencies
   * Different CPU features (e.g. floating point only one
 some CPUs): scheduler awareness, per-CPU hwcap bits (in
 case user space wants to set the affinity) 
   * Asymmetric workload balancing for power consumption (may
 be better to load 1 CPU at 60% than 4 at 15%) 

This actually just works(tm) for me on a voyager system running SMP with
a mixed 486/586 set of processors ... what's the problem?  The only
issue I see is that you have to set the capabilities of the boot CPU to
the intersection of the mixture otherwise setup goes wrong, but
otherwise it seems to work OK.

James


--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Ksummit-2009-discuss] Representing Embedded Architectures at the Kernel Summit

2009-06-03 Thread Russell King
On Wed, Jun 03, 2009 at 12:19:57PM -0400, James Bottomley wrote:
 On Wed, 2009-06-03 at 14:04 +0100, Catalin Marinas wrote:
* Better support for coherent DMA mask - currently ZONE_DMA is
  assumed to be in the bottom part of the memory which isn't
  always the case. Enabling NUMA may help but it is overkill for
  some systems. As above, a more unified solution across
  architectures would help
 
 So ZONE_DMA and coherent memory allocation as represented by the
 coherent mask are really totally separate things.  The idea of ZONE_DMA
 was really that if you had an ISA device, allocations from ZONE_DMA
 would be able to access the allocated memory without bouncing.  Since
 ISA is really going away, this definition has been hijacked.  If your
 problem is just that you need memory allocated on a certain physical
 mask and neither GFP_DMA or GFP_DMA32 cut it for you, then we could
 revisit the kmalloc_mask() proposal again ... but the consensus last
 time was that no-one really had a compelling use case that couldn't be
 covered by GFP_DMA32.

I'm not aware of such a discussion; I keep running into issues here.  In
fact, on ARM the DMA mask is exactly that - it's a 100% proper mask.  It's
not a bunch of zeros in the MSB followed by a bunch of ones down to the
LSB.  It can be a bunch of ones, a bunch of zeros, followed by a bunch of
ones.

The way we occasionally have to deal with this is to trial an allocation,
see if the physical address fits, if not free the page and try again with
GFP_DMA set.

We do certain checks on the DMA mask - notably that a GFP_DMA allocation
will satisfy the mask which has been passed.

I've never submitted the patch which does this in the ARM coherent DMA
allocator, but it's something that occasionally crops up as being
necessary - I've always thought the allocate-by-mask stuff would
eventually be merged.

* PIO block devices and non-coherent hardware - code like mpage.c
  assumes that the either the hardware is coherent or the device
  driver performs the cache flushing. The latter is true for
  DMA-capable device but not for PIO. The issue becomes visible
  with write-allocate caches and the device driver may not have
  the struct page information to call flush_dcache_page(). A
  proposed solution on the ARM lists was to differentiate (via
  some flags) between PIO and DMA block devices and use this
  information in mpage.c
 
 flush_dcache_page() is supposed to be for making the data visible to the
 user ... that coherency is supposed to be managed by the block layer.
 The DMA API is specifically aimed at device to kernel space
 coherency ... although if you line up all your aliases, that can also be
 device to userspace.  Technically though we have two separate APIs for
 user-kernel coherency and device-kernel coherency.  What's the path
 you're seeing this problem down?  SG_IO to a device doing PIO should be
 handling this correctly.

There's many stories I've heard on what is supposed to take care of the
coherency that I now just close my ears to the problem and chant it
doesn't exist, people aren't seeing it, mainline folk just don't give
a damn.  Really.  It is a problem on _some_ ARM devices and has been
for several years now, and I've 100% given up caring about it.

So people who see the problem just have to suffer with it, and they have
to accept that the Linux kernel sucks with PIO on ARM hardware.

Unless they use a driver I've written which has the necessary callbacks
in it to ensure cache coherency (like MMC).  IDE... forget it.

Yes, that taste you're experiencing is my bitterness on this subject.

-- 
Russell King
 Linux kernel2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Ksummit-2009-discuss] Representing Embedded Architectures at the Kernel Summit

2009-06-03 Thread Andrew Morton
On Wed, 3 Jun 2009 18:09:25 +0100
Russell King r...@arm.linux.org.uk wrote:

 In
 fact, on ARM the DMA mask is exactly that - it's a 100% proper mask.  It's
 not a bunch of zeros in the MSB followed by a bunch of ones down to the
 LSB.  It can be a bunch of ones, a bunch of zeros, followed by a bunch of
 ones.
 
 The way we occasionally have to deal with this is to trial an allocation,
 see if the physical address fits, if not free the page and try again with
 GFP_DMA set.

A couple of times I've suggested that we have the ability to allocate
one zone per address bit, so a 32-bit machine with 4k pages would end
up having 20 zones.  Then, your funny DMA mask can be directly passed
into the page allocator as a zone mask and voila, I think.

 There's many stories I've heard on what is supposed to take care of the
 coherency that I now just close my ears to the problem and chant it
 doesn't exist, people aren't seeing it, mainline folk just don't give
 a damn.  Really.  It is a problem on _some_ ARM devices and has been
 for several years now, and I've 100% given up caring about it.

I wasn't even aware that there was an issue here.  Please don't blame
mainline folk for something they weren't told about!

--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Ksummit-2009-discuss] Representing Embedded Architectures at the Kernel Summit

2009-06-03 Thread James Bottomley
On Wed, 2009-06-03 at 11:43 -0700, Andrew Morton wrote:
 On Wed, 3 Jun 2009 18:09:25 +0100
 Russell King r...@arm.linux.org.uk wrote:
 
  In
  fact, on ARM the DMA mask is exactly that - it's a 100% proper mask.  It's
  not a bunch of zeros in the MSB followed by a bunch of ones down to the
  LSB.  It can be a bunch of ones, a bunch of zeros, followed by a bunch of
  ones.
  
  The way we occasionally have to deal with this is to trial an allocation,
  see if the physical address fits, if not free the page and try again with
  GFP_DMA set.
 
 A couple of times I've suggested that we have the ability to allocate
 one zone per address bit, so a 32-bit machine with 4k pages would end
 up having 20 zones.  Then, your funny DMA mask can be directly passed
 into the page allocator as a zone mask and voila, I think.

The objection I heard to that one is that the zone machinery works
better with fewer zones ... but we could certainly align them along
known boundaries for allocations (if it's only bit X that's the problem,
say, you only need an additional zone covering that one).

Based on this, I dug up the initial proposal, it was the Ottawa Kernel
Summit in 2005 (I'm a packrat; I keep all my old presentations):

http://www.hansenpartnership.com/sites/hansenpartnership.com/files/jejb/kernel_summit_iommu.pdf

kmalloc_mask() is right at the end.  It basically died for lack of
interest and the fact that GFP_DMA32 satisfied 99% of the actual use
cases.

James


--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Ksummit-2009-discuss] Representing Embedded Architectures at the Kernel Summit

2009-06-03 Thread Catalin Marinas
On Wed, 2009-06-03 at 12:19 -0400, James Bottomley wrote:
 On Wed, 2009-06-03 at 14:04 +0100, Catalin Marinas wrote:
  On Tue, 2009-06-02 at 15:22 +, James Bottomley wrote:
   So what we're looking for is a proposal to discuss the issues
   most affecting embedded architectures, or preview any features affecting
   the main kernel which embedded architectures might need ... or any other
   topics from embedded architectures which might need discussion or
   debate.
  
  Some issues that come up on embedded systems (and not only):
  
* Multiple coherency domains for devices - the system may have
  multiple bus levels, coherency ports, cache levels etc. Some
  devices in the system (but not all) may be able to see various
  cache levels but the DMA API (at least on ARM) cannot handle
  this. It may be useful to discuss how other embedded
  architectures handle this and come up with a unified solution
 
 So this is partially what the dma_sync_for_{device|cpu} is supposed to
 be helping with.  By and large, the DMA API tries to hide the
 complexities of coherency domains from the user.  The actual API, as far
 as it goes, seems to do this OK.

Yes, the dma_sync_* API is probably OK. The actual implementation should
become aware of various coherency domains on the same system (it could
hold this information in one of the bus-related structures). Currently,
devices that can access the CPU (inner or outer) cache have drivers
modified to avoid calling the dma_sync_* functions (since other devices
need such functions).

If other embedded architectures face similar issues, it is worth
discussing and maybe come up with a common solution (of course, like
most topics, they could simply be discussed on the mailing lists rather
than at the KS).

* Better support for coherent DMA mask - currently ZONE_DMA is
  assumed to be in the bottom part of the memory which isn't
  always the case. Enabling NUMA may help but it is overkill for
  some systems. As above, a more unified solution across
  architectures would help
 
 So ZONE_DMA and coherent memory allocation as represented by the
 coherent mask are really totally separate things.  The idea of ZONE_DMA
 was really that if you had an ISA device, allocations from ZONE_DMA
 would be able to access the allocated memory without bouncing.  Since
 ISA is really going away, this definition has been hijacked.  If your
 problem is just that you need memory allocated on a certain physical
 mask and neither GFP_DMA or GFP_DMA32 cut it for you, then we could
 revisit the kmalloc_mask() proposal again ... but the consensus last
 time was that no-one really had a compelling use case that couldn't be
 covered by GFP_DMA32.

Russell already commented on this. As an example, I have a platform with
two blocks of RAM - 512MB @ 0x2000 and 512MB @ 0x7000 - but only
the higher one allows DMA.

* PIO block devices and non-coherent hardware - code like mpage.c
  assumes that the either the hardware is coherent or the device
  driver performs the cache flushing. The latter is true for
  DMA-capable device but not for PIO. The issue becomes visible
  with write-allocate caches and the device driver may not have
  the struct page information to call flush_dcache_page(). A
  proposed solution on the ARM lists was to differentiate (via
  some flags) between PIO and DMA block devices and use this
  information in mpage.c
 
 flush_dcache_page() is supposed to be for making the data visible to the
 user ... that coherency is supposed to be managed by the block layer.

I'm referring to kernel-user coherency issues and yes,
flush_dcache_page() is the function supposed to handle this. It's only
that it isn't always called in the block or VFS layers (for example, to
be able to use ext2 over compact flash using pata I had to add a hack so
that flush_dcache_page is called from mpage_end_io_read).

Some devices like Russell's mmci.c use scatter lists and they have
access to the page structure and perform the flushing. I noticed that
for some block devices you can't easily retrieve the page structure (I
would need to check the code for more precise references). But if the
driver is somehow marked as PIO, the VFS layer could ensure the
coherency.

* Mixed endianness devices in the same system - this may only need
  dedicated readl_be/writel_be etc. macros but it could also be
  done by having bus-aware readl/writel-like macros
 
 We have ioreadXbe for this exact case (similar problem on parisc)

OK, probably not worth a new topic. As it was mentioned on
linux-embedded already, it may just need better documention (there is no
reference to ioread* in Documentation/ and most devices seem to use
readl/writel etc.).

* Asymmetric MP:
* Different CPU frequencies
* Different CPU features 

RE: [Ksummit-2009-discuss] Representing Embedded Architectures at the Kernel Summit

2009-06-03 Thread David VomLehn (dvomlehn)
David Delaney has a proof-of-concept of an idea of his which was
presented at the last CELF, which is basically to put the kernel and
loadable kernel modules closely enough together that you can avoid the
use of long jumps. He sees a better than 1% improvement in performance,
which we've duplicated using a slightly different approach. This is nice
payback for little work and, though it doesn't help on all processors,
it helps on several.

The problem is: how do you allocate memory with the magical close to
the kernel attribute? We have something that adds a new ZONE_KERNEL
(this name has some problems, actually). It seems like a pretty good
solution if you look at zones as conceptually concentric usages, but
with the current zone implementation, each zone must be contiguous. So,
if we're talking about changing what zones are done, I'd like to throw
this into the pot.

 -Original Message-
 From: linux-embedded-ow...@vger.kernel.org 
 [mailto:linux-embedded-ow...@vger.kernel.org] On Behalf Of 
 Andrew Morton
 Sent: Wednesday, June 03, 2009 11:44 AM
 To: Russell King
 Cc: james.bottom...@hansenpartnership.com; 
 linux-a...@vger.kernel.org; linux-embedded@vger.kernel.org; 
 ksummit-2009-disc...@lists.linux-foundation.org
 Subject: Re: [Ksummit-2009-discuss] Representing Embedded 
 Architectures at the Kernel Summit
 
 On Wed, 3 Jun 2009 18:09:25 +0100
 Russell King r...@arm.linux.org.uk wrote:
 
  In
  fact, on ARM the DMA mask is exactly that - it's a 100% 
 proper mask.  It's
  not a bunch of zeros in the MSB followed by a bunch of ones 
 down to the
  LSB.  It can be a bunch of ones, a bunch of zeros, followed 
 by a bunch of
  ones.
  
  The way we occasionally have to deal with this is to trial 
 an allocation,
  see if the physical address fits, if not free the page and 
 try again with
  GFP_DMA set.
 
 A couple of times I've suggested that we have the ability to allocate
 one zone per address bit, so a 32-bit machine with 4k pages would end
 up having 20 zones.  Then, your funny DMA mask can be directly passed
 into the page allocator as a zone mask and voila, I think.
 
  There's many stories I've heard on what is supposed to take 
 care of the
  coherency that I now just close my ears to the problem and chant it
  doesn't exist, people aren't seeing it, mainline folk just 
 don't give
  a damn.  Really.  It is a problem on _some_ ARM devices 
 and has been
  for several years now, and I've 100% given up caring about it.
 
 I wasn't even aware that there was an issue here.  Please don't blame
 mainline folk for something they weren't told about!
 
 --
 To unsubscribe from this list: send the line unsubscribe 
 linux-embedded in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Ksummit-2009-discuss] Representing Embedded Architectures at the Kernel Summit

2009-06-03 Thread Mike Frysinger
On Wed, Jun 3, 2009 at 23:11, David VomLehn (dvomlehn) wrote:
 David Delaney has a proof-of-concept of an idea of his which was
 presented at the last CELF, which is basically to put the kernel and
 loadable kernel modules closely enough together that you can avoid the
 use of long jumps. He sees a better than 1% improvement in performance,
 which we've duplicated using a slightly different approach. This is nice
 payback for little work and, though it doesn't help on all processors,
 it helps on several.

it would help on the Blackfin architecture.  we compile all kernel
modules with -mlong-call because of this issue.

 The problem is: how do you allocate memory with the magical close to
 the kernel attribute? We have something that adds a new ZONE_KERNEL
 (this name has some problems, actually). It seems like a pretty good
 solution if you look at zones as conceptually concentric usages, but
 with the current zone implementation, each zone must be contiguous. So,
 if we're talking about changing what zones are done, I'd like to throw
 this into the pot.

what do you do if the alloc fails ?  return back to userspace with
something like ENOMEM and have it retry with a module that was
compiled with -mlong-call ?
-mike
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Ksummit-2009-discuss] Representing Embedded Architectures at the Kernel Summit

2009-06-02 Thread James Bottomley
On Tue, 2009-06-02 at 12:30 -0700, Tim Bird wrote:
 Josh Boyer wrote:
  2) Encouraging upstream participation of Embedded distros
  
  Things like Moblin and Android are getting a lot of press these days, but
  embedded distros have been around for a while.  Are we getting good
  participation from these vendors?  Is there something we could be doing to
  encourage such participation?  Has CELF helped with this at all?  etc
 
 CELF tries, but the progress is exceedingly slow.  Recently
 we've been more focused on contracting specific feature work.
 (E.g. Squashfs mainlining).
 
 James Bottomley wrote:
  Even for someone as inattentive as me, the general problems of getting
  embedded people to agree the sky is blue did impinge on the peripheral
  consciousness.  Thus: If you can come up with such a process in a timely
  fashion then fine ... if not, we'll do the topic based one suggested by
  the PC.
 
 With regard to a process to determine representatives, I'm not
 sure we need one.  Based on participation and inclusion in
 MAINTAINERS, either Matt Mackall or David Woodhouse can
 represent most embedded issues just fine.  And I can say that
 officially on behalf of CELF and it's members, which would
 account for a large fraction of the overall embedded community.
 
 With regard to topics, do topics drive attendee invitations,
 or vice-versa?
 
 Here's my own issue list:
 
 tracing - already well (over?) represented
 
 bloat - tracing will help identify performance bloat.
 As for size bloat, a smaller kernel is always desirable, but we
 are seeing signs that Moore's law is catching up and making
 this less an issue (for the kernel - apps still have big
 problems here.)
 
 power management - Use cases for products that spend most
 of their time off (even while appearing to be running) are
 of interest. I don't know what the status 'wakelock-like'
 solutions is.
 
 fast boot - kernel is almost done? (!!!)  The new target for
 kernel boot time is 300 milliseconds.  Once there, almost
 all problems are then user space issues.  It is interesting
 how much of a differentiator fast boot became for Linux
 in netbooks and dual-boot configurations, in just the last
 2 years - which just shows that sometimes it pays off to
 optimize something. ;-)

OK, if that's what you all want, that's what we can do ... however, it
would likely be the same people discussing the same issues.

 participation - talking about this is like beating a dead horse
 (for me at least).  I've been working on this for 5 years now,
 making baby steps forward.  The issues are, by now, well understood
 (I hope).  I'm not sure what a KS discussion is going to do
 to drive issues here.

This is what made us suggest the presentation driven approach.  We can
send people who understand how the kernel development process out
anointed as embedded maintainers.  However, looking at the arch
directory, you have a ton of new kids on the block.  We wondered if,
perhaps, rather than having seasoned kernel developers reach out to the
embedded community, we might try giving the embedded community the
opportunity to reach out to us.  The topic of flattened device tree
look interesting to me (perhaps because I'm a hardened device driver
person and things like that always look interesting to me) ... if we can
get a few more like that out of the woodwork, this approach might end up
being successful.

James


--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Ksummit-2009-discuss] Representing Embedded Architectures at the Kernel Summit

2009-06-02 Thread Bill Gatliff

James Bottomley wrote:

On Tue, 2009-06-02 at 12:30 -0700, Tim Bird wrote:
  

With regard to a process to determine representatives, I'm not
sure we need one.  Based on participation and inclusion in
MAINTAINERS, either Matt Mackall or David Woodhouse can
represent most embedded issues just fine.  And I can say that
officially on behalf of CELF and it's members, which would
account for a large fraction of the overall embedded community.



What about others like myself who would like to get more involved, but 
don't show up in the MAINTAINERS list?  I might be interested in lending 
a hand in helping to represent the embedded issues...



b.g.

--
Bill Gatliff
b...@billgatliff.com

--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Ksummit-2009-discuss] Representing Embedded Architectures at the Kernel Summit

2009-06-02 Thread Robert Schwebel
On Tue, Jun 02, 2009 at 03:37:44PM -0500, James Bottomley wrote:
 The topic of flattened device tree look interesting to me (perhaps
 because I'm a hardened device driver person and things like that
 always look interesting to me) ...

The recent oftree activities look indeed very promising; the different
boot-information-passing methods, mainly in powerpc and arm land, is
IMHO and important field where a generic kernel infrastructure would
make sense for the embedded people. Taken that oftree has created
robustness and compatiblity problems in the past but seems to move into
a good direction recently, feedback from core kernel developers would
certainly be a good thing.

 if we can get a few more like that out of the woodwork, this approach
 might end up being successful.

Could flickerfree-bootsplash be a topic? Or is that completely pushed
into the userspace these fastboot days?

rsc
-- 
Pengutronix e.K.   | |
Industrial Linux Solutions | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0|
Amtsgericht Hildesheim, HRA 2686   | Fax:   +49-5121-206917- |
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Ksummit-2009-discuss] Representing Embedded Architectures at the Kernel Summit

2009-06-02 Thread Eric W. Biederman
David VomLehn dvoml...@cisco.com writes:

 On Tue, Jun 02, 2009 at 03:37:44PM -0500, James Bottomley wrote:
 ...
 This is what made us suggest the presentation driven approach.  We can
 send people who understand how the kernel development process out
 anointed as embedded maintainers.  However, looking at the arch
 directory, you have a ton of new kids on the block.  We wondered if,
 perhaps, rather than having seasoned kernel developers reach out to the
 embedded community, we might try giving the embedded community the
 opportunity to reach out to us.  The topic of flattened device tree
 look interesting to me (perhaps because I'm a hardened device driver
 person and things like that always look interesting to me) ... if we can
 get a few more like that out of the woodwork, this approach might end up
 being successful.

 Failure reporting is the one area where embedded applications have
 little overlap with other Linux application domains. The cable settop box
 environment has:
 o Limited peristent storage
 o Low or no upstream bandwidth
 o Little access to hundreds of thousands of devices in the field

 When a kernel panics in the field, we have no place to put a core dump
 and, if we had a place to put it, it would take way too long to upload
 it when the box comes back up. And most people just don't understand when
 you knock at their door at midnight, JTAG probe in hand.

 We hook in a panic notifier and have it generate a really rich report.
 At present, this report stays in memory until we reboot and send it
 upstream (or write it to flash), but we could really write it to any
 device with which we can use polled I/O (interrupts being questionable
 at this point). Generic interfaces to support this would be useful.

 Many embedded devices have highly integrated stacks, so failures in user
 space lead to device reboots, and you want to leverage much of the same
 ability to store and send failure reports.

 Our failure report includes things you'd expect as well as various pieces
 of history, such as:
 o IRQs
 o softirq dispatches (including max times)
 o selected /proc info, e.g. /proc/meminfo

 We also report info on the current thread, like backtracing and
 /proc/pid/maps, though I'm not sure it's as useful as it might be.

 Though I'm working on pushing this stuff out, other things that might be
 helpful are:
 o If you get to panic() by way of die(), you've lost the registers passed to
   die(). We save a pointer off, but it's really a kludge.
 o The implementation of die() varies from platform to platform and isn't even
   called die() everywhere.
 o It is truly nasty trying to get /proc information when you are in a panic
   situation--any semaphores being held are not going to be released, so you
   have to duplicate a lot of the code, minus the semaphores. Pretty gross
   and there is no way our implementation will be acceptable.
 o Increased reporting on what's happening in user/kernel space interaction.
   For example, a signal sent in good faith might kill a buggy process. It
   would be helpful to log signals that result in a process' death.
 o Then there is more speculative stuff. For example, your caches would
   have a copy of the most recently accessed code and data.  If your
   processor supports dumping cache, it might help determing what went wrong.

Have you looked at doing this with the kexec on panic infrastructure?

Things like mkdumpfile can now have enough information to dump this.

If you are space constrained a stand alone executable could be used
instead of a linux kernel to marshal the information into your buffer.

Eric
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Ksummit-2009-discuss] Representing Embedded Architectures at the Kernel Summit

2009-06-02 Thread Steven Rostedt

On Tue, 2 Jun 2009, David VomLehn wrote:
 On Tue, Jun 02, 2009 at 03:37:44PM -0500, James Bottomley wrote:
 Our failure report includes things you'd expect as well as various pieces
 of history, such as:
 o IRQs
 o softirq dispatches (including max times)
 o selected /proc info, e.g. /proc/meminfo
 
 We also report info on the current thread, like backtracing and
 /proc/pid/maps, though I'm not sure it's as useful as it might be.
 
 Though I'm working on pushing this stuff out, other things that might be
 helpful are:
 o If you get to panic() by way of die(), you've lost the registers passed to
   die(). We save a pointer off, but it's really a kludge.
 o The implementation of die() varies from platform to platform and isn't even
   called die() everywhere.
 o It is truly nasty trying to get /proc information when you are in a panic
   situation--any semaphores being held are not going to be released, so you
   have to duplicate a lot of the code, minus the semaphores. Pretty gross
   and there is no way our implementation will be acceptable.
 o Increased reporting on what's happening in user/kernel space interaction.
   For example, a signal sent in good faith might kill a buggy process. It
   would be helpful to log signals that result in a process' death.
 o Then there is more speculative stuff. For example, your caches would
   have a copy of the most recently accessed code and data.  If your
   processor supports dumping cache, it might help determing what went wrong.

If your system is hooked up to a serial console, another helpful thing to 
do is to pass in ftrace_dump_on_oops in the command line and also keep 
the event tracer running, where you enable just the events you are 
interested in. Then if the system crashes, it will dump out the ftrace 
buffer to the console, which could be logged by a serial console and then 
you at least have a trace of the events that lead up to the crash.

The event tracer while active is not that much overhead and could be 
enabled in a production system.

-- Steve

--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Ksummit-2009-discuss] Representing Embedded Architectures at the Kernel Summit

2009-06-02 Thread Greg KH
On Tue, Jun 02, 2009 at 11:34:52PM +0200, Robert Schwebel wrote:
 Could flickerfree-bootsplash be a topic? Or is that completely pushed
 into the userspace these fastboot days?

We have that working today, no in-kernel work needed other than the
already-present KMS stuff.  See the recent Moblin images for proof of
this.

thanks,

greg k-h
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html