Re: [PATCH V2 10/16] Squashfs: cache operations

2008-10-31 Thread Jörn Engel
On Fri, 31 October 2008 04:43:46 +, Phillip Lougher wrote:
 
 Simplicity and speed is extremely important.  The 
 squashfs_metadata_read() wrapper around the cache is designed to step 
 through the metadata a structure at a time (20 or so bytes), updating 
 the read position in the metadata each call, with more metadata cache 
 blocks being read and decompressed as necessary.  The common case where 
 the metadata is already in the cache (because we're stepping through it 
 20 or so bytes at a time), is designed to be extremely fast - a spinlock 
 and array search only.  I recently optimised the cache to use spinlocks 
 rather than mutexes and reduced the lock holding time (necessary to move 
 to spinlocks), and this resulted in a 20%+ speed improvement in reading 
 squashfs filesystems.

For the page cache, you can use read_cache_page() and
page_cache_release().  This does not even take a spinlock, hence
multiple readers can work in parallel.  The radix tree walk will take a
similar time to your array search.

You are aware that multiple cpus will play pingpong with your spinlock?

 Given the above using an address space in the page cache will result in 
 greater complexity, more memory overhead, and much slower operation. 
 There's a number of reasons for this.
 
 1. The amount and life-span of the data stored in the page cache is 
 outside of Squashfs' control.  As explained above it only makes sense to 
 temporarily cache the last couple of metadata and fragment blocks. 
 Using the page cache (if a 'large' address space is used) for these 
 keeps more of them around for longer than necessary, and will 
 potentially cause more worthy datablocks to be flushed on memory pressure.

I personally find it hard to guess access patterns.  If I constructed a
workload just large enough that your cache is slightly too small, it
will start thrashing.  Hit rates will go close to zero.  And unless I
missed something, such a workload can be constructed and may even be
used in real life.  With growing numbers of cores, this becomes
increasingly likely.

In other cases, an idle squashfs will still hold the 64k of unused cache
for ransom.  By giving up control, you allow your cache to grow and
shrink as desired and can avoid both cases.

 2. The address space will be caching uncompressed data, the squashfs 
 references to this data are the compressed locations within the 
 filesystem.  There doesn't exist a one-to-one linear mapping from 
 compressed location to decompressed location in the address space.  This 
 means a lookup table still needs to be used to store the mapping from 
 compressed location to decompressed location in the address space.  Now 
 this lookup table (however implemented) is itself at least as complex as 
 my current cache implementation.

You are currently using physical offset of the medium to address blocks
in the cache, which are 64bit.  Page cache may use 32bit to address
pages.  And since blocks can be larger than pages, you effectively need
some bits extra.  This is indeed a problem.

 3. Once the locations of the decompressed pages in the address space 
 have been found, they'll need to be looked up in the page cache, and 
 this has to be done for every 4K page.  With the default fragment size 
 of 128 KiB this means 32 separate lookups.  Somewhat slower than one 
 spinlock and array search per 128 KiB block in the squashfs cache 
 implementation.

Above you claimed the complete cache to be just 64k in size.  How can
you access 128k blocks that way?  One of the numbers appears wrong.

Ignoring that, you don't need to take either the spinlock or do a lookup
in a fast path.  If you currently have this code:

for (some condition) {
err = squashfs_read_metadata(address, ...);
}

You can transform it to:

void *handle = squashfs_get_metadata(address, ...);
for (some condition) {
err = squashfs_read_metadata(handle, address, ...);
}
squashfs_put_metadata(handle);

With the handle you can keep a reference count to your cached object.
squashfs_read_metadata() only has to call squashfs_cache_get() or
squashfs_cache_put() when moving across object boundaries.  In the
common case it simply returns data from the object referenced through
the handle.

This might be a worthwhile optimization independently of whether you use
the page cache or not.

 Comments, especially those of the form you've got this completely 
 wrong, and you can use the page cache like this, which will be simpler 
 and faster than your current implementation welcome :)  I'm not adverse 
  to using the page cache, but I can't see how it will be simpler or 
 faster than the current implementation.

Only one of your problems seems to be real.  Not sure if or how we can
solve that one, though.

Jörn

-- 
Security vulnerabilities are here to stay.
-- Scott Culp, Manager of the Microsoft Security Response Center, 2001
--
To unsubscribe from this list: 

Re: [PATCH V2 10/16] Squashfs: cache operations

2008-10-31 Thread Phillip Lougher

Jörn Engel wrote:



Only one of your problems seems to be real.  Not sure if or how we can
solve that one, though.



Sorry don't agree.  But I'm not going to argue this like a couple of old 
maids.


I'll keep what I currently do unless Andrew Morton or someone else says 
it's not getting mainlined unless it's changed.


Phillip

--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] phylib: make mdio-gpio work without OF

2008-10-31 Thread Grant Likely
On Fri, Oct 31, 2008 at 10:49 AM, Paulius Zaleckas
[EMAIL PROTECTED] wrote:
 make mdio-gpio work with non OpenFirmware gpio implementation.


Looking good, but it has too many #ifdef blocks for my taste.  In
particular, if the driver is refactored a bit then all the OF stuff
can be contained within a single ifdef block, as can all the non-OF
stuff.  Comments below...

 +#ifdef CONFIG_OF_GPIO
  static int __devinit mdio_ofgpio_bitbang_init(struct mii_bus *bus,
  struct device_node *np)
  {
 @@ -87,34 +102,60 @@ static int __devinit mdio_ofgpio_bitbang_init(struct 
 mii_bus *bus,
snprintf(bus-id, MII_BUS_ID_SIZE, %x, bitbang-mdc);
return 0;
  }
 -
 -static void __devinit add_phy(struct mii_bus *bus, struct device_node *np)
 +#else
 +static void __devinit mdio_gpio_bitbang_init(struct mii_bus *bus,
 +struct mdio_gpio_platform_data 
 *pdata,
 +   int bus_id)
  {
 -   const u32 *data;
 -   int len, id, irq;
 +   struct mdio_gpio_info *bitbang = bus-priv;
 +
 +   bitbang-mdc = pdata-mdc;
 +   bitbang-mdio = pdata-mdio;
 +
 +   snprintf(bus-id, MII_BUS_ID_SIZE, phy%i, bus_id);
 +}
 +#endif

mdio_ofgpio_bitbang_init() is such short function and it is only
called once inside the probe() function.  Rather than duplicate it, it
can probably be moved inside the OF probe function and do the same
thing for the non-OF probe().


 -   data = of_get_property(np, reg, len);
 -   if (!data || len != 4)
 +static void __devinit add_phy(struct mii_bus *bus, unsigned int phy_addr,
 + int phy_irq)
 +{
 +   if (phy_addr = PHY_MAX_ADDR) {
 +   dev_err(bus-parent,
 +   Failed to add phy with invalid address: 0x%x,
 +   phy_addr);
return;
 +   }

 -   id = *data;
 -   bus-phy_mask = ~(1  id);
 +   bus-phy_mask = ~(1  phy_addr);

 -   irq = of_irq_to_resource(np, 0, NULL);
 -   if (irq != NO_IRQ)
 -   bus-irq[id] = irq;
 +   if (phy_irq != NO_IRQ)
 +   bus-irq[phy_addr] = phy_irq;
  }

I like the refactoring of add_phy


 +#ifdef CONFIG_OF_GPIO
  static int __devinit mdio_ofgpio_probe(struct of_device *ofdev,
 const struct of_device_id *match)
  {
struct device_node *np = NULL;
 +   struct device *dev = ofdev-dev;
 +#else
 +static int __devinit mdio_gpio_probe(struct platform_device *pdev)
 +{
 +   struct mdio_gpio_platform_data *pdata;
 +   struct device *dev = pdev-dev;
 +#endif

Instead of doing multiple #ifdef sections throughout the probe
function, use one #ifdef block for the OF stuff and another for all
the non-OF stuff.  You can factor out any non-trivial common code
blocks into shared helper functions.

Otherwise, looking good.

Thanks,
g.

-- 
Grant Likely, B.Sc., P.Eng.
Secret Lab Technologies Ltd.
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] phylib: make mdio-gpio work without OF

2008-10-31 Thread Mike Frysinger
On Fri, Oct 31, 2008 at 12:49, Paulius Zaleckas wrote:
 +#ifndef NO_IRQ
 +#define NO_IRQ  0
 +#endif

discussions elsewhere indicate this should die.
-if (irq != NO_IRQ)
+if (irq)
-mike
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html