Re: [PATCH V2 10/16] Squashfs: cache operations
On Fri, 31 October 2008 04:43:46 +, Phillip Lougher wrote: Simplicity and speed is extremely important. The squashfs_metadata_read() wrapper around the cache is designed to step through the metadata a structure at a time (20 or so bytes), updating the read position in the metadata each call, with more metadata cache blocks being read and decompressed as necessary. The common case where the metadata is already in the cache (because we're stepping through it 20 or so bytes at a time), is designed to be extremely fast - a spinlock and array search only. I recently optimised the cache to use spinlocks rather than mutexes and reduced the lock holding time (necessary to move to spinlocks), and this resulted in a 20%+ speed improvement in reading squashfs filesystems. For the page cache, you can use read_cache_page() and page_cache_release(). This does not even take a spinlock, hence multiple readers can work in parallel. The radix tree walk will take a similar time to your array search. You are aware that multiple cpus will play pingpong with your spinlock? Given the above using an address space in the page cache will result in greater complexity, more memory overhead, and much slower operation. There's a number of reasons for this. 1. The amount and life-span of the data stored in the page cache is outside of Squashfs' control. As explained above it only makes sense to temporarily cache the last couple of metadata and fragment blocks. Using the page cache (if a 'large' address space is used) for these keeps more of them around for longer than necessary, and will potentially cause more worthy datablocks to be flushed on memory pressure. I personally find it hard to guess access patterns. If I constructed a workload just large enough that your cache is slightly too small, it will start thrashing. Hit rates will go close to zero. And unless I missed something, such a workload can be constructed and may even be used in real life. With growing numbers of cores, this becomes increasingly likely. In other cases, an idle squashfs will still hold the 64k of unused cache for ransom. By giving up control, you allow your cache to grow and shrink as desired and can avoid both cases. 2. The address space will be caching uncompressed data, the squashfs references to this data are the compressed locations within the filesystem. There doesn't exist a one-to-one linear mapping from compressed location to decompressed location in the address space. This means a lookup table still needs to be used to store the mapping from compressed location to decompressed location in the address space. Now this lookup table (however implemented) is itself at least as complex as my current cache implementation. You are currently using physical offset of the medium to address blocks in the cache, which are 64bit. Page cache may use 32bit to address pages. And since blocks can be larger than pages, you effectively need some bits extra. This is indeed a problem. 3. Once the locations of the decompressed pages in the address space have been found, they'll need to be looked up in the page cache, and this has to be done for every 4K page. With the default fragment size of 128 KiB this means 32 separate lookups. Somewhat slower than one spinlock and array search per 128 KiB block in the squashfs cache implementation. Above you claimed the complete cache to be just 64k in size. How can you access 128k blocks that way? One of the numbers appears wrong. Ignoring that, you don't need to take either the spinlock or do a lookup in a fast path. If you currently have this code: for (some condition) { err = squashfs_read_metadata(address, ...); } You can transform it to: void *handle = squashfs_get_metadata(address, ...); for (some condition) { err = squashfs_read_metadata(handle, address, ...); } squashfs_put_metadata(handle); With the handle you can keep a reference count to your cached object. squashfs_read_metadata() only has to call squashfs_cache_get() or squashfs_cache_put() when moving across object boundaries. In the common case it simply returns data from the object referenced through the handle. This might be a worthwhile optimization independently of whether you use the page cache or not. Comments, especially those of the form you've got this completely wrong, and you can use the page cache like this, which will be simpler and faster than your current implementation welcome :) I'm not adverse to using the page cache, but I can't see how it will be simpler or faster than the current implementation. Only one of your problems seems to be real. Not sure if or how we can solve that one, though. Jörn -- Security vulnerabilities are here to stay. -- Scott Culp, Manager of the Microsoft Security Response Center, 2001 -- To unsubscribe from this list:
Re: [PATCH V2 10/16] Squashfs: cache operations
Jörn Engel wrote: Only one of your problems seems to be real. Not sure if or how we can solve that one, though. Sorry don't agree. But I'm not going to argue this like a couple of old maids. I'll keep what I currently do unless Andrew Morton or someone else says it's not getting mainlined unless it's changed. Phillip -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] phylib: make mdio-gpio work without OF
On Fri, Oct 31, 2008 at 10:49 AM, Paulius Zaleckas [EMAIL PROTECTED] wrote: make mdio-gpio work with non OpenFirmware gpio implementation. Looking good, but it has too many #ifdef blocks for my taste. In particular, if the driver is refactored a bit then all the OF stuff can be contained within a single ifdef block, as can all the non-OF stuff. Comments below... +#ifdef CONFIG_OF_GPIO static int __devinit mdio_ofgpio_bitbang_init(struct mii_bus *bus, struct device_node *np) { @@ -87,34 +102,60 @@ static int __devinit mdio_ofgpio_bitbang_init(struct mii_bus *bus, snprintf(bus-id, MII_BUS_ID_SIZE, %x, bitbang-mdc); return 0; } - -static void __devinit add_phy(struct mii_bus *bus, struct device_node *np) +#else +static void __devinit mdio_gpio_bitbang_init(struct mii_bus *bus, +struct mdio_gpio_platform_data *pdata, + int bus_id) { - const u32 *data; - int len, id, irq; + struct mdio_gpio_info *bitbang = bus-priv; + + bitbang-mdc = pdata-mdc; + bitbang-mdio = pdata-mdio; + + snprintf(bus-id, MII_BUS_ID_SIZE, phy%i, bus_id); +} +#endif mdio_ofgpio_bitbang_init() is such short function and it is only called once inside the probe() function. Rather than duplicate it, it can probably be moved inside the OF probe function and do the same thing for the non-OF probe(). - data = of_get_property(np, reg, len); - if (!data || len != 4) +static void __devinit add_phy(struct mii_bus *bus, unsigned int phy_addr, + int phy_irq) +{ + if (phy_addr = PHY_MAX_ADDR) { + dev_err(bus-parent, + Failed to add phy with invalid address: 0x%x, + phy_addr); return; + } - id = *data; - bus-phy_mask = ~(1 id); + bus-phy_mask = ~(1 phy_addr); - irq = of_irq_to_resource(np, 0, NULL); - if (irq != NO_IRQ) - bus-irq[id] = irq; + if (phy_irq != NO_IRQ) + bus-irq[phy_addr] = phy_irq; } I like the refactoring of add_phy +#ifdef CONFIG_OF_GPIO static int __devinit mdio_ofgpio_probe(struct of_device *ofdev, const struct of_device_id *match) { struct device_node *np = NULL; + struct device *dev = ofdev-dev; +#else +static int __devinit mdio_gpio_probe(struct platform_device *pdev) +{ + struct mdio_gpio_platform_data *pdata; + struct device *dev = pdev-dev; +#endif Instead of doing multiple #ifdef sections throughout the probe function, use one #ifdef block for the OF stuff and another for all the non-OF stuff. You can factor out any non-trivial common code blocks into shared helper functions. Otherwise, looking good. Thanks, g. -- Grant Likely, B.Sc., P.Eng. Secret Lab Technologies Ltd. -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] phylib: make mdio-gpio work without OF
On Fri, Oct 31, 2008 at 12:49, Paulius Zaleckas wrote: +#ifndef NO_IRQ +#define NO_IRQ 0 +#endif discussions elsewhere indicate this should die. -if (irq != NO_IRQ) +if (irq) -mike -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html