Re: Invitation and RFC: Linux Plumbers Device Tree track proposed
On Tuesday 14 April 2015 10:36:15 Rob Herring wrote: 4) Identifying additional people who should attend the device tree track. Arnd Bergmann Matt Porter Jon Loeliger Gaurav Minocha Sorry, I won't be there. I should have replied earlier, but I'll be on parental leave at the time. Arnd -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Handling of modular boards
On Friday 04 May 2012, Mark Brown wrote: Quite a few reference platforms (including Wolfson ones, which is why I'm particularly interested) use replaceable modules to allow configuration changes. Since we can often identify the configuration at runtime we should ideally do that but currently there's no infrastructure to help with that, generally this seems to be done in arch code for the machine but this doesn't scale when even the CPU might change and isn't terribly device tree compatible either. For reference the code for current Wolfson plugin modules is in arch/arm/mach-s3c64xx/mach-crag6410-module.c. Hi Mark, Thanks for getting the discussion started. I've seen the same issue come up for arch/arm/mach-ux500/board-mop500*uib.c and for the beaglebone. I'm sure there are many more, but we should make sure that everyone of these can live with whatever we come up with. The most obvious current fit here is the MFD subsystem but it feels like we need some slightly different infastructure to what MFD currently provides. MFD is really set up to handle platform devices with a core and linear ranges of resources fanning out from that core since they're really oriented around chips. In contrast these boards are more about remapping random collections of potentially unrelated resources and instantiating devices on all sorts of buses and share more with board files. I'm just starting to put some stuff together for this so I was wondering if anyone had been thinking about this and had any bright ideas for how to handle it, and also if people think that MFD is a good fit for this or if we should split the silicon MFDs from these PCBs. One idea that I've heard before is to put device tree fragments into the kernel and dynamically add them to the device tree that was passed by the boot loader whenever we detect the presence of a specific device. This obviously means it works only for boards using DT for booting, but it allows us to use some infrastructure that we already have. Another idea was to put all the possible extensions into the device tree for a given board and disable them by default, putting it into the responsibility of the boot loader to enable the one that is actually being used. This has serious scalibility problems when there are many possible extensions and also relies more on the boot loader than I would like. An intermediate solution that I really like is the ability to stuff device tree fragments on extension board themselves, but that can only work for new designs and causes problems when that information is not actually correct. Arnd -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Handling of modular boards
On Friday 04 May 2012, Wolfgang Denk wrote: In message 201205041934.08830.a...@arndb.de you wrote: One idea that I've heard before is to put device tree fragments into the kernel and dynamically add them to the device tree that was passed by the boot loader whenever we detect the presence of a specific device. This obviously means it works only for boards using DT for booting, but it allows us to use some infrastructure that we already have. Another idea was to put all the possible extensions into the device tree for a given board and disable them by default, putting it into the responsibility of the boot loader to enable the one that is actually being used. This has serious scalibility problems when there are many possible extensions and also relies more on the boot loader than I would like. On the other hand, some of the issues we're trying to solve here for the kernel are also present in the boot loader, so this needs to do this anyway - whether by inserting new or modifying (enabling or disabling) existing properties in the DT is not really relevant here. I haven't seen a case where the add-on board is actually required for booting. What examples are you thinking of? Arnd -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Handling of modular boards
On Friday 04 May 2012, Wolfgang Denk wrote: There are systems (and I bet it will be a growing number) where U-Boot itself uses the DT for configuration. Also, there are functions that are needed both by the boot loader and the kernel - for example to dislay a splash screen the boot loader needs to initialize the display, so it must be able to detect which type of LCD is attached (resolution, color-depth, orientation) - the device tree comes in very handy here. Why should Linux re-do all such things? Sure, there are a lot of things that the boot loader can use from the device tree, but I'm not sure if the LCD panel connection fits into the same category as the devices that Mark was thinking of. Anyway, display controllers are definitely something that needs to be handled in some way, which may or may not be the same way we handle more complex collections of arbitrary devices. Arnd -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RFC: android logger feedback request
On Thursday 22 December 2011, NeilBrown wrote: If you created a 'logbuf' filesystem that used libfs to provide a single directory in which privileged processes could create files then you wouldn't need the kernel to know the allowed logs: radio, events, main, system. The size could be set by ftruncate() (by privileged used again) rather than being hardcoded. You would defined 'read' and 'write' much like you currently do to create a list of datagrams in a circular buffer and replace the ioctls by more standard interfaces: LOGGER_GET_LOG_BUG_SIZE would use 'stat' and the st_blocks field LOGGER_GET_LOG_LEN would use 'stat' and the st_size field LOGGER_GET_NEXT_ENTRY_LEN could use the FIONREAD ioctl LOGGER_FLUSH_LOG could use ftruncate The result would be much the same amount of code, but an interface which has fewer details hard-coded and is generally more versatile and accessible. I like the idea and was going to suggest something very similar, but I wonder if we could take the approach even further: * Remove all kernel code for this and use a user space library together with tmpfs * prepopulate the tmpfs at boot time with all the log buffers in the right size, and set the maximum file system size so that they cannot grow further. * Have minimal formatting in the log buffer: A few bytes header (ring buffer start and end) * Mandate that user space must use mmap and atomic operations to reserve space in the log and write to the files. * Provide a tool to get the log data out of the buffer again in a race-free way. Since any program that is allowed to write to the buffer can overwrite all existing information in it anyway, I think we don't actually need any kernel help in maintaining consistency of the contents either -- the reader will simply discard any data. The main thing we would not be able to guarantee without kernel help is proving the origin of individual messages, but I'm not sure if that is a design goal. Arnd -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: dma_unmap_single() lacking cache sync on some archs?
On Tuesday 27 September 2011 09:55:02 Håvard Skinnemoen wrote: On Tue, Sep 27, 2011 at 5:13 AM, Arvid Brodin arvid.bro...@enea.com wrote: [Resending with CC to affected parties] Hi, I would expect cache synchronization for DMA_TO_DEVICE and DMA_BIDIRECTIONAL when dma_map_single() is called, and for DMA_FROM_DEVICE and DMA_BIDIRECTIONAL when dma_unmap_single() is called. However, on some architechtures (at least avr32, blackfin, ...), cache synchronization only happens when dma_map_single() is called (and then irrespective of DMA direction). dma_unmap_single() is a no-op for these archs. See e.g. http://lxr.linux.no/#linux+v3.0.4/arch/avr32/include/asm/dma-mapping.h#L117 Isn't this a bug? I don't think so. What do other architectures do? We always need to sync before the transfer because if there is dirty data in the cache, it might get written to RAM during the transfer, which would be bad. Then, since the relevant cache lines are already clean and invalid, and the CPU is not allowed to access the buffer during the transfer, there's no need to sync again when the transfer is complete. On some architectures, e.g. ARMv6 and higher, a speculative prefetch might cause cache lines to be read again while an inbound DMA is on its way. On those architectures you need to discard cache lines before reading from the buffer. In fact also for DMA_FROM_DEVICE you need to flush or invalidate the cache for the buffer before the transfer and invalidate the cache again after the transfer. Most architectures however do not require this. Arnd -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv3] UBI: new module ubiblk: block layer on top of UBI
On Friday 09 September 2011, Artem Bityutskiy wrote: On Thu, 2011-09-08 at 17:26 +0200, Arnd Bergmann wrote: On Tuesday 06 September 2011, Artem Bityutskiy wrote: Not sure about the bus approach - David, could you take a look at it please? If we can handle errors there - then we could indeed re-use the UBI control device. We could even re-use the ioctl data structures for UBI volumes creation/removal - we have plenty of space there reserved for future extensions. I would generally recommend using new ioctl commands. ioctl numbers are cheap, but complexity in data structures is not, because every user who wants to deal with the data structures has to understand them. Also, changing the ABI is always tricky since you have to provide backward and forwards compatibility with existing kernels and with existing user space. Hmm, what do we do if ubiblk module is not loaded, and UBI would have to return an error (because the block device cannot be created), how will UBI know that ubiblk is not there? Any direct call to ubiblk from UBI would be a direct dependency and would require ubiblk to be always loaded, which is bad. No, the idea of this approach is that the main ubi driver creates a device, which can always succeed. It's just that there won't be a block device node created, because that is part of what the ubiblk driver does. Compare this to how scsi works: A scsi host driver scans the host controller and adds scsi devices internal to the kernel, each of them have a specific type (disk, tape, ...). If the scsi disk driver is loaded, it will create a blockdev for each disk device. It doesn't matter in which order the drivers are loaded though. In case of ubiblk, it's similar, except that there is no way for the ubi layer to know if some partition should be a block device or not, so it relies on user space to tell it. Well, actually, you /could/ encode this somewhere so that the main ubi layer creates different kinds of devices based on what it finds: a ubiblk_device when it finds a partition that was created as a block device or gluebi_device for gluebi or a ubifs volume. IOW, we need a blocking mechanism to call the upper layer's function (ubiblk) from the lower layer (UBI) which can return an error, and which allows to check if a ubiblk exists at all. Do we have such mechanism? Actually the fact of invoking upper layers from lower makes me worry. Yes, you should not do that. Arnd -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv3] UBI: new module ubiblk: block layer on top of UBI
On Tuesday 06 September 2011, Artem Bityutskiy wrote: Not sure about the bus approach - David, could you take a look at it please? If we can handle errors there - then we could indeed re-use the UBI control device. We could even re-use the ioctl data structures for UBI volumes creation/removal - we have plenty of space there reserved for future extensions. I would generally recommend using new ioctl commands. ioctl numbers are cheap, but complexity in data structures is not, because every user who wants to deal with the data structures has to understand them. Also, changing the ABI is always tricky since you have to provide backward and forwards compatibility with existing kernels and with existing user space. Arnd -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv3] UBI: new module ubiblk: block layer on top of UBI
On Thursday 25 August 2011, Artem Bityutskiy wrote: On Wed, 2011-08-24 at 18:23 +0200, Arnd Bergmann wrote: That should be fine, yes. I would probably put them into the same header file though if they are in the same number space even when you use them on distinct devices. It does feel a little clumsy to have yet another character device to manage the block devices though. What do you think about one of these alternative approaches: * When the ubi block device driver gets loaded, create one block device per volume and let the user deal with permissions for the devices instead of having to first create them as well. I think this wasteful. Why should I have block devices which I do not need? If I have 4 UBI volumes, and need only one ubiblk, why should I waste my resources for 3 more of them (e.g., I do not want to waste memory for struct inode for each sysfs entry which these useless block devices will add). Also, will this mean 3 more block devices registered? I think it is much uglier to have 3 dummy block devices and confuse users than have one nice control character device. For the sake of not having a separate control chardev? The cost of a block device node in the kernel is rather low. Nowadays, sysfs does not even permanently use inodes for entries, it has a much more compact internal representation IIRC. The main advantage of this approach is not having to set up the block device at all, it would just be there, which e.g. makes it possible to put a root file system on it or do something else without requiring a user space tool to issue an ioctl. Evidently you can do everything you need even with that user space tool, but IMHO the complexity of doing that is way bigger than just creating the block devices right away. * Use the existing UBI control device for the block devices as well and just add two more ioctls to create the devices. You can add a logical bus_type for this so that the ubi block driver gets automatically loaded matched with the device when one is created using the control device. This sounds better IMHO, but I am still not sure that adding another dummy bus and exposing it in sysfs and more complexity in the ubiblk code is more elegant and less wasteful than just creating a separate chardev... It's not a dummy bus, in this approach it would be a the bus that gets used by all ubiblk devices, which is a very common concept by itself. It's more like the classic understanding of a 'device class' that Greg wants to see get replaced by bus_types in the kernel. Arnd -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv3] UBI: new module ubiblk: block layer on top of UBI
On Monday 22 August 2011, Artem Bityutskiy wrote: On Wed, 2011-08-17 at 15:17 +0200, david.wag...@free-electrons.com wrote: Questions: == I wasn't sure what magic ioctl number to use, so I settled to use the same one as a part of UBI: 'O', which was so far only used by UBI but on a higher range and leaving some room for UBI to add ioctls (for nw, it uses 'O'/0x00-0x06 and ubiblk uses 'O'/0x10-0x11). Is it ok or should ubiblk use a different number/range ? I think this is OK to share them between UBI and ubiblk, as long as this is documented. That should be fine, yes. I would probably put them into the same header file though if they are in the same number space even when you use them on distinct devices. It does feel a little clumsy to have yet another character device to manage the block devices though. What do you think about one of these alternative approaches: * When the ubi block device driver gets loaded, create one block device per volume and let the user deal with permissions for the devices instead of having to first create them as well. * Use the existing UBI control device for the block devices as well and just add two more ioctls to create the devices. You can add a logical bus_type for this so that the ubi block driver gets automatically loaded matched with the device when one is created using the control device. Arnd -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: architecture-independent I/o accessors
On Tuesday 18 August 2009 21:07:01 Wolfgang Denk wrote: Dear Arnd, Josh Boyer suggested you might provide some insight... I'm currently looking for a solution how to provide architecture independent I/O accessor functions to U-Boot. In the past, lots of code used direct pointer accesses, relying on the idea that volatile would be sufficient to convince the compiler and the hardware to do what was expected; some architectures (like ARM and others) used readl() / writel(), while others (like PPC) used in_8, in_le16, in_be16, in_le32, in_be32, in_le64, in_be64 etc. As we like to borrow code from Linux, I'm trying to find out what the big plan for Linux is. My understanding is that in Linux the ioreadX() / iowriteX() / ioreadXbe() / iowriteXbe() functions are supposed to provide architecture independent I/O accessors, and that the plain ioreadX() / iowriteX() functions (without the be) are always guaranteed to be little-endian on all architectures, while the be functions are, well, big-endian. Is this understanding correct? yes. Also, these functions are defined so that you can use them both for memory mapped I/O *and* for programmed I/O (aka inl/outl). If yes, does that mean that in the future we will see more Linux code using ioreadX[be]() / iowriteX[be]()? So far I did not find much hints that support this aproach - only memory-barriers.txt has only a short sentence about these functions, with basicly no explanation. The most common ones are readl/writel, simply because they are better known. For devices that only have memory mapped I/O, they are by definition equivalent to ioread32/iowrite32. The SATA drivers and others use ioread32/iowrite32 because that lets the driver ignore the difference between PIO and MMIO. What I liked from the in_[le]X() / out_[le]X() accessors on PPC was that they allowed for type checking - the compiler would raise a warning when you used in_[le]16() to read from a 32 bit wide register. However, ioreadX[be]() / iowriteX[be]() use a void * iomem cookie, so no type checking can be done. Hmm, interesting. I was never aware of that difference. We should probably change that in the kernel, to add type checking to all of them. Another difference on powerpc is that in_le32/out_le32 do not can not be used on PCI devices but only SoC, because legacy iSeries and pSeries need some additional magic for PCI accesses. Basicly I have two questions: 1) Can you make a statement which direction Linux is heading to? Will more (new) code use ioreadX() / iowriteX()? New subsystems will often use ioreadX/iowriteX by default, but I expect existing code to keep using readl/writel and new drivers will also keep using it. 2) What would be your recommendation what we should do in U-Boot? Provide for all architectures in_8, in_le16, in_be16, in_le32, in_be32, in_le64, in_be64 etc. similar to what we have for the Power architecture, well knowing that Linux will not follow that route, or use ioreadX[be]() / iowriteX[be]() which does not provide type checking, and which eventually does not find wider use in Linux either? Or even something else - like ioreadX[be]() / iowriteX[be]() with type checking added? I think ioread32/iowrite32 and friends with type checking would be the easiest. It would be nice to try adding type checking to the kernel, just to see what breaks ;-) Arnd -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 06/14] Pramfs: Include files
On Tuesday 23 June 2009, David Woodhouse wrote: And dd on /dev/mem would work, surely? Actually, reading from /dev/mem is only valid on real RAM. If the nvram is part of an IO memory mapping, you have to do mmap()+memcpy() rather than read(). So dd won't do it, but it's still easy to read from user space. I'd definitely recommend making it fixed-endian. Not doing so for JFFS2 was a mistake I frequently regretted. Right. Arnd -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 06/14] Pramfs: Include files
On Monday 22 June 2009, Marco wrote: Sorry, I meant it's not currently possible. At the moment the only way to use it as rootfs it's to copy all the data in an already mounted (empty) ram partition and reboot. However it's not my first item on my todo list because I think that it's possible to use it as rootfs but it isn't the standard use for this fs. Well, it doesn't have to work right away. What I'm asking to define the data structures in a way that keeps the layout stable across kernel updates. Since a future version of the file system might support cross-endian image creation, it would be good to define the data structures in a fixed endian mode already, so you don't have to change it in the future. Arnd -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 06/14] Pramfs: Include files
On Monday 22 June 2009, Jörn Engel wrote: Four loops doing the same increment with different data types: long, u64, we32 (wrong-endian) and we64. Compile with no optimizations. Results on my i386 notebook: long: 453953 us we32: 880273 us u64: 504214 us we64:2259953 us loops: 1 (couldn't resist) The we64 number is artificially high because the glibc bswap_64 implementation forces the conversion to be done on the stack. Using __builtin_bswap64 make this look more logical, and makes your point even stronger (on core 2, using -m32): long: 236792 us we32: 500827 us u64: 265990 us we64: 757380 us loops: 1 Arnd -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 04/10] AXFS: axfs_inode.c
On Friday 22 August 2008, Phillip Lougher wrote: This looks very nice, but could use some comments about how the data is actually stored on disk. It took me some time to figure out that it actually allows to do tail merging into compressed blocks, which I was about to suggest you implement ;-). Cramfs doesn't have them, and I found that they are the main reason why squashfs compresses better than cramfs, besides the default block size, which you can change on either one. Squashfs has much larger block sizes than cramfs (last time I looked it was limited to 4K blocks), and it compresses the metadata which helps to get better compression. But tail merging (fragments in Squashfs terminology) is obviously a major reason why Squashfs gets good compression. The *default* block size in cramfs is smaller than in squashfs, but they both have user selectable block sizes. I found the impact of compressed metadata to be almost zero. I hacked up a mksquashfs to avoid tail merging, and found that the image size for squashfs and cramfs is practically identical if you use the same block size and no tail merging. The AXFS code is rather obscure but it doesn't look to me that it does tail merging. The following code wouldn't work if the block in question was a tail contained in a larger block. It assumes the block extends to the end of the compressed block (cblk_size - cnode_offset). yes, I thought the same thing when I first read that code, and was about to send a lengthy reply about how it should be changed when I saw that it already does exactly that ;-). Arnd -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 03/10] AXFS: axfs.h
On Friday 22 August 2008, Jared Hulbert wrote: This bytetable stuff looks overly complicated, both the data structure and the access method. It seems like you are implementing your own custom Huffman compression with this. Is the reasonn for the bytetable just to pack numbers efficiently, or do you have a different intention? It looks more complicated than it is. I need a data structure that is 64bit capable, easily read-in-place (remember this is designed to be an XIP fs), and highly space efficient. Because it's XIP I didn't want something that required a lot of calculation nor something that made you incur a lot of cache misses. So yes I just want to pack numbers in an easily read-in-place fashion. ok, that makes sense. If I have an array of u64 numbers tracking small numbers (a[0] = 1; a[1] = 2;) just throwing that onmedia is a big waste. (0x0001; 0x0002) Having different array types for different images such as arrays of u8,u16,u32,u64 becomes less efficient for 3,5,6 and 7 byte numbers, 3 bytes was a particularly interesting size for me. All I'm doing is removing the totally unnecessary zeros and aligning by bytes. Take an array of u64 like this : 0x0005 0x1001 0x000a I strip off the unneeded leading zeros: 0x05 0x001001 0x0a Then pack them to byte alignment: 0x050010010a Sure it could be encoded more but that would make it harder to extract the data. This way I can read the data in one, maybe two, cache misses. A couple of shifts to deal with the alignment and endianness and we are done. So do I understand right that 3 bytes is your minimum size, and going smaller than that would not be helpful? Otherwise I would assume that storing a '5' should only take one byte instead of three. I don't unsterstand yet why you store the length of each word separate from the word. Most variable-length codes store that implicitly in the data itself, e.g. in the upper three bits, so that for storing 0x5, 0x1001, 0xa, this could e.g. end up as 0x054010014a, which is shorter than what you have, but not harder to decode. Did you see a significant size benefit over simply storing all metadata as uncompressed data structures like in cramfs? Yes. For some modest values of significant. In terms of the amount of space required to track the metadata it is more dramatic. For a small rootfs I can fit many of the data structures in an u8 array, while maintaining u64 compatibility. Compared to dumping u64 arrays onmedia that's an 8X savings. But it's an 8X savings of a smallish percentage of the image size. The difference is more pronounced on a smaller (2MB) filesystem I tested but it was only ~5% if memory serves me correct. If you can save 5% on a real-world file system, you have convinced me. Have you considered storing simple dentry/inode data in node_type==Compressed nodes? Yes, I thought a lot about that. But I choose against it because I wanted read-in-place data structures for minimum RAM usage in the XIP case and I figure the way I do it would stat() faster. ok. Arnd -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/10] AXFS: Advanced XIP filesystem
On Friday 22 August 2008, Geert Uytterhoeven wrote: I gave AxFS a try on PS3 (ppc64, always use big-endian 64-bit for testing new code ;-). When mounting the image, I got the crash below: | attempt to access beyond end of device | loop0: rw=0, want=4920, limit=4912 | Unable to handle kernel paging request for data at address 0x0028 Offset 0x28 is buffer_head-b_data, so it seems like sb_bread returns NULL, which it does for out of range block numbers. I guess axfs_copy_block should check for that condition, as it can happen on malicious file system images. I agree that this is likely to get caused by an endianess bug. A good help for finding endianess bugs is to use __be64 like data types everywhere and test with sparse -D__CHECK_ENDIAN__. Arnd -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 06/10] AXFS: axfs_super.c
On Friday 22 August 2008, Jared Hulbert wrote: This implies for block devices that the entire filesystem metadata has to be cached in RAM. This severely limits the size of AXFS filesystems when using block devices, or the else memory usage will be excessive. This is where 64bit squashfs could be a better fit. Is this the only place where squashfs has a significant advantage? If so, you might want to change it in axfs eventually to make the decision easier for users ;-) It certainly sounds like something for your medium-term TODO list, although I wouldn't think of it as a show-stopper. Arnd -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 05/10] AXFS: axfs_profiling.c
On Friday 22 August 2008, you wrote: You mean to take this off list? No, i replied to your mail that was sent just to me. Putting everyone back on now In 3, you create files with sysfs_create_file, and are fairly limited with how you can use it. A structured file like you have in procfs would not be allowed. File names are fixed, directory names can be used to identify the mounted file systems. You can create symlinks between your directory and other things in sysfs. What do you mean a structured file wouldn't be allowed? What's in them then? sysfs files are meant to have just a single value. Some have a list of values of the same type, but a file that needs a nontrivial parser (even sscanf) is not allowed in sysfs, by convention. There is also the technical limitation of the size to a single page, which makes it hard to write variable size data. In 4, you write a whole file system like debugfs (it's not as hard as it sounds) and are free to do anything in there, but you can't easily symlink to sysfs. Argh. No it might not be too bad to do to do, but it sounds like a maintenance hassle. Sounds like the best option though. Why did we decide debugfs is a bad fit? It's basically the same as debugfs -- actually I once started a patch to make it a single function call to instantiate a debugfs-like file system, but I never finished that patch. debugfs is a bad idea here because it is not meant for stable interfaces but rather ad-hoc stuff. In a distribution kernel, debugfs is supposed to be empty. So where does a page show up in the profile if you have two identical files and both are mapped? In which ever file was actually read. The kernel driver doesn't really know pages are redundant. ok. Will the kernel map them to the same page but count the files separately, or will it show the same count for both? I count faults on pages in mmap() so I don't really care whether a page is mapped twice or just once. I'll count it every time you fault it even if it's the same physical page. It's the image builders job to figure out if there are redundant pages. ok, makes sense. I think there is still another option, which would be to generalize the profiling interface so it can work with arbitrary file systems. I'm sure that other people can benefit from that as well, e.g. for optimizing boot times on disks. For such a general interface, a per-file ioctl would fit best, and then file systems can implement it if they want, or it can be moved into VFS. Arnd -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 06/10] AXFS: axfs_super.c
On Friday 22 August 2008, Phillip Lougher wrote: 1. Support for 4GB filesystems. In theory 2^64 bytes. 2. Compressed metadata 3. Inode timestamps 4. Hard-link support, and correct nlink counts 5. Sparse file support 6. Support for . .. in readdir 7. Indexed directories for fast lookup 8. NFS exporting 9. No need to cache entire metadata in memory Squashfs has been optimised for block-based rotating media like hard disks, CDROMS. AXFS has been optimised for flash based media. Squashfs will outperform AXFS on rotating media, AXFS will outperform Squashfs on flash based media. Ok, thanks for the list. I'm sure that sparse files are already part of AXFS, and among the other things, I would consider some to be AXFS bugs rather than squashfs features (. in readdir, in particular), but I get the point. Squashfs and AXFS should be seen as complementary filesystems, and there should be room in the Linux kernel for both. I don't see what your problem is here. I think AXFS is an extremely good filesystem and should be merged. But I don't see why this should lead to more Squashfs bashing. Sorry, I didn't mean to be abusive. From first look, it appeared to do everything that squashfs does, with less code, but you've made it clear that there is need for both of them. I would still expect axfs to replace cramfs for all practical purposes, even though that was written by our Emperor Penguin ;-) Arnd -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 05/10] AXFS: axfs_profiling.c
On Thursday 21 August 2008, Jared Hulbert wrote: 1) same mount point - I don't see how this works without an ioctl. I can't just make up files in my mounted filesystem. You expect the mounted version to match input to the mkfs. I'd not be happy with an ioctl. You can just read it. 2) sysfs - I agree with Carsten, I don't see how this fits in the sysfs hierarchy. 3) debugfs - I don't know diddly about this. Ok, so now yet another suggestion, which may sound a little strange: oprofilefs I believe you can use the oprofile infrastructure to record data about file accesses, even independent of the file system you are looking at. It's probably a lot of work to get it right, but I would be worth it. Arnd -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 02/10] AXFS: Kconfig and Makefiles
On Thursday 21 August 2008, Jared Hulbert wrote: The Kconfig edits and Makefiles required for AXFS. Signed-off-by: Jared Hulbert [EMAIL PROTECTED] If you split out this patch separate from the files, please make it the *last* patch so that you cannot get build errors during a later git-bisect through the middle of your series. Arnd -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 04/10] AXFS: axfs_inode.c
On Thursday 21 August 2008, Jared Hulbert wrote: +/* functions in other axfs files **/ +int axfs_get_sb(struct file_system_type *, int, const char *, void *, + struct vfsmount *); +void axfs_kill_super(struct super_block *); +void axfs_profiling_add(struct axfs_super *, unsigned long, unsigned int); +int axfs_copy_mtd(struct super_block *, void *, u64, u64); +int axfs_copy_block(struct super_block *, void *, u64, u64); *Never* put extern declarations into a .c file, that's what headers are for. If you ever change the definition, the compiler doesn't get a chance to warn you otherwise. +/**/ +static int axfs_readdir(struct file *, void *, filldir_t); +static int axfs_mmap(struct file *, struct vm_area_struct *); +static ssize_t axfs_file_read(struct file *, char __user *, size_t, loff_t *); +static int axfs_readpage(struct file *, struct page *); +static int axfs_fault(struct vm_area_struct *, struct vm_fault *); +static struct dentry *axfs_lookup(struct inode *, struct dentry *, + struct nameidata *); +static int axfs_get_xip_mem(struct address_space *, pgoff_t, int, void **, + unsigned long *); For style reasons, also please don't put static forward declarations anywhere, but define the functions in the right order so you don't need them. Arnd -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 05/10] AXFS: axfs_profiling.c
On Thursday 21 August 2008, David Woodhouse wrote: On Thu, 2008-08-21 at 10:44 +0200, Carsten Otte wrote: Exporting profiling data for a file system in another file system (/proc) seems not very straigtforward to me. I think it is worth considering to export this information via the same mount point. I would have said sysfs, rather than 'the same mount point'. Let me throw in debugfs as my preferred option. sysfs is for stable interfaces, while profiling generally fits into the debugging category. Arnd -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 04/10] AXFS: axfs_inode.c
On Thursday 21 August 2008, Jared Hulbert wrote: + array_index = AXFS_GET_INODE_ARRAY_INDEX(sbi, ino_number); + array_index += page-index; + + node_index = AXFS_GET_NODE_INDEX(sbi, array_index); + node_type = AXFS_GET_NODE_TYPE(sbi, array_index); + + if (node_type == Compressed) { + /* node is in compessed region */ + cnode_offset = AXFS_GET_CNODE_OFFSET(sbi, node_index); + cnode_index = AXFS_GET_CNODE_INDEX(sbi, node_index); + down_write(sbi-lock); + if (cnode_index != sbi-current_cnode_index) { + /* uncompress only necessary if different cblock */ + ofs = AXFS_GET_CBLOCK_OFFSET(sbi, cnode_index); + len = AXFS_GET_CBLOCK_OFFSET(sbi, cnode_index + 1); + len -= ofs; + axfs_copy_data(sb, cblk1, (sbi-compressed), ofs, len); + axfs_uncompress_block(cblk0, cblk_size, cblk1, len); + sbi-current_cnode_index = cnode_index; + } + downgrade_write(sbi-lock); + max_len = cblk_size - cnode_offset; + len = max_len PAGE_CACHE_SIZE ? PAGE_CACHE_SIZE : max_len; + src = (void *)((unsigned long)cblk0 + cnode_offset); + memcpy(pgdata, src, len); + up_read(sbi-lock); This looks very nice, but could use some comments about how the data is actually stored on disk. It took me some time to figure out that it actually allows to do tail merging into compressed blocks, which I was about to suggest you implement ;-). Cramfs doesn't have them, and I found that they are the main reason why squashfs compresses better than cramfs, besides the default block size, which you can change on either one. Have you seen any benefit of the rwsem over a simple mutex? I would guess that you can never even get into the situation where you get concurrent readers since I haven't found a single down_read() in your code, only downgrade_write(). Arnd
Re: [PATCH 21/23] make section names compatible with -ffunction-sections -fdata-sections: v850
On Wednesday 02 July 2008, Denys Vlasenko wrote: This patch fixes v850 architecture. For all I know, v850 has been broken and unmaintained for a few years now, didn't someone have a patch to remove it entirely? Arnd -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 21/23] make section names compatible with -ffunction-sections -fdata-sections: v850
On Thursday 03 July 2008, Andi Kleen wrote: Same seems to be true for cris btw. Cris has seen significant updates in 2.6.25 by its maintainer. It's not a very active port, but skipping updates for one kernel version is on a completely different scale from doing nothing at all for over three years as in the v850 case. I don't currently see any architecture (other than v850) in a state that justifies removing it entirely. Arnd -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html