On Wed, Jul 28, 2010 at 11:39 AM, Alex Hornung <ahorn...@gmail.com> wrote:
> I absolutely don't see how forcing the I/O from N different threads > onto 2 (events are not I/O effectively) is better than having each I/O > maintain (mostly) it's own context. Your particular case may not > suffer from any performance impact, but I was mostly talking about a > future-proof solution. > Here is some detailing of how a geom class can improve performance on multithreaded IO. http://lists.freebsd.org/pipermail/freebsd-geom/2006-June/001290.html http://retis.sssup.it/~fabio/soc09/downloads/D3.pdf Also you look at FreeBSD's sys/geom/eli/g_eli.c for an example implementation for a dedicated thread so as not to pollute the geom up/down threads. Disks can only read/write one thing at a time anyways, and if you have a device with multiple providers, requests can be optimized and split across them. I don't think it's been shown that GEOM is or isn't "future-proof". Same with DM. If problems arise in either implementation, I'm sure they would be worked around. > I mean every module that needs metadata. From the top of my head I > could mention the lvm parser and geli, but I'd suspect most others, > too. So GEOM has a lvm parser that is utterly incomplete and obviously > offers no management whatsoever. How is that superior to having most > of the lvm functionality (in userland), easy to keep up to date and > offering the same tools as on Linux? You don't address the GPL part, which is a problem for me and maybe some others. Other BSD's have struggled for a long time removing GPL tools. I also don't understand why having the same tools a Linux is desirable. If that's the goal(eg to attract Linux users) IMO it's misguided. Now let's see... you write an I/O scheduler on DragonFly... you simply > use the dsched framework which fits nicely on top of the disk > subsystem. As a matter of fact I could even change dm slightly to use > the disk subsystem, too, and hence allow I/O schedulers, mbr, gpt and > disklabels on top of dm devices, but I don't think there's much point > to it at this time. > I wasn't aware of dsched, pretty cool. The point remains though as I could list a lot of different modules you couldn't match easily without GEOM. Also not all Linux block stuff is done in userland, eg DRBD. As you point out later this is somewhat subjective, and could easily turn into an pissing match which is not what I want. What I'm mainly looking for is input on importing GEOM, and given your experience any insight would be helpful there. > My point is that there's no need to learn anything new. Also, this is > completely subjetive. Maybe you prefer GEOM; I'd argue some of the > Linux counterparts are way more intuitive. > LVM is only one consumer of device mapper as I said before, so there's > really no point in doing this comparison. LVM was imported strictly > because of the compatibility with Linux. cryptsetup is another > consumer of device mapper, which offers a different interface. My > point here is that it's extremely simple to write userland tools to > fit anyone's needs. I'm currently working on a mirror target for the > device mapper; it'll also have its own userland tool, and not > dependent on LVM which you seem to find cumbersome. > GEOM modules can certianly working in userland, eg ggate, plus it's easily stackable nature and BSD license are other features are what I find attractive. Obviously you and I disagree on the merits of DM/LVM stuff, but if GEOM had been implemented you wouldn't have to write the mirror code. You'd already have well tested, performant code. I still don't get your point. GPT support in the loader is not > assisted in any way by geom or any other similar mess. > The gpart class is a helper of the loader. It creates the normal gpt boot partition unlike the hack that exists in gpt(8) Also I don't see any advantage to any softraid implementation which > requires a full disk sync after a minor glitch, such as someone > pulling a plug temporarily or a crash/reboot or a misprobe. The gmirror sync after an unclean disconnect is greatly reduced on gjournal volumes, I haven't timed it lately but it's something like 1 to 2 minutes for TB+ sized volumes. -- Adam Vande More