Re: [zfs-discuss] ZFS raid is very slow???
I am using Solaris Express Community build 67 installed on a 40GB harddrive (UFS filesystem on Solaris), dual boot with Windows XP. I have a zfsraid with 4 samsung drives. It is a [EMAIL PROTECTED] and 1GB RAM. When I copy a 1.3G file from ZFSpool to ZFSpool the command time cp file file2 gives this output: bash-3.00# time cp PAGEFILE.SYS pagefil3 real0m49.719s user0m0.004s sys 0m10.160s Which gives like 26MB/sec. When I copy that file from ZFS to UFS I get: real0m35.091s user0m0.004s sys 0m15.337s Which gives 37MB/sec. However, in each of the above scenarios, the system monitor shows that all RAM is used up and it begins to swap (the swap uses like 40MB). My system has never swapped before (Windows swaps immediately upon startup, ha!). The cpu utilization is like 50%. When I copy that file from UFS to UFS I get: real1m36.315s user0m0.003s sys 0m11.327s However, the CPU utilization is around 20% and RAM usage never exceeds 600MB - it doesnt use the swap. When I copy that file from ZFS to /dev/null I get this output: real0m0.025s user0m0.002s sys 0m0.007s which can't be correct. Is it wrong of me to use time cp fil fil2 when measuring disk performance? I mount NTFS with packages FSWfsmisc and FSWfspart, by Moinak Ghosh (and based on Martin Rosenau's work and part of Moinak's BeleniX work) This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Take Three: PSARC 2007/171 ZFS Separate Intent Log
Hello, This is a third request to open the materials of the PSARC case 2007/171 ZFS Separate Intent Log I am not sure why two previous requests were completely ignored (even when seconded by another community member). In any case that is absolutely unaccepted practice. On 6/30/07, Cyril Plisko [EMAIL PROTECTED] wrote: Hello ! I am adding zfs-discuss as it directly relevant to this community. On 6/23/07, Cyril Plisko [EMAIL PROTECTED] wrote: Hi, can the materials of the above be open for the community ? -- Regards, Cyril -- Regards, Cyril -- Regards, Cyril ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Take Three: PSARC 2007/171 ZFS Separate Intent Log
On 7/7/07, Cyril Plisko [EMAIL PROTECTED] wrote: Hello, This is a third request to open the materials of the PSARC case 2007/171 ZFS Separate Intent Log I am not sure why two previous requests were completely ignored (even when seconded by another community member). In any case that is absolutely unaccepted practice. The past week of inactivity is likely related to most of Sun in the US being on mandatory vacation. Sun typically shuts down for the week that contains July 4 and (I think) the week between Christmas and Jan 1. Mike -- Mike Gerdts http://mgerdts.blogspot.com/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Take Three: PSARC 2007/171 ZFS Separate Intent Log
On 7/7/07, Cyril Plisko [EMAIL PROTECTED] wrote: Hello, This is a third request to open the materials of the PSARC case 2007/171 ZFS Separate Intent Log I am not sure why two previous requests were completely ignored (even when seconded by another community member). In any case that is absolutely unaccepted practice. The past week of inactivity is likely related to most of Sun in the US being on mandatory vacation. Sun typically shuts down for the week that contains July 4 and (I think) the week between Christmas and Jan 1. But not mandatory this year (but many appear to have arranged for vacation because of the 4th of July and because they half expected a mandatory shutdown) Casper ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS Compression algorithms - Project Proposal
nice idea! :) We plan to start with the development of a fast implementation of a Burrows Wheeler Transform based algorithm (BWT). why not starting with lzo first - it`s already in zfs-fuse on linux and it looks, that it`s just in between lzjb and gzip in terms of performance and compression ratio. there needs yet to be demonstrated that it behaves similar on solaris. see http://www.opensolaris.org/jive/thread.jspa?messageID=111031 This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS performance and memory consumption
When tuning recordsize for things like databases, we try to recommend that the customer's recordsize match the I/O size of the database record. On this filesystem I have: - file links and they are rather static - small files ( about 8kB ) that keeps changing - big files ( 1MB - 20 MB ) used as temporary files( create, write, read, unlink ) and operations on theses files is about 50% off all I/O I think I need defragmentator tool. Do you think there will be any ? Now all I can do is copy filesystems from this zpool to another. After this operation new zpool will not be fragmentated. But it takes time and I have several zpool's like this. This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS Performance as a function of Disk Slice
Scott Lovenberg wrote: First Post! Sorry, I had to get that out of the way to break the ice... Welcome! I was wondering if it makes sense to zone ZFS pools by disk slice, and if it makes a difference with RAIDZ. As I'm sure we're all aware, the end of a drive is half as fast as the beginning ([i]where the zoning stipulates that the physical outside is the beginning and going towards the spindle increases hex value[/i]). IMHO, it makes sense to short-stroke if you are looking for the best performance. But raidz (or RAID-5) will not give you the best performance. You'd be better off mirroring for performance. I usually short stroke my drives so that the variable files on the operating system drive are at the beginning, page in center (so if I'm already in thrashing I'm at most 1/2 a platters width from page), and static files are towards the end. So, applying this methodology to ZFS, I partition a drive into 4 equal-sized quarters, and do this to 4 drives (each on a separate SATA channel), and then create 4 pools which hold each 'ring' of the drives. Will I then have 4 RAIDZ pools, which I can mount according to speed needs? For instance, I always put (in Linux... I'm new to Solaris) '/export/archive' all the way on the slow tracks since I don't read or write to it often and it is almost never accessed at the same time as anything else that would force long strokes. Ideally, I'd like to do a straight ZFS on the archive track. I move data to archive in chunks, 4 gigs at a time - when I roll it in I burn 2 DVDs, 1 gets cataloged locally and the other offsite, so if I lose the data, I don't care - but, ZFS gives me the ability to snapshot to archive (I assume it works across pools?). Then stripe 1 ring (I guess this is ZFS native?), /usr/local (or its Solaris equivalent) for performance. Then mirror the root slice. Finally, /export would be RAIDZ or RAIDZ2 on the fastest track, holding my source code, large files, and things I want to stream over the LAN. Does this make sense with ZFS? Is the spindle count more of a factor than stroke latency? Does ZFS balance these things out on its own via random scattering? Spindle count almost always wins for performance. Note: bandwidth usually isn't the source of perceived performance problems, latency is. We believe that this has implications for ZFS over time due to COW, but nobody has characterized this yet. Reading back over this post, I've found it sounds like the ramblings of a madman. I guess I know what I want to say, but I'm not sure the right questions to ask. I think I'm saying: Will my proposed setup afford me the flexibility to zone for performance since I have a more intimate knowledge of the data going onto the drive, or will brute force by spindle count (I'm planning 4-6 drives - single drive to a bus) and random placement be sufficient if I just add the whole drive to a single pool? Yes :-) YMMV. I thank you all for your time and patience as I stumble through this, and I welcome any point of view or insights (especially those from experience!) that might help me decide how to configure my storage server. KISS. There are trade-offs for space, performance, and RAS. We have models to describe these, so you might check out my blogs on the subject. http://blogs.sun.com/relling -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Take Three: PSARC 2007/171 ZFS Separate Intent Log
Cyril, I wrote this case and implemented the project. My problem was that I didn't know what policy (if any) Sun has about publishing ARC cases, and a mail log with a gazillion email addresses. I did receive an answer to this this in the form: http://www.opensolaris.org/os/community/arc/arc-faq/arc-publish-historical-checklist/ Never having done this it seems somewhat burdensome, and will take some time. Sorry, for the slow response and lack of feedback. Are there any particular questions you have about separate intent logs that I can answer before I embark on the process? Neil. Cyril Plisko wrote: Hello, This is a third request to open the materials of the PSARC case 2007/171 ZFS Separate Intent Log I am not sure why two previous requests were completely ignored (even when seconded by another community member). In any case that is absolutely unaccepted practice. On 6/30/07, Cyril Plisko [EMAIL PROTECTED] wrote: Hello ! I am adding zfs-discuss as it directly relevant to this community. On 6/23/07, Cyril Plisko [EMAIL PROTECTED] wrote: Hi, can the materials of the above be open for the community ? -- Regards, Cyril -- Regards, Cyril ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Take Three: PSARC 2007/171 ZFS Separate Intent Log
On 7/7/07, Neil Perrin [EMAIL PROTECTED] wrote: Cyril, I wrote this case and implemented the project. My problem was that I didn't know what policy (if any) Sun has about publishing ARC cases, and a mail log with a gazillion email addresses. I did receive an answer to this this in the form: http://www.opensolaris.org/os/community/arc/arc-faq/arc-publish-historical-checklist/ Never having done this it seems somewhat burdensome, and will take some time. Neil, I am glad the message finally got through. It seems to me that the URL above refers to the publishing materials of *historical* cases. Do you think the case in hand should be considered historical ? Anyway, many ZFS related cases were openly reviewed from the moment zero of their life, why this one was an exception ? Sorry, for the slow response and lack of feedback. Are there any particular questions you have about separate intent logs that I can answer before I embark on the process? Well, that only question I have now is what is it all about ? It is hard to ask question without access to case materials, right ? -- Regards, Cyril ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs space efficiency
agreed. while a bitwise check is the only assured way to determine duplicative nature of two blocks, if the check were done in a streaming method as you suggest, performance, while a huge impact compared to not, would be more than bearable if used within an environment with large known levels of duplicative data, such as a large central backup zfs send target. the checksum metadata is sent first, then the data, while the receiving system is checking it's db for possible dupe and if found, reads the data from local disks and compares to data as it is coming from sender. If it gets to the end and hasn't found a difference, updates the pointer for the block to point to the duplicate. This won't save any bandwidth during the backup, but will save on-disk space and given the application, could be very advantagous. thank you for the insightful discussion on this. within the electronic discovery and records and information management space data deduplication and policy-based aging are the foremost topics of the day but this is at the file level while block-level deduplication would lend no benefit to that regardless. -=dave This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs space efficiency
one other thing... the checksums for all files to send *could* be checked first in batch and known unique blocks prioritized and sent first, then the possibly duplicative data sent afterwards to be verified a dupe, thereby decreasing the possible data loss for the backup window to levels equivolently low to the checksum collision probability. -=dave This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS raid is very slow???
ZFS is a 128 bit file system. The performance on your 32-bit CPU will not be that good. ZFS was designed for a 64-bit CPU. Another GB of RAM might help. There are a bunch of post in the archive about 32-bit CPUs and performance. -Sean Orvar Korvar wrote: I am using Solaris Express Community build 67 installed on a 40GB harddrive (UFS filesystem on Solaris), dual boot with Windows XP. I have a zfsraid with 4 samsung drives. It is a [EMAIL PROTECTED] and 1GB RAM. When I copy a 1.3G file from ZFSpool to ZFSpool the command time cp file file2 gives this output: bash-3.00# time cp PAGEFILE.SYS pagefil3 real0m49.719s user0m0.004s sys 0m10.160s Which gives like 26MB/sec. When I copy that file from ZFS to UFS I get: real0m35.091s user0m0.004s sys 0m15.337s Which gives 37MB/sec. However, in each of the above scenarios, the system monitor shows that all RAM is used up and it begins to swap (the swap uses like 40MB). My system has never swapped before (Windows swaps immediately upon startup, ha!). The cpu utilization is like 50%. When I copy that file from UFS to UFS I get: real1m36.315s user0m0.003s sys 0m11.327s However, the CPU utilization is around 20% and RAM usage never exceeds 600MB - it doesnt use the swap. When I copy that file from ZFS to /dev/null I get this output: real0m0.025s user0m0.002s sys 0m0.007s which can't be correct. Is it wrong of me to use time cp fil fil2 when measuring disk performance? I mount NTFS with packages FSWfsmisc and FSWfspart, by Moinak Ghosh (and based on Martin Rosenau's work and part of Moinak's BeleniX work) This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS raid is very slow???
On Jul 7, 2007, at 06:14, Orvar Korvar wrote: When I copy that file from ZFS to /dev/null I get this output: real0m0.025s user0m0.002s sys 0m0.007s which can't be correct. Is it wrong of me to use time cp fil fil2 when measuring disk performance? well you're reading and writing to the same disk so that's going to affect performance, particularly as you're seeking to different areas of the disk both for the files and for the uberblock updates .. in the above case it looks like the file is already cached (buffer cache being what is probably consuming most of your memory here) - so you're just looking at a memory to memory transfer here .. if you want to see a simple write performance test many people use dd like so: # timex dd if=/dev/zero of=file bs=128k count=8192 which will give you a measure of an efficient 1GB file write of zeros .. or use a better opensource tool like iozone to get a better fix on single thread vs multi-thread, read/write mix, and block size differences for your given filesystem and storage layout jonathan ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss