Re: [zfs-discuss] Improving L1ARC cache efficiency with dedup
Not exactly. What is dedup'ed is the stream only, which is infect not very efficient. Real dedup aware replication is taking the necessary steps to avoid sending a block that exists on the other storage system. http://www.oracle.com/ Mertol Özyöney | Storage Sales Mobile: +90 533 931 0752 Email: mertol.ozyo...@oracle.com On 12/8/11 1:39 PM, Darren J Moffat darren.mof...@oracle.com wrote: On 12/07/11 20:48, Mertol Ozyoney wrote: Unfortunetly the answer is no. Neither l1 nor l2 cache is dedup aware. The only vendor i know that can do this is Netapp In fact , most of our functions, like replication is not dedup aware. For example, thecnicaly it's possible to optimize our replication that it does not send daya chunks if a data chunk with the same chechsum exists in target, without enabling dedup on target and source. We already do that with 'zfs send -D': -D Perform dedup processing on the stream. Deduplicated streams cannot be received on systems that do not support the stream deduplication feature. -- Darren J Moffat ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Improving L1ARC cache efficiency with dedup
I am almost sure that in cache things are still hydrated. There is an outstanding RFE for this, while I am not sure, I think this feature will be implemented sooner or later. And in theory there will be little benefits as most dedup'ed shares are used for archive purposes... PS: NetApp's do have significantly bigger problems in caching department , like virtually having no L1 cache. However it's also my duty to knw where they have an advantage Br Mertol http://www.oracle.com/ Mertol Özyöney | Storage Sales Mobile: +90 533 931 0752 Email: mertol.ozyo...@oracle.com On 12/10/11 4:05 PM, Pawel Jakub Dawidek p...@freebsd.org wrote: On Wed, Dec 07, 2011 at 10:48:43PM +0200, Mertol Ozyoney wrote: Unfortunetly the answer is no. Neither l1 nor l2 cache is dedup aware. The only vendor i know that can do this is Netapp And you really work at Oracle?:) The answer is definiately yes. ARC caches on-disk blocks and dedup just reference those blocks. When you read dedup code is not involved at all. Let me show it to you with simple test: Create a file (dedup is on): # dd if=/dev/random of=/foo/a bs=1m count=1024 Copy this file so that it is deduped: # dd if=/foo/a of=/foo/b bs=1m Export the pool so all cache is removed and reimport it: # zpool export foo # zpool import foo Now let's read one file: # dd if=/foo/a of=/dev/null bs=1m 1073741824 bytes transferred in 10.855750 secs (98909962 bytes/sec) We read file 'a' and all its blocks are in cache now. The 'b' file shares all the same blocks, so if ARC caches blocks only once, reading 'b' should be much faster: # dd if=/foo/b of=/dev/null bs=1m 1073741824 bytes transferred in 0.870501 secs (1233475634 bytes/sec) Now look at it, 'b' was read 12.5 times faster than 'a' with no disk activity. Magic?:) -- Pawel Jakub Dawidek http://www.wheelsystems.com FreeBSD committer http://www.FreeBSD.org Am I Evil? Yes, I Am! http://yomoli.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Improving L1ARC cache efficiency with dedup
Unfortunetly the answer is no. Neither l1 nor l2 cache is dedup aware. The only vendor i know that can do this is Netapp In fact , most of our functions, like replication is not dedup aware. However we have significant advantage that zfs keeps checksums regardless of the dedup being on and off. So, in the future we can perhaps make functions more dedup friendly regardless of dedup being enabled or not. For example, thecnicaly it's possible to optimize our replication that it does not send daya chunks if a data chunk with the same chechsum exists in target, without enabling dedup on target and source. Best regards Mertol Sent from a mobile device Mertol Ozyoney On 07 Ara 2011, at 20:46, Brad Diggs brad.di...@oracle.com wrote: Hello, I have a hypothetical question regarding ZFS reduplication. Does the L1ARC cache benefit from reduplication in the sense that the L1ARC will only need to cache one copy of the reduplicated data versus many copies? Here is an example: Imagine that I have a server with 2TB of RAM and a PB of disk storage. On this server I create a single 1TB data file that is full of unique data. Then I make 9 copies of that file giving each file a unique name and location within the same ZFS zpool. If I start up 10 application instances where each application reads all of its own unique copy of the data, will the L1ARC contain only the deduplicated data or will it cache separate copies the data from each file? In simpler terms, will the L1ARC require 10TB of RAM or just 1TB of RAM to cache all 10 1TB files worth of data? My hope is that since the data only physically occupies 1TB of storage via deduplication that the L1ARC will also only require 1TB of RAM for the data. Note that I know the deduplication table will use the L1ARC as well. However, the focus of my question is on how the L1ARC would benefit from a data caching standpoint. Thanks in advance! Brad PastedGraphic-2.tiff Brad Diggs | Principal Sales Consultant Tech Blog: http://TheZoneManager.com LinkedIn: http://www.linkedin.com/in/braddiggs ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Dedup memory overhead
Sorry fort he late answer. Approximately it's 150 bytes per individual block. So increasing the blocksize is a good idea. Also when L1 and L2 arc is not enough system will start making disk IOPS and RaidZ is not very effective for random IOPS and it's likely that when your dram is not enough your perfor ance will suffer. You may choose to use Raid 10 which is a lot better on random loads Mertol Mertol Ozyoney Storage Practice - Sales Manager Sun Microsystems, TR Istanbul TR Phone +902123352200 Mobile +905339310752 Fax +90212335 Email mertol.ozyo...@sun.com -Original Message- From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of erik.ableson Sent: Thursday, January 21, 2010 6:05 PM To: zfs-discuss Subject: [zfs-discuss] Dedup memory overhead Hi all, I'm going to be trying out some tests using b130 for dedup on a server with about 1,7Tb of useable storage (14x146 in two raidz vdevs of 7 disks). What I'm trying to get a handle on is how to estimate the memory overhead required for dedup on that amount of storage. From what I gather, the dedup hash keys are held in ARC and L2ARC and as such are in competition for the available memory. So the question is how much memory or L2ARC would be necessary to ensure that I'm never going back to disk to read out the hash keys. Better yet would be some kind of algorithm for calculating the overhead. eg - averaged block size of 4K = a hash key for every 4k stored and a hash occupies 256 bits. An associated question is then how does the ARC handle competition between hash keys and regular ARC functions? Based on these estimations, I think that I should be able to calculate the following: 1,7 TB 1740,8 GB 1782579,2 MB 1825361100,8KB 4 average block size 456340275,2 blocks 256 hash key size-bits 1,16823E+11 hash key overhead - bits 1460206,4 hash key size-bytes 14260633,6 hash key size-KB 13926,4 hash key size-MB 13,6hash key overhead-GB Of course the big question on this will be the average block size - or better yet - to be able to analyze an existing datastore to see just how many blocks it uses and what is the current distribution of different block sizes. I'm currently playing around with zdb with mixed success on extracting this kind of data. That's also a worst case scenario since it's counting really small blocks and using 100% of available storage - highly unlikely. # zdb -ddbb siovale/iphone Dataset siovale/iphone [ZPL], ID 2381, cr_txg 3764691, 44.6G, 99 objects ZIL header: claim_txg 0, claim_blk_seq 0, claim_lr_seq 0 replay_seq 0, flags 0x0 Object lvl iblk dblk dsize lsize %full type 0716K16K 57.0K64K 77.34 DMU dnode 1116K 1K 1.50K 1K 100.00 ZFS master node 2116K512 1.50K512 100.00 ZFS delete queue 3216K16K 18.0K32K 100.00 ZFS directory 4316K 128K 408M 408M 100.00 ZFS plain file 5116K16K 3.00K16K 100.00 FUID table 6116K 4K 4.50K 4K 100.00 ZFS plain file 7116K 6.50K 6.50K 6.50K 100.00 ZFS plain file 8316K 128K 952M 952M 100.00 ZFS plain file 9316K 128K 912M 912M 100.00 ZFS plain file 10316K 128K 695M 695M 100.00 ZFS plain file 11316K 128K 914M 914M 100.00 ZFS plain file Now, if I'm understanding this output properly, object 4 is composed of 128KB blocks with a total size of 408MB, meaning that it uses 3264 blocks. Can someone confirm (or correct) that assumption? Also, I note that each object (as far as my limited testing has shown) has a single block size with no internal variation. Interestingly, all of my zvols seem to use fixed size blocks - that is, there is no variation in the block sizes - they're all the size defined on creation with no dynamic block sizes being used. I previously thought that the -b option set the maximum size, rather than fixing all blocks. Learned something today :-) # zdb -ddbb siovale/testvol Dataset siovale/testvol [ZVOL], ID 45, cr_txg 4717890, 23.9K, 2 objects Object lvl iblk dblk dsize lsize %full type 0716K16K 21.0K16K6.25 DMU dnode 1116K64K 064K0.00 zvol object 2116K512 1.50K512 100.00 zvol prop # zdb -ddbb siovale/tm-media Dataset siovale/tm-media [ZVOL], ID 706, cr_txg 4426997, 240G, 2 objects ZIL header: claim_txg 0, claim_blk_seq 0, claim_lr_seq 0 replay_seq 0, flags 0x0 Object lvl iblk dblk dsize lsize %full type 0716K16K 21.0K16K6.25 DMU dnode 1516K 8K 240G 250G 97.33 zvol object 2116K512 1.50K512 100.00 zvol prop
Re: [zfs-discuss] Large scale ZFS deployments out there (200 disks)
We got 50+ X4500/X4540's running in the same DC happiliy with ZFS. Approximately 2500 drives and growing everyday... Br Mertol Mertol Ozyoney Storage Practice - Sales Manager Sun Microsystems, TR Istanbul TR Phone +902123352200 Mobile +905339310752 Fax +90212335 Email mertol.ozyo...@sun.com -Original Message- From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Henrik Johansen Sent: Friday, January 29, 2010 10:45 AM To: zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] Large scale ZFS deployments out there (200 disks) On 01/28/10 11:13 PM, Lutz Schumann wrote: While thinking about ZFS as the next generation filesystem without limits I am wondering if the real world is ready for this kind of incredible technology ... I'm actually speaking of hardware :) ZFS can handle a lot of devices. Once in the import bug (http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6761786) is fixed it should be able to handle a lot of disks. That was fixed in build 125. I want to ask the ZFS community and users what large scale deploments are out there. How man disks ? How much capacity ? Single pool or many pools on a server ? How does resilver work in those environtments ? How to you backup ? What is the experience so far ? Major headakes ? It would be great if large scale users would share their setups and experiences with ZFS. The largest ZFS deployment that we have is currently comprised of 22 Dell MD1000 enclosures (330 750 GB Nearline SAS disks). We have 3 head nodes and use one zpool per node, comprised of rather narrow (5+2) RAIDZ2 vdevs. This setup is exclusively used for storing backup data. Resilver times could be better - I am sure that this will improve once we upgrade from S10u9 to 2010.03. One of the things that I am missing in ZFS is the ability to prioritize background operations like scrub and resilver. All our disks are idle during daytime and I would love to be able to take advantage of this, especially during resilver operations. This setup has been running for about a year with no major issues so far. The only hickups we've had were all HW related (no fun in firmware upgrading 200+ disks). Will you ? :) Thanks, Robert -- Med venlig hilsen / Best Regards Henrik Johansen hen...@scannet.dk Tlf. 75 53 35 00 ScanNet Group A/S ScanNet ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS Cache + ZIL on single SSD
Hi ; I dont think that anyone owns the list and as anyone else you are very welcome to ask any question. L2arc will cache zpool so if your iscs lun is a zvol or a file on zfs it will be cached. Please use constar if you need performance You are correct that you will only need couple of gb as zil. Performance gain of using a portion of ssd for l2arc will depend on your pool configuration, your average load mix. A side note: use raid 10 if you are using zfs and iscsi and need random iops Best regards Mertol Sent from a mobile device Mertol Ozyoney On 11.Oca.2010, at 01:05, A. Krijgsman a.krijgs...@draftsman.nl wrote: Hi all, Sorry for spamming your mailinglist, but since I could not find a direct awnser on the internet and archives, I give this a try! I am building an ZFS filesystem to export iSCSI LUN's. Now I was wondering if the L2ARC has the ability to cache non- filesystem iscsi lun's? Or does it only work in combination with the ZFS mounted filesystem? Next to that I am reading all kind of performance benefits using seperate devices for the ZIL (write) and the Cache (read). I was wondering if I could share a single SSD between both ZIL and Cache device? Or is this not recommended? This because the ZIL needs 1 GB top from what I read? But since SSD is not cheap, I would like to make use of the other GB's on the disk. Thank you. Armand ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] STK 2540 and Ignore Cache Sync (ICS)
7.x FW on 2500 and 6000 series doesnot operate the same way as 6.x FW does. So on some/most loads ignore cache synch commands option may not improve performance as expected. Best regards Mertol Mertol Ozyoney Storage Practice - Sales Manager Sun Microsystems, TR Istanbul TR Phone +902123352200 Mobile +905339310752 Fax +90212335 Email mertol.ozyo...@sun.com -Original Message- From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Bob Friesenhahn Sent: Tuesday, October 13, 2009 6:05 PM To: Nils Goroll Cc: zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] STK 2540 and Ignore Cache Sync (ICS) On Tue, 13 Oct 2009, Nils Goroll wrote: I am trying to find out some definite answers on what needs to be done on an STK 2540 to set the Ingnore Cache Sync Option. The best I could find is Bob's Sun StorageTek 2540 / ZFS Performance Summary (Dated Feb 28, 2008, thank you, Bob), in which he quotes a posting of Joel Miller: I should update this paper since the performance is now radically different and the StorageTek 2540 CAM configurables have changed. Is this information still current for F/W 07.35.44.10 ? I suspect that the settings don't work the same as before, but don't know how to prove it. Bonus question: Is there a way to determine the setting which is currently active, if I don't know if the controller has been booted since the nvsram potentially got modified? From what I can tell, the controller does not forget these settings due to a reboot or firmware update. However, new firmware may not provide the same interpretation of the values. Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] STK 2540 and Ignore Cache Sync (ICS)
Hi Bob; In all 2500 and 6000 series you can assign raid set's to a controller and that controller becomes the owner of the set. Generaly not force drives switching between controllers always one controller owns a disk, and other waits in standby. Some disks use ALUA and re-route traffic coming to the not preferred controller to preferred controller. While some companies market this a true active active set up, this reduces the performance significantly if the host is not %100 ALUA aware. While this architacture solves the problem of setting up MPXIO on hosts. It's likely that sometime in future Sun may release a FW to enable ALUA on controllers but this definetly wont improve performance. The advantage of 2540 against it's bigger brothers (6140 which is EOL'ed) and competitors 2540 do use dedicated data paths for cache mirroring just like higher end unit disks (6180,6580, 6780) improving write performance significantly. Spliting load between controllers can most of the time increase performance, but you do not need to split in two equal partitions. Also do not forget that first tray have dedicated data lines to the controller so generaly it's wise not to mix those drives with other drives on other trays. Best regards Mertol Mertol Ozyoney Storage Practice - Sales Manager Sun Microsystems, TR Istanbul TR Phone +902123352200 Mobile +905339310752 Fax +90212335 Email mertol.ozyo...@sun.com -Original Message- From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Bob Friesenhahn Sent: Tuesday, October 13, 2009 10:59 PM To: Nils Goroll Cc: zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] STK 2540 and Ignore Cache Sync (ICS) On Tue, 13 Oct 2009, Nils Goroll wrote: Regarding my bonus question: I haven't found yet a definite answer if there is a way to read the currently active controller setting. I still assume that the nvsram settings which can be read with service -d arrayname -c read -q nvsram region=0xf2 host=0x00 do not necessarily reflect the current configuration and that the only way to make sure the controller is running with that configuration is to reset it. I believe that in the STK 2540, the controllers operate Active/Active except that each controller is Active for half the drives and Standby for the others. Each controller has a copy of the configuration information. Whichever one you communicate with is likely required to mirror the changes to the other. In my setup I load-share the fiber channel traffic by assigning six drives as active on one controller and six drives as active on the other controller, and the drives are individually exported with a LUN per drive. I used CAM to do that. MPXIO sees the changes and does map 1/2 the paths down each FC link for more performance than one FC link offers. Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] fishworks on x4275?
Hı Trevor; As can be seen from my email adress and signiture below my answer will be quite biased J To be honest, while converting every X series server with millions of alternative configurations to a Fishwork appliance may not be extremely difficult, it would be impossible to support them. So Sun have to limit the number of configurations that needs to supported to a reasonable number. (Even this limited number of Systems and options will equal to unseen flexibility and number of choices ) However I agree that ability to convert a 4540 to a 7210 would have been nice. Best regards Mertol http://www.sun.com/ http://www.sun.com/emrkt/sigs/6g_top.gif Mertol Ozyoney Storage Practice - Sales Manager Sun Microsystems, TR Istanbul TR Phone +902123352200 Mobile +905339310752 Fax +90212335 Email mertol.ozyo...@sun.com From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Trevor Pretty Sent: Sunday, October 18, 2009 11:53 PM To: Frank Cusack Cc: zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] fishworks on x4275? Frank I've been looking into:- http://www.nexenta.com/corp/index.php?option=com_content http://www.nexenta.com/corp/index.php?option=com_contenttask=blogsectioni d=4Itemid=128 task=blogsectionid=4Itemid=128 Only played with a VM so far on my laptop, but it does seem to be an alternative to the Sun product if you don't want to buy a S7000. IMHO: Sun are missing a great opportunity not offering a reasonable upgrade path from an X to an S7000. Trevor Pretty | Technical Account Manager | T: +64 9 639 0652 | M: +64 21 666 161 Eagle Technology Group Ltd. Gate D, Alexandra Park, Greenlane West, Epsom Private Bag 93211, Parnell, Auckland Frank Cusack wrote: Apologies if this has been covered before, I couldn't find anything in my searching. Can the software which runs on the 7000 series servers be installed on an x4275? -frank ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss http://www.eagle.co.nz/ www.eagle.co.nz This email is confidential and may be legally privileged. If received in error please destroy and immediately notify us. attachment: image001.gif___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Sun Flash Accelerator F20
Hi James; Product will be lounched in a very short time. You can learn pricing from sun. Please keep in mind that Logzilla and F20 is desigined for slightly different tasks in mind. Logzilla is an extremely fast and reliable write device while F20 can be used for many different loads (read or write cache r both at the same time) Mertol Mertol Ozyoney Storage Practice - Sales Manager Sun Microsystems, TR Istanbul TR Phone +902123352200 Mobile +905339310752 Fax +90212335 Email mertol.ozyo...@sun.com -Original Message- From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of James Andrewartha Sent: Thursday, September 24, 2009 10:21 AM To: zfs-discuss@opensolaris.org Subject: [zfs-discuss] Sun Flash Accelerator F20 I'm surprised no-one else has posted about this - part of the Sun Oracle Exadata v2 is the Sun Flash Accelerator F20 PCIe card, with 48 or 96 GB of SLC, a built-in SAS controller and a super-capacitor for cache protection. http://www.sun.com/storage/disk_systems/sss/f20/specs.xml There's no pricing on the webpage though - does anyone know how it compares in price to a logzilla? -- James Andrewartha ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Sun Flash Accelerator F20
Hi Richard; You are right ZFS is not a shared FS so it can not be used for RAC unless you have 7000 series disk system. In Exadata ASM is used for storage Management where F20 can perform as a cache. Best regards Mertol Mertol Ozyoney Storage Practice - Sales Manager Sun Microsystems, TR Istanbul TR Phone +902123352200 Mobile +905339310752 Fax +90212335 Email mertol.ozyo...@sun.com -Original Message- From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Richard Elling Sent: Thursday, September 24, 2009 8:10 PM To: James Andrewartha Cc: zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] Sun Flash Accelerator F20 On Sep 24, 2009, at 12:20 AM, James Andrewartha wrote: I'm surprised no-one else has posted about this - part of the Sun Oracle Exadata v2 is the Sun Flash Accelerator F20 PCIe card, with 48 or 96 GB of SLC, a built-in SAS controller and a super-capacitor for cache protection. http://www.sun.com/storage/disk_systems/sss/f20/specs.xml At the Exadata-2 announcement, Larry kept saying that it wasn't a disk. But there was little else of a technical nature said, though John did have one to show. RAC doesn't work with ZFS directly, so the details of the configuration should prove interesting. -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS read performance scalability
Hi; You may be hitting a bottleneck at your HBA. Try using multiple HBA's or drive channels Mertol Mertol Ozyoney Storage Practice - Sales Manager Sun Microsystems, TR Istanbul TR Phone +902123352200 Mobile +905339310752 Fax +90212335 Email mertol.ozyo...@sun.com -Original Message- From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of en...@businessgrade.com Sent: Monday, August 31, 2009 5:16 PM To: zfs-discuss@opensolaris.org Subject: [zfs-discuss] ZFS read performance scalability Hi. I've been doing some simple read/write tests using filebench on a mirrored pool. Essentially, I've been scaling up the number of disks in the pool before each test between 4, 8 and 12. I've noticed that for individual disks, ZFS write performance scales very well between 4, 8 and 12 disks. This may be due to the fact that I'm using a SSD as a logging device. But I'm seeing individual disk performance drop by as much as 14MB per disk between 4 and 12 disks. Across the entire pool that means I've lost 168MB of raw throughput just by adding two mirror sets. I'm curious to know if there are any dials I can turn to improve this. System details are below: HW: Dual Quad Core 2.33 Xeon 8GB RAM Disks: Seagate Savio 10K 146GB and LSI 1068e HBA latest firmware OS: SCXE snv_121 Thank in advance.. This email and any files transmitted with it are confidential and are intended solely for the use of the individual or entity to whom they are addressed. This communication may contain material protected by the attorney-client privilege. If you are not the intended recipient, be advised that any use, dissemination, forwarding, printing or copying is strictly prohibited. If you have received this email in error, please contact the sender and delete all copies. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs fragmentation
There are Works to make NDMP more efficient in highly fregmanted file Systems with a lot of small files. I am not a development engineer so I don't know much and I do not think that there is any committed work. However ZFS engineers on the forum may comment more Mertol Mertol Ozyoney Storage Practice - Sales Manager Sun Microsystems, TR Istanbul TR Phone +902123352200 Mobile +905339310752 Fax +90212335 Email mertol.ozyo...@sun.com -Original Message- From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Ed Spencer Sent: Sunday, August 09, 2009 12:14 AM To: zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] zfs fragmentation On Sat, 2009-08-08 at 15:20, Bob Friesenhahn wrote: A SSD slog backed by a SAS 15K JBOD array should perform much better than a big iSCSI LUN. Now...yes. We implemented this pool years ago. I believe, then, the server would crash if you had a zfs drive fail. We decided to let the netapp handle the disk redundency. Its worked out well. I've looked at those really nice Sun products adoringly. And a 7000 series appliance would also be a nice addition to our central NFS service. Not to mention more cost effective than expanding our Network Appliance (We have researchers who are quite hungry for storage and NFS is always our first choice). We now have quite an investment in the current implementation. Its difficult to move away from. The netapp is quite a reliable product. We are quite happy with zfs and our implementation. We just need to address our backup performance and improve it just a little bit! We were almost lynched this spring because we encountered some pretty severe zfs bugs. We are still running the IDR named A wad of ZFS bug fixes for Solaris 10 Update 6. It took over a month to resolve the issues. I work at a University and Final Exams and year end occur at the same time. I don't recommend having email problems during this time! People are intolerant to email problems. I live in hope that a Netapp OS update, or a solaris patch, or a zfs patch, or a iscsi patch , or something will come along that improves our performance just a bit so our backup people get off my back! -- Ed ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS nfs performance on ESX4i
Hi Ashley; RaidZ Group is Ok for throughput but due to the design whole RaidZ Group behavies like a single disk so your max IOPS is around 100. I'd personaly use Raid10 instead. Also you seem to have no write cache which can effect performance. Try using a log device Best regards Mertol http://www.sun.com/ http://www.sun.com/emrkt/sigs/6g_top.gif Mertol Ozyoney Storage Practice - Sales Manager Sun Microsystems, TR Istanbul TR Phone +902123352200 Mobile +905339310752 Fax +90212335 Email mertol.ozyo...@sun.com From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Ashley Avileli Sent: Friday, August 14, 2009 2:21 PM To: zfs-discuss@opensolaris.org Subject: [zfs-discuss] ZFS nfs performance on ESX4i I have setup a pool called vmstorage and mounted it as nfs storage in esx4i. The pool in freenas contains 4 sata2 disks in raidz. I have 6 vms; 5 linux and 1 windows and performance is terrible. Any suggestion on improving the performance of the current setup. I have added the following vfs.zfs.prefetch_disable=1 which improved the performance slightly. attachment: image001.gif___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] surprisingly poor performance
Hi James, ZFS SSD usage behaviour heavly depends on access pattern and for asynch ops ZFS will not use SSD's. I'd suggest you to disable SSD's , create a ram disk and use it as SLOG device to compare the performance. If performance doesnt change, it means that the measurement method have some flaws or you havent configured Slog correctly. Please note that SSD's are way slower then DRAM based write cache's. SSD's will show performance increase when you create load from multiple clients at the same time, as ZFS will be flushing the dirty cache sequantialy. SO I'd suggest running the test from a lot of clients simultaneously Best regards Mertol Mertol Ozyoney Storage Practice - Sales Manager Sun Microsystems, TR Istanbul TR Phone +902123352200 Mobile +905339310752 Fax +90212335 Email mertol.ozyo...@sun.com -Original Message- From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of James Lever Sent: Friday, July 03, 2009 10:09 AM To: Brent Jones Cc: zfs-discuss; storage-disc...@opensolaris.org Subject: Re: [zfs-discuss] surprisingly poor performance On 03/07/2009, at 5:03 PM, Brent Jones wrote: Are you sure the slog is working right? Try disabling the ZIL to see if that helps with your NFS performance. If your performance increases a hundred fold, I'm suspecting the slog isn't perming well, or even doing its job at all. The slog appears to be working fine - at ~800 IOPS it wasn't lighting up the light significantly and when a second was added both activity lights were even more dim. Without the slog, the pool was only providing ~200 IOPS for the NFS metadata test. Speaking of which, can anybody point me at a good, valid test to measure the IOPS of these SSDs? cheers, James ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Using single SSD for l2arc on multiple pools?
Hi Joseph ; You cant share SSDs between pools (at least for today) unless you slice. Also it's better to use 2x SSD's for L2 ARC as depending on your system there can be slight limitations of using one SSD. Best regards Mertol Mertol Ozyoney Storage Practice - Sales Manager Sun Microsystems, TR Istanbul TR Phone +902123352200 Mobile +905339310752 Fax +90212335 Email mertol.ozyo...@sun.com -Original Message- From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Joseph Mocker Sent: Tuesday, June 16, 2009 10:28 PM To: zfs-discuss@opensolaris.org Subject: [zfs-discuss] Using single SSD for l2arc on multiple pools? Hello, I'm curious if it is possible to use a single SSD for the l2arc for multiple pools? I'm guessing that I can break the SSD into multiple slices and assign a slice as a cache device in each pool. That doesn't seem very flexible though, so I was wondering if there is another way to do this? Thanks... --joe ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Sun's flash offering(s)
Hi All, Currently, the ssd's used in 7000 series are stec's, ssd's used inside servers are intel Sent from a mobile device Mertol Ozyoney On 20.Nis.2009, at 06:09, Scott Laird sc...@sigkill.org wrote: On Sun, Apr 19, 2009 at 10:20 AM, David Magda dma...@ee.ryerson.ca wrote: Looking at the web site for Sun's SSD storage products, it looks like what's been offered is the so-called Logzilla: http://www.sun.com/storage/flash/specs.jsp You know, those specs look almost *identical* to the Intel X25-E. Is this actually the STEC device, or just a rebranded Intel SSD? Not that there's anything wrong with the Intel or anything, but if you were going to buy it it'd probably be dramatically cheaper buying it from someone other than Sun, if Sun's service contract, etc, wasn't important to you. Compare the URL above with this one: http://www.intel.com/design/flash/nand/extreme/index.htm Scott ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] j4200 drive carriers
Hi All ; As you can read below I carry a Sun bath so my opinions could be a little bit biased :) There are couple of reasons why you may consider a J series JBOD against some other whitebox unit. 1) Dual IO module option 2) Multipath support 3) Zone support [multi host connecting to same JBOD or same set of JBOD's connected in series. ] 4) Better testing with ZFS 5) Very nice SPC2 results Best regards Mertol Mertol Ozyoney Storage Practice - Sales Manager Sun Microsystems, TR Istanbul TR Phone +902123352200 Mobile +905339310752 Fax +90212335 Email mertol.ozyo...@sun.com -Original Message- From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Richard Elling Sent: Sunday, February 01, 2009 10:24 PM To: John-Paul Drawneek Cc: zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] j4200 drive carriers John-Paul Drawneek wrote: the J series is far to new to be hitting ebay yet. Any alot of people will not be buying the J series for obvious reasons The obvious reason is that Sun cannot service random disk drives you buy from Fry's (or elsewhere). People who value data tend to value service contracts for disk drives. -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] S10U6 and x4500 thumper sata controller
I also need this information. Thanks a lot for keeping me on the loop also Sent from a mobile device Mertol Ozyoney On 31.Eki.2008, at 13:59, Paul B. Henson [EMAIL PROTECTED] wrote: S10U6 was released this morning (whoo-hooo!), and I was wondering if someone in the know could verify that it contains all the fixes/patches/IDRs for the x4500 sata problems? Thanks... -- Paul B. Henson | (909) 979-6361 | http://www.csupomona.edu/ ~henson/ Operating Systems and Network Analyst | [EMAIL PROTECTED] California State Polytechnic University | Pomona CA 91768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Proposed 2540 and ZFS configuration
That's exactly what I said in a private email. J4200 or J4400 can offer better price/performance. However the price difference is not as much as you think. Besides 2540 have a few function that can not be found on J series , like SAN connectivity, internal redundant raid controllers [redundancy is good and you can make use of the controllers when connected to some other hosts ilke windows servers] , ability to change stripe size/raid level and other paramters on the go etc.. Mertol Ozyoney Storage Practice - Sales Manager Sun Microsystems, TR Istanbul TR Phone +902123352200 Mobile +905339310752 Fax +90212335 Email [EMAIL PROTECTED] -Original Message- From: Al Hopper [mailto:[EMAIL PROTECTED] Sent: Tuesday, September 02, 2008 3:53 AM To: [EMAIL PROTECTED] Cc: zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] Proposed 2540 and ZFS configuration On Mon, Sep 1, 2008 at 5:18 PM, Mertol Ozyoney [EMAIL PROTECTED] wrote: A few quick notes. 2540's first 12 drives are extremely fast due to the fact that they have direct unshared connections. I do not mean that additional disks are slow, I want to say that first 12 is extremely fast, compared to any other disk system. So although it's a little bit expansive but it could be a lot faster to add a second 2540 , than adding a second drive expansion. We generally use a few 2540's with 12 drives running in parallel for extreme performances. Agin with the additional disk tray 2540 will still perfom quite good, for the extreme performance For this application the 2540 is overkill and a poor fit. I'd recommend a J4xxx series JBOD array and and matching SAS controller(s). With enough memory in the ZFS host, you don't need hardware RAID with buffer RAM. Spend your dollars where you'll get the best payback - buy more drives and max out the RAM on the ZFS host!! In fact, if it's not too late, I'd return the 2540 Regards, -- Al Hopper Logical Approach Inc,Plano,TX [EMAIL PROTECTED] Voice: 972.379.2133 Timezone: US CDT OpenSolaris Governing Board (OGB) Member - Apr 2005 to Mar 2007 http://www.opensolaris.org/os/community/ogb/ogb_2005-2007/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Proposed 2540 and ZFS configuration
A few quick notes. 2540's first 12 drives are extremely fast due to the fact that they have direct unshared connections. I do not mean that additional disks are slow, I want to say that first 12 is extremely fast, compared to any other disk system. So although it's a little bit expansive but it could be a lot faster to add a second 2540 , than adding a second drive expansion. We generally use a few 2540's with 12 drives running in parallel for extreme performances. Agin with the additional disk tray 2540 will still perfom quite good, for the extreme performance Mertol Ozyoney Storage Practice - Sales Manager Sun Microsystems, TR Istanbul TR Phone +902123352200 Mobile +905339310752 Fax +90212335 Email [EMAIL PROTECTED] -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Ross Sent: Sunday, August 31, 2008 12:04 PM To: zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] Proposed 2540 and ZFS configuration Personally I'd go for an 11 disk raid-z2, with one hot spare. You loose some capacity, but you've got more than enough for your current needs, and with 1TB disks single parity raid means a lot of time with your data unprotected when one fails. You could split this into two raid-z2 sets if you wanted, that would have a bit better performance, but if you can cope with the speed of a single pool for now I'd be tempted to start with that. It's likely that by Christmas you'll be able to buy flash devices to use as read or write cache with ZFS, at which point the speed of the disks becomes academic for many cases. Adding a further 12 disks sounds fine, just as you suggest. You can add another 11 disk raid-z2 set to your pool very easily. ZFS can't yet restripe your existing data across the new disks, so you'll have some data on the old 12 disk array, some striped across all 24, and some on the new array. ZFS probably does add some overhead compared to hardware raid, but unless you have a lot of load on that box I wouldn't expect it to be a problem. I don't know the T5220 servers though, so you might want to double check that. I do agree that you don't want to use the hardware raid though, ZFS has plenty of advantages and it's best to let it manage the whole lot. Could you do me a favour though and see how ZFS copes on that array if you just pull a disk while the ZFS pool is running? I've had some problems on a home built box after pulling disks, I suspect a proper raid array will cope fine but haven't been able to get that tested yet. thanks, Ross -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] shrinking a zpool - roadmap
Can ADM ease the pain by migrating data only from one pool to the other. I know it's not what most of you want but... Mertol Ozyoney Storage Practice - Sales Manager Sun Microsystems, TR Istanbul TR Phone +902123352200 Mobile +905339310752 Fax +90212335 Email [EMAIL PROTECTED] -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Will Murnane Sent: Thursday, August 21, 2008 1:57 AM To: Bob Friesenhahn Cc: zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] shrinking a zpool - roadmap On Wed, Aug 20, 2008 at 18:40, Bob Friesenhahn [EMAIL PROTECTED] wrote: The errant command which accidentally adds a vdev could just as easily be a command which scrambles up or erases all of the data. True enough---but if there's a way to undo accidentally adding a vdev, there's one source of disastrously bad human error eliminated. If the vdev is removable, then typing zpool evacuate c3t4d5 to fix the problem instead of getting backups up to date, destroying and recreating the pool, then restoring from backups saves quite a bit of the cost associated with human error in this case. Think of it as the analogue of zpool import -D: if you screw up, ZFS has a provision to at least try to help. The recent discussion on accepting partial 'zfs recv' streams is a similar measure. No system is perfectly resilient to human error, but any simple ways in which the resilience (especially of such a large unit as a pool!) can be improved should be considered. Will ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Thumper, ZFS and performance
We had done several benchmarks on Thumpers. Config 1 is definetly better on most of the loads. Some Raid1 configs perform better on certain loads. Mertol Mertol Ozyoney Storage Practice - Sales Manager Sun Microsystems, TR Istanbul TR Phone +902123352200 Mobile +905339310752 Fax +90212335 Email [EMAIL PROTECTED] -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Richard Elling Sent: Tuesday, August 12, 2008 11:46 PM To: John Malick Cc: zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] Thumper, ZFS and performance John Malick wrote: There is a thread quite similar to this but it did not provide a clear answer to the question which was worded a bit odd.. I have a Thumper and am trying to determine, for performance, which is the best ZFS configuration of the two shown below. Any issues other than performance that anyone may see to steer me in one direction or another would be helpful as well. Thanks. Do config 1, please do not do config 2. From zpool(1): A raidz group with N disks of size X with P parity disks can hold approximately (N-P)*X bytes and can withstand one device failing before data integrity is compromised. The minimum number of devices in a raidz group is one more than the number of parity disks. The recommended number is between 3 and 9. -- richard ZFS Config 1: zpool status pool: rpool state: ONLINE scrub: none requested config: NAMESTATE READ WRITE CKSUM rpool ONLINE 0 0 0 raidzONLINE 0 0 0 c0t1d0 ONLINE 0 0 0 c1t1d0 ONLINE 0 0 0 c4t1d0 ONLINE 0 0 0 c5t1d0 ONLINE 0 0 0 c6t1d0 ONLINE 0 0 0 c7t1d0 ONLINE 0 0 0 raidzONLINE 0 0 0 c0t2d0 ONLINE 0 0 0 c1t2d0 ONLINE 0 0 0 c4t2d0 ONLINE 0 0 0 c5t2d0 ONLINE 0 0 0 c6t2d0 ONLINE 0 0 0 c7t2d0 ONLINE 0 0 0 raidzONLINE 0 0 0 c0t3d0 ONLINE 0 0 0 c1t3d0 ONLINE 0 0 0 c4t3d0 ONLINE 0 0 0 c5t3d0 ONLINE 0 0 0 c6t3d0 ONLINE 0 0 0 c7t3d0 ONLINE 0 0 0 raidzONLINE 0 0 0 c0t4d0 ONLINE 0 0 0 c1t4d0 ONLINE 0 0 0 c4t4d0 ONLINE 0 0 0 c5t4d0 ONLINE 0 0 0 c6t4d0 ONLINE 0 0 0 c7t4d0 ONLINE 0 0 0 versus ZFS Config 2: zpool status pool: rpool state: ONLINE scrub: none requested config: NAMESTATE READ WRITE CKSUM rpool ONLINE 0 0 0 raidz1ONLINE 0 0 0 c0t1d0 ONLINE 0 0 0 c1t1d0 ONLINE 0 0 0 c4t1d0 ONLINE 0 0 0 c5t1d0 ONLINE 0 0 0 c6t1d0 ONLINE 0 0 0 c7t1d0 ONLINE 0 0 0 c0t2d0 ONLINE 0 0 0 c1t2d0 ONLINE 0 0 0 c4t2d0 ONLINE 0 0 0 c5t2d0 ONLINE 0 0 0 c6t2d0 ONLINE 0 0 0 c7t2d0 ONLINE 0 0 0 c0t3d0 ONLINE 0 0 0 c1t3d0 ONLINE 0 0 0 c4t3d0 ONLINE 0 0 0 c5t3d0 ONLINE 0 0 0 c6t3d0 ONLINE 0 0 0 c7t3d0 ONLINE 0 0 0 c0t4d0 ONLINE 0 0 0 c1t4d0 ONLINE 0 0 0 c4t4d0 ONLINE 0 0 0 c5t4d0 ONLINE 0 0 0 c6t4d0 ONLINE 0 0 0 c7t4d0 ONLINE 0 0 0 In a nutshell, for performance reasons, is it better to have multiple raidz vdevs in the pool or just one raidz vdev. The number of disks used is the same in either case. Thanks again. This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss
[zfs-discuss] iSCSI Lun SnapShot success
Hi, I'd like to share our positive experience on a POC. We created a few iSCSI shares. Mounted on a Windows box. Then took snap shot of one of them. On the next step we converted the snap shot in to a clone and then tried to mount to the same Windows server. We all thought it will not work as drive ID's are same and we had never tried to take a snap shot of iSCSI volumes. Surprise it worked without a problem. We were able to test the volume a couple of minutes. My question is , is what we have done ok in the long run ? Can we use it for production ? Best regards Mertol Ozyoney Storage Practice - Sales Manager Sun Microsystems, TR Istanbul TR Phone +902123352200 Mobile +905339310752 Fax +90212335 Email [EMAIL PROTECTED] ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540
Are right that X4500 have single point of failure but keeping a spare server module is not that expensive. As there are no cables , replacing will tkae a few seconds and after the boot everything will be ok. Besides cluster support for JBOD's will come shortly, that setup will eleminate SPOF Mertol Mertol Ozyoney Storage Practice - Sales Manager Sun Microsystems, TR Istanbul TR Phone +902123352200 Mobile +905339310752 Fax +90212335 Email [EMAIL PROTECTED] -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Bob Friesenhahn Sent: Monday, July 14, 2008 3:58 AM To: Moore, Joe Cc: zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] X4540 On Fri, 11 Jul 2008, Moore, Joe wrote: Bob Friesenhahn I expect that Sun is realizing that it is already undercutting much of the rest of its product line. These minor updates would allow the X4540 to compete against much more expensive StorageTek SAN hardware. Assuming, of course that the requirements for the more expensive SAN hardware don't include, for example, surviving a controller or motherboard failure (or gracefully a RAM chip failure) without requiring an extensive downtime for replacement, or other extended downtime because there's only 1 set of chips that can talk to those disks. I am totally with you here since today I can not access my storage pool due to server motherboard failure and I don't know when Sun will successfully fix it. Since I use an external RAID array for my file server, there would not be so much hardship except that I do not have a spare file server available. Bob ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] ZFS deduplication
Hi All ; Is there any hope for deduplication on ZFS ? Mertol http://www.sun.com/ http://www.sun.com/emrkt/sigs/6g_top.gif Mertol Ozyoney Storage Practice - Sales Manager Sun Microsystems, TR Istanbul TR Phone +902123352200 Mobile +905339310752 Fax +90212335 Email [EMAIL PROTECTED] attachment: image001.gif___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] J4200/J4400 Array
Infact using NVRam in a JBOD is less safe as most of the Jbods that use NvRam have only one NvRam not being mirrored. Therefore if NvRam goes bad you quarantee inconsistancy. However ZFS is finetuned in every layer for all or nothing commit kind of working. Therefore ZFs have the internal mechanisms to be consistant at the time of a device failure. If you put a device between storage and ZFS that ZFs can not control , that device should be redundant and should be able to quarantee consistancy and Jbod NvRam modules are very problematic. I have a customer who had 80 TB on Lustre system and who system locked because of a battery problem and it took them a week to figure out what went wrong. Mertol Mertol Ozyoney Storage Practice - Sales Manager Sun Microsystems, TR Istanbul TR Phone +902123352200 Mobile +905339310752 Fax +90212335 Email [EMAIL PROTECTED] -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Albert Chin Sent: Thursday, July 03, 2008 8:17 PM To: zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] J4200/J4400 Array On Thu, Jul 03, 2008 at 01:43:36PM +0300, Mertol Ozyoney wrote: You are right that J series do not have nvram onboard. However most Jbods like HPS's MSA series have some nvram. The idea behind not using nvram on the Jbod's is -) There is no use to add limited ram to a JBOD as disks already have a lot of cache. -) It's easy to design a redundant Jbod without nvram. If you have nvram and need redundancy you need to design more complex HW and more complex firmware -) Bateries are the first thing to fail -) Servers already have too much ram Well, if the server attached to the J series is doing ZFS/NFS, performance will increase with zfs:zfs_nocacheflush=1. But, without battery-backed NVRAM, this really isn't safe. So, for this usage case, unless the server has battery-backed NVRAM, I don't see how the J series is good for ZFS/NFS usage. -- albert chin ([EMAIL PROTECTED]) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] J4200/J4400 Array
Hi; You are right that J series do not have nvram onboard. However most Jbods like HPS's MSA series have some nvram. The idea behind not using nvram on the Jbod's is -) There is no use to add limited ram to a JBOD as disks already have a lot of cache. -) It's easy to design a redundant Jbod without nvram. If you have nvram and need redundancy you need to design more complex HW and more complex firmware -) Bateries are the first thing to fail -) Servers already have too much ram Best regards Mertol Ozyoney Storage Practice - Sales Manager Sun Microsystems, TR Istanbul TR Phone +902123352200 Mobile +905339310752 Fax +90212335 Email [EMAIL PROTECTED] -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Albert Chin Sent: Wednesday, July 02, 2008 9:04 PM To: zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] J4200/J4400 Array On Wed, Jul 02, 2008 at 04:49:26AM -0700, Ben B. wrote: According to the Sun Handbook, there is a new array : SAS interface 12 disks SAS or SATA ZFS could be used nicely with this box. Doesn't seem to have any NVRAM storage on board, so seems like JBOD. There is an another version called J4400 with 24 disks. Doc is here : http://docs.sun.com/app/docs/coll/j4200 -- albert chin ([EMAIL PROTECTED]) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] J4200/J4400 Array
You should be able to buy them today. GA should be next week Mertol Mertol Ozyoney Storage Practice - Sales Manager Sun Microsystems, TR Istanbul TR Phone +902123352200 Mobile +905339310752 Fax +90212335 Email [EMAIL PROTECTED] -Original Message- From: Tim [mailto:[EMAIL PROTECTED] Sent: Wednesday, July 02, 2008 9:45 PM To: [EMAIL PROTECTED]; Ben B.; zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] J4200/J4400 Array So when are they going to release msrp? On 7/2/08, Mertol Ozyoney [EMAIL PROTECTED] wrote: Availibilty may depend on where you are located but J4200 and J4400 are available for most regions. Those equipment is engineered to go well with Sun open storage components like ZFS. Besides price advantage, J4200 and J4400 offers unmatched bandwith to hosts or to stacking units. You can get the price from your sun account manager Best regards Mertol Mertol Ozyoney Storage Practice - Sales Manager Sun Microsystems, TR Istanbul TR Phone +902123352200 Mobile +905339310752 Fax +90212335 Email [EMAIL PROTECTED] -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Ben B. Sent: Wednesday, July 02, 2008 2:49 PM To: zfs-discuss@opensolaris.org Subject: [zfs-discuss] J4200/J4400 Array Hi, According to the Sun Handbook, there is a new array : SAS interface 12 disks SAS or SATA ZFS could be used nicely with this box. There is an another version called J4400 with 24 disks. Doc is here : http://docs.sun.com/app/docs/coll/j4200 Does someone know price and availability for these products ? Best Regards, Ben This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs on top of 6140 FC array
Depends on what benefit you are looking for. If you are looking ways to improve redundancy you can still benefit from ZFS a) ZFS snap shots will give you the ability to withstand soft/user errors. b) ZFS checksum... c) ZFS can mirror (synch or async) a 6140'lun to an other storage for increased redundancy d) You can put an other level of raid over 6140's internal raid to increase redundancy e) Use ZFS send recieve to backup data some other place. Best regards Mertol http://www.sun.com/ http://www.sun.com/emrkt/sigs/6g_top.gif Mertol Ozyoney Storage Practice - Sales Manager Sun Microsystems, TR Istanbul TR Phone +902123352200 Mobile +905339310752 Fax +90212335 Email [EMAIL PROTECTED] From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Justin Vassallo Sent: Wednesday, July 02, 2008 3:23 AM To: zfs-discuss@opensolaris.org Subject: [zfs-discuss] zfs on top of 6140 FC array When set up with multi-pathing to dual redundant controllers, is layering zfs on top of the 6140 of any benefit? AFAIK this array does have internal redundant paths up to the disk connection. justin attachment: image001.gif___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] J4200/J4400 Array
Availibilty may depend on where you are located but J4200 and J4400 are available for most regions. Those equipment is engineered to go well with Sun open storage components like ZFS. Besides price advantage, J4200 and J4400 offers unmatched bandwith to hosts or to stacking units. You can get the price from your sun account manager Best regards Mertol Mertol Ozyoney Storage Practice - Sales Manager Sun Microsystems, TR Istanbul TR Phone +902123352200 Mobile +905339310752 Fax +90212335 Email [EMAIL PROTECTED] -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Ben B. Sent: Wednesday, July 02, 2008 2:49 PM To: zfs-discuss@opensolaris.org Subject: [zfs-discuss] J4200/J4400 Array Hi, According to the Sun Handbook, there is a new array : SAS interface 12 disks SAS or SATA ZFS could be used nicely with this box. There is an another version called J4400 with 24 disks. Doc is here : http://docs.sun.com/app/docs/coll/j4200 Does someone know price and availability for these products ? Best Regards, Ben This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] 2 items on the wish list
Hi all ; There are two tinhs that some customers are asking for constantly about ZFS. Active active clustering support. Ability to mount snap shots somewhere else. [this doesnt look easy, perhaps a proxy kind of set up ? ] Any hope for these features? Mertol http://www.sun.com/ http://www.sun.com/emrkt/sigs/6g_top.gif Mertol Ozyoney Storage Practice - Sales Manager Sun Microsystems, TR Istanbul TR Phone +902123352200 Mobile +905339310752 Fax +90212335 Email [EMAIL PROTECTED] attachment: image001.gif___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] raid card vs zfs
Please note that IO speeds exceeding 1 GB/sec can be limited by several components including the OS or device drivers. Currrent maximum performance I have seen is around 1,25 GB through dual IB. Anyway, that's a great realworld performance, considering the price and size of the unit. Mertol Ozyoney Storage Practice - Sales Manager Sun Microsystems, TR Istanbul TR Phone +902123352200 Mobile +905339310752 Fax +90212335 Email [EMAIL PROTECTED] -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Lida Horn Sent: Wednesday, June 25, 2008 11:14 PM To: Tim Cc: zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] raid card vs zfs Tim wrote: On Wed, Jun 25, 2008 at 10:44 AM, Bob Friesenhahn [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: I see that the configuration tested in this X4500 writeup only uses the four built-in gigabit ethernet interfaces. This places a natural limit on the amount of data which can stream from the system. For local host access, I am achieving this level of read performance using one StorageTek 2540 (6 mirror pairs) and a single reading process. The X4500 with 48 drives should be capable of far more. The X4500 has two expansion bus slows but they are only 64-bit 133MHz PCI-X so it seems that the ability to add bandwidth via more interfaces is limited. A logical improvement to the design is to offer PCI-E slots which can support 10Gbit ethernet, Infiniband, or Fiber Channel cards so that more of the internal disk bandwidth is available to power user type clients. Bob == Bob Friesenhahn [EMAIL PROTECTED] mailto:[EMAIL PROTECTED], http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org mailto:zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss Uhhh... 64bit/133mhz is 17Gbit/sec. I *HIGHLY* doubt that bus will be a limit. Without some serious offloading, you aren't pushing that amount of bandwidth out the card. Most systems I've seen top out around 6bit/sec with current drivers. Ummm 133MHz is just slightly above 1/8 GHz. 64bits is 8 x 8 bits. Multiplying yields 8Gbits/sec or 1GByte/sec. So even if you have two PCI-X (64-bit/133MHz slots) that are independent, that would yield at best 2GB/sec. The SunFire x4500 is capable of doing 3GB/sec I/O to the disks, so you would still be network band limited. Of course if you are using ZFS and/or mirroring that 3GB/sec from the disks goes down dramatically, so for practical purposes the 2GB/sec limit may well be enough. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] raid card vs zfs
I agree to other comments. From the Day 1 ZFS is fine tuned for JBOD's. While Raid cards are welcome ZFS will perform better with JBOD's. Most of the Raid cards do have limited power and bandwith to support platter speeds of the newer drives. And ZFS code seems to be more intelligent for caching. A few days a ago a customer tested a Sunfire X4500 connected to a network with 4 x 1 Gbit ethernets. X4500 have modest CPU power and do not use any Raid card. The unit easly performaed 400 MB/sec on write from LAN tests which clearly limited by the ethernet ports. Mertol Mertol Ozyoney Storage Practice - Sales Manager Sun Microsystems, TR Istanbul TR Phone +902123352200 Mobile +905339310752 Fax +90212335 Email [EMAIL PROTECTED] -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Bob Friesenhahn Sent: Monday, June 23, 2008 5:33 AM To: kevin williams Cc: zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] raid card vs zfs On Sun, 22 Jun 2008, kevin williams wrote: The article says that ZFS eliminates the need for a RAID card and is faster because the striping is running on the main cpu rather than an old chipset on a card. My question is, is this true? Can I Ditto what the other guys said. Since ZFS may generate more I/O traffic from the CPU, you will want an adaptor with lots of I/O ports. SATA/SAS with a port per drive is ideal. It is useful to have a NVRAM cache on the card if you will be serving NFS or running a database, although some vendors sell this NVRAM cache as a card which plugs into the backplane and uses a special driver. ZFS is memory-hungry so 4GB of RAM is a good starting point for a server. Make sure that your CPU and OS are able to run a 64-bit kernel. Bob == Bob Friesenhahn [EMAIL PROTECTED], http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] memory hog
No, ZFS loves memory and unlike most other FS's around it can make good use of memory. But ZFS will free memory if it recognizes that other apps require memory or you can limit the cache ARC will be using. To my experiance ZFS still performs nicely on 1 GB boxes. PS: How much 4 GB Ram costs for a desktop ? Mertol Ozyoney Storage Practice - Sales Manager Sun Microsystems, TR Istanbul TR Phone +902123352200 Mobile +905339310752 Fax +90212335 Email [EMAIL PROTECTED] -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Edward Sent: Monday, June 23, 2008 9:32 AM To: zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] memory hog So does that mean ZFS is not for consumer computer? If ZFS require 4GB of Ram for operation, that means i will need 8GB+ Ram if i were to use Photoshop or any other memory intensive application? And it seems ZFS memory usage scales with the amount of HDD space? This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Oracle and ZFS
Hi All ; One of our customer is suffered from FS being corrupted after an unattanded shutdonw due to power problem. They want to switch to ZFS. From what I read on, ZFS will most probably not be corrupted from the same event. But I am not sure how will Oracle be affected from a sudden power outage when placed over ZFS ? Any comments ? PS: I am aware of UPS's and smilar technologies but customer is still asking those if ... questions ... Mertol http://www.sun.com/ http://www.sun.com/emrkt/sigs/6g_top.gif Mertol Ozyoney Storage Practice - Sales Manager Sun Microsystems, TR Istanbul TR Phone +902123352200 Mobile +905339310752 Fax +90212335 Email mailto:[EMAIL PROTECTED] [EMAIL PROTECTED] attachment: image001.gif___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Adaptec 3085 or smilar , support larger than 2 TB lun ?
Hi ; Is there any one who have used adaptec 3085 or smilar ? I'd like to learn if larger than 2 TB lun's are supported? Best regards http://www.sun.com/ http://www.sun.com/emrkt/sigs/6g_top.gif Mertol Ozyoney Storage Practice - Sales Manager Sun Microsystems, TR Istanbul TR Phone +902123352200 Mobile +905339310752 Fax +90212335 Email mailto:[EMAIL PROTECTED] [EMAIL PROTECTED] attachment: image001.gif___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SSD reliability, wear levelling, warranty period
Hi All ; Every NAND based SSD HDD have some ram. Consumer grade products will have smaller not battery protected ram with a smaller number of prallel working nand chips and a slower cpu to distribute the load. Also consumer product will have less number of spare cells. Enterprise SSD's are genrally compose of several nand devices and a lot of spare cells controlled by a fast micro computer which also have some cache and a super capacitor to protect the cache. Regardless of NAND write cycle capability, vendors can design a reliable SSD with incorprating more spare cells in to the desing. Mertol Mertol Ozyoney Storage Practice - Sales Manager Sun Microsystems, TR Istanbul TR Phone +902123352200 Mobile +905339310752 Fax +90212335 Email [EMAIL PROTECTED] -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Richard L. Hamilton Sent: Wednesday, June 11, 2008 2:58 PM To: zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] SSD reliability, wear levelling, warranty period btw: it's seems to me that this thread is a little bit OT. I don't think its OT - because SSDs make perfect sense as ZFS log and/or cache devices. If I did not make that clear in my OP then I failed to communicate clearly. In both these roles (log/cache) reliability is of the utmost importance. Older SSDs (before cheap and relatively high-cycle-limit flash) were RAM cache+battery+hard disk. Surely RAM+battery+flash is also possible; the battery only needs to keep the RAM alive long enough to stage to the flash. That keeps the write count on the flash down, and the speed up (RAM being faster than flash). Such a device would of course cost more, and be less dense (given having to have battery+charging circuits and RAM as well as flash), than a pure flash device. But with more limited write rates needed, and no moving parts, _provided_ it has full ECC and maybe radiation-hardened flash (if that exists), I can't imagine why such a device couldn't be exceedingly reliable and have quite a long lifetime (with the battery, hopefully replaceable, being more of a limitation than the flash). It could be a matter of paying for how much quality you want... As for reliability, from zpool(1m): log A separate intent log device. If more than one log device is specified, then writes are load-balanced between devices. Log devices can be mirrored. However, raidz and raidz2 are not supported for the intent log. For more information, see the “Intent Log” section. cache A device used to cache storage pool data. A cache device cannot be mirrored or part of a raidz or raidz2 configuration. For more information, see the “Cache Devices” section. [...] Cache Devices Devices can be added to a storage pool as “cache devices.” These devices provide an additional layer of caching between main memory and disk. For read-heavy workloads, where the working set size is much larger than what can be cached in main memory, using cache devices allow much more of this working set to be served from low latency media. Using cache devices provides the greatest performance improvement for random read-workloads of mostly static content. To create a pool with cache devices, specify a “cache” vdev with any number of devices. For example: # zpool create pool c0d0 c1d0 cache c2d0 c3d0 Cache devices cannot be mirrored or part of a raidz configuration. If a read error is encountered on a cache device, that read I/O is reissued to the original storage pool device, which might be part of a mirrored or raidz configuration. The content of the cache devices is considered volatile, as is the case with other system caches. That tells me that the zil can be mirrored and zfs can recover from cache errors. I think that means that these devices don't need to be any more reliable than regular disks, just much faster. So...expensive ultra-reliability SSD, or much less expensive SSD plus mirrored zil? Given what zfs can do with cheap SATA, my bet is on the latter... This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS conflict with MAID?
Hi ; If you want to use ZFS special ability to pool all the storage together to supply thin provisioning like functionlaty , this will work against MAID. However there is always the option to setup ZFS just like any other FS. (ie. One disk - one fs ) By the way, if I am not mistaken MAID like functionality is built into Solaris. I think solaris gurus should answer this part but I think there is a command to enable MAID like functionality on sata drives. Mertol Mertol Ozyoney Storage Practice - Sales Manager Sun Microsystems, TR Istanbul TR Phone +902123352200 Mobile +905339310752 Fax +90212335 Email [EMAIL PROTECTED] -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of John Kunze Sent: Friday, June 06, 2008 7:29 PM To: zfs-discuss@opensolaris.org Subject: [zfs-discuss] ZFS conflict with MAID? My organization is considering an RFP for MAID storage and we're wondering about potential conflicts between MAID and ZFS. We want MAID's power management benefits but are concerned that what we understand to be ZFS's use of dynamic striping across devices with filesystem metadata replication and cache syncing will tend to keep disks spinning that the MAID is trying to spin down. Of course, we like ZFS's large namespace and dynamic memory pool resizing ability. Is it possible to configure ZFS to maximize the benefits of MAID? -John =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= John A. Kunze [EMAIL PROTECTED] California Digital LibraryWork: +1-510-987-9231 415 20th St, #406 http://dot.ucop.edu/home/jak/ Oakland, CA 94612 USA University of California =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS in S10U6 vs openSolaris 05/08
I strongly agree most of the comments. I quess, I tried to keep it simple, perhaps a little bit too simple. If I am not mistaken ,most of the Nand disks will virtualize the underlying cells so even you update the same sector update will be made somewhere else. So the time to corrupt an enterprise grade SSD (nand based) will be quite long although I wouldn't recommend to keep the swap file or any sort of fast changing cache on those drives. Think that you have a 146 GB SSD and the wirte cycle is around 100k And you can write/update data at 10 MB/sec (depends on the IO pattern could be a lot slower or a lot higher) It will take 4 Hours or 14,400 sec's to fully populate the drive. Multiply this with 100k , this is 45 Years. If the virtualisation algorithmws work at %25 efficiency this will be 10 years plus. And if I am not mistaken all enterprise NAnds and most consumer Nands do read after write verify and they will mark bad blocks. This will also increase the usable time as you will not be marking a whole device failed , just a cell... Please correct me where I am wrong , as I am not quite knowledgeble on this subject Mertol Mertol Ozyoney Storage Practice - Sales Manager Sun Microsystems, TR Istanbul TR Phone +902123352200 Mobile +905339310752 Fax +90212335 Email [EMAIL PROTECTED] -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Bob Friesenhahn Sent: 27 Mayıs 2008 Salı 18:55 To: Mertol Ozyoney Cc: 'ZFS Discuss' Subject: Re: [zfs-discuss] ZFS in S10U6 vs openSolaris 05/08 On Mon, 26 May 2008, Mertol Ozyoney wrote: It's true that NAND based falsh's wear out under heavy load. Regular consumer grade nand drives will wear out the extra cells pretty rapidly. (in a year or so) However enterprise grade SSD disks are fine tuned to with stand continous writes for more than 10 years It is incorrect to classify wear in terms of years without also specifying update behavior. NAND FLASH sectors can withstand 100,000 to (sometimes) 1,000,000 write-erase-cycles. In normal filesystem use, there are far more reads than writes and the size of the storage device is much larger than the the data re-written. Even in server use, only a small fraction of the data is updated. A device used to cache writes will be written to as often as it is read from (or perhaps more often). If the cache device storage is fully occupied, then wear leveling algorithms based on statistics do not have much opportunity to work. If the underlying device sectors are good for 100,000 write-erase-cycles and the entire device is re-written once per second, then the device is not going to last very long (27 hours). Of course the write performance for these devices is quite poor (8-120MB/second) and the write performace seems to be proportional to the total storage size so it is quite unlikely that you could re-write a suitably performant device once per second. The performance of FLASH SSDs does not seem very appropriate for use as a write cache device. There is a useful guide to these devices at http://www.storagesearch.com/ssd-buyers-guide.html;. SRAM-based cache devices which plug into a PCI-X or PCI-Express slot seem far more appropriate for use as a write cache than a slow SATA device. At least 5X or 10X the performance is available by this means. Bob == Bob Friesenhahn [EMAIL PROTECTED], http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS in S10U6 vs openSolaris 05/08
By the way. All enterprise SSD's have internal Dram based cache. Some vendors may quote the write performance of the internal RAM device. Normally Nand drives due to read after write operations and several other reasons will not perform quite good under write based load. Mertol Ozyoney Storage Practice - Sales Manager Sun Microsystems, TR Istanbul TR Phone +902123352200 Mobile +905339310752 Fax +90212335 Email [EMAIL PROTECTED] -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Bob Friesenhahn Sent: 27 Mayıs 2008 Salı 20:22 To: Tim Cc: ZFS Discuss Subject: Re: [zfs-discuss] ZFS in S10U6 vs openSolaris 05/08 On Tue, 27 May 2008, Tim wrote: You're still concentrating on consumer level drives. The stec drives emc is using for instance, exhibit none of the behaviors you describe. How long have you been working for STEC? ;-) Looking at the specifications for STEC SSDs I see that they are very good at IOPS (probably many times faster than the Solaris I/O stack). Write performance of the fastest product (ZEUS iops) is similar to a typical SAS hard drive, with the remaining products being much slower. This all that STEC has to say about FLASH lifetime in their products: http://www.stec-inc.com/technology/flash_life_support.php;. There are no hard facts to be found there. The STEC SSDs are targeted towards being a replacement for a traditional hard drive. There is no mention of lifetime when used as a write-intensive cache device. Bob == Bob Friesenhahn [EMAIL PROTECTED], http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] ZFS as a shared file system
Hi All ; Do anyone know the status of supporting ZFS on active active clusters ? Best regards Mertol http://www.sun.com/ http://www.sun.com/emrkt/sigs/6g_top.gif Mertol Ozyoney Storage Practice - Sales Manager Sun Microsystems, TR Istanbul TR Phone +902123352200 Mobile +905339310752 Fax +90212335 Email mailto:[EMAIL PROTECTED] [EMAIL PROTECTED] attachment: image001.gif___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS and Sun Disk arrays - Opinions?
Hi All ; 2 TB limit on the 6000 series will be removed when we release CAM 6,1 and cyrstall firmware. I cant give an actual date at the moment but it's pretty close. The same will happen for 2500 series but it will take some more time. Mertol Mertol Ozyoney Storage Practice - Sales Manager Sun Microsystems, TR Istanbul TR Phone +902123352200 Mobile +905339310752 Fax +90212335 Email [EMAIL PROTECTED] -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Andy Lubel Sent: 20 Mayıs 2008 Salı 00:30 To: Torrey McMahon; Bob Friesenhahn Cc: zfs-discuss@opensolaris.org; Kenny Subject: Re: [zfs-discuss] ZFS and Sun Disk arrays - Opinions? The limitation existed in every Sun branded Engenio array we tested - 2510,2530,2540,6130,6540. This limitation is on volumes. You will not be able to present a lun larger than that magical 1.998TB. I think it is a combination of both in CAM and the firmware. Can't do it with sscs either... Warm and fuzzy: Sun engineers told me they would have a new release of CAM (and firmware bundle) in late June which would resolve this limitation. Or just do ZFS (or even SVM) setup like Bob and I did. Its actually pretty nice because the traffic will split to both controllers giving you theoretically more throughput so long as MPxIO is functioning properly. Only (minor) downside is parity is being transmitted from the host to the disks rather than living on the controller entirely. -Andy From: [EMAIL PROTECTED] on behalf of Torrey McMahon Sent: Mon 5/19/2008 1:59 PM To: Bob Friesenhahn Cc: zfs-discuss@opensolaris.org; Kenny Subject: Re: [zfs-discuss] ZFS and Sun Disk arrays - Opinions? Bob Friesenhahn wrote: On Mon, 19 May 2008, Kenny wrote: Bob M.- Thanks for the heads up on the 2 (1.998) TN Lun limit. This has me a little concerned esp. since I have 1 TB drives being delivered! Also thanks for the scsi cache flushing heads up, yet another item to lookup! grin I am not sure if this LUN size limit really exists, or if it exists, in which cases it actually applies. On my drive array, I created a 3.6GB RAID-0 pool with all 12 drives included during the testing process. Unfortunately, I don't recall if I created a LUN using all the space. I don't recall ever seeing mention of a 2TB limit in the CAM user interface or in the documentation. The Solaris LUN limit is gone if you're using Solaris 10 and recent patches. The array limit(s) are tied to the type of array you're using. (Which type is this again?) CAM shouldn't be enforcing any limits of its own but only reporting back when the array complains. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Solaris SAMBA questions
Hi All ; Need help for figuring out a solution for customer requirements. We will most probably be using Solaris 10u5 or OpenSolaris 10.5 . So I will be very please if you can state your opinion for Solaris 10 + SAMBA and OpenSolaris integrated Cifs serving capabilities. System will accessed by several windows systems. First requirement system is to have auditing capabilities. Customer wants to be able to see, who have done what at what time on files. Second requirement is about administering the file permissions. Here are some questions. (sorry for the question I have absolutely no knowledge about samba) 1) Can SAMBA get the user lists from active directory ? (I quess this is basic functionality and could be done) 2) Ones a owner ship for a directory assigned can the owner set the permissions from a windows workstation ? Thanks for the answers Best regards http://www.sun.com/ http://www.sun.com/emrkt/sigs/6g_top.gif Mertol Ozyoney Storage Practice - Sales Manager Sun Microsystems, TR Istanbul TR Phone +902123352200 Mobile +905339310752 Fax +90212335 Email mailto:[EMAIL PROTECTED] [EMAIL PROTECTED] attachment: image001.gif___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] ZFS and Linux
Hi All ; What is the status of ZFS on linux and what are the kernel's supported? Regards Mertol http://www.sun.com/ http://www.sun.com/emrkt/sigs/6g_top.gif Mertol Ozyoney Storage Practice - Sales Manager Sun Microsystems, TR Istanbul TR Phone +902123352200 Mobile +905339310752 Fax +90212335 Email mailto:[EMAIL PROTECTED] [EMAIL PROTECTED] attachment: image001.gif___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Cifs and Solaris
Hi ; Cant remember on top of my head. Do solaris latest version support Cifs ? Or do we need Open solaris ? Best regard Mertol http://www.sun.com/ http://www.sun.com/emrkt/sigs/6g_top.gif Mertol Ozyoney Storage Practice - Sales Manager Sun Microsystems, TR Istanbul TR Phone +902123352200 Mobile +905339310752 Fax +90212335 Email mailto:[EMAIL PROTECTED] [EMAIL PROTECTED] attachment: image001.gif___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] iSCSI targets mapped to a VMWare ESX server
Hi All ; We are running latest Solaris 10 a X4500 Thumper. We defined a test iSCSI Lun. Out put below Target: AkhanTemp/VM iSCSI Name: iqn.1986-03.com.sun:02:72406bf8-2f5f-635a-f64c-cb664935f3d1 Alias: AkhanTemp/VM Connections: 0 ACL list: TPGT list: LUN information: LUN: 0 GUID: 01144fa709302a0047fa50e6 VID: SUN PID: SOLARIS Type: disk Size: 100G Backing store: /dev/zvol/rdsk/AkhanTemp/VM Status: online We tried to access the LUN from a windows laptop, and it worked without any problems. However VMWare ESX 3,2 Server is unable to access the LUN's. We checked that the virtual interface can ping X4500. Sometimes it sees the Lun , but 200+ Lun's with the same proporties are listed and we cant add them as storage. Then after a rescan they vanish. Any help appraciated Mertol http://www.sun.com/ http://www.sun.com/emrkt/sigs/6g_top.gif Mertol Ozyoney Storage Practice - Sales Manager Sun Microsystems, TR Istanbul TR Phone +902123352200 Mobile +905339310752 Fax +90212335 Email mailto:[EMAIL PROTECTED] [EMAIL PROTECTED] attachment: image001.gif___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] iSCSI targets mapped to a VMWare ESX server
Thanks James ; The problem is nearly identical with mine. When we had 2 LUN's vmware tried to multipath over them . I think this is a bug inside VMWare as it thinks that two LUN 0 are same. I think I can fool it setting up targets with different LUN numbers. After I figured out this, I switched to a single LUN , cleaned a few things and reinitialized . For some reason VM found 200+ LUN's again but all with single active path. I started formatting the first one. (100 GB) It's been going one for the last 20 minutes. I hope it will ever finish. Zpool iostat shows small activity over disk. I think that vmware 3,02 have a severe bug. I will try to open a case at vmware . But there seems to be a lot of people on the web who had no problems with the exact same setup. Mertol http://www.sun.com/ http://www.sun.com/emrkt/sigs/6g_top.gif Mertol Ozyoney Storage Practice - Sales Manager Sun Microsystems, TR Istanbul TR Phone +902123352200 Mobile +905339310752 Fax +90212335 Email [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Sent: 07 Nisan 2008 Pazartesi 20:42 To: [EMAIL PROTECTED] Cc: zfs-discuss@opensolaris.org; [EMAIL PROTECTED] Subject: Re: [zfs-discuss] iSCSI targets mapped to a VMWare ESX server Mertol Ozyoney wrote: Hi All ; There are a set of issues being looked at that prevent the VMWare ESX server from working with the Solaris iSCSI Target. http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6597310 At this time there is no target date when this issues will be resolved. Jim We are running latest Solaris 10 a X4500 Thumper. We defined a test iSCSI Lun. Out put below Target: AkhanTemp/VM iSCSI Name: iqn.1986-03.com.sun:02:72406bf8-2f5f-635a-f64c-cb664935f3d1 Alias: AkhanTemp/VM Connections: 0 ACL list: TPGT list: LUN information: LUN: 0 GUID: 01144fa709302a0047fa50e6 VID: SUN PID: SOLARIS Type: disk Size: 100G Backing store: /dev/zvol/rdsk/AkhanTemp/VM Status: online We tried to access the LUN from a windows laptop, and it worked without any problems. However VMWare ESX 3,2 Server is unable to access the LUN's. We checked that the virtual interface can ping X4500. Sometimes it sees the Lun , but 200+ Lun's with the same proporties are listed and we cant add them as storage. Then after a rescan they vanish. Any help appraciated Mertol http://www.sun.com/ image001.gif Mertol Ozyoney Storage Practice - Sales Manager Sun Microsystems, TR Istanbul TR Phone +902123352200 Mobile +905339310752 Fax +90212335 Email [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss attachment: image001.gif___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Can not add ZFS LOG devices
Hi All ; I am a newvbie on solaris and ZFS. I am setting up a Zpool on a thumper (Sun fire x4500 with 48x internal sata drives.) Our engineers have set up the lates Solaris 10. When I try to create a Zpool with Log disks (mirrored) included, I get an error message from where I ujnderstand that Log devices are not supported. I check the ZFS version and it's 4 (I know it's low and I don't know how to upgrade ZFS software pieces) Can any one help ? Mertol http://www.sun.com/ http://www.sun.com/emrkt/sigs/6g_top.gif Mertol Ozyoney Storage Practice - Sales Manager Sun Microsystems, TR Istanbul TR Phone +902123352200 Mobile +905339310752 Fax +90212335 Email mailto:[EMAIL PROTECTED] [EMAIL PROTECTED] attachment: image001.gif___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS performance lower than expected
Hi Bart; Your setup is composed of a lot of components. I'd suggest the following. 1) check the system with one SAN server and see the performance 2) check the internal performance of one SAN server 3) TRY using Solaris instead of Linux as solaris iSCSI target could offer more performance 4) For performance over IB I strongly suggest you Lustre 5) Check your Ethernet setup Regards Mertol Mertol Ozyoney Storage Practice - Sales Manager Sun Microsystems, TR Istanbul TR Phone +902123352200 Mobile +905339310752 Fax +90212335 Email [EMAIL PROTECTED] -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Bart Van Assche Sent: 20 Mart 2008 Perşembe 17:33 To: zfs-discuss@opensolaris.org Subject: [zfs-discuss] ZFS performance lower than expected Hello, I just made a setup in our lab which should make ZFS fly, but unfortunately performance is significantly lower than expected: for large sequential data transfers write speed is about 50 MB/s while I was expecting at least 150 MB/s. Setup - The setup consists of five servers in total: one OpenSolaris ZFS server and four SAN servers. ZFS accesses the SAN servers via iSCSI and IPoIB. * ZFS Server Operating system: OpenSolaris build 78. CPU: Two Intel Xeon CPU's, eight cores in total. RAM: 16 GB. Disks: not relevant for this test. * SAN Servers Operating system: Linux 2.6.22.18 kernel, 64-bit + iSCSI Enterprise Target (IET). IET has been configured such that it performs both read and write caching. CPU: Intel Xeon CPU E5310, 1.60GHz, four cores in total. RAM: two servers with 8 GB RAM, one with 4 GB RAM, one with 2 GB RAM. Disks: 16 disks in total: two disks with the Linux OS and 14 set up in RAID-0 via LVM. The LVM volume is exported via iSCSI and used by ZFS. These SAN servers give excellent performance results when accessed via Linux' open-iscsi initiator. * Network 4x SDR InfiniBand. The raw transfer speed of this network is 8 Gbit/s. Netperf reports 1.6 Gbit/s between the ZFS server and one SAN server (IPoIB, single-threaded). iSCSI transfer speed between the ZFS server and one SAN server is about 150 MB/s. Performance test Software: xdd (see also http://www.ioperformance.com/products.htm). I modified xdd such that the -dio command line option enables O_RSYNC and O_DSYNC in open() instead of calling directio(). Test command: xdd -verbose -processlock -dio -op write -targets 1 testfile -reqsize 1 -blocksize $((2**20)) -mbytes 1000 -passes 3 This test command triggers synchronous writes with a block size of 1 MB (verified this with truss). I am using synchronous writes because these give the same performance results as very large buffered writes (large compared to ZFS' cache size). Write performance reported by xdd for synchronous sequential writes: 50 MB/s, which is lower than expected. Any help with improving the performance of this setup is highly appreciated. Bart Van Assche. This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] ZFS Send and recieve
Hi All ; I am not a Solaris or ZFS expert and I am in need of your help. When I run the following command zfs send -i [EMAIL PROTECTED] [EMAIL PROTECTED] | ssh 10.10.103.42 zfs receive -F data/data41 if some one is accessing data/data41 folder system gives the following error message cannot umount . device is busy I assume this is notmal. I want to know how I can suspend the user accessing the folder until send and recieve command finishes it's job. Best regards Mertol http://www.sun.com/ http://www.sun.com/emrkt/sigs/6g_top.gif Mertol Ozyoney Storage Practice - Sales Manager Sun Microsystems, TR Istanbul TR Phone +902123352200 Mobile +905339310752 Fax +90212335 Email mailto:[EMAIL PROTECTED] [EMAIL PROTECTED] attachment: image001.gif___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] five megabytes per second with Microsoft iSCSI initiator (2.06)
Please also check http://www.microsoft.com/downloads/details.aspx?familyid=12CB3C1A-15D6-4585- B385-BEFD1319F825displaylang=en best regards Mertol Ozyoney Storage Practice - Sales Manager Sun Microsystems, TR Istanbul TR Phone +902123352200 Mobile +905339310752 Fax +90212335 Email [EMAIL PROTECTED] -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of John Tracy Sent: 19 Şubat 2008 Salı 22:02 To: zfs-discuss@opensolaris.org Subject: [zfs-discuss] five megabytes per second with Microsoft iSCSI initiator (2.06) Hello All- I've been creating iSCSI targets on the following two boxes: - Sun Ultra 40 M2 with eight 10K SATA disks - Sun x2200 M2, with two 15K RPM SAS drives Both were running build 82 I'm creating a zfs volume, and sharing it with zfs set shareiscsi=on poolname/volume. I can access the iSCSI volume without any problems, but IO is terribly slow, as in five megabytes per second sustained transfers. I've tried creating an iSCSI target stored on a UFS filesystem, and get the same slow IO. I've tried every level of RAID available in ZFS with the same results. The client machines are Windows 2003 Enterprise Edition SP2, running Microsoft iSCSI initiator 2.06, and Windows XP SP2, running MS iSCSI initiator 2.06. I've tried moving some of the client machines to the same physical switch as the target servers, and get the same results. I've tried another switch, and get the same results. I've even physically isolated the computers from my network, and get the same results. I'm not sure where to go from here and what to try next. The network is all gigabit. I normally have the Solaris boxes in a 802.3ad LAG group, tying two physical NICs together which should give me a max of 2gb/s of bandwidth (250 megabytes per second). Of course, I've tried no LAG connections with the same results. In short, I've tried every combination of everything I know to try, except using a different iSCSI client/server software stack (well, I did try the 2.05 version of MS's iSCSI initiator client--same result). Here is what I'm seeing with performance logs on the Windows side- On any of the boxes, I see the queue length for the hard disk (iSCSI target) go from under 1 to 600+, and then back to under 1 about every four or five seconds. On the Solaris side, I'm running iostat -xtc 1 which shows me lots of IO activity on the hard drives associated with my ZFS pool, and then about three or four seconds of pause, and then lots of activity again for a second or two, and then a lull again, and the cycle repeats as long as I'm doing active sustained IO against the iSCSI target. The output of prstat doesn't show any heavy processor/memory usage on the Solaris box. I'm not sure what other monitors to run on either side to get a better picture. Any recommendations on how to proceed? Does anybody else use the Solaris iSCSI target software to export iSCSI targets to initiators running the MS iSCSI initiator? Thank you- John This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Performance with Sun StorageTek 2540
Hi Bob; When you have some spare time can you prepare a simple benchmark report in PDF that I can share with my customers to demonstrate the performance of 2540 ? Best regards Mertol Mertol Ozyoney Storage Practice - Sales Manager Sun Microsystems, TR Istanbul TR Phone +902123352200 Mobile +905339310752 Fax +90212335 Email [EMAIL PROTECTED] -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Bob Friesenhahn Sent: 16 Şubat 2008 Cumartesi 19:57 To: Joel Miller Cc: zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] Performance with Sun StorageTek 2540 On Sat, 16 Feb 2008, Joel Miller wrote: Here is how you can tell the array to ignore cache sync commands and the force unit access bits...(Sorry if it wraps..) Thanks to the kind advice of yourself and Mertol Ozyoney, there is a huge boost in write performance: Was: 154MB/second Now: 279MB/second The average service time for each disk LUN has dropped considerably. The numbers provided by 'zfs iostat' are very close to what is measured by 'iozone'. This is like night and day and gets me very close to my original target write speed of 300MB/second. Thank you very much! Bob == Bob Friesenhahn [EMAIL PROTECTED], http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Performance with Sun StorageTek 2540
Hi Tim; 2540 controler can achieve maximum 250 MB/sec on writes on the first 12 drives. So you are pretty close to maximum throughput already. Raid 5 can be a little bit slower. Please try to distribute Lun's between controllers and try to benchmark by disabling cache mirroring. (it's different then disableing cache) Best regards Mertol http://www.sun.com/ http://www.sun.com/emrkt/sigs/6g_top.gif Mertol Ozyoney Storage Practice - Sales Manager Sun Microsystems, TR Istanbul TR Phone +902123352200 Mobile +905339310752 Fax +90212335 Email [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Tim Sent: 15 Şubat 2008 Cuma 03:13 To: Bob Friesenhahn Cc: zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] Performance with Sun StorageTek 2540 On 2/14/08, Bob Friesenhahn [EMAIL PROTECTED] wrote: Under Solaris 10 on a 4 core Sun Ultra 40 with 20GB RAM, I am setting up a Sun StorageTek 2540 with 12 300GB 15K RPM SAS drives and connected via load-shared 4Gbit FC links. This week I have tried many different configurations, using firmware managed RAID, ZFS managed RAID, and with the controller cache enabled or disabled. My objective is to obtain the best single-file write performance. Unfortunately, I am hitting some sort of write bottleneck and I am not sure how to solve it. I was hoping for a write speed of 300MB/second. With ZFS on top of a firmware managed RAID 0 across all 12 drives, I hit a peak of 200MB/second. With each drive exported as a LUN and a ZFS pool of 6 pairs, I see a write rate of 154MB/second. The number of drives used has not had much effect on write rate. Information on my pool is shown at the end of this email. I am driving the writes using 'iozone' since 'filebench' does not seem to want to install/work on Solaris 10. I am suspecting that the problem is that I am running out of IOPS since the drive array indicates a an average IOPS of 214 for one drive even though the peak write speed is only 26MB/second (peak read is 42MB/second). Can someone share with me what they think the write bottleneck might be and how I can surmount it? Thanks, Bob == Bob Friesenhahn [EMAIL PROTECTED], http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ % zpool status pool: Sun_2540 state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM Sun_2540 ONLINE 0 0 0 mirror ONLINE 0 0 0 c4t600A0B80003A8A0B096A47B4559Ed0 ONLINE 0 0 0 c4t600A0B80003A8A0B096E47B456DAd0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c4t600A0B80003A8A0B096147B451BEd0 ONLINE 0 0 0 c4t600A0B80003A8A0B096647B453CEd0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c4t600A0B80003A8A0B097347B457D4d0 ONLINE 0 0 0 c4t600A0B800039C9B50A9C47B4522Dd0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c4t600A0B800039C9B50AA047B4529Bd0 ONLINE 0 0 0 c4t600A0B800039C9B50AA447B4544Fd0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c4t600A0B800039C9B50AA847B45605d0 ONLINE 0 0 0 c4t600A0B800039C9B50AAC47B45739d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c4t600A0B800039C9B50AB047B457ADd0 ONLINE 0 0 0 c4t600A0B800039C9B50AB447B4595Fd0 ONLINE 0 0 0 errors: No known data errors freddy:~% zpool iostat capacity operationsbandwidth pool used avail read write read write -- - - - - - - Sun_254064.0G 1.57T808861 99.8M 105M freddy:~% zpool iostat -v capacity operations bandwidth pool used avail read write read write -- - - - - - - Sun_254064.0G 1.57T809860 100M 105M mirror10.7G 267G135143 16.7M 17.6M c4t600A0B80003A8A0B096A47B4559Ed0 - - 66141 8.37M 17.6M c4t600A0B80003A8A0B096E47B456DAd0 - - 67141 8.37M 17.6M mirror10.7G 267G135143 16.7M 17.6M c4t600A0B80003A8A0B096147B451BEd0 - - 66141 8.37M 17.6M
Re: [zfs-discuss] Performance with Sun StorageTek 2540
Yes, it does replicate data between controllers. Usualy it slows that a lot espacialy on wirte heavy environments. If you properly tune ZFS you may not need this feature for consistency... Mertol Ozyoney Storage Practice - Sales Manager Sun Microsystems, TR Istanbul TR Phone +902123352200 Mobile +905339310752 Fax +90212335 Email [EMAIL PROTECTED] -Original Message- From: Bob Friesenhahn [mailto:[EMAIL PROTECTED] Sent: 16 Şubat 2008 Cumartesi 18:43 To: Mertol Ozyoney Cc: zfs-discuss@opensolaris.org Subject: RE: [zfs-discuss] Performance with Sun StorageTek 2540 On Sat, 16 Feb 2008, Mertol Ozyoney wrote: Please try to distribute Lun's between controllers and try to benchmark by disabling cache mirroring. (it's different then disableing cache) By the term disabling cache mirroring are you talking about Write Cache With Replication Enabled in the Common Array Manager? Does this feature maintain a redundant cache (two data copies) between controllers? Bob == Bob Friesenhahn [EMAIL PROTECTED], http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Solaris File Server ZFS and Cifs
Hi ; One of my customers is considering a 10 TB NAS box for some windows boxes. Reliability and High performance is mandatory. So I plan to use 2x Clustered Servers + some storage and ZFS and Solaris. Here are my questions 1) Is any body using Clustered Solaris and ZFS for file serving in an active-active configuration. (granularity required between cluster nodes is at share level) 2) Is CIFS support reliable at the moment ? 3) Can we implement Ethernet port failover for Cifs services? 4) Can we implement Ethernet failover for iSCSI services? 5) Is It easy and automatic to failover and failback Cifs shares? Very best regards Mertol http://www.sun.com/ http://www.sun.com/emrkt/sigs/6g_top.gif Mertol Ozyoney Storage Practice - Sales Manager Sun Microsystems, TR Istanbul TR Phone +902123352200 Mobile +905339310752 Fax +90212335 Email mailto:[EMAIL PROTECTED] [EMAIL PROTECTED] attachment: image001.gif___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Real time mirroring
I think I have heard something called dirty time logging being implemented in ZFS. Mertol Ozyoney Storage Practice - Sales Manager Sun Microsystems, TR Istanbul TR Phone +902123352200 Mobile +905339310752 Fax +90212335 Email [EMAIL PROTECTED] -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of J.P. King Sent: 08 Şubat 2008 Cuma 10:26 To: zfs-discuss@opensolaris.org Subject: [zfs-discuss] Real time mirroring Someone suggested an idea, which the more I think about the less insane it sounds. I thought I would ask the assembled masses to see if anyone had tried anything like this, and how successful they had been. I'll start with the simplest variant of the solution, but there are potentially subtleties which could be applied. Take 3 machines, for the same of argument 2 x4500s and an x4100 as a head unit. Export the storage from each of the x4500s by making it an iSCSI target. Import the storage onto the x4100 making it an iSCSI initiator. Using ZFS (and I assume this is preferable to Solaris Volume manager) set up a mirror between the two sets of storage. Assuming that works, one of the two servers can be moved to a different site, and you now have real time, cross site mirroring of data. For added tweaks I believe that I can arrange to have two head units so that I can do something resembling failover of data, if not necessarily instantaneously. The only issue I haven't yet been able to come up with a solution for in this thought experiment is how to recover quickly from one half of the mirror going down. As far as I can tell I need to re-silver the entire half of the mirror, which could take some time. Am I missing some clever trick here? I'm interested in any input as to why this does or doesn't work, and I'm especially interested to hear from anyone that has actually done something like this already. Cheers, Julian -- Julian King Computer Officer, University of Cambridge, Unix Support ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Real time mirroring
What is the procedure for enabling DTL ? PS: I am no unix guru Mertol Ozyoney Storage Practice - Sales Manager Sun Microsystems, TR Istanbul TR Phone +902123352200 Mobile +905339310752 Fax +90212335 Email [EMAIL PROTECTED] -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Sent: 08 Şubat 2008 Cuma 13:42 To: J.P. King Cc: Mertol Ozyoney; zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] Real time mirroring J.P. King wrote: I think I have heard something called dirty time logging being implemented in ZFS. Thanks for the pointer. Certainly interesting, but according to the talks/emails I've found a month or so ago ZFS will offer this, so I am guessing it isn't there yet, and certainly not in a released version of Solaris. Knowing that it is (probably) on the way is still useful. It is already there, see here http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/fs/ zfs/sys/vdev_impl.h#130 and try full-text search for dtl in usr/src/uts/common/fs/zfs/ as well hth, victor ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Draft one-pager for zfs-auto-snapshots
Tim; Excellent work. This is one great feature that should have been implemented into the ZFS long ago. I also recommend integrating this functionality with the ZFS GUI. And a global manager that manages snapshots of multiple servers would be the dream of a system admin. Keepup the good work. Best regards Mertol Ozyoney Storage Practice - Sales Manager Sun Microsystems, TR Istanbul TR Phone +902123352200 Mobile +905339310752 Fax +90212335 Email [EMAIL PROTECTED] -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Tim Foster Sent: 04 Şubat 2008 Pazartesi 17:14 To: zfs-discuss Subject: [zfs-discuss] Draft one-pager for zfs-auto-snapshots Hi all, I put together the attached one-pager on the ZFS Automatic Snapshots service which I've been maintaining on my blog to date. I would like to see if this could be integrated into ON and believe that a first step towards this is a project one-pager: so I've attached a draft version. I'm happy to defer judgement to the ZFS team as to whether this would be a suitable addition to OpenSolaris - if the consensus is that it's better for the service to remain in it's current un-integrated state and be discovered through BigAdmin or web searches, that's okay by me. [ just thought I'd ask ] cheers, tim -- Tim Foster, Sun Microsystems Inc, Solaris Engineering Ops http://blogs.sun.com/timf ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Case #65841812
Don't take my words as an expert advice, as I am newbie when it comes to ZFS. If I am not mistaken, if you are only using Oracle on the particular Zpol, Oracle Checksum offers better protection against data corruption. You can disable ZFS checksums. Best regards Mertol Mertol Ozyoney Storage Practice - Sales Manager Sun Microsystems, TR Istanbul TR Phone +902123352200 Mobile +905339310752 Fax +90212335 Email [EMAIL PROTECTED] -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Scott Macdonald - Sun Microsystem Sent: 01 Şubat 2008 Cuma 15:31 To: zfs-discuss@opensolaris.org; [EMAIL PROTECTED] Subject: [zfs-discuss] Case #65841812 Below is my customers issue. I am stuck on this one. I would appreciate if someone could help me out on this. Thanks in advance! ZFS Checksum feature: I/O checksum is one of the main ZFS features; however, there is also block checksum done by Oracle. This is good when utilizing UFS since it does not do checksums, but with ZFS it can be a waste of CPU time. Suggestions have been made to change the Oracle db_block_checksum parameter to false which may give Significant performance gain on ZFS. What are Sun's stance and/or suggestions on making this change on the ZFS side as well as making the changes on the Oracle side. -- Scott MacDonald - Sun Support Services _/_/_/_/ _/_/ _/_/_/Technical Support Engineer _/ _/_/ _/_/ _/_/ _/_/ _/_/ Mon - Fri 8:00am - 4:30pm EST _/ _/_/ _/_/ Ph: 1-800-872-4786 (option 2 case #) _/_/_/_/ _/_/_/ _/_/ email: [EMAIL PROTECTED] M I C R O S Y S T E M S alias: [EMAIL PROTECTED] www.sun.com/service/support If you need immediate assistance please call 1-800-USA-4-SUN, option 2 and the case number. If I am unavailable, and you need immediate assistance, please press 0 for more options. To track package delivery, call Logistics at 1(800)USA-1SUN, option 1 Thank you for using SUN. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] x4500 x2
Hi; Why don't you buy one X4500 and one X4500 motherboard as spare a long with a few cold standby drives. Best regards Mertol Mertol Ozyoney Storage Practice - Sales Manager Sun Microsystems, TR Istanbul TR Phone +902123352200 Mobile +905339310752 Fax +90212335 Email [EMAIL PROTECTED] -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Jorgen Lundman Sent: 31 Ocak 2008 Perşembe 13:13 To: zfs-discuss@opensolaris.org Subject: [zfs-discuss] x4500 x2 If we were to get two x4500s, with the idea of keeping one as a passive standby (serious hardware failure) are there any clever solutions in doing so? We can not use ZFS itself, but rather zpool volumes, with UFS on-top. I assume there is no zpool send/recv (although, that would be pretty neat if there was!). Doing full rsyncs all the time would probably be slow. Would it be possible to do a snapshot, then 10 minutes later, another snapshot and only rsync the differences? Any advice will be appreciated. Lund -- Jorgen Lundman | [EMAIL PROTECTED] Unix Administrator | +81 (0)3 -5456-2687 ext 1017 (work) Shibuya-ku, Tokyo| +81 (0)90-5578-8500 (cell) Japan| +81 (0)3 -3375-1767 (home) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] ZFS as cluster file system
Hi; I am regularly making ZFS presentations and everybody is asking for when they can use ZFS as a cluster file system.(active active at least 2 nodes) Any idea? regards http://www.sun.com/ http://www.sun.com/emrkt/sigs/6g_top.gif Mertol Ozyoney Storage Practice - Sales Manager Sun Microsystems, TR Istanbul TR Phone +902123352200 Mobile +905339310752 Fax +90212335 Email mailto:[EMAIL PROTECTED] [EMAIL PROTECTED] attachment: image001.gif___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Help needed ZFS vs Veritas Comparison
Hi Everyone ; I will soon be making a presentation comparing ZFS against Veritas Storage Foundation , do we have any document comparing features ? regards http://www.sun.com/ http://www.sun.com/emrkt/sigs/6g_top.gif Mertol Ozyoney Storage Practice - Sales Manager Sun Microsystems, TR Istanbul TR Phone +902123352200 Mobile +905339310752 Fax +90212335 Email mailto:[EMAIL PROTECTED] [EMAIL PROTECTED] attachment: image001.gif___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Help needed ZFS vs Veritas Comparison
Good points. I will try to Focus on these areas. Very best regards http://www.sun.com/ http://www.sun.com/emrkt/sigs/6g_top.gif Mertol Ozyoney Storage Practice - Sales Manager Sun Microsystems, TR Istanbul TR Phone +902123352200 Mobile +905339310752 Fax +90212335 Email [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] From: Sengor [mailto:[EMAIL PROTECTED] Sent: Friday, December 28, 2007 4:41 PM To: [EMAIL PROTECTED] Cc: zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] Help needed ZFS vs Veritas Comparison Perhaps a few that might help: http://www.sun.com/software/whitepapers/solaris10/zfs_veritas.pdf http://www.symantec.com/enterprise/stn/articles/article_detail.jsp?articleid =SF_and_ZFS_whitepaper_44545 http://www.serverwatch.com/tutorials/article.php/3663066 I'm yet to see a side by side features comparison. Real comparison of features should include scenarios such as: - how ZFS/VxVM compare in BCV like environments (eg. when volumes are presented back to the same host) - how they all cope with various multipathing solutions out there - Filesystem vs Volume snapshots - Portability within cluster like environments (SCSI reserves LUN visibility to multiple synchronous hosts) - Disaster recovery scenarios - Ease/Difficulty with data migrations across physical arrays - Boot volumes - Online vs Offline attribute/parameter changes I can't think of more right now, it's way past midnight here ;) On 12/28/07, Mertol Ozyoney [EMAIL PROTECTED] wrote: Hi Everyone ; I will soon be making a presentation comparing ZFS against Veritas Storage Foundation , do we have any document comparing features ? regards http://www.sun.com/ http://www.sun.com/emrkt/sigs/6g_top.gif Mertol Ozyoney Storage Practice - Sales Manager Sun Microsystems, TR Istanbul TR Phone +902123352200 Mobile +905339310752 Fax +90212335 Email [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss http://mail.opensolaris.org/mailman/listinfo/zfs-discuss -- _/ sengork.blogspot.com / attachment: image001.gif___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Trial x4500, zfs with NFS and quotas.
Hello All; While sometimes not possible, ZFS+Thumper solution is not so far away from replacing expensive to buy and own NetApp like equipment. What people can sometimes forget is Thumper and Solaris are general purpose products that can be spcialized with some efforts. We had some cases where we had to fine tune X4500 and ZFS for more stability or performance. At the end of the day the benefits well worth the efforts. Best regards Mertol Ozyoney Storage Practice - Sales Manager Sun Microsystems, TR Istanbul TR Phone +902123352200 Mobile +905339310752 Fax +90212335 Email [EMAIL PROTECTED] -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Jorgen Lundman Sent: Tuesday, December 11, 2007 4:22 AM To: zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] Trial x4500, zfs with NFS and quotas. I don't know... while it will work I'm not sure I would trust it. Maybe just use Solaris Volume Manager with Soft Partitioning + UFS and forget about ZFS in your case? Well, the idea was to see if it could replace the existing NetApps as that was what Jonathan promised it could do, and we do use snapshots on the NetApps, so having zfs snapshots would be attractive, as well as easy to grow the file-system as needed. (Although, perhaps I can growfs with SVM as well.) You may be correct about the trust issue though. copied over a small volume from the netapp: Filesystem size used avail capacity Mounted on 1.0T 8.7G 1005G 1%/export/vol1 NAMESIZEUSED AVAILCAP HEALTH ALTROOT zpool1 20.8T 5.00G 20.8T 0% ONLINE - So copied 8.7Gb, to compressed volume takes up 5Gb. That is quite nice. Enable the same quotas for users, then run quotacheck: [snip] #282759fixed: files 0 - 4939 blocks 0 - 95888 #282859fixed: files 0 - 9 blocks 0 - 144 Read from remote host x4500-test: Operation timed out Connection to x4500-test closed. and it has not come back, so not a panic, just a complete hang. I'll have to get NOC staff to go power cycle it. We are bending over backwards trying to get the x4500 to work in a simple NAS design, but honestly, the x4500 is not a NAS. Nor can it compete with NetApps. As a Unix server with lots of disks, it is very nice. Perhaps one day it can mind you, it just is not there today. -- Jorgen Lundman | [EMAIL PROTECTED] Unix Administrator | +81 (0)3 -5456-2687 ext 1017 (work) Shibuya-ku, Tokyo| +81 (0)90-5578-8500 (cell) Japan| +81 (0)3 -3375-1767 (home) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Simultaneous access to a single ZFS volume
Hi ; When will ZFS support multiple servers accessing same file system ? Best regards http://www.sun.com/ http://www.sun.com/emrkt/sigs/6g_top.gif Mertol Ozyoney Storage Practice - Sales Manager Sun Microsystems, TR Istanbul TR Phone +902123352200 Mobile +905339310752 Fax +90212335 Email mailto:[EMAIL PROTECTED] [EMAIL PROTECTED] attachment: image001.gif___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] running both client and server on the same server
Hello; Which version of Lustre can run both server and client on the same server? regards http://www.sun.com/ http://www.sun.com/emrkt/sigs/6g_top.gif Mertol Ozyoney Storage Practice - Sales Manager Sun Microsystems, TR Istanbul TR Phone +902123352200 Mobile +905339310752 Fax +90212335 Email mailto:[EMAIL PROTECTED] [EMAIL PROTECTED] attachment: image001.gif___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] iSCSI on ZFS with Linux initiator
Hi; Do anyone have experiance on iSCSI target volumes on ZFS accessed by linux clients? (Red hat , suse ?) regards http://www.sun.com/ http://www.sun.com/emrkt/sigs/6g_top.gif Mertol Ozyoney Storage Practice - Sales Manager Sun Microsystems, TR Istanbul TR Phone +902123352200 Mobile +905339310752 Fax +90212335 Email mailto:[EMAIL PROTECTED] [EMAIL PROTECTED] attachment: image001.gif___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS and clustering - wrong place?
Some answers below, Mertol Ozyoney Storage Practice - Sales Manager Sun Microsystems, TR Istanbul TR Phone +902123352200 Mobile +905339310752 Fax +90212335 Email [EMAIL PROTECTED] -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Ross Sent: Tuesday, November 06, 2007 1:50 PM To: zfs-discuss@opensolaris.org Subject: [zfs-discuss] ZFS and clustering - wrong place? I'm just starting to learn about Solaris, ZFS, etc... It's amazing me how much is possible, but it's just shy of what I'd really, really like to see. I can see there's a fair amount of interest in ZFS and clustering, and it seems Sun are actively looking into this, but I'm wondering if that's the right place to do it? Now I may be missing something obvious here, but it seems to me that for really reliable clustering of data you need to be dealing with it at a higher layer, effectively where iSCSI sits. Instead of making ZFS cluster aware, wouldn't it be easier to add support for things like mirroring, striping (even raid) to the iSCSI protocol? You are missing something obvious here. Looking from an aplication layer file system is at a higher level then a storage protokol. Besides iSCSI is a protokol and ZFS is a file system so there is virtualy no reason to compare them. What Sun is doing at the moment is trying to support active active access of cluster nodes to the same ZFS file system. And active active access is managed at FS level. Accessing shared store is an other thing. ZFS defines nothing about how you access raw devices (FCP , iSCSI, Sata etc.) You can access your storage with iSCSI and use ZFS over it. That way you get to use ZFS locally with all the benefits that entails (guaranteed data integrity, etc), and you also have a protocol somewhere in the network layer that guarantees data integrity to the client (confirming writes at multiple locations, painless failover, etc...). Essentially doing for iSCSI what ZFS did for disk. You'd need support for this in the iSCSI target as it would seem make sense to store the configuration of the cluster on every target. That way the client can connect to any target and read the information on how it is to connect. But once that's done, your SAN speed is only limited by the internal speed of your switch. If you need fast performance, add half a dozen devices and stripe data across them. If you need reliability mirror them. If you want both, use a raid approach. Who needs expensive fibre channel when you can just stripe a pile of cheap iSCSI devices? It would make disaster recovery and HA a piece of cake. For any network like ourselves with a couple of offices and a fast link between them (any university campus would fit that model), you just have two completely independent servers and configure the clients to stream data to them both. No messy configuration of clustered servers, and support for multicast on the network means you don't even have to slow your clients down. The iSCSI target would probably need to integrate with the file system to cope with disasters. You'd need an intelligent way to re-synchronise machines when they came back online, but that shouldn't be too difficult with ZFS. I reckon you could turn Solaris ZFS into the basis for one of the most flexible SAN solutions out there. What do you all think? Am I off my rocker or would an approach like this work? This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS mirroring
I tried to mean continous data replication between different systems. Mertol Ozyoney Storage Practice - Sales Manager Sun Microsystems, TR Istanbul TR Phone +902123352200 Mobile +905339310752 Fax +90212335 Email [EMAIL PROTECTED] -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Darren J Moffat Sent: Monday, October 22, 2007 12:43 PM To: [EMAIL PROTECTED] Cc: zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] ZFS mirroring Mertol Ozyoney wrote: Hi; Do any of you know when ZFS remote mirroring will be available ? Depends on exactly what you mean by remote mirroring ? Please explain exactly what you mean by this because there are many different definitions. Some of which ZFS already meets others it depends on your constraints. -- Darren J Moffat ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS mirroring
I know I havent defined my particular needs. However I am looking for a simple explanation of waht is available today and what will be available in short term. Example. One to one asynch replication is suported, many to one synch replication supported etc Regards Mertol Mertol Ozyoney Storage Practice - Sales Manager Sun Microsystems, TR Istanbul TR Phone +902123352200 Mobile +905339310752 Fax +90212335 Email [EMAIL PROTECTED] -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Darren J Moffat Sent: Monday, October 22, 2007 12:48 PM To: [EMAIL PROTECTED] Cc: zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] ZFS mirroring Mertol Ozyoney wrote: I tried to mean continous data replication between different systems. Still not enough information. How many systems, are all of them expected to be read/write at all times ? Is HA cluster software (such as Sun Cluster) involved ? Does the replication need to be synchronous (ie a transaction on A will fail if B can't commit it) or is asynchronous sufficient ? If asynchronous what is the time window for replication ? Is it a single master with remote copies (that are in standby) ? What network/storage infrastructure is available ? -- Darren J Moffat ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] random or sequantial writes and Jbot redundany
Hi all; Optimizing the array controller to random or sequantial controller depends on your read/write ratio. ZFS have the ability to combine random writes to a sequantial write. However eads stilll be random. And for Jbod redundany I have a few comments 1) You can do three way mirror on very critical volumes. Using ZFS saves a lot of money, cheepest 2540 controllers still will set you back 10k $'s so three way mirror will be still economical 2) You may write scripts to add spares after a failure, this may not be very flexible how ever will do what you want My 2 cents http://www.sun.com/ http://www.sun.com/emrkt/sigs/6g_top.gif Mertol Ozyoney Storage Practice - Sales Manager Sun Microsystems, TR Istanbul TR Phone +902123352200 Mobile +905339310752 Fax +90212335 Email mailto:[EMAIL PROTECTED] [EMAIL PROTECTED] attachment: image001.gif___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] ZFS mirroring
Hi; Do any of you know when ZFS remote mirroring will be available ? regards http://www.sun.com/ http://www.sun.com/emrkt/sigs/6g_top.gif Mertol Ozyoney Storage Practice - Sales Manager Sun Microsystems, TR Istanbul TR Phone +902123352200 Mobile +905339310752 Fax +90212335 Email mailto:[EMAIL PROTECTED] [EMAIL PROTECTED] attachment: image001.gif___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] ZFS and Lsutre
Hi; I noticed that Lustre OSS can be set up on ZFS file systems. http://wiki.lustre.org/index.php?title=Lustre_OSS/MDS_with_ZFS_DMU Is there anyone out there who can share their experiance with me ? regards http://www.sun.com/ http://www.sun.com/emrkt/sigs/6g_top.gif Mertol Ozyoney Storage Practice - Sales Manager Sun Microsystems, TR Istanbul TR Phone +902123352200 Mobile +905339310752 Fax +90212335 Email mailto:[EMAIL PROTECTED] [EMAIL PROTECTED] attachment: image001.gif___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Distribued ZFS
Hi Ged; At the moment ZFS is not a shared file system nor a paralell file system. However lustre integration which will take some time will provide parallel file system abilities. I am unsure if lustre at the moment supports redundancy between storage nodes (it was on the road map) But ZFS at the moment supports Sun cluster 3.2 (no paralel acccess is supported) and new upcoming SAS Jbods will let you implement cheaper ZFS cluster easly. )2 x Entry level sun server + 4 x 48 slot Jbod) http://www.sun.com/ http://www.sun.com/emrkt/sigs/6g_top.gif Mertol Ozyoney Storage Practice - Sales Manager Sun Microsystems, TR Istanbul TR Phone +902123352200 Mobile +905339310752 Fax +90212335 Email mailto:[EMAIL PROTECTED] [EMAIL PROTECTED] attachment: image001.gif___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Mounting ZFS Pool to a different server
Hi; One of my customers is using ZFs on IBM DS4800 Lun's. They use one lun for each ZFS pool if it matters. They want to take the pool offline from one server and take it nline from an other server. In summary they want to take the control of a ZFS pool if the primary server fails for some reason. I know we can do it with Sun Cluster how ever this is pretty complex and expensive . How can this be achieved? Regards Mertol http://www.sun.com/ http://www.sun.com/emrkt/sigs/6g_top.gif Mertol Ozyoney Storage Practice - Sales Manager Sun Microsystems, TR Istanbul TR Phone +902123352200 Mobile +905339310752 Fax +90212335 Email mailto:[EMAIL PROTECTED] [EMAIL PROTECTED] attachment: image001.gif___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss