Re: [zfs-discuss] ZFS raid is very slow???

2007-07-07 Thread Orvar Korvar
I am using Solaris Express Community build 67 installed on a 40GB harddrive 
(UFS filesystem on Solaris), dual boot with Windows XP. I have a zfsraid with 4 
samsung drives. It is a [EMAIL PROTECTED] and 1GB RAM.




When I copy a 1.3G file from ZFSpool to ZFSpool the command time cp file 
file2 gives this output:

bash-3.00# time cp PAGEFILE.SYS pagefil3
real0m49.719s
user0m0.004s
sys 0m10.160s

Which gives like 26MB/sec. 




When I copy that file from ZFS to UFS I get:
real0m35.091s
user0m0.004s
sys 0m15.337s

Which gives 37MB/sec.


However, in each of the above scenarios, the system monitor shows that all 
RAM is used up and it begins to swap (the swap uses like 40MB). My system has 
never swapped before (Windows swaps immediately upon startup, ha!). The cpu 
utilization is like 50%.





When I copy that file from UFS to UFS I get:
real1m36.315s
user0m0.003s
sys 0m11.327s
However, the CPU utilization is around 20% and RAM usage never exceeds 600MB - 
it doesnt use the swap.






When I copy that file from ZFS to /dev/null I get this output:
real0m0.025s
user0m0.002s
sys 0m0.007s
which can't be correct. Is it wrong of me to use time cp fil fil2 when 
measuring disk performance?





I mount NTFS with packages FSWfsmisc and FSWfspart, by Moinak Ghosh (and based 
on Martin Rosenau's work and part of Moinak's BeleniX work)
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Take Three: PSARC 2007/171 ZFS Separate Intent Log

2007-07-07 Thread Cyril Plisko
Hello,

This is a third request to open the materials of the PSARC case
2007/171 ZFS Separate Intent Log
I am not sure why two previous requests were completely ignored
(even when seconded by another community member).
In any case that is absolutely unaccepted practice.



On 6/30/07, Cyril Plisko [EMAIL PROTECTED] wrote:
 Hello !

 I am adding zfs-discuss as it directly relevant to this community.

 On 6/23/07, Cyril Plisko [EMAIL PROTECTED] wrote:
  Hi,
 
  can the materials of the above be open for the community ?
 
  --
  Regards,
  Cyril
 


 --
 Regards,
 Cyril



-- 
Regards,
Cyril
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Take Three: PSARC 2007/171 ZFS Separate Intent Log

2007-07-07 Thread Mike Gerdts
On 7/7/07, Cyril Plisko [EMAIL PROTECTED] wrote:
 Hello,

 This is a third request to open the materials of the PSARC case
 2007/171 ZFS Separate Intent Log
 I am not sure why two previous requests were completely ignored
 (even when seconded by another community member).
 In any case that is absolutely unaccepted practice.

The past week of inactivity is likely related to most of Sun in the US
being on mandatory vacation.  Sun typically shuts down for the week
that contains July 4 and (I think) the week between Christmas and Jan
1.

Mike
-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Take Three: PSARC 2007/171 ZFS Separate Intent Log

2007-07-07 Thread Casper . Dik

On 7/7/07, Cyril Plisko [EMAIL PROTECTED] wrote:
 Hello,

 This is a third request to open the materials of the PSARC case
 2007/171 ZFS Separate Intent Log
 I am not sure why two previous requests were completely ignored
 (even when seconded by another community member).
 In any case that is absolutely unaccepted practice.

The past week of inactivity is likely related to most of Sun in the US
being on mandatory vacation.  Sun typically shuts down for the week
that contains July 4 and (I think) the week between Christmas and Jan
1.



But not mandatory this year (but many appear to have arranged for 
vacation because of the 4th of July and because they half expected a
mandatory shutdown)

Casper

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS Compression algorithms - Project Proposal

2007-07-07 Thread roland
nice idea! :)

We plan to start with the development of a fast implementation of a Burrows 
Wheeler Transform based algorithm (BWT).

why not starting with lzo first - it`s already in zfs-fuse on linux and it 
looks, that it`s just in between lzjb and gzip in terms of performance and 
compression ratio.
there needs yet to be demonstrated that it behaves similar on solaris.

see http://www.opensolaris.org/jive/thread.jspa?messageID=111031
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS performance and memory consumption

2007-07-07 Thread Ɓukasz
 When tuning recordsize for things like databases, we
 try to recommend
 that the customer's recordsize match the I/O size of
 the database
 record.

On this filesystem I have:
 - file links and they are rather static
 - small files ( about 8kB ) that keeps changing
 - big files ( 1MB - 20 MB ) used as temporary files( create, write, read, 
unlink ) 
 and operations on theses files is about 50% off all I/O

I think I need defragmentator tool. Do you think there will be any ?

Now all I can do is copy filesystems from this zpool to another. 
After this operation new zpool will not be fragmentated. 
But it takes time and I have several zpool's like this.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS Performance as a function of Disk Slice

2007-07-07 Thread Richard Elling
Scott Lovenberg wrote:
 First Post!
 Sorry, I had to get that out of the way to break the ice...

Welcome!

 I was wondering if it makes sense to zone ZFS pools by disk slice, and if it 
 makes a difference with RAIDZ.  As I'm sure we're all aware, the end of a 
 drive is half as fast as the beginning ([i]where the zoning stipulates that 
 the physical outside is the beginning and going towards the spindle increases 
 hex value[/i]).

IMHO, it makes sense to short-stroke if you are looking for the
best performance.  But raidz (or RAID-5) will not give you the
best performance.  You'd be better off mirroring for performance.

 I usually short stroke my drives so that the variable files on the operating 
 system drive are at the beginning, page in center (so if I'm already in 
 thrashing I'm at most 1/2 a platters width from page), and static files are 
 towards the end.  So, applying this methodology to ZFS, I partition a drive 
 into 4 equal-sized quarters, and do this to 4 drives (each on a separate SATA 
 channel), and then create 4 pools which hold each 'ring' of the drives.  Will 
 I then have 4 RAIDZ pools, which I can mount according to speed needs?  For 
 instance, I always put (in Linux... I'm new to Solaris) '/export/archive' all 
 the way on the slow tracks since I don't read or write to it often and it is 
 almost never accessed at the same time as anything else that would force long 
 strokes.
 
 Ideally, I'd like to do a straight ZFS on the archive track.  I move data to 
 archive in chunks, 4 gigs at a time - when I roll it in I burn 2 DVDs, 1 gets 
 cataloged locally and the other offsite, so if I lose the data, I don't care 
 - but, ZFS gives me the ability to snapshot to archive (I assume it works 
 across pools?).  Then stripe 1 ring  (I guess this is ZFS native?), 
 /usr/local (or its Solaris equivalent) for performance.  Then mirror the root 
 slice.  Finally, /export would be RAIDZ or RAIDZ2 on the fastest track, 
 holding my source code, large files, and things I want to stream over the LAN.
 
 Does this make sense with ZFS?  Is the spindle count more of a factor than 
 stroke latency?  Does ZFS balance these things out on its own via random 
 scattering?

Spindle count almost always wins for performance.
Note: bandwidth usually isn't the source of perceived performance
problems, latency is.  We believe that this has implications for
ZFS over time due to COW, but nobody has characterized this yet.

 Reading back over this post, I've found it sounds like the ramblings of a 
 madman.  I guess I know what I want to say, but I'm not sure the right 
 questions to ask.  I think I'm saying:  Will my proposed setup afford me the 
 flexibility to zone for performance since I have a more intimate knowledge of 
 the data going onto the drive, or will brute force by spindle count (I'm 
 planning 4-6 drives - single drive to  a bus) and random placement be 
 sufficient if I just add the whole drive to a single pool?

Yes :-)  YMMV.

 I thank you all for your time and patience as I stumble through this, and I 
 welcome any point of view or insights (especially those from experience!) 
 that might help me decide how to configure my storage server.

KISS.

There are trade-offs for space, performance, and RAS.  We have models
to describe these, so you might check out my blogs on the subject.
http://blogs.sun.com/relling
  -- richard
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Take Three: PSARC 2007/171 ZFS Separate Intent Log

2007-07-07 Thread Neil Perrin
Cyril,

I wrote this case and implemented the project. My problem was
that I didn't know what policy (if any) Sun has about publishing
ARC cases, and a mail log with a gazillion email addresses.

I did receive an answer to this this in the form:

http://www.opensolaris.org/os/community/arc/arc-faq/arc-publish-historical-checklist/

Never having done this it seems somewhat burdensome, and will take some time.

Sorry, for the slow response and lack of feedback. Are there
any particular questions you have about separate intent logs
that I can answer before I embark on the process?

Neil.

Cyril Plisko wrote:
 Hello,
 
 This is a third request to open the materials of the PSARC case
 2007/171 ZFS Separate Intent Log
 I am not sure why two previous requests were completely ignored
 (even when seconded by another community member).
 In any case that is absolutely unaccepted practice.
 
 
 
 On 6/30/07, Cyril Plisko [EMAIL PROTECTED] wrote:
 Hello !

 I am adding zfs-discuss as it directly relevant to this community.

 On 6/23/07, Cyril Plisko [EMAIL PROTECTED] wrote:
 Hi,

 can the materials of the above be open for the community ?

 --
 Regards,
 Cyril


 --
 Regards,
 Cyril

 
 
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Take Three: PSARC 2007/171 ZFS Separate Intent Log

2007-07-07 Thread Cyril Plisko
On 7/7/07, Neil Perrin [EMAIL PROTECTED] wrote:
 Cyril,

 I wrote this case and implemented the project. My problem was
 that I didn't know what policy (if any) Sun has about publishing
 ARC cases, and a mail log with a gazillion email addresses.

 I did receive an answer to this this in the form:

 http://www.opensolaris.org/os/community/arc/arc-faq/arc-publish-historical-checklist/

 Never having done this it seems somewhat burdensome, and will take some time.

Neil,

I am glad the message finally got through.

It seems to me that the URL above refers to the publishing
materials of *historical* cases. Do you think the case in hand
should be considered historical ?

Anyway, many ZFS related cases were openly reviewed from
the moment zero of their life, why this one was an exception ?


 Sorry, for the slow response and lack of feedback. Are there
 any particular questions you have about separate intent logs
 that I can answer before I embark on the process?

Well, that only question I have now is what is it all about ?
It is hard to ask question without access to case materials,
right ?

-- 
Regards,
Cyril
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs space efficiency

2007-07-07 Thread -=dave
agreed.

while a bitwise check is the only assured way to determine duplicative nature 
of two blocks, if the check were done in a streaming method as you suggest, 
performance, while a huge impact compared to not, would be more than bearable 
if used within an environment with large known levels of duplicative data, such 
as a large central backup zfs send target.   the checksum metadata is sent 
first, then the data, while the receiving system is checking it's db for 
possible dupe and if found, reads the data from local disks and compares to 
data as it is coming from sender.  If it gets to the end and hasn't found a 
difference, updates the pointer for the block to point to the duplicate.  This 
won't save any bandwidth during the backup, but will save on-disk space and 
given the application, could be very advantagous.

thank you for the insightful discussion on this.   within the electronic 
discovery and records and information management space data deduplication and 
policy-based aging are the foremost topics of the day but this is at the file 
level while block-level deduplication would lend no benefit to that regardless.

-=dave
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs space efficiency

2007-07-07 Thread -=dave
one other thing... the checksums for all files to send *could* be checked first 
in batch and known unique blocks prioritized and sent first, then the possibly 
duplicative data sent afterwards to be verified a dupe, thereby decreasing the 
possible data loss for the backup window to levels equivolently low to the 
checksum collision probability.

-=dave
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS raid is very slow???

2007-07-07 Thread Sean Hafeez
ZFS is a 128 bit file system. The performance on your 32-bit CPU will 
not be that good. ZFS was designed for a 64-bit CPU. Another GB of RAM 
might help. There are a bunch of post in the archive about 32-bit CPUs 
and performance.

-Sean




Orvar Korvar wrote:
 I am using Solaris Express Community build 67 installed on a 40GB harddrive 
 (UFS filesystem on Solaris), dual boot with Windows XP. I have a zfsraid with 
 4 samsung drives. It is a [EMAIL PROTECTED] and 1GB RAM.




 When I copy a 1.3G file from ZFSpool to ZFSpool the command time cp file 
 file2 gives this output:

 bash-3.00# time cp PAGEFILE.SYS pagefil3
 real0m49.719s
 user0m0.004s
 sys 0m10.160s

 Which gives like 26MB/sec. 




 When I copy that file from ZFS to UFS I get:
 real0m35.091s
 user0m0.004s
 sys 0m15.337s

 Which gives 37MB/sec.


 However, in each of the above scenarios, the system monitor shows that all 
 RAM is used up and it begins to swap (the swap uses like 40MB). My system has 
 never swapped before (Windows swaps immediately upon startup, ha!). The cpu 
 utilization is like 50%.





 When I copy that file from UFS to UFS I get:
 real1m36.315s
 user0m0.003s
 sys 0m11.327s
 However, the CPU utilization is around 20% and RAM usage never exceeds 600MB 
 - it doesnt use the swap.






 When I copy that file from ZFS to /dev/null I get this output:
 real0m0.025s
 user0m0.002s
 sys 0m0.007s
 which can't be correct. Is it wrong of me to use time cp fil fil2 when 
 measuring disk performance?





 I mount NTFS with packages FSWfsmisc and FSWfspart, by Moinak Ghosh (and 
 based on Martin Rosenau's work and part of Moinak's BeleniX work)
  
  
 This message posted from opensolaris.org
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

   

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS raid is very slow???

2007-07-07 Thread Jonathan Edwards


On Jul 7, 2007, at 06:14, Orvar Korvar wrote:


When I copy that file from ZFS to /dev/null I get this output:
real0m0.025s
user0m0.002s
sys 0m0.007s
which can't be correct. Is it wrong of me to use time cp fil fil2  
when measuring disk performance?


well you're reading and writing to the same disk so that's going to  
affect performance, particularly as you're seeking to different areas  
of the disk both for the files and for the uberblock updates .. in  
the above case it looks like the file is already cached (buffer cache  
being what is probably consuming most of your memory here) - so  
you're just looking at a memory to memory transfer here .. if you  
want to see a simple write performance test many people use dd like so:


# timex dd if=/dev/zero of=file bs=128k count=8192

which will give you a measure of an efficient 1GB file write of  
zeros .. or use a better opensource tool like iozone to get a better  
fix on single thread vs multi-thread, read/write mix, and block size  
differences for your given filesystem and storage layout


jonathan
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss