[zfs-discuss] Odp: Re[2]: Re: Re[2]: Re: Re: Re: Snapshots impact on performance

2007-07-27 Thread Łukasz K
Dnia 26-07-2007 o godz. 13:31 Robert Milkowski napisał(a):
 Hello Victor,
 
 Wednesday, June 27, 2007, 1:19:44 PM, you wrote:
 
 VL Gino wrote:
  Same problem here (snv_60).
  Robert, did you find any solutions?
 
 VL Couple of week ago I put together an implementation of space maps
 which
 VL completely eliminates loops and recursion from space map alloc
 VL operation, and allows to implement different allocation strategies
 quite
 VL easily (of which I put together 3 more). It looks like it works for me
 VL on thumper and my notebook with ZFS Root though I have almost no
 time to
 VL test it more these days due to year end. I haven't done SPARC build
 yet
 VL and I do not have test case to test against.
 
 VL Also, it comes at a price - I have to spend some more time
 (logarithmic,
 VL though) during all other operations on space maps and is not
 optimized now.
 
 Lukasz (cc) - maybe you can test it and even help on tuning it?
 
Yes, I can test it. I'm building environment to compile opensolaris
and test zfs. I will be ready next week.

Victor, can you tell me where to look for your changes ?
How to change allocation strategy ?
I can see that changing space_map_ops_t
I can declare diffrent callback functions.

Lukas


Tylko od nich zależy czy przeżyją tę noc. Jak uciec, gdy 
oni widzą wszystko? Kate Beckinsale w mrocznym thrillerze
MOTEL - kinach od 3 sierpnia!
http://klik.wp.pl/?adr=http%3A%2F%2Fadv.reklama.wp.pl%2Fas%2Fmotel.htmlsid=1236


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS forks (Was: LZO compression?)

2007-07-27 Thread Mario Goebbels
Accessibility of the data is also a reason, in dual boot scenarios.
Doesn't need to be a native Windows driver, but something that still
ties into the Explorer. There's still the option of running Solaris in
VMware, but that's a bit heavy handed.

-mg
 TT You like Windows /that much/ ? Note Sun isn't doing the OS X port,  
 TT Apple is.

 It has nothing to do if I like it or not.
 It's the question if ZFS could become default file system on most
 important platform.

   



signature.asc
Description: OpenPGP digital signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS forks (Was: LZO compression?)

2007-07-27 Thread andrewk9
Windows has a user mode driver frame work thingy - came across it recently in 
the lsit of services on my XP box. Perhaps this could be used to host a ZFS 
driver on Windows?

Andrew.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Read-only (forensic) mounts of ZFS

2007-07-27 Thread Darren J Moffat
Mark Furner wrote:
 Hi 
 
 Sorry for the cross-posting, I'd sent this to zfs-code originally. Wrong 
 forum.

and I've already replied there:

http://mail.opensolaris.org/pipermail/zfs-code/2007-July/000557.html

--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Read-only (forensic) mounts of ZFS

2007-07-27 Thread Mark Furner
Hi 

Sorry for the cross-posting, I'd sent this to zfs-code originally. Wrong 
forum.

I'm looking into forensic aspects of ZFS, in particular ways to use ZFS tools 
to investigate ZFS file systems without writing to the pools. I'm working on 
a test suite of file system images within VTOC partitions. At the moment, 
these only have 1 file system per pool per VTOC partition for simplicity's 
sake, and I'm using Solaris 10 6/06, which may not be the most up-to-date. At 
the bottom are details of the tests.

The problem: I was not able to use a loopback device on a file system image 
(see TEST section). Here are some questions:
* Am I missing a command or something? 
* Is there support for lofiadm in a more recent version of ZFS? 
* Or is there any way to safely mount a file system image?

Thanks for your help.

Regards

Mark

GOOD NEWS
It looks as if the zfs mount options can stop updates of file system metadata 
(ie mount times etc) and file metadata (no writing of file access times). 

Quote from man zfs  25 Apr 2006 p. 11 (Temporary Mount Point Properties) :
 ... these options can be set on a  per-mount  basis
 using  the -o option, without affecting the property that is
 stored on disk. The values specified  on  the  command  line
 will  override the values stored in the dataset. The -nosuid
 option is an alias for nodevices,nosetuid.  These  proper-
 ties  are  reported as temporary by the zfs get command.


TEST 

26.07.2007
Forensic mounting of ZFS File Systems.
Loopback device does not seem to work with ZFS using zfs mount or 
legacy mount.
However, temporary command-line options can prevent mounts from writing to a 
file system.

MAKE COPY
[EMAIL PROTECTED] /export/home# cp t1_fs1.dd t1_fs1.COPY.dd

CHECKSUMS

[EMAIL PROTECTED] /export/home# gsha1sum t1*
5c08a7edfe3d04f5fff6d37c6691e85c3745629f  t1_fs1.COPY.dd
5c08a7edfe3d04f5fff6d37c6691e85c3745629f  t1_fs1.dd

CHECKSUM RAW DEV FOR FS1
[EMAIL PROTECTED] /export/home# gsha1sum /dev/dsk/c0t1d0s1
5c08a7edfe3d04f5fff6d37c6691e85c3745629f  /dev/dsk/c0t1d0s1
[EMAIL PROTECTED] /export/home#


PREPARE LOOPBACK DEVICE
note need full path for file

[EMAIL PROTECTED] /export/home# lofiadm -a /export/home/t1_fs1.COPY.dd 
/dev/lofi/1
[EMAIL PROTECTED] /export/home# lofiadm
Block Device File
/dev/lofi/1  /export/home/t1_fs1.COPY.dd
[EMAIL PROTECTED] /export/home#

ZFS MOUNT OF LOOPBACK DEVICE DOESNT WORK

[EMAIL PROTECTED] /export/home# zfs mount -o 
noexec,nosuid,noatime,nodevices,ro /dev/lofi/1 /fs1
too many arguments
usage:
[...]
[EMAIL PROTECTED] /export/home# zfs mount -o ro,noatime /dev/lofi/1
cannot open '/dev/lofi/1': invalid filesystem name

NOR DOES LEGACY MOUNT

[EMAIL PROTECTED] /export/home# mount -F zfs -o 
noexec,nosuid,noatime,nodevices,ro /dev/lofi/1 /fs1
cannot open '/dev/lofi/1': invalid filesystem name

TRY MOUNT OF NORMAL FS

[EMAIL PROTECTED] /export/home# mount -o noexec,nosuid,noatime,nodevices,ro fs1 
/fs1
[EMAIL PROTECTED] /export/home# ls -lR /fs1
/fs1:
total 520
-rw-r--r--   1 mark staff 234179 Jul 17 20:17 
gutenberg.org_martin_luther_treatise_on_good_works_with_intro_gwork10.txt
drwxr-xr-x   3 root root   5 Jul 26 14:12 level_1

/fs1/level_1:
total 1822
-rwxr-xr-x   1 mark staff 834236 Jul 17 20:16 imgp2219.jpg
-rw-r--r--   1 mark staff   1388 Jul 17 20:15 
imgp2219.jpg.head.tail.xxd
drwxr-xr-x   2 root root   5 Jul 26 14:12 level_2

/fs1/level_1/level_2:
total 1038
-rw-r--r--   1 mark staff 234179 Jul 17 20:17 
gutenberg.org_martin_luther_treatise_on_good_works_with_intro_gwork10.txt
-rw-r--r--   1 mark staff 173713 Jul 17 20:15 imgp2219.small.jpg
-rw-r--r--   1 mark staff   1388 Jul 17 20:15 
imgp2219.small.jpg.head.tail.xxd

MUCK AROUND A BIT

[EMAIL PROTECTED] /export/home# 
file 
/fs1/gutenberg.org_martin_luther_treatise_on_good_works_with_intro_gwork10.txt
/fs1/gutenberg.org_martin_luther_treatise_on_good_works_with_intro_gwork10.txt: 
ascii text
[EMAIL PROTECTED] /export/home#
[EMAIL PROTECTED] /export/home# 
head 
/fs1/gutenberg.org_martin_luther_treatise_on_good_works_with_intro_gwork10.txt
*The Project Gutenberg Etext of A treatise on Good Works*
#2 in our series by Dr. Martin Luther


Copyright laws are changing all over the world, be sure to check
the copyright laws for your country before posting these files!

Please take a look at the important information in this header.
We encourage you to keep this file on your own disk, keeping an
electronic path open for the next readers.  Do not remove this.
[EMAIL PROTECTED] /export/home#
[EMAIL PROTECTED] /export/home# 
rm 
/fs1/gutenberg.org_martin_luther_treatise_on_good_works_with_intro_gwork10.txt
rm: 
/fs1/gutenberg.org_martin_luther_treatise_on_good_works_with_intro_gwork10.txt: 
override protection 644 (yes/no)? y
rm: 
/fs1/gutenberg.org_martin_luther_treatise_on_good_works_with_intro_gwork10.txt 
not removed: Read-only file system
[EMAIL 

Re: [zfs-discuss] Snapshots impact on performance

2007-07-27 Thread Łukasz
 Same problem here (snv_60).
 Robert, did you find any solutions?
 
 gino

check this http://www.opensolaris.org/jive/thread.jspa?threadID=34423tstart=0

Check spa_sync function time
remember to change POOL_NAME !

dtrace -q -n fbt::spa_sync:entry'/(char *)(((spa_t*)arg0)-spa_name) == 
POOL_NAME/{ self-t = timestamp; }' -n fbt::spa_sync:return'/self-t/{ @m = 
max((timestamp - self-t)/100); self-t = 0; }' -n tick-10m'{ 
printa([EMAIL PROTECTED],@m); exit(0); }'

If you have long spa_sync times, try to check if you have 
problems with finding new blocks in space map with this script:

#!/usr/sbin/dtrace -s

fbt::space_map_alloc:entry
{
   self-s = arg1;
}

fbt::space_map_alloc:return
/arg1 != -1/
{
  self-s = 0;
}

fbt::space_map_alloc:return
/self-s  (arg1 == -1)/
{
  @s = quantize(self-s);
  self-s = 0;
}

tick-10s
{
  printa(@s);
}

Then change zfs set recordsize=XX POOL_NAME. Make sure that all filesystem 
inherits
recordsize. 
  #zfs get -r recordsize POOL_NAME

Other thing is space map size. 

check map size

echo '::spa' | mdb -k | grep 'f[0-9]*-[0-9]*' \
  | while read pool_ptr state pool_name
do
  echo ${pool_ptr}::walk metaslab|::print -d struct metaslab 
ms_smo.smo_objsize \
| mdb -k \
| nawk '{sub(^0t,,$3);sum+=$3}END{print sum}' 
done

The value you will get is space map size on disk. In memory space map will have 
about 4 *size_on_disk. Sometimes during snapshot remove kernel will have to load
all space maps to memory. For example
if space map on disk takes 1GB then:
 - kernel in spa_sync funtion will read 1GB from disk ( or from cache )
 - allocate 4GB for avl trees
 - do all operations on avl trees
 - save maps

It is good to have enough free memory for this operations.

You can reduce space map by coping all filesystems on other pool. I recommend 
zfs send.

regards

Lukas
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] separate intent log blog

2007-07-27 Thread Adolf Hohl
Hi,

what is necessary to get it working from the solaris side. Is a driver on board 
or is there no special one needed? 
I just got a packed MM-5425CN with 256M. However i am lacking a pci-x 64bit 
connector and not sure if it is worth the whole effort for my personal purposes.

Any comment are very appreciated

-ah
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Does iSCSI target support SCSI-3 PGR reservation ?

2007-07-27 Thread Jim Dunham

 A quick look through the source would seem to indicate that the  
 PERSISTENT RESERVE commands are not supported by the Solaris ISCSI  
 target at all.

Correct. There is an RFE outstanding for iSCSI Target to implement  
PGR for both raw SCSI-3 devices, and block devices.

http://bugs.opensolaris.org/view_bug.do?bug_id=6415440


 http://cvs.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/cmd/ 
 iscsi/iscsitgtd/t10_spc.c


 This message posted from opensolaris.org
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Jim Dunham
Solaris, Storage Software Group

Sun Microsystems, Inc.
1617 Southwood Drive
Nashua, NH 03063
Email: [EMAIL PROTECTED]
http://blogs.sun.com/avs



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS forks (Was: LZO compression?)

2007-07-27 Thread Robert Milkowski
Hello Toby,

Thursday, July 26, 2007, 6:18:46 PM, you wrote:

TT On 26-Jul-07, at 1:24 PM, Robert Milkowski wrote:

 Hello Matthew,

 Thursday, July 26, 2007, 2:56:32 PM, you wrote:

 MA Robert Milkowski wrote:
 Hello Matthew,

 Monday, June 18, 2007, 7:28:35 PM, you wrote:

 MA FYI, we're already working with engineers on some other ports  
 to ensure
 MA on-disk compatability.  Those changes are going smoothly.  So  
 please,
 MA contact us if you want to make (or want us to make) on-disk  
 changes to ZFS
 MA for your port or distro.  We aren't that difficult to work  
 with :-)

 If it's not secret - can you say what ports are you talking about?

 MA Primarily the MacOS X port.

 Thank you.

 ps. so not windows port? :))) or maybe windows is secondary one :)


TT You like Windows /that much/ ? Note Sun isn't doing the OS X port,  
TT Apple is.

It has nothing to do if I like it or not.
It's the question if ZFS could become default file system on most
important platform.

-- 
Best regards,
 Robertmailto:[EMAIL PROTECTED]
   http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS with HDS TrueCopy and EMC SRDF [CloseSync]

2007-07-27 Thread Damon Atkins
Date: Thu, 26 Jul 2007 20:39:09 PDT
From: Anton B. Rang [EMAIL PROTECTED]

That said, I?m not sure exactly what this buys you for disk replication. 
What?s special about files which have been closed? Is the point that 
applications might close a file and then notify some other process of the 
file?s availability for use?

Yes


E.g. 1
Program starts output job,and completes job in OS Cache on Server A. Server A 
tells batch scheduling software on Server B, that job is complete. Server A 
Crashes, file no longer exists or is truncated due to what is left in the OS 
Cache. Server B Schedules the next job, on the assumption that the file creates 
on Server A is ok.

E.g. 2
Program starts output job,and completes job in OS Cache on Server A. A DB on 
Server A running in a different ZFS Pool, updates a DB record to record the 
fact the output is complete (DB uses O_DSYNC)  Server A Crashes, file no longer 
exists or is truncated due to what is left in the OS Cache. Server A DB 
contains information saying that the file is completed.

I believe that sync-on-close should be the default.  File systems integrity 
should be more than just being able to read a file which has been truncated due 
to a system crash/power failure etc.

E.g. 3 (a bit cheeky -:)
$ vi  a file, save the file, system crashes, you look back at the screen 
and you say thank god, I save the file in time, because on your screen in the 
prompt $ again. This is all happening in the OS Cache file. When the system 
returns the file does not exist. (I am ignoring vi -r)
$ vi x
$ connection lost
Therefore users should do
$ vi x
$ sleep 5 ; echo file x now on disk :-)
$ echo add a line  x
$ sleep 5; echo update to x complete

UFS forcedirectio and VxFS closesync ensure that what ever happens your files 
will always exist if the   program completes. Therefore with Disk Replication 
(sync) the file exists at the other site at its finished size. When you 
introduce DR with Disk Replication, general means you can not afford to lose 
any save data.  UFS forcedirectio has a larger performance hit than VxFS 
closesync.

Cheers



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] cloning disk with zpool

2007-07-27 Thread work
Hello the list,

I thought that it should be easy to do a clone (not in the term of zfs) of a 
disk with zpool. This manipulation is strongly inspired by
http://www.opensolaris.org/jive/thread.jspa?messageID=135038
and 
http://www.opensolaris.org/os/community/zfs/boot/

But unfortunately this doesn't work, and we do have no clue what could be wrong

on c1d0 you have a zfs root

create a mirror of this disk:
 zpool attache rootpool c1d0 c2d0
 wait that it construct c2d0

zpool offline

#install the grub 
/usr/sbin/installgrub /boot/grub/stage1 /boot/grub/stage2 c2d0

shut down
remove the disk c1d0 and place the disk on c2d0 in c1d0

boot in failsafe

clean the zpool
 zpool import rootpool
 zpool rootpool c1d0 #(yeah I know this look strange because
at that time we have a zpool status saying that the zpool is constitued of two 
disk c1d0 and c1d0)


# try to repair the zpool.cache
mkdir /tmp/w_zfs # w stand for write
mount -F lofs /tmp/w_zfs /etc/zfs
touch /etc/zfs/foo # to check if writable
rm /etc/zfs/foo

zpool export rootpool
zpool import rootpool

## okay we can see now that there is a /etc/zfs/zpool.cache

# lets put the new zpool.cache in the rootpool/rootfs
mkdir /tmp/rootpool
mount -F zfs rootpool/rootfs /tmp/rootpool
cp /etc/zfs/zpool.cache /tmp/rootpool/etc/zfs/

# update the bootadm
/usr/sbin/bootadm update-archive -v -R /tmp/rootpool



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] separate intent log blog

2007-07-27 Thread Albert Chin
On Fri, Jul 27, 2007 at 08:32:48AM -0700, Adolf Hohl wrote:
 what is necessary to get it working from the solaris side. Is a
 driver on board or is there no special one needed?

I'd imagine so.

 I just got a packed MM-5425CN with 256M. However i am lacking a
 pci-x 64bit connector and not sure if it is worth the whole effort
 for my personal purposes.

Huh? So your MM-5425CN doesn't fit into a PCI slot?

 Any comment are very appreciated

How did you obtain your card?

-- 
albert chin ([EMAIL PROTECTED])
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] separate intent log blog

2007-07-27 Thread Neil Perrin
Adolf,

Yes, there was a separate driver, that I believe came from Micro
Memories. I installit from a package umem_Sol_Drv_Cust_i386_v01_10.pkg.
I just use pkgadd on it and it just worked. Sorry, I don't know if it's
publicly available or will even work for your device.

I gave details of that device for completeness. I was hoping
it would be representative of any NVRAM. I wasn't
intending to endorse its use, although it does seem fast.
Hardware availability and access to drivers is indeed
an issue.

256M is not a lot of NVRAM - the devive I tested had 1GB.
If you have a lot of synchronous transactions then you could
exceed the 256MB and overflow into the slower main pool.

Neil.

Adolf Hohl wrote:
 Hi,
 
 what is necessary to get it working from the solaris side.
 Is a driver on board or is there no special one needed? 
 I just got a packed MM-5425CN with 256M.
 However i am lacking a pci-x 64bit connector and not sure
 if it is worth the whole effort for my personal purposes.
 
 Any comment are very appreciated
 
 -ah
  
  
 This message posted from opensolaris.org
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Mysterious corruption with raidz2 vdev (1 checksum err on disk, 2 on vdev?)

2007-07-27 Thread Matthew Ahrens
Kevin wrote:
 After a scrub of a pool with 3 raidz2 vdevs (each with 5 disks in them) I see 
 the following status output. Notice that the raidz2 vdev has 2 checksum 
 errors, but only one disk inside the raidz2 vdev has a checksum error. How is 
 this possible? I thought that you would have to have 3 errors in the same 
 'stripe' within a raidz2 vdev in order for the error to become unrecoverable.

A checksum error on a disk indicates that we know for sure that this disk 
gave us wrong data.  With raidz[2], if we are unable to reconstruct the 
block successfully but no disk admitted that it failed, then we have no way 
of knowing which disk(s) are actually incorrect.

So the errors on the raidz2 vdev indeed indicate that at least 3 disks below 
it gave the wrong data for a those 2 blocks; we just couldn't tell which 3+ 
disks they were.

It's as if I know that A+B==3, but A is 1 and B is 3.  I can't tell if A is 
wrong or B is wrong (or both!).

The checksum errors on the cXtXdX vdevs didn't result in data loss, because 
we reconstructed the data from the other disks in the raidz group.

--matt
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] gzip compression throttles system?

2007-07-27 Thread eric kustarz
I've filed:
6586537 async zio taskqs can block out userland commands

to track this issue.

eric
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Mysterious corruption with raidz2 vdev (1 checksum err on disk, 2 on vdev?)

2007-07-27 Thread Marc Bevand
Matthew Ahrens Matthew.Ahrens at sun.com writes:
 
 So the errors on the raidz2 vdev indeed indicate that at least 3 disks below 
 it gave the wrong data for a those 2 blocks; we just couldn't tell which 3+ 
 disks they were.

Something must be seriously wrong with this server. This is the first time I 
see an uncorrectable checksum error in a raidz2 vdev. I would suggest Kevin to 
run memtest86 or similar. It is more likely bad data has been written on the 
disks in the first place (due to flaky RAM/CPU/mobo/cables) rather than 3+ 
disks corrupting data in the same stripe !

-marc


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss