date:20090923

Re: [zfs-discuss] periodic slow responsiveness

2009-09-23 Thread James Lever



On 08/09/2009, at 2:01 AM, Ross Walker wrote:

On Sep 7, 2009, at 1:32 AM, James Lever  wrote:

Well a MD1000 holds 15 drives a good compromise might be 2 7 drive  
RAIDZ2s with a hotspare... That should provide 320 IOPS instead of  
160, big difference.


The issue is interactive responsiveness and if there is a way to  
tune the system to give that while still having good performance  
for builds when they are run.


Look at the write IOPS of the pool with the zpool iostat -v and look  
at how many are happening on the RAIDZ2 vdev.


I was suggesting that slog write were possibly starving reads from  
the l2arc as they were on the same device.  This appears not to  
have been the issue as the problem has persisted even with the  
l2arc devices removed from the pool.


The SSD will handle a lot more IOPS then the pool and L2ARC is a  
lazy reader, it mostly just holds on to read cache data.


It just may be that the pool configuration just can't handle the  
write IOPS needed and reads are starving.


Possible, but hard to tell.  Have a look at the iostat results I’ve  
posted.


The busy times of the disks while the issue is occurring should let  
you know.


So it turns out that the problem is that all writes coming via NFS are  
going through the slog.  When that happens, the transfer speed to the  
device drops to ~70MB/s (the write speed of his SLC SSD) and until the  
load drops all new write requests are blocked causing a noticeable  
delay (which has been observed to be up to 20s, but generally only  
2-4s).


I can reproduce this behaviour by copying a large file (hundreds of MB  
in size) using 'cp src dst’ on an NFS (still currently v3) client and  
observe that all data is pushed through the slog device (10GB  
partition of a Samsung 50GB SSD behind a PERC 6/i w/256MB BBC) rather  
than going direct to the primary storage disks.


On a related note, I had 2 of these devices (both using just 10GB  
partitions) connected as log devices (so the pool had 2 separate log  
devices) and the second one was consistently running significantly  
slower than the first.  Removing the second device made an improvement  
on performance, but did not remove the occasional observed pauses.


I was of the (mis)understanding that only metadata and writes smaller  
than 64k went via the slog device in the event of an O_SYNC write  
request?


The clients are (mostly) RHEL5.

Is there a way to tune this on the NFS server or clients such that  
when I perform a large synchronous write, the data does not go via the  
slog device?


I have investigated using the logbias setting, but that will just kill  
small file performance also on any filesystem using it and defeat the  
purpose of having a slog device at all.


cheers,
James

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] New to ZFS: One LUN, multiple zones

2009-09-23 Thread bertram fukuda

Would I just do the following then:

> zpool create -f zone1 c1t1d0s0
> zfs create zone1/test1
> zfs create zone1/test2

Woud I then use zfs set quota=xxxG to handle disk usage?
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] New to ZFS: One LUN, multiple zones

2009-09-23 Thread Robert Milkowski


Mike Gerdts wrote:

On Wed, Sep 23, 2009 at 7:32 AM, bertram fukuda  wrote:
  

Thanks for the info Mike.

Just so I'm clear.  You suggest 1)create a single zpool from my LUN 2) create a 
single ZFS filesystem 3) create 2 zone in the ZFS filesystem. Sound right?



Correct


  
Well I would actually recommend to create a dedicate zfs file system for 
each zone (which zoneadm should do for you anyway). The reason is that 
it is much easier then to get information on how much storage each zone 
is using, you can set a quote or reservation for storage for each zone 
independently, you can easily clone each zone, snapshot it, etc.



--
Robert Milkowski
http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Checksum property change does not change pre-existing data - right?

2009-09-23 Thread Bob Friesenhahn


On Wed, 23 Sep 2009, Ray Clark wrote:

My understanding is that if I "zfs set checksum=" to 
change the algorithm that this will change the checksum algorithm 
for all FUTURE data blocks written, but does not in any way change 
the checksum for previously written data blocks.


This is correct. The same applies to blocksize and compression.

I need to corroborate this understanding.  Could someone please 
point me to a document that states this?  I have searched and 
searched and cannot find this.


Sorry, I am not aware of a document and don't have time to look.

Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] Checksum property change does not change pre-existing data - right?

2009-09-23 Thread Ray Clark

My understanding is that if I "zfs set checksum=" to change the 
algorithm that this will change the checksum algorithm for all FUTURE data 
blocks written, but does not in any way change the checksum for previously 
written data blocks.  

I need to corroborate this understanding.  Could someone please point me to a 
document that states this?  I have searched and searched and cannot find this.

Thank you.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] New to ZFS: One LUN, multiple zones

2009-09-23 Thread Mike Gerdts

On Wed, Sep 23, 2009 at 8:12 PM, David Magda  wrote:
> On Sep 23, 2009, at 20:48, bertram fukuda wrote:
>
>> What if we have no plans on moving or cloning the zone, I can get away
>> with only pool right?
>
> Sure.
>
>> If I'm doing a separate FS for each zone is it just slice up my LUN,
>> create a FS for each zone and I'm done?
>
>
> One pool from the LUN ('zpool create zonepool c2t0d0'), but with-in that
> pool you do a 'zfs create' for each zone (I think zoneadm can do this
> automatically as well):
>
>  zfs create zonepool/zone1
>  zfs create zonepool/zone2
>  zfs create zonepool/zone3
>
> This allows you to do snapshots and rollbacks on the zone for things like
> patching and other major changes.

Agreed.  And it allows you to do migrations sometime in the future with

host1# zoneadm -z zone1 detach
host1# zfs snapshot zonepool/zo...@migrate
host1# zfs send -r zonepool/zo...@migrate \
 | ssh host2 zfs receive zones/zo...@migrate
host2# zonecfg -z zone1 create -a /zones/zone1
host2# zonecfg -z zone1 attach
host2# zoneadm -z zone1 boot

-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] New to ZFS: One LUN, multiple zones

2009-09-23 Thread David Magda


On Sep 23, 2009, at 20:48, bertram fukuda wrote:

What if we have no plans on moving or cloning the zone, I can get  
away with only pool right?


Sure.

If I'm doing a separate FS for each zone is it just slice up my LUN,  
create a FS for each zone and I'm done?



One pool from the LUN ('zpool create zonepool c2t0d0'), but with-in  
that pool you do a 'zfs create' for each zone (I think zoneadm can do  
this automatically as well):


 zfs create zonepool/zone1
 zfs create zonepool/zone2
 zfs create zonepool/zone3

This allows you to do snapshots and rollbacks on the zone for things  
like patching and other major changes.


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] New to ZFS: One LUN, multiple zones

2009-09-23 Thread bertram fukuda

David, 
What if we have no plans on moving or cloning the zone, I can get away with 
only pool right?
If I'm doing a separate FS for each zone is it just slice up my LUN, create 
a FS for each zone and I'm done?

Thanks,
Bert
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] You're Invited: OpenSolaris Security Summit

2009-09-23 Thread Jennifer Bauer Scarpino




To: Developers and Students

You are invited to participate in the first OpenSolaris Security Summit


OpenSolaris Security Summit
Tuesday, November 3rd, 2009
Baltimore Marriott Waterfront
700 Aliceanna Street
Baltimore, Maryland 21202


Join us as we explore the latest trends of OpenSolaris Security
technologies, as well as key insights from security community members,
technologists, and users.


You will also have the unique opportunity to hear from our keynote
speaker William Cheswick, Lead Member of the Technical Staff at AT&T
labs

Bio:

Ches is an early innovator in Internet security. He is known for his
work in firewalls, proxies, and Internet mapping at Bell Labs and Lumeta
Corp. He is best known for the book he co-authored with Steve Bellovin
and now Avi Rubin, Firewalls and Internet Security; Repelling the Wily
Hacker.
Ches is now a member of the technical staff at AT&T Labs - Research in
Florham Park, NJ, where he is working on security, visualization, user
interfaces, and a variety of other things.


Registration is now available

http://wikis.sun.com/display/secsummit09/
http://www.usenix.org/events/lisa09/

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] URGENT: very high busy and average service time with ZFS and USP1100

2009-09-23 Thread Javier Conde



Thanks Richard and Jim,

Your answers helped me to show to the customer that there was no issue 
with ZFS and the HDS.


I went onsite to see the problem, and as Jim suggested, the customer 
just saw the %b and average service time and he thought there was a problem.


The server is running an Oracle DB, and the 2 zfs file systems showing a 
lot of activity were the one with the database files, and the one for 
the redo logs.


For the DB file system, the recordsize is set to 8k, and that's why we 
see around 2000 IOPS, with an asvct of 10 ms


The redolog file system have the default recordsize of 128k, so we see 
much less IOPS, the same transfer rate, and an asvct of 30 ms.


Everything is normal on this system. I probed that the activity of the 
IO side while the DB was not responding correctly was not different from 
the rest of the day.


I still don't know what the final where was the problem, but it seems it 
has been solved now.


Regards,

Javier

Richard Elling wrote:

comment below...

On Sep 22, 2009, at 9:57 AM, Jim Mauro wrote:



Cross-posting to zfs-discuss. This does not need to be on the
confidential alias. It's a performance query - there's nothing
confidential in here. Other folks post performance queries to
zfs-discuss

Forget %b - it's useless.

It's not the bandwidth that's hurting you, it's the IOPS.
One of the hot devices did 1515.8 reads-per-second,
the other did over 500.

Is this Oracle?

You never actually tell us what the huge performance problem is -
what's the workload, what's the delivered level of performance?

IO service times in the 32-22 millisecond range are not great,
but not the worst I've seen. Do you have any data that connects
the delivered perfomance of the workload to an IO latency
issue, or did the customer just run "iostat", saw "100% b",
and assumed this was the problem?

I need to see zpool stats.

Is each of these c3txx devices actually a raid 7+1 (which means
7 data disks and 1 parity disk)??

There's nothing here that tells us there's something that needs to be
done on the ZFS side. Not enough data.

It looks like a very lopsided IO load distribution problem.
You have 8 LUNs cetX devices, 2 of which are getting
slammed with IOPS, the other 6 are relatively idle.

Thanks,
/jim


Javier Conde wrote:


Hello,

IHAC with a huge performance problem in a newly installed M8000 
confiured with a USP1100 and ZFS.


From what we can see, 2 disks used by in different ZPOOLS have are 
100% busy and and average service time is also quite high (between 
30 and 5 ms).


  r/sw/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
  0.0   11.40.0  224.1  0.0  0.20.0   20.7   0   5 
c3t5000C5000F94A607d0
  0.0   11.80.0  224.1  0.0  0.30.0   24.2   0   6 
c3t5000C5000F94E38Fd0
  0.20.0   25.60.0  0.0  0.00.07.9   0   0 
c3t60060E8015321F01321F0032d0
  0.03.60.0   20.8  0.0  0.00.00.5   0   0 
c3t60060E8015321F01321F0020d0
  0.2   24.0   25.6  488.0  0.0  0.00.00.6   0   1 
c3t60060E8015321F01321F001Cd0
 11.40.8   92.88.0  0.0  0.00.03.9   0   4 
c3t60060E8015321F01321F0019d0
573.40.0 73395.50.0  0.0 20.60.0   36.0   0 100 
c3t60060E8015321F01321F000Bd0


avg read size ~128kBytes... which is good

  0.80.8  102.48.0  0.0  0.00.0   22.8   0   4 
c3t60060E8015321F01321F0008d0
1515.8   10.2 30420.9  148.0  0.0 34.90.0   22.9   1 100 
c3t60060E8015321F01321F0006d0


avg read size ~20 kBytes... not so good
These look like single-LUN pools.  What is the workload?

  0.40.4   51.21.6  0.0  0.00.05.1   0   0 
c3t60060E8015321F01321F0055d0


The USP1100 is configured with a raid 7+1, which is the default 
recommendation.


Check the starting sector for the partition.  For older OpenSolaris 
and Solaris 10
installations, the default starting sector is 34, which has the 
unfortunate affect of
misaligning with most hardware RAID arrays. For newer installations, 
the default
starting sector is 256, which has a better chance of aligning with 
hardware RAID

arrays. This will be more pronounced when using RAID-5.

To check, look at the partition table in format(1m) or prtvtoc(1m)

BTW, the customer is surely not expecting super database performance from
RAID-5 are they?



The data transfered is not very high, between 50 and 150 MB/sec.

Is this normal to see the disks all the time busy at 100% and the 
average time always greater than 30 ms?


Is there something we can do from the ZFS side?

We have followed the recommendations regarding the block size for 
the database file systems, we use 4 different zpools for the DB, 
indexes, redolog and archive logs, the vdev_cache_bshift is set to 
13 (8k blocks)...


hmmm... what OS release?  The vdev cache should only read
metadata, unless you are running on an old OS. In other words, the
solution which suggests changing vdev_cache_bshift has been

[zfs-discuss] zfs snapshot -r panic on b114

2009-09-23 Thread Albert Chin

While a resilver was running, we attempted a recursive snapshot which
resulted in a kernel panic:
  panic[cpu1]/thread=ff00104c0c60: assertion failed: 0 == 
zap_remove_int(mos, next_clones_obj, dsphys->ds_next_snap_obj, tx) (0x0 == 
0x2), file: ../../common/ fs/zfs/dsl_dataset.c, line: 1869

  ff00104c0960 genunix:assfail3+c1 ()
  ff00104c0a00 zfs:dsl_dataset_snapshot_sync+4a2 ()
  ff00104c0a50 zfs:snapshot_sync+41 ()
  ff00104c0aa0 zfs:dsl_sync_task_group_sync+eb ()
  ff00104c0b10 zfs:dsl_pool_sync+196 ()
  ff00104c0ba0 zfs:spa_sync+32a ()
  ff00104c0c40 zfs:txg_sync_thread+265 ()
  ff00104c0c50 unix:thread_start+8 ()

System is a X4100M2 running snv_114.

Any ideas?

-- 
albert chin (ch...@thewrittenword.com)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] libzfs.h versioning

2009-09-23 Thread Enrico Maria Crisostomo

Richard,

I compared the libzfs_jni source code and they're pretty different
from what we're doing. libzfs_jni is essentially a jni wrapper to
(yet?) another set of zfs-related programs written in C. zfs for Java,
on the other hand, is a Java wrapper to the functionality of (and only
of) libzfs. I suppose that libzfs_jni capabilities could be
implemented on top of zfs for java but the approach is pretty
different: the main difference is the purpose of the exposed methods:
libzfs is the interface to ZFS and its methods are low level while
libzfs_jni exposes a set of operations which are coarse grained and
targeted to management.

Nevertheless, the functionality provided by libzfs_jni is interesting
and I'd like to build something similar by using zfs for java.
Personally, I'm doing this for two reasons: having a libzfs wrapper
for Java seems like a good thing to have and I'd like to use to build
some management interfaces (such as web but not only) instead on
having to rely on shell scripting with zfs and zpool commands. I'll
keep an eye to libzfs_jni.

Now, to return to the original question, I haven't found a way to
correlate libzfs.h versions (and dependencies) to Nevada releases. At
the moment, I'm willing to extract information from a sysinfo call
(any suggestion about a better way?) and the next step, whose logic
I'm missing, is how to correlate this information with to a concrete
libzfs.h version from openGrok: maybe it's just trivial, but I do not
find it. Have you got some information to help me address this
problem?

Thanks,
Enrico

On Fri, Sep 11, 2009 at 12:53 AM, Enrico Maria Crisostomo
 wrote:
> On Fri, Sep 11, 2009 at 12:26 AM, Richard Elling
>  wrote:
>> On Sep 10, 2009, at 1:03 PM, Peter Tribble wrote:
>>>
>>> On Thu, Sep 10, 2009 at 8:52 PM, Richard Elling
>>>  wrote:

 Enrico,
 Could you compare and contrast your effort with the existing libzfs_jni?

 http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/lib/libzfs_jni/common/
>>>
>>> Where's the source for the java code that uses that library?
>>
>> Excellent question!  It is used for the BUI ZFS manager webconsole that
>> comes with S10 and SXCE. So you might find the zfs.jar as
>> /usr/share/webconsole/webapps/zfs/WEB-INF/lib/zfs.jar
>> The jar only contains the class files, though.
> Yes, that's what I thought when I saw it. Furthermore, the last time I
> tried it was still unaligned with the new ZFS capabilites: it crashed
> because of an unknown gzip compression type...
>
>>
>> Someone from Sun could comment on the probability that they
>> will finally get their act together and have a real BUI framework for
>> systems management... they've tried dozens (perhaps hundreds)
>> of times, with little to show for the effort :-(
> By the way, one of the goals I'd like to reach with such kind of
> library is just that: putting the basis for building a java based
> management framework for ZFS. Unfortunately wrapping libzfs will
> hardly fulfill this goal and the more I dig into the code the more I
> realize that we will need to wrap (or reimplement) some of the logic
> of the zfs and zpool commands. I'm also confident that building a good
> library on top of this wrapper will give us a very powerful tool to
> play with from Java.
>
>>  -- richard
>>
>>
>
>
>
> --
> Ελευθερία ή θάνατος
> "Programming today is a race between software engineers striving to
> build bigger and better idiot-proof programs, and the Universe trying
> to produce bigger and better idiots. So far, the Universe is winning."
> GPG key: 1024D/FD2229AF
>

-- 
Ελευθερία ή θάνατος
"Programming today is a race between software engineers striving to
build bigger and better idiot-proof programs, and the Universe trying
to produce bigger and better idiots. So far, the Universe is winning."
GPG key: 1024D/FD2229AF
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] New to ZFS: One LUN, multiple zones

2009-09-23 Thread David Magda


On Sep 23, 2009, at 08:13, Mike Gerdts wrote:

That is, at time t1 I have zones z1 and z2 on host h1.  I think that  
at some time
in the future I would like to move z2 to host h2 while leaving z1 on  
h1.


You can have a single pool, but it's probably good to have each zone  
in its own file system. As mentioned in another message this would  
allow you to delegate (if so desired), but what it also allows you to  
do is move the zone later on (also with a 'zfs send / recv'):


http://prefetch.net/blog/index.php/2006/09/27/moving-solaris-zones/
http://blogs.sun.com/gz/entry/how_to_move_a_solaris

It also allows you to clone zones if you want cookie-cutter  
configurations (or even have a base set up that's common to multiple  
zones):


http://www.cuddletech.com/blog/pivot/entry.php?id=751
http://www.google.com/search?q=zone+clone+zfs

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] How to verify if the ZIL is disabled

2009-09-23 Thread Scott Meilicke

> zfs share -a

Ah-ha! Thanks.

FYI, I got between 2.5x and 10x improvement in performance, depending on the 
test. So tempting :)

-Scott
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] How to verify if the ZIL is disabled

2009-09-23 Thread Tomas Ögren

On 23 September, 2009 - Scott Meilicke sent me these 0,5K bytes:

> Thank you both, much appreciated.
> 
> I ended up having to put the flag into /etc/system. When I disabled
> the ZIL and umount/mounted without a reboot, my ESX host would not see
> the NFS export, nor could I create a new NFS connection from my ESX
> host. I could get into the file system from the host itself of course.

zfs share -a

/Tomas
-- 
Tomas Ögren, st...@acc.umu.se, http://www.acc.umu.se/~stric/
|- Student at Computing Science, University of Umeå
`- Sysadmin at {cs,acc}.umu.se
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] How to verify if the ZIL is disabled

2009-09-23 Thread Scott Meilicke

Thank you both, much appreciated.

I ended up having to put the flag into /etc/system. When I disabled the ZIL and 
umount/mounted without a reboot, my ESX host would not see the NFS export, nor 
could I create a new NFS connection from my ESX host. I could get into the file 
system from the host itself of course.

-Scott
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] RAID-Z2 won't come online after replacing failed disk

2009-09-23 Thread Dustin Marquess

Cindy: AWESOME!  Didn't know about that property, I'll make sure I set it :).

All I did to replace the drives was to power off the machine (the failed drive 
had hard-locked the SCSI bus, so I had to anyways).  Once the machine was 
powered off, I pulled the bad drive, inserted the new drive, and powered the 
machine on.  That's when the machine came up showing the pool in a corrupted 
state.

I'm assuming if I had removed the old drive, booted it with the drive missing, 
let it come up DEGRADED, and then inserted the new drive and did a zpool 
replace, it would have been fine.

So I was going by the guess that zpool didn't know that the disk was replaced, 
and I was just curious why.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] How to verify if the ZIL is disabled

2009-09-23 Thread Roch Bourbonnais



Le 23 sept. 09 à 19:07, Neil Perrin a écrit :




On 09/23/09 10:59, Scott Meilicke wrote:
How can I verify if the ZIL has been disabled or not? I am trying  
to see how much benefit I might get by using an SSD as a ZIL. I  
disabled the ZIL via the ZFS Evil Tuning Guide:

echo zil_disable/W0t1 | mdb -kw


- this only temporarily disables the zil until the reboot.
In fact it has no effect unless file systems are remounted as
the variable is only looked at on mount.



Scott, just setting it;

zfs umount xxx; zfs mount xxx

and then run  your experiment. Directly compare fast/incorrect xxx  
dataset with slower/correct yyy mount point. No need to reboot.


and then rebooted. However, I do not see any benefits for my NFS  
workload.


To set zil_disable from boot put the following in /etc/system and  
reboot:


set zfs:zil_disable=1



Actually you need to have these 2 lines  or it won't work :

 TEMPORARY zil disable on non-production system; Sept 23 for a  
test by  *

set zfs:zil_disable=1


-r





Neil
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss




smime.p7s
Description: S/MIME cryptographic signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] RAID-Z2 won't come online after replacing failed disk

2009-09-23 Thread Cindy Swearingen


Dustin,

You didn't describe the process that you used to replace the disk so its
difficult to commment on what happened.

In general, you physically replace the disk and then let ZFS know that
the disk is replaced, like this:

# zpool replace pool-name device-name

This process is described here:

http://docs.sun.com/app/docs/doc/819-5461/gazgd?a=view

If you want to reduce the steps in the future, you can enable the
autoreplace property on the pool and all you need to do is physically
replace the disks in the pool.

Cindy

On 09/23/09 11:23, Dustin Marquess wrote:

Tim: I couldn't do a zpool scrub, since the pool was marked as UNAVAIL.  
Believe me, I tried :)

Bob: Ya, I realized that after I clicked send.  My brain was a little frazzled, 
so I completely overlooked it.

Solaris 10u7 - Sun E450
ZFS pool version 10
ZFS filesystem version 3

-Dustin

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] RAID-Z2 won't come online after replacing failed disk

2009-09-23 Thread Dustin Marquess

Tim: I couldn't do a zpool scrub, since the pool was marked as UNAVAIL.  
Believe me, I tried :)

Bob: Ya, I realized that after I clicked send.  My brain was a little frazzled, 
so I completely overlooked it.

Solaris 10u7 - Sun E450
ZFS pool version 10
ZFS filesystem version 3

-Dustin
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] How to verify if the ZIL is disabled

2009-09-23 Thread Neil Perrin




On 09/23/09 10:59, Scott Meilicke wrote:
How can I verify if the ZIL has been disabled or not? 


I am trying to see how much benefit I might get by using an SSD as a ZIL. I 
disabled the ZIL via the ZFS Evil Tuning Guide:

echo zil_disable/W0t1 | mdb -kw


- this only temporarily disables the zil until the reboot.
In fact it has no effect unless file systems are remounted as
the variable is only looked at on mount.



and then rebooted. However, I do not see any benefits for my NFS workload.


To set zil_disable from boot put the following in /etc/system and reboot:

set zfs:zil_disable=1

Neil
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] How to verify if the ZIL is disabled

2009-09-23 Thread Scott Meilicke

How can I verify if the ZIL has been disabled or not? 

I am trying to see how much benefit I might get by using an SSD as a ZIL. I 
disabled the ZIL via the ZFS Evil Tuning Guide:

echo zil_disable/W0t1 | mdb -kw

and then rebooted. However, I do not see any benefits for my NFS workload.

Thanks,
Scott
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] New to ZFS: One LUN, multiple zones

2009-09-23 Thread bertram fukuda

Awesome!!! Thanks for you help.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] RAID-Z2 won't come online after replacing failed disk

2009-09-23 Thread Bob Friesenhahn


On Wed, 23 Sep 2009, Dustin Marquess wrote:


Okay.. I "fixed" it by powering the server off, removing the new drive, letting 
the pool come up degraded, and then doing zpool replace.

I'm assuming what happened was ZFS saw that the disk was online, 
tried to use it, and then noticed that the checksums didn't match 
(of course) and marked the pool as corrupted.  The question is why 
didn't ZFS check the labels on the drive and see that the drive 
wasn't in the pool and kick it out itself?


You never told us what OS and version (OpenSolaris, Solaris 10, 
FreeBSD, NetBSD, Linux Fuse, OS X zfs preview) you are using.  If you 
are using an older version of zfs, maybe a newer version works as 
expected?


Never report a problem without identifying the software and hardware 
you are using.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] RAID-Z2 won't come online after replacing failed disk

2009-09-23 Thread Tim Cook

On Wed, Sep 23, 2009 at 10:57 AM, Dustin Marquess  wrote:

> Okay.. I "fixed" it by powering the server off, removing the new drive,
> letting the pool come up degraded, and then doing zpool replace.
>
> I'm assuming what happened was ZFS saw that the disk was online, tried to
> use it, and then noticed that the checksums didn't match (of course) and
> marked the pool as corrupted.  The question is why didn't ZFS check the
> labels on the drive and see that the drive wasn't in the pool and kick it
> out itself?
> --
>

Did you do a zpool scrub after you replaced the drive?  How would zfs know
what you wanted done with the drive if you didn't tell it?

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] RAID-Z2 won't come online after replacing failed disk

2009-09-23 Thread Dustin Marquess

Okay.. I "fixed" it by powering the server off, removing the new drive, letting 
the pool come up degraded, and then doing zpool replace.

I'm assuming what happened was ZFS saw that the disk was online, tried to use 
it, and then noticed that the checksums didn't match (of course) and marked the 
pool as corrupted.  The question is why didn't ZFS check the labels on the 
drive and see that the drive wasn't in the pool and kick it out itself?
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Real help

2009-09-23 Thread vattini giacomo

hereby is my menu.lst
I tried to change the zpool mountpoints but no way either

menu.lst=
j...@opensolaris:~# more /a/boot/grub/menu.lst 
splashimage /boot/grub/splash.xpm.gz
background 215ECA
timeout 30
default 10
#-- ADDED BY BOOTADM - DO NOT EDIT --
title OpenSolaris 2008.11 snv_101b_rc2 X86
findroot (pool_rpool,0,a)
splashimage /boot/solaris.xpm
foreground d25f00
background 115d93
bootfs rpool/ROOT/opensolaris
kernel$ /platform/i86pc/kernel/$ISADIR/unix -B $ZFS-BOOTFS,console=graphics
module$ /platform/i86pc/$ISADIR/boot_archive
#-END BOOTADM

# Unknown partition of type 175 found on /dev/rdsk/c4t0d0p0 partition: 1
# It maps to the GRUB device: (hd2,0) .

# Unknown partition of type 175 found on /dev/rdsk/c4t0d0p0 partition: 2
# It maps to the GRUB device: (hd2,1) .

# Unknown partition of type 5 found on /dev/rdsk/c6d0p0 partition: 2
# It maps to the GRUB device: (hd0,1) .

title Ubuntu
root (hd0,4)
kernel$ /boot/vmlinuz-2.6.27-7-generic root=UUID=8dea655b-ce58-4cf6-8097-84f6f
a0d44e3 ro quiet splash 
initrd  /boot/initrd.img-2.6.27-7-generic
quite
title OpenSolaris 2008.11 snv_101b_rc2 X86 text boot
findroot (pool_rpool,0,a)
bootfs rpool/ROOT/opensolaris
kernel$ /platform/i86pc/kernel/$ISADIR/unix -B $ZFS-BOOTFS
module$ /platform/i86pc/$ISADIR/boot_archive

title opensolaris-1
findroot (pool_rpool,0,a)
splashimage /boot/solaris.xpm
foreground d25f00
background 115d93
bootfs rpool/ROOT/opensolaris-1
kernel$ /platform/i86pc/kernel/$ISADIR/unix -B $ZFS-BOOTFS,console=graphics
module$ /platform/i86pc/$ISADIR/boot_archive
# End of LIBBE entry =
title opensolaris-2
findroot (pool_rpool,0,a)
splashimage /boot/solaris.xpm
foreground d25f00
background 115d93
bootfs rpool/ROOT/opensolaris-2
kernel$ /platform/i86pc/kernel/$ISADIR/unix -B $ZFS-BOOTFS,console=graphics
module$ /platform/i86pc/$ISADIR/boot_archive
# End of LIBBE entry =
title opensolaris-3
findroot (pool_rpool,0,a)
splashimage /boot/solaris.xpm
foreground d25f00
background 115d93
bootfs rpool/ROOT/opensolaris-3
kernel$ /platform/i86pc/kernel/$ISADIR/unix -B $ZFS-BOOTFS,console=graphics
module$ /platform/i86pc/$ISADIR/boot_archive
# End of LIBBE entry =
title opensolaris-4
findroot (pool_rpool,0,a)
splashimage /boot/solaris.xpm
foreground d25f00
background 115d93
bootfs rpool/ROOT/opensolaris-4
kernel$ /platform/i86pc/kernel/$ISADIR/unix -B $ZFS-BOOTFS,console=graphics
module$ /platform/i86pc/$ISADIR/boot_archive
# End of LIBBE entry =
title opensolaris-5
findroot (pool_rpool,0,a)
splashimage /boot/solaris.xpm
foreground d25f00
background 115d93
bootfs rpool/ROOT/opensolaris-5
kernel$ /platform/i86pc/kernel/$ISADIR/unix -B $ZFS-BOOTFS,console=graphics
kernel$ /platform/i86pc/kernel/$ISADIR/unix -B $ZFS-BOOTFS -k
module$ /platform/i86pc/$ISADIR/boot_archive
# End of LIBBE entry =
title be_name
findroot (pool_rpool,0,a)
splashimage /boot/solaris.xpm
foreground d25f00
background 115d93
bootfs rpool/ROOT/be_name
kernel$ /platform/i86pc/kernel/$ISADIR/unix -B $ZFS-BOOTFS,console=graphics
kernel$ /platform/i86pc/kernel/$ISADIR/unix -B $ZFS-BOOTFS -k
module$ /platform/i86pc/$ISADIR/boot_archive
# End of LIBBE entry =
title be_name-1
findroot (pool_rpool,0,a)
splashimage /boot/solaris.xpm
foreground d25f00
background 115d93
bootfs rpool/ROOT/be_name-1
kernel$ /platform/i86pc/kernel/$ISADIR/unix -B $ZFS-BOOTFS,console=graphics
kernel$ /platform/i86pc/kernel/$ISADIR/unix -B $ZFS-BOOTFS -k
module$ /platform/i86pc/$ISADIR/boot_archive
# End of LIBBE entry =
title OSOL_123
findroot (pool_rpool,0,a)
splashimage /boot/solaris.xpm
foreground d25f00
background 115d93
bootfs rpool/ROOT/OSOL_123
kernel$ /platform/i86pc/kernel/$ISADIR/unix -B $ZFS-BOOTFS,console=graphics
kernel$ /platform/i86pc/kernel/$ISADIR/unix -B $ZFS-BOOTFS -k
module$ /platform/i86pc/$ISADIR/boot_archive
# End of LIBBE entry =
j...@opensolaris:~# 

and here the zfs list

j...@opensolaris:~$ zfs list
NAME   USED  AVAIL  REFER  MOUNTPOINT
rpool 66.2G  5.65G78K  /a
rpool/ROOT21.7G  5.65G18K  /a/
rpool/ROOT/OSOL_123   16.9G  5.65G  10.7G  /tmp/tmpd38Ilg
rpool/ROOT/be_name 160M  5.65G  10.5G  /a/
rpool/ROOT/be_name-1  4.23G  5.65G  10.5G  /a/
rpool/ROOT/opensolaris35.4M  5.65G  5.69G  /
rpool/ROOT/opensolaris-1   188M  5.65G  5.05G  /tmp/tmp2GKUrN
rpool/ROOT/opensolaris-2  43.0M  5.65G  5.60G  /tmp/tmpUbjG28
rpool/ROOT/opensolaris-3  42.8M  5.65G  5.67G  /tmp/tmpKIjHhB
rpool/ROOT/opensolaris-4  32.3M  5.65G  6.49G  /tmp/tmp7JeXnR
rpool/ROOT/opensolaris-5  87.5M  5.65G  10.4G  /tmp/tmpfvpm6V
rpool/dump1.50G  5.65G  1.50G  -
rpool/export  41.5G  5.65G21K  /export
rpool/export/home 41.5G  5.65G   195M  /export/home/
rpool/export/home/hazz

Re: [zfs-discuss] Real help

2009-09-23 Thread Tim Cook

On Wed, Sep 23, 2009 at 3:32 AM, vattini giacomo  wrote:

> Hi there i'v been able to restore my zpool on a live cd,reinstall the
> grub,but booting from the HD it hangs for a while and than nothing comes up
> j...@opensolaris:~# zfs list
> NAME   USED  AVAIL  REFER  MOUNTPOINT
> rpool 66.2G  5.65G78K  /a
> rpool/ROOT21.7G  5.65G18K  /
> rpool/ROOT/OSOL_123   16.9G  5.65G  10.7G  /tmp/tmpd38Ilg
> rpool/ROOT/be_name 160M  5.65G  10.5G  /
> rpool/ROOT/be_name-1  4.23G  5.65G  10.5G  /
> rpool/ROOT/opensolaris35.4M  5.65G  5.69G  /
> rpool/ROOT/opensolaris-1   188M  5.65G  5.05G  /tmp/tmp2GKUrN
> rpool/ROOT/opensolaris-2  43.0M  5.65G  5.60G  /tmp/tmpUbjG28
> rpool/ROOT/opensolaris-3  42.8M  5.65G  5.67G  /tmp/tmpKIjHhB
> rpool/ROOT/opensolaris-4  32.3M  5.65G  6.49G  /tmp/tmp7JeXnR
> rpool/ROOT/opensolaris-5  87.5M  5.65G  10.4G  /tmp/tmpfvpm6V
> rpool/dump1.50G  5.65G  1.50G  -
> rpool/export  41.5G  5.65G21K  /export
> rpool/export/home 41.5G  5.65G   195M  /export/home/
> rpool/export/home/hazz41.3G  5.65G  41.3G  /export/home/hazz/
> rpool/swap1.50G  5.86G  1.29G  -
> Any clue to get on the rescue?
> --
>
>
What does the grub.conf look like now that you've re-installed grub?

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] RAID-Z2 won't come online after replacing failed disk

2009-09-23 Thread Dustin Marquess

I replaced a bad disk in a RAID-Z2 pool, and now the pool won't come online.  
Status shows nothing helpful at all.  I don't understand why this is, which I 
should be able to lose 2 drives, and I only replaced one!

# zpool status -v pool
  pool: pool
 state: UNAVAIL
 scrub: none requested
config:

NAMESTATE READ WRITE CKSUM
poolUNAVAIL  0 0 0  insufficient replicas
  raidz2UNAVAIL  0 0 0  corrupted data
c2t0d0  ONLINE   0 0 0
c2t1d0  ONLINE   0 0 0
c2t2d0  ONLINE   0 0 0
c2t3d0  ONLINE   0 0 0
c3t0d0  ONLINE   0 0 0
c3t1d0  ONLINE   0 0 0
c3t2d0  ONLINE   0 0 0
c3t3d0  ONLINE   0 0 0
c4t0d0  ONLINE   0 0 0
c4t1d0  ONLINE   0 0 0
c4t2d0  ONLINE   0 0 0
c4t3d0  ONLINE   0 0 0
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] lots of zil_clean threads

2009-09-23 Thread Roch


I wonder if a taskq pool does not suffer from a similar
effect observed for the nfsd pool :

6467988 Minimize the working set of nfsd threads

Created threads round robin our of taskq loop, doing little
work but wake up at least once per 5 minute and so are never
reaped.

-r



Nils Goroll writes:
 > Hi Neil and all,
 > 
 > thank you very much for looking into this:
 > 
 > > So I don't know what's going on. What is the typical call stack for those
 > > zil_clean() threads?
 > 
 > I'd say they are all blocking on their respective CVs:
 > 
 > ff0009066c60 fbc2c0300   0  60 ff01d25e1180
 >PC: _resume_from_idle+0xf1TASKQ: zil_clean
 >stack pointer for thread ff0009066c60: ff0009066b60
 >[ ff0009066b60 _resume_from_idle+0xf1() ]
 >  swtch+0x147()
 >  cv_wait+0x61()
 >  taskq_thread+0x10b()
 >  thread_start+8()
 > 
 > I should add that I have quite a lot of datasets:
 > 
 > r...@haggis:~# zfs list -r -t filesystem | wc -l
 >49
 > r...@haggis:~# zfs list -r -t volume | wc -l
 >14
 > r...@haggis:~# zfs list -r -t snapshot | wc -l
 >  6018
 > 
 > Nils
 > ___
 > zfs-discuss mailing list
 > zfs-discuss@opensolaris.org
 > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] [dtrace-discuss] How to drill down cause of cross-calls in the kernel? (output provided)

2009-09-23 Thread Jim Leonard

> The only thing that jumps out at me is the ARC size -
> 53.4GB, or
> most of your 64GB of RAM. This in-and-of-itself is
> not necessarily
> a bad thing - if there are no other memory consumers,
> let ZFS cache
> data in the ARC. But if something is coming along to
> flush dirty
> ARC pages periodically

The workload is a set of 50 python processes, each receiving a stream of data 
via TCP/IP.  The processes run until they notice something interesting in the 
stream (sorry I can't be more specific), then they connect to a server via 
TCP/IP and issue a command or two.  Log files are written that takes up about 
50M per day per process.  It's relatively low-traffic.
 
> I found what looked to be an applicable bug;
> CR 6699438 zfs induces crosscall storm under heavy
> mapped sequential 
> read workload
> but the stack signature for the above bug is
> different than yours, and
> it doesn't sound like your workload is doing mmap'd
> sequential reads.
> That said, I would be curious to know if your
> workload used mmap(),
> versus read/write?

I asked and they couldn't say.  It's python so I think it's unlikely.

> For the ZFS folks just seeing this, here's the stack
> frame;
> 
>   unix`xc_do_call+0x8f
> unix`xc_wait_sync+0x36
>   unix`x86pte_invalidate_pfn+0x135
> unix`hat_pte_unmap+0xa9
>   unix`hat_unload_callback+0x109
> unix`hat_unload+0x2a
>   unix`segkmem_free_vn+0x82
> unix`segkmem_zio_free+0x10
>   genunix`vmem_xfree+0xee
> genunix`vmem_free+0x28
>   genunix`kmem_slab_destroy+0x80
> genunix`kmem_slab_free+0x1be
>   genunix`kmem_magazine_destroy+0x54
> genunix`kmem_depot_ws_reap+0x4d
>   genunix`taskq_thread+0xbc
> unix`thread_start+0x8
> 
> Let's see what the fsstat and zpool iostat data looks
> like when this
> starts happening..

Both are unremarkable, I'm afraid.  Here's the fsstat from when it starts 
happening:

new name name attr attr lookup rddir read read write write
file remov chng get set ops ops ops bytes ops bytes
0 0 0 75 0 0 0 0 0 10 1.25M zfs
0 0 0 83 0 0 0 0 0 7 896K zfs
0 0 0 78 0 0 0 0 0 13 1.62M zfs
0 0 0 229 0 0 0 0 0 29 3.62M zfs
0 0 0 217 0 0 0 0 0 28 3.37M zfs
0 0 0 212 0 0 0 0 0 26 3.03M zfs
0 0 0 151 0 0 0 0 0 18 2.07M zfs
0 0 0 184 0 0 0 0 0 31 3.41M zfs
0 0 0 187 0 0 0 0 0 32 2.74M zfs
0 0 0 219 0 0 0 0 0 24 2.61M zfs
0 0 0 222 0 0 0 0 0 29 3.29M zfs
0 0 0 206 0 0 0 0 0 29 3.26M zfs
0 0 0 205 0 0 0 0 0 19 2.26M zfs
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] [dtrace-discuss] How to drill down cause of cross-calls in the kernel? (output provided)

2009-09-23 Thread Jim Mauro


(posted to zfs-discuss)

Hmmm...this is nothing in terms of load.
So you say that the system becomes sluggish/unresponsive
periodically, and you noticed the xcall storm when that
happens, correct?

Refresh my memory - what is the frequency and duration
of the sluggish cycles?

Could you capture a kernel profile during a sluggish cycle;

#dtrace -n 'profile-997hz / arg0 && curthread->t_pri != -1 / { 
@[stack()]=count(); } tick-1sec { trunc(@,10); printa(@); clear(@); }'


And/or -

#lockstat -i997 -kIW -s 10 sleep 30 > lockstat.kprof.out

And

#lockstat -Cc sleep 30 > lockstat.locks.out

Thanks,
/jim

Jim Leonard wrote:

Can you gather some ZFS IO statistics, like
"fsstat zfs 1" for a minute or so.
  


Here is a snapshot from when it is exhibiting the behavior:

 new  name   name  attr  attr lookup rddir  read read  write write
 file remov  chng   get   setops   ops   ops bytes   ops bytes
0 0 075 0  0 0 0 010 1.25M zfs
0 0 083 0  0 0 0 0 7  896K zfs
0 0 078 0  0 0 0 013 1.62M zfs
0 0 0   229 0  0 0 0 029 3.62M zfs
0 0 0   217 0  0 0 0 028 3.37M zfs
0 0 0   212 0  0 0 0 026 3.03M zfs
0 0 0   151 0  0 0 0 018 2.07M zfs
0 0 0   184 0  0 0 0 031 3.41M zfs
0 0 0   187 0  0 0 0 032 2.74M zfs
0 0 0   219 0  0 0 0 024 2.61M zfs
0 0 0   222 0  0 0 0 029 3.29M zfs
0 0 0   206 0  0 0 0 029 3.26M zfs
0 0 0   205 0  0 0 0 019 2.26M zfs

Unless attr_get is ludicrously costly, I can't see any issues...?
  

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] [dtrace-discuss] How to drill down cause of cross-calls in the kernel? (output provided)

2009-09-23 Thread Jim Mauro


I'm cross-posting to zfs-discuss, as this is now more of a ZFS
query than a dtrace query at this point, and I'm  not sure if all the ZFS
experts are listening on dtrace-discuss (although they probably
are... :^).

The only thing that jumps out at me is the ARC size - 53.4GB, or
most of your 64GB of RAM. This in-and-of-itself is not necessarily
a bad thing - if there are no other memory consumers, let ZFS cache
data in the ARC. But if something is coming along to flush dirty
ARC pages periodically

I found what looked to be an applicable bug;
CR 6699438 zfs induces crosscall storm under heavy mapped sequential 
read workload

but the stack signature for the above bug is different than yours, and
it doesn't sound like your workload is doing mmap'd sequential reads.
That said, I would be curious to know if your workload used mmap(),
versus read/write?

For the ZFS folks just seeing this, here's the stack frame;

 unix`xc_do_call+0x8f
 unix`xc_wait_sync+0x36
 unix`x86pte_invalidate_pfn+0x135
 unix`hat_pte_unmap+0xa9
 unix`hat_unload_callback+0x109
 unix`hat_unload+0x2a
 unix`segkmem_free_vn+0x82
 unix`segkmem_zio_free+0x10
 genunix`vmem_xfree+0xee
 genunix`vmem_free+0x28
 genunix`kmem_slab_destroy+0x80
 genunix`kmem_slab_free+0x1be
 genunix`kmem_magazine_destroy+0x54
 genunix`kmem_depot_ws_reap+0x4d
 genunix`taskq_thread+0xbc
 unix`thread_start+0x8

Let's see what the fsstat and zpool iostat data looks like when this
starts happening..

Thanks,
/jim


Jim Leonard wrote:

It would also be interesting to see some snapshots
of the ZFS arc kstats

kstat -n arcstats



Here you go, although I didn't see anything jump out (massive amounts of cache 
misses or something).  Any immediate trouble spot?

# kstat -n arcstats
module: zfs instance: 0
name:   arcstatsclass:misc
c   53490612870
c_max   67636535296
c_min   8454566912
crtime  212.955493179
deleted 7267003
demand_data_hits179708165
demand_data_misses  189797
demand_metadata_hits9959277
demand_metadata_misses  194228
evict_skip  1709
hash_chain_max  9
hash_chains 205513
hash_collisions 9372169
hash_elements   851634
hash_elements_max   886509
hdr_size143082240
hits198822346
l2_abort_lowmem 0
l2_cksum_bad0
l2_evict_lock_retry 0
l2_evict_reading0
l2_feeds0
l2_free_on_write0
l2_hdr_size 0
l2_hits 0
l2_io_error 0
l2_misses   0
l2_rw_clash 0
l2_size 0
l2_writes_done  0
l2_writes_error 0
l2_writes_hdr_miss  0
l2_writes_sent  0
memory_throttle_count   0
mfu_ghost_hits  236508
mfu_hits165895558
misses  388618
mru_ghost_hits  70149
mru_hits24777390
mutex_miss  6094
p   49175731760
prefetch_data_hits  7993639
prefetch_data_misses370
prefetch_metadata_hits  1161265
prefetch_metadata_misses4223
recycle_miss7149
size53490565328
snaptime5759009.53378144
  

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] ZFS send-receive between remote machines as non-root user

2009-09-23 Thread Sergey

Hi list,

I have a question about setting up zfs send-receive functionality (between 
remote machine) as non-root user.

"server1" - is a server where "zfs send" will be executed
"server2" - is a server where "zfs receive" will be executed.

I am using the following zfs structure:

[server1]$ zfs list -t filesystem -r datapool/data
NAME USED  AVAIL  REFER  MOUNTPOINT
datapool/data  2.05G   223G  2.05G  /opt/data
datapool/data/logs   35K   223G19K  /opt/data/logs
datapool/data/db18K   223G18K  /opt/data/db


[server1]$ zfs list -t filesystem -r datapool2/data
NAME USED  AVAIL  REFER  MOUNTPOINT
datapool2/data   72K  6.91G18K  /datapool2/data
datapool2/data/fastdb   18K  6.91G18K  /opt/data/fastdb
datapool2/data/fastdblog18K  6.91G18K  /opt/data/fastdblog
datapool2/data/dblog18K  6.91G18K  /opt/data/dblog


ZFS delegated permissions setup on the sending machine:

[server1]$ zfs allow datapool/data
-
Local+Descendent permissions on (datapool/data)
user joe 
atime,canmount,create,destroy,mount,receive,rollback,send,snapshot
-

[server1]$ zfs allow datapool2/data
-
Local+Descendent permissions on (data2/data)
user joe 
atime,canmount,create,destroy,mount,receive,rollback,send,snapshot
-


The idea is to create a snapshot and send it to another machine with zfs using 
zfs send-receive.

So I am creating a snapshot and ... get the following error:

[server1]$ zfs list -t snapshot -r datapool/data
NAMEUSED  AVAIL  REFER  
MOUNTPOINT
datapool/d...@rolling-2009092314071448K  -  2.05G  -
datapool/data/l...@rolling-20090923140714   16K  -18K  -
datapool/data/d...@rolling-20090923140714  0  -18K  -

[server1]$ zfs list -t snapshot -r datapool2/data
NAMEUSED  AVAIL  REFER  
MOUNTPOINT
datapool2/d...@rolling-20090923140714 0  -18K  -
datapool2/data/fas...@rolling-20090923140714 0  -18K  -
datapool2/data/fastdb...@rolling-20090923140714  0  -18K  -
datapool2/data/db...@rolling-20090923140714  0  -18K  -


To send the snapshot I'm using the following command (for "datapool" datapool):

[server1]$ zfs send -R datapool/d...@rolling-20090923140714 | ssh server2 zfs 
receive -vd datapool/data_backups/`hostname`/datapool

receiving full stream of datapool/d...@rolling-20090923140714 into 
datapool/data_backups/server1/datapool/data
@rolling-20090923140714
received 2.06GB stream in 62 seconds (34.0MB/sec)
receiving full stream of datapool/data/l...@rolling-20090923140714 into 
datapool/data_backups/server2/datapool/data/l...@rolling-20090923140714
cannot mount 'datapool/data_backups/server1/datapool/data/logs': Insufficient 
privileges


Seems like user "joe" on the remote server ("server2") can not mount the 
filesystem:

[server2]$ zfs mount datapool/data_backups/server1/datapool/data/logs
cannot mount 'datapool/data_backups/server1/datapool/data/logs': Insufficient 
privileges

ZFS delegated permissions on the receiving side look fine for me:

[server2]$ zfs allow datapool/data_backups/server1/datapool/data/logs
-
Local+Descendent permissions on (datapool/data_backups)
user joe 
atime,canmount,create,destroy,mount,receive,rollback,send,snapshot
-
Local+Descendent permissions on (datapool)
user joe 
atime,canmount,create,destroy,mount,receive,rollback,send,snapshot

"zfs receive" creates a mountpoint with "root:root" permissions:

[server2]$ ls -ld /opt/data_backups/server2/datapool/data/logs/
drwxr-xr-x   2 root root   2 Sep 23 14:02 
/opt/data_backups/server1/datapool/data/logs/

I've tried to play with RBAC a bit ..:
[server2]$ id 
uid=750(joe) gid=750(prod)

[server2]$ profiles
File System Security
ZFS File System Management
File System Management
Service Management
Basic Solaris User
All

... but no luck - I still have zfs mount error while receiving a snapshot:

Both servers are running Solaris U7 x86_64, Generic_139556-08.

Is there any method to setup zfs send-receive functionality for descending zfs 
filesystems as non-root user?
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] New to ZFS: One LUN, multiple zones

2009-09-23 Thread Mike Gerdts

On Wed, Sep 23, 2009 at 7:32 AM, bertram fukuda  wrote:
> Thanks for the info Mike.
>
> Just so I'm clear.  You suggest 1)create a single zpool from my LUN 2) create 
> a single ZFS filesystem 3) create 2 zone in the ZFS filesystem. Sound right?

Correct


-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] New to ZFS: One LUN, multiple zones

2009-09-23 Thread Mārcis Lielturks

2009/9/23 bertram fukuda 

> Thanks for the info Mike.
>
> Just so I'm clear.  You suggest 1)create a single zpool from my LUN 2)
> create a single ZFS filesystem 3) create 2 zone in the ZFS filesystem. Sound
> right?
>
> You can create zfs filesystems for each zone and you also can delegate zfs
filesystems to be managed by zones. I usually put each zone's root on its
own filesystem atleast.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] [indiana-discuss] Boot failure with snv_122 and snv_123

2009-09-23 Thread Charles Menser

Cross-posted to ZFS-Discuss per Vikram's suggestion.

Summary: I upgraded to snv_123 and the system hangs on boot. snv_121, and 
earlier are working fine.

Booting with -kv, the system still hung, but after a few minutes, the system 
continued, spit out more text (referring to disks, but I could not capture the 
text). Here is what was left on the screen once the debugger kicked in:

PCI Express-device: i...@0, ata0
ata0 is /p...@0,0/pci-...@14,1/i...@0
PCI Express-device: pci1002,5...@4, pcieb1
pcieb1 is /p...@0,0/pci1002,5...@4
PCI Express-device: pci1002,5...@7
pcieb3 is /p...@0,0/pci1002,5...@7
UltraDMA mode 4 selected
sd4 at ata0: target 0 lun 0
sd4 is /p...@0,0/pci-...@14,1/i...@0/s...@0,0
NOTICE: Can not read the pool label from '/p...@0,0/pci1043,8...@12/d...@0,0:a'
NOTICE: spa_import_rootpool: error 5
Cannot mount root on /p...@0,0/pci1043,8...@12/d...@0,0:a fstype zfs

panic[cpu0]/thread=fbc2efe0: vfs_mountroot: cannot mount root

fbc50ce0 genunix:vfs_mountroot+350 ()
fbc50d10 genunix:main+e7 ()
fbc50d20 unix:_locore_start+92 ()

panic: entering debugger (do dump device, continue to reboot)

Then I ran ::stack and ::status:

[0]> ::stack
kmdb_enter+0xb()
debug_enter+0x38(fb934340)
panicsys+0x41c(fbb89070, fbc50c70, fbc58e80, 1)
vpanic+0x15c()
panic+0x94()
vfs_mountroot+0x350()
main+0xe7()
_locore_start+0x92()

[0]> ::status
debugging live kernel (64-bit) on (not set)
operating system: 5.11 snv_123 (i86pc)
CPU-specific support: AMD
DTrace state: inactive
stopped on: debugger entry trap


To clarify, when build 122 was announced, I tried upgrading. The new BE would 
not boot, hanging in the same way that snv_123 does. I later deleted the 
snv_122 BE.

Also, I checked my grub config, and nothing seems out of line there (though I 
have edited the boot entries to remove the splashimage, foreground, background, 
and console=graphics).

Thanks,
Charles


> Hi,
> 
> A problem with your root pool - something went wrong
> when you upgraded 
> which explains why snv_122 no longer works as well.
> One of the ZFS 
> experts on this list could help you - I suspect
> others may have run into 
> similar issues before.
> 
> Vikram
> 
> Charles Menser wrote:
> > Vikram,
> >
> > Thank you for the prompt reply!
> >
> > I have made no BIOS changes. The last time I
> changed the BIOS was before reinstalling OpenSolaris
> 2009.06 after changing my SATA controller to AHCI
> mode. This was some time ago, and I have been using
> the /dev repo and installed several development
> builds since then (the latest that worked was
> snv_121).
> >
> > I switched to a USB keyboard and mdb was happy. I
> am curious why a PS/AUX keyboard works with the
> system normally, but not MDB.
> >
> > Here is what I have from MDB so far:
> >
> > I rebooted with -kv, and after a few minutes, the
> system continued, spit out more text (referring to
> disks, but I could not capture the text). Here is
> what was left on the screen once the debugger kicked
> in:
> >
> > PCI Express-device: i...@0, ata0
> > ata0 is /p...@0,0/pci-...@14,1/i...@0
> > PCI Express-device: pci1002,5...@4, pcieb1
> > pcieb1 is /p...@0,0/pci1002,5...@4
> > PCI Express-device: pci1002,5...@7
> > pcieb3 is /p...@0,0/pci1002,5...@7
> > UltraDMA mode 4 selected
> > sd4 at ata0: target 0 lun 0
> > sd4 is /p...@0,0/pci-...@14,1/i...@0/s...@0,0
> > NOTICE: Can not read the pool label from
> '/p...@0,0/pci1043,8...@12/d...@0,0:a'
> > NOTICE: spa_import_rootpool: error 5
> > Cannot mount root on
> /p...@0,0/pci1043,8...@12/d...@0,0:a fstype zfs
> >
> > panic[cpu0]/thread=fbc2efe0: vfs_mountroot:
> cannot mount root
> >
> > fbc50ce0 genunix:vfs_mountroot+350 ()
> > fbc50d10 genunix:main+e7 ()
> > fbc50d20 unix:_locore_start+92 ()
> >
> > panic: entering debugger (do dump device, continue
> to reboot)
> >
> > [again, the above is hand transcribed, and may
> contain typos]
> >
> > Then I ran ::stack and ::status:
> >
> > [0]> ::stack
> > kmdb_enter+0xb()
> > debug_enter+0x38(fb934340)
> > panicsys+0x41c(fbb89070, fbc50c70,
> fbc58e80, 1)
> > vpanic+0x15c()
> > panic+0x94()
> > vfs_mountroot+0x350()
> > main+0xe7()
> > _locore_start+0x92()
> >
> > [0]> ::status
> > debugging live kernel (64-bit) on (not set)
> > operating system: 5.11 snv_123 (i86pc)
> > CPU-specific support: AMD
> > DTrace state: inactive
> > stopped on: debugger entry trap
> >
> > The motherboard is an ASUS M3A32-MVP, with BIOS Rev
> 1705.
> >
> > There are four 500G SATA drives connected to the
> on-board SATA controller.
> >
> > There is only one pool (rpool), setup as a
> three-way mirror:
> >
> > char...@carbon-box:~$ zpool status
> >   pool: rpool
> >  state: ONLINE
> > status: The pool is formatted using an older
> on-disk format.  The pool can
> > still be used, but some features are
> unavailable.
> > action: Upgrade the pool using 'zpool upgrade'.
>  Once this is

Re: [zfs-discuss] New to ZFS: One LUN, multiple zones

2009-09-23 Thread bertram fukuda

Thanks for the info Mike.  

Just so I'm clear.  You suggest 1)create a single zpool from my LUN 2) create a 
single ZFS filesystem 3) create 2 zone in the ZFS filesystem. Sound right?

Thanks again,
Bert
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] New to ZFS: One LUN, multiple zones

2009-09-23 Thread Mike Gerdts

On Wed, Sep 23, 2009 at 7:04 AM, bertram fukuda  wrote:
> I have a 1TB LUN being presented to me from our storage team.  I need to 
> create 2 zones and share the storage between them.  Would it be best to 
> repartition the LUNs (2 500Gb slices), create 2 separate storage pools then 
> assign them separately to each zone?  If not, what would be the recommended 
> way.
>   I've read a ton of documentation but end up getting more confused than 
> anything.

The only time that I would create multiple storage pools for zones is
if I intend to migrate them to other hosts independently.  That is, at
time t1 I have zones z1 and z2 on host h1.  I think that at some time
in the future I would like to move z2 to host h2 while leaving z1 on
h1.

Since you only have one LUN, you are not able to move the zones
independently via reassigning a LUN to another host.  That is, it is
impossible to split the LUN and unsafe to share the LUN to multiple
hosts.

In your situation, I would create one pool and put both zones on it.
When you decide you need more zones, put them in it too.

As an aside, I rarely find the idea that I have X amount of space and
Y things to put into it, so I will give each thing X/Y space.  This is
because it is quite likely that someone will do the operation Y++ and
there are very few storage technologies that allow you to shrink the
amount of space allocated to each item.

-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] New to ZFS: One LUN, multiple zones

2009-09-23 Thread bertram fukuda

I have a 1TB LUN being presented to me from our storage team.  I need to create 
2 zones and share the storage between them.  Would it be best to repartition 
the LUNs (2 500Gb slices), create 2 separate storage pools then assign them 
separately to each zone?  If not, what would be the recommended way.
   I've read a ton of documentation but end up getting more confused than 
anything.

Thanks,
Bert
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Real help

2009-09-23 Thread vattini giacomo

Hi there i'v been able to restore my zpool on a live cd,reinstall the grub,but 
booting from the HD it hangs for a while and than nothing comes up
j...@opensolaris:~# zfs list
NAME   USED  AVAIL  REFER  MOUNTPOINT
rpool 66.2G  5.65G78K  /a
rpool/ROOT21.7G  5.65G18K  /
rpool/ROOT/OSOL_123   16.9G  5.65G  10.7G  /tmp/tmpd38Ilg
rpool/ROOT/be_name 160M  5.65G  10.5G  /
rpool/ROOT/be_name-1  4.23G  5.65G  10.5G  /
rpool/ROOT/opensolaris35.4M  5.65G  5.69G  /
rpool/ROOT/opensolaris-1   188M  5.65G  5.05G  /tmp/tmp2GKUrN
rpool/ROOT/opensolaris-2  43.0M  5.65G  5.60G  /tmp/tmpUbjG28
rpool/ROOT/opensolaris-3  42.8M  5.65G  5.67G  /tmp/tmpKIjHhB
rpool/ROOT/opensolaris-4  32.3M  5.65G  6.49G  /tmp/tmp7JeXnR
rpool/ROOT/opensolaris-5  87.5M  5.65G  10.4G  /tmp/tmpfvpm6V
rpool/dump1.50G  5.65G  1.50G  -
rpool/export  41.5G  5.65G21K  /export
rpool/export/home 41.5G  5.65G   195M  /export/home/
rpool/export/home/hazz41.3G  5.65G  41.3G  /export/home/hazz/
rpool/swap1.50G  5.86G  1.29G  -
Any clue to get on the rescue?
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

41 matches

Mail list logo