from:"Jens Elkner"

Re: [zfs-discuss] System started crashing hard after zpool reconfigure and OI upgrade

2013-03-20 Thread Jens Elkner

On Wed, Mar 20, 2013 at 08:50:40AM -0700, Peter Wood wrote:
I'm sorry. I should have mentioned it that I can't find any errors in the
logs. The last entry in /var/adm/messages is that I removed the keyboard
after the last reboot and then it shows the new boot up messages when I 
 boot
up the system after the crash. The BIOS log is empty. I'm not sure how to
check the IPMI but IPMI is not configured and I'm not using it.

You definitely should! Plugin a cable into the dedicated network port 
and configure it (easiest way for you is probably to jump into the BIOS
and assign the appropriate IP address etc.). Than, for a quick look, 
point your browser to the given IP port 80 (default login is
ADMIN/ADMIN). Also you may now configure some other details
(accounts/passwords/roles).

To track the problem, either write a script, which polls the parameters
in question periodically or just install the latest ipmiViewer and use
this to monitor your sensors ad hoc.
see ftp://ftp.supermicro.com/utility/IPMIView/

Just another observation - the crashes are more intense the more data the
system serves (NFS).
I'm looking into FRMW upgrades for the LSI now.

Latest LSI FW should be P15, for this MB type 217 (2.17), MB-BIOS C28 (1.0b).
However, I doubt, that your problem has anything to do with the
SAS-ctrl or OI or ZFS.

My guess is, that either your MB is broken (we had an X9DRH-iF, which
instantly disappeared as soon as it got some real load) or you have
a heat problem (watch you cpu temp e.g. via ipmiviewer). With 2GHz
that's not very likely, but worth a try (socket placement on this board
is not really smart IMHO).

To test quickly
- disable all addtional, unneeded service in OI, which may put some
  load on the machine (like NFS service, http and bla) and perhaps
  even export unneeded pools (just to be sure)
- fire up your ipmiviewer and look at the sensors (set update to
  10s) or refresh manually often
- start 'openssl speed -multi 32' and keep watching your cpu temp
  sensors (with 2GHz I guess it takes ~ 12min)

I guess, your machine disappears before the CPUs getting really hot
(broken MB). If CPUs switch off (usually first CPU2 and a little bit
later CPU1) you have a cooling problem. If nothing happens, well, than
it could be an OI or ZFS problem ;-)

Have fun,
jel.
-- 
Otto-von-Guericke University http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany Tel: +49 391 67 52768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Changing rpool device paths/drivers

2012-10-04 Thread Jens Elkner

On Thu, Oct 04, 2012 at 07:57:34PM -0500, Jerry Kemp wrote:
 I remember a similar video that was up on YouTube as done by some of the
 Sun guys employed in Germany.  They build a big array from USB drives,
 then exported the pool.  Once the system was down, they re-arranged all
 the drives in random order and ZFS was able to figure out how to put the
 raid all back together.  I need to go find that video.

http://constantin.glez.de/blog/2011/01/how-save-world-zfs-and-12-usb-sticks-4th-anniversary-video-re-release-edition
?

Have fun,
jel.
-- 
Otto-von-Guericke University http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany Tel: +49 391 67 12768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Is there an actual newsgroup for zfs-discuss?

2012-06-11 Thread Jens Elkner

On Tue, Jun 12, 2012 at 08:12:48AM +1000, Alan Hargreaves wrote:
 There is a ZFS Community on the Oracle Communities that was just kicked 
 off this month - 
 https://communities.oracle.com/portal/server.pt/community/oracle_solaris_zfs_file_system/526

Ohh, another censored forum/crappy thing - no thanx!

Regards,
jel.
-- 
Otto-von-Guericke University http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany Tel: +49 391 67 12768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] bad seagate drive?

2011-09-12 Thread Jens Elkner

On Mon, Sep 12, 2011 at 12:52:42AM +0100, Matt Harrison wrote:
 On 11/09/2011 18:32, Krunal Desai wrote:
 On Sep 11, 2011, at 13:01 , Richard Elling wrote:
 The removed state can be the result of a transport issue. If this is a 
 Solaris-based
 OS, then look at fmadm faulty for a diagnosis leading to a removal. If 
 none,
 then look at fmdump -eV for errors relating to the disk. Last, check 
 the zpool
 history to make sure one of those little imps didn't issue a zpool 
 remove
 command.
 
 Definitely check your cabling; a few of my drives disappeared like this as 
 'REMOVED', turned out to be some loose SATA cables on my backplane.
 
 --khd
 
 Thanks guys,
 
 I reinstalled the drive after testing on the windows machine and it 
 looks fine now. By the time I'd got on to the console it had already 
 started resilvering. All done now and hopefully it will stay like that 
 for a while.

Hmmm, at least if S11x, ZFS mirror, ICH10 and cmdk (IDE) driver is involved,
I'm 99.9% confident, that a while turns out to be some days or weeks, only
- no matter what Platinium-Enterprise-HDDs you use ;-)

Regards,
jel.
-- 
Otto-von-Guericke University http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany Tel: +49 391 67 12768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] bad seagate drive?

2011-09-11 Thread Jens Elkner

On Sun, Sep 11, 2011 at 11:41:32AM +0100, Matt Harrison wrote:
Hi,
 
 I've got a system with 3 WD and 3 seagate drives. Today I got an email 
 that zpool status indicated one of the seagate drives as REMOVED.
 
 I've tried clearing the error but the pool becomes faulted again. Taken 
 out the offending drive and plugged into a windows box with seatools 
 install. Unfortunately seatools finds nothing wrong with the drive.

Wondering, which OS version, driver and which controller? Also, is this
always the 2nd drive of a 2-way mirror?

Regards,
jel.
-- 
Otto-von-Guericke University http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany Tel: +49 391 67 12768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Gen-ATA read sector errors

2011-07-28 Thread Jens Elkner

On Thu, Jul 28, 2011 at 01:55:27PM +0200, Koopmann, Jan-Peter wrote:
Hi,
 
my system is running oi148 on a super micro X8SIL-F board. I have two pools
(2 disc mirror, 4 disc RAIDZ) with RAID level SATA drives. (Hitachi 
 HUA72205
and SAMSUNG HE103UJ).  The system runs as expected however every few days
(sometimes weeks) the system comes to a halt due to these errors:
 
Dec  3 13:51:20 nasjpk gda: [ID 107833 kern.warning] WARNING:
/pci@0,0/pci-ide@1f,2/ide@1/cmdk@0,0 (Disk1):
Dec  3 13:51:20 nasjpk  Error for commandX \'read sector\' Error Level:
Fatal
Dec  3 13:51:20 nasjpk gda: [ID 107833 kern.notice] Requested Block
5503936, Error Block: 5503936
Dec  3 13:51:20 nasjpk gda: [ID 107833 kern.notice] Sense Key:
uncorrectable data error
Dec  3 13:51:20 nasjpk gda: [ID 107833 kern.notice] Vendor \'Gen-ATA \'
error code: XX7
 
It is not related to this one disk. It happens on all disks. Sometimes
several are listed before the system crashes, sometimes just one. I 
 cannot

I tend to agree, that the IDE driver seems to have a problem: I.e. on
our machines (HP Z400 with a 0B4Ch-D MB with a 82801JI (ICH10 Family)
controller) using a rpool 2-way mirror of WDC WD5000AAKS HDDs) we also
see sometimes, that one drive got disabled dueto too many errors.
zpool clear revives the pool (i.e. the HDD gets resilvered very quickly
without any problem) 'til it occures again (i.e. after some
days, weeks, or months). Unfortunately we couldn't find a procedure to
reproduce the problem (e.g. like for the Marvell ctrl in the early days).

Regards,
jel.
-- 
Otto-von-Guericke University http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany Tel: +49 391 67 12768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] SATA disk perf question

2011-06-02 Thread Jens Elkner

On Wed, Jun 01, 2011 at 06:17:08PM -0700, Erik Trimble wrote:
 On Wed, 2011-06-01 at 12:54 -0400, Paul Kraus wrote:
  
 Here's how you calculate (average) how long a random IOPs takes:
 seek time + ((60 / RPMs) / 2))]
 
 A truly sequential IOPs is:
 (60 / RPMs) / 2)
 
 For that series of drives, seek time averages 8.5ms (per Seagate).
 So, you get 
 
 1 Random IOPs takes [8.5ms + 4.13ms] = 12.6ms, which translates to 78
 IOPS
 1 Sequential IOPs takes 4.13ms, which gives 120 IOPS.
 
 Note that due to averaging, the above numbers may be slightly higher or
 lower for any actual workload.

Nahh, shouldn't it read numbers may be _significant_ higher or lower
...? ;-)

Regards,
jel.
-- 
Otto-von-Guericke University http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany Tel: +49 391 67 12768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] X4540 no next-gen product?

2011-04-08 Thread Jens Elkner

On Fri, Apr 08, 2011 at 08:29:31PM +1200, Ian Collins wrote:
  On 04/ 8/11 08:08 PM, Mark Sandrock wrote:
...
 I don't follow? What else would an X4540 or a 7xxx box
 be used for, other than a storage appliance?
...
 No, I just wasn't clear - we use ours as storage/application servers.  
 They run Samba, Apache and various other applications and P2V zones that 
 access the large pool of data.  Each also acts as a fail over box (both 
 data and applications) for the other.

Same thing here + several zones (source code repositories,
documentation, even a real samba server to avoid the MS crap, install
server, shared installs (i.e. relocatable packages shared via NFS e.g.
as /local/usr ...)).

So yes, 7xxx is a no-go for us as well. If there are no X45xx,
we'll find alternatives from other companies ...

 Guess I'm slow. :-)

May be - flexibility/dependencies are some of the keywords ;-)

Regards,
jel.
-- 
Otto-von-Guericke University http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany Tel: +49 391 67 12768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS and TRIM

2011-02-08 Thread Jens Elkner

On Fri, Feb 04, 2011 at 03:30:45PM +0100, Pawel Jakub Dawidek wrote:
 On Sat, Jan 29, 2011 at 11:31:59AM -0500, Edward Ned Harvey wrote:
  What is the status of ZFS support for TRIM?
 [...]
 My initial idea was to implement 100% reliable TRIM, so that I can
 implement secure delete using it, eg. if ZFS is placed on top of disk

Hmmm - IIRC, zfs loadbalances ZIL ops over available devs. Furthermore
I guess, almost everyone, who considers SSDs for ZIL will use at least
2 devs (i.e. since there is no big benefit in having a mirror dev,
many people will proably use one SSD/dev, some more paranoid people
will choose to have a N-way mirror/dev).

So why not turn the 2nd dev temp. off (freeze), do what you want
(trim/reformat/etc) and than turn it on again? I assume of course, that
the loadbalancer recognizes, when a dev goes offline/online and
automatically uses the available ones, only ...

If one doesn't have at least 2 devs, don't care about this home optimized
setup ;-)

Regards,
jel.
-- 
Otto-von-Guericke University http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany Tel: +49 391 67 12768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Ext. UPS-backed SATA SSD ZIL?

2010-11-29 Thread Jens Elkner

On Sat, Nov 27, 2010 at 03:04:27AM -0800, Erik Trimble wrote:
Hi, 
 
 I haven't had a chance to test a Vertex 2 PRO against my 2 EX, and I'd 
 be interested if anyone else has.  The EX is SLC-based, and the PRO is 
 MLC-based, but the claimed performance numbers are similar.  If the PRO 
 works well, it's less than half the cost, and would be a nice solution 
 for most users who don't need ultra-super-performance from their ZIL.   

Well, we'll get some toys to play with in the next couple of weeks (i.e
before x-mas), which probably allows us to mimic your env and do some
testing, before they go into production (probably in March 2011).
So if you or anybody else has a special setup to test, feel free to send
me a note ...

HW Details: a new X4540 + 9x OCZSSD2-2VTXP50G + 3x OCZSSD2-2VTXP50G
+ a SM server with an LSI 620J and 24x 15K SAS2-Drives (see
http://iws.cs.uni-magdeburg.de/~elkner/supermicro/server.html) as well
as a bunch of HP z400 Xeon W3680 based WS. Estimated delivery of
10G components is end of january 2011 (Nexus 5010 + some 3560X-nT-Ls).

 The DDRdrive is still the way to go for the ultimate ZIL accelleration, 

Well, not for us, since full height cards are a no go for us 
and PCIe 1.x x1 ...

Regards,
jel.
-- 
Otto-von-Guericke University http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany Tel: +49 391 67 12768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Running on Dell hardware?

2010-10-26 Thread Jens Elkner

On Tue, Oct 26, 2010 at 08:06:53AM +1300, Ian Collins wrote:
 On 10/26/10 01:38 AM, Edward Ned Harvey wrote:
 From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
 boun...@opensolaris.org] On Behalf Of Ian Collins
 
 Sun hardware?  Then you get all your support from one vendor.
  
 +1
 
 Sun hardware costs more, but it's worth it, if you want to simply assume
 your stuff will work.  In my case, I'd say the sun hardware was approx 50%
 to 2x higher cost than the equivalent dell setup.
 

 I find that claim odd.  When ever we bought kit down here in NZ, Sun has 
 been the best on price.  Maybe that's changed under the new order.

Add about 50% to the last price list from Sun und you will get the price
it costs now ...

Have fun,
jel.
-- 
Otto-von-Guericke University http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany Tel: +49 391 67 12768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Legality and the future of zfs...

2010-07-12 Thread Jens Elkner

On Mon, Jul 12, 2010 at 05:05:41PM +0100, Andrew Gabriel wrote:
 Linder, Doug wrote:
 Out of sheer curiosity - and I'm not disagreeing with you, just wondering 
 - how does ZFS make money for Oracle when they don't charge for it?  Do 
 you think it's such an important feature that it's a big factor in 
 customers picking Solaris over other platforms?
   
 
 Yes, it is one of many significant factors in customers choosing Solaris 
 over other OS's.
 Having chosen Solaris, customers then tend to buy Sun/Oracle systems to 
 run it on.

2x hit the nail on the head. But only if one doesn't have to sell
its kingdom to get recommended/security patches. Otherwise the windooze
nerds take over ...

Regards,
jel.
-- 
Otto-von-Guericke University http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany Tel: +49 391 67 12768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Homegrown Hybrid Storage

2010-06-07 Thread Jens Elkner

On Sun, Jun 06, 2010 at 09:16:56PM -0700, Ken wrote:
I'm looking at VMWare, ESXi 4, but I'll take any advice offered.
...
I'm looking to build a virtualized web hosting server environment accessing
files on a hybrid storage SAN.  I was looking at using the Sun X-Fire x4540
with the following configuration:

IMHO Solaris Zones with LOFS mounted ZFSs gives you the highest
flexibility in all directions, probably the best performance and 
least resource consumption, fine grained resource management (CPU,
memory, storage space) and less maintainance stress etc...

Have fun,
jel.
-- 
Otto-von-Guericke University http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany Tel: +49 391 67 12768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Sun X4500 disk drives

2010-05-12 Thread Jens Elkner

On Wed, May 12, 2010 at 09:34:28AM -0700, Doug wrote:
 We have a 2006 Sun X4500 with Hitachi 500G disk drives.  Its been running for 
 over four years and just now fmadm  zpool reports a disk has failed.  No 
 data was lost (RAIDZ2 + hot spares worked as expected.)  But, the server is 
 out of warranty and we have no hardware support on it.

Well - had the same thing here (X4500, Q1 2007) 2-3 times couple of
month ago. The 'too many errors' msg ringed some bells: do you remember
the race condition problems in the marvell driver (IIRC especially late
u3, u4) which caused many 'bad ...' errors in the logs? So I simply
checked the drive in question (QD 2xdd over the whole disk and checked,
whether an error occured). Since not a single error or bad performance
I put it back and no wonder, it is still working ;-) ). 

Your situation might be different, but checking may not hurt - your
disks might be a victim of an SW aka ZFS error counter...

Have fun,
jel.
-- 
Otto-von-Guericke University http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany Tel: +49 391 67 12768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Large scale ZFS deployments out there (200 disks)

2010-02-27 Thread Jens Elkner

On Fri, Feb 26, 2010 at 09:25:57PM -0700, Eric D. Mudama wrote:
...
 I agree with the above, but the best practices guide:
 
 http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide#ZFS_file_service_for_SMB_.28CIFS.29_or_SAMBA
 
 states in the SAMBA section that Beware that mounting 1000s of file
 systems, will impact your boot time.  I'd say going from a 2-3 minute
 boot time to a 4+ hour boot time is more than just impact.  That's
 getting hit by a train.

At least on S10u8 its not that bad. Last time I patched and rebooted 
a X4500 with ~350 ZFS it took about 10min to come up, a X4600 with
a 3510 and ~2350 ZFS took about 20min (almost all are shared via NFS).
Shutting down/unshare them takes roughly the same time ...
On the X4600 creating|destroying a single ZFS (no matter on which pool or
how many ZFS belong to the same pool!) takes about 20 sec, renaming about
40 sec ... - that's really a pain ...
  
Regards,
jel.
-- 
Otto-von-Guericke University http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany Tel: +49 391 67 12768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] How to get a list of changed files between two snapshots?

2010-02-03 Thread Jens Elkner

On Wed, Feb 03, 2010 at 10:29:18AM -0500, Frank Cusack wrote:
 On February 3, 2010 12:04:07 PM +0200 Henu henrik.he...@tut.fi wrote:
 Is there a possibility to get a list of changed files between two
 snapshots?
 
 Great timing as I just looked this up last night, I wanted to verify
 that an install program was only changing the files on disk that it
 claimed to be changing.  So I have to say, come on.  It took me but
 one google search and the answer was one of the top 3 hits.
 
 http://forums.freebsd.org/showthread.php?p=65632
 
 # newer files
 find /file/system -newer /file/system/.zfs/snapshot/snapname -type f
 # deleted files
 cd /file/system/.zfs/snapshot/snapname
 find . -type f -exec test -f /file/system/{} || echo {} \;
 
 The above requires GNU find (for -newer), and obviously it only finds
 files.  If you need symlinks or directory names modify as appropriate.
 
 The above is also obviously to compare a snapshot to the current
 filesystem.  To compare two snapshots make the obvious modifications.

Perhaps http://iws.cs.uni-magdeburg.de/~elkner/ddiff/ wrt. dir2dir cmp
may help as well (should be faster).

Have fun,
jel.
-- 
Otto-von-Guericke University http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany Tel: +49 391 67 12768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] How to get a list of changed files between two snapshots?

2010-02-03 Thread Jens Elkner

On Wed, Feb 03, 2010 at 12:19:50PM -0500, Frank Cusack wrote:
 On February 3, 2010 6:02:52 PM +0100 Jens Elkner 
 jel+...@cs.uni-magdeburg.de wrote:
 On Wed, Feb 03, 2010 at 10:29:18AM -0500, Frank Cusack wrote:
 # newer files
 find /file/system -newer /file/system/.zfs/snapshot/snapname -type f
 # deleted files
 cd /file/system/.zfs/snapshot/snapname
 find . -type f -exec test -f /file/system/{} || echo {} \;
 
 The above requires GNU find (for -newer), and obviously it only finds
 files.  If you need symlinks or directory names modify as appropriate.
 
 The above is also obviously to compare a snapshot to the current
 filesystem.  To compare two snapshots make the obvious modifications.
 
 Perhaps http://iws.cs.uni-magdeburg.de/~elkner/ddiff/ wrt. dir2dir cmp
 may help as well (should be faster).
 
 If you don't need to know about deleted files, it wouldn't be.  It's hard
 to be faster than walking through a single directory tree if ddiff has to
 walk through 2 directory trees.

Yepp, but I guess the 'test ...' invocation for each file alone is much
more time consuming and IIRC the test -f path has do do several stats
as well, 'til it reaches its final target. So a lot of overhead again.

However, just finding newer files via 'find' is probably unbeatable ;-)
 
 If you do need to know about deleted files, the find method still may
 be faster depending on how ddiff determines whether or not to do a
 file diff.  The docs don't explain the heuristics so I wouldn't want
 to guess on that.

ddiff is a single process and basically travels recursively through
directories via a DirectoryStream (side by side) and stops it at the
point, where no more information is required to make the final decision
(depends on cmd line options). So it needs for very deep dirs with a
lot of entries [much] more memory than find, yes.

Not sure, how DirectoryStream is implemented, but I guess, it gets
mapped to readdir(3C) and friends ...

Regards,
jel.
-- 
Otto-von-Guericke University http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany Tel: +49 391 67 12768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] How to get a list of changed files between two snapshots?

2010-02-03 Thread Jens Elkner

On Wed, Feb 03, 2010 at 06:46:57PM -0500, Ross Walker wrote:
 On Feb 3, 2010, at 12:35 PM, Frank Cusack frank+lists/ 
 z...@linetwo.net wrote:
 
 On February 3, 2010 12:19:50 PM -0500 Frank Cusack 
 frank+lists/z...@linetwo.net  wrote:
 If you do need to know about deleted files, the find method still may
 be faster depending on how ddiff determines whether or not to do a
 file diff.  The docs don't explain the heuristics so I wouldn't want
 to guess on that.
 
 An improvement on finding deleted files with the find method would
 be to not limit your find criteria to files.  Directories with
 deleted files will be newer than in the snapshot so you only need
 to look at those directories.  I think this would be faster than
 ddiff in most cases.
 
 So was there a final consensus on the best way to find the difference  
 between two snapshots (files/directories added, files/directories  
 deleted and file/directories changed)?
 
 Find won't do it, ddiff won't do it,

ddiff does exactly this. However it never looks at any timestamp since
it is the most unimportant/unreliable path component tag wrt.
what has been changed and does also not take file permissions and xattrs
into account.  So ddiff is all about path names, types and content.
Not more but also not less ;-)

 I think the only real option is  
 rsync. Of course you can zfs send the snap to another system and do  
 the rsync there against a local previous version.

Probably the worst of all suggested alternatives ...

Have fun,
jel.
-- 
Otto-von-Guericke University http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany Tel: +49 391 67 12768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] X4540 + SFA F20 PCIe?

2009-12-14 Thread Jens Elkner

On Mon, Dec 14, 2009 at 01:29:50PM +0300, Andrey Kuzmin wrote:
 On Mon, Dec 14, 2009 at 4:04 AM, Jens Elkner
 jel+...@cs.uni-magdeburg.de wrote:
...
  Problem is pool1 - user homes! So GNOME/firefox/eclipse/subversion/soffice
...
 Flash-based read cache should help here by minimizing (metadata) read
 latency, and flash-based log would bring down write latency. The only

Hmmm not yet sure - I think writing via NFS is the biggest problem. 
Anyway, almost finished the work for a 'generic collector' and data
visualizer which allows us to better correlate them to each other on the
fly (i.e. no rrd pain) and understand the numbers hopefully a little bit
better ;-).

 drawback of using single F20 is that you're trying to minimize both
 with the same device.

Yepp. But would that scenario change much, when one puts 4 SSDs at HDD
slots instead? I guess, not really or would be even worse because it
disturbs the data path from/to HDD controlers. Anyway, I'll try that
out next year, when those neat toys are officially supported (and the
budget for this got its final approval of course).
  
Regards,
jel.
-- 
Otto-von-Guericke University http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany Tel: +49 391 67 12768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] X4540 + SFA F20 PCIe?

2009-12-13 Thread Jens Elkner

On Sat, Dec 12, 2009 at 03:28:29PM +, Robert Milkowski wrote:
 Jens Elkner wrote:
Hi Robert,
 
 just got a quote from our campus reseller, that readzilla and logzilla
 are not available for the X4540 - hmm strange Anyway, wondering
 whether it is possible/supported/would make sense to use a Sun Flash
 Accelerator F20 PCIe Card in a X4540 instead of 2.5 SSDs? 
 
 If so, is it possible to partition the F20, e.g. into 36 GB logzilla,
 60GB readzilla (also interesting for other X servers)?
 
   
 IIRC the card presents 4x LUNs so you could use each of them for 
 different purpose.
 You could also use different slices.

oh. coool - IMHO this would be sufficient for our purposes (see next
posting).
 
 me or not. Is this correct?
 
 It still does. The capacitor is not for flushing data to disks drives! 
 The card has a small amount of DRAM memory on it which is being flushed 
 to FLASH. Capacitor is to make sure it actually happens if the power is 
 lost.

Yepp - found the specs. (BTW: Was probably to late to think about the
term Flash Accelerator having DRAM prestoserv in mind ;-)).

Thanx,
jel.
-- 
Otto-von-Guericke University http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany Tel: +49 391 67 12768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] X4540 + SFA F20 PCIe?

2009-12-13 Thread Jens Elkner

On Sat, Dec 12, 2009 at 04:23:21PM +, Andrey Kuzmin wrote:
 As to whether it makes sense (as opposed to two distinct physical
 devices), you would have read cache hits competing with log writes for
 bandwidth. I doubt both will be pleased :-)
  
Hmm - good point. What I'm trying to accomplish:

Actually our current prototype thumper setup is:
root pool (1x 2-way mirror SATA)
hotspare  (2x SATA shared)
pool1 (12x 2-way mirror SATA)   ~25% used   user homes
pool2 (10x 2-way mirror SATA)   ~25% used   mm files, archives, ISOs

So pool2 is not really a problem - delivers about 600MB/s uncached, 
about 1.8 GB/s cached (i.e. read a 2nd time, tested with a 3.8GB iso)
and is not contineously stressed. However sync write is ~ 200 MB/s
or 20 MB/s and mirror, only.

Problem is pool1 - user homes! So GNOME/firefox/eclipse/subversion/soffice
usually via NFS and a litle bit via samba - a lot of more or less small
files, probably widely spread over the platters. E.g. checkin' out a
project from a svn|* repository into a home takes hours. Also having
its workspace on NFS isn't fun (compared to linux xfs driven local soft
2-way mirror).

So data are coming in/going out currently via 1Gbps aggregated NICs, for
X4540 we plan to use one (may be experiment with two some time later)
10 Gbps NIC. So max. 2 GB/s read and write. This leaves still 2GB/s in
and out for the last PCIe 8x Slot - the F20. Since IO55 is bound
with 4GB/s bidirectional HT to the Mezzanine Connector1, in theory those 
2 GB/s to and from the F20 should be possible. 

So IMHO wrt. bandwith basically it makes not really a difference, whether
one puts 4 SSDs into HDD slots or using the 4 Flash-Modules on the F20
(even when distributing the SSDs over the IO55(2) and MCP55).

However, having it on a separate HT than the HDDs might be an advantage.
Also one would be much more flexible/able to scale immediately, i.e.
don't need to re-organize the pools because of the now unavailable
slots/ is still able to use all HDD slots with normal HDDs.
(we are certainly going to upgrade x4500 to x4540 next year ...)
(And if Sun makes a F40 - dropping the SAS ports and putting 4 other
Flash-Modules on it or is able to get flashMods with double speed , one
could probably really get ~ 1.2 GB write and ~ 2GB/s read).

So, seems to be a really interesting thing and I expect at least wrt. 
user homes a real improvement, no matter, how the final configuration
will look like. 

Maybe the experts at the source are able to do some 4x SSD vs. 1xF20
benchmarks? I guess at least if they turn out to be good enough, it
wouldn't hurt ;-)

  Jens Elkner wrote:
...
  whether it is possible/supported/would make sense to use a Sun Flash
  Accelerator F20 PCIe Card in a X4540 instead of 2.5 SSDs?

Regards,
jel.
-- 
Otto-von-Guericke University http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany Tel: +49 391 67 12768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] X4540 + SFA F20 PCIe?

2009-12-11 Thread Jens Elkner

Hi,

just got a quote from our campus reseller, that readzilla and logzilla
are not available for the X4540 - hmm strange Anyway, wondering
whether it is possible/supported/would make sense to use a Sun Flash
Accelerator F20 PCIe Card in a X4540 instead of 2.5 SSDs? 

If so, is it possible to partition the F20, e.g. into 36 GB logzilla,
60GB readzilla (also interesting for other X servers)?

Wrt. super capacitators: I would guess, at least wrt. X4540 it doesn't
give one more protection, since if power is lost, the HDDs do not respond
anymore and thus it doesn't matter, whether the log cache is protected for
a short time or not. Is this correct?

Regards,
jel.
-- 
Otto-von-Guericke University http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany Tel: +49 391 67 12768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Liveupgrade'd to U8 and now can't boot previous U6 BE :(

2009-10-28 Thread Jens Elkner

On Wed, Oct 28, 2009 at 01:55:57AM -0700, Ben Middleton wrote:
Hi,
  
 $ ludelete 10_05-09
 
 System has findroot enabled GRUB
 Checking if last BE on any disk...
 ERROR: cannot mount '/.alt.10_05-09/var': directory is not empty
 ERROR: cannot mount mount point /.alt.10_05-09/var device 
 rpool/ROOT/s10x_u7wos_08/var
 ERROR: failed to mount file system rpool/ROOT/s10x_u7wos_08/var on 
 /.alt.10_05-09/var
 ERROR: unmounting partially mounted boot environment file systems
...
 rpool/ROOT/s10x_u7wos_08  17.4M  4.26G  4.10G  /.alt.10_05-09
 rpool/ROOT/s10x_u7wos_08/var  9.05M  4.26G  2.11G  /.alt.10_05-09/var

luumount /.alt.10_05-09
mount -p | grep /.alt.10_05-09
# if it lists something (e.g. tmp, swap, etc) reboot first and than:

zfs set mountpoint=/mnt rpool/ROOT/s10x_u7wos_08
zfs mount rpool/ROOT/s10x_u7wos_08
rm -rf /mnt/var/* /mnt/var/.???*
zfs umount /mnt

# now that should work
lumount 10_05-09 /mnt
luumount /mnt

# if not, send the output of mount -p | grep ' /mnt'

Have fun,
jel.
-- 
Otto-von-Guericke University http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany Tel: +49 391 67 12768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] strange results ...

2009-10-21 Thread Jens Elkner

Hmmm,

wondering about IMHO strange ZFS results ...

X4440:  4x6 2.8GHz cores (Opteron 8439 SE), 64 GB RAM
6x Sun STK RAID INT V1.0 (Hitachi H103012SCSUN146G SAS)
Nevada b124

Started with a simple test using zfs on c1t0d0s0: cd /var/tmp

(1) time sh -c 'mkfile 32g bla ; sync' 
0.16u 19.88s 5:04.15 6.5%
(2) time sh -c 'mkfile 32g blabla ; sync'
0.13u 46.41s 5:22.65 14.4%
(3) time sh -c 'mkfile 32g blablabla ; sync'
0.19u 26.88s 5:38.07 8.0%

chmod 644 b*
(4) time dd if=bla of=/dev/null bs=128k
262144+0 records in
262144+0 records out
0.26u 25.34s 6:06.16 6.9%
(5) time dd if=blabla of=/dev/null bs=128k
262144+0 records in
262144+0 records out
0.15u 26.67s 4:46.63 9.3%
(6) time dd if=blablabla of=/dev/null bs=128k
262144+0 records in
262144+0 records out
0.10u 20.56s 0:20.68 99.9%

So 1-3 is more or less expected (~97..108 MB/s write).
However 4-6 looks strange: 89, 114 and 1585 MB/s read!

Since the arc size is ~55+-2GB (at least arcstat.pl says so), I guess (6)
reads from memory completely. Hmm - maybe.
However, I would expect, that when repeating 5-6, 'blablabla' gets replaced
by 'bla' or 'blabla'. But the numbers say, that 'blablabla' is kept in the
cache, since I get almost the same results as in the first run (and zpool
iostat/arcstat.pl show for the blablabla almost no activity at all).
So is this a ZFS bug? Or does the OS some magic here?

2nd)
Never had a Sun STK RAID INT before. Actually my intention was to
create a zpool mirror of sd0 and sd1 for boot and logs, and a 2x2-way 
zpool mirror with the 4 remaining disks. However, the controller seems
not to support JBODs :( - which is also bad, since we can't simply put
those disks into another machine with a different controller without
data loss, because the controller seems to use its own format under the
hood.  Also the 256MB BBCache seems to be a little bit small for ZIL
even if one would know, how to configure it ...

So what would you recommend? Creating 2 appropriate STK INT arrays
and using both as a single zpool device, i.e. without ZFS mirror devs
and 2nd copies? 

Intent workload is MySQL DBs + VBox images wrt. to the 4 disk *mirror,
logs and OS for the 2 disk *mirror, and should also act as a sunray
server (user homes and add. apps are comming from another server via NFS).

Any hints?

Regards,
jel.
-- 
Otto-von-Guericke University http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany Tel: +49 391 67 12768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Liveupgrade'd to U8 and now can't boot previous U6 BE :(

2009-10-16 Thread Jens Elkner

On Fri, Oct 16, 2009 at 07:36:04PM -0700, Paul B. Henson wrote:
 
 I used live upgrade to update a U6+lots'o'patches system to vanilla U8. I
 ran across CR 6884728, which results in extraneous lines in vfstab
 preventing successful boot. I logged in with maintainence mode and deleted

Haveing a look at http://iws.cs.uni-magdeburg.de/~elkner/luc/solaris-upgrade.txt
shouldn't hurt ;-)

 those lines, and the U8 BE came up ok. I wasn't sure if there were any
 other problems from that, so I tried to activate and boot back into my
 previous U6 BE. That now fails with this error:
 
   ***
   *  This device is not bootable!   *
   *  It is either offlined or detached or faulted.  *
   *  Please try to boot from a different device.*
   ***
 
 
 NOTICE:
 spa_import_rootpool: error 22
 
 Cannot mount root on /p...@1,0/pci1022,7...@4/pci11ab,1...@1/d...@0,0:a
 fstype zfs
 
 panic[cpu0]/thread=fbc283a0: vfs_mountroot: cannot mount root
 
 I can still boot fine into the new U8 BE, but so far have found no way to
 recover and boot into my previously existing U6 BE.

Hmm - haven't done thumper upgrades yet, but on sparc there is no 
problem to boot into the old BE as long as the zpool hasn't been 
upgraded to U8's v15. So first thing to check is, whether the pool
is still at =v10 (U7 used v10, not sure about U6).
  
Regards,
jel.
-- 
Otto-von-Guericke University http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany Tel: +49 391 67 12768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Solaris 10 samba in AD mode broken when user in 32 AD groups

2009-10-14 Thread Jens Elkner

On Tue, Oct 13, 2009 at 10:59:37PM -0600, Drew Balfour wrote:
...
 For Opensolaris, Solaris CIFS != samba. Solaris now has a native in kernel 
 CIFS server which has nothing to do with samba. Apart from having it's 
 commands start with smb, which can be confusing.
 
 http://www.opensolaris.org/os/project/cifs-server/

Ah ok. Thanx for clarification!

Regards,
jel.
-- 
Otto-von-Guericke University http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany Tel: +49 391 67 12768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Solaris 10 samba in AD mode broken when user in 32 AD groups

2009-10-13 Thread Jens Elkner

On Tue, Oct 13, 2009 at 09:20:23AM -0700, Paul B. Henson wrote:
 
 We're currently using the Sun bundled Samba to provide CIFS access to our
 ZFS user/group directories.
...
 Evidently the samba engineering group is in Prague. I don't know if it is a
 language problem, or where the confusion is coming from, but even after
 escalating this through our regional support manager, they are still
 refusing to fix this bug and claiming it is an RFE.

Havn't tested the bundle samba stuff for a long time, since I don't trust
it: The bundled stuff didn't work when tested; packages are IMHO
awefully assembled; Problems are not understood by the involved engineers
(or they are not willingly to understand); The team seems to follow the
dogma, fix the symptoms and not the root cause.

So at least if the bundled stuff is modified according to their RFEs on
bugzilla, don't be suprised, if your environment gets screwed up -
especially when you have a mixed users group, i.e. Windows and *ix based
user, which are using workgroup directories for sharing their stuff.

So we still use the original samba and it causes no headaches. Once
we had a problem when switching some desktops to Vista, MS Office 2007
due to the new win strategy save changes to a tmp file, than rename to
the original file - wrong ACLs, however this has been fixed within
ONE DAY: Just did some code scanning, talked to Jeremy Allison via smb IRC
channel and viola, he came up with a fix pretty fast. So I didn't need
to waste my time explaining the problem again and again to SUN support,
creating explorer archives, which usually hang the NFS services which
couldn't be fixed without a reboot!, and waiting several months to get
it fixed (BTW: IIRC, I opened a case for this via sun support, so if it
hasn't be silently closed, its probably still open ...).

Since we guess, that CIFS gets screwed up by the same team, we don't use
it either (well, and can't because we've no ADS ;-)).

My 10¢.

Regards,
jel.
-- 
Otto-von-Guericke University http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany Tel: +49 391 67 12768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] alternative hardware configurations for zfs

2009-09-13 Thread Jens Elkner

On Sat, Sep 12, 2009 at 02:37:35PM -0500, Tim Cook wrote:
On Sat, Sep 12, 2009 at 10:17 AM, Damjan Perenic
...
  I shopped for 1TB 7200rpm drives recently and I noticed Seagate
  Barracude ES.2 has 1TB version with SATA and SAS interface.
 
On the flip side, according to storage review, the SATA version trumps the
SAS version in pretty much everything but throughput (which is still
negligible).

 [5]http://www.storagereview.com/php/benchmark/suite_v4.php?typeID=10testbed
ID=4osID=6raidconfigID=1numDrives=1devID_0=354devID_1=362devCnt=2
--Tim

Just in case if interested in SATA, perhaps this helps (made on an almost
idle system):

elkner.sol /pool2  uname -a
SunOS sol 5.11 snv_98 i86pc i386 i86xpv

elkner.sol /rpool   prtdiag
System Configuration: Intel S5000PAL
BIOS Configuration: Intel Corporation S5000.86B.10.00.0091.081520081046 
08/15/2008
BMC Configuration: IPMI 2.0 (KCS: Keyboard Controller Style)

 Processor Sockets 

Version  Location Tag
 --
Intel(R) Xeon(R) CPU   E5440  @ 2.83GHz CPU1
Intel(R) Xeon(R) CPU   E5440  @ 2.83GHz CPU2
...

elkner.sol /pool2  + /usr/X11/bin/scanpci | grep -i sata
 Intel Corporation 631xESB/632xESB SATA AHCI Controller

elkner.sol ~  iostat -E | \
awk '/^sd/ { print $1; getline; print; getline; print }'
sd0
Vendor: ATA  Product: ST3250310NS  Revision: SN05 Serial No:
Size: 250.06GB 250059350016 bytes
sd1
Vendor: ATA  Product: ST3250310NS  Revision: SN04 Serial No:
Size: 250.06GB 250059350016 bytes
sd2
Vendor: ATA  Product: ST3250310NS  Revision: SN04 Serial No:
Size: 250.06GB 250059350016 bytes
sd3
Vendor: ATA  Product: ST3250310NS  Revision: SN05 Serial No:
Size: 250.06GB 250059350016 bytes
sd5
Vendor: ATA  Product: ST31000340NS Revision: SN06 Serial No:
Size: 1000.20GB 1000204886016 bytes
sd6
Vendor: ATA  Product: ST31000340NS Revision: SN06 Serial No:
Size: 1000.20GB 1000204886016 bytes

elkner.sol ~  zpool status | grep ONLINE
 state: ONLINE
pool1   ONLINE   0 0 0
  mirrorONLINE   0 0 0
c1t2d0  ONLINE   0 0 0
c1t3d0  ONLINE   0 0 0
 state: ONLINE
pool2   ONLINE   0 0 0
  mirrorONLINE   0 0 0
c1t4d0  ONLINE   0 0 0
c1t5d0  ONLINE   0 0 0
 state: ONLINE
rpool ONLINE   0 0 0
  mirror  ONLINE   0 0 0
c1t0d0s0  ONLINE   0 0 0
c1t1d0s0  ONLINE   0 0 0


elkner.sol /pool2  + time sh -c mkfile 4g xx; sync; echo ST31000340NS
ST31000340NS
real 3:55.2
user0.0
sys 1.9

elkner.sol ~  iostat -zmnx c1t4d0 c1t5d0 5 | grep -v device
0.0  154.20.0 19739.4  3.0 32.0   19.4  207.5 100 100 c1t4d0
0.0  125.80.0 16103.9  3.0 32.0   23.8  254.3 100 100 c1t5d0
0.0  133.00.0 16366.9  2.4 25.9   17.9  194.4  80  82 c1t4d0
0.0  158.00.0 19592.5  2.8 30.3   17.6  191.7  93  96 c1t5d0
0.0  159.40.0 20054.8  2.8 30.3   17.7  190.2  94  95 c1t4d0
0.0  140.20.0 17597.2  2.8 30.3   20.1  216.4  94  95 c1t5d0
0.0  134.80.0 16298.7  2.0 23.0   15.2  170.8  68  76 c1t4d0
0.0  154.40.0 18807.5  2.7 29.3   17.3  189.9  89  94 c1t5d0
0.0  188.40.0 24115.5  3.0 32.0   15.9  169.8 100 100 c1t4d0
0.0  159.80.0 20454.6  3.0 32.0   18.8  200.2 100 100 c1t5d0
0.0  120.00.0 14328.3  2.0 22.2   16.4  184.9  66  71 c1t4d0
0.0  143.20.0 17169.9  2.6 28.2   18.0  197.1  86  93 c1t5d0
0.0  157.00.0 19140.9  2.6 29.3   16.5  186.9  87  96 c1t4d0
0.0  169.20.0 20676.9  2.2 24.8   13.2  146.6  75  79 c1t5d0
0.0  156.20.0 19993.8  3.0 32.0   19.2  204.8 100 100 c1t4d0
0.0  140.40.0 17971.3  3.0 32.0   21.3  227.9 100 100 c1t5d0
0.0  138.80.0 16759.6  2.6 29.3   18.4  210.9  86  94 c1t4d0
0.0  146.60.0 17809.2  2.7 29.6   18.4  201.7  90  94 c1t5d0
0.0  133.80.0 16196.8  2.5 28.0   18.9  209.3  85  90 c1t4d0
0.0  134.00.0 16222.4  2.6 28.7   19.5  214.3  87  94 c1t5d0
r/sw/s   kr/skw/s wait actv wsvc_t asvc_t  %w  %b device

elkner.sol /pool1  + time sh -c 'mkfile 4g xx; sync; echo ST3250310NS'
ST3250310NS
real 1:33.5
user0.0
sys 2.0

elkner.sol ~  iostat -zmnx c1t2d0 c1t3d0 5 | grep -v device
0.2  408.61.6 49336.8 25.7  0.8   62.81.9  79  79 c1t3d0
0.2  432.61.6 53284.4 29.9  0.9   69.02.1  89  89 c1t2d0
0.2  456.01.6 56280.0 28.6  0.9   62.61.9  86  86 c1t3d0
0.8  389.8   17.6 45360.7 25.8  0.8   66.02.1  81  80 c1t2d0
0.4  368.63.2 42698.0 21.1  0.6   57.31.8  65  65 c1t3d0
1.0  432.48.0 52615.8 30.2  0.9   69.62.1  91  91 c1t2d0

Re: [zfs-discuss] live upgrade with lots of zfs filesystems

2009-08-28 Thread Jens Elkner

On Thu, Aug 27, 2009 at 10:59:16PM -0700, Paul B. Henson wrote:
 On Thu, 27 Aug 2009, Paul B. Henson wrote:
 
  However, I went to create a new boot environment to install the patches
  into, and so far that's been running for about an hour and a half :(,
  which was not expected or planned for.
 [...]
  I don't think I'm going to make my downtime window :(, and will probably
  need to reschedule the patching. I never considered I might have to start
  the patch process six hours before the window.
 
 Well, so far lucreate took 3.5 hours, lumount took 1.5 hours, applying the
 patches took all of 10 minutes, luumount took about 20 minutes, and
 luactivate has been running for about 45 minutes. I'm assuming it will

Have a look at http://iws.cs.uni-magdeburg.de/~elkner/luc/lu-5.10.patch
or http://iws.cs.uni-magdeburg.de/~elkner/luc/lu-5.11.patch ...
So first install most recent LU patches and than one of the above.
Since still on vacation (for ~8 weeks), haven't checked, whether there
are new LU patches out there and the patches still match (usually they do).
If not, adjusting the files manually shouldn't be a problem ;-)

There are also versions for pre svn_b107 and pre 121430-36,121431-37:
see http://iws.cs.uni-magdeburg.de/~elkner/

More info:
http://iws.cs.uni-magdeburg.de/~elkner/luc/lutrouble.html#luslow

Have fun,
jel.
-- 
Otto-von-Guericke University http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany Tel: +49 391 67 12768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Boot error

2009-08-28 Thread Jens Elkner

On Thu, Aug 27, 2009 at 04:05:15PM -0700, Grant Lowe wrote:
 I've got a 240z with Solaris 10 Update 7, all the latest patches from 
 Sunsolve.  I've installed a boot drive with ZFS.  I mirrored the drive with 
 zpool.  I installed the boot block.  The system had been working just fine.  
 But for some reason, when I try to boot, I get the error: 
 
 {1} ok boot -s
 Boot device: /p...@1c,60/s...@2/d...@0,0  File and args: -s
 SunOS Release 5.10 Version Generic_141414-08 64-bit
 Copyright 1983-2009 Sun Microsystems, Inc.  All rights reserved.
 Use is subject to license terms.
 Division by Zero
 {1} ok

My guess: s0 was to small when updating the boot archive.
So booting from a jumstart dir/CD, mounting s0 (e.g. to /a)
and running bootadm update-archive -R /a should fix the problem.

If you are low on space on /, manually 
rm -f /a/platform/sun4u/boot_archive before doing the update-archive.
If still not enough space, try to move some other stuff temp. away, 
e.g. /core , /etc/mail/cf ...

Good luck,
jel.
-- 
Otto-von-Guericke University http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany Tel: +49 391 67 12768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] bug access

2009-06-21 Thread Jens Elkner

Hi,

 This CR has been marked as incomplete by User 1-UM-1502
 for the reason Need More Info.  Please update the CR
 providing the information requested in the Evaluation and/or Comments
 field.

hmmm - wondering, how to find out, what 'more info' means and how to
provide this info. There is no URL in the bug-response and it even
seems to be impossible, to obtain the current state via 
http://bugs.opensolaris.org/view_bug.do?bug_id= 
(Bug Database Search is even more bogus - in general it doesn't find
any bugs by ID).

Should I continue to ignore these responses and mark those bugs internally
as 'gets probably never fixed'?

Regards.
jel.
-- 
Otto-von-Guericke University http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany Tel: +49 391 67 12768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] v440 - root mirror lost after LU

2009-06-16 Thread Jens Elkner

Hmmm,

just upgraded some servers to U7. Unfortunately one server's primary disk
died during the upgrade, so that luactivate was not able to activate the
s10u7 BE (Unable to determine the configuration ...). Since the rpool
is a 2-way mirror, the boot-device=/p...@1f,70/s...@2/d...@1,0:a was
simply set to /p...@1f,70/s...@2/d...@0,0:a and checked, whether the
machine still reboots unattended. As expected - no problem.

At the evening the faulty disk was replaced and the mirror resilvered via
'zpool replace rpool c1t1d0s0' (see below).  Since there was no error and
everything stated to be healthy, the s10u7 BE was luactivated (no error
here message as well) and 'init 6'.

Unfortunately, now the server was gone and no known recipe helped to
revive it (I guess, LU damaged the zpool.cache?) :

Any hints, how to get the rpool back? 

Regards,
jel.


What has been tried 'til now:

{3} ok boot
Boot device: /p...@1f,70/s...@2/d...@1,0:a  File and args:
Bad magic number in disk label
Can't open disk label package

Can't open boot device

{3} ok

{3} ok boot /p...@1f,70/s...@2/d...@0,0:a
Boot device: /p...@1f,70/s...@2/d...@0,0:a  File and args:
SunOS Release 5.10 Version Generic_13-08 64-bit
Copyright 1983-2009 Sun Microsystems, Inc.  All rights reserved.
Use is subject to license terms.
NOTICE:
spa_import_rootpool: error 22

Cannot mount root on /p...@1f,70/s...@2/d...@0,0:a fstype zfs

panic[cpu3]/thread=180e000: vfs_mountroot: cannot mount root

0180b950 genunix:vfs_mountroot+358 (800, 200, 0, 1875c00,
189f800, 18ca000)
  %l0-3: 010ba000 010ba208 0187bba8
  %011e8400
  %l4-7: 011e8400 018cc400 0600
  %0200
0180ba10 genunix:main+a0 (1815178, 180c000, 18397b0, 18c6800,
181b578, 1815000)
  %l0-3: 01015400 0001 70002000
  %
  %l4-7: 0183ec00 0003 0180c000
  %

skipping system dump - no dump device configured
rebooting...

SC Alert: Host System has Reset

{3} ok boot net -s
Boot device: /p...@1c,60/netw...@2  File and args: -s
1000 Mbps FDX Link up
Timeout waiting for ARP/RARP packet
3a000 1000 Mbps FDX Link up
SunOS Release 5.10 Version Generic_137137-09 64-bit
Copyright 1983-2008 Sun Microsystems, Inc.  All rights reserved.
Use is subject to license terms.
Hardware watchdog enabled
Booting to milestone milestone/single-user:default.
Configuring devices.
Using RPC Bootparams for network configuration information.
Attempting to configure interface ce1...
Skipped interface ce1
Attempting to configure interface ce0...
Configured interface ce0
Requesting System Maintenance Mode
SINGLE USER MODE
# mount -F zfs /dev/dsk/c1t1d0s0 /mnt
cannot open '/dev/dsk/c1t1d0s0': invalid dataset name
# mount -F zfs /dev/dsk/c1t0d0s0 /mnt
cannot open '/dev/dsk/c1t0d0s0': invalid dataset name
# zpool import
  pool: pool1
id: 5088500955966129017
 state: ONLINE
action: The pool can be imported using its name or numeric identifier.
config:

pool1   ONLINE
  mirrorONLINE
c1t2d0  ONLINE
c1t3d0  ONLINE

  pool: rpool
id: 5910200402071733373
 state: UNAVAIL
action: The pool cannot be imported due to damaged devices or data.
config:

rpool UNAVAIL  insufficient replicas
  mirror  UNAVAIL  corrupted data
c1t1d0s0  ONLINE
c1t0d0s0  ONLINE

# dd if=/dev/rdsk/c1t1d0s0 of=/tmp/bb bs=1b iseek=1 count=15
15+0 records in
15+0 records out
# dd if=/dev/rdsk/c1t1d0s0 of=/tmp/bb bs=1b iseek=1024 oseek=15 count=16
16+0 records in
16+0 records out
# cmp /tmp/bb /usr/platform/`uname -i`/lib/fs/zfs/bootblk
# echo $?
0
# dd if=/dev/rdsk/c1t0d0s0 of=/tmp/ab bs=1b iseek=1 count=15
15+0 records in
15+0 records out
# dd if=/dev/rdsk/c1t0d0s0 of=/tmp/ab bs=1b iseek=1024 oseek=15 count=16
16+0 records in
16+0 records out
# cmp /tmp/ab /usr/platform/`uname -i`/lib/fs/zfs/bootblk
# echo $?
0
#

pre-history:

admin.tpol ~ # zpool status -xv
  pool: rpool
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
 scrub: resilver in progress for 0h28m, 98.10% done, 0h0m to go
config:

NAMESTATE READ WRITE CKSUM
rpool   DEGRADED 0 0 0
  mirrorDEGRADED 0 0 0
replacing   DEGRADED 0 0 0
  c1t1d0s0/old  FAULTED  0 0 0  corrupted data
  c1t1d0s0  ONLINE   0 0 0
c1t0d0s0ONLINE   0 0 0

errors: No known data errors

admin.tpol ~ # zpool status -xv
all pools are healthy
admin.tpol ~ # zpool status

Re: [zfs-discuss] v440 - root mirror lost after LU

2009-06-16 Thread Jens Elkner

On Tue, Jun 16, 2009 at 05:58:00PM -0600, Lori Alt wrote:

First: Thanx a lot, Lori for the quick help!!!

 On 06/16/09 16:32, Jens Elkner wrote:

 At the evening the faulty disk was replaced and the mirror resilvered via
 'zpool replace rpool c1t1d0s0' (see below).  Since there was no error and
 everything stated to be healthy, the s10u7 BE was luactivated (no error
 here message as well) and 'init 6'.
 
 Unfortunately, now the server was gone and no known recipe helped to
 revive it (I guess, LU damaged the zpool.cache?) :
  
 The other suggestion I have is to remove the
 /p...@1f,70/s...@2/d...@1,0:a

Indeed, physically removing the new c1t1d0 was the key for solving the
problem (netboot s10u7 gave the same import error as s10u6).


Just in case, somebody is interested in the details:

Since 'cfgadm -c unconfigure c1::dsk/c1t1d0' didn't work (no blue
LED), the machine was 'poweroff'ed, disk removed and 'poweron'ed.

Really strange: it came back with the s10u6 BE instead of the s10u7 BE
and took quite a while 'til it gave up to get HDD1:
WARNING: /p...@1d,70/s...@2,1 (mpt3):
Disconnected command timeout for Target 1
Hardware watchdog enabled
...

Now cfgadm did not show c1::dsk/c1t1d0. So re-inserted HDD1, which was
properly logged 'SC Alert: DISK @ HDD1 has been inserted.' Unfortunately
cfgadm didn't show it and 'cfgadm -c configure c1::dsk/c1t1d0' meant:
cfgadm: Attachment point not found. 

However, 'format -e':
AVAILABLE DISK SELECTIONS:
   0. c1t0d0 SUN36G cyl 24620 alt 2 hd 27 sec 107
  /p...@1f,70/s...@2/s...@0,0
   1. c1t1d0 HITACHI-DK32EJ36NSUN36G-PQ08-33.92GB
  /p...@1f,70/s...@2/s...@1,0
   2. c1t2d0 SEAGATE-ST3146707LC-0005-136.73GB
  /p...@1f,70/s...@2/s...@2,0
   3. c1t3d0 SEAGATE-ST3146707LC-0005-136.73GB
  /p...@1f,70/s...@2/s...@3,0
Specify disk (enter its number): 1
selecting c1t1d0
[disk formatted]

and now the machine seemed to be stalled. No ssh nor login via console
possible.

So 'poweroff'ed again, removed the disk, 'poweron'ed and this time
the first thing done was 'zpool detach rpool c1t1d0' and scrubbing the
pool (completed after 0h22m with 0 errors).

After that a 'cfgadm -x insert_device c1' got back c1::dsk/c1t1d0 and 
'format -e' worked as expected (1p showed an EFI partition table)!

So the rest was trivial:
SMI label, repart, label, reattached c1t1d0 to rpool (resilver took 
about 31m), installboot, verified boot 'ok' s10u6, luactivated s10u7
and finally verified boot 'ok' again.

Once again, thanx a lot Lori for your quick help!!!

Regards,
jel.
-- 
Otto-von-Guericke University http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany Tel: +49 391 67 12768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] zfs reliability under xen

2009-05-22 Thread Jens Elkner

On Wed, May 20, 2009 at 12:06:49AM +0300, Ahmed Kamal wrote:
Is anyone even using ZFS under Xen in production in some form. If so, 
 what's
your impression of reliability ?

Hmm, somebody needs to out itself. Short answer: yes.

Details:
Well, i've installed an IntelServer (2x QuadCore E5440, 2.8 GHz)
with 4x 300GB-SATA (2x 2-way mirrors) in a small company (one of
the ebay rated Top10 sellers in germany) in October 2008 running
snv_b98 as dom0 ;-)

domU1 is a win2003 32bit small business server running a MS SQL server,
domU2 is a win2008 32bit standard server, which is used as terminal server
(=10 users) basically to run several Sage[.de] products.

Both domUs are using the latest PV driver, which - depending on the
recordsize to write - reaches about the same xfer rates as in dom0.
(see http://iws.cs.uni-magdeburg.de/~elkner/xVM/ for more info:
the benchmarks were made on a X4600M, however the IntelServer produces
about the same numbers).

pool1/win2003sbs.dsk  volsize   48G  -
pool1/win2003sbs.dsk  volblocksize  8K -
pool1/win2008ss.dsk   volsize   24G  -
pool1/win2008ss.dsk   volblocksize  8K -

It is in production since ~ february 2009 and stable as expected:
Sometimes, when the win2003 domU is too long idle, it doesn't wake up
anymore. No problem, wrote a simple CGI-script, so that the users have
a simple UI to check the state of the domUs (basically a ping), to pull
the cable (virsh destroy) as well as start/suspend/resume them.

Usually they use this bookmark ~ 2-3/week when they start working. Users
are quite happy with it, since clicking on a link is not much more
work than switching on their own PC - so not annoying/painful at all.

Initially I gave win2003 2x16 GB partitions (C:, D: for SQL Data),
winn2008 a single 16 GB partition.
When the sage products were installed, it turned out, that the 16GB
was almost filled, so the volsize of pool1/win2008ss.dsk was increased
to 24GB and the C: partition dynam. extendend in Win2008 - no problem.
Last month the SQL Server has filled up its D: partition (16GB). So
it started to 'reboot' several times a day. However, rising the
pool1/win2003sbs.dsk volsize to 48 GB (i.e. D: to 32GB) solved that
problem. The only little hurdle here was, that on could not extend
the partition per win partitionmanager's context menu: one had to use
the command line tool ...

Things, which are a little bit annoying is the ZFS send command. Takes
pretty long, but since it is usually running at night, it is not a real
problem. 

BTW:
The server was installed from remote misusing a company internal
linux server (jumpstart of course). Since seriell port was not
connected, I had to ask an on-site user to initiate the PXE boot,
but this was not a problem, too.

So the summary: all people (incl. admins) are happy.

However, if you need to decide, whether to use Xen, test your setup
before going into production and ask your boss, whether he can live with
innovative ... solutions ;-)

Regards,
jel.
-- 
Otto-von-Guericke University http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany Tel: +49 391 67 12768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] LU snv_93 - snv_101a (ZFS - ZFS )

2009-05-22 Thread Jens Elkner

On Thu, May 21, 2009 at 02:51:53PM -0700, Nandini Mocherla wrote:
 Here is the short story about of my Live Upgrade problem.  This is not 
... 
 # mount -F zfs /dev/dsk/c1t2d0s0 /mnt
 cannot open '/dev/dsk/c1t2d0s0': invalid dataset name

Have seen this when LUing from b110 to b114 on a V240 (well known Fcode
error). The mount command didn't work, too.

Fix was to boot back into b110 (see 'boot -L'), calling luactivate
once more and init 6 .

Regards,
jel.
-- 
Otto-von-Guericke University http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany Tel: +49 391 67 12768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] swat java source?

2009-04-19 Thread Jens Elkner

Hi,

does anybody know, whether it is possible to get the java source for
swat (where/how)?

Thanx,
jel.
-- 
Otto-von-Guericke University http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany Tel: +49 391 67 12768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] How recoverable is an 'unrecoverable error'?

2009-04-15 Thread Jens Elkner

On Wed, Apr 15, 2009 at 10:32:13PM +0800, Uwe Dippel wrote:
  
 status: One or more devices has experienced an unrecoverable error.  An
   attempt was made to correct the error.  Applications are unaffected.
...
 errors: No known data errors
 
 Now I wonder where that error came from. It was just a single checksum 

Hmmm, had ~ 2 weeks ago also a curious thing with an StorEdge 3510
(2x2Gbps FC MP, 1 Controller, 2x6HDDs mirrored and exported as a
single device, no ZIL etc. tricks) connected to a X4600:

Since grill party time has started, the 3510 decided at a room temp of
33°C to go offline and take part on the party ;-). Result was that
during the offline time everything blocked (i.e. didn't got timeout or
error), which tried to access a ZFS on that pool - wrt. the POV more or
less expected. After the 3510 came back, a 'zpool status ..' showed 
something like this:

NAME STATEREAD WRITE CKSUM
pool2FAULTED  289K 4.03M 0
  c4t600C0FF0099C790E0144EC00d0  FAULTED  289K 4.03M 0  too 
many errors

errors: Permanent errors have been detected in the following files:

pool2/home/stud/inf/foobar:0x0

Still everything was blocking. After a 'zpool clear' all ZFS ( ~ 2300 on
that pool) expect the listed one were accessable, but the status message
kept unchanged. Curious, thought that blocking/waiting for the device to
come back and the ZFS transaction stuff is actually made for a situation
like this, aka re-commit un-ACKed actions ...
Anyway, finally scrubbing the pool brought it back to normal ONLINE state
without any errors.  To be sure I compared the ZFS in question with the
backup from some hours ago - no difference. So same question made in the
subject.

BTW: Some days later we had an even bigger grill party  (~ 38°C) - this
time the X4xxx machines in this room decided to go offline and take part
as well (v4xx's kept running ;-)).
So first the 3510 and some time later the X4600. This time the pool
was after going back online in DEGRADED state, had some more errors like
the above one and:

metadata:0x103
metadata:0x4007
...

Clearing and scrubbing it brought it again back to normal ONLINE state
without any errors. Spot check on the noted files with errors showed
no damage ...

Everything nice (wrt. data loss), but curious ...

Regards,
jel.
-- 
Otto-von-Guericke University http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany Tel: +49 391 67 12768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Error ZFS-8000-9P

2009-04-06 Thread Jens Elkner

On Fri, Apr 03, 2009 at 10:41:40AM -0700, Joe S wrote:
 Today, I noticed this:
...
 According to http://www.sun.com/msg/ZFS-8000-9P:
 
 The Message ID: ZFS-8000-9P indicates a device has exceeded the
 acceptable limit of errors allowed by the system. See document 203768
 for additional information.
...
I've had the same on a thumper with S10u6 1|2 month ago. Since logs did
not show any disk error/warning for the last 6 month I just cleared the
pool and finally scrubbed it and put back the 'tmp hotspare' used to
the hot spare pool. No errors or warnings since then for that disk,
so it was obviously a false/brain damaged alarm ...

regards,
jel.
-- 
Otto-von-Guericke University http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany Tel: +49 391 67 12768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] strange 'too many errors' msg

2009-02-11 Thread Jens Elkner

Hi,

just found on a X4500 with S10u6:

fmd: [ID 441519 daemon.error] SUNW-MSG-ID: ZFS-8000-GH, TYPE: Fault, VER: 1, 
SEVERITY: Major
EVENT-TIME: Wed Feb 11 16:03:26 CET 2009
PLATFORM: Sun Fire X4500, CSN: 00:14:4F:20:E0:2C , HOSTNAME: peng
SOURCE: zfs-diagnosis, REV: 1.0
EVENT-ID: 74e6f0ec-b1e7-e49b-8d71-dc1c9b68ad2b
DESC: The number of checksum errors associated with a ZFS device exceeded 
acceptable levels.  Refer to http://sun.com/msg/ZFS-8000-GH for more 
information.
AUTO-RESPONSE: The device has been marked as degraded.  An attempt will be made 
to activate a hot spare if available.
IMPACT: Fault tolerance of the pool may be compromised.
REC-ACTION: Run 'zpool status -x' and replace the bad device.

zpool status -x 
...
  mirror  DEGRADED 0 0 0
spare DEGRADED 0 0 0
  c6t6d0  DEGRADED 0 0 0  too many errors
  c4t0d0  ONLINE   0 0 0
c7t6d0ONLINE   0 0 0
...
spares
  c4t0d0  INUSE currently in use
  c4t4d0  AVAIL

Strange thing is, that for more than 3 month there was no single error
logged with any drive. IIRC, before u4 I've seen occasionaly a bad
checksum error message, but this was obviously the result from the
wellknown race condition of the marvell driver when havy writes took place.

So I tend to interprete it as an false alarm and think about
'zpool ... clear c6t6d0'.

What do you think. Is this a good idea?

Regards,
jel. 

BTW: zpool status -x  msg refers to http://www.sun.com/msg/ZFS-8000-9P,
 the event to http://sun.com/msg/ZFS-8000-GH - little bit
 inconsistent I think.
-- 
Otto-von-Guericke University http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany Tel: +49 391 67 12768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] s10u6 ludelete issues with zones on zfs root

2009-01-17 Thread Jens Elkner

On Fri, Jan 16, 2009 at 02:08:09PM -0500, amy.r...@tufts.edu wrote:
 I've installed an s10u6 machine with no UFS partitions at all.  I've created a
 dataset for zones and one for a zone named default.  I then do an lucreate
 and luactivate and a subsequent boot off the new BE.  All of that appears to
 go just fine (though I've found that I MUST call the zone dataset zoneds for
 some reason, or it will rename it ot that for me).  When I try to delete the
 old BE, it fails with the following message:

It's a LU bug. Have a look at 
http://iws.cs.uni-magdeburg.de/~elkner/luc/lutrouble.html

The following patch fix it and provides an oppurtunity to speedup 
lucreate/lumount/luactivate and friends dramtically wrt. a machine with 
lots of LU unrelated filesystems (e.g. user homes).

http://iws.cs.uni-magdeburg.de/~elkner/luc/lu-5.10.patch or
http://iws.cs.uni-magdeburg.de/~elkner/luc/lu-5.11.patch

Have fun,
jel.
-- 
Otto-von-Guericke University http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany Tel: +49 391 67 12768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] zfs boot - U6 kernel patch breaks sparc boot

2008-12-03 Thread Jens Elkner

On Tue, Dec 02, 2008 at 12:22:49PM -0800, Vincent Fox wrote:
 Reviving this thread.
 
 We have a Solaris 10u4 system recently patched with 137137-09.
 Unfortunately the patch was applied from multi-user mode, I wonder if this
 may have been original posters problem as well?  Anyhow we are now stuck

No - in my case it was a 'not enough space' on / problem, not the
multi-user mode ;-). 

Regards,
jel.
-- 
Otto-von-Guericke University http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany Tel: +49 391 67 12768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] `zfs list` doesn't show my snapshot

2008-11-25 Thread Jens Elkner

On Tue, Nov 25, 2008 at 06:34:47PM -0500, Richard Morris - Sun Microsystems - 
Burlington United States wrote:

option to list all datasets.  So 6734907 added -t all which produces the
same output as -t filesystem,volume,snapshot.
1. http://bugs.opensolaris.org/view_bug.do?bug_id=6734907

Hmmm - very strange, when I run 'zfs list -t all' on b101 it says:

invalid type 'all'
...

But the bug report says:
Fixed In  snv_99
Release Fixed solaris_nevada(snv_99)

So, what do those fields really mean?

Regards,
jel.
-- 
Otto-von-Guericke University http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany Tel: +49 391 67 12768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] `zfs list` doesn't show my snapshot

2008-11-21 Thread Jens Elkner

On Fri, Nov 21, 2008 at 03:42:17PM -0800, David Pacheco wrote:
 Pawel Tecza wrote:
  But I still don't understand why `zfs list` doesn't display snapshots
  by default. I saw it in the Net many times at the examples of zfs usage.
 
 This was PSARC/2008/469 - excluding snapshot info from 'zfs list'
 
 http://opensolaris.org/os/community/on/flag-days/pages/2008091003/

The uncomplete one - where is the '-t all' option? It's really annoying,
error prone, time consuming to type stories on the command line ...
Does anybody remember the keep it small and simple thing?

Regards,
jel.
-- 
Otto-von-Guericke University http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany Tel: +49 391 67 12768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] zfs boot - U6 kernel patch breaks sparc boot

2008-11-19 Thread Jens Elkner

On Tue, Nov 18, 2008 at 07:44:56PM -0800, Ed Clark wrote:
Hi Ed,
  
 messages from the underlying pkging commands are captured in the 
 /var/sadm/patch/PID/log file
 
 messages from patchadd itself and patch level scripts (prepatch, postpatch, 
 etc) go to stdout/stderr
 
 these are two distinct sets of messages -- not really optimal, just the way 
 patchadd has always been

Yepp - thanks for making that clear.

  So moving /etc/gconf to /opt and /etc/mail/cf to
  /usr/lib/mail (inkl. 
  creating the appropriate links) was sufficient to get
  it work.

 nice trick, but unfortunately it won't do -- officially you  must _never_ 
 make such changes that alter the type of system files ; if you do, the 
 changes are at your own risk and completely unsupported

Yes, considering it for a temp. change, only.  However pkgtools are usually
robust enough (as long as the dir is not empty) to handle such
relocations properly - that's why I really like pkgtools (giving me the
freedom I need ;-)).

 the basic reason for this patching can not be guaranteed behave in a 
 deterministic manner when it encounters such changes -- this does cause real 
 problems too, ie. a case where changing sendmail.cf to a symlink caused a 
 kernel patch only half apply, leading to long term outages

Oops - that's a big bummer [OT: and probably the result of the bad packaging
strategy of solaris (i.e. not software oriented aka merging different
sw into one sol package ...)]. Good to know, that this already happend...
  
 removing the old corrupt boot archive is a simple and safe way to free up 
 some space, best part is on reboot the system will automatically rebuild it  
 
  
  Hmmm - may be I'm wrong, but IMHO if there is not
  enough space for a new
  boot_archive the bootadm should not corrupt
  anything but leave the old
  one in place - I would guess, in 95% of al cases one
  comes away with it,
  since very often updates are not really required ...
 
 hmm ... something of double edged sword, at least the way it works currently 
 we know with certainty when there was a problem building the archive and can 
 go about rectifying it ; the problem with keeping the old boot archive is 
 that the system may have the appearance of booting and possibly even running 
 ok, but there is absolutely no guarantee of nominal operation, could be very 
 confusing

Yes, I understand your point of view. However, I didn't mean, silently
ignore the unable to update boot archive but giving the user a simple
way to fix the problem. So I would prefer the keep the old archive
as long as it can not be updated, but issue big warnings on
reboot/activation to get informed, that a fix is needed.  At least in
my case the system would have been offline for at most 30min, but
because of the bug it was several days offline and without your help
probably several weeks/months (i.e. my experience wrt. german sun
support) ...

Anyway, thanks a lot again,
jel.
-- 
Otto-von-Guericke University http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany Tel: +49 391 67 12768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] zfs boot - U6 kernel patch breaks sparc boot

2008-11-17 Thread Jens Elkner

On Sun, Nov 16, 2008 at 09:27:32AM -0800, Ed Clark wrote:
Hi Ed,

   1. a copy of the 137137-09 patchadd log if you have
  http://iws.cs.uni-magdeburg.de/~elkner/137137-09/
  
 thanks for info - what you provided here is the patch pkg installation log,

Yes, actually the only one, I have/could find.

 what i was actually after was patchadd log (ie. the messages output to 
 terminal)

Up to now I thought, that stderr and stdout are redirected from patchadd
to the patchlog, but never checked in detail, since the log always had
the info I needed ...

 -- both the patchadd log and the console log on reboot should have shown 
 errors which would have provided hints as to what the problem was

Haven't seen anything unusaly. But may be I've overseen it :(

 now the df/prtvtoc output was most useful :
 
 137137-09 delivers sparc newboot, and the problem here appears to be that a 
 root fs slice of 256M falls well below the minimum required size required for 
 sparc newboot to operate nominally -- due to the lack of space in /, i 
 suspect that 137137-09 postpatch failed to copy the ~180MB failsafe archive 
 (/platform/sun4u/failsafe) to your system, and that the ~80M boot archive 
 (/platform/sun4u/boot_archive) was not created correctly on the reboot after 
 applying 137137-09
 
 the 'seek failed' error message you see on boot is coming from the ufs 
 bootblk fcode, which i suspect is due to not being able load the corrupt 
 boot_archive

Yes - that makes sense.

 you should be able to get your system to boot by doing the following
 
 1. net/CD/DVD boot the system using a recent update release, u5/u6 should 
 work, not sure about u4 or earlier 
 2. mount the root fs slice, cd to root-fs-mount-point
 3. ls -l platform/sun4u
 4. rm -f platform/sun4u/boot_archive
 5. sbin/bootadm -a update_all

Yepp - now I can see the problem:

Creating boot_archive for /a
updating /a/platform/sun4u/boot_archive
15+0 records in
15+0 records out
cat: write error: No space left on device
bootadm: write to file failed: /a/boot/solaris/filestat.ramdisk.tmp: No space 
left on device

So moving /etc/gconf to /opt and /etc/mail/cf to /usr/lib/mail (inkl. 
creating the appropriate links) was sufficient to get it work.

 6. ls -l platform/sun4u

total 136770
-rw-r--r--   1 root root 68716544 Nov 18 03:22 boot_archive
-rw-r--r--   1 root sys71808 Oct  3 23:28 bootlst
-rw-r--r--   1 root sys79976 Oct  3 23:34 cprboot
drwxr-xr-x  11 root sys  512 Mar 19  2007 kernel
drwxr-xr-x   4 root bin  512 Nov 12 22:17 lib
drwxr-xr-x   2 root bin  512 Mar 19  2007 sbin
-rw-r--r--   1 root sys  1084048 Oct  3 23:28 wanboot

Filesystem   1024-blocksUsed   Available Capacity Mounted on
/dev/dsk/c0t0d0s0 245947  205343   1601093%/

 boot_archive corruption will be a recurrent problem on your configuration, 
 every time the system determines that boot_archive needs to be rebuilt on 
 reboot -- a very inelegant workaround would be to 'rm -f 
 /platform/sun4u/boot_archive' every time before rebooting the system

Hmmm - may be I'm wrong, but IMHO if there is not enough space for a new
boot_archive the bootadm should not corrupt anything but leave the old
one in place - I would guess, in 95% of al cases one comes away with it,
since very often updates are not really required ...
  
 better option would be to reinstall the system, choosing a disk layout 
 adequate for newboot

Well, the 2nd exercise is to test zfs boot (all systems have at least a
2nd HDD). If this works, just converting to zfs is probably the better
option ...

Anyway, thanks a lot for your help!

Regards,
jel.
-- 
Otto-von-Guericke University http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany Tel: +49 391 67 12768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] zfs boot - U6 kernel patch breaks sparc boot

2008-11-14 Thread Jens Elkner

On Thu, Nov 13, 2008 at 04:54:57PM -0800, Gerry Haskins wrote:
 Jens, http://www.sun.com/bigadmin/patches/firmware/release_history.jsp  on 
 the Big Admin Patching center, http://www.sun.com/bigadmin/patches/ list 
 firmware revisions.

Thanks a lot. Digged around there and found, that 121683-06 aka
OBP 4.22.33 seems to be the most recent one for V240. So in theory it
should be ok in my case.

 If it's the same as a V490, then I think the current firmware version is 
 121689-04, 
 http://sunsolve.sun.com/search/advsearch.do?collection=PATCHtype=collectionsqueryKey5=121689toDocument=yes

OK - so the OBPs are all the latest ones on my machines. Unfortunately
I've not a 2nd V490 to test, whether the problem occurs there as well -
so I'll better postbone its upgrade :(

Anyway, thanks a lot Gerry,
jel.
-- 
Otto-von-Guericke University http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany Tel: +49 391 67 12768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] zfs boot - U6 kernel patch breaks sparc boot

2008-11-14 Thread Jens Elkner

On Fri, Nov 14, 2008 at 01:07:29PM -0800, Ed Clark wrote:
hi,
 
 is the system still in the same state initially reported ?

Yes.

 ie. you have not manually run any commands (ie. installboot) that would have 
 altered the slice containing the root fs where 137137-09 was applied
 
 could you please provide the following 
 
 1. a copy of the 137137-09 patchadd log if you have one available 

cp it to http://iws.cs.uni-magdeburg.de/~elkner/137137-09/
Can't spot anything unusual.
  
 2. an indication of anything particular about the system configuration, ie. 
 mirrored root 

No mirrors/raid:

# format
Searching for disks...done


AVAILABLE DISK SELECTIONS:
   0. c0t0d0 FUJITSU-MAN3367MC-0109 cyl 24343 alt 2 hd 4 sec 737
  /[EMAIL PROTECTED],60/[EMAIL PROTECTED]/[EMAIL PROTECTED],0
   1. c0t1d0 SEAGATE-ST336737LC-0102 cyl 29773 alt 2 hd 4 sec 606
  /[EMAIL PROTECTED],60/[EMAIL PROTECTED]/[EMAIL PROTECTED],0
   2. c0t2d0 FUJITSU-MAT3073N SUN72G-0602-68.37GB
  /[EMAIL PROTECTED],60/[EMAIL PROTECTED]/[EMAIL PROTECTED],0
   3. c0t3d0 FUJITSU-MAT3073N SUN72G-0602-68.37GB
  /[EMAIL PROTECTED],60/[EMAIL PROTECTED]/[EMAIL PROTECTED],0

 3. output from the following commands run against root fs where 137137-09 was 
 applied
 
 ls -l usr/platform/sun4u/lib/fs/*/bootblk
 ls -l platform/sun4u/lib/fs/*/bootblk
 sum usr/platform/sun4u/lib/fs/*/bootblk
 sum platform/sun4u/lib/fs/*/bootblk
 dd if=/dev/rdsk/rootdsk of=/tmp/bb bs=1b iseek=1 count=15
 cmp /tmp/bb usr/platform/sun4u/lib/fs/ufs/bootblk
 cmp /tmp/bb platform/sun4u/lib/fs/ufs/bootblk
 prtvtoc /dev/rdsk/rootdsk

also cp to http://iws.cs.uni-magdeburg.de/~elkner/137137-09/
Seems to be ok, too.
  
Regards,
jel.
-- 
Otto-von-Guericke University http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany Tel: +49 391 67 12768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] zfs boot - U6 kernel patch breaks sparc boot

2008-11-13 Thread Jens Elkner

On Thu, Nov 13, 2008 at 10:50:02AM -0800, Enda wrote:
 What hardware are you on, and what firmware are you at.
 Issue is coming from firmware.

Sun Fire V240 with OpenBoot 4.22.23

Tried to find out, whether there is an OBP patch available, but haven't
found anything wrt. V240, V440 and V490 :(

Regards,
jel.
-- 
Otto-von-Guericke University http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany Tel: +49 391 67 12768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] zfs boot - U6 kernel patch breaks sparc boot

2008-11-12 Thread Jens Elkner

Hi,

in preparation to try zfs boot on sparc I installed all recent patches
incl. feature patches comming from s10s_u3wos_10 and after reboot
finally 137137-09 (still having everything on UFS).

Now it doesn't boot at anymore:
###
Sun Fire V240, No Keyboard
Copyright 2006 Sun Microsystems, Inc.  All rights reserved.
OpenBoot 4.22.23, 2048 MB memory installed, Serial #63729301.
Ethernet address 0:3:ba:cc:6e:95, Host ID: 83cc6e95.



Rebooting with command: boot
Boot device: /[EMAIL PROTECTED],60/[EMAIL PROTECTED]/[EMAIL PROTECTED],0:a  
File and args:
|
seek failed

Warning: Fcode sequence resulted in a net stack depth change of 1
Evaluating:

Evaluating:
The file just loaded does not appear to be executable.
|1} ok |
###

fsck /dev/rdsk/c0t0d0s0 doesn't find any problems.
So mounted this slice on /tmp/a, and 
# find /tmp/a/boot
/tmp/a/boot
/tmp/a/boot/solaris
/tmp/a/boot/solaris/bin
/tmp/a/boot/solaris/bin/extract_boot_filelist
/tmp/a/boot/solaris/bin/create_ramdisk
/tmp/a/boot/solaris/bin/root_archive
/tmp/a/boot/solaris/filelist.ramdisk
/tmp/a/boot/solaris/filelist.safe
/tmp/a/boot/solaris/filestat.ramdisk
# cat /tmp/a/boot/solaris/filelist.ramdisk
etc/cluster/nodeid
etc/dacf.conf
etc/mach
kernel
platform

It looks different than on x86 (no kernels), so is it possible, that the
patch didn't install all required files or is it simply broken?
Or did somebody forget to mention, that an OBP update is required before
installing this patch?

Any hints?

Regards,
jel.
-- 
Otto-von-Guericke University http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany Tel: +49 391 67 12768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] FYI - proposing storage pm project

2008-11-03 Thread Jens Elkner

On Mon, Nov 03, 2008 at 02:54:10PM -0800, Yuan Chu wrote:
Hi,
  
   a disk may take seconds or
   even tens of seconds to come on line if it needs to be powered up
   and spin up.

Yes - I really hate this on my U40 and tried to disable PM for HDD[s]
completely. However, haven't found a way to do this (thought
/etc/power.conf is the right place, but either it doesn't work as
explained or is not the right place).

HDD[s] are HITACHI HDS7225S Revision: A9CA

Any hints, how to switch off PM for this HDD?

Regards,
jel.
-- 
Otto-von-Guericke University http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany Tel: +49 391 67 12768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] [Fwd: Re: ZSF Solaris]

2008-10-07 Thread Jens Elkner

On Tue, Oct 07, 2008 at 11:35:47AM +0530, Pramod Batni wrote:
  
The reason why the (implicit) truncation could be taking long  might be due
to
6723423 [6]UFS slow following large file deletion with fix for 6513858
installed
 
To overcome this problem for S10, the offending patch 127866-03 can be
removed.
 
   It is not yet fixed in snv. A fix is being developed, not sure which
build it would be available in.

OK - thanx for your answer. Since the fixes in 03-05 seem to be
important, I'll try to initiate an escalation of the case -  does it help
to get it in a little bit earlier?

Regards,
jel.
-- 
Otto-von-Guericke University http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany Tel: +49 391 67 12768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] zpool imports are slow when importing multiple storage pools

2008-10-07 Thread Jens Elkner

On Mon, Oct 06, 2008 at 05:08:13PM -0700, Richard Elling wrote:
 Scott Williamson wrote:
  Speaking of this, is there a list anywhere that details what we can 
  expect to see for (zfs) updates in S10U6?
 
 The official release name is Solaris 10 10/08

Ooops - no beta this time?

Regards,
jel.
-- 
Otto-von-Guericke University http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany Tel: +49 391 67 12768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] [Fwd: Re: ZSF Solaris]

2008-10-06 Thread Jens Elkner

On Mon, Oct 06, 2008 at 08:01:39PM +0530, Pramod Batni wrote:
 
 On Tue, Sep 30, 2008 at 09:44:21PM -0500, Al Hopper wrote:
 
  This behavior is common to tmpfs, UFS and I tested it on early ZFS
  releases.  I have no idea why - I have not made the time to figure it
  out.  What I have observed is that all operations on your (victim)
  test directory will max out (100% utilization) one CPU or one CPU core
  - and all directory operations become single-threaded and limited by
  the performance of one CPU (or core).
 
 And sometimes its just a little bug: E.g. with a recent version of Solaris
 (i.e. = snv_95 || = S10U5) on UFS:
 
 SunOS graf 5.10 Generic_137112-07 i86pc i386 i86pc (X4600, S10U5)
 =
 admin.graf /var/tmp   time sh -c 'mkfile 2g xx ; sync'
 0.05u 9.78s 0:29.42 33.4%
 admin.graf /var/tmp  time sh -c 'mkfile 2g xx ; sync'
 0.05u 293.37s 5:13.67 93.5%
 
 SunOS q 5.11 snv_98 i86pc i386 i86pc (U40, S11b98)
 =
 elkner.q /var/tmp  time mkfile 2g xx
 0.05u 3.63s 0:42.91 8.5%
 elkner.q /var/tmp  time mkfile 2g xx
 0.04u 315.15s 5:54.12 89.0%
 
The reason why the (implicit) truncation could be taking long  might be due
to
6723423 [6]UFS slow following large file deletion with fix for 6513858
installed
 
To overcome this problem for S10, the offending patch 127866-03 can be
removed.

Yes - removing 127867-05 (x86, i.e. going back to 127867-02) resolved
the problem. On sparc removing 127866-05 brought me back to 127866-01
which didn't seem to solve the problem (maybe because didn't init 6
before). However installing 127866-02 and init 6 fixed it on sparc as well.

Any hints, in which snv release it is fixed?

Thanx a lot,
jel.
-- 
Otto-von-Guericke University http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany Tel: +49 391 67 12768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZSF Solaris

2008-09-30 Thread Jens Elkner

On Tue, Sep 30, 2008 at 09:44:21PM -0500, Al Hopper wrote:
 
 This behavior is common to tmpfs, UFS and I tested it on early ZFS
 releases.  I have no idea why - I have not made the time to figure it
 out.  What I have observed is that all operations on your (victim)
 test directory will max out (100% utilization) one CPU or one CPU core
 - and all directory operations become single-threaded and limited by
 the performance of one CPU (or core).

And sometimes its just a little bug: E.g. with a recent version of Solaris
(i.e. = snv_95 || = S10U5) on UFS:

SunOS graf 5.10 Generic_137112-07 i86pc i386 i86pc (X4600, S10U5)
=
admin.graf /var/tmp   time sh -c 'mkfile 2g xx ; sync'
0.05u 9.78s 0:29.42 33.4%
admin.graf /var/tmp  time sh -c 'mkfile 2g xx ; sync'
0.05u 293.37s 5:13.67 93.5%
admin.graf /var/tmp  rm xx
admin.graf /var/tmp  time sh -c 'mkfile 2g xx ; sync'
0.05u 9.92s 0:31.75 31.4%
admin.graf /var/tmp  time sh -c 'mkfile 2g xx ; sync'
0.05u 305.15s 5:28.67 92.8%
admin.graf /var/tmp  time dd if=/dev/zero of=xx bs=1k count=2048
2048+0 records in
2048+0 records out
0.00u 298.40s 4:58.46 99.9%
admin.graf /var/tmp  time sh -c 'mkfile 2g xx ; sync'
0.05u 394.06s 6:52.79 95.4%

SunOS kaiser 5.10 Generic_137111-07 sun4u sparc SUNW,Sun-Fire-V440 (S10, U5)
=
admin.kaiser /var/tmp  time mkfile 1g xx
0.14u 5.24s 0:26.72 20.1%
admin.kaiser /var/tmp  time mkfile 1g xx
0.13u 64.23s 1:25.67 75.1%
admin.kaiser /var/tmp  time mkfile 1g xx
0.13u 68.36s 1:30.12 75.9%
admin.kaiser /var/tmp  rm xx
admin.kaiser /var/tmp  time mkfile 1g xx
0.14u 5.79s 0:29.93 19.8%
admin.kaiser /var/tmp  time mkfile 1g xx
0.13u 66.37s 1:28.06 75.5%

SunOS q 5.11 snv_98 i86pc i386 i86pc (U40, S11b98)
=
elkner.q /var/tmp  time mkfile 2g xx
0.05u 3.63s 0:42.91 8.5%
elkner.q /var/tmp  time mkfile 2g xx
0.04u 315.15s 5:54.12 89.0%

SunOS dax 5.11 snv_79a i86pc i386 i86pc (U40, S11b79)
=
elkner.dax /var/tmp  time mkfile 2g xx
0.05u 3.09s 0:43.09 7.2%
elkner.dax /var/tmp  time mkfile 2g xx
0.05u 4.95s 0:43.62 11.4%

Regards,
jel.
-- 
Otto-von-Guericke University http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany Tel: +49 391 67 12768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Greenbytes/Cypress

2008-09-23 Thread Jens Elkner

On Tue, Sep 23, 2008 at 01:04:34PM -0500, Bob Friesenhahn wrote:
 On Tue, 23 Sep 2008, Eric Schrock wrote:
  http://www.opensolaris.org/jive/thread.jspa?threadID=73740tstart=0
 
 I must apologize for anoying everyone.  When Richard Elling posted the 
 GreenBytes link without saying what it was I completely ignored it. 
 I assumed that it would be Windows-centric content that I can not view 
 since of course I am a dedicated Solaris user.  I see that someone 
 else mentioned that the content does not work for Solaris users.  As a 
 result I ignored the entire discussion as being about some silly 
 animation of gumballs.

Don't apologize - its not your fault! 
BTW: I have exactly the same problem/assumption ...

Have fun,
jel.
-- 
Otto-von-Guericke University http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany Tel: +49 391 67 12768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS hangs/freezes after disk failure,

2008-08-25 Thread Jens Elkner

On Mon, Aug 25, 2008 at 08:17:55PM +1200, Ian Collins wrote:
 John Sonnenschein wrote:
 
  Look, yanking the drives like that can seriously damage the drives
  or your motherboard. Solaris doesn't let you do it ...

Haven't seen an andruid/universal soldier shipping with Solaris ... ;-)

  and assumes that something's gone seriously wrong if you try it. That Linux 
  ignores the behavior and lets you do it sounds more like a bug in linux 
  than anything else.

Not sure, whether everything, what can't be understood, is likely a bug
- maybe it is more forgiving and tries its best to solve the problem
without taking you out of business (see below), even if it requires some
hacks not in line with specifications ...

 One point that's been overlooked in all the chest thumping - PCs vibrate
 and cables fall out.  I had this happen with an SCSI connector.  Luckily

Yes - and a colleague told me, that he've had the same problem once.
Also he managed a SiemensFujitsu server, where the SCSI-controller card 
had a tiny hairline crack: very odd behavior, usually not reproducible,
IIRC, the 4th ServiceEngineer finally replaced the card ...

 So pulling a drive is a possible, if rare, failure mode.

Definitely!

And expecting strange controller (or in general hardware) behavior is
possibly a big + for an OS, which targets SMEs and home users as well
(everybody knows about far east and other cheap HW producers,  which 
sometimes seem to say, lets ship it, later we build a special driver for
MS Windows, which workarounds the bug/problem ...).
 
Similar story: ~ 2000+ we had a WG server with 4 IDE channels PATA,
one HDD on each. HDD0 on CH0 mirrored to HDD2 on CH2, HDD1 on CH1 mirrored
to HDD3 on CH3, using Linux Softraid driver. We found out, that when
HDD1 on CH1 got on the blink, for some reason the controller got on the
blink as well, i.e. took CH0 and vice versa down too. After reboot, we
were able to force the md raid to re-take the bad marked drives and even
found out, that the problem starts, when a certain part of a partition
was accessed (which made the ops on that raid really slow for some
minutes - but after the driver marked the drive(s) as bad, performance
was back). Thus disabling the partition gave us the time to get a new
drive... During all these ops nobody (except sysadmins) realized, that we
had a problem - thanx to the md raid1 (with xfs btw.). And also we did not
have any data corruption (at least, nobody has complained about it ;-)).

Wrt. what I've experienced and read in ZFS-discussion etc. list I've the
__feeling__, that we would have got really into trouble, using Solaris
(even the most recent one) on that system ... 
So if one asks me, whether to run Solaris+ZFS on a production system, I
usually say: definitely, but only, if it is a Sun server ...

My 2¢ ;-)

Regards,
jel.

PS: And yes, all the vendor specific workarounds/hacks are for Linux
kernel folks a problem as well - at least on Torvalds side
discouraged IIRC ...
-- 
Otto-von-Guericke University http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany Tel: +49 391 67 12768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] Jumpstart + ZFS boot: profile?

2008-08-14 Thread Jens Elkner

Hi,

I wanna try to setup a machine via jumpstart with ZFS boot using svn_b95. 
Usually (UFS) I use a profile like this for it:

install_typeinitial_install
system_type standalone
usedisk c1t0d0
partitioningexplicit
filesys c1t0d0s0256 /
filesys c1t0d0s116384   swap
filesys c1t0d0s34096/var
filesys c1t0d0s48192/usr
filesys c1t0d0s54096/opt
filesys c1t0d0s7free/joker
...
# c1t0 gets replaced by the real bootdisk path via begin script 


Is something similar now possible wrt. ZFS, i.e. something like

zpool create boot c1t0d0 swap=16G
zfs create boot/root ; zfs set reservation=512M boot/root
zfs create boot/var ; zfs set reservation=4G boot/var
zfs create boot/usr ; zfs set reservation=8G boot/usr
zfs create boot/opt ; zfs set reservation=4G boot/opt

???

Regards,
jel.
-- 
Otto-von-Guericke University http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany Tel: +49 391 67 12768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Jumpstart + ZFS boot: profile?

2008-08-14 Thread Jens Elkner

On Thu, Aug 14, 2008 at 02:33:19PM -0700, Richard Elling wrote:
 There is a section on jumpstart for root ZFS in the ZFS Administration
 Guide.
http://www.opensolaris.org/os/community/zfs/docs/zfsadmin.pdf

Ah ok - thanx for the link.  Seems to be almost the same, as on the web
pages (thought, they are out of date ...)

 Is something similar now possible wrt. ZFS, i.e. something like
 
 zpool create boot c1t0d0 swap=16G
 zfs create boot/root ; zfs set reservation=512M boot/root
 zfs create boot/var ; zfs set reservation=4G boot/var
 zfs create boot/usr ; zfs set reservation=8G boot/usr
 zfs create boot/opt ; zfs set reservation=4G boot/opt
 
 ???

So the answer is NO : .

Regards,
jel.
-- 
Otto-von-Guericke University http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany Tel: +49 391 67 12768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Jumpstart + ZFS boot: profile?

2008-08-14 Thread Jens Elkner

On Thu, Aug 14, 2008 at 10:49:54PM -0400, Ellis, Mike wrote:
 You can break out just var, not the others.

Yepp - and that's not sufficient :(

Regards,
jel.
-- 
Otto-von-Guericke University http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany Tel: +49 391 67 12768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS ACLs/Samba integration

2008-03-14 Thread Jens Elkner

On Thu, Mar 13, 2008 at 11:33:57AM +, Darren J Moffat wrote:
 Paul B. Henson wrote:
  I'm currently prototyping a Solaris file server that will dish out user
  home directories and group project directories via NFSv4 and Samba.
 
 Why not the in kernel CIFS server ?

E.g., how would one mimic:

[office]
comment = office
path = /export/vol1/office
valid users = @office
force group = office
create mode = 660
directory mode = 770
...

We already lost this functionality with the introduction of the NFSv4
ACL crap on ZFS and earned a lot of hate you feedbacks. Anyway, most
users and staff switched/switching over to windows (we do not support
Linux yet and Solaris is wrt. desktop at least 5 years behind the scene),
so the last 5% of *x users need to live with it.
However, if we would switch to Solaris CIFS (which AFAIK can not
accomplish, what is required) we would have no friends anymore ...

Regards,
jel.
-- 
Otto-von-Guericke University http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany Tel: +49 391 67 12768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS array NVRAM cache?

2007-09-25 Thread Jens Elkner

On Tue, Sep 25, 2007 at 10:14:57AM -0700, Vincent Fox wrote:
 Where is ZFS with regards to the NVRAM cache present on arrays?
 
 I have a pile of 3310 with 512 megs cache, and even some 3510FC with 1-gig 
 cache.  It seems silly that it's going to waste.  These are dual-controller 
 units so I have no worry about loss of cache information.
 
 It looks like OpenSolaris has a way to force arguably correct behavior, but 
 Solaris 10u3/4 do not.  I see some threads from early this year about it, and 
 nothing since.

Made some simple tests wrt. cont. seq. writes/reads for a 3510 (singl.
controller), single Host (v490) with 2 FC-HBAs - so, yes - I'm running
now ZFS single disk over HW Raid10 (10disks) ...
Haven't had the time, to test all combinations or mixed load cases, 
however, in case you wanna check, what I got:
http://iws.cs.uni-magdeburg.de/~elkner/3510.txt

Have fun,
jel.
-- 
Otto-von-Guericke University http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany Tel: +49 391 67 12768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS problems in dCache

2007-08-22 Thread Jens Elkner

On Wed, Aug 01, 2007 at 09:49:26AM -0700, Sergey Chechelnitskiy wrote:
Hi Sergey,
 
 I have a flat directory with a lot of small files inside. And I have a java 
 application that reads all these files when it starts. If this directory is 
 located on ZFS the application starts fast (15 mins) when the number of files 
 is around 300,000 and starts very slow (more than 24 hours) when the number 
 of files is around 400,000. 
 
 The question is why ? 
 Let's set aside the question why this application is designed this way.
 
 I still needed to run this application. So, I installed a linux box with XFS, 
 mounted this XFS directory to the Solaris box and moved my flat directory 
 there. Then my application started fast (  30 mins) even if the number of 
 files (in the linux operated XFS directory mounted thru NSF to the Solaris 
 box) was 400,000 or more. 
 
 Basicly, what I want to do is to run this application on a Solaris box. Now I 
 cannot do it.

Just a rough guess - this might be a Solaris threading problem. See
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6518490

So perhaps starting the app with -XX:-UseThreadPriorities may help ...

Regards,
jel.
-- 
Otto-von-Guericke University http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany Tel: +49 391 67 12768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] New german white paper on ZFS

2007-06-27 Thread Jens Elkner

On Tue, Jun 19, 2007 at 05:19:05PM +0200, Constantin Gonzalez wrote:
Hi,

   http://blogs.sun.com/constantin/entry/new_zfs_white_paper_in

Excellent!!!

I think it is a pretty good idea, to put the links for the
paper and slides on the ZFS Documentation page aka
http://www.opensolaris.org/os/community/zfs/docs/ 

Regards,
jel.
-- 
Otto-von-Guericke University http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany Tel: +49 391 67 12768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Slashdot Article: Does ZFS Obsolete Expensive NAS/SANs?

2007-06-07 Thread Jens Elkner

On Mon, Jun 04, 2007 at 12:10:18PM -0700, eric kustarz wrote:
 
 There's going to be some very good stuff for ZFS in s10u4, can you  
 please update the issues *and* features when it comes out?

Yes, and don't forget to add, that the POSIX ACL has been dropped/replaced by
the braindamaged NFS4 ACLs. Now (when one migrates/cp UFS to ZFS) one
needs to  chmod -R A- workgroup_dir and find workgroup_dir -type d -exec
chmod 0770 {} + and similar find workgroup_dir -type f -exec chmod
0660 {} + to make files accessable again. Last but not least one needs to
teach the *x nerds to revert to same procedures as 20 years ago: create/copy
files/dirs to workgroup_dir and than chmod g+rw etc.. 
Yes, windows people are laughing now about *x nerds and the nerds start
hating admins, which use ZFS for workgroup dirs and getting bad points
because they used to forget the chmod etc. thing ...

My 2 ¢.

Regards,
jel.
-- 
Otto-von-Guericke University http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany Tel: +49 391 67 12768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] setup_install_server, cpio and zfs : fix needed ?

2007-05-13 Thread Jens Elkner

On Sun, May 13, 2007 at 12:24:48PM -0500, Gael wrote:
 As I saw the same issue with the previous release, I'm going to post that
 one here and not on the U4 Beta forum.
 
 I'm trying to create a miniroot image for wanboot from the S10u4 media (same
 issue occured with S10u3 media) on a zfs filesystem on a U3 server.
...
 If checking in parallel, I can see a cpio running but not doing a lot... 30
 mins without activity yet...

Have the same problem on a thumper with u3: creating a boot image there takes
about 41 min, on a U40 with b55b 2.5 min.

+ /boot/solaris/bin/root_archive packmedia $DST $SCRATCH

ON both system $SCRATCH is local UFS. $DST is ZFS on the thumper,
the same directory NFS mounted on the U40.

  21537 /bin/ksh /boot/solaris/bin/root_archive packmedia 
/home/elkner/tmp/miniroot/Sol
21576 newfs /dev/rlofi/1
  21577 sh -c mkfs -F ufs /dev/rlofi/1 385800 600 1 8192 1024 16 10 120 
2048 t 0 -1 8 1
21578 mkfs -F ufs /dev/rlofi/1 385800 600 1 8192 1024 16
10 120 2048 t 0 -1 8 1 n

Here mkfs seems to take the most time even so prstat says, load 0.02
and less (mkfs is doing nothing) ... 

Regards,
jel.
-- 
Otto-von-Guericke University http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany Tel: +49 391 67 12768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] learn to quote

2007-04-30 Thread Jens Elkner

On Sat, Apr 28, 2007 at 01:20:45PM -0400, Christine Tran wrote:
 Jens Elkner wrote:
 
 So please: http://learn.to/quote
 
 We apparently need to learn German as well.  -CT

Not really. There is also a Dutch version ;-)

For your convinience (and all people, which can't find the English |
Dutch link right below Revision on top of the page):

http://www.netmeister.org/news/learn2quote.html
http://www.briachons.org/art/quote/

Regards,
jel.
-- 
Otto-von-Guericke University http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany Tel: +49 391 67 12768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] learn to quote

2007-04-28 Thread Jens Elkner

Hi,
why is it so time consuming being on an opensolaris discuss mailing
list? Obviously because many people never learned/forgot how to quote.

So please: http://learn.to/quote

Thanx,
jel.
-- 
Otto-von-Guericke University http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany Tel: +49 391 67 12768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] migration/acl4 problem

2007-03-22 Thread Jens Elkner

On Thu, Mar 22, 2007 at 01:34:15PM -0600, Mark Shellenbaum wrote:
 
 There is one big difference which you see here.  ZFS always honors the 
 users umask, and that is why the file was created with 644 permission 
 rather than 664 as UFS did.  ZFS has to always apply the users umask 
 because of POSIX.
 
 Wow, that's a big show stopper! If I tell the users, that after the
 transition they have to toggle their umask before/after writing to
 certain directories or need to do a chmod, I'm sure they wanna hang me
 right on the next tree and wanna get their OS changed to Linux/Windooze...
 
 
 Only if your goal is to ignore a users intent on what permissions their 
 files should be created with.  Think about users who set their umask to 
 077.  They will be upset when their files are created with a more 
 permissive mode.  The ZFS way is much more secure.

Nope - you're talking about a different thing. I did not say, that
these ACLs would be set on every possible fs|directory on the system!
We and several companies I worked for use it to have a shared data dir
you might think of it as a kind of workgroup based CVS, where
the members of the owning workgroup are in the role of committers.

The rationale for this is obvious and actually the same as for CVS:
the only thing that counts is, what one can find in /data/$workgroup/**
So no need to waist time for asking, who has finally the latest version
of a document or the version, which should be used wrt. communication 
with none-internal entities, etc. and furthermore it allows to reduce
the huge pile of redundant data extremly...

We used this pattern/policy successfully for more than 10 year: for
window users it was achieved easily by using samba, on Linux servers
using XFS ACLs and on Solaris servers using UFS ACLs. ZFS breaks it.
And since Solaris has no smbmnt - we can't even get a workaround, which
makes more or less sense...

 What is your real desired goal?  Are you just wanting anybody in a 
 specific group to be able to read,write all files in a certain directory 
 tree?  If so, then there are other ways to achieve this, with file and 
 directory inheritance.

May be I didn't use the right settings, but I played around with it
before sending the original posting (zfs aclmode intentionally set
to passthrough and added fd flags), but this didn't work either.
So a working example/demo would be helpful ...

 Isn't there a flag/property for zfs, to get back the old behavior
 or to enable POSIX-ACLs instead of zfs-ACLs?
 A force_directory_create_mode=0770,force_file_create_mode=0660'
 (like for samba shares) property would be even better - no need to fight
 with ACLs...
 
 That would be bad.  That would mean that every file in a file system 
 would be forced to be created with forced set of permissions.

And that's exactly the business requirement. And even more a practical
expericence: Assume user always have to change their umask before
writing to /data/workgroup/**. Since people are usually a little bit
lazy and are focused on get the job done, it doesn't take very long
until the have added umask 007 to their .login/.profile whatever.
But now, anybody in the same workgroup is also able to read the users
private data in its $HOME, e.g. $HOME/Mail/* ...

So in theory you might be right, but in practice it turns out, that you
are achieving exactly the opposite...

Regards,
jel.
-- 
Otto-von-Guericke University http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany Tel: +49 391 67 12768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] migration/acl4 problem

2007-03-21 Thread Jens Elkner

Hi,

S10U3: It seems, that ufs POSIX-ACLs are not properly translated to zfs
ACL4 entries, when one xfers a directory tree from UFS to ZFS.

Test case:

Assuming one has an user A and B, both belonging to group G and having
their
umask set to 022:
1) On UFS
- as user A do:
mkdir /dir
chmod 0775 /dir
setfacl -m d:u::rwx,d:g::rwx,d:o:r-x,d:m:rwx /dir
# samba would say: force create mask = 0664; directory mode = 0775
- as user B do:
cd /dir
touch x
ls -alv
- as user A do:
cd /dir
echo bla x
- results in:
drwxrwxr-x+  3 A   G  512 Mar 22 01:20 .
 0:user::rwx
 1:group::rwx   #effective:rwx
 2:mask:rwx
 3:other:r-x
 4:default:user::rwx
 5:default:group::rwx
 6:default:mask:rwx
 7:default:other:r-x
...
-rw-rw-r--   1 BG4 Mar 22 01:22 x
 0:user::rw-
 1:group::rw-   #effective:rw-
 2:mask:rw-
 3:other:r--

2) On zfs
- e.g. as root do:
cp -P -r -p /dir /pool1/zfsdir
# cp: Insufficient memory to save acl entry
cp  -r -p /dir /pool1/zfsdir
# cp: Insufficient memory to save acl entry
find dir | cpio -puvmdP /pool1/docs/
- as user B do:
cd /pool1/zfsdir/dir
touch y
- as user A do:
cd /pool1/zfsdir/dir
echo bla y
# y: Permission denied.
- result:
drwxrwxr-x+  2 A   G4 Mar 22 01:36 .
owner@:--:fdi---:deny
owner@:--:--:deny
owner@:rwxp---A-W-Co-:fdi---:allow
owner@:---A-W-Co-:--:allow
group@:--:fdi---:deny
group@:--:--:deny
group@:rwxp---A-W-Co-:fdi---:allow
group@:---A-W-Co-:--:allow
 everyone@:-w-p---A-W-Co-:fdi---:deny
 everyone@:---A-W-Co-:--:deny
 everyone@:r-x---a-R-c--s:fdi---:allow
 everyone@:--a-R-c--s:--:allow
owner@:--:--:deny
owner@:rwxp---A-W-Co-:--:allow
group@:--:--:deny
group@:rwxp--:--:allow
 everyone@:-w-p---A-W-Co-:--:deny
 everyone@:r-x---a-R-c--s:--:allow
...
-rw-r--r--+  1 BG0 Mar 22 01:36 y
owner@:--:--:deny
owner@:---A-W-Co-:--:allow
group@:--:--:deny
group@:---A-W-Co-:--:allow
 everyone@:---A-W-Co-:--:deny
 everyone@:--a-R-c--s:--:allow
owner@:--x---:--:deny
owner@:rw-p---A-W-Co-:--:allow
group@:-wxp--:--:deny

So, has anybody a clue, how one is able to migrate directories from
ufs to zfs without loosing functionality?

I've read, that it is always possible to translate POSIX_ACLs to ACL4,
but it doesn't seem to work. So I've a big migration problem ... :(((

Also I haven't found anything, which explain, how ACL4 really works on
Solaris, i.e. how the rules are applied. Yes, in order and only who
matches. But what means 'who  matches', what purpose have the
'owner@:--:--:deny'  entries, what takes precendence
(allow | deny | first match | last match), also I remember, that
sometimes I heard, that if allow once matched, everything else is
ignored - but than I' askling, why the order of the ACLEs are important.
Last but not least, what purpose have the standard perms e.g. 0644 -
completely ignored if ACLEs are present ? Or used as fallback, if no
ACLE matches or ACLE match, but have not set anywhere e.g. the r bit ?

Any hints?

Regards,
jel.
-- 
Otto-von-Guericke University http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany Tel: +49 391 67 12768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] understanding zfs/thunoer bottlenecks?

2007-03-19 Thread Jens Elkner

On Wed, Feb 28, 2007 at 11:45:35AM +0100, Roch - PAE wrote:

 http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6460622

Any estimations, when we'll see a [feature] fix for U3?
Should I open a call, to perhaps rise the priority for the fix?

 The bug applies to checksum as well. Although the fix now in 
 the gate only addreses compression.
 There is a per pool limit on throuput due to checksum.
 Multiple pools may help.

Yepp. However, pooling the the disks also means to limit the I/O for a
single task to max. #disksOfPool*IOperDisk/2. So pooling would make
sense to me, if one has a lot of tasks and is able to force them to
a dedicated pool...
So my conclusion is, the more pools the ore aggregate bandwitdh (if
one is able to distribute the work properly over all disks), but the
less bandwith for a single task :((

This performance feature was fixed in Nevada last week.
Workaround is to  create multiple pools with fewer disks.
   
   Does this make sense for mirrors only as well ?

 Yep.

OK, since I can't get out more than ~1GB/s (only one PCI-X slot left for
a 10Mbps NIC), I decided to split into 2m*12 + 2m*10 + s*2 (see below).
But I do not wanna rise the write perf limit: It already dropped to
average ~ 345 MB/s :(((

 http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6415647

is degrading a bit the perf (guesstimate of anywhere up to
10-20%).

I would guess, even up to 45% ...

 Check out iostat 1 and you will see the '0s' : not good.

Yes - saw even 5 consecutive 0s ... :(

Here the layout (inkl. test results), I actually have in mind for
production:

   1.  pool for big files (sources tarballs, multimedia, iso images):
   --
   zpool create -n pool1 \
   mirror c0t0d0 c1t0d0mirror c6t0d0 c7t0d0 \
 mirror c4t1d0 c5t1d0   \
   mirror c0t2d0 c1t2d0mirror c6t2d0 c7t2d0 \
 mirror c4t3d0 c5t3d0   \
   mirror c0t4d0 c1t4d0mirror c6t4d0 c7t4d0 \
 mirror c4t5d0 c5t5d0   \
   mirror c0t6d0 c1t6d0mirror c6t6d0 c7t6d0 \
 mirror c4t7d0 c5t7d0   \
   spare c4t0d0 c4t4d0  \

   (2x 256G) write(min/max/aver): 0 674 343.7


   2. pool for mixed stuff (homes, apps):
   --
   zpool create -n pool2 \
\
   mirror c0t1d0 c1t1d0mirror c6t1d0 c7t1d0 \
 mirror c4t2d0 c5t2d0   \
   mirror c0t3d0 c1t3d0mirror c6t3d0 c7t3d0 \
\
   mirror c0t5d0 c1t5d0mirror c6t5d0 c7t5d0 \
 mirror c4t6d0 c5t6d0   \
   mirror c0t7d0 c1t7d0mirror c6t7d0 c7t7d0 \
   spare c4t0d0 c4t4d0  \

   (2x 256G) write(min/max/aver): 0 600 386.0

  1. + 2. (2x 256G) write(min/max/aver): 0 1440 637.9

  1. + 2. (4x 128G) write(min/max/aver): 3.5 1268 709.5 (381+328.5)

Regards,
jel.
-- 
Otto-von-Guericke University http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany Tel: +49 391 67 12768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] understanding zfs/thunoer bottlenecks?

2007-02-27 Thread Jens Elkner

On Mon, Feb 26, 2007 at 06:36:47PM -0800, Richard Elling wrote:
 Jens Elkner wrote:
 Currently I'm trying to figure out the best zfs layout for a thumper wrt. 
 to read AND write performance. 
 
 First things first.  What is the expected workload?  Random, sequential, 
 lots of
 little files, few big files, 1 Byte iops, synchronous data, constantly 
 changing
 access times, ???

Mixed. I.e.
1) as a homes server for student's and staff's ~, so small and big files
   (BTW: what is small and what is big?) as well as compressed/text files
   (you know, the more space people have, the more messier they get ...) -
   target to samba and nfs
2) app server in the sence of shared nfs space, where applications get
   installed once and can be used everywhere, e.g. eclipse, soffice,
   jdk*, teX, Pro Engineer, studio 11 and the like. 
   Later I wanna have the same functionality for firefox, thunderbird,
   etc. for windows clients via samba, but this requires a little bit
   ore tweaking to get it work aka time I do not have right now ...
   Anyway, when ~ 30 students start their monster app like eclipse,
   oxygen, soffice at once (what happens in seminars quite frequently),
   I would be lucky to get same performance via nfs as from a local HDD
   ...
3) Video streaming, i.e. capturing as well as broadcasting/editing via
   smb/nfs.

 In general, striped mirror is the best bet for good performance with 
 redundancy.

Yes - thought about doing a 
mirror c0t0d0 c1t0d0 mirror c4t0d0 c6t0d0 mirror c7t0d0 c0t4d0 \
mirror c0t1d0 c1t1d0 mirror c4t1d0 c5t1d0 mirror c6t1d0 c7t1d0 \
mirror c0t2d0 c1t2d0 mirror c4t2d0 c5t2d0 mirror c6t2d0 c7t2d0 \
mirror c0t3d0 c1t3d0 mirror c4t3d0 c5t3d0 mirror c6t3d0 c7t3d0 \
mirror c1t4d0 c7t4d0 mirror c4t4d0 c6t4d0 \
mirror c0t5d0 c1t5d0 mirror c4t5d0 c5t5d0 mirror c6t5d0 c7t5d0 \
mirror c0t6d0 c1t6d0 mirror c4t6d0 c5t6d0 mirror c6t6d0 c7t6d0 \
mirror c0t7d0 c1t7d0 mirror c4t7d0 c5t7d0 mirror c6t7d0 c7t7d0
(probably removing 5th line and using those drives for hotspare).

But perhaps it might be better, to split the mirrors into 3 different
pools (but not sure why: my brain says no, my belly says yes ;-)).

 I did some simple mkfile 512G tests and found out, that per average ~ 500 
 MB/s  seems to be the maximum on can reach (tried initial default setup, 
 all 46 HDDs as R0, etc.).
 
 How many threads?  One mkfile thread may be CPU bound.

Very good point! Using 2 mkfile 256G I got (min/max/av) 473/750/630
MB/s (via zpool iostat 10) with the layout shown above and no
compression enabled. Just to proof it I got with 4 mkfile 128G 407/815/588,
with 3 mkfile 170G 401/788/525, 1 mkfile 512G was 397/557/476.

Regards,
jel.
-- 
Otto-von-Guericke University http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany Tel: +49 391 67 12768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] understanding zfs/thunoer bottlenecks?

2007-02-27 Thread Jens Elkner

On Tue, Feb 27, 2007 at 11:35:37AM +0100, Roch - PAE wrote:
 
 That might be a per pool limitation due to 
 
   http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6460622

Not sure - did not use compression feature...
  
 This performance feature was fixed in Nevada last week.
 Workaround is to  create multiple pools with fewer disks.

Does this make sense for mirrors only as well ?

 Also this
 
   http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6415647
 
 is degrading a bit the perf (guesstimate of anywhere up to
 10-20%).

Hmm - sounds similar (zpool iostat 10): 
pool1   4.36G  10.4T  0  5.04K  0   636M
pool1   9.18G  10.4T  0  4.71K204   591M
pool1   17.8G  10.4T  0  5.21K  0   650M
pool1   24.0G  10.4T  0  5.65K  0   710M
pool1   30.9G  10.4T  0  6.26K  0   786M
pool1   36.3G  10.4T  0  2.74K  0   339M
pool1   41.5G  10.4T  0  4.27K  1.60K   533M
pool1   46.7G  10.4T  0  4.19K  0   527M
pool1   46.7G  10.4T  0  2.28K  1.60K   290M
pool1   55.7G  10.4T  0  5.18K  0   644M
pool1   59.9G  10.4T  0  6.17K  0   781M
pool1   68.8G  10.4T  0  5.63K  0   702M
pool1   73.8G  10.3T  0  3.93K  0   492M
pool1   78.7G  10.3T  0  2.96K  0   366M
pool1   83.2G  10.3T  0  5.58K  0   706M
pool1   91.5G  10.3T  4  6.09K  6.54K   762M
pool1   96.4G  10.3T  0  2.74K  0   338M
pool1101G  10.3T  0  3.88K  1.75K   485M
pool1106G  10.3T  0  3.85K  0   484M
pool1106G  10.3T  0  2.79K  1.60K   355M
pool1110G  10.3T  0  2.97K  0   369M
pool1119G  10.3T  0  5.20K  0   647M
pool1124G  10.3T  0  3.64K  1.80K   455M
pool1124G  10.3T  0  3.54K  0   453M
pool1128G  10.3T  0  2.77K  0   343M
pool1133G  10.3T  0  3.92K102   491M
pool1137G  10.3T  0  2.43K  0   300M
pool1141G  10.3T  0  3.26K  0   407M
pool1148G  10.3T  0  5.35K  0   669M
pool1152G  10.3T  0  3.14K  0   392M
pool1156G  10.3T  0  3.01K  0   374M
pool1160G  10.3T  0  4.47K  0   562M
pool1164G  10.3T  0  3.04K  0   379M
pool1168G  10.3T  0  3.39K  0   424M
pool1172G  10.3T  0  3.67K  0   459M
pool1176G  10.2T  0  3.91K  0   490M
pool1183G  10.2T  4  5.58K  6.34K   699M
pool1187G  10.2T  0  3.30K  1.65K   406M
pool1195G  10.2T  0  3.24K  0   401M
pool1198G  10.2T  0  3.21K  0   401M
pool1203G  10.2T  0  3.87K  0   486M
pool1206G  10.2T  0  4.92K  0   623M
pool1214G  10.2T  0  5.13K  0   642M
pool1222G  10.2T  0  5.02K  0   624M
pool1225G  10.2T  0  4.19K  0   530M
pool1234G  10.2T  0  5.62K  0   700M
pool1238G  10.2T  0  6.21K  0   787M
pool1247G  10.2T  0  5.47K  0   681M
pool1254G  10.2T  0  3.94K  0   488M
pool1258G  10.2T  0  3.54K  0   442M
pool1262G  10.2T  0  3.53K  0   442M
pool1267G  10.2T  0  4.01K  0   504M
pool1274G  10.2T  0  5.32K  0   664M
pool1274G  10.2T  4  3.42K  6.69K   438M
pool1278G  10.2T  0  3.44K  1.70K   428M
pool1282G  10.1T  0  3.44K  0   429M
pool1289G  10.1T  0  5.43K  0   680M
pool1293G  10.1T  0  3.36K  0   419M
pool1297G  10.1T  0  3.39K306   423M
pool1301G  10.1T  0  3.33K  0   416M
pool1308G  10.1T  0  5.48K  0   685M
pool1312G  10.1T  0  2.89K  0   360M
pool1316G  10.1T  0  3.65K  0   457M
pool1320G  10.1T  0  3.10K  0   386M
pool1327G  10.1T  0  5.48K  0   686M
pool1334G  10.1T  0  3.31K  0   406M
pool1337G  10.1T  0  5.28K  0   669M
pool1345G  10.1T  0  3.30K  0   402M
pool1349G  10.1T  0  3.48K  1.60K   437M
pool1349G  10.1T  0  3.42K  0   436M
pool1353G  10.1T  0  3.05K  0   379M
pool1358G  10.1T  0  3.81K  0   477M
pool1362G  10.1T  0  3.40K  0   425M
pool1366G  10.1T  4  3.23K  6.59K   401M
pool1370G  10.1T  0  3.47K  1.65K   432M
pool1376G  10.1T  0  4.98K  0   623M
pool1380G  10.1T  0  2.97K  0   369M
pool1384G  10.0T  0  3.52K409   439M
pool1390G  10.0T  0  5.00K  0   626M
pool1398G  10.0T  0  3.38K  0   414M
pool1404G  10.0T  0  5.09K  0   637M
pool1408G  10.0T  0  3.18K  0   397M
pool1412G  10.0T  0  3.19K  0   397M

[zfs-discuss] Re: zfs bogus (10 u3)?

2007-02-26 Thread Jens Elkner

Hi Wire ;-),

 What's the output of
   zpool list
 zfs list
 ?

Ooops, already destroyed the pool. Anyway, slept a night over it and found a 
maybe explaination:
Files were created with mkfile an mkfile has an option -n. It was not used to 
create the files, however I interrupted mkfile (^C). So I guess, mkfile always 
creates a sparse file first, and when no '-n' is given, it starts to allocate 
the blocks ...

Regards,
jel.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] understanding zfs/thunoer bottlenecks?

2007-02-26 Thread Jens Elkner

Currently I'm trying to figure out the best zfs layout for a thumper wrt. to 
read AND write performance. 

I did some simple mkfile 512G tests and found out, that per average ~ 500 MB/s  
seems to be the maximum on can reach (tried initial default setup, all 46 HDDs 
as R0, etc.).

According to 
http://www.amd.com/us-en/assets/content_type/DownloadableAssets/ArchitectureWP_062806.pdf
 I would assume, that much more and at least in theory a max. ~ 2.5 GB/s should 
be possible with R0 (assuming the throughput for a single thumper HDD is ~ 54 
MB/s)...

Is somebody able to enlighten me?

Thanx,
jel.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] zone with lofs zfs - why legacy

2006-10-23 Thread Jens Elkner

I've created a zone which should mount the /pool1/flexlm.ulo zfs via lofs:

+ zfs create pool1/flexlm.ulo
+ zfs set atime=off pool1/flexlm.ulo
+ zfs set sharenfs=off pool1/flexlm.ulo

+ zonecfg -z flexlm
...
 add fs
 set dir=/usr/local
 set special=/pool1/flexlm.ulo
 set type=lofs
 add options rw,nodevices
 end
...

This seems to work. However the manual zfs(5) say, that the mountpoint property 
has to be set to legacy.
1) why ?
2) if one sets the zfs property to legacy, the manual does not say, how an 
/etc/vfstab entry should look like (and a mount_zfs man page doesn't exist)...
3) the /pool1/flexlm.ulo property is set to atime=off. Do I need to specifiy 
this option or something similar, when creating the zone?
4) Wrt. best performance, only , what should one prefer: add fs:dir or add 
fs:dataset ?
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] Re: zone with lofs zfs - why legacy

2006-10-23 Thread Jens Elkner

'Robert Milkowski wrote:'

Hi Robert,

 Monday, October 23, 2006, 7:15:39 PM, you wrote:
 JE 3) the /pool1/flexlm.ulo property is set to atime=off. Do I need
 JE to specifiy this option or something similar, when creating the zone?

 no, you don't.

OK.

 JE 4) Wrt. best performance, only , what should one prefer: add fs:dir or 
 add fs:dataset ?

 The performance should be the same in both cases - only a difference
 in features.
 ps. of course you do realize that mounting a filesystem over lofs will
 degrade performance

Yes, I guessed that, but hopefully not that much ...
Thinking about it, it would suggest to me (if I need abs. max. perf): the best
thing to do is, to create a pool inside the zone and to use zfs on it ?

Regards,
jens.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] Re: Re: zone with lofs zfs - why legacy

2006-10-23 Thread Jens Elkner

 Using a ZFS filesystem within a zone will go just as
 fast as in the 
 global zone, so there's no need to create multiple
 pools.

So, Robert is actually wrong (at least in theory): using a zfs via 
add:fs:dir..,type=lofs gives probably less performances than using it via 
add:dataset:name. Correct?
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

77 matches

Mail list logo