from:"Markus Kovero"

Re: [zfs-discuss] Replacement for X25-E

2011-09-26 Thread Markus Kovero

 I don't think the 311 has any over-provisioning (other than the 7% from GB - 
 GiB conversion). I believe it is an X25-E with only 5 channels populated. The 
 upcoming enterprise models are MLC based and have greater over-provisioning 
 AFAIK.
 The 20GB 311 only costs ~ $100 though. The 100GB Intel 710 costs ~ $650.
 The 311 is a good choice for home or budget users, and it seems that the 710 
 is much bigger than it needs to be for slog devices.

I think 311 looks suitable replacement, as in you can put four 311's instead of 
2 X25-E's as slog (when it comes to pricing), going to test it out. Thanks to 
you all.

Yours
Markus Kovero
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Replacement for X25-E

2011-09-21 Thread Markus Kovero

 Can you rank your priorities:
   + cost/IOPS
   + cost
   + latency
   + predictable latency
   + HA-cluster capable

 There are quite a number of devices available now, at widely varying costs, 
 application, and performance.
 -- richard

I'd say price range around same than X25-E was, main priorities being 
predictable latency and performance. Also write wear shouldn't get an issue 
when writing 150MB/s 24/7 365.
Thanks

Yours 
Markus Kovero
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] Replacement for X25-E

2011-09-20 Thread Markus Kovero

Hi, I was wondering do you guys have any recommendations as replacement for 
Intel X25-E as it is being EOL'd? Mainly as for log device.


With kind regards
Markus Kovero

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] about write balancing

2011-06-30 Thread Markus Kovero



 To me it seems that writes are not directed properly to the devices that have 
 most free space - almost exactly the opposite. The writes seem to go to the 
 devices that have _least_ free space, instead of the devices that have most 
 free space.  The same effect that can be seen in these 60s averages can also 
 be observed in a shorter timespan, like a second or so.

 Is there something obvious I'm missing?


Not sure how OI should behave, I've managed to even writes  space usage 
between vdevs by bringing device offline in vdev you don't want to writes end 
up to.
If you have degraded vdev in your pool, zfs will try not to write there, and 
this may be the case here as well as I don't see zpool status output.

Yours
Markus Kovero

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Wired write performance problem

2011-06-09 Thread Markus Kovero

Have you determined this is not 7000208 as it sounds much like it;
You could run
/usr/sbin/lockstat -HcwP -n 10 -x aggrate=10hz -D 20 -s 40 sleep 2
/usr/sbin/lockstat -CcwP -n 10 -x aggrate=10hz -D 20 -s 40 sleep 2
to find out hottest callers (space_map_load,kmem_cache_free) while issue is on.

Yours
Markus Kovero


-Original Message-
From: zfs-discuss-boun...@opensolaris.org 
[mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Donald Stahl
Sent: 9. kesäkuuta 2011 6:27
To: Ding Honghui
Cc: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] Wired write performance problem

 There is snapshot of metaslab layout, the last 51 metaslabs have 64G 
 free space.
After we added all the disks to our system we had lots of free
metaslabs- but that didn't seem to matter. I don't know if perhaps the system 
was attempting to balance the writes across more of our devices but whatever 
the reason- the percentage didn't seem to matter. All that mattered was 
changing the size of the min_alloc tunable.

You seem to have gotten a lot deeper into some of this analysis than I did so 
I'm not sure if I can really add anything. Since 10u8 doesn't support that 
tunable I'm not really sure where to go from there.

If you can take the pool offline, you might try connecting it to a
b148 box and see if that tunable makes a difference. Beyond that I don't really 
have any suggestions.

Your problem description, including the return of performance when freeing 
space is _identical_ to the problem we had. After checking every single piece 
of hardware, replacing countless pieces, removing COMSTAR and other pieces from 
the puzzle- the only change that helped was changing that tunable.

I wish I could be of more help but I have not had the time to dive into the ZFS 
code with any gusto.

-Don
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Wired write performance problem

2011-06-08 Thread Markus Kovero

Hi, also see;
http://www.mail-archive.com/zfs-discuss@opensolaris.org/msg45408.html

We hit this with Sol11 though, not sure if it's possible with sol10

Yours
Markus Kovero

-Original Message-
From: zfs-discuss-boun...@opensolaris.org 
[mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Ding Honghui
Sent: 8. kesäkuuta 2011 6:07
To: zfs-discuss@opensolaris.org
Subject: [zfs-discuss] Wired write performance problem

Hi,

I got a wired write performance and need your help.

One day, the write performance of zfs degrade.
The write performance decrease from 60MB/s to about 6MB/s in sequence write.

Command:
date;dd if=/dev/zero of=block bs=1024*128 count=1;date

The hardware configuration is 1 Dell MD3000 and 1 MD1000 with 30 disks.
The OS is Solaris 10U8, zpool version 15 and zfs version 4.

I run Dtrace to trace the write performance:

fbt:zfs:zfs_write:entry
{
 self-ts = timestamp;
}


fbt:zfs:zfs_write:return
/self-ts/
{
 @time = quantize(timestamp-self-ts);
 self-ts = 0;
}

It shows
value  - Distribution - count
 8192 | 0
16384 | 16
32768 | 3270
65536 |@@@  898
   131072 |@@@  985
   262144 | 33
   524288 | 1
  1048576 | 1
  2097152 | 3
  4194304 | 0
  8388608 |@180
 16777216 | 33
 33554432 | 0
 67108864 | 0
134217728 | 0
268435456 | 1
536870912 | 1
   1073741824 | 2
   2147483648 | 0
   4294967296 | 0
   8589934592 | 0
  17179869184 | 2
  34359738368 | 3
  68719476736 | 0

Compare to a working well storage(1 MD3000), the max write time of zfs_write is 
4294967296, it is about 10 times faster.

Any suggestions?

Thanks
Ding

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] No write coalescing after upgrade to Solaris 11 Express

2011-04-28 Thread Markus Kovero

 failed: space_map_load(sm, zfs_metaslab_ops, SM_FREE, smo, 
 spa-spa_meta_objset) == 0, file ../zdb.c, line 571, function dump_metaslab

 Is this something I should worry about?

 uname -a
 SunOS E55000 5.11 oi_148 i86pc i386 i86pc Solaris

I thought we were talking about solaris 11 express, not oi?
Anyway, no idea about how openindiana should work or not.

Yours
Markus Kovero
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] No write coalescing after upgrade to Solaris 11 Express

2011-04-27 Thread Markus Kovero



 Sync was disabled on the main pool and then let to inherrit to everything 
 else. The  reason for disabled this in the first place was to fix bad NFS 
 write performance (even with  Zil on an X25e SSD it was under 1MB/s).
 I've also tried setting the logbias to throughput and latency but they both 
 perform  around the same level.

 Thanks
 -Matt

I believe you're hitting bug 7000208: Space map trashing affects NFS write 
throughput. We also did, and it did impact iscsi as well.

If you have enough ram you can try enabling metaslab debug (which makes problem 
vanish);

# echo metaslab_debug/W1 | mdb -kw

And calculating amount of ram needed:


/usr/sbin/amd64/zdb -mm poolname  /tmp/zdb-mm.out

awk '/segments/ {s+=$2}END {printf(sum=%d\n,s)}' zdb_mm.out

93373117 sum of segments
16 VDEVs
116 metaslabs
1856 metaslabs in total

93373117/1856 = 50308 average number of segments per metaslab

50308*1856*64
5975785472

5975785472/1024/1024/1024
5.56

= 5.56 GB

Yours
Markus Kovero
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] What drives?

2011-02-24 Thread Markus Kovero

 So, does anyone know which drives to choose for the next setup? Hitachis look 
 good so far, perhaps also seagates, but right now, I'm dubious about the  
 blacks.

Hi! I'd go for WD RE edition. Blacks and Greens are for desktop use and 
therefore lack proper TLER settings and have useless power saving features that 
could induce errors and mysterious slowness.

Yours
Markus Kovero
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Very bad ZFS write performance. Ok Read.

2011-02-11 Thread Markus Kovero


 I noticed recently that write rate has dropped off and through testing now I 
 am getting 35MB/sec writes. The pool is around 50-60% full.

 I am getting a CONSTANT 30-35% kernel cpu utilisation, even if the machine is 
 idle. I do not know if this was the case when the write performance was 
 better. I have tried reading from the server to a HDD on a windows client and 
 I get 50+MB/sec which is probably the max that that HDD can sustain on a 
 write.

Hi, do you have your zfs prefetch turned on or off? Turning prefetch off makes 
comstar iscsi shares unusable in Solaris 11 Express while it might work fine in 
osol.

Yours
Markus Kovero
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Very bad ZFS write performance. Ok Read.

2011-02-11 Thread Markus Kovero


 On the other hand, that will only matter for reads.  And the complaint is
 writes.

Actually, it also affects writes. (due checksum reads?)

Yours
Markus Kovero
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] zpool does not like iSCSI ?

2010-12-01 Thread Markus Kovero


  Do you know if these bugs are fixed in Solaris 11 Express ?

 It says it was fixed in snv_140, and S11E is based on snv_151a, so it
 should be in:

 http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6907687


I can confirm it works, iscsi zpools seem to work very happily now.

Yours
Markus Kovero

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] RAID-Z/mirror hybrid allocator

2010-11-22 Thread Markus Kovero


  2. If you have an existing RAIDZ pool and upgrade to b151a, you would
  need to upgrade the pool version to use this feature. In this case,
  newly written metadata would be mirrored.


 Hi,

 And if one creates raid-z3 pool would meta-data be a 3-way mirror as well?

Also, how are devices determined where metadata is mirrored?

Yours
Markus Kovero

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] problem adding second MD1000 enclosure to LSI 9200-16e

2010-11-21 Thread Markus Kovero


 Any suggestions?  Is there some sort of boot procedure, in order to get the 
 system to recognize the second enclosure without locking up?  Is there a 
 special way to  configure one of these LSI boards?


It should just work, make sure you connect it right way and both JBODs are not 
in split mode (which does not allow daisy chaining).

Yours
Markus Kovero
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] RAID-Z/mirror hybrid allocator

2010-11-18 Thread Markus Kovero

Hi, I'm referring to;
http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6977913
It should be in Solaris 11 Express, has anyone tried this? How this is supposed 
to work? Any documentation available?

Yours
Markus Kovero
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS Crypto in Oracle Solaris 11 Express

2010-11-17 Thread Markus Kovero

 Does Oracle support Solaris 11 Express in production systems?
 -- richard

Yes, You need Premier support plan from Oracle for that.
Afaik, sol11 express is production ready, and is going to be updated to real 
Solaris 11, and is supported even with non-oracle hardware if you have the 
money (and certified system).

Yours
Markus Kovero
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] is opensolaris support ended?

2010-11-11 Thread Markus Kovero

 Thanks for your help.

 I would check this out.

Hi, yes. No new support plans have been available for a while.

Yours
Markus Kovero

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Maximum zfs send/receive throughput

2010-11-08 Thread Markus Kovero


 I'm wondering if #6975124 could be the cause of my problem, too.

there are several zfs send (and receive) related issues with 111b. You might 
seriously want to consider upgrading to more recent opensolaris (134) or 
openindiana

Yours
Markus Kovero
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] zpool does not like iSCSI ?

2010-11-03 Thread Markus Kovero

 Again: you may encounter this case only if...
 ... you're running some recent kernel patch level (in our case 142909-17)
 ... *and* you have placed zpools on both iscsi and non-iscsi devices.

Witnessed same behavior with osol_134 but it seems to be fixed in 147 atleast. 
No idea about Solaris though

Yours
Markus Kovero



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Running on Dell hardware?

2010-10-26 Thread Markus Kovero


 Add about 50% to the last price list from Sun und you will get the price
 it costs now ...

Seems oracle does not want to sell its hardware so much, several month delays 
with sales rep providing prices and pricing nowhere close to its competitors.

Yours
Markus Kovero
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Running on Dell hardware?

2010-10-25 Thread Markus Kovero

 That's precisely what I'm experiencing.  System still responds to ping.
 Anything that was already running in memory via network stays alive (cron
 jobs continue to run) but remote access is impossible (ssh, vnc, even local
 physical console...)  And eventually the system will stop completely.

Hi, Broadcom issues come out as loss of network connectivity, ie. system stops 
responding to ping.
This is different issue, it's like system runs out of memory or looses its 
system disks (which we have seen lately)

Yours
Markus Kovero

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Running on Dell hardware?

2010-10-25 Thread Markus Kovero

 You are asking for a world of hurt.  You may luck out, and it may work
 great, thus saving you money.  Take my example for example ... I took the
 safe approach (as far as any non-sun hardware is concerned.)  I bought an
 officially supported dell server, with all dell blessed and solaris
 supported components, with support contracts on both the hardware and
 software, fully patched and updated on all fronts, and I am getting system
 failures approx once per week.  I have support tickets open with both dell
 and oracle right now ... Have no idea how it's all going to turn out.  But
 if you have a problem like mine, using unsupported hardware, you have no
 alternative.  You're up a tree full of bees, naked, with a hunter on the
 ground trying to shoot you.  And IMHO, I think the probability of having a
 problem like mine is higher when you use the unsupported hardware.  But of
 course there's no definable way to quantize that belief.

 My advice to you is:  buy the supported hardware, and the support contracts
 for both the hardware and software.  But of course, that's all just a
 calculated risk, and I doubt you're going to take my advice.  ;-)


Any other feasible alternatives for Dell hardware? Wondering, are these issues 
mostly related to Nehalem-architectural problems, eg. c-states.
So is there anything good in switching hw vendor? HP anyone? 

Yours
Markus Kovero


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Running on Dell hardware?

2010-10-13 Thread Markus Kovero


 I have a Dell R710 which has been flaky for some time.  It crashes about once 
 per week.  I have literally replaced every piece of hardware in it, and 
 reinstalled Sol 10u9 fresh and clean.  
 I am wondering if other people out there are using Dell hardware, with what 
 degree of success, and in what configuration?
 The failure seems to be related to the perc 6i.  For some period around the 
 time of crash, the system still responds to ping, and anything currently in 
 memory or running from remote storage continues to function fine.  But new 
 processes that require the local storage ... Such as inbound ssh etc, or even 
 physical login at the console ... those are all hosed.  And eventually the 
 system stops responding to ping.  As soon as the problem starts, the only 
 recourse is power cycle.
 I can't seem to reproduce the problem reliably, but it does happen 
 regularly.  Yesterday it happened several times in one day, but sometimes it 
 will go 2 weeks without a problem.
 Again, just wondering what other people are using, and experiencing.  To see 
 if any more clues can be found to identify the cause.


Hi, we've been running opensolaris on Dell R710 with mixed results, some work 
better than others and we've been struggling with same issue as you are with 
latest servers.
I suspect somekind powersaving issue gone wrong, system disks goes to sleep and 
never wake up or something similar.
Personally, I cannot recommend using them with solaris, support is not even 
close to what it should be.

Yours
Markus Kovero
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Running on Dell hardware?

2010-10-13 Thread Markus Kovero


 How consistent are your problems?  If you change something and things get
 better or worse, will you be able to notice?

 Right now, I think I have improved matters by changing the Perc to
 WriteThrough instead of WriteBack.  Yesterday the system crashed several
 times before I changed that, and afterward, I can't get it to crash at all.
 But as I said before ... Sometimes the system goes 2 weeks without a
 problem.

 Do you have all your disks configured as individual disks?
 Do you have any SSD?
 WriteBack or WriteThrough?

I believe issues are not related to perc, as we use sas 6ir with system disks 
and disks are showing up as individual disks.
System has been crashing with and without (i/o) load, so far it's been running 
best with all extra pci-e cards removed (10Gbps nic, sas 5e controllers), 
uptime almost two days.
There's no apparent reason what triggers the crash, it did crash very 
frequently during one day and now it seems more stable. (sunspots anyone?)
We had SSD's at start, but removed them during testing, no effect there.
Somehow, all this is starting to remind me about Broadcom NIC issues. Different 
(not fully supported) hardware revision causing issues?

Yours
Markus Kovero

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] dedup status

2010-10-01 Thread Markus Kovero

Hi!

 Hi all

 I just tested dedup on this test box running OpenIndiana (147) storing bacula 
 backups, and did some more testing on some datasets with ISO images. The  
 results show so far that removing 30GB deduped datasets are done in a matter 
 of minutes, which is not the case with 134 (which may take hours). The tests 
 also show that the write speed to the pool is low, very low, if dedup is 
 enabled. This is a box with a 3GHz core2duo, 8 gigs of RAM, eight 2TB drives 
 and a 80GB x25m for the SLOG (4 gigs) and L2ARC (the rest of it).

 So far I will conclude that dedup should be useful if storage capacity is 
 crucial, but not if performance is taken into concideration.

 Mind, this is not a high-end box, but still, I think the numbers show 
 something

Hi, it is probably due you have quite low amount of ram. I have similar setup, 
10TB dataset that can handle 100MB/s writes easily, system has 24GB of ram.

Yours
Markus Kovero

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Pools inside pools

2010-09-28 Thread Markus Kovero

 Yes. But what is enough reserved free memory? If you need 1Mb for a normal 
 configuration you might need 2Mb when you are doing ZFS on ZFS. (I am just 
 guessing).
 This is the same problem as mounting an NFS server on itself via NFS. Also 
 not supported.

 The system has shrinkable caches and so on, but that space will sometimes run 
 out. All of it. There is also swap to use, but if that is on ZFS

 These things are also very hard to test.

I was able to see opensolaris snv_134 to become unresponsive due lack of memory 
with nested pool configuration today. It took around 12hours issuing writes 
around 1,2-1,5GB/s with system that had 48GB of ram.
Anyway, setting zfs_arc_max in /etc/system seemed to do the trick, seems to 
behave like expected even under heavier load. Performance is actually pretty 
good.

Yours
Markus Kovero
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] dedup testing?

2010-09-25 Thread Markus Kovero

 On Sat, Sep 25, 2010 at 10:19 AM, Piotr Jasiukajtis est...@gmail.com wrote:
 AFAIK that part of dedup code is not changed in b147.

 I think I remember seeing that there was a change made in 142 that
 helps, though I'm not sure to what extent.

 -B

OI seemed to behave much better than 134 in low disk space situation with dedup 
turned on after server crashed during (terabytes of) snapshot destroy.
import took some time but it did not block IO and most time consuming part was 
mounting datasets, already mounted datasets could be used during import too.
Also performance is a lot better.

Yours
Markus Kovero
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Pools inside pools

2010-09-23 Thread Markus Kovero

 Isn't this a matter of not keeping enough free memory as a workspace?  By 
 free memory, I am referring to unallocated memory and also recoverable main 
 memory used for shrinkable read caches (shrinkable by discarding cached  
 data).  If the system keeps enough free and recoverable memory around for 
 workspace, why should the deadlock case ever arise?  Slowness and page 
 swapping might be expected to arise (as a result of a shrinking read cache 
 and high memory pressure), but deadlocks too?

 It sounds like deadlocks from the described scenario indicate the memory 
 allocation and caching algorithms do not perform gracefully in the face of 
 high memory pressure.  If the deadlocks do not occur when different memory 
 pools  are involved (by using a second computer), that tells me that memory 
 allocation decisions are playing a role.  Additional data should not be 
 accepted for writes when the system determines memory pressure is so high 
 that it it may not  be able to flush everything to disk.

 Here is one article about memory pressure (on Windows, but the issues apply 
 cross-OS):
 http://blogs.msdn.com/b/slavao/archive/2005/02/01/364523.aspx

 (How does virtualization fit into this picture?  If both OpenSolaris systems 
 are actually running inside of different virtual machines, on top of the same 
 host, have we isolated them enough to allow pools inside pools without risk 
 of deadlocks? )

I haven't noticed any deadlock issues so far in low memory conditions when 
doing nested pools (in replicated configuration), atleast in snv134. Maybe I 
haven't tried hard enough, anyway, wouldn't log-device in innerpool help in 
this situation?

Yours
Markus Kovero

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Pools inside pools

2010-09-23 Thread Markus Kovero

 What is an example of where a checksummed outside pool would not be able 
 to protect a non-checksummed inside pool?  Would an intermittent 
 RAM/motherboard/CPU failure that only corrupted the inner pool's block 
 before it was passed to the outer pool (and did not corrupt the outer 
 pool's block) be a valid example?

 If checksums are desirable in this scenario, then redundancy would also 
 be needed to recover from checksum failures.


That is excellent point also, what is the point for checksumming if you cannot 
recover from it? At this kind of configuration one would benefit 
performance-wise not having to calculate checksums again.
Checksums in outer pools effectively protect from disk issues, if hardware 
fails so data is corrupted isn't outer pools redundancy going to handle it for 
inner pool also.
Only thing comes to mind is that IF something happens to outerpool, innerpool 
is not aware anymore of possibly broken data which can lead issues.

Yours
Markus Kovero
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] Pools inside pools

2010-09-22 Thread Markus Kovero

Hi, I'm asking for opinions here, any possible disaster happening or 
performance issues related in setup described below.
Point being to create large pool and smaller pools within where you can monitor 
easily iops and bandwidth usage without using dtrace or similar techniques.

1. Create pool

# zpool create testpool mirror c1t1d0 c1t2d0

2. Create volume inside a pool we just created

# zfs create -V 500g testpool/testvolume

3. Create pool from volume we just did

# zpool create anotherpool /dev/zvol/dsk/testpool/testvolume

After this, anotherpool can be monitored via zpool iostat nicely and 
compression can be used in testpool to save resources without having 
compression effect in anotherpool.

zpool export/import seems to work, although flag -d needs to be used, are there 
any caveats in this setup? How writes are handled?
Is it safe to create pool consisting several ssd's and use volumes from it as 
log-devices? Is it even supported?

Yours
Markus Kovero
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Pools inside pools

2010-09-22 Thread Markus Kovero



 Such configuration was known to cause deadlocks. Even if it works now (which 
 I don't expect to be the case) it will make your data to be cached twice. The 
 CPU utilization  will also be much higher, etc.
 All in all I strongly recommend against such setup.

 -- 
 Pawel Jakub Dawidek   http://www.wheelsystems.com
 p...@freebsd.org   http://www.FreeBSD.org
 FreeBSD committer Am I Evil? Yes, I Am!

Well, CPU utilization can be tuned downwards by disabling checksums in inner 
pools as checksumming is done in main pool. I'd be interested in bug id's for 
deadlock issues and everything related. Caching twice is not an issue, 
prefetching could be and it can be disabled
I don't understand what makes it difficult for zfs to handle this kind of 
setup. Main pool (testpool) should just allow any writes/reads to/from volume, 
not caring what they are, where as anotherpool would just work as any other 
pool consisting of any other devices.
This is quite similar setup to iscsi-replicated mirror pool, where you have 
redundant pool created from iscsi volumes locally and remotely. 

Yours
Markus Kovero
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Pools inside pools

2010-09-22 Thread Markus Kovero


 Actually, the mechanics of local pools inside pools is significantly 
 different than using remote volumes (potentially exported ZFS volumes) 
 to build a local pool from.

I don't see how, I'm referring to method where hostA shares local iscsi volume 
to hostB where volume is being mirrored with zfs to its local volume that is 
shared through iscsi, resulting sync mirrored pool.

 And, no, you WOULDN'T want to turn off the inside pool's checksums.  
 You're assuming that this would be taken care of by the outside pool, 
 but that's a faulty assumption, since the only way this would happen 
 would be if the pools somehow understood they were being nested, and 
 thus could bypass much of the caching and I/O infrastructure related 
 to the inner pool.

Good point. Checksums it is then.

 Cacheing is also a huge issue, since ZFS isn't known for being 
 memory-slim, and as caching is done (currently) on a per-pool level, 
 nested pools will consume significantly more RAM.  Without caching the 
 inner pool, performance is going to suck (even if some blocks are cached 
 in the outer pool, that pool has no way to do look-ahead, nor other 
 actions). The nature of delayed writes can also wreck havoc with caching 
 at both pool levels.

Well, again, I don't see how nested pool would consume more RAM than invidual 
another pool created from dedicated disks.
Read caching takes place twice, but I don't see it much of as problem nowadays, 
just double the ram. (ofcourse, depending on workload)
look-ahead (prefetch?) hasn't work very well anyway so it's gong to be 
disabled, cache hit isn't great (worth it) on any workload.
Also, write caching needs to be benchmarked, but I'd say, if it works like it 
should, there is no issues there, have to test it out thoroughly though.

 Stupid filesystems have no issues with nesting, as they're not doing 
 anything besides (essentially) direct I/O to the underlying devices. UFS 
 doesn't have its own I/O subsystem, nor do things like ext* or xfs.  
 However, I've yet to see any modern filesystem do well with nesting 
 itself - there's simply too much going on under the hood, and without 
 being nested-aware (i.e. specifically coding the filesystem to 
 understand when it's being nested), much of these backend optimizations 
 are a recipe for conflict .


 -- 
 Erik Trimble
 Java System Support
 Mailstop:  usca22-123
 Phone:  x17195
 Santa Clara, CA

Thanks for your thoughts, if issues are performance related, they can be dealt 
with to some extent, more I'm worrying if there is still deadlock issues
or other general stability issues to consider, haven't found anything useful 
from bugtraq yet though.


Yours
Markus Kovero
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Pools inside pools

2010-09-22 Thread Markus Kovero




If you write to a zvol on a different host (via iSCSI) those writes
use memory in a different memory pool (on the other computer). No
deadlock.

I would expect in a usual configuration that one side of a mirrored 
iSCSI-based pool would be on the same host as it's underlying zvol's 
pool.

Thats what I was after. Would using log-device in inner pool make things 
different then? If presumed workload is eg. serving nfs.

Yours
Markus Kovero
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] resilver that never finishes

2010-09-19 Thread Markus Kovero

Hi, 

 The drives and the chassis are fine, what I am questioning is how can it 
 be resilvering more data to a device than the capacity of the device?

If data on pool has changed during resilver, resilver counter will not update 
accordingly, and it will show resilvering 100% for needed time to catch up.

Yours
Markus Kovero

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] performance leakage when copy huge data

2010-09-09 Thread Markus Kovero

On Sep 9, 2010, at 8:27 AM, Fei Xu twinse...@hotmail.com wrote:


 This might be the dreaded WD TLER issue. Basically the drive keeps retrying a 
 read operation over and over after a bit error trying to recover from a  
 read error themselves. With ZFS one really needs to disable this and have the 
 drives fail immediately.

 Check your drives to see if they have this feature, if so think about 
 replacing the drives in the source pool that have long service times and make 
 sure this feature is disabled on the destination pool drives.

 -Ross


It might be due tler-issues, but I'd try to pin greens down to SATA1-mode (use 
jumper, or force via controller). It might help a bit with these disks, 
although these are not really suitable disks for any use in any raid 
configurations due tler issue, which cannot be disabled in later firmware 
versions.

Yours
Markus Kovero

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] dedup status

2010-05-16 Thread Markus Kovero

Hi, its getting better, I believe its no longer single threaded after 135? 
(http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6922161)
but still waiting for major bug fix, 
http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6924824

It should be fixed before Release afaik.

Yours
Markus Kovero
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Mirroring USB Drive with Laptop for Backup purposes

2010-05-07 Thread Markus Kovero


 - Poweroff with USB drive connected or removed, Solaris will not boot 
 unless USB drive is
   connected, and in some cases need to be attached to the exact same 
 USB port when last
   attached. Is this a bug ?


Possibly hitting this?
http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6923585

Yours
Markus Kovero
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] b134 pool borked!

2010-05-05 Thread Markus Kovero

Hi, It definitely seems like hardware-related issue as panics related to common 
tools like format isn’t to be expected.

Anyhow. You might want to start to get all your disks show up in iostat / 
cfgadm before trying to import pool. You should replace controller if you have 
not already done so, and RAM should be all ok I guess?

Yours
Markus Kovero

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Workaround for mpt timeouts in snv_127

2010-04-10 Thread Markus Kovero

...
 
 I have identified the culprit is the Western Digital drive WD2002FYPS-01U1B0. 
 It's not clear if they can fix it in firmware, but Western Digital is 
 replacing my drives.

 Feb 17 04:45:10 thecratewall scsi_status=0x0, ioc_status=0x804b,
 scsi_state=0xc
 Feb 17 04:45:10 thecratewall scsi: [ID 365881 kern.info]
 /p...@0,0/pci15ad,7...@15/pci1000,3...@0 (mpt_sas0):
 Feb 17 04:45:10 thecratewall Log info 0x31110630 received for target 13.
 Feb 17 04:45:10 thecratewall scsi_status=0x0, ioc_status=0x804b,
 scsi_state=0xc

Hi, do you have disks connected in sata1/2? With 
WD2003FYYS-01T8B0/WD20EADS-00S2B0/WD1001FALS-00J7B1/WD1002FBYS-01A6B0 these 
timeouts are to be expected if disk is in SATA2 mode, 
we've get rid of these timeouts after forcing disks in SATA1-mode with jumpers, 
now they only appear when disk is having real issues and needs to be replaced.


Yours 
Markus Kovero
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Workaround for mpt timeouts in snv_127

2010-04-10 Thread Markus Kovero


 No, why are they to be expected with SATA2 mode? Is the defect 
 specific to the SATA2 circuitry? I guess it could be a temporary 
 workaround provided they would eventually fix the problem in 
 firmware, but I'm getting new drives, so I guess I can't complain :-)

Probably your new disks do this too, I really don't know whats with flawkey 
sata2 but I'd be quite sure it would fix your issues.
Performance drop is not even noticeable, so it's worth a try.

Yours
Markus Kovero

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Are there (non-Sun/Oracle) vendors selling OpenSolaris/ZFS based NAS Hardware?

2010-04-07 Thread Markus Kovero


 Seems like this issue only occurs when MSI-X interrupts are enabled
 for the BCM5709 chips, or am I reading it wrong?
 
 If I type 'echo ::interrupts | mdb -k', and isolate for
 network-related bits, I get the following output:


   IRQ  Vect IPL Bus   Trg Type   CPU Share APIC/INT# ISR(s)
   36   0x60 6   PCI   Lvl Fixed  3   1 0x1/0x4   bnx_intr_1lvl
   48   0x61 6   PCI   Lvl Fixed  2   1 0x1/0x10  bnx_intr_1lvl


 Does this imply that my system is not in a vulnerable configuration?
 Supposedly i'm losing some performance without MSI-X, but I'm not sure
 in which environments or workloads we would notice since the load on
 this server is relatively low, and the L2ARC serves data at greater
 than 100MB/s (wire speed) without stressing much of anything.

 The BIOS settings in our T610 are exactly as they arrived from Dell
 when we bought it over a year ago.

 Thoughts?
 --eric

Unfortunately I see irq type fixed in system that suffers from network issues 
with bnx. But yes, Regarding to redhat material this has something to do with 
Nehalem c-states (power saving etc) and/or MSI.
If your system has been running for year or so, I wouldn't expect this issue to 
come up, we have noted this issue with R410/R710 mostly that are manufactured 
in Q4/2009-Q1/2010 (different hw revisions?) 

Yours
Markus Kovero
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Are there (non-Sun/Oracle) vendors selling OpenSolaris/ZFS based NAS Hardware?

2010-04-06 Thread Markus Kovero

 Install nexenta on a dell poweredge ? 
 or one of these http://www.pogolinux.com/products/storage_director
FYI; More recent poweredges (R410,R710, possibly blades too, those with 
integrated Broadcom chips) are not working very well with opensolaris due 
broadcom network issues, hang-ups packet loss etc.  
And as opensolaris is not supported OS Dell is not interested to fix these 
issues.

Yours
Markus Kovero
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Are there (non-Sun/Oracle) vendors selling OpenSolaris/ZFS based NAS Hardware?

2010-04-06 Thread Markus Kovero

 Our Dell T610 is and has been working just fine for the last year and
 a half, without a single network problem.  Do you know if they're
 using the same integrated part?

 --eric

Hi, as I should have mentioned, integrated nics that cause issues are using 
Broadcom BCM5709 chipset and these connectivity issues have been
quite widespread amongst linux people too, Redhat tries to fix this; 
http://kbase.redhat.com/faq/docs/DOC-26837 but I believe it's  messed up in 
firmware somehow, as in our tests show 4.6.8-series firmware seems to be more 
stable.
And what comes to workarounds, disabling msi is bad if it creates latency for 
network/disk controllers and disabling c-states from Nehalem processors is just 
stupid (having no turbo, power saving etc).

Definitely no go for storage imo.

Yours
Markus Kovero

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Pool vdev imbalance - getting worse?

2010-03-25 Thread Markus Kovero

 This system has since been upgraded, but the imbalance in getting worse:
 
 zpool iostat -v tank | grep raid
   raidz2  3.60T  28.5G166 41  6.97M   764K
   raidz2  3.59T  33.3G170 35  7.35M   709K
   raidz2  3.60T  26.1G173 35  7.36M   658K
  raidz2  1.69T  1.93T129 46  6.70M   610K
   raidz2  2.25T  1.38T124 54  5.77M   967K

 Is there any way to determine how this is happening?

 I may have to resort to destroying and recreating some large 
 filesystems, but there's no way to determine which ones to target...
 
 -- 
 Ian.

Hi, if you have had faulted disks in some raidsets that would explain imbalance 
as zfs avoids writing to them while they are in faulted state.
I've encountered similar imbalance but that is due later changes in pool 
configuration where vdev's were added after first one's got too full.
Anyway, this is an issue, as your writes will definitely get slower after first 
raidsets get more full, as mine did, writes went from 1.2GB/s to 40-50KB/s and 
freeing up some space made problem go away (total pool capacity was around 60%).


Yours
Markus Kovero
 
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] snv_133 mpt0 freezing machine

2010-03-05 Thread Markus Kovero

-Original Message-
From: zfs-discuss-boun...@opensolaris.org 
[mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Bruno Sousa
Sent: 5. maaliskuuta 2010 10:34
To: ZFS filesystem discussion list
Subject: [zfs-discuss] snv_133 mpt0 freezing machine

 Hi all,

 Recently i got myself a new machine (Dell R710) with 1 internal Dell SAS/i 
 and 2 sun hba (non-raid) .
 From time to time this system just freezes and i noticed that it always 
 freezes after this message (shown in the /var/adm/messages) :

 scsi: [ID 107833 kern.warning] WARNING:
 /p...@0,0/pci8086,3...@4/pci1028,1...@0 (mpt0):

 Does anyone has any tip in how to start to trace the problem ?

 Best regards,
 Bruno

I'm not sure about this issue but I just have to say that dell supplies SAS 5E 
controller which is basically Dell oem'd LSI controller similar to SUN hba. 
These controllers seem to work well enough with R710. (just be sure to 
downgrade bios and nicfw to 1.1.4 and 4.x more recent firmware causes network 
issues:)

Yours
Markus Kovero
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] snv_133 mpt0 freezing machine

2010-03-05 Thread Markus Kovero



-Original Message-
From: Bruno Sousa [mailto:bso...@epinfante.com] 
Sent: 5. maaliskuuta 2010 13:04
To: Markus Kovero
Cc: ZFS filesystem discussion list
Subject: Re: [zfs-discuss] snv_133 mpt0 freezing machine

 Hi Markus,

 Thanks for your input and regarding the broadcom fw i already hitted
 that issue and have downgraded it.
 However for the Dell Bios i couldn't find anything older than 1.2.6. Do
 you have by any chance the url for getting bios 1.1.4 like you say?

 Bruno

Hi, you can downgrade bios and nicfw quite easily using USC and dvd 
downloadable from here (.001 and .002, use dd or copy to make .iso out of them);
http://support.dell.com/support/downloads/format.aspx?c=usl=ens=genSystemID=pwe_r710servicetag=os=WNETosl=endeviceid=16823libid=36dateid=-1typeid=-1formatid=-1catid=-1impid=-1typecnt=0vercnt=5releaseid=R236931


Yours
Markus Kovero
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Disk controllers changing the names of disks

2010-02-19 Thread Markus Kovero

 I am curious how admins are dealing with controllers like the Dell Perc 5 and 
 6 that can  change the device name on a disk if a disk fails and the machine 
 reboots.   These  
 controllers are not nicely behaved in that they happily fill in the device 
 numbers for 
 the physical drive that is missing.  In that case, how can you recover the 
 zpool that was  on the disk?   I understand if the pool was exported, you 
 can then re-import it.   
 However, what happens if the machine completely dies and you have no chance 
 to export the  pool? 

 --
 Terry
 -- 

You still can import it, Although you might loose some inflight data that was 
going in during crash and it can take a while during import to finish 
transactions, anyway, it will be fine.

Yours
Markus Kovero

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Abysmal ISCSI / ZFS Performance

2010-02-18 Thread Markus Kovero

 No one has said if they're using dks, rdsk, or file-backed COMSTAR LUNs yet.
 I'm using file-backed COMSTAR LUNs, with ZIL currently disabled.
 I can get between 100-200MB/sec, depending on random/sequential and block 
 sizes.
 
 Using dsk/rdsk, I was not able to see that level of performance at all.
 
 -- 
 Brent Jones
 br...@servuhome.net

Hi, I find comstar performance very low if using zvols under dsk, somehow using 
them under rdsk and letting comstar to handle cache makes performance really 
good (disks/nics become limiting factor).

Yours
Markus Kovero
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Reading ZFS config for an extended period

2010-02-15 Thread Markus Kovero

 The other thing I've noticed with all of the destroyed a large dataset with 
 dedup 
 enabled and it's taking forever to import/destory/insert function here 
 questions 
 is that the process runs so so so much faster with 8+ GiB of RAM.  Almost to 
 a man, 
 everyone who reports these 3, 4, or more day destroys has  8 GiB of RAM on 
 the 
 storage server.

I've witnessed destroys that take several days with 24GB+ systems (dataset over 
30TB). I guess it's just matter of how large datasets vs. how much ram.

Yours
Markus Kovero
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] I/O Read starvation

2010-01-10 Thread Markus Kovero

Hi, it seems you might have somekind of hardware issue there, I have no way 
reproducing this.

Yours
Markus Kovero

-Original Message-
From: zfs-discuss-boun...@opensolaris.org 
[mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of bank kus
Sent: 10. tammikuuta 2010 7:21
To: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] I/O Read starvation

Btw FWIW if I redo the dd + 2 cp experiment on /tmp the result is far more 
disastrous. The GUI stops moving caps lock stops responding for large intervals 
no clue why.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Clearing a directory with more than 60 million files

2010-01-05 Thread Markus Kovero

Hi, while not providing complete solution, I'd suggest turning atime off so 
find/rm does not change access time and possibly destroying unnecessary 
snapshots before removing files, should be quicker.


Yours
Markus Kovero


-Original Message-
From: zfs-discuss-boun...@opensolaris.org 
[mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Mikko Lammi
Sent: 5. tammikuuta 2010 12:35
To: zfs-discuss@opensolaris.org
Subject: [zfs-discuss] Clearing a directory with more than 60 million files

Hello,

As a result of one badly designed application running loose for some time,
we now seem to have over 60 million files in one directory. Good thing
about ZFS is that it allows it without any issues. Unfortunatelly now that
we need to get rid of them (because they eat 80% of disk space) it seems
to be quite challenging.

Traditional approaches like find ./ -exec rm {} \; seem to take forever
- after running several days, the directory size still says the same. The
only way how I've been able to remove something has been by giving rm
-rf to problematic directory from parent level. Running this command
shows directory size decreasing by 10,000 files/hour, but this would still
mean close to ten months (over 250 days) to delete everything!

I also tried to use unlink command to directory as a root, as a user who
created the directory, by changing directory's owner to root and so forth,
but all attempts gave Not owner error.

Any commands like ls -f or find will run for hours (or days) without
actually listing anything from the directory, so I'm beginning to suspect
that maybe the directory's data structure is somewhat damaged. Is there
some diagnostics that I can run with e.g zdb to investigate and
hopefully fix for a single directory within zfs dataset?

To make things even more difficult, this directory is located in rootfs,
so dropping the zfs filesystem would basically mean reinstalling the
entire system, which is something that we really wouldn't wish to go.


OS is Solaris 10, zpool version is 10 (rather old, I know, but is there
easy path for upgrade that might solve this problem?) and the zpool
consists two 146 GB SAS drivers in a mirror setup.


Any help would be appreciated.

Thanks,
Mikko

-- 
 Mikko Lammi | l...@lmmz.net | http://www.lmmz.net

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS pool unusable after attempting to destroy a dataset with dedup enabled

2010-01-02 Thread Markus Kovero

If pool isnt rpool you might to want to boot into singleuser mode (-s after 
kernel parameters on boot) remove /etc/zfs/zpool.cache and then reboot.
after that you can merely ssh into box and watch iostat while import.

Yours
Markus Kovero

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS pool unusable after attempting to destroy a dataset with dedup enabled

2010-01-02 Thread Markus Kovero


 Hey Markus,

 Thanks for the suggestion, but as stated in the thread, I am booting using 
 -s -kv -m  
 verbose and deleting the cache file was one of the first troubleshooting 
 steps we and 
 the others affected did.The other problem is that we were all starting an 
 iostat at 
 the console and ssh'ing in during multiuser mode and starting the import,  
 but the 
 eventual hang starts hanging iostat as well and kills the ssh.

 Seems like this issue is effecting more users than just me  judging from this 
 and the 
 other threads I've been watching.

 Update on the other stuff.  This is day 3 of my import and still no joy.

 Thanks,
 ~Bryan

Oh, my bad I didnt go thru thread so closely, anyway, seems bit odd it's 
blocking I/O completely, have you tried reading from pools member disks with dd 
before import and checking iostat error counters for hw/transport errors? 
Did you try with different set of RAM on other server, faulty ram could do this 
as well.
And is your swap device okay, if it happens to swap during import into faulty 
pool/device it might cause interesting behavior as well.

Yours
Markus Kovero
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS write bursts cause short app stalls

2009-12-28 Thread Markus Kovero

Hi, Try to add flow for traffic you want to get prioritized, I noticed that 
opensolaris tends to drop network connectivity without priority flows defined, 
I believe this is a feature presented by crossbow itself. flowadm is your 
friend that is.
I found this particularly annoying if you monitor servers with icmp-ping and 
high load causes checks to fail therefore triggering unnecessary alarms.

Yours
Markus Kovero

-Original Message-
From: zfs-discuss-boun...@opensolaris.org 
[mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Saso Kiselkov
Sent: 28. joulukuuta 2009 15:25
To: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] ZFS write bursts cause short app stalls

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

I progressed with testing a bit further and found that I was hitting
another scheduling bottleneck - the network. While the write burst was
running and ZFS was commiting data to disk, the server was dropping
incomming UDP packets (netstat -s | grep udpInOverflows grew by about
1000-2000 packets during every write burst).

To work around that I had to boost the scheduling priority of recorder
processes to the real-time class and I also had to lower
zfs_txg_timeout=1 (there was still minor packet drop after just doing
priocntl on the processes) to even out the CPU load.

Any ideas on why ZFS should completely thrash the network layer and make
it drop incomming packets?

Regards,
- --
Saso

Robert Milkowski wrote:
 On 26/12/2009 12:22, Saso Kiselkov wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 Thank you, the post you mentioned helped me move a bit forward. I tried
 putting:

 zfs:zfs_txg_timeout = 1
 btw: you can tune it on a live system without a need to do reboots.
 
 mi...@r600:~# echo zfs_txg_timeout/D | mdb -k
 zfs_txg_timeout:
 zfs_txg_timeout:30
 mi...@r600:~# echo zfs_txg_timeout/W0t1 | mdb -kw
 zfs_txg_timeout:0x1e=   0x1
 mi...@r600:~# echo zfs_txg_timeout/D | mdb -k
 zfs_txg_timeout:
 zfs_txg_timeout:1
 mi...@r600:~# echo zfs_txg_timeout/W0t30 | mdb -kw
 zfs_txg_timeout:0x1 =   0x1e
 mi...@r600:~# echo zfs_txg_timeout/D | mdb -k
 zfs_txg_timeout:
 zfs_txg_timeout:30
 mi...@r600:~#
 

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAks4sa8ACgkQRO8UcfzpOHAASgCdF1QWcKvpvK58BPBVr9EDmrWK
zmoAoLeX3Q+avIDbb+CONlh++pAIGOob
=NcRo
-END PGP SIGNATURE-
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Troubleshooting dedup performance

2009-12-23 Thread Markus Kovero

Hi, I threw 24GB of ram and couple latest nehalems at it and dedup=on seemed to 
cripple performance without actually using much cpu or ram. it's quite unusable 
like this.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] snv_129 dedup panic

2009-12-15 Thread Markus Kovero

Hi, I encountered panic and spontaneous reboot after canceling zfs send from 
another server. It took around 2-3hrs to remove 2TB data server had sent and 
then:

Dec 15 16:54:05 foo ^Mpanic[cpu2]/thread=ff0916724560:
Dec 15 16:54:05 foo genunix: [ID 683410 kern.notice] BAD TRAP: type=0 (#de 
Divide error) rp=ff003db82910 addr=ff003db82a10
Dec 15 16:54:05 foo unix: [ID 10 kern.notice]
Dec 15 16:54:05 foo unix: [ID 839527 kern.notice] zpool:
Dec 15 16:54:05 foo unix: [ID 753105 kern.notice] #de Divide error
Dec 15 16:54:05 foo unix: [ID 358286 kern.notice] addr=0xff003db82a10
Dec 15 16:54:05 foo unix: [ID 243837 kern.notice] pid=15520, 
pc=0xf794310a, sp=0xff003db82a00, eflags=0x10246
Dec 15 16:54:05 foo unix: [ID 211416 kern.notice] cr0: 
80050033pg,wp,ne,et,mp,pe cr4: 6f8xmme,fxsr,pge,mce,pae,pse,de
Dec 15 16:54:05 foo unix: [ID 624947 kern.notice] cr2: 80a7000
Dec 15 16:54:05 foo unix: [ID 625075 kern.notice] cr3: 4721dc000
Dec 15 16:54:05 foo unix: [ID 625715 kern.notice] cr8: c
Dec 15 16:54:05 foo unix: [ID 10 kern.notice]
Dec 15 16:54:05 foo unix: [ID 592667 kern.notice]   rdi: ff129712b578 
rsi:  rdx:0
Dec 15 16:54:05 foo unix: [ID 592667 kern.notice]   rcx:1  
r8:173724e00  r9:0
Dec 15 16:54:05 foo unix: [ID 592667 kern.notice]   rax:173724e00 
rbx:8 rbp: ff003db82a90
Dec 15 16:54:05 foo unix: [ID 592667 kern.notice]   r10: afd231db9a85b86e 
r11:  3fc244aaa90 r12:0
Dec 15 16:54:05 foo unix: [ID 592667 kern.notice]   r13: ff12fed0e9d0 
r14: ff092953d000 r15: ff003db82a10
Dec 15 16:54:05 foo unix: [ID 592667 kern.notice]   fsb:0 
gsb: ff09128e9000  ds:   4b
Dec 15 16:54:05 foo unix: [ID 592667 kern.notice]es:   4b  
fs:0  gs:  1c3
Dec 15 16:54:06 foo unix: [ID 592667 kern.notice]   trp:0 
err:0 rip: f794310a
Dec 15 16:54:06 foo unix: [ID 592667 kern.notice]cs:   30 
rfl:10246 rsp: ff003db82a00
Dec 15 16:54:06 foo unix: [ID 266532 kern.notice]ss:   38
Dec 15 16:54:06 foo unix: [ID 10 kern.notice]
Dec 15 16:54:06 foo genunix: [ID 655072 kern.notice] ff003db827f0 
unix:die+10f ()
Dec 15 16:54:06 foo genunix: [ID 655072 kern.notice] ff003db82900 
unix:trap+1558 ()
Dec 15 16:54:06 foo genunix: [ID 655072 kern.notice] ff003db82910 
unix:cmntrap+e6 ()
Dec 15 16:54:06 foo genunix: [ID 655072 kern.notice] ff003db82a90 
zfs:ddt_get_dedup_object_stats+152 ()
Dec 15 16:54:06 foo genunix: [ID 655072 kern.notice] ff003db82b00 
zfs:spa_config_generate+2d9 ()
Dec 15 16:54:06 foo genunix: [ID 655072 kern.notice] ff003db82b90 
zfs:spa_open_common+1c2 ()
Dec 15 16:54:06 foo genunix: [ID 655072 kern.notice] ff003db82c00 
zfs:spa_get_stats+50 ()
Dec 15 16:54:06 foo genunix: [ID 655072 kern.notice] ff003db82c40 
zfs:zfs_ioc_pool_stats+32 ()
Dec 15 16:54:06 foo genunix: [ID 655072 kern.notice] ff003db82cc0 
zfs:zfsdev_ioctl+175 ()
Dec 15 16:54:06 foo genunix: [ID 655072 kern.notice] ff003db82d00 
genunix:cdev_ioctl+45 ()
Dec 15 16:54:06 foo genunix: [ID 655072 kern.notice] ff003db82d40 
specfs:spec_ioctl+5a ()
Dec 15 16:54:06 foo genunix: [ID 655072 kern.notice] ff003db82dc0 
genunix:fop_ioctl+7b ()
Dec 15 16:54:06 foo genunix: [ID 655072 kern.notice] ff003db82ec0 
genunix:ioctl+18e ()
Dec 15 16:54:06 foo genunix: [ID 655072 kern.notice] ff003db82f10 
unix:brand_sys_syscall32+19d ()
Dec 15 16:54:06 foo unix: [ID 10 kern.notice]
Dec 15 16:54:06 foo genunix: [ID 672855 kern.notice] syncing file systems...
Dec 15 16:54:06 foo genunix: [ID 904073 kern.notice]  done
Dec 15 16:54:07 foo genunix: [ID 111219 kern.notice] dumping to 
/dev/zvol/dsk/rpool/dump, offset 65536, content: kernel
Dec 15 16:55:07 foo genunix: [ID 10 kern.notice]
Dec 15 16:55:07 foo genunix: [ID 665016 kern.notice] ^M 64% done: 1881224 pages 
dumped,
Dec 15 16:55:07 foo genunix: [ID 495082 kern.notice] dump failed: error 28

Is it just me or everlasting Monday again.

Yours
Markus Kovero
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] Space not freed?

2009-12-14 Thread Markus Kovero

Hi, if someone running 129 could try this out, turn off compression in your 
pool, mkfile 10g /pool/file123, see used space and then remove the file and see 
if it makes used space available again. I'm having trouble with this, reminds 
me of similar bug that occurred in 111-release.

Yours
Markus Kovero

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Space not freed?

2009-12-14 Thread Markus Kovero

Hi, if someone running 129 could try this out, turn off compression in your 
pool, mkfile 10g /pool
/file123, see used space and then remove the file and see if it makes used 
space available again. I
'm having trouble with this, reminds me of similar bug that occurred in 
111-release.
Any automatically created snapshots, perhaps?

Casper

Nope, no snapshots.

Yours
Markus Kovero

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] hard drive choice, TLER/ERC/CCTL

2009-12-14 Thread Markus Kovero

How you can setup these values to fma?

Yours
Markus Kovero

-Original Message-
From: zfs-discuss-boun...@opensolaris.org 
[mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of R.G. Keen
Sent: 14. joulukuuta 2009 20:14
To: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] hard drive choice, TLER/ERC/CCTL

 FMA (not ZFS, directly) looks for a number of
 failures over a period of time.
 By default that is 10 failures in 10 minutes.  If you
 have an error that trips
 on TLER, the best it can see is 2-3 failures in 10
 minutes.  The symptom
 you will see is that when these long timeouts happen,
 they take a long time
 because, by default, the drive will be reset and the
 I/O retried after 60 seconds.
That's very good news. I'm trying to get the stuff together
to set up my zfs server, and I'm also perfectly willing
to trade slower operation and more disks to get zfs' 
scrubbing and other operations. 

The recent discovery that WD has decided to up its 
prices in a back-door manner by making sure that the
DIY RAID folks can't modify TLER on cheaper drives was
a real slap in the face, potentially more than doubling
the price of storage. I've dealt with the MBA mentality
before, and I don't like it.  :-| 

This discovery was bad enough to almost put me off 
building a server entirely, with the apparent options of
paying 100% more for the disks or having the array suffer
100% data loss on any significant read/write error. 

So let me be sure I understand. If I'm using solaris/zfs, 
I can use FMA to set the level of retries/time to be 
waited if I get a disk error before taking a disk out of
the array. Is that correct? 

If it is, and that can be set to allow an array of disks 
to tolerate most instances of read/write errors without
corrupting an entire array, then I'm back on with the
server scheme.

The whole point of going to solaris/zfs is background
scrubbing for me. I'm willing for it to be slow - however
slow it is, it's much faster than finding the backup DVDs
in the closet, pilfering through them to find the right
one, then finding out the DVD set has bit-rot too. 

I apologize for the baby-simple questions. I'm reading 
documentation as hard as I can, but there's a world of
difference between reading documentation and 
understanding and using the tools described.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS pool unusable after attempting to destroy a dataset with dedup enabled

2009-12-08 Thread Markus Kovero

Hi, are you sure zfs isnt just going thru transactions after forcibly stopping 
zfs destroy?
Sometimes (always) it seems zfs/zpool commands just hang if you destroy larger 
filesets, in reality zfs is just doing its job, if you reboot server during 
dataset destroy it will take some time to come up.

So how long you've waited, have you tried removing /etc/zfs/zpool.cache and 
then booting into snv_128, doing import and possibly watching disk with iostat 
to see is there any activity?

Yours
Markus Kovero

-Original Message-
From: zfs-discuss-boun...@opensolaris.org 
[mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Jack Kielsmeier
Sent: 8. joulukuuta 2009 6:08
To: zfs-discuss@opensolaris.org
Subject: [zfs-discuss] ZFS pool unusable after attempting to destroy a dataset 
with dedup enabled

Howdy,

I upgraded to snv_128a from snv_125 . I wanted to do some de-dup testing :).

I have two zfs pools: rpool and vault. I upgraded my vault zpool version and 
turned on dedup on datastore vault/shared_storage. I also turned on gzip 
compression on this dataset as well.

Before I turned on dedup, I made a new datastore and copied all data to 
vault/shared_storage_temp (just in case something crazy happened to my dedup'd 
datastore, since dedup is new).

I removed all data on my dedup'd datastore and copied all data from my temp 
datastore. After I realized my space savings wasn't going to be that great, I 
decided to delete vault/shared_storage dataset.

zfs destroy vault/shared_storage

This hung, and couldn't be killed.

I force rebooted my system, and I couldn't boot into Solaris. It hung at 
reading zfs config

I then booted into single user mode (multiple times) and any zfs or zpool 
commands froze.

I then rebooted to my snv_125 environment. As it should, it ignored my vault 
zpool, as it's version is higher than it can understand. I forced an zpool 
export of vault and rebooted, I could then boot back into snv_128 and zpool 
import listed the pool of vault.

However, I cannot import via name or identifier, the command hangs, as well as 
any additional zfs or zpool commands. I cannot kill or kill -9 the processes.

Is there anything I can do to get my pool imported? I haven't done much 
troubleshooting at all on opensolairs, I'd be happy to run any suggested 
commands and provide output.

Thank you for the assistance.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS pool unusable after attempting to destroy a dataset with dedup enabled

2009-12-08 Thread Markus Kovero

From what I've noticed, if one destroys dataset that is say 50-70TB and 
reboots before destroy is finished, it can take up to several _days_ before 
it's back up again.
So, nowadays I'm doing rm -fr BEFORE issuing zfs destroy whenever possible.

Yours
Markus Kovero


-Original Message-
From: zfs-discuss-boun...@opensolaris.org 
[mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Michael Herf
Sent: 9. joulukuuta 2009 9:38
To: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] ZFS pool unusable after attempting to destroy a 
dataset with dedup enabled

Am in the same boat, exactly. Destroyed a large set and rebooted, with a scrub 
running on the same pool. 

My reboot stuck on Reading ZFS Config: * for several hours (disks were 
active). I cleared the zpool.cache from single-user and am doing an import (can 
boot again). I wasn't able to boot my 123 build (kernel panic), even though my 
rpool is an older version.

zpool import is pegging all 4 disks in my RAIDZ-1.

Can't touch zpool/zfs commands during the import or they hang...but regular 
iostat is ok for watching what's going on.

I didn't limit ARC memory (box has 6GB), we'll see if that's ok.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] mpt errors on snv 127

2009-12-01 Thread Markus Kovero

We actually tried this, although using sol10-version of mpt-driver. 
Surprisingly it didn't work :-)

Yours
Markus Kovero

-Original Message-
From: zfs-discuss-boun...@opensolaris.org 
[mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Mark Johnson
Sent: 1. joulukuuta 2009 15:57
To: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] mpt errors on snv 127



Mark Johnson wrote:
 
 
 Chad Cantwell wrote:
 Hi,

  I was using for quite awhile OpenSolaris 2009.06
 with the opensolaris-provided mpt driver to operate a zfs raidz2 pool of
 about ~20T and this worked perfectly fine (no issues or device errors
 logged for several months, no hanging).  A few days ago I decided to
 reinstall with the latest OpenSolaris in order to take advantage of
 raidz3.  
 
 Just to be clear... The same setup was working fine on osol2009.06,
 you upgraded to b127 and it started failing?
 
 Did you keep the osol2009.06 be around so you can reboot back to it?
 
 If so, have you tried the osol2009.06 mpt driver in the
 BE with the latest bits (make sure you make a backup copy
 of the mpt driver)?

What's the earliest build someone has seen this
problem? i.e. if we binary chop, has anyone seen it in
b118?

I have no idea if the old mpt drivers will work on a
new kernel... But if someone wants to try... Something
like the following should work...


# first, I would work out of a test BE in case you
# mess something up.
beadm create test-be
beadm activate test-be
reboot

# assuming your lasted BE is call snv127, mount it and backup
# the stock mpt driver and conf file.
beadm mount snv127 /mnt
cp /mnt/kernel/drv/mpt.conf /mnt/kernel/drv/mpt.conf.orig
cp /mnt/kernel/drv/amd64/mpt /mnt/kernel/drv/amd64/mpt.orig

# see what builds are out there...
pkg search /kernel/drv/amd64/mpt


# There's probably an easier way to do this...
# grab an older mpt. This will take a while since it's
# not in it's own package and ckr has some dependencies
# so it will pull in a bunch of other packages.
# change out 118 with the build you want to grab.
mkdir /tmp/mpt
pkg image-create -f -F -a opensolaris.org=http://pkg.opensolaris.org/dev 
/tmp/mpt
pkg -R /tmp/mpt/ install sunw...@0.5.11-0.118
cp /tmp/mpt/kernel/drv/mpt.conf /mnt/kernel/drv/mpt.conf
cp /tmp/mpt/kernel/drv/amd64/mpt /mnt/kernel/drv/amd64/mpt
rm -rf /tmp/mpt/
bootadm update-archive -R /mnt




MRJ



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS on JBOD storage, mpt driver issue - server not responding

2009-11-11 Thread Markus Kovero

Hi, you could try LSI itmpt driver as well, it seems to handle this better, 
although I think it only supports 8 devices at once or so.

You could also try more recent version of opensolaris (123 or even 126), as 
there seems to be a lot fixes regarding mpt-driver (which still seems to have 
issues).

Yours
Markus Kovero

-Original Message-
From: zfs-discuss-boun...@opensolaris.org 
[mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of M P
Sent: 11. marraskuuta 2009 18:08
To: zfs-discuss@opensolaris.org
Subject: [zfs-discuss] ZFS on JBOD storage, mpt driver issue - server not 
responding

Server using [b]Sun StorageTek 8-port external SAS PCIe HBA [/b](mpt driver) 
connected to external JBOD array with 12 disks. 

Here is link to the exact SAS (Sun) adapter: 
http://www.sun.com/storage/storage_networking/hba/sas/PCIe.pdf  (LSI SAS3801)

When running IO intensive operations (zpool scrub) for couple of hours, the 
server locks with the following repeating messages:

Nov 10 16:31:45 sunserver scsi: [ID 365881 kern.info] 
/p...@0,0/pci10de,3...@a/pci1000,3...@0 (mpt0):
Nov 10 16:31:45 sunserver   Log info 0x3114 received for target 17.
Nov 10 16:31:45 sunserver   scsi_status=0x0, ioc_status=0x8048, 
scsi_state=0xc
Nov 10 16:32:55 sunserver scsi: [ID 107833 kern.warning] WARNING: 
/p...@0,0/pci10de,3...@a/pci1000,3...@0 (mpt0):
Nov 10 16:32:55 sunserver   Disconnected command timeout for Target 19
Nov 10 16:32:56 sunserver scsi: [ID 365881 kern.info] 
/p...@0,0/pci10de,3...@a/pci1000,3...@0 (mpt0):
Nov 10 16:32:56 sunserver   Log info 0x3114 received for target 19.
Nov 10 16:32:56 sunserver   scsi_status=0x0, ioc_status=0x8048, 
scsi_state=0xc
Nov 10 16:34:16 sunserver scsi: [ID 107833 kern.warning] WARNING: 
/p...@0,0/pci10de,3...@a/pci1000,3...@0 (mpt0):
Nov 10 16:34:16 sunserver   Disconnected command timeout for Target 21

I tested this on two servers:
- [b]Sun Fire X2200[/b] using [b]Sun Storage J4200 JBOD[/b] array and
- [b]Dell R410 Server[/b] with [b]Promise VTJ-310SS JBOD array[/b] 

They both are showing the same repeating messages and locking after couple of 
hours of zpool scrub.

Solaris appears to be more stable (than OpenSolaris) - it doesn't lock when 
scrubbing, but still locks after 5-6 hours reading from the JBOD array - 10TB 
size.

So at this point this looks like an issue with the MPT driver or these SAS 
cards (I tested two) when under heavy load. I put the latest firmware for the 
SAS card from LSI's web site - v1.29.00 without any changes, server still locks.

Any ideas, suggestions how to fix or workaround this issue? The adapter is 
suppose to be enterprise-class.

Here is more detailed log info:

Sun Fire X2200 and Sun Storage J4200 JBOD array

SAS card: Sun StorageTek 8-port external SAS PCIe HBA

http://www.sun.com/storage/storage_networking/hba/sas/PCIe.pdf  (LSI SAS3801)

Operation System: SunOS sunserver 5.11 snv_111b i86pc i386 i86pc Solaris

Nov 10 16:30:33 sunserver scsi: [ID 365881 kern.info] 
/p...@0,0/pci10de,3...@a/pci1000,3...@0 (mpt0):
Nov 10 16:30:33 sunserver   Log info 0x3114 received for target 0.
Nov 10 16:30:33 sunserver   scsi_status=0x0, ioc_status=0x8048, 
scsi_state=0xc
Nov 10 16:31:43 sunserver scsi: [ID 107833 kern.warning] WARNING: 
/p...@0,0/pci10de,3...@a/pci1000,3...@0 (mpt0):
Nov 10 16:31:43 sunserver   Disconnected command timeout for Target 17
Nov 10 16:32:55 sunserver scsi: [ID 107833 kern.warning] WARNING: 
/p...@0,0/pci10de,3...@a/pci1000,3...@0 (mpt0):
Nov 10 16:32:55 sunserver   Disconnected command timeout for Target 19
Nov 10 16:32:56 sunserver scsi: [ID 365881 kern.info] 
/p...@0,0/pci10de,3...@a/pci1000,3...@0 (mpt0):
Nov 10 16:32:56 sunserver   Log info 0x3114 received for target 19.
Nov 10 16:32:56 sunserver   scsi_status=0x0, ioc_status=0x8048, 
scsi_state=0xc
Nov 10 16:34:16 sunserver scsi: [ID 107833 kern.warning] WARNING: 
/p...@0,0/pci10de,3...@a/pci1000,3...@0 (mpt0):
Nov 10 16:34:16 sunserver   Disconnected command timeout for Target 21


Dell R410 Server and Promise VTJ-310SS JBOD array

SAS card: Sun StorageTek 8-port external SAS PCIe HBA

Operating System: SunOS dellserver 5.10 Generic_141445-09 i86pc i386 i86pc

Nov 11 00:18:22 dellserver scsi: [ID 107833 kern.warning] WARNING: 
/p...@0,0/pci8086,3...@3/pci1028,1...@0 (mpt0):
Nov 11 00:18:22 dellserver Disconnected command timeout for Target 0
Nov 11 00:18:22 dellserver scsi: [ID 107833 kern.warning] WARNING: 
/p...@0,0/pci8086,3...@3/pci1028,1...@0/s...@0,0 (sd13):
Nov 11 00:18:22 dellserver Error for Command: read(10)
Error Level: Retryable
Nov 11 00:18:22 dellserver scsi: [ID 107833 kern.notice]   Requested Block: 
276886498 Error Block: 276886498
Nov 11 00:18:22 dellserver scsi: [ID 107833 kern.notice]   Vendor: Dell 
  Serial Number: Dell Interna

Re: [zfs-discuss] ZFS on JBOD storage, mpt driver issue - server not responding

2009-11-11 Thread Markus Kovero

Have you tried another SAS-cable?

Yours
Markus Kovero

-Original Message-
From: zfs-discuss-boun...@opensolaris.org 
[mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of M P
Sent: 11. marraskuuta 2009 21:05
To: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] ZFS on JBOD storage, mpt driver issue - server not 
responding

I already changed some of the drives, no difference. The target drive seem to 
have  random character - most likely not from the drives.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] SNV_125 MPT warning in logfile

2009-10-24 Thread Markus Kovero

How do you estimate needed queue depth if one has say 64 to 128 disks sitting 
behind LSI?
Is it bad idea having queuedepth 1?

Yours
Markus Kovero


Lähettäjä: zfs-discuss-boun...@opensolaris.org 
[zfs-discuss-boun...@opensolaris.org] k#228;ytt#228;j#228;n Richard Elling 
[richard.ell...@gmail.com] puolesta
Lähetetty: 24. lokakuuta 2009 7:36
Vastaanottaja: Adam Cheal
Kopio: zfs-discuss@opensolaris.org
Aihe: Re: [zfs-discuss] SNV_125 MPT warning in logfile

ok, see below...

On Oct 23, 2009, at 8:14 PM, Adam Cheal wrote:

 Here is example of the pool config we use:

 # zpool status
  pool: pool002
 state: ONLINE
 scrub: scrub stopped after 0h1m with 0 errors on Fri Oct 23 23:07:52
 2009
 config:

NAME STATE READ WRITE CKSUM
pool002  ONLINE   0 0 0
  raidz2 ONLINE   0 0 0
c9t18d0  ONLINE   0 0 0
c9t17d0  ONLINE   0 0 0
c9t55d0  ONLINE   0 0 0
c9t13d0  ONLINE   0 0 0
c9t15d0  ONLINE   0 0 0
c9t16d0  ONLINE   0 0 0
c9t11d0  ONLINE   0 0 0
c9t12d0  ONLINE   0 0 0
c9t14d0  ONLINE   0 0 0
c9t9d0   ONLINE   0 0 0
c9t8d0   ONLINE   0 0 0
c9t10d0  ONLINE   0 0 0
c9t29d0  ONLINE   0 0 0
c9t28d0  ONLINE   0 0 0
c9t27d0  ONLINE   0 0 0
c9t23d0  ONLINE   0 0 0
c9t25d0  ONLINE   0 0 0
c9t26d0  ONLINE   0 0 0
c9t21d0  ONLINE   0 0 0
c9t22d0  ONLINE   0 0 0
c9t24d0  ONLINE   0 0 0
c9t19d0  ONLINE   0 0 0
  raidz2 ONLINE   0 0 0
c9t30d0  ONLINE   0 0 0
c9t31d0  ONLINE   0 0 0
c9t32d0  ONLINE   0 0 0
c9t33d0  ONLINE   0 0 0
c9t34d0  ONLINE   0 0 0
c9t35d0  ONLINE   0 0 0
c9t36d0  ONLINE   0 0 0
c9t37d0  ONLINE   0 0 0
c9t38d0  ONLINE   0 0 0
c9t39d0  ONLINE   0 0 0
c9t40d0  ONLINE   0 0 0
c9t41d0  ONLINE   0 0 0
c9t42d0  ONLINE   0 0 0
c9t44d0  ONLINE   0 0 0
c9t45d0  ONLINE   0 0 0
c9t46d0  ONLINE   0 0 0
c9t47d0  ONLINE   0 0 0
c9t48d0  ONLINE   0 0 0
c9t49d0  ONLINE   0 0 0
c9t50d0  ONLINE   0 0 0
c9t51d0  ONLINE   0 0 0
c9t52d0  ONLINE   0 0 0
cache
  c8t2d0 ONLINE   0 0 0
  c8t3d0 ONLINE   0 0 0
spares
  c9t20d0AVAIL
  c9t43d0AVAIL

 errors: No known data errors

  pool: rpool
 state: ONLINE
 scrub: none requested
 config:

NAME  STATE READ WRITE CKSUM
rpool ONLINE   0 0 0
  mirror  ONLINE   0 0 0
c8t0d0s0  ONLINE   0 0 0
c8t1d0s0  ONLINE   0 0 0

 errors: No known data errors

 ...and here is a snapshot of the system using iostat -indexC 5
 during a scrub of pool002 (c8 is onboard AHCI controller, c9 is
 LSI SAS 3801E):

  extended device statistics   
 errors ---
r/sw/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b s/w h/w
 trn tot device
0.00.00.00.0  0.0  0.00.00.0   0   0   0
 0   0   0 c8
0.00.00.00.0  0.0  0.00.00.0   0   0   0
 0   0   0 c8t0d0
0.00.00.00.0  0.0  0.00.00.0   0   0   0
 0   0   0 c8t1d0
0.00.00.00.0  0.0  0.00.00.0   0   0   0
 0   0   0 c8t2d0
0.00.00.00.0  0.0  0.00.00.0   0   0   0
 0   0   0 c8t3d0
 8738.70.0 555346.10.0  0.1 345.00.0   39.5   0 3875
 0   1   1   2 c9

You see 345 entries in the active queue. If the controller rolls over at
511 active entries, then it would explain why it would soon begin to
have difficulty.

Meanwhile, it is providing 8,738 IOPS and 555 MB/sec, which is quite
respectable.

  194.80.0 11936.90.0  0.0  7.90.0   40.3   0  87   0
 0   0   0 c9t8d0

These disks are doing almost 200 read IOPS, but are not 100% busy.
Average I/O size is 66 KB, which is not bad, lots of little I/Os could
be
worse, but at only 11.9 MB/s, you are not near the media bandwidth.
Average service time is 40.3 milliseconds, which

Re: [zfs-discuss] SNV_125 MPT warning in logfile

2009-10-24 Thread Markus Kovero

We actually hit similar issues with LSI, but within workload not scrub, result 
is same but it seems to choke on writes rather than reads with suboptimal 
performance.
http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6891413

Anyway, we haven't experienced this _at all_ with RE3-version of Western 
Digital disks..
Issues seem to pop up with 750GB seagate and 1TB WD black-series, so far 2TB 
green WDs seem unaffected too, so might it be related to disks firmware due how 
they chat with LSI?

Also, we noticed more severe (even RE3 and 2TBWD green) timeouts if disks are 
not forced into SATA1-mode, I believe this is known issue with newer 2TB disks 
and some other disk controllers and may be caused by bad cabling or 
connectivity.

We have never witnessed this behaviour with SAS (fujitsu,ibm..) also. All this 
happens with snv 118,122,123 and 125.

Yours
Markus Kovero


Lähettäjä: zfs-discuss-boun...@opensolaris.org 
[zfs-discuss-boun...@opensolaris.org] k#228;ytt#228;j#228;n Adam Cheal 
[ach...@pnimedia.com] puolesta
Lähetetty: 24. lokakuuta 2009 12:49
Vastaanottaja: zfs-discuss@opensolaris.org
Aihe: Re: [zfs-discuss] SNV_125 MPT warning in logfile

The iostat I posted previously was from a system we had already tuned the 
zfs:zfs_vdev_max_pending depth down to 10 (as visible by the max of about 10 in 
actv per disk).

I reset this value in /etc/system to 7, rebooted, and started a scrub. iostat 
output showed busier disks (%b is higher, which seemed odd) but a cap of about 
7 queue items per disk, proving the tuning was effective. iostat at a 
high-water mark during the test looked like this:

extended device statistics
r/sw/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
0.00.00.00.0  0.0  0.00.00.0   0   0 c8
0.00.00.00.0  0.0  0.00.00.0   0   0 c8t0d0
0.00.00.00.0  0.0  0.00.00.0   0   0 c8t1d0
0.00.00.00.0  0.0  0.00.00.0   0   0 c8t2d0
0.00.00.00.0  0.0  0.00.00.0   0   0 c8t3d0
 8344.50.0 359640.40.0  0.1 300.50.0   36.0   0 4362 c9
  190.00.0 6800.40.0  0.0  6.60.0   34.8   0  99 c9t8d0
  185.00.0 6917.10.0  0.0  6.10.0   32.9   0  94 c9t9d0
  187.00.0 6640.90.0  0.0  6.50.0   34.6   0  98 c9t10d0
  186.50.0 6543.40.0  0.0  7.00.0   37.5   0 100 c9t11d0
  180.50.0 7203.10.0  0.0  6.70.0   37.2   0 100 c9t12d0
  195.50.0 7352.40.0  0.0  7.00.0   35.8   0 100 c9t13d0
  188.00.0 6884.90.0  0.0  6.60.0   35.2   0  99 c9t14d0
  204.00.0 6990.10.0  0.0  7.00.0   34.3   0 100 c9t15d0
  199.00.0 7336.70.0  0.0  7.00.0   35.2   0 100 c9t16d0
  180.50.0 6837.90.0  0.0  7.00.0   38.8   0 100 c9t17d0
  198.00.0 7668.90.0  0.0  7.00.0   35.3   0 100 c9t18d0
  203.00.0 7983.20.0  0.0  7.00.0   34.5   0 100 c9t19d0
0.00.00.00.0  0.0  0.00.00.0   0   0 c9t20d0
  195.50.0 7096.40.0  0.0  6.70.0   34.1   0  98 c9t21d0
  189.50.0 7757.20.0  0.0  6.40.0   33.9   0  97 c9t22d0
  195.50.0 7645.90.0  0.0  6.60.0   33.8   0  99 c9t23d0
  194.50.0 7925.90.0  0.0  7.00.0   36.0   0 100 c9t24d0
  188.50.0 6725.60.0  0.0  6.20.0   32.8   0  94 c9t25d0
  188.50.0 7199.60.0  0.0  6.50.0   34.6   0  98 c9t26d0
  196.00.0 .90.0  0.0  6.30.0   32.1   0  95 c9t27d0
  193.50.0 7455.40.0  0.0  6.20.0   32.0   0  95 c9t28d0
  189.00.0 7400.90.0  0.0  6.30.0   33.2   0  96 c9t29d0
  182.50.0 9397.00.0  0.0  7.00.0   38.3   0 100 c9t30d0
  192.50.0 9179.50.0  0.0  7.00.0   36.3   0 100 c9t31d0
  189.50.0 9431.80.0  0.0  7.00.0   36.9   0 100 c9t32d0
  187.50.0 9082.00.0  0.0  7.00.0   37.3   0 100 c9t33d0
  188.50.0 9368.80.0  0.0  7.00.0   37.1   0 100 c9t34d0
  180.50.0 9332.80.0  0.0  7.00.0   38.8   0 100 c9t35d0
  183.00.0 9690.30.0  0.0  7.00.0   38.2   0 100 c9t36d0
  186.00.0 9193.80.0  0.0  7.00.0   37.6   0 100 c9t37d0
  180.50.0 8233.40.0  0.0  7.00.0   38.8   0 100 c9t38d0
  175.50.0 9085.20.0  0.0  7.00.0   39.9   0 100 c9t39d0
  177.00.0 9340.00.0  0.0  7.00.0   39.5   0 100 c9t40d0
  175.50.0 8831.00.0  0.0  7.00.0   39.9   0 100 c9t41d0
  190.50.0 9177.80.0  0.0  7.00.0   36.7   0 100 c9t42d0
0.00.00.00.0  0.0  0.00.00.0   0   0 c9t43d0
  196.00.0 9180.50.0  0.0  7.00.0   35.7   0 100 c9t44d0
  193.50.0 9496.80.0  0.0  7.00.0   36.2   0 100 c9t45d0
  187.00.0 8699.50.0  0.0  7.00.0   37.4   0 100 c9t46d0
  198.50.0 9277.00.0  0.0  7.00.0   35.2   0 100 c9t47d0
  185.50.0 9778.30.0  0.0  7.00.0

[zfs-discuss] Numbered vdevs

2009-10-19 Thread Markus Kovero

Hi, I just noticed this on snv_125, is there oncoming feature that allows use 
of numbered vdevs or what for are these?
(raidz2-N)

  pool: tank
 state: ONLINE
config:

NAME   STATE READ WRITE CKSUM
tank   ONLINE   0 0 0
  raidz2-0 ONLINE   0 0 0
c8t40d0ONLINE   0 0 0
c8t36d0ONLINE   0 0 0
c8t38d0ONLINE   0 0 0
c8t39d0ONLINE   0 0 0
c8t41d0ONLINE   0 0 0
c8t42d0ONLINE   0 0 0
c8t43d0ONLINE   0 0 0
  raidz2-1 ONLINE   0 0 0
c8t44d0ONLINE   0 0 0
c8t45d0ONLINE   0 0 0
c8t46d0ONLINE   0 0 0
c8t47d0ONLINE   0 0 0
c8t48d0ONLINE   0 0 0
c8t49d0ONLINE   0 0 0
c8t50d0ONLINE   0 0 0
  raidz2-2 ONLINE   0 0 0
c8t51d0ONLINE   0 0 0
c8t86d0ONLINE   0 0 0
c8t87d0ONLINE   0 0 0
c8t149d0   ONLINE   0 0 0
c8t91d0ONLINE   0 0 0
c8t94d0ONLINE   0 0 0
c8t95d0ONLINE   0 0 0

Yours
Markus Kovero
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] Unusual latency issues

2009-09-28 Thread Markus Kovero

Hi, this may not be correct mailinglist for this, but I'd like to share this 
with you, I noticed weird network behavior with osol snv_123.
icmp for host lags randomly between 500ms-5000ms and ssh sessions seem to 
tangle, I guess this could affect iscsi/nfs as well.

what was most intresting that I found workaround to be running snoop with 
promiscuous mode disabled on interfaces suffering lag, this did make 
interruptions go away. Is this somekind cpu/irq scheduling issue?

Behaviour was noticed on two different platform and with two different nics 
(bge and e1000).

Yours
Markus Kovero
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] Migrate from iscsitgt to comstar?

2009-09-21 Thread Markus Kovero

Is it possible to migrate data from iscsitgt for comstar iscsi target? I guess 
comstar wants metadata at beginning of volume and this makes things difficult?

Yours
Markus Kovero
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] sync replication easy way?

2009-09-16 Thread Markus Kovero

Hi, I managed to test this out, it seems iscsitgt performance is suboptimal 
with this setup but somehow comstar maxes out gige easily, no performance 
issues there.

Yours
Markus Kovero

-Original Message-
From: Maurice Volaski [mailto:maurice.vola...@einstein.yu.edu] 
Sent: 11. syyskuuta 2009 20:40
To: Markus Kovero; zfs-discuss@opensolaris.org
Subject: RE: [zfs-discuss] sync replication easy way?

At 8:25 PM +0300 9/11/09, Markus Kovero wrote:
I believe failover is best to be done manually just to be sure 
active node is really dead before importing it on another node, 
otherwise there could be serious issues I think.


I believe there are many users of Linux-HA, aka heartbeat, who do 
failover automatically on Linux systems. You can configure a stonith 
device to shoot the other node in the head. I had heartbeat running 
on OpenSolaris, though I never tested failover.

Did you get decent performance when you tested?
-- 

Maurice Volaski, maurice.vola...@einstein.yu.edu
Computing Support, Rose F. Kennedy Center
Albert Einstein College of Medicine of Yeshiva University
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] RAIDZ versus mirrroed

2009-09-16 Thread Markus Kovero

It's possible to do 3-way (or more) mirrors too, so you may achieve better 
redundancy than raidz2/3

Yours
Markus Kovero

-Original Message-
From: zfs-discuss-boun...@opensolaris.org 
[mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Marty Scholes
Sent: 16. syyskuuta 2009 19:38
To: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] RAIDZ versus mirrroed

 Generally speaking, striping mirrors will be faster
 than raidz or raidz2,
 but it will require a higher number of disks and
 therefore higher cost to
 The main reason to use
 raidz or raidz2 instead
 of striping mirrors would be to keep the cost down,
 or to get higher usable
 space out of a fixed number of drives.

While it has been a while since I have done storage management for critical 
systems, the advantage I see with RAIDZN is better fault tolerance: any N 
drives may fail before  the set goes critical.

With straight mirroring, failure of the wrong two drives will invalidate the 
whole pool.

The advantage of striped mirrors is that it offers a better chance of higher 
iops (assuming the I/O is distributed correctly).  Also, it might be easier to 
expand a mirror by upgrading only two drives with larger drives.  With RAID, 
the entire stripe of drives would need to be upgraded.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] alternative hardware configurations for zfs

2009-09-11 Thread Markus Kovero

We've been using caviar black 1TB with disk configurations consisting 64 disks 
or more. They are working just fine.

Yours
Markus Kovero

-Original Message-
From: zfs-discuss-boun...@opensolaris.org 
[mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Eugen Leitl
Sent: 11. syyskuuta 2009 9:51
To: Eric Sproul; zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] alternative hardware configurations for zfs

On Thu, Sep 10, 2009 at 01:11:49PM -0400, Eric Sproul wrote:

 I would not use the Caviar Black drives, regardless of TLER settings.  The RE3
 or RE4 drives would be a better choice, since they also have better vibration
 tolerance.  This will be a significant factor in a chassis with 20 spinning 
 drives.

Yes, I'm aware of the issue, and am using 16x RE4 drives in my current 
box right now (which I unfortunately had to convert to CentOS 5.3 for Oracle/
custom software compatibility reasons). I've made very bad experiences
with Seagate 7200.11 in RAID in the past.

Thanks for your advice against Caviar Black. 
 
  Do you think above is a sensible choice? 
 
 All your other choices seem good.  I've used a lot of Supermicro gear with 
 good
 results.  The very leading-edge hardware is sometimes not supported, but

I've been using 
http://www.supermicro.com/products/motherboard/QPI/5500/X8DAi.cfm
in above box.

 anything that's been out for a while should work fine.  I presume you're going
 for an Intel Xeon solution-- the peripherals on those boards a a bit better
 supported than the AMD stuff, but even the AMD boards work well.

Yes, dual-socket quadcore Xeon.

-- 
Eugen* Leitl a href=http://leitl.org;leitl/a http://leitl.org
__
ICBM: 48.07100, 11.36820 http://www.ativel.com http://postbiota.org
8B29F6BE: 099D 78BA 2FD3 B014 B08A  7779 75B0 2443 8B29 F6BE
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] alternative hardware configurations for zfs

2009-09-11 Thread Markus Kovero

Couple months, nope. I guess there is this DOS utility provided by WD that 
allows you change TLER settings
having TLER disabled can be problem, faulty disks timeout randomly and zfs 
doesn't always want to mark them as failed, sometimes it does though.

Yours
Markus Kovero

-Original Message-
From: Tristan Ball [mailto:tristan.b...@leica-microsystems.com] 
Sent: 11. syyskuuta 2009 10:04
To: Markus Kovero; zfs-discuss@opensolaris.org
Subject: RE: [zfs-discuss] alternative hardware configurations for zfs

How long have you had them in production?

Were you able to adjust the TLER settings from within solaris?

Thanks,
Tristan.

-Original Message-
From: zfs-discuss-boun...@opensolaris.org
[mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Markus Kovero
Sent: Friday, 11 September 2009 5:00 PM
To: Eugen Leitl; Eric Sproul; zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] alternative hardware configurations for zfs

We've been using caviar black 1TB with disk configurations consisting 64
disks or more. They are working just fine.

Yours
Markus Kovero

-Original Message-
From: zfs-discuss-boun...@opensolaris.org
[mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Eugen Leitl
Sent: 11. syyskuuta 2009 9:51
To: Eric Sproul; zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] alternative hardware configurations for zfs

On Thu, Sep 10, 2009 at 01:11:49PM -0400, Eric Sproul wrote:

 I would not use the Caviar Black drives, regardless of TLER settings.
The RE3
 or RE4 drives would be a better choice, since they also have better
vibration
 tolerance.  This will be a significant factor in a chassis with 20
spinning drives.

Yes, I'm aware of the issue, and am using 16x RE4 drives in my current 
box right now (which I unfortunately had to convert to CentOS 5.3 for
Oracle/
custom software compatibility reasons). I've made very bad experiences
with Seagate 7200.11 in RAID in the past.

Thanks for your advice against Caviar Black. 
 
  Do you think above is a sensible choice? 
 
 All your other choices seem good.  I've used a lot of Supermicro gear
with good
 results.  The very leading-edge hardware is sometimes not supported,
but

I've been using
http://www.supermicro.com/products/motherboard/QPI/5500/X8DAi.cfm
in above box.

 anything that's been out for a while should work fine.  I presume
you're going
 for an Intel Xeon solution-- the peripherals on those boards a a bit
better
 supported than the AMD stuff, but even the AMD boards work well.

Yes, dual-socket quadcore Xeon.

-- 
Eugen* Leitl a href=http://leitl.org;leitl/a http://leitl.org
__
ICBM: 48.07100, 11.36820 http://www.ativel.com http://postbiota.org
8B29F6BE: 099D 78BA 2FD3 B014 B08A  7779 75B0 2443 8B29 F6BE
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

__
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email 
__
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] sync replication easy way?

2009-09-11 Thread Markus Kovero

Hi, I was just wondering following idea, I guess somebody mentioned something 
similar and I'd like some thoughts on this.


1.   create iscsi volume on Node-A and mount it locally with iscsiadm

2.   create pool with this local iscsi-share

3.   create iscsi volume on Node-B and share it to Node-A

4.   create mirror from both disks on Node-A; zpool attach foopool 
localiscsivolume remotevolume

Why not? After quick test it seems to fail and resilver like it should when 
nodes fail. Actual failover needs to be done manually though, but am I missing 
something relevant here?

Yours
Markus Kovero
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] sync replication easy way?

2009-09-11 Thread Markus Kovero

This also makes failover more easy, as volumes are already shared via iscsi on 
both nodes.
I have to poke it next week to see performance numbers, I could imagine it 
plays within expected iscsi performance, or it should atleast.

Yours
Markus Kovero
-Original Message-
From: Richard Elling [mailto:richard.ell...@gmail.com] 
Sent: 11. syyskuuta 2009 19:53
To: Markus Kovero
Cc: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] sync replication easy way?

On Sep 11, 2009, at 5:05 AM, Markus Kovero wrote:

 Hi, I was just wondering following idea, I guess somebody mentioned  
 something similar and I'd like some thoughts on this.

 1.   create iscsi volume on Node-A and mount it locally with  
 iscsiadm
 2.   create pool with this local iscsi-share
 3.   create iscsi volume on Node-B and share it to Node-A
 4.   create mirror from both disks on Node-A; zpool attach  
 foopool localiscsivolume remotevolume

 Why not? After quick test it seems to fail and resilver like it  
 should when nodes fail. Actual failover needs to be done manually  
 though, but am I missing something relevant here?

This is more complicated than the more commonly used, simpler method:
1. create iscsi volume on Node-B, share to Node-A
2. zpool create mypool mirror local-vdev iscsi-vdev

  -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] sync replication easy way?

2009-09-11 Thread Markus Kovero

I believe failover is best to be done manually just to be sure active node is 
really dead before importing it on another node, otherwise there could be 
serious issues I think.

Yours
Markus Kovero

-Original Message-
From: zfs-discuss-boun...@opensolaris.org 
[mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Maurice Volaski
Sent: 11. syyskuuta 2009 19:24
To: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] sync replication easy way?

This method also allows one to nest mirroring or some RAID-z level with 
mirroring. 

When I tested it with a older build a while back, I found performance really 
poor, about 1-2 MB/second, but my environment was also constrained.

A major showstopper had been the infamous 3 minute iSCSI timeout, which was 
recently fixed, http://bugs.opensolaris.org/view_bug.do?bug_id=649.

How is your performance?

Also, why do you think failover has to be done manually?
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] This is the scrub that never ends...

2009-09-07 Thread Markus Kovero

Hi, I noticed that counters will not get updated if data amount increases 
during scrub/resilver, so if application has written new data during scrub, 
counter will not give realistic estimate. 

This happens with resilvering and scrub, somebody could fix this?

Yours
Markus Kovero


-Original Message-
From: zfs-discuss-boun...@opensolaris.org 
[mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Will Murnane
Sent: 7. syyskuuta 2009 16:42
To: ZFS Mailing List
Subject: [zfs-discuss] This is the scrub that never ends...

I have a pool composed of a single raidz2 vdev, which is currently
degraded (missing a disk):
config:

NAME STATE READ WRITE CKSUM
pool DEGRADED 0 0 0
  raidz2 DEGRADED 0 0 0
c8d1 ONLINE   0 0 0
c8d0 ONLINE   0 0 0
c12t4d0  ONLINE   0 0 0
c12t3d0  ONLINE   0 0 0
c12t2d0  ONLINE   0 0 0
c12t0d0  OFFLINE  0 0 0
logs
  c10d0  ONLINE   0 0 0

errors: No known data errors

I have it scheduled for periodic scrubs, via root's crontab:
20 2 1 * * /usr/sbin/zpool scrub pool
but this scrub was kicked off manually.

Last night I checked its status and saw:
 scrub: scrub in progress for 20h32m, 100.00% done, 0h0m to go
This morning I see:
 scrub: scrub in progress for 31h10m, 100.00% done, 0h0m to go
It's 100% done, but yet hasn't finished in 10 hours!  zpool iostat -v
pool 10 shows it's doing between 50 and 120 MB/s of reads, when the
userspace applications are only doing a few megabytes per second of
I/O, as measured by the DTraceToolkit script rwtop (app_r:   4469
KB,  app_w:   4579 KB).

What can cause this kind of behavior, and how can I make my pool
finish scrubbing?

Will
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] snv_110 - snv_121 produces checksum errors on Raid-Z pool

2009-09-02 Thread Markus Kovero

Please see iostat -xen if there is transport or hw errors generated by say, 
device timeouts or bad cables etc. Consumer disks usually just timeout time to 
time while on load when RE-versions usually report error.

Yours
Markus Kovero

-Original Message-
From: zfs-discuss-boun...@opensolaris.org 
[mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Simon Breden
Sent: 2. syyskuuta 2009 17:34
To: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] snv_110 - snv_121 produces checksum errors on 
Raid-Z pool

I too see checksum errors ocurring for the first time using OpenSolaris 2009.06 
on the /dev package repository at version snv_121.

I see the problem occur within a mirrored boot pool (rpool) using SSDs.

Hardware is AMD BE-2350 (ECC) processor with 4GB ECC memory on MCP55 chipset, 
although SATA is using mpt driver on a SuperMicro AOC-USAS-L8i controller card.

More here:
http://breden.org.uk/2009/09/02/home-fileserver-handling-pool-errors/

So I'm going to check my other boot environments to see if a rollback makes 
sense ( snv_121).

Cheers,
Simon
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] possible resilver bugs

2009-08-21 Thread Markus Kovero

Hi, I don't have means to replicate this issue nor file a bug about it so I'd 
like your opinion about these issues or perhaps make bug report if necessary.

In scenario where is say three raidz2 groups consisting several disks, two 
disks fail in different raidz-groups. You have degraded pool and two degraded 
raidz2 groups.

Now, one replaces first disk and starts resilvering, it takes day, two days, 
three days, counter says 100% resilvered but new data is still written to disk 
being replaced, counter SHOULD update if data amount increases in group.
Before that first disk is resilvered, second failed disk in second group is 
replaced resulting in BOTH resilver-processes start from beginning making pool 
rather unusable due two resilvers and compromising pool for several days to 
come.
Replacing disk in other raidz2-group should not interfere with ongoing 
resilvering on another disk set.

Yours
Markus Kovero


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] [n/zfs-discuss] Strange speeds with x4500, Solaris 10 10/08

2009-07-30 Thread Markus Kovero

btw, there's coming new Intel X25-M (G2) next month that will offer better 
random read/writes than E-series and seriously cheap pricetag, worth for a try 
I'd say.

Yours
Markus Kovero


-Original Message-
From: zfs-discuss-boun...@opensolaris.org 
[mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Jorgen Lundman
Sent: 30. heinäkuuta 2009 9:55
To: ZFS Discussions
Subject: Re: [zfs-discuss] [n/zfs-discuss] Strange speeds with x4500, Solaris 
10 10/08



Bob Friesenhahn wrote:
 Something to be aware of is that not all SSDs are the same.  In fact, 
 some faster SSDs may use a RAM write cache (they all do) and then 
 ignore a cache sync request while not including hardware/firmware 
 support to ensure that the data is persisted if there is power loss. 
 Perhaps your fast CF device does that.  If so, that would be really 
 bad for zfs if your server was to spontaneously reboot or lose power. 
 This is why you really want a true enterprise-capable SSD device for 
 your slog.

Naturally, we just wanted to try the various technologies to see how 
they compared. Store-bought CF card took 26s, store-bought SSD 48s. We 
have not found a PCI NVRam card yet.

When talking to our Sun vendor, they have no solutions, which is annoying.

X25-E would be good, but some pools have no spares, and since you can't 
remove vdevs, we'd have to move all customers off the x4500 before we 
can use it.

CF card need reboot to see the cards, but 6 servers are x4500, not 
x4540, so not really a global solution.

PCI NVRam cards need a reboot, but should work in both x4500 and x4540 
without zpool rebuilding. But can't actually find any with Solaris drivers.

Peculiar.

Lund


-- 
Jorgen Lundman   | lund...@lundman.net
Unix Administrator   | +81 (0)3 -5456-2687 ext 1017 (work)
Shibuya-ku, Tokyo| +81 (0)90-5578-8500  (cell)
Japan| +81 (0)3 -3375-1767  (home)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] zpool import hungs up forever...

2009-07-29 Thread Markus Kovero

I recently noticed that importing larger pools that are occupied by large 
amounts of data can do zpool import for several hours while zpool iostat only 
showing some random reads now and then and iostat -xen showing quite busy disk 
usage, It's almost it goes thru every bit in pool before it goes thru.

Somebody said that zpool import got faster on snv118, but I don't have real 
information on that yet.

Yours
Markus Kovero

-Original Message-
From: zfs-discuss-boun...@opensolaris.org 
[mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Victor Latushkin
Sent: 29. heinäkuuta 2009 14:05
To: Pavel Kovalenko
Cc: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] zpool import hungs up forever...

On 29.07.09 14:42, Pavel Kovalenko wrote:
 fortunately, after several hours terminal went back --
 # zdb -e data1
 Uberblock
 
 magic = 00bab10c
 version = 6
 txg = 2682808
 guid_sum = 14250651627001887594
 timestamp = 1247866318 UTC = Sat Jul 18 01:31:58 2009
 
 Dataset mos [META], ID 0, cr_txg 4, 27.1M, 3050 objects
 Dataset data1 [ZPL], ID 5, cr_txg 4, 5.74T, 52987 objects
 
 capacity   operations   bandwidth   errors 
 
 descriptionused avail  read write  read write  read write 
 cksum
 data1 5.74T 6.99T   772 0 96.0M 0 0 0
 91
   /dev/dsk/c14t0d05.74T 6.99T   772 0 96.0M 0 0 0   
 223
 #

So we know that there are some checksum errors there but at least zdb 
was able to open pool in read-only mode.

 i've tried to run zdb -e -t 2682807 data1
 and 
 #echo 0t::pid2proc|::walk thread|::findstack -v | mdb -k

This is wrong - you need to put PID of the 'zpool import data1' process 
right after '0t'.

 and 
 #fmdump -eV
 shows checksum errors, such as 
 Jul 28 2009 11:17:35.386268381 ereport.fs.zfs.checksum
 nvlist version: 0
 class = ereport.fs.zfs.checksum
 ena = 0x1baa23c52ce01c01
 detector = (embedded nvlist)
 nvlist version: 0
 version = 0x0
 scheme = zfs
 pool = 0x578154df5f3260c0
 vdev = 0x6e4327476e17daaa
 (end detector)
 
 pool = data1
 pool_guid = 0x578154df5f3260c0
 pool_context = 2
 pool_failmode = wait
 vdev_guid = 0x6e4327476e17daaa
 vdev_type = disk
 vdev_path = /dev/dsk/c14t0d0p0
 vdev_devid = id1,s...@n2661000612646364/q
 parent_guid = 0x578154df5f3260c0
 parent_type = root
 zio_err = 50
 zio_offset = 0x2313d58000
 zio_size = 0x4000
 zio_objset = 0x0
 zio_object = 0xc
 zio_level = 0
 zio_blkid = 0x0
 __ttl = 0x1
 __tod = 0x4a6ea60f 0x1705fcdd

This tells us that object 0xc in metabjset (objset 0x0) is corrupted.

So to get more details you can do the following:

zdb -e - data1

zdb -e -bbcs data1

victor
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] zfs destroy slow?

2009-07-27 Thread Markus Kovero

Hi, how come zfs destroy being so slow, eg. destroying 6TB dataset renders zfs 
admin commands useless for time being, in this case for hours?
(running osol 111b with latest patches.)

Yours
Markus Kovero
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] zfs destroy slow?

2009-07-27 Thread Markus Kovero

Oh well, whole system seems to be deadlocked.
nice. Little too keen keeping data safe :-P

Yours
Markus Kovero

From: zfs-discuss-boun...@opensolaris.org 
[mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Markus Kovero
Sent: 27. heinäkuuta 2009 13:39
To: zfs-discuss@opensolaris.org
Subject: [zfs-discuss] zfs destroy slow?

Hi, how come zfs destroy being so slow, eg. destroying 6TB dataset renders zfs 
admin commands useless for time being, in this case for hours?
(running osol 111b with latest patches.)

Yours
Markus Kovero
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] No files but pool is full?

2009-07-24 Thread Markus Kovero

During our tests we noticed very disturbing behavior, what would be causing 
this?
System is running latest stable opensolaris.
Any other means to remove ghost files rather than destroying pool and restoring 
from backups?

r...@~# zpool status testpool
  pool: testpool
 state: ONLINE
 scrub: scrub completed after 0h26m with 0 errors on Fri Jul 24 10:32:09 2009
config:

NAME   STATE READ WRITE CKSUM
testpoolONLINE   0 0 0
  raidz2   ONLINE   0 0 0
c0t5000C5000505C31Bd0  ONLINE   0 0 0
c0t5000C5000498A9D3d0  ONLINE   0 0 0
c0t5000C5000505B523d0  ONLINE   0 0 0
c0t5000C5000505BB83d0  ONLINE   0 0 0
c0t5000C5000505B727d0  ONLINE   0 0 0
c0t5000C50004987B6Bd0  ONLINE   0 0 0

errors: No known data errors

r...@~# zpool list testpool
NAME  SIZE   USED  AVAILCAP  HEALTH  ALTROOT
testpool   408G   402G  6.37G98%  ONLINE  -

r...@~# ls -lasht /testpool/
total 4.0K
1.5K drwxr-xr-x 29 root root 30 2009-07-24 09:56 ..
2.5K drwxr-xr-x  2 root root  2 2009-07-23 18:23 .

r...@~# df /testpool
Filesystem   1K-blocks  Used Available Use% Mounted on
testpool  280481377 280481377 0 100% /testpool

r...@~# df -i /testpool
FilesystemInodes   IUsed   IFree IUse% Mounted on
testpool7   7   0  100% /testpool

r...@~# zdb - testpool
...

Object  lvl   iblk   dblk  lsize  asize  type
 6516K   128K  1000G   262G  ZFS plain file
 264  bonus  ZFS znode
path???object#6
uid 0
gid 0
atime   Thu Jul 23 17:23:19 2009
mtime   Thu Jul 23 17:50:17 2009
ctime   Thu Jul 23 17:50:17 2009
crtime  Thu Jul 23 17:23:19 2009
gen 19
mode100600
size1073741824000
parent  3
links   0
xattr   0
rdev0x

Yours
Markus Kovero

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] No files but pool is full?

2009-07-24 Thread Markus Kovero

r...@~# zfs list -t snapshot
NAME USED  AVAIL  REFER  MOUNTPOINT
rpool/ROOT/opensola...@install   146M  -  2.82G  -
r...@~#

-Original Message-
From: pantz...@gmail.com [mailto:pantz...@gmail.com] On Behalf Of Mattias 
Pantzare
Sent: 24. heinäkuuta 2009 10:56
To: Markus Kovero
Cc: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] No files but pool is full?

On Fri, Jul 24, 2009 at 09:33, Markus Koveromarkus.kov...@nebula.fi wrote:
 During our tests we noticed very disturbing behavior, what would be causing
 this?

 System is running latest stable opensolaris.

 Any other means to remove ghost files rather than destroying pool and
 restoring from backups?

You may have snapshots, try:
zfs list -t snapshot
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] No files but pool is full?

2009-07-24 Thread Markus Kovero

Yes, server has been rebooted several times and there is no available space, is 
it possible to delete ghosts that zdb sees somehow? how this can happen?

Yours
Markus Kovero

-Original Message-
From: pantz...@gmail.com [mailto:pantz...@gmail.com] On Behalf Of Mattias 
Pantzare
Sent: 24. heinäkuuta 2009 11:22
To: Markus Kovero
Cc: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] No files but pool is full?

On Fri, Jul 24, 2009 at 09:57, Markus Koveromarkus.kov...@nebula.fi wrote:
 r...@~# zfs list -t snapshot
 NAME                             USED  AVAIL  REFER  MOUNTPOINT
 rpool/ROOT/opensola...@install   146M      -  2.82G  -
 r...@~#

Then it is probably some process that has a deleted file open. You can
find those with:

fuser -c /testpool

But if you can't find the space after a reboot something is not right...


 -Original Message-
 From: pantz...@gmail.com [mailto:pantz...@gmail.com] On Behalf Of Mattias 
 Pantzare
 Sent: 24. heinäkuuta 2009 10:56
 To: Markus Kovero
 Cc: zfs-discuss@opensolaris.org
 Subject: Re: [zfs-discuss] No files but pool is full?

 On Fri, Jul 24, 2009 at 09:33, Markus Koveromarkus.kov...@nebula.fi wrote:
 During our tests we noticed very disturbing behavior, what would be causing
 this?

 System is running latest stable opensolaris.

 Any other means to remove ghost files rather than destroying pool and
 restoring from backups?

 You may have snapshots, try:
 zfs list -t snapshot
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] No files but pool is full?

2009-07-24 Thread Markus Kovero

Hi, thanks for pointing out issue, we haven't run updates on server yet.

Yours
Markus Kovero

-Original Message-
From: Henrik Johansson [mailto:henr...@henkis.net] 
Sent: 24. heinäkuuta 2009 12:26
To: Markus Kovero
Cc: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] No files but pool is full?

On 24 jul 2009, at 09.33, Markus Kovero markus.kov...@nebula.fi wrote:

 During our tests we noticed very disturbing behavior, what would be  
 causing this?

 System is running latest stable opensolaris.

 Any other means to remove ghost files rather than destroying pool  
 and restoring from backups?

This looks like bug i filed a while ago, CR 6792701 removing large  
holey files does bot free space.

The only solution I found  to clean the pool when isolating the bug  
was to recreate it. The fix was integrated inbuild post OSOL 2009.06.

Mkfile of a certain size will trigger this.

Henrik
http://sparcv9.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

2009-07-19 Thread Markus Kovero

I would be intrested in how to roll-back to certain txg-points in case of 
disaster, that was what Russel was after anyway.

Yours
Markus Kovero

-Original Message-
From: zfs-discuss-boun...@opensolaris.org 
[mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Miles Nordin
Sent: 19. heinäkuuta 2009 11:24
To: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 
40 days work

 bj == Brent Jones br...@servuhome.net writes:

bj many levels of fail here,

pft.  Virtualbox isn't unstable in any of my experience.  It doesn't by default 
pass cache flushes from guest to host unless you set

VBoxManage setextradata VMNAME 
VBoxInternal/Devices/piix3ide/0/LUN#[x]/Config/IgnoreFlush 0

however OP does not mention the _host_ crashing, so this questionable 
``optimization'' should not matter.  Yanking the guest's virtual cord is 
something ZFS is supposed to tolerate:  remember the ``crash-consistent 
backup'' concept (not to mention the ``always consistent on disk'' claim, but 
really any filesystem even without that claim should tolerate having the 
guest's virtual cord yanked, or the guest's kernel crashing, without losing all 
its contents---the claim only means no time-consuming fsck after reboot).

bj to blame ZFS seems misplaced,

-1

The fact that it's a known problem doesn't make it not a problem.

bj the subject on this thread especially inflammatory.

so what?
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

88 matches

Mail list logo