Re: [zfs-discuss] number of blocks changes

2012-08-06 Thread Justin Stringfellow


Can you check whether this happens from /dev/urandom as well?

It does:

finsdb137@root dd if=/dev/urandom of=oub bs=128k count=1  while true
 do
 ls -s oub
 sleep 1
 done
0+1 records in
0+1 records out
   1 oub
   1 oub
   1 oub
   1 oub
   1 oub
   4 oub
   4 oub
   4 oub
   4 oub
   4 oub

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] number of blocks changes

2012-08-06 Thread Justin Stringfellow


 I think for the cleanness of the experiment, you should also include
sync after the dd's, to actually commit your file to the pool.

OK that 'fixes' it:

finsdb137@root dd if=/dev/random of=ob bs=128k count=1  sync  while true
 do
 ls -s ob
 sleep 1
 done
0+1 records in
0+1 records out
   4 ob
   4 ob
   4 ob
.. etc.

I guess I knew this had something to do with stuff being flushed to disk, I 
don't know why I didn't think of it myself.

 What is the pool's redundancy setting?
copies=1. Full zfs get below, but in short, it's a basic mirrored root with 
default settings. Hmm, maybe I should mirror root with copies=2.    ;)

 I am not sure what ls -s actually accounts for file's FS-block
usage, but I wonder if it might include metadata (relevant pieces of
the block pointer tree individual to the file). Also check if the
disk usage reported by du -k ob varies similarly, for the fun of it?

Yes, it varies too.

finsdb137@root dd if=/dev/random of=ob bs=128k count=1  while true
 do
 ls -s ob
 du -k ob
 sleep 1
 done
0+1 records in
0+1 records out
   1 ob
0   ob
   1 ob
0   ob
   1 ob
0   ob
   1 ob
0   ob
   4 ob
2   ob
   4 ob
2   ob
   4 ob
2   ob
   4 ob
2   ob
   4 ob
2   ob






finsdb137@root zfs get all rpool/ROOT/s10s_u9wos_14a
NAME   PROPERTY  VALUE  SOURCE
rpool/ROOT/s10s_u9wos_14a  type  filesystem -
rpool/ROOT/s10s_u9wos_14a  creation  Tue Mar  1 15:09 2011  -
rpool/ROOT/s10s_u9wos_14a  used  20.6G  -
rpool/ROOT/s10s_u9wos_14a  available 37.0G  -
rpool/ROOT/s10s_u9wos_14a  referenced    20.6G  -
rpool/ROOT/s10s_u9wos_14a  compressratio 1.00x  -
rpool/ROOT/s10s_u9wos_14a  mounted   yes    -
rpool/ROOT/s10s_u9wos_14a  quota none   default
rpool/ROOT/s10s_u9wos_14a  reservation   none   default
rpool/ROOT/s10s_u9wos_14a  recordsize    128K   default
rpool/ROOT/s10s_u9wos_14a  mountpoint    /  local
rpool/ROOT/s10s_u9wos_14a  sharenfs  off    default
rpool/ROOT/s10s_u9wos_14a  checksum  on default
rpool/ROOT/s10s_u9wos_14a  compression   off    default
rpool/ROOT/s10s_u9wos_14a  atime on default
rpool/ROOT/s10s_u9wos_14a  devices   on default
rpool/ROOT/s10s_u9wos_14a  exec  on default
rpool/ROOT/s10s_u9wos_14a  setuid    on default
rpool/ROOT/s10s_u9wos_14a  readonly  off    default
rpool/ROOT/s10s_u9wos_14a  zoned off    default
rpool/ROOT/s10s_u9wos_14a  snapdir   hidden default
rpool/ROOT/s10s_u9wos_14a  aclmode   groupmask  default
rpool/ROOT/s10s_u9wos_14a  aclinherit    restricted default
rpool/ROOT/s10s_u9wos_14a  canmount  noauto local
rpool/ROOT/s10s_u9wos_14a  shareiscsi    off    default
rpool/ROOT/s10s_u9wos_14a  xattr on default
rpool/ROOT/s10s_u9wos_14a  copies    1  default
rpool/ROOT/s10s_u9wos_14a  version   3  -
rpool/ROOT/s10s_u9wos_14a  utf8only  off    -
rpool/ROOT/s10s_u9wos_14a  normalization none   -
rpool/ROOT/s10s_u9wos_14a  casesensitivity   sensitive  -
rpool/ROOT/s10s_u9wos_14a  vscan off    default
rpool/ROOT/s10s_u9wos_14a  nbmand    off    default
rpool/ROOT/s10s_u9wos_14a  sharesmb  off    default
rpool/ROOT/s10s_u9wos_14a  refquota  none   default
rpool/ROOT/s10s_u9wos_14a  refreservation    none   default
rpool/ROOT/s10s_u9wos_14a  primarycache  all    default
rpool/ROOT/s10s_u9wos_14a  secondarycache    all    default
rpool/ROOT/s10s_u9wos_14a  usedbysnapshots   0  -
rpool/ROOT/s10s_u9wos_14a  usedbydataset 20.6G  -
rpool/ROOT/s10s_u9wos_14a  usedbychildren    0  -
rpool/ROOT/s10s_u9wos_14a  usedbyrefreservation  0  -
rpool/ROOT/s10s_u9wos_14a  logbias   latency    default
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] what have you been buying for slog and l2arc?

2012-08-06 Thread Matt Breitbach
Stec ZeusRAM for Slog - it's exensive and small, but it's the best out
there.  OCZ Talos C for L2ARC.

-Original Message-
From: zfs-discuss-boun...@opensolaris.org
[mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Bob Friesenhahn
Sent: Friday, August 03, 2012 8:40 PM
To: Karl Rossing
Cc: ZFS filesystem discussion list
Subject: Re: [zfs-discuss] what have you been buying for slog and l2arc?

On Fri, 3 Aug 2012, Karl Rossing wrote:

 I'm looking at
 http://www.intel.com/content/www/us/en/solid-state-drives/solid-state-
 drives-ssd.html
 wondering what I should get.

 Are people getting intel 330's for l2arc and 520's for slog?

For the slog, you should look for a SLC technology SSD which saves unwritten
data on power failure.  In Intel-speak, this is called Enhanced Power Loss
Data Protection.  I am not running across any Intel SSDs which claim to
match these requirements.

Extreme write IOPS claims in consumer SSDs are normally based on large write
caches which can lose even more data if there is a power failure.

Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Missing Disk Space

2012-08-06 Thread Burt Hailey
Thanks for the responses.  I read the docs that Cindy suggested and they
were educational but I still don't understand where the missing disk space
is.  I used the zfs list command and added up all space used.  If I'm
reading it right, I have 250GB of snapshots.   Zpool list shows that the
pool size (localpool)is 1.81TB, of which 1.68 shows allocated.  The
filesystem that I am concerned about is localhome and a du -sk shows that it
is ~650GB in size.  This corresponds to the output from df -lk.  This is
also in the neighborhood of what I see in the REFER column in zfs list.  So
my question remains:

I have 1.68TB of space allocated.  Of that, there is ~650GB of actual
filesystem data and 250GB of snapshots.  That leaves almost 800GB of space
unaccounted for.  I would like to understand if my logic, or method, is
flawed.  If not, how can I go about determining what happened to the 800GB?

I am including the output from the zfs list and zpool list commands.

Zpool list:
NAMESIZE  ALLOC   FREECAP  HEALTH  ALTROOT
localpool  1.81T  1.68T   137G92%  ONLINE  -

Zfs list:
NAMEUSED  AVAIL  REFER
MOUNTPOINT
localpool  1.68T   109G24K
/localpool
localpool/backup 21K   109G21K
/localpool/backup
localpool/localhome1.68T   109G   624G
/localpool/localhome
localpool/localhome@weekly00310-date2011-11-06-hour00  10.1G  -   754G
-
localpool/localhome@weekly00338-date2011-12-04-hour00  4.63G  -   847G
-
localpool/localhome@weekly001-date2012-01-01-hour005.13G  -   938G
-
localpool/localhome@weekly0036-date2012-02-05-hour00   10.3G  -  1.06T
-
localpool/localhome@weekly0064-date2012-03-04-hour00   84.1G  -  1.22T
-
localpool/localhome@weekly0092-date2012-04-01-hour00   8.43G  -   709G
-
localpool/localhome@weekly00127-date2012-05-06-hour00  11.1G  -   722G
-
localpool/localhome@weekly00155-date2012-06-03-hour00  20.5G  -   737G
-
localpool/localhome@weekly00183-date2012-07-01-hour00  10.9G  -   672G
-
localpool/localhome@weekly00190-date2012-07-08-hour00  11.0G  -   696G
-
localpool/localhome@weekly00197-date2012-07-15-hour00  7.92G  -   662G
-
localpool/localhome@weekly00204-date2012-07-22-hour00  13.5G  -   691G
-
localpool/localhome@weekly00211-date2012-07-29-hour00  7.88G  -   697G
-
localpool/localhome@12217-date2012-08-04-hour12 248M  -   620G
-
localpool/localhome@13217-date2012-08-04-hour13 201M  -   620G
-
localpool/localhome@14217-date2012-08-04-hour14 151M  -   620G
-
localpool/localhome@15217-date2012-08-04-hour15 143M  -   620G
-
localpool/localhome@16217-date2012-08-04-hour16 166M  -   621G
-
localpool/localhome@17217-date2012-08-04-hour17 157M  -   620G
-
localpool/localhome@18217-date2012-08-04-hour18 136M  -   620G
-
localpool/localhome@19217-date2012-08-04-hour19 178M  -   620G
-
localpool/localhome@20217-date2012-08-04-hour20 152M  -   620G
-
localpool/localhome@21217-date2012-08-04-hour21 117M  -   620G
-
localpool/localhome@22217-date2012-08-04-hour22 108M  -   620G
-
localpool/localhome@23217-date2012-08-04-hour23 156M  -   620G
-
localpool/localhome@weekly00218-date2012-08-05-hour00  34.7M  -   620G
-
localpool/localhome@00218-date2012-08-05-hour0035.3M  -   620G
-
localpool/localhome@01218-date2012-08-05-hour01 153M  -   620G
-
localpool/localhome@02218-date2012-08-05-hour02 126M  -   620G
-
localpool/localhome@03218-date2012-08-05-hour0398.0M  -   620G
-
localpool/localhome@04218-date2012-08-05-hour04 318M  -   620G
-
localpool/localhome@05218-date2012-08-05-hour054.31G  -   624G
-
localpool/localhome@06218-date2012-08-05-hour06 587M  -   621G
-
localpool/localhome@07218-date2012-08-05-hour07 200M  -   621G
-
localpool/localhome@08218-date2012-08-05-hour08 119M  -   621G
-
localpool/localhome@09218-date2012-08-05-hour09 141M  -   621G
-
localpool/localhome@10218-date2012-08-05-hour10 189M  -   621G
-
localpool/localhome@11218-date2012-08-05-hour11 243M  -   621G
-
localpool/localhome@12218-date2012-08-05-hour12 256M  -   621G
-
localpool/localhome@13218-date2012-08-05-hour13 221M  -   621G
-
localpool/localhome@14218-date2012-08-05-hour14 168M  -   621G
-
localpool/localhome@15218-date2012-08-05-hour15 156M  -   621G
-
localpool/localhome@16218-date2012-08-05-hour16 147M  -   621G
-
localpool/localhome@17218-date2012-08-05-hour17 118M  -   621G
-
localpool/localhome@18218-date2012-08-05-hour18 151M  -   621G
-
localpool/localhome@19218-date2012-08-05-hour19 252M  -   621G

Re: [zfs-discuss] Missing Disk Space

2012-08-06 Thread Stefan Ring
Have you not seen my answer?

http://mail.opensolaris.org/pipermail/zfs-discuss/2012-August/052170.html
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] what have you been buying for slog and l2arc?

2012-08-06 Thread Christopher George

Are people getting intel 330's for l2arc and 520's for slog?


Unfortunately, the Intel 520 does *not* power protect it's
on-board volatile cache (unlike the Intel 320/710 SSD).

Intel has an eye-opening technology brief, describing the
benefits of power-loss data protection at:

http://www.intel.com/content/www/us/en/solid-state-drives/ssd-320-series-power-loss-data-protection-brief.html

Intel's brief also clears up a prior controversy of what types of
data are actually cached, per the brief it's both user and system
data!

Best regards,

Christopher George
www.ddrdrive.com


*** The Intel 311 (SLC NAND) also fails to support on-board
power protection. 


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] what have you been buying for slog and l2arc?

2012-08-06 Thread Stefan Ring
 Unfortunately, the Intel 520 does *not* power protect it's
 on-board volatile cache (unlike the Intel 320/710 SSD).

 Intel has an eye-opening technology brief, describing the
 benefits of power-loss data protection at:

 http://www.intel.com/content/www/us/en/solid-state-drives/ssd-320-series-power-loss-data-protection-brief.html

 Intel's brief also clears up a prior controversy of what types of
 data are actually cached, per the brief it's both user and system
 data!

So you're saying that SSDs don't generally flush data to stable medium
when instructed to? So data written before an fsync is not guaranteed
to be seen after a power-down?

If that -- ignoring cache flush requests -- is the whole reason why
SSDs are so fast, I'm glad I haven't got one yet.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] what have you been buying for slog and l2arc?

2012-08-06 Thread Brandon High
On Mon, Aug 6, 2012 at 2:15 PM, Stefan Ring stefan...@gmail.com wrote:
 So you're saying that SSDs don't generally flush data to stable medium
 when instructed to? So data written before an fsync is not guaranteed
 to be seen after a power-down?

It depends on the model. Consumer models are less likely to
immediately flush. My understanding that this is done in part to do
some write coalescing and reduce the number of P/E cycles. Enterprise
models should either flush, or contain a super capacitor that provides
enough power for the drive to complete writing any date in its buffer.

 If that -- ignoring cache flush requests -- is the whole reason why
 SSDs are so fast, I'm glad I haven't got one yet.

They're fast for random reads and writes because they don't have seek
latency. They're fast for sequential IO because they aren't limited by
spindle speed.

-- 
Brandon High : bh...@freaks.com
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] what have you been buying for slog and l2arc?

2012-08-06 Thread Bob Friesenhahn

On Mon, 6 Aug 2012, Christopher George wrote:


Intel's brief also clears up a prior controversy of what types of
data are actually cached, per the brief it's both user and system
data!


I am glad to hear that both user AND system data is stored.  That is 
rather reassuring. :-)


Is your DDRDrive product still supported and moving?  Is it well 
supported for Illumos?


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] what have you been buying for slog and l2arc?

2012-08-06 Thread Bob Friesenhahn

On Mon, 6 Aug 2012, Stefan Ring wrote:


Intel's brief also clears up a prior controversy of what types of
data are actually cached, per the brief it's both user and system
data!


So you're saying that SSDs don't generally flush data to stable medium
when instructed to? So data written before an fsync is not guaranteed
to be seen after a power-down?

If that -- ignoring cache flush requests -- is the whole reason why
SSDs are so fast, I'm glad I haven't got one yet.


Testing has shown that many SSDs do not flush the data prior to 
claiming that they have done so.  The flush request may hasten the 
time until the next actual cache flush.


As far as I am aware, Intel does not sell any enterprise-class SSDs 
even though they have sold some models with 'E' in the name.  True 
enterprise SSDs can cost 5-10X the price of larger consumer models.


A battery-backed RAM cache with Flash backup can be a whole lot faster 
and still satisfy many users.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] what have you been buying for slog and l2arc?

2012-08-06 Thread Christopher George

Is your DDRdrive product still supported and moving?


Yes, we now exclusively target ZIL acceleration.

We will be at the upcoming OpenStorage Summit 2012,
and encourage those attending to stop by our booth and
say hello :-)

http://www.openstoragesummit.org/


Is it well supported for Illumos?


Yes!  Customers using Illumos derived distros make-up a
good portion of our customer base.

Thanks,

Christopher George
www.ddrdrive.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] what have you been buying for slog and l2arc?

2012-08-06 Thread Sašo Kiselkov
On 08/07/2012 12:12 AM, Christopher George wrote:
 Is your DDRdrive product still supported and moving?
 
 Yes, we now exclusively target ZIL acceleration.
 
 We will be at the upcoming OpenStorage Summit 2012,
 and encourage those attending to stop by our booth and
 say hello :-)
 
 http://www.openstoragesummit.org/
 
 Is it well supported for Illumos?
 
 Yes!  Customers using Illumos derived distros make-up a
 good portion of our customer base.

How come I haven't seen new products coming from you guys? I mean, the
X1 is past 3 years old and some improvements would be sort of expected
in that timeframe. Off the top of my head, I'd welcome things such as:

 *) Increased capacity for high-volume applications.

 *) Remove the requirement to have an external UPS (couple of
supercaps? microbattery?)

 *) Use cheaper MLC flash to lower cost - it's only written to in case
of a power outage, anyway so lower write cycles aren't an issue and
modern MLC is almost as fast as SLC at sequential IO (within 10%
usually).

 *) PCI Express 3.0 interface (perhaps even x4)

 *) Soldered-on DRAM to create a true low-profile card (the current DIMM
slots look like a weird dirty hack).

 *) At least updated benchmarks your site to compare against modern
flash-based competition (not the Intel X25-E, which is seriously
stone age by now...)

 *) Lower price, lower price, lower price.
I can get 3-4 200GB OCZ Talos-Rs for $2k FFS. That means I could
equip my machine with one to two mirrored slogs and nearly 800GB
worth of L2ARC for the price of a single X1.

I mean this as constructive criticism, not as angry bickering. I totally
respect you guys doing your own thing.

Cheers,
--
Saso
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] what have you been buying for slog and l2arc?

2012-08-06 Thread Christopher George
I am glad to hear that both user AND system data is stored.  That is 
rather reassuring. :-)


I agree!

---
[Excerpt from the linked Intel Technology Brief]

What Type of Data is Protected:
During an unsafe shutdown, firmware routines in the 
Intel SSD 320 Series respond to power loss interrupt 
and make sure both user data and system data in the 
temporary buffers are transferred to the NAND media.

---

I was taking user data to indicate actual txg data and 
system data to mean the SSD's internal meta data... 


I'm curious, any other interpretations?

Thanks,
Chris


Christopher George
cgeorge at ddrdrive.com
http://www.ddrdrive.com/

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] what have you been buying for slog and l2arc?

2012-08-06 Thread Erik Trimble

On 8/6/2012 2:53 PM, Bob Friesenhahn wrote:

On Mon, 6 Aug 2012, Stefan Ring wrote:


Intel's brief also clears up a prior controversy of what types of
data are actually cached, per the brief it's both user and system
data!


So you're saying that SSDs don't generally flush data to stable medium
when instructed to? So data written before an fsync is not guaranteed
to be seen after a power-down?

If that -- ignoring cache flush requests -- is the whole reason why
SSDs are so fast, I'm glad I haven't got one yet.


Testing has shown that many SSDs do not flush the data prior to 
claiming that they have done so.  The flush request may hasten the 
time until the next actual cache flush.


Honestly, I don't think this last point can be emphasized enough. SSDs 
of all flavors and manufacturers have a track record of *consistently* 
lying when returning from a cache flush command. There might exist 
somebody out there who actually does it across all products, but I've 
tested and used enough of the variety (both Consumer and Enterprise) to 
NOT trust any SSD that tells you it actually flushed out its local cache.


ALWAYS insist on some form of power protection, whether it be a 
supercap, battery, or external power-supply.   That way, even if they 
lie to you, you're covered from a power loss.


-Erik

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss