Re: [OmniOS-discuss] Strange ARC reads numbers

Jim Klimov Wed, 28 May 2014 00:50:33 -0700

26 мая 2014 г. 9:36:02 CEST, Filip Marvan <filip.mar...@aira.cz> пишет:
>Hello,
>
> 
>
>just for information, after two weeks, numbers of ARC assesses came
>back to
>high numbers as before deletion of data (you can see that in
>screenshot).
>
>And I try to delete the same amount of data on different storage
>server, and
>the accesses to ARC droped in the same way as on first pool
>
> 
>
>Interesting.
>
> 
>
>Filip Marvan
>
> 
>
> 
>
> 
>
> 
>
>From: Richard Elling [mailto:richard.ell...@richardelling.com] 
>Sent: Thursday, May 08, 2014 12:47 AM
>To: Filip Marvan
>Cc: omnios-discuss@lists.omniti.com
>Subject: Re: [OmniOS-discuss] Strange ARC reads numbers
>
> 
>
>On May 7, 2014, at 1:44 AM, Filip Marvan <filip.mar...@aira.cz> wrote:
>
>
>
>
>
>Hi Richard,
>
> 
>
>thank you for your reply.
>
> 
>
>1. Workload is still the same or very similar. Zvols, which we deleted
>from
>our pool were disconnected from KVM server a few days before, so the
>only
>change was, that we deleted that zvols with all snapshots.
>
>2. As you wrote, our customers are fine for now :) We have monitoring
>of all
>our virtual servers running from that storage server, and there is no
>noticeable change in workload or latencies.
>
> 
>
>good, then there might not be an actual problem, just a puzzle :-)
>
>
>
>
>
>3. That could be the reason, of course. But in the graph are only data
>from
>arcstat.pl script. We can see, that arcstat is reporting heavy read
>accesses
>every 5 seconds (propably some update of ARC after ZFS writes data to
>disks
>from ZIL? All of them are marked as "cache hits" by arcstat script) and
>with
>only few ARC accesses between that 5 seconds periody. Before we deleted
>that
>zvols (about 0.7 TB data from 10 TB pool, which have 5 TB of free
>space)
>there were about 40k accesses every 5 seconds, now there are no more
>than 2k
>accesses every 5 seconds.
>
> 
>
>This is expected behaviour for older ZFS releases that used a
>txg_timeout of
>5 seconds. You should
>
>see a burst of write activity around that timeout and it can include
>reads
>for zvols. Unfortunately, the
>
>zvol code is not very efficient and you will see a lot more reads than
>you
>expect.
>
> -- richard
>
> 
>
>
>
>
>
> 
>
>Most of our zvols have 8K volblocksize (including deleted zvols), only
>few
>have 64K. Unfortunately I have no data about size of the read before
>that
>change. But we have two more storage servers, with similary high ARC
>read
>accesses every 5 seconds as on the first pool before deletion. Maybe I
>should try to delete some data on that pools and see what happen with
>more
>detailed monitoring.
>
> 
>
>Thank you,
>
>Filip
>
> 
>
> 
>
>  _____  
>
>From: Richard Elling [mailto:richard.ell...@richardelling.com] 
>Sent: Wednesday, May 07, 2014 3:56 AM
>To: Filip Marvan
>Cc: omnios-discuss@lists.omniti.com
>Subject: Re: [OmniOS-discuss] Strange ARC reads numbers
>
> 
>
>Hi Filip,
>
> 
>
>There are two primary reasons for reduction in the number of ARC reads.
>
>            1. the workload isn't reading as much as it used to
>
>            2. the latency of reads has increased
>
>            3. your measurement is b0rken
>
>there are three reasons...
>
> 
>
>The data you shared clearly shows reduction in reads, but doesn't
>contain
>the answers
>
>to the cause. Usually, if #2 is the case, then the phone will be
>ringing
>with angry customers
>
>on the other end.
>
> 
>
>If the above 3 are not the case, then perhaps it is something more
>subtle.
>The arcstat reads
>
>does not record the size of the read. To get the read size for zvols is
>a
>little tricky, you can
>
>infer it from the pool statistics in iostat. The subtleness here is
>that if
>the volblocksize is 
>
>different between the old and new zvols, then the number of (block)
>reads
>will be different
>
>for the same workload.
>
> -- richard
>
> 
>
>--
>
> 
>
>richard.ell...@richardelling.com
>+1-760-896-4422
>
>
>
> 
>
>
>
>------------------------------------------------------------------------
>
>_______________________________________________
>OmniOS-discuss mailing list
>OmniOS-discuss@lists.omniti.com
>http://lists.omniti.com/mailman/listinfo/omnios-discuss


Reads from L2ARC suggest that this data is being read, but it is not hot enough 
to stick in the main RAM ARC. Deleting the datasets seemingly caused this data 
to no longer be read, perhaps because those blocks are no longer referenced by 
the pool. IIRC in your first post you wrote this happens to your older 
snapshots, and now it seems that the situation repeats as your system 
accumulated new snapshots after that mass deletion a few weeks back.

To me it sums up to: "somebody mass-reads your available snapshots". Do you 
have 'zfssnap=visible' so that $dataset/.zfs directories are always visible 
(not only upon direct request) and do you have daemons or cronjobs or something 
of that kind (possibly a slocate/mlocate updatedb job, or an rsync backup) that 
reads your posix filesystem structure? 

Since it does not seem that your whole datasets are being re-read (exact guess 
depends on amount of unique data related to l2arc size, of course - and on 
measurable presence of reads from the main pool devices), regular accesses to 
just the FS metadata might explain the symptoms. Though backups that do read 
the file data (perhaps "rsync -c", or tar, or zfs send of any dataset type 
redone over and over for some reason) and sufficiently small unique data in the 
snapshots might also fit this explanation.

HTH,
//Jim Klimov
--
Typos courtesy of K-9 Mail on my Samsung Android
_______________________________________________
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss

Re: [OmniOS-discuss] Strange ARC reads numbers

Reply via email to