Mostly replying to Nick. Based on this thread, I checked our primaries, and
interestingly enough... 1 of the 3 kiddos had 36 days worth of junk; when the
retention period was 5 days (the other 2 were just fine/normal). We opened a
call w/MS and got it back to normal. So instead of looking at just raw count of
rows in that table, use something like this:
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;GOSELECT count(*) as
number_commits, MIN(commit_time) as minimum_commit_time, MAX(commit_time) as
maximum_commit_timefrom sys.dm_tran_commit_tableGO
It might take a few minutes to run, depending upon #of rows you have.
then, what you look at is the "minimum_commit_time" value. if that's around
about 5-6 days ago, it's probably just fine. Because the retention period is 5
days. For us, "just fine" where the minimum_commit_time is ~5 days ago, the
rows we have on our sites range from 44 million to 56million. On the "36 days
of stuff" before we cleaned up w/MS help it was 234million rows.
If you DO have minimum_commit_time of "holy smokes that's old", open a call
w/MS, don't try to 'fix it' on your own. It was a little tricky to do.
On Wednesday, November 11, 2015 8:06 AM, Nick <[email protected]>
wrote:
<!--#yiv7508085728 _filtered #yiv7508085728 {font-family:Calibri;panose-1:2 15
5 2 2 2 4 3 2 4;} _filtered #yiv7508085728 {font-family:Tahoma;panose-1:2 11 6
4 3 5 4 4 2 4;} _filtered #yiv7508085728 {font-family:"Segoe UI";panose-1:2 11
5 2 4 2 4 2 2 3;} _filtered #yiv7508085728 {font-family:"Segoe UI
Symbol";panose-1:2 11 5 2 4 2 4 2 2 3;}#yiv7508085728 #yiv7508085728
p.yiv7508085728MsoNormal, #yiv7508085728 li.yiv7508085728MsoNormal,
#yiv7508085728 div.yiv7508085728MsoNormal
{margin:0in;margin-bottom:.0001pt;font-size:11.0pt;font-family:"Calibri",
"sans-serif";}#yiv7508085728 a:link, #yiv7508085728
span.yiv7508085728MsoHyperlink
{color:#0563C1;text-decoration:underline;}#yiv7508085728 a:visited,
#yiv7508085728 span.yiv7508085728MsoHyperlinkFollowed
{color:#954F72;text-decoration:underline;}#yiv7508085728
span.yiv7508085728EmailStyle17 {font-family:"Calibri",
"sans-serif";color:windowtext;}#yiv7508085728 span.yiv7508085728EmailStyle18
{font-family:"Calibri", "sans-serif";color:#1F497D;}#yiv7508085728
.yiv7508085728MsoChpDefault {font-size:10.0pt;} _filtered #yiv7508085728
{margin:1.0in 1.0in 1.0in 1.0in;}#yiv7508085728 div.yiv7508085728WordSection1
{}-->Please define “big scary number”. With over 100,000 clients, everything’s
a big scary number to me. select count (*) from sys.dm_tran_commit_table = ~5M
From: [email protected]
[mailto:[email protected]] On Behalf Of Jason Wallace
Sent: Wednesday, November 11, 2015 7:24 AM
To: [email protected]
Subject: [mssms] Change Tracking and DRS Hi there folks I wanted to send you
a very quick note about something that we have seen in a customer with whom I
work. This is a large organisation who have a number of primary sites. It is
likely that this issue won’t ever affect the vast majority of us but it’s
interesting all the same. So, if you deal with DRS (and my suggestion is that
if you can scale up and scale out a standalone primary site to meet your needs
you consider this before ever uttering the letters CAS) you will know that
there is replication zooming around the enterprise – global data going both
ways and site data going upwards to be consolidated at the CAS. You’ll know
that the data is arranged into publications and that there are a number of
replication groups associated with this. In the environment that we were
working there was one primary site which just was “not right”. Most of the
time it was fine but then replication links would go degraded, we’d have slow
collection evaluation, TEMPDB would rise and we’d see a call being raised.
Doing the ◾spDiagDRS thing would show replication was running very slowly
indeed – but just on this site. We then started offloading some of the client
activity to MP Replicas and we started to see better the busy SQL activity that
was then affecting back-end services and what struck us was that some of the
busiest code was not sProc related but real SQL. When we looked into the SQL
we saw lots of “Change Tracking” activity. When we DRS replicate data we need
to track the data that is going to be replicated so whenever there is a row
change this data is written into sys.syscommittab. We can see this by looking
at the sys.dm_tran_commit_table view. These records have to hang around in
this table for the change tracking delay time that you have configured (by
default 5 days) but after that should be cleared out. When we queried this
view we got back a LOT of rows – many of which were a lot older than the 5 days
that we expected. So, we started looking at what is the process that clears
down change tracking data and found the answer here
http://www.brentozar.com/archive/2014/06/performance-tuning-sql-server-change-tracking/
- every 62 seconds up to 10,000 rows are deleted from change tracking. If you
have a busy primary site with a lot of stuff going on then it’s quite possible
that you’ll exceed this running average of 14.4 million rows in a day. Doing
so from time to time is no problem as things will revert to norm but if you are
exceeding this value day in day out for some time then slowly but surely you
have a problem building. Fixing this down is not so pretty – it involves
establishing a DAC connection to SQL and clearing things down in SQL, all of
which can be done while the site is live but which may take some time to run.
So if you do a select count (*) from sys.dm_tran_commit_table and see a big
scary number log a call with Microsoft. HTH Jason