Re: [mssms] Change Tracking and DRS

Sherry Kissinger Thu, 19 Nov 2015 14:39:05 -0800

Mostly replying to Nick.  Based on this thread, I checked our primaries, and 
interestingly enough... 1 of the 3 kiddos had 36 days worth of junk; when the 
retention period was 5 days (the other 2 were just fine/normal).  We opened a 
call w/MS and got it back to normal. So instead of looking at just raw count of 
rows in that table, use something like this:
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;GOSELECT count(*) as 
number_commits, MIN(commit_time) as minimum_commit_time, MAX(commit_time) as 
maximum_commit_timefrom sys.dm_tran_commit_tableGO
It might take a few minutes to run, depending upon #of rows you have.  
then, what you look at is the "minimum_commit_time" value.  if that's around 
about 5-6 days ago, it's probably just fine.  Because the retention period is 5 
days. For us, "just fine" where the minimum_commit_time is ~5 days ago, the 
rows we have on our sites range from 44 million to 56million.  On the "36 days 
of stuff" before we cleaned up w/MS help it was 234million rows.  
If you DO have minimum_commit_time of "holy smokes that's old", open a call 
w/MS, don't try to 'fix it' on your own.  It was a little tricky to do.



    On Wednesday, November 11, 2015 8:06 AM, Nick <[email protected]> 
wrote:
 

 <!--#yiv7508085728 _filtered #yiv7508085728 {font-family:Calibri;panose-1:2 15 
5 2 2 2 4 3 2 4;} _filtered #yiv7508085728 {font-family:Tahoma;panose-1:2 11 6 
4 3 5 4 4 2 4;} _filtered #yiv7508085728 {font-family:"Segoe UI";panose-1:2 11 
5 2 4 2 4 2 2 3;} _filtered #yiv7508085728 {font-family:"Segoe UI 
Symbol";panose-1:2 11 5 2 4 2 4 2 2 3;}#yiv7508085728 #yiv7508085728 
p.yiv7508085728MsoNormal, #yiv7508085728 li.yiv7508085728MsoNormal, 
#yiv7508085728 div.yiv7508085728MsoNormal 
{margin:0in;margin-bottom:.0001pt;font-size:11.0pt;font-family:"Calibri", 
"sans-serif";}#yiv7508085728 a:link, #yiv7508085728 
span.yiv7508085728MsoHyperlink 
{color:#0563C1;text-decoration:underline;}#yiv7508085728 a:visited, 
#yiv7508085728 span.yiv7508085728MsoHyperlinkFollowed 
{color:#954F72;text-decoration:underline;}#yiv7508085728 
span.yiv7508085728EmailStyle17 {font-family:"Calibri", 
"sans-serif";color:windowtext;}#yiv7508085728 span.yiv7508085728EmailStyle18 
{font-family:"Calibri", "sans-serif";color:#1F497D;}#yiv7508085728 
.yiv7508085728MsoChpDefault {font-size:10.0pt;} _filtered #yiv7508085728 
{margin:1.0in 1.0in 1.0in 1.0in;}#yiv7508085728 div.yiv7508085728WordSection1 
{}-->Please define “big scary number”.  With over 100,000 clients, everything’s 
a big scary number to me.  select count (*) from sys.dm_tran_commit_table = ~5M 
     From: [email protected] 
[mailto:[email protected]] On Behalf Of Jason Wallace
Sent: Wednesday, November 11, 2015 7:24 AM
To: [email protected]
Subject: [mssms] Change Tracking and DRS  Hi there folks  I wanted to send you 
a very quick note about something that we have seen in a customer with whom I 
work.  This is a large organisation who have a number of primary sites.  It is 
likely that this issue won’t ever affect the vast majority of us but it’s 
interesting all the same.  So, if you deal with DRS (and my suggestion is that 
if you can scale up and scale out a standalone primary site to meet your needs 
you consider this before ever uttering the letters CAS) you will know that 
there is replication zooming around the enterprise – global data going both 
ways and site data going upwards to be consolidated at the CAS.  You’ll know 
that the data is arranged into publications and that there are a number of 
replication groups associated with this.  In the environment that we were 
working there was one primary site which just was “not right”.  Most of the 
time it was fine but then replication links would go degraded, we’d have slow 
collection evaluation, TEMPDB would rise and we’d see a call being raised.  
Doing the ◾spDiagDRS thing would show replication was running very slowly 
indeed – but just on this site.  We then started offloading some of the client 
activity to MP Replicas and we started to see better the busy SQL activity that 
was then affecting back-end services and what struck us was that some of the 
busiest code was not sProc related but real SQL.  When we looked into the SQL 
we saw lots of “Change Tracking” activity.  When we DRS replicate data we need 
to track the data that is going to be replicated so whenever there is a row 
change this data is written into sys.syscommittab.  We can see this by looking 
at the sys.dm_tran_commit_table view.  These records have to hang around in 
this table for the change tracking delay time that you have configured (by 
default 5 days) but after that should be cleared out.  When we queried this 
view we got back a LOT of rows – many of which were a lot older than the 5 days 
that we expected.  So, we started looking at what is the process that clears 
down change tracking data and found the answer here 
http://www.brentozar.com/archive/2014/06/performance-tuning-sql-server-change-tracking/
 - every 62 seconds up to 10,000 rows are deleted from change tracking.  If you 
have a busy primary site with a lot of stuff going on then it’s quite possible 
that you’ll exceed this running average of 14.4 million rows in a day.  Doing 
so from time to time is no problem as things will revert to norm but if you are 
exceeding this value day in day out for some time then slowly but surely you 
have a problem building.  Fixing this down is not so pretty – it involves 
establishing a DAC connection to SQL and clearing things down in SQL, all of 
which can be done while the site is live but which may take some time to run.  
So if you do a select count (*) from sys.dm_tran_commit_table and see a big 
scary number log a call with Microsoft.  HTH  Jason

Re: [mssms] Change Tracking and DRS

Reply via email to