Although I've said much of this already I'm making a separate post in its
own topic in hopes that Valve will give it some serious consideration, and
more importantly, do something about it to fix the problem.  I am thankful
that Valve staff does pay attention to this mailing list.

The subject of this post is a problem that has been around for a long
time.  I have been operating public gameservers off and on since 2006 and
have seen this occur with some update releases pretty much since then.  I
am not the first person to say this happens, and I am also pretty sure
this is not the first time I had something to say about it here.

In short, the problem is incomplete updates.  It seems that the
(gamedir)/steam.inf file is the file most often left out of updates, but
sometimes other updated or new files are involved.  Sometimes (apparently)
files that an "updated" server will crash without are missed.

SCOPE OF PROBLEM
I should point out that I am specifically referring only to update
scenarios where the updater for a particular gameserver "tree" initiates
an update, pulls in files, AND RECEIVES A SUCCESSFUL "HLDS Installation Is
Up to Date" MESSAGE FROM THAT MASTER AND THE STEAM BINARY RESPONSIBLE FOR
UPDATING THE SERVER FILES EXITS WITH A 0 RETURN CODE when there are
actually files missing and the server cannot run properly afterwards as a
result.

The scenario in question also involves NOT using -verify_all among the
arguments passed to the steam binary to perform the update.

Whether the server in question is just using srcds_run with -autoupdate
among the command line arguments, or using nemrun or any other
properly-written script isn't important.

While it is possible to have incorrectly-written scripts to run and/or
update our servers, that isn't what I'm talking about here.  Nemrun and
other properly-written update scripts do not run "./steam -command update
-game tf -dir ." (or whatever) only once and ASSUME the update was
successful.  We can't do that because it is fairly common to get a
connection reset from whichever master we are pulling the update from
before it is complete.  It has to check the return code from the steam
binary doing the update, and if it isn't 0, you have to repeat until it IS
0.

Also, it needs to be understood by anyone at Valve investigating this
problem that neither nemrun, nor any other script (including your own
srcds_run with used with -autoupdate) that I'm talking about does anything
more than call the steam binary with "-command update".  Nemrun in
updatedaemon mode may have its own implementation of checking in with
steam to see if an update is needed, but it does comply with your
protocols, and even when it does start an update, it only does it because
your masters said a required update was available.  And it calls the same
steam binary to do the actual updating that srcds_run does.

Guess why the steam.inf file is so important?  This isn't specific to
nemrun, by the way.  It tells the dedicated server which version to report
to the steam masters.  So its contents (or existence) means everything to
that dedicated server when it checks in with the masters.  You could be
100% in sync with the latest dedicated server files, and edit steam.inf to
show an older version, and the next time your server checks in with the
masters it is going to tell you that you need to update.  And until your
steam.inf file has the same patch level/version that the masters think is
the latest required version, your server will keep telling you to restart
for the latest update.  The same is true if the file isn't there at all. 
So it's not just nemrun that needs this file.  It needs it for the same
reasons the gameserver itself does.

The nemrun scripts are out there for Valve and anybody else to look at. 
Please look at them before you say nemrun is the problem.  It isn't.

So what IS going on?  As far as I'm concerned we have a couple of
different issues here:

1. The steam binary can remove the existing steam.inf file while updating
a server, even if it doesn't have an updated one to replace it with.

2. The steam binary will exit with a 0 return code and claim that the HLDS
Installation is Up To Date when the server hasn't been fully updated.

Anyone seeing a pattern yet?

STEAM BINARY

Now, let's talk about distributing updates among your masters.  Here are a
couple questions anyone thinking about this rationally should ask:

1. Why does a master server tell you that an update is available if it
doesn't have ALL of the updated files yet?  If it doesn't have all the
updated files, it should not be telling you to restart your server UNTIL
IT DOES.

2. Why does the steam binary exit with a 0 RETURN CODE (successful) AND
EVEN TELL YOU 'HLDS Installation is Up To Date' if it doesn't have all the
files that got updated?


SUGGESTIONS
I don't have visibility of the inner workings of Valve's content
distribution system and I am not aware of whatever policies and procedures
you may have for releasing updates.  (and I am not sure I would want to
either, heh).  So I have to admit I am making some guesses here.  But I
have a few suggestions that I think would be relevant.

1. Look at the "protocol" used by your masters, or maybe just policies and
procedures for deploying updates onto them, which control when to start
issuing Server Out of Date messages to dedicated servers, so that they
will absolutely not under any circumstance start telling the dedicated
servers heartbeating in that they need to restart/update UNTIL IT HAS
*ALL* OF THE UPDATED FILES READY TO SERVE.

2. If files and directories that need to be pushed out as updated content
have to be "flagged" by the folks preparing the updates for the masters to
know which files it is supposed to push out to the dedicated servers for
an update, HAVE SOMEONE QA CHECK THE FLAGGED FILES/DIRS LIST before the
update release is approved (and the master servers start using it) to make
sure files weren't left off the list.  ESPECIALLY THE STEAM.INF FILE.  The
steam.inf file is ALWAYS updated for mandatory updates.  So checking the
updated files list should ALWAYS require making sure steam.inf is marked
as a file to push out.

3. It is possible that nemrun or other scripts like it are talking
indirectly to a master (through an API) that isn't "in sync" with
whichever master server the steam binary will talk to to pull in the
update.  This is still not nemrun's fault.  The nemrun script is using
SteamAPI now to see if an update is available. 
(https://api.steampowered.com).  If this API is telling "anyone that asks"
that a required update is out before any (AND ALL) of the master servers
the steam binary will actually pull the update in from has the complete
set of update files ready to serve, then this needs to be fixed.  My
suggestion would be that api.steampowered.com be the LAST thing you guys
push to - ie, only after ALL of the steam masters that will be serving
updated content have 100% of the update and are ready to serve it.  So the
API host won't say there is an update until it's ready to be served.

CONCLUSION
Rather than just telling people "-verify_all" is the only recommended or
supported way to update your gameserver, find and fix the problems that
make a "./steam -command update" fail sometimes without -verify_all added
-- even if only to help yourselves out with the load on the masters when
updates get released.  I suspect the answer lies in one or more of the
items noted above.

People don't like to use -verify_all to get updates because it takes
FOREVER to finish.  Doing an update without -verify_all usually only takes
a minute or so - whether doing that gets you a complete updated server or
not depends entirely on what the steam binary pulls in.  Nothing else. 
You can't blame nemrun or something else when the steam binary exits with
a 0 return code and says "HLDS Installation Up To Date".

Cheers.

_______________________________________________
To unsubscribe, edit your list preferences, or view the list archives, please 
visit:
https://list.valvesoftware.com/cgi-bin/mailman/listinfo/hlds_linux

Reply via email to