A few quick thoughts, and a few useful resources.
* Develop against the technology you're going to use in production. I
know that seems absurd to have to say, but it's amazing the number
of apps you see out there that are tuned for things like Microsoft
Jet or SQL Express which the dev was running on their local
machine. If you're going to run it on MySQL in production, develop
against it. Different databases handle queries in different ways,
what is fast on one isn't necessarily fast on the other, and for
pities sake make sure it's the same */version/* too. MySQL 5.5
performs a lot better than MySQL 5.1 for example, and 5.6 is faster
than 5.5. Even down to the minor version is good, 5.5.12 (IIRC) had
an interesting performance bug that bit us.
* Also aim to develop on data of at least reasonable scale. It's nice
your function runs quick against 10,000 rows in your dev database,
but what if it's typically running against a million in production?
I was working with a dev who had a stored function that ran in <1s
against the dev database but would typically take over a minute and
a half against production in real world scenarios.
* Write your apps with split master/slave SQL from the outset, even if
you're pointing them at a single database to start with. That way
it's a lot easier to bring in master/slave down the line, rather
than having to go through a bunch of code later on to point them at
appropriate servers. Even if you end up with a database structure
that doesn't use master/slave the amount of work required at the
offset is fairly minimal. It's a bit of an on-going disagreement I
have with a few of our developers. Should you find yourself in a
position to need to rapidly scale it'll be a lot easier to do so and
buy yourself time for further considerations on how to best scale
your app (i.e. would sharding be preferable?)
* Cache, cache, cache, cache, cache. No point wasting CPU/Database
time answering frequent queries. Cache wherever possible, as close
to the edge as feasible, but be conscious of security. You don't
want to cache secure content and return it to the wrong person.
Stick a memcache layer in or similar.
* Make static whatever can be. No point generating the entire page
from code if the content is staying static for hours at a time.
* Use config management. Puppet / Chef / Cfengine / Whatever. If
you're lucky enough to have your site go viral you're going to need
to scale very quickly and you don't want to have to be building and
configuring every server from scratch when config management can do
it for you in minutes.
* On that front, if you're running your own physical servers don't
forget the cloud can provide you temporary breathing space. A
former colleague of mine works for a start-up that much prefers
physical hardware. Their monitoring system tracks overall site
usage & performance metrics. Should they start to reach a defined
threshold the monitoring system will automatically trigger
provisioning of a new virtual server, and tells the config
management about it so that it can be set-up and added into the loop
without a single finger being lifted. Tech team are e-mailed at the
same time so they can investigate if necessary, and if it happens
frequently enough they can consider buying more hardware.
* http://techblog.netflix.com/2010/12/5-lessons-weve-learned-using-aws.html
Chaos Monkey. Build your platform to work as best as possible even
if certain services are down.
For scalability stuff, http://highscalability.com/ is worth a look.
They'll talk about different architectures and approaches taken by
various large sites.
Paul
On 10/31/2011 4:26 AM, Trey Darley wrote:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hey there, LOPSA hivemind! Greets from Sunny Brussels.
I routinely bump into people trying to cook a web startup. Fairly
typical is one guy with an idea and a bit of technical ability trying to
get a product to market without much capital expenditure.
I find myself having this conversation repeatedly where someone doesn't
have customer one *yet* but they're trying to already think about
scaling to $really_large_number. I think, really, there is some
low-hanging fruit for such cases. Premature optimization is the root of
all evil, true, but on the other hand one can also make early
architectural choices that impose otherwise avoidable scalability limits.
Why am I writing? Two things:
0) Does a sort of ten commandments, dos and don'ts of scalability for
tech startups exist? (I don't want to reinvent the wheel.)
1) Given the above, wouldn't this be a valuable addition to the LOPSA
site? (Rather like the old tools database, which sadly seems to have
disappeared.) If we put our brains together and thrash out a useful 'ten
commandments of scalability' page we could post/upvote it on HN and
thereby garner some nice publicity for LOPSA.
Cheers,
- --Trey
P.S. Speaking of which, lopsa.org seems rather unresponsive. Is it just
me? Perhaps this ought to be addressed before (hypothetically) driving a
bunch of HN traffic to the site. :-/
++----------------------------------------------------------------------------++
Trey Darley - Brussels
mobile: +32/494.766.080
twitter: @treyka
++----------------------------------------------------------------------------++
Quis custodiet ipsos custodes?
++----------------------------------------------------------------------------++
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iEYEARECAAYFAk6usBYACgkQQXaSM49tivCk6QCfSJQHnzLDeqJFqTRlcJ5iFdx1
QrAAmwb0q5EQ2jHy9EQb3McDBgryuZea
=3RUW
-----END PGP SIGNATURE-----
_______________________________________________
Tech mailing list
[email protected]
https://lists.lopsa.org/cgi-bin/mailman/listinfo/tech
This list provided by the League of Professional System Administrators
http://lopsa.org/
_______________________________________________
Tech mailing list
[email protected]
https://lists.lopsa.org/cgi-bin/mailman/listinfo/tech
This list provided by the League of Professional System Administrators
http://lopsa.org/