I've just taken a squiz at an S3-based website we have, and via the S3 URL it
is a CNAME with a 60-secod TTL pointing at a set of A records with 5-second
TTLs.
Any one dig returns the CNAME and a single IP address:
dig our-domain.s3-website-ap-southeast-2.amazonaws.com.
our-domain.s3-website-ap-southeast-2.amazonaws.com. 14 IN CNAME s3-
website-ap-southeast-2.amazonaws.com.
s3-website-ap-southeast-2.amazonaws.com. 5 IN A 52.95.134.145
If the query is multiply repeated, the returned IP address changes, roughly
every five seconds.
What's interesting is the name attached to the A records, which does not
include "our-domain". It seems to be a record pointing to ALL S3 websites in
the region. And all of the addresses I saw reverse-resolve to that one name. So
there is definitely some under-the-bonnet magic discrimination going on.
In Route53 the picture is very different, with the published website host name
(think "our-domain.com.au") resolving to four IP addresses that are all
returned in the response to a single dig query. There is an A-ALIAS (a
non-standard AWS record type) that points to a CloudFront distribution that has
the relevant S3 bucket as its origin.
Using the CNAME bypasses the CloudFront distribution unless steps are taken to
forbid direct access to the bucket. It would be usual to use (and enforce)
access via CloudFront, if for no other reason than to provide for HTTPS access.
---
So, depending on what query you make... you get very different answers. For
example. If you try s3.amazon.com you get a CNAME to a rewrite.amazon.com which
seems reasonable for any subdomain request that they would have a better
response for.
I don't remember, and they may be moving to deterministic subdomains as you've
shown above, and only "legacy" uses go to s3.amazonaws.com. I remember hearing
a big uproar about it. Perhaps an AWS person will chime in with some color on
this.
So deterministic subdomain to a group of relatively deterministic endpoints,
even round-robin, makes sense to me as in... "usual in the practice of the
art." Even if those systems end up being load balancers for other systems
behind them.
The s3.amazonaws.com is different than that. I'm guessing that no one (else)
uses this sort of single IP from a pool trick and therefore it's not standard.
Further, given that AWS appears to be moving *back* to the traditional way of
doing things, there must be undesirable limitations to this model.
[just spitballing here]
Deepak