I've just taken a squiz at an S3-based website we have, and via the S3 URL it 
is a CNAME with a 60-secod TTL pointing at a set of A records with 5-second 
TTLs.

Any one dig returns the CNAME and a single IP address:

dig our-domain.s3-website-ap-southeast-2.amazonaws.com.
our-domain.s3-website-ap-southeast-2.amazonaws.com.     14 IN CNAME s3-
website-ap-southeast-2.amazonaws.com.
s3-website-ap-southeast-2.amazonaws.com. 5 IN A 52.95.134.145

If the query is multiply repeated, the returned IP address changes, roughly 
every five seconds.

What's interesting is the name attached to the A records, which does not 
include "our-domain". It seems to be a record pointing to ALL S3 websites in 
the region. And all of the addresses I saw reverse-resolve to that one name. So 
there is definitely some under-the-bonnet magic discrimination going on.

In Route53 the picture is very different, with the published website host name 
(think "our-domain.com.au") resolving to four IP addresses that are all 
returned in the response to a single dig query. There is an A-ALIAS (a 
non-standard AWS record type) that points to a CloudFront distribution that has 
the relevant S3 bucket as its origin.

Using the CNAME bypasses the CloudFront distribution unless steps are taken to 
forbid direct access to the bucket. It would be usual to use (and enforce) 
access via CloudFront, if for no other reason than to provide for HTTPS access. 

---

So, depending on what query you make... you get very different answers. For 
example. If you try s3.amazon.com you get a CNAME to a rewrite.amazon.com which 
seems reasonable for any subdomain request that they would have a better 
response for. 

I don't remember, and they may be moving to deterministic subdomains as you've 
shown above, and only "legacy" uses go to s3.amazonaws.com. I remember hearing 
a big uproar about it. Perhaps an AWS person will chime in with some color on 
this.

So deterministic subdomain to a group of relatively deterministic endpoints, 
even round-robin, makes sense to me as in... "usual in the practice of the 
art." Even if those systems end up being load balancers for other systems 
behind them.

The s3.amazonaws.com is different than that. I'm guessing that no one (else) 
uses this sort of single IP from a pool trick and therefore it's not standard. 
Further, given that AWS appears to be moving *back* to the traditional way of 
doing things, there must be undesirable limitations to this model.

[just spitballing here]

Deepak

Reply via email to