Re: William was raided for running a Tor exit node. Please help if you can.

2012-11-30 Thread Rayson Ho
On Fri, Nov 30, 2012 at 4:46 PM, Jimmy Hess mysi...@gmail.com wrote:
 If they had a qualified technician,  they probably wouldn't be raiding
 a TOR exit node in the first place;   they would have investigated the
 matter  more thoroughly, and saved precious time.

And what if the TOR exit node was in the cloud? Are they going to
confiscate millions of servers just because a few of them were hosting
child pornography??

(I am a believer of Cloud Computing, and in fact earlier this month we
had a 10,000-node Grid Engine HPC cluster running in Amazon EC2:
http://blogs.scalablelogic.com/2012/11/running-1-node-grid-engine-cluster.html
)

I believe most Cloud providers (Google, Amazon, IBM, etc) have some
sort of disclaimer clause... but then one can get a VPN account easily
too (there are many free ones as well)! So how could VPN, local coffee
shops, and cloud providers protect themselves from this kind of
non-sense??

Rayson

==
Open Grid Scheduler - The Official Open Source Grid Engine
http://gridscheduler.sourceforge.net/



 --
 -JH




Re: FYI Netflix is down

2012-07-09 Thread Rayson Ho
On Sun, Jul 8, 2012 at 8:27 PM, steve pirk [egrep] st...@pirk.com wrote:
 I am pretty sure Netflix and others were trying to do it right, as they
 all had graceful fail-over to a secondary AWS zone defined.
 It looks to me like Amazon uses DNS round-robin to load balance the zones,
 because they mention returning a list of addresses for DNS queries, and
 explains the failure of the services to shunt over to other zones in their
 postmortem.

There are also bugs from the Netflix side uncovered by the AWS outage:

Lessons Netflix Learned from the AWS Storm

http://techblog.netflix.com/2012/07/lessons-netflix-learned-from-aws-storm.html

For an infrastructure this large, no matter you are running your own
datacenter or using the cloud, it is certain that the code is not bug
free. And another thing is, if everything is too automated, then
failure in one component can trigger bugs in areas that no one has
ever thought of...

Rayson

==
Open Grid Scheduler - The Official Open Source Grid Engine
http://gridscheduler.sourceforge.net/






 Elastic Load Balancers (ELBs) allow web traffic directed at a single IP
 address to be spread across many EC2 instances. They are a tool for high
 availability as traffic to a single end-point can be handled by many
 redundant servers. ELBs live in individual Availability Zones and front EC2
 instances in those same zones or in other Availability Zones.



 ELBs can also be deployed in multiple Availability Zones. In this
 configuration, each Availability Zone’s end-point will have a separate IP
 address. A single Domain Name will point to all of the end-points’ IP
 addresses. When a client, such as a web browser, queries DNS with a Domain
 Name, it receives the IP address (“A”) records of all of the ELBs in random
 order. While some clients only process a single IP address, many (such as
 newer versions of web-browsers) will retry the subsequent IP addresses if
 they fail to connect to the first. A large number of non-browser clients
 only operate with a single IP address.
 During the disruption this past Friday night, the control plane (which
 encompasses calls to add a new ELB, scale an ELB, add EC2 instances to an
 ELB, and remove traffic from ELBs) began performing traffic shifts to
 account for the loss of load balancers in the affected Availability Zone.
 As the power and systems returned, a large number of ELBs came up in a
 state which triggered a bug we hadn’t seen before. The bug caused the ELB
 control plane to attempt to scale these ELBs to larger ELB instance sizes.
 This resulted in a sudden flood of requests which began to backlog the
 control plane. At the same time, customers began launching new EC2
 instances to replace capacity lost in the impacted Availability Zone,
 requesting the instances be added to existing load balancers in the other
 zones. These requests further increased the ELB control plane backlog.
 Because the ELB control plane currently manages requests for the US East-1
 Region through a shared queue, it fell increasingly behind in processing
 these requests; and pretty soon, these requests started taking a very long
 time to complete.

  http://aws.amazon.com/message/67457/


 *In reality, though, Amazon data centers have outages all the time. In
 fact, Amazon tells its customers to plan for this to happen, and to be
 ready to roll over to a new data center whenever there’s an outage.*

 *That’s what was supposed to happen at Netflix Friday night. But it
 didn’t work out that way. According to Twitter messages from Netflix
 Director of Cloud Architecture Adrian Cockcroft and Instagram Engineer Rick
 Branson, it looks like an Amazon Elastic Load Balancing service, designed
 to spread Netflix’s processing loads across data centers, failed during the
 outage. Without that ELB service working properly, the Netflix and Pintrest
 services hosted by Amazon crashed.*

  http://www.wired.com/wiredenterprise/2012/06/real-clouds-crush-amazon/

 I am a big believer in using hardware to load balance data centers, and not
 leave it up to software in the data center which might fail.

 Speaking of services like RightScale, Google announced Compute Engine at
 Google I/O this year. BuildFax was an early Adopter, and they gave it great
 reviews...
 http://www.youtube.com/watch?v=LCjSJ778tGU

 It looks like Google has entered into the VPS market. 'bout time... ;-]
 http://cloud.google.com/products/compute-engine.html

 --steve pirk



Re: FYI Netflix is down

2012-06-30 Thread Rayson Ho
If I recall correctly, availability zone (AZ) mappings are specific to
an AWS account, and in fact there is no way to know if you are running
in the same AZ as another AWS account:

http://aws.amazon.com/ec2/faqs/#How_can_I_make_sure_that_I_am_in_the_same_Availability_Zone_as_another_developer


Also, AWS Elastic Load Balancer (and/or CloudWatch) should be able to
detect that some instances are not reachable, and thus can start new
instances and remap DNS entries automatically:
http://aws.amazon.com/elasticloadbalancing/


This time only 1 AZ is affected by the power outage, so sites with
fault tolerance built into their AWS infrastructure should be able to
handle the issues relatively easily.

Rayson

==
Open Grid Scheduler - The Official Open Source Grid Engine
http://gridscheduler.sourceforge.net/



On Fri, Jun 29, 2012 at 11:44 PM, Grant Ridder shortdudey...@gmail.com wrote:
 I have an instance in zone C and it is up and fine, so it must be A, B, or
 D that is down.

 On Fri, Jun 29, 2012 at 10:42 PM, James Laszko jam...@mythostech.comwrote:

 To further expand:

 8:21 PM PDT We are investigating connectivity issues for a number of
 instances in the US-EAST-1 Region.

  8:31 PM PDT We are investigating elevated errors rates for APIs in the
 US-EAST-1 (Northern Virginia) region, as well as connectivity issues to
 instances in a single availability zone.

  8:40 PM PDT We can confirm that a large number of instances in a single
 Availability Zone have lost power due to electrical storms in the area. We
 are actively working to restore power.

 -Original Message-
 From: Grant Ridder [mailto:shortdudey...@gmail.com]
 Sent: Friday, June 29, 2012 8:42 PM
 To: Jason Baugher
 Cc: nanog@nanog.org
 Subject: Re: FYI Netflix is down

 From Amazon

 Amazon Elastic Compute Cloud (N. Virginia)  (http://status.aws.amazon.com/
 )
 8:21 PM PDT We are investigating connectivity issues for a number of
 instances in the US-EAST-1 Region.
 8:31 PM PDT We are investigating elevated errors rates for APIs in the
 US-EAST-1 (Northern Virginia) region, as well as connectivity issues to
 instances in a single availability zone.

 -Grant

 On Fri, Jun 29, 2012 at 10:40 PM, Jason Baugher ja...@thebaughers.com
 wrote:

  Seeing some reports of Pinterest and Instagram down as well. Amazon
  cloud services being implicated.
 
 
  On 6/29/2012 10:22 PM, Joe Blanchard wrote:
 
  Seems that they are unreachable at the moment. Called and theres a
  recorded message stating they are aware of an issue, no details.
 
  -Joe