Great to know about this all...Thankx!!

On Sun, Jun 20, 2010 at 10:13 AM, Bipin Gautam <[email protected]>wrote:

> Exploring the software behind Facebook, the world’s largest site
> Posted in Main on June 18th, 2010 by Pingdom
>
>
> At the scale that Facebook operates, a lot of traditional approaches
> to serving web content break down or simply aren’t practical. The
> challenge for Facebook’s engineers has been to keep the site up and
> running smoothly in spite of handling close to half a billion active
> users. This article takes a look at some of the software and
> techniques they use to accomplish that.
>
>
> Facebook’s scaling challenge
> Before we get into the details, here are a few factoids to give you an
> idea of the scaling challenge that Facebook has to deal with:
>
>    * Facebook serves 570 billion page views per month (according to
> Google Ad Planner).
>    * There are more photos on Facebook than all other photo sites
> combined (including sites like Flickr).
>    * More than 3 billion photos are uploaded every month.
>    * Facebook’s systems serve 1.2 million photos per second. This
> doesn’t include the images served by Facebook’s CDN.
>    * More than 25 billion pieces of content (status updates,
> comments, etc) are shared every month.
>    * Facebook has more than 30,000 servers (and this number is from last
> year!)
>
>
> Software that helps Facebook scale
> In some ways Facebook is still a LAMP site (kind of), but it has had
> to change and extend its operation to incorporate a lot of other
> elements and services, and modify the approach to existing ones.
>
> For example:
>
>    * Facebook still uses PHP, but it has built a compiler for it so
> it can be turned into native code on its web servers, thus boosting
> performance.
>    * Facebook uses Linux, but has optimized it for its own purposes
> (especially in terms of network throughput).
>    * Facebook uses MySQL, but primarily as a key-value persistent
> storage, moving joins and logic onto the web servers since
> optimizations are easier to perform there (on the “other side” of the
> Memcached layer).
>
>
> Then there are the custom-written systems, like Haystack, a highly
> scalable object store used to serve Facebook’s immense amount of
> photos, or Scribe, a logging system that can operate at the scale of
> Facebook (which is far from trivial).
>
> But enough of that. Let’s present (some of) the software that Facebook
> uses to provide us all with the world’s largest social network site.
>
>
> Memcached
> Memcached is by now one of the most famous pieces of software on the
> internet. It’s a distributed memory caching system which Facebook (and
> a ton of other sites) use as a caching layer between the web servers
> and MySQL servers (since database access is relatively slow). Through
> the years, Facebook has made a ton of optimizations to Memcached and
> the surrounding software (like optimizing the network stack).
>
> Facebook runs thousands of Memcached servers with tens of terabytes of
> cached data at any one point in time. It is likely the world’s largest
> Memcached installation.
>
>
> HipHop for PHP
> HipHop for PHPPHP, being a scripting language, is relatively slow when
> compared to code that runs natively on a server. HipHop converts PHP
> into C++ code which can then be compiled for better performance. This
> has allowed Facebook to get much more out of its web servers since
> Facebook relies heavily on PHP to serve content.
>
> A small team of engineers (initially just three of them) at Facebook
> spent 18 months developing HipHop, and it is now live in production.
>
>
> Haystack
> Haystack is Facebook’s high-performance photo storage/retrieval system
> (strictly speaking, Haystack is an object store, so it doesn’t
> necessarily have to store photos). It has a ton of work to do; there
> are more than 20 billion uploaded photos on Facebook, and each one is
> saved in four different resolutions, resulting in more than 80 billion
> photos.
>
> And it’s not just about being able to handle billions of photos,
> performance is critical. As we mentioned previously, Facebook serves
> around 1.2 million photos per second, a number which doesn’t include
> images served by Facebook’s CDN. That’s a staggering number.
>
>
> BigPipe
> BigPipe is a dynamic web page serving system that Facebook has
> developed. Facebook uses it to serve each web page in sections (called
> “pagelets”) for optimal performance.
>
> For example, the chat window is retrieved separately, the news feed is
> retrieved separately, and so on. These pagelets can be retrieved in
> parallel, which is where the performance gain comes in, and it also
> gives users a site that works even if some part of it would be
> deactivated or broken.
>
>
> Cassandra
> Cassandra is a distributed storage system with no single point of
> failure. It’s one of the poster children for the NoSQL movement and
> has been made open source (it’s even become an Apache project).
> Facebook uses it for its Inbox search.
>
> Other than Facebook, a number of other services use it, for example
> Digg. We’re even considering some uses for it here at Pingdom.
>
>
> Scribe
> Scribe is a flexible logging system that Facebook uses for a multitude
> of purposes internally. It’s been built to be able to handle logging
> at the scale of Facebook, and automatically handles new logging
> categories as they show up (Facebook has hundreds).
>
>
> Hadoop and Hive
> Hadoop is an open source map-reduce implementation that makes it
> possible to perform calculations on massive amounts of data. Facebook
> uses this for data analysis (and as we all know, Facebook has massive
> amounts of data). Hive originated from within Facebook, and makes it
> possible to use SQL queries against Hadoop, making it easier for
> non-programmers to use.
>
> Both Hadoop and Hive are open source (Apache projects) and are used by
> a number of big services, for example Yahoo and Twitter.
> Thrift
>
> Facebook uses several different languages for its different services.
> PHP is used for the front-end, Erlang is used for Chat, Java and C++
> are also used in several places (and perhaps other languages as well).
> Thrift is an internally developed cross-language framework that ties
> all of these different languages together, making it possible for them
> to talk to each other. This has made it much easier for Facebook to
> keep up its cross-language development.
>
> Facebook has made Thrift open source and support for even more
> languages has been added.
>
> Varnish
> Varnish is an HTTP accelerator which can act as a load balancer and
> also cache content which can then be served lightning-fast.
>
> Facebook uses Varnish to serve photos and profile pictures, handling
> billions of requests every day. Like almost everything Facebook uses,
> Varnish is open source.
> Other things that help Facebook run smoothly
>
> We have mentioned some of the software that makes up Facebook’s
> system(s) and helps the service scale properly. But handling such a
> large system is a complex task, so we thought we would list a few more
> things that Facebook does to keep its service running smoothly.
>
>
> Gradual releases and dark launches
> Facebook has a system they called Gatekeeper that lets them run
> different code for different sets of users (it basically introduces
> different conditions in the code base). This lets Facebook do gradual
> releases of new features, A/B testing, activate certain features only
> for Facebook employees, etc.
>
> Gatekeeper also lets Facebook do something called “dark launches”,
> which is to activate elements of a certain feature behind the scenes
> before it goes live (without users noticing since there will be no
> corresponding UI elements). This acts as a real-world stress test and
> helps expose bottlenecks and other problem areas before a feature is
> officially launched. Dark launches are usually done two weeks before
> the actual launch.
> Profiling of the live system
>
> Facebook carefully monitors its systems (something we here at Pingdom
> of course approve of), and interestingly enough it also monitors the
> performance of every single PHP function in the live production
> environment. This profiling of the live PHP environment is done using
> an open source tool called XHProf.
>
>
> Gradual feature disabling for added performance
> If Facebook runs into performance issues, there are a large number of
> levers that let them gradually disable less important features to
> boost performance of Facebook’s core features.
> The things we didn’t mention.
>
> We didn’t go much into the hardware side in this article, but of
> course that is also an important aspect when it comes to scalability.
> For example, like many other big sites, Facebook uses a CDN to help
> serve static content. And then of course there is the huge data center
> Facebook is building in Oregon to help it scale out with even more
> servers.
>
> And aside from what we have already mentioned, there is of course a
> ton of other software involved. However, we hope we were able to
> highlight some of the more interesting choices Facebook has made.
>
>
> Facebook’s love affair with open source
> We can’t complete this article without mentioning how much Facebook
> likes open source. Or perhaps we should say, “loves”.
>
> Not only is Facebook using (and contributing to) open source software
> such as Linux, Memcached, MySQL, Hadoop, and many others, it has also
> made much of its internally developed software available as open
> source.
>
> Examples of open source projects that originated from inside Facebook
> include HipHop, Cassandra, Thrift and Scribe. Facebook has also
> open-sourced Tornado, a high-performance web server framework
> developed by the team behind FriendFeed (which Facebook bought in
> August 2009).
>
> (A list of open source software that Facebook is involved with can be
> found on Facebook’s Open Source page.)
>
> ...
> (Read More:
> http://royal.pingdom.com/2010/06/18/the-software-behind-facebook/ )
>
>
> -----------------------------------
> You received this message because you are subscribed to the Google
> Groups "NepSecure (Nepali computer security and hacking community )"
> group.
>
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to
> [email protected]<nepsecure%[email protected]>
> .
> For more options, visit this group at
> http://groups.google.com/group/nepsecure?hl=en.
>
> --
> FOSS Nepal mailing list: [email protected]
> http://groups.google.com/group/foss-nepal
> To unsubscribe, e-mail: 
> [email protected]<foss-nepal%[email protected]>
>
> Mailing List Guidelines:
> http://wiki.fossnepal.org/index.php?title=Mailing_List_Guidelines
> Community website: http://www.fossnepal.org/

-- 
FOSS Nepal mailing list: [email protected]
http://groups.google.com/group/foss-nepal
To unsubscribe, e-mail: [email protected]

Mailing List Guidelines: 
http://wiki.fossnepal.org/index.php?title=Mailing_List_Guidelines
Community website: http://www.fossnepal.org/

Reply via email to