Dear Wiki user, You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.
The "BristolHadoopWorkshopSpring2010" page has been changed by SteveLoughran. The comment on this change is: BBC. http://wiki.apache.org/hadoop/BristolHadoopWorkshopSpring2010?action=diff&rev1=2&rev2=3 -------------------------------------------------- HDFS has been used as a filestore in some of the US CMS Tier-2 sites, the new work that James discussed was that of actually treating physics problems as MapReduce jobs. They are bringing up a cluster of machines with storage for this, but would also like to use idle CPU time on other machines in the datacentre -there was some discussion on how to do this MAPREDUCE-1603 is now a feature request asking for a way to make the assessing of availability a feature that supported plugins. This would allow someone to write something that looked at non-Hadoop workload of machines and reduced the number Hadoop slots to report as being available when busy with other work. + == Leo Simons: The BBC == + Leo spoke about their CouchDB back end for the BBC web site + * [[http://vis.cs.ucdavis.edu/~ogawa/codeswarm/|Codeswarm]]: live graphics of their repository work. + * There's a new BBC homepage [[http://www.live.bbc.co.uk]] + * The web page is integrated with iplayer. + * Friday afternoons are busy iPlayer times. People either skive off work or watch TV from their desk. + * Lets you change your prefs -no need to login, the preferences are just bound to cookies + * Uses a hash of json to drive couchdb lookup, this lets them stay with 4M docs rather than 60M docs. + * They reach consistency in 40mS or so, no need for microsecond consistency as the rate of change of homepage is below that. + * Compaction reduced the status display to "blue", rather than green, had everyone panicing but no visible change in behaviour. Moral: use light green instead. + Lots of fun with incomplete resharding causing intermittent replication failures. When an app saw a 404, it created a new doc as it expected this and kept going, created extra load and resulted in a 7h replication. +
