Dear Wiki user, You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.
The "PoweredBy" page has been changed by prosch: http://wiki.apache.org/hadoop/PoweredBy?action=diff&rev1=358&rev2=359 * We also use Hadoop for executing long-running offline [[http://en.wikipedia.org/wiki/SPARQL|SPARQL]] queries for clients. * We use Amazon S3 and Cassandra to store input RDF datasets and output files. * We've developed [[http://rdfgrid.rubyforge.org/|RDFgrid]], a Ruby framework for map/reduce-based processing of RDF data. - * We primarily use Ruby, [[http://rdf.rubyforge.org/|RDF.rb]] and RDFgrid to process RDF data with Hadoop Streaming. + * We primarily use Ruby, and RDFgrid to process RDF data with Hadoop Streaming. - * We primarily run Hadoop jobs on Amazon Elastic MapReduce, with cluster sizes of 1 to 20 nodes depending on the size of the dataset (hundreds of millions to billions of RDF statements). + * We primarily run Hadoop jobs on Amazon Elastic MapReduce, with cluster sizes of 1 to 20 nodes depending on the size of the dataset (hundreds of millions to billions of RDF statements). * [[http://www.deepdyve.com|Deepdyve]] * Elastic cluster with 5-80 nodes
