At Veoh the recommendation data amounts to many billions of (roughly) these triples and this approach works very well indeed, even on tiny development clusters.
On Mon, Oct 20, 2008 at 6:23 PM, Colin Evans <[EMAIL PROTECTED]> wrote: > Hi Edward, > At Metaweb, we're experimenting with storing raw triples in HDFS flat > files, and have written a simple query language and planner that executes > the queries with chained map-reduce jobs. This approach works well for > warehousing triple data, and doesn't require HBase. Queries may take a few > minutes to execute, but the system scales for very large datasets and result > sets because it doesn't try to resolve queries in memory. We're currently > testing with more than 150MM triples and have been happy with the results. > > -Colin > > > > Edward J. Yoon wrote: > >> Hi all, >> >> This RDF proposal is a good long time ago. Now we'd like to settle >> down to research again. I attached our proposal, We'd love to hear >> your feedback & stories!! >> >> Thanks. >> >> > > -- ted
