So, any such data store is way too slow to use with a real-time recommender.

But a distributed algorithm? sure. As you say, the distributed version
runs on Hadoop, and you can transfer between HDFS and Cassandra. Not
sure whether to call that integration -- there's nothing the project
would meaningfully do here, since it reads off HDFS.

It could be changed to read from Cassandra somehow but my sense is
there would be no real benefit. Cassandra is designed to provided
reasonably fast access to huge data, but that's not the model that the
distributed algorithm needs at all. And it's not suitable for the
non-distributed versions (not even an RDBMS really is).

But sure integrating at arms length -- pushing output to Cassandra and
pulling it in -- makes sense. It's probably something *Cassandra*
supports better than this project, since it's more a question of
Cassandra importing/exporting.

On Mon, May 31, 2010 at 4:23 PM, Florent Empis <[email protected]> wrote:
> Hi,
> Wouldn't it make sense for Taste to be able to use a datamodel based on a
> NoSQL of some sort?
> I was thinking of Cassandra since :
>
>   - Output of a Hadoop job can be inserted into Cassandra
>   - It's classified as an eventual consistency keystore by WP NoSQL page
>   - It's an Apache project
>
>
> I am by no means a NoSQL expert, but I figured it could be a nice project
> for me to learn a few things about Taste, and about Cassandra.
> The idea I have is basically:
>
> Recommendations  queries  (via, say, TasteWeb) are answered
> via Cassandra-based datamodel for recommendation
> Regular updates are computed (frequency depends on the volume of data and
> the available processing power) and pushed into Cassandra
>
> Since immediate consistency does not matter (for a given item/user,
> recommendations won't drastically change between two runs of the
> recommendation algorithm. If they do change, the last proposal wasn't
> probably "wrong" anyway, the new one is just "better"), an eventually
> consistent datastore seems a good choice, since content replication is built
> in, thus allowing a strong resilience for the query system...
> Let me know if you think it would add some value to Taste....
>
> BR,
> Florent
>

Reply via email to