Sean,

The use-case that we're looking at is a bit more complicated than that. 
Briefly, this is what we want to do.

1. We get a whole bunch of data, say, blog posts from various sources which we 
index in Solr, and store in Riak in json format.

2. Once the data is in riak, we need to run a whole bunch of analysis on 
selected groups of records. The scripts to do this analysis are in PHP and 
Python. The idea is to run MapReduce on a batch of records, and update Solr 
with the results of the analysis. On Riak, the results of the analysis will be  
updated on a different bucket, with links to the original record.


3. At the serving end, it's going to be just key-value pair retrievals, or 
simple MapReduce.

Pre-processing the data is not an option as we won't be running this analysis 
on all the records. It will be run only on a subset of data.

Given these use-case, what do you suggest is the best way to use Riak?


--
Thanks,
Ishwar




----- Original Message -----
> From:Sean Cribbs <[email protected]>
> To:Ishwar <[email protected]>
> Cc:"[email protected]" <[email protected]>
> Sent:Monday, March 14, 2011 8:57 PM
> Subject:Re: Riak n00b questions
> 
> 
> >> It is not currently, but we are looking into the feasibility of 
> supporting other languages.  However, I might say that if you're already 
> doing Python and PHP, it would be worth your while (and not difficult) to 
> learn 
> JavaScript.
> > 
> > We already have a whole bunch of processing on the data written in Python 
> and PHP, and porting them to Javascript is (1) very tedious, and (2) 
> Javascript 
> does not support the required functionality. For example, we do a bunch of 
> NLP 
> analysis on the data.
> > 
> > Given these, is it advisable if I expose these processes as webservices and 
> call them from javascript/erlang?
> > 
> 
> The other option of course, is to pre-process your data and just insert 
> multiple 
> copies in different formats, which is a pretty common pattern.  The tradeoff 
> is 
> whether you want to pay the cost at query time or at write time.  If you can 
> pay 
> that cost up-front, reads will likely be key-value or very simple MapReduce 
> and 
> thus very fast.
> 
> Sean Cribbs <[email protected]>
> Developer Advocate
> Basho Technologies, Inc.
> http://basho.com/


_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to