On Wed, May 11, 2011 at 11:19 AM, Ben Scholl brsch...@gmail.com wrote:
I keep reading that Hadoop/Brisk is not suitable for online querying, only
for offline/batch processing. What exactly are the reasons it is unsuitable?
My use case is a fairly high query load, and each query ideally would return
within about 20 seconds. The queries will use indexes to narrow down the
result set first, but they also need to support text search on one of the
fields. I was thinking of simulating the SQL LIKE statement, by running each
query as a MapReduce job so that the text search gets distributed between
nodes.
I know the recommended approach is to keep a seperate full-text index, but
that could be quite space-intensive, and also means you can only search on
complete words. Any thoughts on this approach?
Thanks,
Ben
Brisk was made to me a tight integration of Cassandra Hadoop and Hive.
If you are looking to full text searches you should look at Solandra,
https://github.com/tjake/Solandra, which is an Cassandra backend for
the Solr/Lucene indexes.
Edward