On Wed, May 11, 2011 at 11:19 AM, Ben Scholl <brsch...@gmail.com> wrote: > I keep reading that Hadoop/Brisk is not suitable for online querying, only > for offline/batch processing. What exactly are the reasons it is unsuitable? > My use case is a fairly high query load, and each query ideally would return > within about 20 seconds. The queries will use indexes to narrow down the > result set first, but they also need to support text search on one of the > fields. I was thinking of simulating the SQL LIKE statement, by running each > query as a MapReduce job so that the text search gets distributed between > nodes. > I know the recommended approach is to keep a seperate full-text index, but > that could be quite space-intensive, and also means you can only search on > complete words. Any thoughts on this approach? > Thanks, > Ben
Brisk was made to me a tight integration of Cassandra Hadoop and Hive. If you are looking to full text searches you should look at Solandra, https://github.com/tjake/Solandra, which is an Cassandra backend for the Solr/Lucene indexes. Edward