Hi, I attended a talk from Cloudera about their search solution. One thing which was striking was their NRT indexing. They have multiple integration points (Flume, HBase) which enables them to index the data as and when it is written to HDFS apart from MapReduce based adhoc-batch indexing. One thing which was not clear was how(if any) they store metadata(analogous to out TableDescriptor) about Indexes.
Also upon just starting a conversation later, I was told that they collaborated with the Blur Team which I was not aware of. - Rahul
