Hi, We are presently evaluating Blur for our distributed search platform. I have few questions on Blur architecture which I need to clarify.
1. During Table creation, we mention the Shard count. Is this count be changed at later point of time, or this is fixed ? 2. As the shards are self sufficient indexes, is there any Routing support available for keys to a specific shards during indexing and during search ? 3. Do you have any writeup available how Lucene Segments got merged in HDFS and how the Block Cache gets updated after segments merge. 4. How Blur handles fail-over ? If a Shard Server dies, how replica shards took over ? I understand Zookeeper comes here, but do you have some writeup on this ? 4. Do you have any benchmark for Blur MTTR (Mean time to recover) ? 5. From my initial understanding, Blur has similar issue like HBase on Data Locality on fail over. If all files for a given Shards are replicated all over HDFS, during fail over , it will loose the data locality. Is that correct ? Regards, Dibyendu
