I like the idea. I'm trying to figure out, in broad strokes, the
overarching goals. Forgive me if this is obvious, I just want to be
clear.

The major goal is scale, right? A distributed server provides more oomph
than a single-node server can.

There are a number of dimensions in scale.

You mention replication of indexes, so scalability of search volume is
in scope, right?

You mention partitioning of indexes, though mostly around delete. What
about scalability of corpus size? Would partitioning be effective for
that, too?

What about scalability of ingest rate?

What are you thinking, in terms of size? Is this a 10 node thing? A 1000
node thing? More? Bigger is cool, but raises a lot of issues. How
dynamic? Can nodes come and go? Are you going to assume homogeneity of
nodes?

What about add/modify/delete to search visibility latency? Close to
batch/once-a-day or real-time?

I think it's definitely something people want. Actually, I think we
could answer these questions in different ways and for every answer,
we'd find people that would want it. But they would probably be
different people.

Reply via email to