>
> What I mean here is right now, when we send a batch of documents to Solr.
> We still process it as concrete - unrelated documents by indexing one by
> one. If indexing the fifth document causing error, that won't affect
> already indexed 4 documents. Using this model we can index the batch in
> The biggest problem I have with this is that the client doesn't know
about indexing problems without awkward callbacks later to see if something
went wrong. Even simple stuff like a schema problem (e.g. undefined
field). It's a useful *option*, any way.
> Currently we now guarantee that if
This is interesting, though it opens a few of cans of worms IMHO.
1. Currently we now guarantee that if solr sends you an OK response the
document WILL eventually become searchable without further action.
Maintaining that guarantee becomes impossible if we haven't verified that
the
On Thu, Oct 8, 2020 at 10:21 AM Cao Mạnh Đạt wrote:
> Hi guys,
>
> First of all it seems that I used the term async a lot recently :D.
> Recently I have been thinking a lot about changing the current indexing
> model of Solr from sync way like currently (user submit an update request
> waiting
I like the idea.
Two (main) points are not clear for me:
- Order of updates: If the current leader fails (its tlog becoming
inaccessible) and another leader is elected and indexes some more,
what happens when the first leader comes back? What does it do with
its tlog and how to know which part
Thank you Tomas
>Atomic updates, can those be supported? I guess yes if we can guarantee
that messages are read once and only once.
It won't be straightforward since we have multiple consumers on the tlog
queue. But it is possible with appropriate locking
>I'm guessing we'd need to read messages
Interesting idea Đạt. The first questions/comments that come to my mind
would be:
* Atomic updates, can those be supported? I guess yes if we can guarantee
that messages are read once and only once.
* I'm guessing we'd need to read messages in an ordered way, so it'd be a
single Kafka partition
> Can there be a situation where the index writer fails after the document
was added to tlog and a success is sent to the user? I think we want to
avoid such a situation, isn't it?
> I suppose failures would be returned to the client one the async response?
To make things more clear, the response
I think this model has a lot of potential.
I'd like to add another wrinkle to this. Which is to store the information
about each batch as a record in the index. Each batch record would contain
a fingerprint for the batch. This solves lots of problems, and allows us to
confirm the integrity of the
I suppose failures would be returned to the client one the async response?
How would one keep the tlog from growing forever if the actual indexing
took a long time?
I'm guessing that this would be optional..
On Thu, Oct 8, 2020, 11:14 Ishan Chattopadhyaya
wrote:
> Can there be a situation
Can there be a situation where the index writer fails after the document
was added to tlog and a success is sent to the user? I think we want to
avoid such a situation, isn't it?
On Thu, 8 Oct, 2020, 8:25 pm Cao Mạnh Đạt, wrote:
> > Can you explain a little more on how this would impact
> Can you explain a little more on how this would impact durability of
updates?
Since we persist updates into tlog, I do not think this will be an issue
> What does a failure look like, and how does that information get
propagated back to the client app?
I did not be able to do much research but
Interesting idea! Can you explain a little more on how this would impact
durability of updates? What does a failure look like, and how does that
information get propagated back to the client app?
Mike
On Thu, Oct 8, 2020 at 9:21 AM Cao Mạnh Đạt wrote:
> Hi guys,
>
> First of all it seems that
13 matches
Mail list logo