Thanks, I think I understand better now. I deleted my previous post so that I can clarify better. The transaction log is just a backup mechanism for durability. When you index a document, it eventually goes into a segment (in memory). When you update it, the old doc is marked as deleted and then a new one is indexed into a/the segment. If no flush/commit has been made so far, the documents/segments are still in memory and each operation is also recorded in the transaction log (one for the first index, and then another for the update, and so on). When you do a flush, the in-memory segments are then written to disk and then the transaction log is emptied out (since we no longer need it as "backup" at this point). If on the other hand you simply do a refresh, the "new" segments in memory are simply made searchable (even though they are not necessarily written to disk yet) and no flush to disk happens. In this case, the transaction log still contains whatever it had in it so far.
So to answer your question, each update will require a new document to be indexed (no way around it). And the transaction log is probably not something that would matter in your scenario. I hope that helps. :) -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/34b4ade0-685f-4bd5-803f-c2264f76a7d5%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
