Re: question about NRT(soft commit) and Transaction Log in trunk

2012-05-06 Thread Michael McCandless
This is a good question...

I don't know much about how Solr's transaction log works, but, peeking
in the code, I do see it fsync'ing (look in TransactionLog.java, in
the finish method), but only if the SyncLevel is FSYNC.

If the default is really flush, I don't see how the transaction log
helps on recovery...?

Should we change the default ot FSYNC?

Mike McCandless

http://blog.mikemccandless.com


On Sat, Apr 28, 2012 at 7:11 AM, Li Li fancye...@gmail.com wrote:
 hi
   I checked out the trunk and played with its new soft commit
 feature. it's cool. But I've got a few questions about it.
   By reading some introductory articles and wiki, and hasted code
 reading, my understand of it's implementation is:
   For normal commit(hard commit), we should flush all into disk and
 commit it. flush is not very time consuming because of
 os level cache. the most time consuming one is sync in commit process.
   Soft commit just flush postings and pending deletions into disk
 and generating new segments. Then solr can use a
 new searcher to read the latest indexes and warm up and then register itself.
   if there is no hard commit and the jvm crashes, then new data may lose.
   if my understanding is correct, then why we need transaction log?
   I found in DirectUpdateHandler2, every time a command is executed,
 TransactionLog will record a line in log. But the default
 sync level in RunUpdateProcessorFactory is flush, which means it will
 not sync the log file. does this make sense?
   in database implementation, we usually write log and modify data
 in memory because log is smaller than real data. if crashes.
 we can redo the unfinished log and make data correct. will Solr
 leverage this log like this? if it is, why it's not synced?


question about NRT(soft commit) and Transaction Log in trunk

2012-04-28 Thread Li Li
hi
   I checked out the trunk and played with its new soft commit
feature. it's cool. But I've got a few questions about it.
   By reading some introductory articles and wiki, and hasted code
reading, my understand of it's implementation is:
   For normal commit(hard commit), we should flush all into disk and
commit it. flush is not very time consuming because of
os level cache. the most time consuming one is sync in commit process.
   Soft commit just flush postings and pending deletions into disk
and generating new segments. Then solr can use a
new searcher to read the latest indexes and warm up and then register itself.
   if there is no hard commit and the jvm crashes, then new data may lose.
   if my understanding is correct, then why we need transaction log?
   I found in DirectUpdateHandler2, every time a command is executed,
TransactionLog will record a line in log. But the default
sync level in RunUpdateProcessorFactory is flush, which means it will
not sync the log file. does this make sense?
   in database implementation, we usually write log and modify data
in memory because log is smaller than real data. if crashes.
we can redo the unfinished log and make data correct. will Solr
leverage this log like this? if it is, why it's not synced?