[jira] [Commented] (SOLR-2748) autocommit commits too many times

2011-09-08 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13100406#comment-13100406
 ] 

Yonik Seeley commented on SOLR-2748:


My 10M doc performance testing now shows the correct number of commits.
I also tried turning on soft autocommits at 1 sec, and that also resulted in 
the correct number of soft commits being done.

Oddly enough the autoCommit + softAutoCommit test ran in 2:35 sec, while the 
autoCommit only test ran in 3:33.
One explanation could be that DWPT doesn't necessarily seem optimal for older 
(non-SSD) drives (Erick reported seeing trunk as slower than 3x on his system 
with a spinning-magnets type drive), and the smaller segments avoided some of 
this.
The other explanation (and this one actually makes more sense to me) is that 
the CSV loader used is single-threaded.  Adding the first 1000 documents to a 
small segment is probably more efficient than adding the last 1000 to a larger 
segment.  Doing more soft commits means creating smaller segments and doing 
more work in background merging using other CPU cores (basically, it increased 
the parallelism).


 autocommit commits too many times
 -

 Key: SOLR-2748
 URL: https://issues.apache.org/jira/browse/SOLR-2748
 Project: Solr
  Issue Type: Bug
Reporter: Yonik Seeley
 Attachments: SOLR-2748.patch, SOLR-2748.patch


 autocommit seems to commit more frequently than configured.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2748) autocommit commits too many times

2011-09-08 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13100435#comment-13100435
 ] 

Yonik Seeley commented on SOLR-2748:


I re-ran the 10M doc test with soft autocommit set to 10ms (obviously too low, 
but I just wanted to make sure that things didn't blow up).  Things went fine, 
no exceptions, etc, and it did manage to commit at a rate of 73 commits/sec 
while indexing.  Should be even higher if logging is turned off.

 autocommit commits too many times
 -

 Key: SOLR-2748
 URL: https://issues.apache.org/jira/browse/SOLR-2748
 Project: Solr
  Issue Type: Bug
Reporter: Yonik Seeley
 Attachments: SOLR-2748.patch, SOLR-2748.patch


 autocommit seems to commit more frequently than configured.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2748) autocommit commits too many times

2011-09-08 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13100556#comment-13100556
 ] 

Yonik Seeley commented on SOLR-2748:


I've been through a few hundred iterations of AutoCommitTest with no failures 
(with logging turned on).
I think it's time to clean up the debugging logs and commit!

 autocommit commits too many times
 -

 Key: SOLR-2748
 URL: https://issues.apache.org/jira/browse/SOLR-2748
 Project: Solr
  Issue Type: Bug
Reporter: Yonik Seeley
 Attachments: SOLR-2748.patch, SOLR-2748.patch


 autocommit seems to commit more frequently than configured.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2748) autocommit commits too many times

2011-09-08 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13100623#comment-13100623
 ] 

Yonik Seeley commented on SOLR-2748:


committed.  will backport to 3x next.

 autocommit commits too many times
 -

 Key: SOLR-2748
 URL: https://issues.apache.org/jira/browse/SOLR-2748
 Project: Solr
  Issue Type: Bug
Reporter: Yonik Seeley
 Attachments: SOLR-2748.patch, SOLR-2748.patch


 autocommit seems to commit more frequently than configured.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2748) autocommit commits too many times

2011-09-07 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13099598#comment-13099598
 ] 

Yonik Seeley commented on SOLR-2748:


So I think the culprit might be CommitTracker.run()
At the end (after doing the commit) it has this code:
{code}
// check if docs have been submitted since the commit started
if (lastAddedTime  started) {
  if (docsUpperBound  0  docsSinceCommit.get()  docsUpperBound) {
pending = scheduler.schedule(this, 100, TimeUnit.MILLISECONDS);
  } else if (timeUpperBound  0) {
pending = scheduler.schedule(this, timeUpperBound,
TimeUnit.MILLISECONDS);
  }
}
{code}

Which seems to blindly schedule another commit (which should have already been 
scheduled?).  So now we have 2 commits scheduled for the next round, where 
there should have only been one.  It seems like those two commits now have the 
potential to turn into 4 for the next round, and so on.

 autocommit commits too many times
 -

 Key: SOLR-2748
 URL: https://issues.apache.org/jira/browse/SOLR-2748
 Project: Solr
  Issue Type: Bug
Reporter: Yonik Seeley

 autocommit seems to commit more frequently than configured.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2748) autocommit commits too many times

2011-09-07 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13099613#comment-13099613
 ] 

Yonik Seeley commented on SOLR-2748:


Hmmm, I'm not quite understanding the logic of this class.
CommitTracker.didCommit() (which is called by DUH2 after a commit finishes) 
tries to cancel any pending scheduled operations and resets the doc counter to 
0.  But that seems like a bug since documents may have been added during the 
commit, and a new commit may have been scheduled while the old commit was 
executing.  Of course we are going to lose track of that since run() sets 
pending to null.

 autocommit commits too many times
 -

 Key: SOLR-2748
 URL: https://issues.apache.org/jira/browse/SOLR-2748
 Project: Solr
  Issue Type: Bug
Reporter: Yonik Seeley

 autocommit seems to commit more frequently than configured.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2748) autocommit commits too many times

2011-09-07 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13099675#comment-13099675
 ] 

Yonik Seeley commented on SOLR-2748:


Another interesting thing I ran into reviewing this code is that the 
CommitTracker.run() method is synchronized, and so is _scheduleCommitWithin(), 
meaning (I think) that a long running commit will block anything calling that 
method (and an autoCommit by time or an add with a commitWithin specified would 
qualify).

 autocommit commits too many times
 -

 Key: SOLR-2748
 URL: https://issues.apache.org/jira/browse/SOLR-2748
 Project: Solr
  Issue Type: Bug
Reporter: Yonik Seeley

 autocommit seems to commit more frequently than configured.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2748) autocommit commits too many times

2011-09-07 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13099712#comment-13099712
 ] 

Yonik Seeley commented on SOLR-2748:


After adding a bunch of prints, I *think* this is a test bug.
Just because a newSearcher callback has been issued (triggered) does *not* mean 
that a new searcher has been registered yet.

 autocommit commits too many times
 -

 Key: SOLR-2748
 URL: https://issues.apache.org/jira/browse/SOLR-2748
 Project: Solr
  Issue Type: Bug
Reporter: Yonik Seeley
 Attachments: SOLR-2748.patch


 autocommit seems to commit more frequently than configured.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2748) autocommit commits too many times

2011-09-07 Thread Jason Rutherglen (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13099722#comment-13099722
 ] 

Jason Rutherglen commented on SOLR-2748:


Seeing all of the bugs related to the Solr NRT code, I can't help but wonder 
why the 4.x version of the project needs to be backward compatible.  

Also why it's not using IndexReaderWarmer which was ostensibly created 
precisely for Solr's usage (and, it's not used in Solr and never has been).

 autocommit commits too many times
 -

 Key: SOLR-2748
 URL: https://issues.apache.org/jira/browse/SOLR-2748
 Project: Solr
  Issue Type: Bug
Reporter: Yonik Seeley
 Attachments: SOLR-2748.patch


 autocommit seems to commit more frequently than configured.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2748) autocommit commits too many times

2011-09-07 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13099726#comment-13099726
 ] 

Yonik Seeley commented on SOLR-2748:


bq. Seeing all of the bugs related to the Solr NRT code, I can't help but 
wonder why the 4.x version of the project needs to be backward compatible.

It's not really related to NRT since the autoCommit (CommitTracker) code has 
been around for a very long time (way before NRT).



 autocommit commits too many times
 -

 Key: SOLR-2748
 URL: https://issues.apache.org/jira/browse/SOLR-2748
 Project: Solr
  Issue Type: Bug
Reporter: Yonik Seeley
 Attachments: SOLR-2748.patch


 autocommit seems to commit more frequently than configured.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2748) autocommit commits too many times

2011-09-07 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13099790#comment-13099790
 ] 

Yonik Seeley commented on SOLR-2748:


Found another race in AutoCommitTest that was causing a failure... the 
autoCommitCount is incremented *after* the commit returns, so depending on 
thread scheduling, the test can be triggered to continue after the commit, but 
before autoCommitCount is incremented.

 autocommit commits too many times
 -

 Key: SOLR-2748
 URL: https://issues.apache.org/jira/browse/SOLR-2748
 Project: Solr
  Issue Type: Bug
Reporter: Yonik Seeley
 Attachments: SOLR-2748.patch


 autocommit seems to commit more frequently than configured.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org