[jira] [Commented] (SOLR-3954) Option to have updateHandler and DIH skip updateLog
[ https://issues.apache.org/jira/browse/SOLR-3954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13636727#comment-13636727 ] Shawn Heisey commented on SOLR-3954: The experimentation mentioned in my last comment was a success. There is still a performance impact, but it is smaller, and tlog sizes are under control. I still think a fix for this issue would be a good idea for general performance reasons, especially with DIH full-import. Option to have updateHandler and DIH skip updateLog --- Key: SOLR-3954 URL: https://issues.apache.org/jira/browse/SOLR-3954 Project: Solr Issue Type: Improvement Components: update Affects Versions: 4.0 Reporter: Shawn Heisey Fix For: 4.3 The updateLog feature makes updates take longer, likely because of the I/O time required to write the additional information to disk. It may take as much as three times as long for the indexing portion of the process. I'm not sure whether it affects the time to commit, but I would imagine that the difference there is small or zero. When doing incremental updates/deletes on an existing index, the time lag is probably very small and unimportant. When doing a full reindex (which may happen via DIH), especially if this is done in a build core that is then swapped with a live core, this performance hit is unacceptable. It seems to make the import take about three times as long. An option to have an update skip the updateLog would be very useful for these situations. It should have a method in SolrJ and be exposed in DIH as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3954) Option to have updateHandler and DIH skip updateLog
[ https://issues.apache.org/jira/browse/SOLR-3954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13540761#comment-13540761 ] Shawn Heisey commented on SOLR-3954: I am currently experimenting with updateLog turned on full time and using autoCommit to keep the size of the tlog directory under control. Unconfirmed testing suggests that the overall slowdown using this method is not as extreme as it it is when my entire dataimport happens without commits. It's still my opinion that a fix for this issue would be a good idea, but I do not think it should hold up the 4.1 release. Option to have updateHandler and DIH skip updateLog --- Key: SOLR-3954 URL: https://issues.apache.org/jira/browse/SOLR-3954 Project: Solr Issue Type: Improvement Components: update Affects Versions: 4.0 Reporter: Shawn Heisey Fix For: 4.1 The updateLog feature makes updates take longer, likely because of the I/O time required to write the additional information to disk. It may take as much as three times as long for the indexing portion of the process. I'm not sure whether it affects the time to commit, but I would imagine that the difference there is small or zero. When doing incremental updates/deletes on an existing index, the time lag is probably very small and unimportant. When doing a full reindex (which may happen via DIH), especially if this is done in a build core that is then swapped with a live core, this performance hit is unacceptable. It seems to make the import take about three times as long. An option to have an update skip the updateLog would be very useful for these situations. It should have a method in SolrJ and be exposed in DIH as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3954) Option to have updateHandler and DIH skip updateLog
[ https://issues.apache.org/jira/browse/SOLR-3954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13477106#comment-13477106 ] Shawn Heisey commented on SOLR-3954: I was unsure what to put for the priority. Minor seems slightly too low and Major seems too high. Option to have updateHandler and DIH skip updateLog --- Key: SOLR-3954 URL: https://issues.apache.org/jira/browse/SOLR-3954 Project: Solr Issue Type: Improvement Components: update Affects Versions: 4.0 Reporter: Shawn Heisey Fix For: 4.1 The updateLog feature makes updates take longer, likely because of the I/O time required to write the additional information to disk. It may take as much as three times as long for the indexing portion of the process. I'm not sure whether it affects the time to commit, but I would imagine that the difference there is small or zero. When doing incremental updates/deletes on an existing index, the time lag is probably very small and unimportant. When doing a full reindex (which may happen via DIH), especially if this is done in a build core that is then swapped with a live core, this performance hit is unacceptable. It seems to make the import take about three times as long. An option to have an update skip the updateLog would be very useful for these situations. It should have a method in SolrJ and be exposed in DIH as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3954) Option to have updateHandler and DIH skip updateLog
[ https://issues.apache.org/jira/browse/SOLR-3954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13477196#comment-13477196 ] Mark Miller commented on SOLR-3954: --- What config are you using? The updatelog should not normally have this kind of performance penalty. In any case, I don't think we would add an option to skip the update log - you can remove it if the performance is unacceptable. Option to have updateHandler and DIH skip updateLog --- Key: SOLR-3954 URL: https://issues.apache.org/jira/browse/SOLR-3954 Project: Solr Issue Type: Improvement Components: update Affects Versions: 4.0 Reporter: Shawn Heisey Fix For: 4.1 The updateLog feature makes updates take longer, likely because of the I/O time required to write the additional information to disk. It may take as much as three times as long for the indexing portion of the process. I'm not sure whether it affects the time to commit, but I would imagine that the difference there is small or zero. When doing incremental updates/deletes on an existing index, the time lag is probably very small and unimportant. When doing a full reindex (which may happen via DIH), especially if this is done in a build core that is then swapped with a live core, this performance hit is unacceptable. It seems to make the import take about three times as long. An option to have an update skip the updateLog would be very useful for these situations. It should have a method in SolrJ and be exposed in DIH as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3954) Option to have updateHandler and DIH skip updateLog
[ https://issues.apache.org/jira/browse/SOLR-3954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13477283#comment-13477283 ] Shawn Heisey commented on SOLR-3954: Which specific configuration bits would you like to see? My solrconfig.xml file is heavily split into separate files and uses xinclude. I will go ahead and paste my best guesses now. {code} directoryFactory name=DirectoryFactory class=${solr.directoryFactory:solr.NRTCachingDirectoryFactory}/ indexDefaults useCompoundFilefalse/useCompoundFile mergePolicy class=org.apache.lucene.index.TieredMergePolicy int name=maxMergeAtOnce35/int int name=segmentsPerTier35/int int name=maxMergeAtOnceExplicit105/int /mergePolicy mergeScheduler class=org.apache.lucene.index.ConcurrentMergeScheduler int name=maxMergeCount4/int int name=maxThreadCount4/int /mergeScheduler ramBufferSizeMB128/ramBufferSizeMB maxFieldLength32768/maxFieldLength writeLockTimeout1000/writeLockTimeout commitLockTimeout1/commitLockTimeout lockTypenative/lockType /indexDefaults updateHandler class=solr.DirectUpdateHandler2 autoCommit maxDocs0/maxDocs maxTime0/maxTime /autoCommit !-- updateLog / -- /updateHandler {code} My schema has 47 fields defined. Not all fields in a typical document will be there, but at least half of them usually will be present. I use the ICU classes for lowercasing and most of the text fieldTypes are using WordDelimeterFilter. {code} fields field name=catchall type=genText indexed=true stored=false multiValued=true termVectors=true/ field name=doc_date type=tdate indexed=true stored=true/ field name=pd type=tdate indexed=true stored=true/ field name=ft_text type=ignored/ field name=mime_type type=mimeText indexed=true stored=true omitTermFreqAndPositions=true/ field name=ft_dname type=genText indexed=true stored=true/ field name=ft_subject type=genText indexed=true stored=true/ field name=action type=keyText indexed=true stored=true/ field name=attribute type=keyText indexed=true stored=true omitTermFreqAndPositions=true/ field name=category type=keyText indexed=true stored=true omitTermFreqAndPositions=true/ field name=caption_writer type=keyText indexed=true stored=true/ field name=doc_id type=keyText indexed=true stored=true/ field name=ft_owner type=keyText indexed=true stored=true/ field name=location type=keyText indexed=true stored=true/ field name=special type=keyText indexed=true stored=true/ field name=special_cats type=keyText indexed=true stored=true/ field name=selector type=keyText indexed=true stored=true omitTermFreqAndPositions=true/ field name=scode type=keyText indexed=true stored=true omitTermFreqAndPositions=true/ field name=byline type=sourceText indexed=true stored=true/ field name=credit type=sourceText indexed=true stored=false/ field name=keywords type=sourceText indexed=true stored=true/ field name=source type=sourceText indexed=true stored=true/ field name=sg type=lcsemi indexed=true stored=false omitTermFreqAndPositions=true/ field name=aimcode type=lowercase indexed=true stored=false omitTermFreqAndPositions=true/ field name=nc_lang type=lowercase indexed=true stored=false omitTermFreqAndPositions=true/ field name=tag_id type=lowercase indexed=true stored=true omitTermFreqAndPositions=true/ field name=collection type=lowercase indexed=true stored=true omitTermFreqAndPositions=true/ field name=feature type=lowercase indexed=true stored=true omitTermFreqAndPositions=true/ field name=ip type=lowercase indexed=true stored=true omitTermFreqAndPositions=true/ field name=longdim type=lowercase indexed=true stored=true omitTermFreqAndPositions=true/ field name=webtable type=lowercase indexed=true stored=true omitTermFreqAndPositions=true/ field name=set_name type=lowercase indexed=true stored=true omitTermFreqAndPositions=true/ field name=did type=long indexed=true stored=true postingsFormat=BloomFilter/ field name=doc_size type=long indexed=true stored=true/ field name=post_date type=tlong indexed=true stored=true/ field name=post_hour type=tlong indexed=true stored=true/ field name=set_count type=int indexed=false stored=true/ field name=set_lead type=boolean indexed=true stored=true default=true/ field name=format type=string indexed=false stored=true/ field name=ft_sfname type=string indexed=false stored=true/ field name=text_preview type=string indexed=false stored=true/ field name=_version_ type=long indexed=true stored=true/ field name=headline type=keyText indexed=true stored=true/ field name=mood type=keyText indexed=true stored=true/ field name=object type=keyText indexed=true stored=true/ field name=personality type=keyText indexed=true stored=true/ field name=poster type=keyText indexed=true stored=true/ /fields
[jira] [Commented] (SOLR-3954) Option to have updateHandler and DIH skip updateLog
[ https://issues.apache.org/jira/browse/SOLR-3954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13477289#comment-13477289 ] Shawn Heisey commented on SOLR-3954: You'll notice that one field has postingsFormat. This was for another bug that I filed. It's not causing any difference in the config. I will set up my import again so I can illustrate the performance impact from updateLog. Option to have updateHandler and DIH skip updateLog --- Key: SOLR-3954 URL: https://issues.apache.org/jira/browse/SOLR-3954 Project: Solr Issue Type: Improvement Components: update Affects Versions: 4.0 Reporter: Shawn Heisey Fix For: 4.1 The updateLog feature makes updates take longer, likely because of the I/O time required to write the additional information to disk. It may take as much as three times as long for the indexing portion of the process. I'm not sure whether it affects the time to commit, but I would imagine that the difference there is small or zero. When doing incremental updates/deletes on an existing index, the time lag is probably very small and unimportant. When doing a full reindex (which may happen via DIH), especially if this is done in a build core that is then swapped with a live core, this performance hit is unacceptable. It seems to make the import take about three times as long. An option to have an update skip the updateLog would be very useful for these situations. It should have a method in SolrJ and be exposed in DIH as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3954) Option to have updateHandler and DIH skip updateLog
[ https://issues.apache.org/jira/browse/SOLR-3954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13477293#comment-13477293 ] Shawn Heisey commented on SOLR-3954: This is my most intense fieldType definition: {code} fieldType name=genText class=solr.TextField sortMissingLast=true positionIncrementGap=100 analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.PatternReplaceFilterFactory pattern=^(\p{Punct}*)(.*?)(\p{Punct}*)$ replacement=$2 allowempty=false / filter class=solr.WordDelimiterFilterFactory splitOnCaseChange=1 splitOnNumerics=1 stemEnglishPossessive=1 generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0 preserveOriginal=1 / filter class=solr.ICUFoldingFilterFactory/ filter class=solr.LengthFilterFactory min=1 max=512/ /analyzer analyzer type=query tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.PatternReplaceFilterFactory pattern=^(\p{Punct}*)(.*?)(\p{Punct}*)$ replacement=$2 allowempty=false / filter class=solr.WordDelimiterFilterFactory splitOnCaseChange=1 splitOnNumerics=1 stemEnglishPossessive=1 generateWordParts=1 generateNumberParts=1 catenateWords=0 catenateNumbers=0 catenateAll=0 preserveOriginal=1 / filter class=solr.ICUFoldingFilterFactory/ filter class=solr.LengthFilterFactory min=1 max=512/ /analyzer /fieldType {code} Option to have updateHandler and DIH skip updateLog --- Key: SOLR-3954 URL: https://issues.apache.org/jira/browse/SOLR-3954 Project: Solr Issue Type: Improvement Components: update Affects Versions: 4.0 Reporter: Shawn Heisey Fix For: 4.1 The updateLog feature makes updates take longer, likely because of the I/O time required to write the additional information to disk. It may take as much as three times as long for the indexing portion of the process. I'm not sure whether it affects the time to commit, but I would imagine that the difference there is small or zero. When doing incremental updates/deletes on an existing index, the time lag is probably very small and unimportant. When doing a full reindex (which may happen via DIH), especially if this is done in a build core that is then swapped with a live core, this performance hit is unacceptable. It seems to make the import take about three times as long. An option to have an update skip the updateLog would be very useful for these situations. It should have a method in SolrJ and be exposed in DIH as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3954) Option to have updateHandler and DIH skip updateLog
[ https://issues.apache.org/jira/browse/SOLR-3954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13477326#comment-13477326 ] Shawn Heisey commented on SOLR-3954: A completed import with updateLog turned off: {code} ?xml version=1.0 encoding=UTF-8? response lst name=responseHeader int name=status0/int int name=QTime0/int /lst lst name=initArgs lst name=defaults str name=configdih-config.xml/str /lst /lst str name=statusidle/str str name=importResponse/ lst name=statusMessages str name=Total Requests made to DataSource1/str str name=Total Rows Fetched12947488/str str name=Total Documents Skipped0/str str name=Full Dump Started2012-10-16 07:46:01/str str name=Indexing completed. Added/Updated: 12947488 documents. Deleted 0 documents./str str name=Committed2012-10-16 11:17:48/str str name=Total Documents Processed12947488/str str name=Time taken3:31:47.508/str /lst str name=WARNINGThis response format is experimental. It is likely to change in the future./str /response {code} Option to have updateHandler and DIH skip updateLog --- Key: SOLR-3954 URL: https://issues.apache.org/jira/browse/SOLR-3954 Project: Solr Issue Type: Improvement Components: update Affects Versions: 4.0 Reporter: Shawn Heisey Fix For: 4.1 The updateLog feature makes updates take longer, likely because of the I/O time required to write the additional information to disk. It may take as much as three times as long for the indexing portion of the process. I'm not sure whether it affects the time to commit, but I would imagine that the difference there is small or zero. When doing incremental updates/deletes on an existing index, the time lag is probably very small and unimportant. When doing a full reindex (which may happen via DIH), especially if this is done in a build core that is then swapped with a live core, this performance hit is unacceptable. It seems to make the import take about three times as long. An option to have an update skip the updateLog would be very useful for these situations. It should have a method in SolrJ and be exposed in DIH as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3954) Option to have updateHandler and DIH skip updateLog
[ https://issues.apache.org/jira/browse/SOLR-3954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13477333#comment-13477333 ] David Smiley commented on SOLR-3954: FWIW I've seen the updateLog grow to huge sizes for my bulk import. I commit at the end (of course) no soft commits or auto commits in-between. The updateLog is a hinderance during bulk imports. Option to have updateHandler and DIH skip updateLog --- Key: SOLR-3954 URL: https://issues.apache.org/jira/browse/SOLR-3954 Project: Solr Issue Type: Improvement Components: update Affects Versions: 4.0 Reporter: Shawn Heisey Fix For: 4.1 The updateLog feature makes updates take longer, likely because of the I/O time required to write the additional information to disk. It may take as much as three times as long for the indexing portion of the process. I'm not sure whether it affects the time to commit, but I would imagine that the difference there is small or zero. When doing incremental updates/deletes on an existing index, the time lag is probably very small and unimportant. When doing a full reindex (which may happen via DIH), especially if this is done in a build core that is then swapped with a live core, this performance hit is unacceptable. It seems to make the import take about three times as long. An option to have an update skip the updateLog would be very useful for these situations. It should have a method in SolrJ and be exposed in DIH as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3954) Option to have updateHandler and DIH skip updateLog
[ https://issues.apache.org/jira/browse/SOLR-3954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13477445#comment-13477445 ] Shawn Heisey commented on SOLR-3954: Here's a direct comparison on the same hardware. It might be important to know that when my import gets kicked off, there are actually four imports running. One of them is small -- during the second test (updateLog off), it imported 687765 rows in 10 minutes and 08 seconds. I did not check how long it took during the first test. The other three imports are all nearly 13 million records each. A du on the completed index directory with 12.9 million records shows 23520900 KB. I ran the first test and grabbed stats after an hour. Then I killed Solr, commented out updateLog, started it up again, kicked off the full-import, and again grabbed stats after an hour. Comparing the two shows that it is about twice as fast with updateLog turned off. With updateLog turned on: {code} ?xml version=1.0 encoding=UTF-8? response lst name=responseHeader int name=status0/int int name=QTime0/int /lst lst name=initArgs lst name=defaults str name=configdih-config.xml/str /lst /lst str name=statusbusy/str str name=importResponseA command is still running.../str lst name=statusMessages str name=Time Elapsed1:0:1.762/str str name=Total Requests made to DataSource1/str str name=Total Rows Fetched2052096/str str name=Total Documents Processed2052095/str str name=Total Documents Skipped0/str str name=Full Dump Started2012-10-16 14:59:01/str /lst str name=WARNINGThis response format is experimental. It is likely to change in the future./str /response {code} With updateLog turned off: {code} ?xml version=1.0 encoding=UTF-8? response lst name=responseHeader int name=status0/int int name=QTime0/int /lst lst name=initArgs lst name=defaults str name=configdih-config.xml/str /lst /lst str name=statusbusy/str str name=importResponseA command is still running.../str lst name=statusMessages str name=Time Elapsed1:0:0.434/str str name=Total Requests made to DataSource1/str str name=Total Rows Fetched4167525/str str name=Total Documents Processed4167524/str str name=Total Documents Skipped0/str str name=Full Dump Started2012-10-16 16:05:01/str /lst str name=WARNINGThis response format is experimental. It is likely to change in the future./str /response {code} Option to have updateHandler and DIH skip updateLog --- Key: SOLR-3954 URL: https://issues.apache.org/jira/browse/SOLR-3954 Project: Solr Issue Type: Improvement Components: update Affects Versions: 4.0 Reporter: Shawn Heisey Fix For: 4.1 The updateLog feature makes updates take longer, likely because of the I/O time required to write the additional information to disk. It may take as much as three times as long for the indexing portion of the process. I'm not sure whether it affects the time to commit, but I would imagine that the difference there is small or zero. When doing incremental updates/deletes on an existing index, the time lag is probably very small and unimportant. When doing a full reindex (which may happen via DIH), especially if this is done in a build core that is then swapped with a live core, this performance hit is unacceptable. It seems to make the import take about three times as long. An option to have an update skip the updateLog would be very useful for these situations. It should have a method in SolrJ and be exposed in DIH as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3954) Option to have updateHandler and DIH skip updateLog
[ https://issues.apache.org/jira/browse/SOLR-3954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13477462#comment-13477462 ] Shawn Heisey commented on SOLR-3954: bq. In any case, I don't think we would add an option to skip the update log - you can remove it if the performance is unacceptable. When I revamp my SolrJ application, I plan to use soft commit on a very short interval (maybe 10 seconds) but only do a hard commit every five minutes, possibly even less often. If I understand the updateLog functionality right, and I don't claim that I do, it would mean that my SolrJ code would not need to keep separate track of which updates succeeded with soft commit and which ones succeeded with hard commit. If the server went down four minutes and 55 seconds after the last hard commit, I would have reasonable expectation that when it came back up, all those soft commits would get properly applied to my index. Assuming I have a proper understanding above, I want the updateLog for my incremental updates. It makes the bulk import take at least twice as long, and I do not need it there because if that fails, I will just start it over. If I am going to benefit from updateLog, I need to be able to turn it off for bulk indexing. Is there a way to create a second updateHandler that does not have updateLog enabled and tell DIH to use that handler? Option to have updateHandler and DIH skip updateLog --- Key: SOLR-3954 URL: https://issues.apache.org/jira/browse/SOLR-3954 Project: Solr Issue Type: Improvement Components: update Affects Versions: 4.0 Reporter: Shawn Heisey Fix For: 4.1 The updateLog feature makes updates take longer, likely because of the I/O time required to write the additional information to disk. It may take as much as three times as long for the indexing portion of the process. I'm not sure whether it affects the time to commit, but I would imagine that the difference there is small or zero. When doing incremental updates/deletes on an existing index, the time lag is probably very small and unimportant. When doing a full reindex (which may happen via DIH), especially if this is done in a build core that is then swapped with a live core, this performance hit is unacceptable. It seems to make the import take about three times as long. An option to have an update skip the updateLog would be very useful for these situations. It should have a method in SolrJ and be exposed in DIH as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org