Re: [solr-user] Upgrade from 1.2 to 1.3 gives 3x slowdown
On Apr 2, 2009, at 9:23 AM, Fergus McMenemie wrote: Grant, I should note, however, that the speed difference you are seeing may not be as pronounced as it appears. If I recall during ApacheCon, I commented on how long it takes to shutdown your Solr instance when exiting it. That time it takes is in fact Solr doing the work that was put off by not committing earlier and having all those deletes pile up. I am confused about work that was put off vs committing. My script was doing a commit right after the CVS import, and you are right about the massive times required to shut tomcat down. But in my tests the time taken to do the commit was under a second, yet I had to allow 300secs for tomcat shutdown. Also I dont have any duplicates. So what sort of work was being done at shutdown that was not being done by a commit? Optimise! The work being done is addressing the deletes, AIUI, but of course there are other things happening during shutdown, too. There are no deletes to do. It was a clean index to begin with and there were no duplicates. How long is the shutdown if you do a commit first and then a shutdown? Still very long, sometimes 300sec. My script always did a commit! At any rate, I don't know that there is a satisfying answer to the larger issue due to the things like the fsync stuff, which is an overall win for Lucene/Solr despite it being more slower. Have you tried running the tests on other machines (non-Mac?) Nope. Although next week I will have real PC running vista, so I could try it there. I think we should knock this on the head and move on. I rarely need to index this content and I can take the performance hit, and of course your work around provides a good speed up. Regards Fergus. -- === Fergus McMenemie Email:fer...@twig.me.uk Techmore Ltd Phone:(UK) 07721 376021 Unix/Mac/Intranets Analyst Programmer ===
Re: [solr-user] Upgrade from 1.2 to 1.3 gives 3x slowdown
The work being done is addressing the deletes, AIUI, but of course there are other things happening during shutdown, too. There are no deletes to do. It was a clean index to begin with and there were no duplicates. I have not followed this thread, so forgive me if this has already been suggested If you know that there are not any duplicates, have you tried indexing with allowDups=true? It will not change the fsync cost, but it may reduce some other checking times. ryan
Re: [solr-user] Upgrade from 1.2 to 1.3 gives 3x slowdown
On Apr 1, 2009, at 9:39 AM, Fergus McMenemie wrote: Grant, Redoing the work with your patch applied does not seem to make a difference! Is this the expected result? No, I didn't expect Solr 1095 to fix the problem. Overwrite = false + 1095, does, however, AFAICT by your last line, right? I did run it again using the full file, this time using my Imac:- 643465took 22min 14sec 2008-04-01 734796 73min 58sec 2009-01-15 758795 70min 55sec 2009-03-26 Again using only the first 1M records with commit=falseoverwrite=true:- 643465took 2m51.516s 2008-04-01 734796 7m29.326s 2009-01-15 758795 8m18.403s 2009-03-26 SOLR-1095 7m41.699s this time with commit=trueoverwrite=true. 643465took 2m49.200s 2008-04-01 734796 8m27.414s 2009-01-15 758795 9m32.459s 2009-03-26 SOLR-1095 7m58.825s this time with commit=falseoverwrite=false. 643465took 2m46.149s 2008-04-01 734796 3m29.909s 2009-01-15 758795 3m26.248s 2009-03-26 SOLR-1095 2m49.997s Grant, Hmmm, the big difference is made by overwrite=false. But, can you explain why overwrite=false makes such a difference. I am starting off with an empty index and I have checked the content there are no duplicates in the uniqueKey field. I guess if overwrite=false then a few checks can be removed from the indexing process, and if I am confident that my content contains no duplicates then this is a good speed up. http://wiki.apache.org/solr/UpdateCSV says that if overwrite is true (the default) then overwrite documents based on the uniqueKey. However what will solr/lucene do if the uniqueKey is not unique and overwrite=false? fergus: perl -nlaF\t -e 'print $F[2];' geonames.txt | wc -l 100 fergus: perl -nlaF\t -e 'print $F[2];' geonames.txt | sort -u | wc -l 100 fergus: /usr/bin/head geonames.txt RC UFI UNI LAT LONGDMS_LAT DMS_LONGMGRSJOG FC DSG PC CC1 ADM1ADM2POP ELEVCC2 NT LC SHORT_FORM GENERIC SORT_NAME FULL_NAME FULL_NAME_ND MODIFY_DATE 1 -130782860524 12.47 -69.9 122800 -695400 19PDP0219578323 ND19-14 T MT AA 00 PALUMARGA Palu Marga Palu Marga 1995-03-23 1 -1307756-189172012.5-70.016667 123000 -700100 19PCP8952982056 ND19-14 P PPLX PS. do you want me to do some kind of chop through the different versions to see where the slow down happened or are you happy you have nailed it? -- === Fergus McMenemie Email:fer...@twig.me.uk Techmore Ltd Phone:(UK) 07721 376021 Unix/Mac/Intranets Analyst Programmer ===
Re: [solr-user] Upgrade from 1.2 to 1.3 gives 3x slowdown
On Apr 2, 2009, at 4:02 AM, Fergus McMenemie wrote: Grant, Hmmm, the big difference is made by overwrite=false. But, can you explain why overwrite=false makes such a difference. I am starting off with an empty index and I have checked the content there are no duplicates in the uniqueKey field. I guess if overwrite=false then a few checks can be removed from the indexing process, and if I am confident that my content contains no duplicates then this is a good speed up. http://wiki.apache.org/solr/UpdateCSV says that if overwrite is true (the default) then overwrite documents based on the uniqueKey. However what will solr/lucene do if the uniqueKey is not unique and overwrite=false? overwrite=false means Solr does not issue deletes first, meaning if you have a doc w/ that id already, you will now have two docs with that id. unique Id is enforced by Solr, not by Lucene. Even if you can't guarantee uniqueness, you can still do overwrite = false as a workaround using the suggestion I gave you in a prior email: 1. Add a new field that is unique for your data source, but is the same for all records in that data source. i.e. type = geonames.txt 2. Before updating, issue a delete by query for the value of that type, which will delete all records with that term 3. Do your indexing with overwrite = false I should note, however, that the speed difference you are seeing may not be as pronounced as it appears. If I recall during ApacheCon, I commented on how long it takes to shutdown your Solr instance when exiting it. That time it takes is in fact Solr doing the work that was put off by not committing earlier and having all those deletes pile up. Thus, while it is likely that your older version is still faster due to the new fsync stuff in Lucene, it may not be that much faster. I think you could see this by actually doing commit = true, but I'm not 100% sure. fergus: perl -nlaF\t -e 'print $F[2];' geonames.txt | wc -l 100 fergus: perl -nlaF\t -e 'print $F[2];' geonames.txt | sort -u | wc -l 100 fergus: /usr/bin/head geonames.txt RC UFI UNI LAT LONG DMS_LAT DMS_LONG MGRS JOG FC DSG PC CC1 ADM1 ADM2 POP ELEV CC2 NT LC SHORT_FORM GENERIC SORT_NAME FULL_NAME FULL_NAME_ND MODIFY_DATE 1 -1307828 60524 12.47 -69.9 122800 -695400 19PDP0219578323 ND19-14 T MT AA 00 PALUMARGA Palu Marga Palu Marga 1995-03-23 1 -1307756 -1891720 12.5 -70.016667 123000 -700100 19PCP8952982056 ND19-14 P PPLX PS. do you want me to do some kind of chop through the different versions to see where the slow down happened or are you happy you have nailed it? -- === Fergus McMenemie Email:fer...@twig.me.uk Techmore Ltd Phone:(UK) 07721 376021 Unix/Mac/Intranets Analyst Programmer === -- Grant Ingersoll http://www.lucidimagination.com/ Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using Solr/Lucene: http://www.lucidimagination.com/search
Re: [solr-user] Upgrade from 1.2 to 1.3 gives 3x slowdown
On Apr 2, 2009, at 9:23 AM, Fergus McMenemie wrote: Grant, I should note, however, that the speed difference you are seeing may not be as pronounced as it appears. If I recall during ApacheCon, I commented on how long it takes to shutdown your Solr instance when exiting it. That time it takes is in fact Solr doing the work that was put off by not committing earlier and having all those deletes pile up. I am confused about work that was put off vs committing. My script was doing a commit right after the CVS import, and you are right about the massive times required to shut tomcat down. But in my tests the time taken to do the commit was under a second, yet I had to allow 300secs for tomcat shutdown. Also I dont have any duplicates. So what sort of work was being done at shutdown that was not being done by a commit? Optimise! The work being done is addressing the deletes, AIUI, but of course there are other things happening during shutdown, too. How long is the shutdown if you do a commit first and then a shutdown? At any rate, I don't know that there is a satisfying answer to the larger issue due to the things like the fsync stuff, which is an overall win for Lucene/Solr despite it being more slower. Have you tried running the tests on other machines (non-Mac?)
Re: [solr-user] Upgrade from 1.2 to 1.3 gives 3x slowdown
Grant, Redoing the work with your patch applied does not seem to make a difference! Is this the expected result? I did run it again using the full file, this time using my Imac:- 643465took 22min 14sec 2008-04-01 734796 73min 58sec 2009-01-15 758795 70min 55sec 2009-03-26 Again using only the first 1M records with commit=falseoverwrite=true:- 643465took 2m51.516s 2008-04-01 734796 7m29.326s 2009-01-15 758795 8m18.403s 2009-03-26 SOLR-1095 7m41.699s this time with commit=trueoverwrite=true. 643465took 2m49.200s 2008-04-01 734796 8m27.414s 2009-01-15 758795 9m32.459s 2009-03-26 SOLR-1095 7m58.825s this time with commit=falseoverwrite=false. 643465took 2m46.149s 2008-04-01 734796 3m29.909s 2009-01-15 758795 3m26.248s 2009-03-26 SOLR-1095 2m49.997s -- === Fergus McMenemie Email:fer...@twig.me.uk Techmore Ltd Phone:(UK) 07721 376021 Unix/Mac/Intranets Analyst Programmer ===
Re: [solr-user] Upgrade from 1.2 to 1.3 gives 3x slowdown
On Apr 1, 2009, at 9:39 AM, Fergus McMenemie wrote: Grant, Redoing the work with your patch applied does not seem to make a difference! Is this the expected result? No, I didn't expect Solr 1095 to fix the problem. Overwrite = false + 1095, does, however, AFAICT by your last line, right? I did run it again using the full file, this time using my Imac:- 643465took 22min 14sec 2008-04-01 734796 73min 58sec 2009-01-15 758795 70min 55sec 2009-03-26 Again using only the first 1M records with commit=falseoverwrite=true:- 643465took 2m51.516s 2008-04-01 734796 7m29.326s 2009-01-15 758795 8m18.403s 2009-03-26 SOLR-1095 7m41.699s this time with commit=trueoverwrite=true. 643465took 2m49.200s 2008-04-01 734796 8m27.414s 2009-01-15 758795 9m32.459s 2009-03-26 SOLR-1095 7m58.825s this time with commit=falseoverwrite=false. 643465took 2m46.149s 2008-04-01 734796 3m29.909s 2009-01-15 758795 3m26.248s 2009-03-26 SOLR-1095 2m49.997s -- === Fergus McMenemie Email:fer...@twig.me.uk Techmore Ltd Phone:(UK) 07721 376021 Unix/Mac/Intranets Analyst Programmer === -- Grant Ingersoll http://www.lucidimagination.com/ Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using Solr/Lucene: http://www.lucidimagination.com/search
Re: [solr-user] Upgrade from 1.2 to 1.3 gives 3x slowdown
svn co -r REV_NUM https://svn.apache.org/repos/asf/lucene/solr/trunk solr-REV_NUM -Grant On Mar 30, 2009, at 4:55 PM, Fergus McMenemie wrote: Can you verify that rev 701485 still performs reasonably well? This is from October 2008 and I get similar results to the earlier rev. Am now trying some other versions between October and when you first reported the issue in November. OK. Can you tell me how to get a hold of revision 701485. What is the magic svn line? On Mar 30, 2009, at 3:37 PM, Grant Ingersoll wrote: Fregus, Is rev 643465 the absolute latest you tried that still performs? i.e. every revision after is slower? -Grant On Mar 30, 2009, at 12:45 PM, Grant Ingersoll wrote: Fergus, I think the problem may actually be due to something that was introduced by a change to Solr's StopFilterFactory and the way it loads the stop words set. See https://issues.apache.org/jira/browse/SOLR-1095 I am in the process of testing it out and will let you know. -Grant On Mar 28, 2009, at 11:00 AM, Grant Ingersoll wrote: Hey Fergus, Finally got a chance to run your scripts, etc. per the thread: http://www.lucidimagination.com/search/document/5c3de15a4e61095c/upgrade_from_1_2_to_1_3_gives_3x_slowdown_script#8324a98d8840c623 I can reproduce your slowdown. One oddity with rev 643465 is: On the old version, there is an exception during startup: Mar 28, 2009 10:44:31 AM org.apache.solr.common.SolrException log SEVERE: java.lang.NullPointerException at org .apache .solr .handler .component.SearchHandler.handleRequestBody(SearchHandler.java:129) at org .apache .solr .handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java: 125) at org.apache.solr.core.SolrCore.execute(SolrCore.java:953) at org.apache.solr.core.SolrCore.execute(SolrCore.java:968) at org .apache .solr .core.QuerySenderListener.newSearcher(QuerySenderListener.java:50) at org.apache.solr.core.SolrCore$3.call(SolrCore.java:797) at java.util.concurrent.FutureTask $Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor $Worker.runTask(ThreadPoolExecutor.java:885) at java.util.concurrent.ThreadPoolExecutor $Worker.run(ThreadPoolExecutor.java:907) at java.lang.Thread.run(Thread.java:637) I see two things in CHANGES.txt that might apply, but I'm not sure: 1. I think commons-csv was upgraded 2. The CSV loader stuff was refactored to share common code I'm still investigating. -Grant -- Grant Ingersoll http://www.lucidimagination.com/ Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using Solr/Lucene: http://www.lucidimagination.com/search -- Grant Ingersoll http://www.lucidimagination.com/ Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using Solr/Lucene: http://www.lucidimagination.com/search -- Grant Ingersoll http://www.lucidimagination.com/ Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using Solr/Lucene: http://www.lucidimagination.com/search -- === Fergus McMenemie Email:fer...@twig.me.uk Techmore Ltd Phone:(UK) 07721 376021 Unix/Mac/Intranets Analyst Programmer === -- Grant Ingersoll http://www.lucidimagination.com/ Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using Solr/Lucene: http://www.lucidimagination.com/search
Re: [solr-user] Upgrade from 1.2 to 1.3 gives 3x slowdown
Can you try adding overwrite=false and running against the latest version? My current working theory is that Solr/Lucene has changed how deletes are handled such that work that was deferred before is now not deferred as often. In fact, you are not seeing this cost paid (or at least not noticing it) because you are not committing, but I believe you do see it when you are closing down Solr, which is why it takes so long to exit. I also think that Lucene adding fsync() into the equation may cause some slow down, but that is a penalty we are willing to pay as it gives us higher data integrity. So, depending on how you have your data, I think a workaround is to: Add a field that contains a single term identifying the data type for this particular CSV file, i.e. something like field: type, value: fergs-csv Then, before indexing, you can issue a Delete By Query: type:fergs-csv and then add your CSV file using overwrite=false. This amounts to a batch delete followed by a batch add, but without the add having to issue deletes for each add. In the meantime, I'm trying to see if I can pinpoint down a specific change and see if there is anything that might help it perform better. -Grant On Mar 30, 2009, at 4:52 PM, Fergus McMenemie wrote: Grant, After all my playing about at boot camp, I gave things a rest. It was not till months later that got back to looking at solr again. So after 643465 (2008-Apr-01) the next version I tried was 694377 from (2008-Sep-11). Nothing in between. Yep so 643465 is the latest version I tried that still performs. Every later revision is slower. However I need to repeat the tests using 643465, 694377 and whatever is the latest version. On my macbook I am only seeing a 2x slowdown of 643465 vis today, where as I had been seeing a 3x slowdown using my Imac. Fergus Fregus, Is rev 643465 the absolute latest you tried that still performs? i.e. every revision after is slower? -Grant On Mar 30, 2009, at 12:45 PM, Grant Ingersoll wrote: Fergus, I think the problem may actually be due to something that was introduced by a change to Solr's StopFilterFactory and the way it loads the stop words set. See https://issues.apache.org/jira/browse/SOLR-1095 I am in the process of testing it out and will let you know. -Grant On Mar 28, 2009, at 11:00 AM, Grant Ingersoll wrote: Hey Fergus, Finally got a chance to run your scripts, etc. per the thread: http://www.lucidimagination.com/search/document/5c3de15a4e61095c/upgrade_from_1_2_to_1_3_gives_3x_slowdown_script#8324a98d8840c623 I can reproduce your slowdown. One oddity with rev 643465 is: On the old version, there is an exception during startup: Mar 28, 2009 10:44:31 AM org.apache.solr.common.SolrException log SEVERE: java.lang.NullPointerException at org .apache .solr .handler .component.SearchHandler.handleRequestBody(SearchHandler.java:129) at org .apache .solr .handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java: 125) at org.apache.solr.core.SolrCore.execute(SolrCore.java:953) at org.apache.solr.core.SolrCore.execute(SolrCore.java:968) at org .apache .solr .core.QuerySenderListener.newSearcher(QuerySenderListener.java: 50) at org.apache.solr.core.SolrCore$3.call(SolrCore.java:797) at java.util.concurrent.FutureTask $Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor $Worker.runTask(ThreadPoolExecutor.java:885) at java.util.concurrent.ThreadPoolExecutor $Worker.run(ThreadPoolExecutor.java:907) at java.lang.Thread.run(Thread.java:637) I see two things in CHANGES.txt that might apply, but I'm not sure: 1. I think commons-csv was upgraded 2. The CSV loader stuff was refactored to share common code I'm still investigating. -Grant -- Grant Ingersoll http://www.lucidimagination.com/ Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using Solr/Lucene: http://www.lucidimagination.com/search -- === Fergus McMenemie Email:fer...@twig.me.uk Techmore Ltd Phone:(UK) 07721 376021 Unix/Mac/Intranets Analyst Programmer === -- Grant Ingersoll http://www.lucidimagination.com/ Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using Solr/Lucene: http://www.lucidimagination.com/search
Re: [solr-user] Upgrade from 1.2 to 1.3 gives 3x slowdown
Grant, I am messing with the script, and with your tip I expect I can make it recurse over as many releases as needed. I did run it again using the full file, this time using my Imac:- 643465took 22min 14sec 2008-04-01 734796 73min 58sec 2009-01-15 758795 70min 55sec 2009-03-26 I then ran it again using only the first 1M records:- 643465took 2m51.516s 2008-04-01 734796 7m29.326s 2009-01-15 758795 8m18.403s 2009-03-26 this time with commit=true. 643465took 2m49.200s 2008-04-01 734796 8m27.414s 2009-01-15 758795 9m32.459s 2009-03-26 this time with commit=falseoverwrite=false. 643465took 2m46.149s 2008-04-01 734796 3m29.909s 2009-01-15 758795 3m26.248s 2009-03-26 Just read your latest post. I will apply the patches and retest the above. Can you try adding overwrite=false and running against the latest version? My current working theory is that Solr/Lucene has changed how deletes are handled such that work that was deferred before is now not deferred as often. In fact, you are not seeing this cost paid (or at least not noticing it) because you are not committing, but I believe you do see it when you are closing down Solr, which is why it takes so long to exit. It can take ages! (15min to get tomcat to quit). Also my script does have the separate commit step, which does not take any time! I also think that Lucene adding fsync() into the equation may cause some slow down, but that is a penalty we are willing to pay as it gives us higher data integrity. Data integrity is always good. However if performance seems unreasonable, user/customers tend to take things into their own hands and kill the process or machine. This tends to be very bad for data integrity. So, depending on how you have your data, I think a workaround is to: Add a field that contains a single term identifying the data type for this particular CSV file, i.e. something like field: type, value: fergs-csv Then, before indexing, you can issue a Delete By Query: type:fergs-csv and then add your CSV file using overwrite=false. This amounts to a batch delete followed by a batch add, but without the add having to issue deletes for each add. Ok.. but... for these test cases I am starting off with an empty index. The script does a rm -rf solr/data before tomcat is launched. So I do not understand how the above helps. UNLESS there are duplicate gaz entries. In the meantime, I'm trying to see if I can pinpoint down a specific change and see if there is anything that might help it perform better. -Grant -- === Fergus McMenemie Email:fer...@twig.me.uk Techmore Ltd Phone:(UK) 07721 376021 Unix/Mac/Intranets Analyst Programmer ===
Re: [solr-user] Upgrade from 1.2 to 1.3 gives 3x slowdown
Fergus, I think the problem may actually be due to something that was introduced by a change to Solr's StopFilterFactory and the way it loads the stop words set. See https://issues.apache.org/jira/browse/SOLR-1095 I am in the process of testing it out and will let you know. -Grant On Mar 28, 2009, at 11:00 AM, Grant Ingersoll wrote: Hey Fergus, Finally got a chance to run your scripts, etc. per the thread: http://www.lucidimagination.com/search/document/5c3de15a4e61095c/upgrade_from_1_2_to_1_3_gives_3x_slowdown_script#8324a98d8840c623 I can reproduce your slowdown. One oddity with rev 643465 is: On the old version, there is an exception during startup: Mar 28, 2009 10:44:31 AM org.apache.solr.common.SolrException log SEVERE: java.lang.NullPointerException at org .apache .solr .handler .component.SearchHandler.handleRequestBody(SearchHandler.java:129) at org .apache .solr .handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:125) at org.apache.solr.core.SolrCore.execute(SolrCore.java:953) at org.apache.solr.core.SolrCore.execute(SolrCore.java:968) at org .apache .solr.core.QuerySenderListener.newSearcher(QuerySenderListener.java: 50) at org.apache.solr.core.SolrCore$3.call(SolrCore.java:797) at java.util.concurrent.FutureTask $Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor $Worker.runTask(ThreadPoolExecutor.java:885) at java.util.concurrent.ThreadPoolExecutor $Worker.run(ThreadPoolExecutor.java:907) at java.lang.Thread.run(Thread.java:637) I see two things in CHANGES.txt that might apply, but I'm not sure: 1. I think commons-csv was upgraded 2. The CSV loader stuff was refactored to share common code I'm still investigating. -Grant -- Grant Ingersoll http://www.lucidimagination.com/ Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using Solr/Lucene: http://www.lucidimagination.com/search
Re: [solr-user] Upgrade from 1.2 to 1.3 gives 3x slowdown
Fregus, Is rev 643465 the absolute latest you tried that still performs? i.e. every revision after is slower? -Grant On Mar 30, 2009, at 12:45 PM, Grant Ingersoll wrote: Fergus, I think the problem may actually be due to something that was introduced by a change to Solr's StopFilterFactory and the way it loads the stop words set. See https://issues.apache.org/jira/browse/SOLR-1095 I am in the process of testing it out and will let you know. -Grant On Mar 28, 2009, at 11:00 AM, Grant Ingersoll wrote: Hey Fergus, Finally got a chance to run your scripts, etc. per the thread: http://www.lucidimagination.com/search/document/5c3de15a4e61095c/upgrade_from_1_2_to_1_3_gives_3x_slowdown_script#8324a98d8840c623 I can reproduce your slowdown. One oddity with rev 643465 is: On the old version, there is an exception during startup: Mar 28, 2009 10:44:31 AM org.apache.solr.common.SolrException log SEVERE: java.lang.NullPointerException at org .apache .solr .handler .component.SearchHandler.handleRequestBody(SearchHandler.java:129) at org .apache .solr .handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java: 125) at org.apache.solr.core.SolrCore.execute(SolrCore.java:953) at org.apache.solr.core.SolrCore.execute(SolrCore.java:968) at org .apache .solr.core.QuerySenderListener.newSearcher(QuerySenderListener.java: 50) at org.apache.solr.core.SolrCore$3.call(SolrCore.java:797) at java.util.concurrent.FutureTask $Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor $Worker.runTask(ThreadPoolExecutor.java:885) at java.util.concurrent.ThreadPoolExecutor $Worker.run(ThreadPoolExecutor.java:907) at java.lang.Thread.run(Thread.java:637) I see two things in CHANGES.txt that might apply, but I'm not sure: 1. I think commons-csv was upgraded 2. The CSV loader stuff was refactored to share common code I'm still investigating. -Grant -- Grant Ingersoll http://www.lucidimagination.com/ Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using Solr/Lucene: http://www.lucidimagination.com/search -- Grant Ingersoll http://www.lucidimagination.com/ Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using Solr/Lucene: http://www.lucidimagination.com/search
Re: [solr-user] Upgrade from 1.2 to 1.3 gives 3x slowdown
Can you verify that rev 701485 still performs reasonably well? This is from October 2008 and I get similar results to the earlier rev. Am now trying some other versions between October and when you first reported the issue in November. -Grant On Mar 30, 2009, at 3:37 PM, Grant Ingersoll wrote: Fregus, Is rev 643465 the absolute latest you tried that still performs? i.e. every revision after is slower? -Grant On Mar 30, 2009, at 12:45 PM, Grant Ingersoll wrote: Fergus, I think the problem may actually be due to something that was introduced by a change to Solr's StopFilterFactory and the way it loads the stop words set. See https://issues.apache.org/jira/browse/SOLR-1095 I am in the process of testing it out and will let you know. -Grant On Mar 28, 2009, at 11:00 AM, Grant Ingersoll wrote: Hey Fergus, Finally got a chance to run your scripts, etc. per the thread: http://www.lucidimagination.com/search/document/5c3de15a4e61095c/upgrade_from_1_2_to_1_3_gives_3x_slowdown_script#8324a98d8840c623 I can reproduce your slowdown. One oddity with rev 643465 is: On the old version, there is an exception during startup: Mar 28, 2009 10:44:31 AM org.apache.solr.common.SolrException log SEVERE: java.lang.NullPointerException at org .apache .solr .handler .component.SearchHandler.handleRequestBody(SearchHandler.java:129) at org .apache .solr .handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java: 125) at org.apache.solr.core.SolrCore.execute(SolrCore.java:953) at org.apache.solr.core.SolrCore.execute(SolrCore.java:968) at org .apache .solr .core.QuerySenderListener.newSearcher(QuerySenderListener.java:50) at org.apache.solr.core.SolrCore$3.call(SolrCore.java:797) at java.util.concurrent.FutureTask $Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor $Worker.runTask(ThreadPoolExecutor.java:885) at java.util.concurrent.ThreadPoolExecutor $Worker.run(ThreadPoolExecutor.java:907) at java.lang.Thread.run(Thread.java:637) I see two things in CHANGES.txt that might apply, but I'm not sure: 1. I think commons-csv was upgraded 2. The CSV loader stuff was refactored to share common code I'm still investigating. -Grant -- Grant Ingersoll http://www.lucidimagination.com/ Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using Solr/Lucene: http://www.lucidimagination.com/search -- Grant Ingersoll http://www.lucidimagination.com/ Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using Solr/Lucene: http://www.lucidimagination.com/search -- Grant Ingersoll http://www.lucidimagination.com/ Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using Solr/Lucene: http://www.lucidimagination.com/search
Re: [solr-user] Upgrade from 1.2 to 1.3 gives 3x slowdown
Grant, After all my playing about at boot camp, I gave things a rest. It was not till months later that got back to looking at solr again. So after 643465 (2008-Apr-01) the next version I tried was 694377 from (2008-Sep-11). Nothing in between. Yep so 643465 is the latest version I tried that still performs. Every later revision is slower. However I need to repeat the tests using 643465, 694377 and whatever is the latest version. On my macbook I am only seeing a 2x slowdown of 643465 vis today, where as I had been seeing a 3x slowdown using my Imac. Fergus Fregus, Is rev 643465 the absolute latest you tried that still performs? i.e. every revision after is slower? -Grant On Mar 30, 2009, at 12:45 PM, Grant Ingersoll wrote: Fergus, I think the problem may actually be due to something that was introduced by a change to Solr's StopFilterFactory and the way it loads the stop words set. See https://issues.apache.org/jira/browse/SOLR-1095 I am in the process of testing it out and will let you know. -Grant On Mar 28, 2009, at 11:00 AM, Grant Ingersoll wrote: Hey Fergus, Finally got a chance to run your scripts, etc. per the thread: http://www.lucidimagination.com/search/document/5c3de15a4e61095c/upgrade_from_1_2_to_1_3_gives_3x_slowdown_script#8324a98d8840c623 I can reproduce your slowdown. One oddity with rev 643465 is: On the old version, there is an exception during startup: Mar 28, 2009 10:44:31 AM org.apache.solr.common.SolrException log SEVERE: java.lang.NullPointerException at org .apache .solr .handler .component.SearchHandler.handleRequestBody(SearchHandler.java:129) at org .apache .solr .handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java: 125) at org.apache.solr.core.SolrCore.execute(SolrCore.java:953) at org.apache.solr.core.SolrCore.execute(SolrCore.java:968) at org .apache .solr.core.QuerySenderListener.newSearcher(QuerySenderListener.java: 50) at org.apache.solr.core.SolrCore$3.call(SolrCore.java:797) at java.util.concurrent.FutureTask $Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor $Worker.runTask(ThreadPoolExecutor.java:885) at java.util.concurrent.ThreadPoolExecutor $Worker.run(ThreadPoolExecutor.java:907) at java.lang.Thread.run(Thread.java:637) I see two things in CHANGES.txt that might apply, but I'm not sure: 1. I think commons-csv was upgraded 2. The CSV loader stuff was refactored to share common code I'm still investigating. -Grant -- Grant Ingersoll http://www.lucidimagination.com/ Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using Solr/Lucene: http://www.lucidimagination.com/search -- === Fergus McMenemie Email:fer...@twig.me.uk Techmore Ltd Phone:(UK) 07721 376021 Unix/Mac/Intranets Analyst Programmer ===
Re: [solr-user] Upgrade from 1.2 to 1.3 gives 3x slowdown
Can you verify that rev 701485 still performs reasonably well? This is from October 2008 and I get similar results to the earlier rev. Am now trying some other versions between October and when you first reported the issue in November. OK. Can you tell me how to get a hold of revision 701485. What is the magic svn line? On Mar 30, 2009, at 3:37 PM, Grant Ingersoll wrote: Fregus, Is rev 643465 the absolute latest you tried that still performs? i.e. every revision after is slower? -Grant On Mar 30, 2009, at 12:45 PM, Grant Ingersoll wrote: Fergus, I think the problem may actually be due to something that was introduced by a change to Solr's StopFilterFactory and the way it loads the stop words set. See https://issues.apache.org/jira/browse/SOLR-1095 I am in the process of testing it out and will let you know. -Grant On Mar 28, 2009, at 11:00 AM, Grant Ingersoll wrote: Hey Fergus, Finally got a chance to run your scripts, etc. per the thread: http://www.lucidimagination.com/search/document/5c3de15a4e61095c/upgrade_from_1_2_to_1_3_gives_3x_slowdown_script#8324a98d8840c623 I can reproduce your slowdown. One oddity with rev 643465 is: On the old version, there is an exception during startup: Mar 28, 2009 10:44:31 AM org.apache.solr.common.SolrException log SEVERE: java.lang.NullPointerException at org .apache .solr .handler .component.SearchHandler.handleRequestBody(SearchHandler.java:129) at org .apache .solr .handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java: 125) at org.apache.solr.core.SolrCore.execute(SolrCore.java:953) at org.apache.solr.core.SolrCore.execute(SolrCore.java:968) at org .apache .solr .core.QuerySenderListener.newSearcher(QuerySenderListener.java:50) at org.apache.solr.core.SolrCore$3.call(SolrCore.java:797) at java.util.concurrent.FutureTask $Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor $Worker.runTask(ThreadPoolExecutor.java:885) at java.util.concurrent.ThreadPoolExecutor $Worker.run(ThreadPoolExecutor.java:907) at java.lang.Thread.run(Thread.java:637) I see two things in CHANGES.txt that might apply, but I'm not sure: 1. I think commons-csv was upgraded 2. The CSV loader stuff was refactored to share common code I'm still investigating. -Grant -- Grant Ingersoll http://www.lucidimagination.com/ Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using Solr/Lucene: http://www.lucidimagination.com/search -- Grant Ingersoll http://www.lucidimagination.com/ Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using Solr/Lucene: http://www.lucidimagination.com/search -- Grant Ingersoll http://www.lucidimagination.com/ Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using Solr/Lucene: http://www.lucidimagination.com/search -- === Fergus McMenemie Email:fer...@twig.me.uk Techmore Ltd Phone:(UK) 07721 376021 Unix/Mac/Intranets Analyst Programmer ===
RE: [solr-user] Upgrade from 1.2 to 1.3 gives 3x slowdown
Hey Fergus, Finally got a chance to run your scripts, etc. per the thread: http://www.lucidimagination.com/search/document/5c3de15a4e61095c/upgrade_from_1_2_to_1_3_gives_3x_slowdown_script#8324a98d8840c623 I can reproduce your slowdown. One oddity with rev 643465 is: On the old version, there is an exception during startup: Mar 28, 2009 10:44:31 AM org.apache.solr.common.SolrException log SEVERE: java.lang.NullPointerException at org .apache .solr .handler.component.SearchHandler.handleRequestBody(SearchHandler.java: 129) at org .apache .solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java: 125) at org.apache.solr.core.SolrCore.execute(SolrCore.java:953) at org.apache.solr.core.SolrCore.execute(SolrCore.java:968) at org .apache .solr.core.QuerySenderListener.newSearcher(QuerySenderListener.java:50) at org.apache.solr.core.SolrCore$3.call(SolrCore.java:797) at java.util.concurrent.FutureTask $Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor $Worker.runTask(ThreadPoolExecutor.java:885) at java.util.concurrent.ThreadPoolExecutor $Worker.run(ThreadPoolExecutor.java:907) at java.lang.Thread.run(Thread.java:637) I see two things in CHANGES.txt that might apply, but I'm not sure: 1. I think commons-csv was upgraded 2. The CSV loader stuff was refactored to share common code I'm still investigating. -Grant