Re: [solr-user] Upgrade from 1.2 to 1.3 gives 3x slowdown

2009-04-15 Thread Fergus McMenemie
On Apr 2, 2009, at 9:23 AM, Fergus McMenemie wrote:

 Grant,



 I should note, however, that the speed difference you are seeing may
 not be as pronounced as it appears.  If I recall during ApacheCon, I
 commented on how long it takes to shutdown your Solr instance when
 exiting it.  That time it takes is in fact Solr doing the work that
 was put off by not committing earlier and having all those deletes
 pile up.

 I am confused about work that was put off vs committing. My script
 was doing a commit right after the CVS import, and you are right
 about the massive times required to shut tomcat down. But in my tests
 the time taken to do the commit was under a second, yet I had to allow
 300secs for tomcat shutdown. Also I dont have any duplicates. So
 what sort of work was being done at shutdown that was not being done
 by a commit? Optimise!


The work being done is addressing the deletes, AIUI, but of course  
there are other things happening during shutdown, too.
There are no deletes to do. It was a clean index to begin with
and there were no duplicates.

How long is the shutdown if you do a commit first and then a shutdown?
Still very long, sometimes 300sec. My script always did a commit!

At any rate, I don't know that there is a satisfying answer to the  
larger issue due to the things like the fsync stuff, which is an  
overall win for Lucene/Solr despite it being more slower.  Have you  
tried running the tests on other machines (non-Mac?)
Nope. Although next week I will have real PC running vista, so 
I could try it there.

I think we should knock this on the head and move on. I rarely
need to index this content and I can take the performance hit,
and of course your work around provides a good speed up. 

Regards Fergus.
-- 

===
Fergus McMenemie   Email:fer...@twig.me.uk
Techmore Ltd   Phone:(UK) 07721 376021

Unix/Mac/Intranets Analyst Programmer
===


Re: [solr-user] Upgrade from 1.2 to 1.3 gives 3x slowdown

2009-04-15 Thread Ryan McKinley


The work being done is addressing the deletes, AIUI, but of course
there are other things happening during shutdown, too.

There are no deletes to do. It was a clean index to begin with
and there were no duplicates.



I have not followed this thread, so forgive me if this has already  
been suggested


If you know that there are not any duplicates, have you tried indexing  
with allowDups=true?


It will not change the fsync cost, but it may reduce some other  
checking times.


ryan


Re: [solr-user] Upgrade from 1.2 to 1.3 gives 3x slowdown

2009-04-02 Thread Fergus McMenemie
On Apr 1, 2009, at 9:39 AM, Fergus McMenemie wrote:

 Grant,

 Redoing the work with your patch applied does not seem to


 make a difference! Is this the expected result?

No, I didn't expect Solr 1095 to fix the problem. Overwrite = false +  
1095, does, however, AFAICT by your last line, right?



 I did run it again using the full file, this time using my Imac:-
  643465took  22min 14sec 2008-04-01
  734796  73min 58sec 2009-01-15
  758795  70min 55sec 2009-03-26
 Again using only the first 1M records with  
 commit=falseoverwrite=true:-
  643465took  2m51.516s   2008-04-01
  734796  7m29.326s   2009-01-15
  758795  8m18.403s   2009-03-26
  SOLR-1095   7m41.699s
 this time with commit=trueoverwrite=true.
  643465took  2m49.200s   2008-04-01
  734796  8m27.414s   2009-01-15
  758795  9m32.459s   2009-03-26
  SOLR-1095   7m58.825s
 this time with commit=falseoverwrite=false.
  643465took  2m46.149s   2008-04-01
  734796  3m29.909s   2009-01-15
  758795  3m26.248s   2009-03-26
  SOLR-1095   2m49.997s

Grant,

Hmmm, the big difference is made by overwrite=false. But,
can you explain why overwrite=false makes such a difference.
I am starting off with an empty index and I have checked the
content there are no duplicates in the uniqueKey field.

I guess if overwrite=false then a few checks can be removed
from the indexing process, and if I am confident that my content
contains no duplicates then this is a good speed up. 

http://wiki.apache.org/solr/UpdateCSV says that if overwrite 
is true (the default) then overwrite documents based on the
uniqueKey. However what will solr/lucene do if the uniqueKey
is not unique and overwrite=false?  

fergus: perl -nlaF\t -e 'print $F[2];' geonames.txt | wc -l
 100
fergus: perl -nlaF\t -e 'print $F[2];' geonames.txt | sort -u | wc -l
 100
fergus: /usr/bin/head geonames.txt
RC  UFI UNI LAT LONGDMS_LAT DMS_LONGMGRSJOG 
FC  DSG PC  CC1 ADM1ADM2POP ELEVCC2 NT  
LC  SHORT_FORM  GENERIC SORT_NAME   FULL_NAME   FULL_NAME_ND
MODIFY_DATE
1   -130782860524   12.47   -69.9   122800  -695400 
19PDP0219578323 ND19-14 T   MT  AA  00  
PALUMARGA   Palu Marga  Palu Marga  1995-03-23
1   -1307756-189172012.5-70.016667  123000  -700100 
19PCP8952982056 ND19-14 P   PPLX

PS. do you want me to do some kind of chop through the
different versions to see where the slow down happened
or are you happy you have nailed it?
-- 

===
Fergus McMenemie   Email:fer...@twig.me.uk
Techmore Ltd   Phone:(UK) 07721 376021

Unix/Mac/Intranets Analyst Programmer
===


Re: [solr-user] Upgrade from 1.2 to 1.3 gives 3x slowdown

2009-04-02 Thread Grant Ingersoll


On Apr 2, 2009, at 4:02 AM, Fergus McMenemie wrote:

Grant,

Hmmm, the big difference is made by overwrite=false. But,
can you explain why overwrite=false makes such a difference.
I am starting off with an empty index and I have checked the
content there are no duplicates in the uniqueKey field.

I guess if overwrite=false then a few checks can be removed
from the indexing process, and if I am confident that my content
contains no duplicates then this is a good speed up.

http://wiki.apache.org/solr/UpdateCSV says that if overwrite
is true (the default) then overwrite documents based on the
uniqueKey. However what will solr/lucene do if the uniqueKey
is not unique and overwrite=false?


overwrite=false means Solr does not issue deletes first, meaning if  
you have a doc w/ that id already, you will now have two docs with  
that id.   unique Id is enforced by Solr, not by Lucene.


Even if you can't guarantee uniqueness, you can still do overwrite =  
false as a workaround using the suggestion I gave you in a prior email:
1. Add a new field that is unique for your data source, but is the  
same for all records in that data source.  i.e. type = geonames.txt
2. Before updating, issue a delete by query for the value of that  
type, which will delete all records with that term

3. Do your indexing with overwrite = false

I should note, however, that the speed difference you are seeing may  
not be as pronounced as it appears.  If I recall during ApacheCon, I  
commented on how long it takes to shutdown your Solr instance when  
exiting it.  That time it takes is in fact Solr doing the work that  
was put off by not committing earlier and having all those deletes  
pile up.


Thus, while it is likely that your older version is still faster due  
to the new fsync stuff in Lucene, it may not be that much faster.  I  
think you could see this by actually doing commit = true, but I'm not  
100% sure.






fergus: perl -nlaF\t -e 'print $F[2];' geonames.txt | wc -l
100
fergus: perl -nlaF\t -e 'print $F[2];' geonames.txt | sort -u |  
wc -l

100
fergus: /usr/bin/head geonames.txt
RC	UFI	UNI	LAT	LONG	DMS_LAT	DMS_LONG	MGRS	JOG	FC	DSG	PC	CC1	ADM1	 
ADM2	POP	ELEV	CC2	NT	LC	SHORT_FORM	GENERIC	SORT_NAME	FULL_NAME	 
FULL_NAME_ND	MODIFY_DATE
1	-1307828	60524	12.47	-69.9	122800	-695400	19PDP0219578323	 
ND19-14	T	MT		AA	00	PALUMARGA	Palu Marga	Palu Marga	1995-03-23
1	-1307756	-1891720	12.5	-70.016667	123000	-700100	19PCP8952982056	 
ND19-14	P	PPLX	


PS. do you want me to do some kind of chop through the
different versions to see where the slow down happened
or are you happy you have nailed it?
--

===
Fergus McMenemie   Email:fer...@twig.me.uk
Techmore Ltd   Phone:(UK) 07721 376021

Unix/Mac/Intranets Analyst Programmer
===


--
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
using Solr/Lucene:

http://www.lucidimagination.com/search



Re: [solr-user] Upgrade from 1.2 to 1.3 gives 3x slowdown

2009-04-02 Thread Grant Ingersoll


On Apr 2, 2009, at 9:23 AM, Fergus McMenemie wrote:


Grant,




I should note, however, that the speed difference you are seeing may
not be as pronounced as it appears.  If I recall during ApacheCon, I
commented on how long it takes to shutdown your Solr instance when
exiting it.  That time it takes is in fact Solr doing the work that
was put off by not committing earlier and having all those deletes
pile up.


I am confused about work that was put off vs committing. My script
was doing a commit right after the CVS import, and you are right
about the massive times required to shut tomcat down. But in my tests
the time taken to do the commit was under a second, yet I had to allow
300secs for tomcat shutdown. Also I dont have any duplicates. So
what sort of work was being done at shutdown that was not being done
by a commit? Optimise!



The work being done is addressing the deletes, AIUI, but of course  
there are other things happening during shutdown, too.


How long is the shutdown if you do a commit first and then a shutdown?

At any rate, I don't know that there is a satisfying answer to the  
larger issue due to the things like the fsync stuff, which is an  
overall win for Lucene/Solr despite it being more slower.  Have you  
tried running the tests on other machines (non-Mac?)


Re: [solr-user] Upgrade from 1.2 to 1.3 gives 3x slowdown

2009-04-01 Thread Fergus McMenemie
Grant,

Redoing the work with your patch applied does not seem to 
make a difference! Is this the expected result?

I did run it again using the full file, this time using my Imac:-
643465took  22min 14sec 2008-04-01
734796  73min 58sec 2009-01-15
758795  70min 55sec 2009-03-26
Again using only the first 1M records with commit=falseoverwrite=true:-
643465took  2m51.516s   2008-04-01
734796  7m29.326s   2009-01-15
758795  8m18.403s   2009-03-26
SOLR-1095   7m41.699s  
this time with commit=trueoverwrite=true.
643465took  2m49.200s   2008-04-01
734796  8m27.414s   2009-01-15
758795  9m32.459s   2009-03-26
SOLR-1095   7m58.825s
this time with commit=falseoverwrite=false.
643465took  2m46.149s   2008-04-01
734796  3m29.909s   2009-01-15
758795  3m26.248s   2009-03-26
SOLR-1095   2m49.997s  


-- 

===
Fergus McMenemie   Email:fer...@twig.me.uk
Techmore Ltd   Phone:(UK) 07721 376021

Unix/Mac/Intranets Analyst Programmer
===


Re: [solr-user] Upgrade from 1.2 to 1.3 gives 3x slowdown

2009-04-01 Thread Grant Ingersoll


On Apr 1, 2009, at 9:39 AM, Fergus McMenemie wrote:


Grant,

Redoing the work with your patch applied does not seem to




make a difference! Is this the expected result?


No, I didn't expect Solr 1095 to fix the problem. Overwrite = false +  
1095, does, however, AFAICT by your last line, right?





I did run it again using the full file, this time using my Imac:-
643465took  22min 14sec 2008-04-01
734796  73min 58sec 2009-01-15
758795  70min 55sec 2009-03-26
Again using only the first 1M records with  
commit=falseoverwrite=true:-

643465took  2m51.516s   2008-04-01
734796  7m29.326s   2009-01-15
758795  8m18.403s   2009-03-26
SOLR-1095   7m41.699s
this time with commit=trueoverwrite=true.
643465took  2m49.200s   2008-04-01
734796  8m27.414s   2009-01-15
758795  9m32.459s   2009-03-26
SOLR-1095   7m58.825s
this time with commit=falseoverwrite=false.
643465took  2m46.149s   2008-04-01
734796  3m29.909s   2009-01-15
758795  3m26.248s   2009-03-26
SOLR-1095   2m49.997s


--

===
Fergus McMenemie   Email:fer...@twig.me.uk
Techmore Ltd   Phone:(UK) 07721 376021

Unix/Mac/Intranets Analyst Programmer
===


--
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
using Solr/Lucene:

http://www.lucidimagination.com/search



Re: [solr-user] Upgrade from 1.2 to 1.3 gives 3x slowdown

2009-03-31 Thread Grant Ingersoll
svn co -r REV_NUM  https://svn.apache.org/repos/asf/lucene/solr/trunk  
solr-REV_NUM


-Grant


On Mar 30, 2009, at 4:55 PM, Fergus McMenemie wrote:


Can you verify that rev 701485 still performs reasonably well?  This
is from October 2008 and I get similar results to the earlier rev.
Am now trying some other versions between October and when you first
reported the issue in November.


OK. Can you tell me how to get a hold of revision 701485. What is the
magic svn line?



On Mar 30, 2009, at 3:37 PM, Grant Ingersoll wrote:


Fregus,

Is rev 643465 the absolute latest you tried that still performs?
i.e. every revision after is slower?

-Grant

On Mar 30, 2009, at 12:45 PM, Grant Ingersoll wrote:


Fergus,

I think the problem may actually be due to something that was
introduced by a change to Solr's StopFilterFactory and the way it
loads the stop words set.  See https://issues.apache.org/jira/browse/SOLR-1095

I am in the process of testing it out and will let you know.

-Grant

On Mar 28, 2009, at 11:00 AM, Grant Ingersoll wrote:


Hey Fergus,

Finally got a chance to run your scripts, etc. per the thread:
http://www.lucidimagination.com/search/document/5c3de15a4e61095c/upgrade_from_1_2_to_1_3_gives_3x_slowdown_script#8324a98d8840c623

I can reproduce your slowdown.

One oddity with rev 643465 is:

On the old version, there is an exception during startup:
Mar 28, 2009 10:44:31 AM org.apache.solr.common.SolrException log
SEVERE: java.lang.NullPointerException
at
org
.apache
.solr
.handler
.component.SearchHandler.handleRequestBody(SearchHandler.java:129)
at
org
.apache
.solr
.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:
125)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:953)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:968)
at
org
.apache
.solr
.core.QuerySenderListener.newSearcher(QuerySenderListener.java:50)
at org.apache.solr.core.SolrCore$3.call(SolrCore.java:797)
at java.util.concurrent.FutureTask
$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.ThreadPoolExecutor
$Worker.runTask(ThreadPoolExecutor.java:885)
at java.util.concurrent.ThreadPoolExecutor
$Worker.run(ThreadPoolExecutor.java:907)
at java.lang.Thread.run(Thread.java:637)

I see two things in CHANGES.txt that might apply, but I'm not  
sure:

1. I think commons-csv was upgraded
2. The CSV loader stuff was refactored to share common code

I'm still investigating.

-Grant


--
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)
using Solr/Lucene:
http://www.lucidimagination.com/search



--
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)
using Solr/Lucene:
http://www.lucidimagination.com/search



--
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)
using Solr/Lucene:
http://www.lucidimagination.com/search


--

===
Fergus McMenemie   Email:fer...@twig.me.uk
Techmore Ltd   Phone:(UK) 07721 376021

Unix/Mac/Intranets Analyst Programmer
===


--
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
using Solr/Lucene:

http://www.lucidimagination.com/search



Re: [solr-user] Upgrade from 1.2 to 1.3 gives 3x slowdown

2009-03-31 Thread Grant Ingersoll
Can you try adding overwrite=false and running against the latest  
version?  My current working theory is that Solr/Lucene has changed  
how deletes are handled such that work that was deferred before is now  
not deferred as often.  In fact, you are not seeing this cost paid (or  
at least not noticing it) because you are not committing, but I  
believe you do see it when you are closing down Solr, which is why it  
takes so long to exit.  I also think that Lucene adding fsync() into  
the equation may cause some slow down, but that is a penalty we are  
willing to pay as it gives us higher data integrity.


So, depending on how you have your data, I think a workaround is to:
Add a field that contains a single term identifying the data type for  
this particular CSV file, i.e. something like field: type, value:  
fergs-csv
Then, before indexing, you can issue a Delete By Query: type:fergs-csv  
and then add your CSV file using overwrite=false.  This amounts to a  
batch delete followed by a batch add, but without the add having to  
issue deletes for each add.


In the meantime, I'm trying to see if I can pinpoint down a specific  
change and see if there is anything that might help it perform better.


-Grant

On Mar 30, 2009, at 4:52 PM, Fergus McMenemie wrote:


Grant,

After all my playing about at boot camp, I gave things a rest. It
was not till months later that got back to looking at solr again.
So after 643465 (2008-Apr-01)  the next version I tried was 694377
from (2008-Sep-11). Nothing in between. Yep so 643465 is the latest
version I tried that still performs. Every later revision is slower.

However I need to repeat the tests using 643465, 694377 and whatever
is the latest version. On my macbook I am only seeing a 2x slowdown
of 643465 vis today, where as I had been seeing a 3x slowdown using
my Imac.

Fergus



Fregus,

Is rev 643465 the absolute latest you tried that still performs?   
i.e.

every revision after is slower?

-Grant

On Mar 30, 2009, at 12:45 PM, Grant Ingersoll wrote:


Fergus,

I think the problem may actually be due to something that was
introduced by a change to Solr's StopFilterFactory and the way it
loads the stop words set.  See https://issues.apache.org/jira/browse/SOLR-1095

I am in the process of testing it out and will let you know.

-Grant

On Mar 28, 2009, at 11:00 AM, Grant Ingersoll wrote:


Hey Fergus,

Finally got a chance to run your scripts, etc. per the thread:
http://www.lucidimagination.com/search/document/5c3de15a4e61095c/upgrade_from_1_2_to_1_3_gives_3x_slowdown_script#8324a98d8840c623

I can reproduce your slowdown.

One oddity with rev 643465 is:

On the old version, there is an exception during startup:
Mar 28, 2009 10:44:31 AM org.apache.solr.common.SolrException log
SEVERE: java.lang.NullPointerException
 at
org
.apache
.solr
.handler
.component.SearchHandler.handleRequestBody(SearchHandler.java:129)
 at
org
.apache
.solr
.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:
125)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:953)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:968)
 at
org
.apache
.solr 
.core.QuerySenderListener.newSearcher(QuerySenderListener.java:

50)
 at org.apache.solr.core.SolrCore$3.call(SolrCore.java:797)
 at java.util.concurrent.FutureTask
$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at java.util.concurrent.ThreadPoolExecutor
$Worker.runTask(ThreadPoolExecutor.java:885)
 at java.util.concurrent.ThreadPoolExecutor
$Worker.run(ThreadPoolExecutor.java:907)
 at java.lang.Thread.run(Thread.java:637)

I see two things in CHANGES.txt that might apply, but I'm not sure:
1. I think commons-csv was upgraded
2. The CSV loader stuff was refactored to share common code

I'm still investigating.

-Grant


--
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)
using Solr/Lucene:
http://www.lucidimagination.com/search


--

===
Fergus McMenemie   Email:fer...@twig.me.uk
Techmore Ltd   Phone:(UK) 07721 376021

Unix/Mac/Intranets Analyst Programmer
===


--
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
using Solr/Lucene:

http://www.lucidimagination.com/search



Re: [solr-user] Upgrade from 1.2 to 1.3 gives 3x slowdown

2009-03-31 Thread Fergus McMenemie
Grant,

I am messing with the script, and with your tip I expect I can
make it recurse over as many releases as needed.

I did run it again using the full file, this time using my Imac:-
643465took  22min 14sec 2008-04-01
734796  73min 58sec 2009-01-15
758795  70min 55sec 2009-03-26
I then ran it again using only the first 1M records:-
643465took  2m51.516s   2008-04-01
734796  7m29.326s   2009-01-15
758795  8m18.403s   2009-03-26
this time with commit=true.
643465took  2m49.200s   2008-04-01
734796  8m27.414s   2009-01-15
758795  9m32.459s   2009-03-26
this time with commit=falseoverwrite=false.
643465took  2m46.149s   2008-04-01
734796  3m29.909s   2009-01-15
758795  3m26.248s   2009-03-26

Just read your latest post. I will apply the patches and retest
the above.

Can you try adding overwrite=false and running against the latest  
version?  My current working theory is that Solr/Lucene has changed  
how deletes are handled such that work that was deferred before is now  
not deferred as often.  In fact, you are not seeing this cost paid (or  
at least not noticing it) because you are not committing, but I  
believe you do see it when you are closing down Solr, which is why it  
takes so long to exit.
It can take ages! (15min to get tomcat to quit). Also my script does
have the separate commit step, which does not take any time!

I also think that Lucene adding fsync() into  
the equation may cause some slow down, but that is a penalty we are  
willing to pay as it gives us higher data integrity.
Data integrity is always good. However if performance seems
unreasonable, user/customers tend to take things into their
own hands and kill the process or machine. This tends to be
very bad for data integrity.

So, depending on how you have your data, I think a workaround is to:
Add a field that contains a single term identifying the data type for  
this particular CSV file, i.e. something like field: type, value:  
fergs-csv
Then, before indexing, you can issue a Delete By Query: type:fergs-csv  
and then add your CSV file using overwrite=false.  This amounts to a  
batch delete followed by a batch add, but without the add having to  
issue deletes for each add.
Ok.. but... for these test cases I am starting off with an empty
index. The script does a rm -rf solr/data before tomcat is launched.
So I do not understand how the above helps. UNLESS there are duplicate
gaz entries.

In the meantime, I'm trying to see if I can pinpoint down a specific  
change and see if there is anything that might help it perform better.

-Grant


-- 

===
Fergus McMenemie   Email:fer...@twig.me.uk
Techmore Ltd   Phone:(UK) 07721 376021

Unix/Mac/Intranets Analyst Programmer
===


Re: [solr-user] Upgrade from 1.2 to 1.3 gives 3x slowdown

2009-03-30 Thread Grant Ingersoll

Fergus,

I think the problem may actually be due to something that was  
introduced by a change to Solr's StopFilterFactory and the way it  
loads the stop words set.  See https://issues.apache.org/jira/browse/SOLR-1095


I am in the process of testing it out and will let you know.

-Grant

On Mar 28, 2009, at 11:00 AM, Grant Ingersoll wrote:


Hey Fergus,

Finally got a chance to run your scripts, etc. per the thread:
http://www.lucidimagination.com/search/document/5c3de15a4e61095c/upgrade_from_1_2_to_1_3_gives_3x_slowdown_script#8324a98d8840c623

I can reproduce your slowdown.

One oddity with rev 643465 is:

On the old version, there is an exception during startup:
Mar 28, 2009 10:44:31 AM org.apache.solr.common.SolrException log
SEVERE: java.lang.NullPointerException
   at  
org 
.apache 
.solr 
.handler 
.component.SearchHandler.handleRequestBody(SearchHandler.java:129)
   at  
org 
.apache 
.solr 
.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:125)

   at org.apache.solr.core.SolrCore.execute(SolrCore.java:953)
   at org.apache.solr.core.SolrCore.execute(SolrCore.java:968)
   at  
org 
.apache 
.solr.core.QuerySenderListener.newSearcher(QuerySenderListener.java: 
50)

   at org.apache.solr.core.SolrCore$3.call(SolrCore.java:797)
   at java.util.concurrent.FutureTask 
$Sync.innerRun(FutureTask.java:303)

   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
   at java.util.concurrent.ThreadPoolExecutor 
$Worker.runTask(ThreadPoolExecutor.java:885)
   at java.util.concurrent.ThreadPoolExecutor 
$Worker.run(ThreadPoolExecutor.java:907)

   at java.lang.Thread.run(Thread.java:637)

I see two things in CHANGES.txt that might apply, but I'm not sure:
1. I think commons-csv was upgraded
2. The CSV loader stuff was refactored to share common code

I'm still investigating.

-Grant


--
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
using Solr/Lucene:

http://www.lucidimagination.com/search



Re: [solr-user] Upgrade from 1.2 to 1.3 gives 3x slowdown

2009-03-30 Thread Grant Ingersoll

Fregus,

Is rev 643465 the absolute latest you tried that still performs?  i.e.  
every revision after is slower?


-Grant

On Mar 30, 2009, at 12:45 PM, Grant Ingersoll wrote:


Fergus,

I think the problem may actually be due to something that was  
introduced by a change to Solr's StopFilterFactory and the way it  
loads the stop words set.  See https://issues.apache.org/jira/browse/SOLR-1095


I am in the process of testing it out and will let you know.

-Grant

On Mar 28, 2009, at 11:00 AM, Grant Ingersoll wrote:


Hey Fergus,

Finally got a chance to run your scripts, etc. per the thread:
http://www.lucidimagination.com/search/document/5c3de15a4e61095c/upgrade_from_1_2_to_1_3_gives_3x_slowdown_script#8324a98d8840c623

I can reproduce your slowdown.

One oddity with rev 643465 is:

On the old version, there is an exception during startup:
Mar 28, 2009 10:44:31 AM org.apache.solr.common.SolrException log
SEVERE: java.lang.NullPointerException
  at  
org 
.apache 
.solr 
.handler 
.component.SearchHandler.handleRequestBody(SearchHandler.java:129)
  at  
org 
.apache 
.solr 
.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java: 
125)

  at org.apache.solr.core.SolrCore.execute(SolrCore.java:953)
  at org.apache.solr.core.SolrCore.execute(SolrCore.java:968)
  at  
org 
.apache 
.solr.core.QuerySenderListener.newSearcher(QuerySenderListener.java: 
50)

  at org.apache.solr.core.SolrCore$3.call(SolrCore.java:797)
  at java.util.concurrent.FutureTask 
$Sync.innerRun(FutureTask.java:303)

  at java.util.concurrent.FutureTask.run(FutureTask.java:138)
  at java.util.concurrent.ThreadPoolExecutor 
$Worker.runTask(ThreadPoolExecutor.java:885)
  at java.util.concurrent.ThreadPoolExecutor 
$Worker.run(ThreadPoolExecutor.java:907)

  at java.lang.Thread.run(Thread.java:637)

I see two things in CHANGES.txt that might apply, but I'm not sure:
1. I think commons-csv was upgraded
2. The CSV loader stuff was refactored to share common code

I'm still investigating.

-Grant


--
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
using Solr/Lucene:

http://www.lucidimagination.com/search



--
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
using Solr/Lucene:

http://www.lucidimagination.com/search



Re: [solr-user] Upgrade from 1.2 to 1.3 gives 3x slowdown

2009-03-30 Thread Grant Ingersoll
Can you verify that rev 701485 still performs reasonably well?  This  
is from October 2008 and I get similar results to the earlier rev. 
Am now trying some other versions between October and when you first  
reported the issue in November.


-Grant

On Mar 30, 2009, at 3:37 PM, Grant Ingersoll wrote:


Fregus,

Is rev 643465 the absolute latest you tried that still performs?   
i.e. every revision after is slower?


-Grant

On Mar 30, 2009, at 12:45 PM, Grant Ingersoll wrote:


Fergus,

I think the problem may actually be due to something that was  
introduced by a change to Solr's StopFilterFactory and the way it  
loads the stop words set.  See https://issues.apache.org/jira/browse/SOLR-1095


I am in the process of testing it out and will let you know.

-Grant

On Mar 28, 2009, at 11:00 AM, Grant Ingersoll wrote:


Hey Fergus,

Finally got a chance to run your scripts, etc. per the thread:
http://www.lucidimagination.com/search/document/5c3de15a4e61095c/upgrade_from_1_2_to_1_3_gives_3x_slowdown_script#8324a98d8840c623

I can reproduce your slowdown.

One oddity with rev 643465 is:

On the old version, there is an exception during startup:
Mar 28, 2009 10:44:31 AM org.apache.solr.common.SolrException log
SEVERE: java.lang.NullPointerException
 at  
org 
.apache 
.solr 
.handler 
.component.SearchHandler.handleRequestBody(SearchHandler.java:129)
 at  
org 
.apache 
.solr 
.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java: 
125)

 at org.apache.solr.core.SolrCore.execute(SolrCore.java:953)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:968)
 at  
org 
.apache 
.solr 
.core.QuerySenderListener.newSearcher(QuerySenderListener.java:50)

 at org.apache.solr.core.SolrCore$3.call(SolrCore.java:797)
 at java.util.concurrent.FutureTask 
$Sync.innerRun(FutureTask.java:303)

 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at java.util.concurrent.ThreadPoolExecutor 
$Worker.runTask(ThreadPoolExecutor.java:885)
 at java.util.concurrent.ThreadPoolExecutor 
$Worker.run(ThreadPoolExecutor.java:907)

 at java.lang.Thread.run(Thread.java:637)

I see two things in CHANGES.txt that might apply, but I'm not sure:
1. I think commons-csv was upgraded
2. The CSV loader stuff was refactored to share common code

I'm still investigating.

-Grant


--
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
using Solr/Lucene:

http://www.lucidimagination.com/search



--
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
using Solr/Lucene:

http://www.lucidimagination.com/search



--
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
using Solr/Lucene:

http://www.lucidimagination.com/search



Re: [solr-user] Upgrade from 1.2 to 1.3 gives 3x slowdown

2009-03-30 Thread Fergus McMenemie
Grant,

After all my playing about at boot camp, I gave things a rest. It
was not till months later that got back to looking at solr again.
So after 643465 (2008-Apr-01)  the next version I tried was 694377 
from (2008-Sep-11). Nothing in between. Yep so 643465 is the latest
version I tried that still performs. Every later revision is slower.

However I need to repeat the tests using 643465, 694377 and whatever
is the latest version. On my macbook I am only seeing a 2x slowdown
of 643465 vis today, where as I had been seeing a 3x slowdown using
my Imac.

Fergus


Fregus,

Is rev 643465 the absolute latest you tried that still performs?  i.e.  
every revision after is slower?

-Grant

On Mar 30, 2009, at 12:45 PM, Grant Ingersoll wrote:

 Fergus,

 I think the problem may actually be due to something that was  
 introduced by a change to Solr's StopFilterFactory and the way it  
 loads the stop words set.  See 
 https://issues.apache.org/jira/browse/SOLR-1095

 I am in the process of testing it out and will let you know.

 -Grant

 On Mar 28, 2009, at 11:00 AM, Grant Ingersoll wrote:

 Hey Fergus,

 Finally got a chance to run your scripts, etc. per the thread:
 http://www.lucidimagination.com/search/document/5c3de15a4e61095c/upgrade_from_1_2_to_1_3_gives_3x_slowdown_script#8324a98d8840c623

 I can reproduce your slowdown.

 One oddity with rev 643465 is:

 On the old version, there is an exception during startup:
 Mar 28, 2009 10:44:31 AM org.apache.solr.common.SolrException log
 SEVERE: java.lang.NullPointerException
   at  
 org 
 .apache 
 .solr 
 .handler 
 .component.SearchHandler.handleRequestBody(SearchHandler.java:129)
   at  
 org 
 .apache 
 .solr 
 .handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java: 
 125)
   at org.apache.solr.core.SolrCore.execute(SolrCore.java:953)
   at org.apache.solr.core.SolrCore.execute(SolrCore.java:968)
   at  
 org 
 .apache 
 .solr.core.QuerySenderListener.newSearcher(QuerySenderListener.java: 
 50)
   at org.apache.solr.core.SolrCore$3.call(SolrCore.java:797)
   at java.util.concurrent.FutureTask 
 $Sync.innerRun(FutureTask.java:303)
   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
   at java.util.concurrent.ThreadPoolExecutor 
 $Worker.runTask(ThreadPoolExecutor.java:885)
   at java.util.concurrent.ThreadPoolExecutor 
 $Worker.run(ThreadPoolExecutor.java:907)
   at java.lang.Thread.run(Thread.java:637)

 I see two things in CHANGES.txt that might apply, but I'm not sure:
 1. I think commons-csv was upgraded
 2. The CSV loader stuff was refactored to share common code

 I'm still investigating.

 -Grant

 --
 Grant Ingersoll
 http://www.lucidimagination.com/

 Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
 using Solr/Lucene:
 http://www.lucidimagination.com/search

-- 

===
Fergus McMenemie   Email:fer...@twig.me.uk
Techmore Ltd   Phone:(UK) 07721 376021

Unix/Mac/Intranets Analyst Programmer
===


Re: [solr-user] Upgrade from 1.2 to 1.3 gives 3x slowdown

2009-03-30 Thread Fergus McMenemie
Can you verify that rev 701485 still performs reasonably well?  This  
is from October 2008 and I get similar results to the earlier rev. 
Am now trying some other versions between October and when you first  
reported the issue in November.

OK. Can you tell me how to get a hold of revision 701485. What is the
magic svn line?


On Mar 30, 2009, at 3:37 PM, Grant Ingersoll wrote:

 Fregus,

 Is rev 643465 the absolute latest you tried that still performs?   
 i.e. every revision after is slower?

 -Grant

 On Mar 30, 2009, at 12:45 PM, Grant Ingersoll wrote:

 Fergus,

 I think the problem may actually be due to something that was  
 introduced by a change to Solr's StopFilterFactory and the way it  
 loads the stop words set.  See 
 https://issues.apache.org/jira/browse/SOLR-1095

 I am in the process of testing it out and will let you know.

 -Grant

 On Mar 28, 2009, at 11:00 AM, Grant Ingersoll wrote:

 Hey Fergus,

 Finally got a chance to run your scripts, etc. per the thread:
 http://www.lucidimagination.com/search/document/5c3de15a4e61095c/upgrade_from_1_2_to_1_3_gives_3x_slowdown_script#8324a98d8840c623

 I can reproduce your slowdown.

 One oddity with rev 643465 is:

 On the old version, there is an exception during startup:
 Mar 28, 2009 10:44:31 AM org.apache.solr.common.SolrException log
 SEVERE: java.lang.NullPointerException
  at  
 org 
 .apache 
 .solr 
 .handler 
 .component.SearchHandler.handleRequestBody(SearchHandler.java:129)
  at  
 org 
 .apache 
 .solr 
 .handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java: 
 125)
  at org.apache.solr.core.SolrCore.execute(SolrCore.java:953)
  at org.apache.solr.core.SolrCore.execute(SolrCore.java:968)
  at  
 org 
 .apache 
 .solr 
 .core.QuerySenderListener.newSearcher(QuerySenderListener.java:50)
  at org.apache.solr.core.SolrCore$3.call(SolrCore.java:797)
  at java.util.concurrent.FutureTask 
 $Sync.innerRun(FutureTask.java:303)
  at java.util.concurrent.FutureTask.run(FutureTask.java:138)
  at java.util.concurrent.ThreadPoolExecutor 
 $Worker.runTask(ThreadPoolExecutor.java:885)
  at java.util.concurrent.ThreadPoolExecutor 
 $Worker.run(ThreadPoolExecutor.java:907)
  at java.lang.Thread.run(Thread.java:637)

 I see two things in CHANGES.txt that might apply, but I'm not sure:
 1. I think commons-csv was upgraded
 2. The CSV loader stuff was refactored to share common code

 I'm still investigating.

 -Grant

 --
 Grant Ingersoll
 http://www.lucidimagination.com/

 Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
 using Solr/Lucene:
 http://www.lucidimagination.com/search


 --
 Grant Ingersoll
 http://www.lucidimagination.com/

 Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
 using Solr/Lucene:
 http://www.lucidimagination.com/search


--
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
using Solr/Lucene:
http://www.lucidimagination.com/search

-- 

===
Fergus McMenemie   Email:fer...@twig.me.uk
Techmore Ltd   Phone:(UK) 07721 376021

Unix/Mac/Intranets Analyst Programmer
===


RE: [solr-user] Upgrade from 1.2 to 1.3 gives 3x slowdown

2009-03-28 Thread Grant Ingersoll

Hey Fergus,

Finally got a chance to run your scripts, etc. per the thread:
http://www.lucidimagination.com/search/document/5c3de15a4e61095c/upgrade_from_1_2_to_1_3_gives_3x_slowdown_script#8324a98d8840c623

I can reproduce your slowdown.

One oddity with rev 643465 is:

On the old version, there is an exception during startup:
Mar 28, 2009 10:44:31 AM org.apache.solr.common.SolrException log
SEVERE: java.lang.NullPointerException
at  
org 
.apache 
.solr 
.handler.component.SearchHandler.handleRequestBody(SearchHandler.java: 
129)
at  
org 
.apache 
.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java: 
125)

at org.apache.solr.core.SolrCore.execute(SolrCore.java:953)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:968)
at  
org 
.apache 
.solr.core.QuerySenderListener.newSearcher(QuerySenderListener.java:50)

at org.apache.solr.core.SolrCore$3.call(SolrCore.java:797)
at java.util.concurrent.FutureTask 
$Sync.innerRun(FutureTask.java:303)

at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.ThreadPoolExecutor 
$Worker.runTask(ThreadPoolExecutor.java:885)
at java.util.concurrent.ThreadPoolExecutor 
$Worker.run(ThreadPoolExecutor.java:907)

at java.lang.Thread.run(Thread.java:637)

I see two things in CHANGES.txt that might apply, but I'm not sure:
1. I think commons-csv was upgraded
2. The CSV loader stuff was refactored to share common code

I'm still investigating.

-Grant