Re: Replication in Solr 1.4 - redirecting update handlers?

2009-06-19 Thread Doug Steigerwald
We had a similar issue using acts_as_solr.  We already had lighttpd  
running on some servers so we just proxied all requests for /solr/CORE/ 
update to the master and /solr/CORE/select to a load balanced IP for  
our slaves.


Doug

On Jun 19, 2009, at 11:42 AM, Mark A. Matienzo wrote:


I'm trying to figure out the best solution to the following issue.
We've got three boxes in our replication set up - one master and two
load  balanced slaves, all of which serve Solr using Tomcat. Given
this setup, we're also using the Drupal apachesolr module, which
currently supports only one Solr host in its configuration. What is
the best way to make this transparent to the Drupal module? Is it
possible to have some sort of phony update handler to redirect the
update requests to the master box from within Solr, or is this
something that would be more properly implemented in the Tomcat
configuration?

Mark A. Matienzo
Applications Developer, Digital Experience Group
The New York Public Library




Re: Issue with AND/OR Operator in Dismax Request

2009-05-20 Thread Doug Steigerwald

http://issues.apache.org/jira/browse/SOLR-405 ?

It's quite old and it's exactly what you want, but I think it might be  
the JIRA ticket that Otis mentioned.  Using a filter query was what we  
really needed.  I'm also not really sure why you need a dismax query  
at all.  You're not querying for the same thing in multiple fields.


Doug

On May 20, 2009, at 1:18 PM, dabboo wrote:



Hi,

Yeah you are right. Can you please tell me the URL of JIRA.

Thanks,
Amit



Otis Gospodnetic wrote:



Amit,

That's the same question as the other day, right?
Yes, DisMax doesn't play well with Boolean operators.  Check JIRA,  
it has

a search box, so you may be able to find related patches.
I think the patch I was thinking about is actually for something  
else -
allowing field names to be specified in query string and DixMax  
handling

that correctly.

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 

From: dabboo ag...@sapient.com
To: solr-user@lucene.apache.org
Sent: Wednesday, May 20, 2009 1:35:00 AM
Subject: Issue with AND/OR Operator in Dismax Request


Hi,

I am not getting correct results with a Query which has multiple  
AND | OR

operator.

Query Format q=((A AND B) OR (C OR D) OR E)

?q=((intAgeFrom_product_i:[0+TO+3]+AND+intAgeTo_product_i:[3+TO+*]) 
+OR+(intAgeFrom_product_i:[0+TO+3]+AND+intAgeTo_product_i:[0+TO+3]) 
+OR+(ageFrom_product_s:Adult))qt=dismaxrequest



Query return correct result without Dismaxrequest, but incorrect  
results

with Dismaxrequest.

I have to use dismaxrequest because i need boosting of search  
results


According to some posts there are issues with AND | OR operator with
dismaxrequest.
Please let me know if anyone has faced the same problem and if  
there is

any
way to make the query work with dismaxrequest.

I also believe that there is some patch available for this in one  
of the
JIRA. I would appreciate if somebody can let me know the URL, so  
that I

can
take a look at the patch.

Thanks for the help.

Thanks,
Amit Garg
--
View this message in context:
http://www.nabble.com/Issue-with-AND-OR-Operator-in-Dismax-Request-tp23629269p23629269.html
Sent from the Solr - User mailing list archive at Nabble.com.






--
View this message in context: 
http://www.nabble.com/Issue-with-AND-OR-Operator-in-Dismax-Request-tp23629269p23639786.html
Sent from the Solr - User mailing list archive at Nabble.com.





MoreLikeThis filtering

2009-03-04 Thread Doug Steigerwald
Is it possible to filter similarities found by the MLT component/ 
handler?  Something like mlt.fq=site_id:86?


We have 32 cores in our Solr install, and some of those cores have up  
to 8 sites indexed in them.  Typically those cores will have one very  
large site with a few hundred thousand indexed documents, and lots of  
small sites with significantly less documents indexed.


We're looking to implement a MLT component for our sites but want the  
similar stories to be only for a specific site (not all sites in the  
core).


Is there a way to do something like this, or will we have to make mods  
(I'm not seeing anything jump out at me in the Solr 1.3.0 or Lucene  
2.4.0 code)?


/solr/dsteiger/mlt?q=story_id:188665+AND+site_id:86mlt.fq=site_id:86

(We have all all of our other defaults set up in the handler config.)

Thanks.
---
Doug Steigerwald
Software Developer
McClatchy Interactive
dsteigerw...@mcclatchyinteractive.com



Re: MoreLikeThis filtering

2009-03-04 Thread Doug Steigerwald
'fq' seems to only work with finding the documents with your original  
query, not for filtering the similar documents.


Doug

On Mar 4, 2009, at 9:28 AM, Otis Gospodnetic wrote:



Doug,

does the good old 'fq' not work with MLT?  It should...


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 

From: Doug Steigerwald dsteigerw...@mcclatchyinteractive.com
To: solr-user@lucene.apache.org
Sent: Wednesday, March 4, 2009 9:20:40 AM
Subject: MoreLikeThis filtering

Is it possible to filter similarities found by the MLT component/ 
handler?

Something like mlt.fq=site_id:86?

We have 32 cores in our Solr install, and some of those cores have  
up to 8 sites
indexed in them.  Typically those cores will have one very large  
site with a few
hundred thousand indexed documents, and lots of small sites with  
significantly

less documents indexed.

We're looking to implement a MLT component for our sites but want  
the similar

stories to be only for a specific site (not all sites in the core).

Is there a way to do something like this, or will we have to make  
mods (I'm not
seeing anything jump out at me in the Solr 1.3.0 or Lucene 2.4.0  
code)?


/solr/dsteiger/mlt?q=story_id:188665+AND+site_id:86mlt.fq=site_id:86

(We have all all of our other defaults set up in the handler config.)

Thanks.
---
Doug Steigerwald
Software Developer
McClatchy Interactive
dsteigerw...@mcclatchyinteractive.com




Re: MoreLikeThis filtering

2009-03-04 Thread Doug Steigerwald
Hm.  I checked out a clean Solr 1.3.0 and indexed the example docs and  
set up a simple MLT handler the example queries on the Wiki work fine  
(fq can filter out docs).  Our build has a slight change to  
QueryComponent so another query isn't done when we use localsolr+field  
collapsing, but that change doesn't look like it would make a  
difference.  It just conditionally sets rb.setNeedDocSet() to true or  
false.


Will run some tests on a clean fresh build of Solr to see if it's our  
build.


Doug

On Mar 4, 2009, at 9:28 AM, Otis Gospodnetic wrote:



Doug,

does the good old 'fq' not work with MLT?  It should...


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 

From: Doug Steigerwald dsteigerw...@mcclatchyinteractive.com
To: solr-user@lucene.apache.org
Sent: Wednesday, March 4, 2009 9:20:40 AM
Subject: MoreLikeThis filtering

Is it possible to filter similarities found by the MLT component/ 
handler?

Something like mlt.fq=site_id:86?

We have 32 cores in our Solr install, and some of those cores have  
up to 8 sites
indexed in them.  Typically those cores will have one very large  
site with a few
hundred thousand indexed documents, and lots of small sites with  
significantly

less documents indexed.

We're looking to implement a MLT component for our sites but want  
the similar

stories to be only for a specific site (not all sites in the core).

Is there a way to do something like this, or will we have to make  
mods (I'm not
seeing anything jump out at me in the Solr 1.3.0 or Lucene 2.4.0  
code)?


/solr/dsteiger/mlt?q=story_id:188665+AND+site_id:86mlt.fq=site_id:86

(We have all all of our other defaults set up in the handler config.)

Thanks.
---
Doug Steigerwald
Software Developer
McClatchy Interactive
dsteigerw...@mcclatchyinteractive.com




Re: MoreLikeThis filtering

2009-03-04 Thread Doug Steigerwald
Sorry.  The examples on the wiki aren't working with the 'fq' to  
filter the similarities.  It just filters the actual queries.


http://localhost:8983/solr/mlt?q=id:SP2514Nmlt.fl=manu,catmlt.mindf=1mlt.mintf=1fq=popularity:6mlt.displayTerms=detailsmlt=true

The popularity of the doc found is 6, and trying to use 'fq=popularity: 
6' brings back similarities with a popularity other than 6.


Doug

On Mar 4, 2009, at 10:39 AM, Doug Steigerwald wrote:

Hm.  I checked out a clean Solr 1.3.0 and indexed the example docs  
and set up a simple MLT handler the example queries on the Wiki work  
fine (fq can filter out docs).  Our build has a slight change to  
QueryComponent so another query isn't done when we use localsolr 
+field collapsing, but that change doesn't look like it would make a  
difference.  It just conditionally sets rb.setNeedDocSet() to true  
or false.


Will run some tests on a clean fresh build of Solr to see if it's  
our build.


Doug

On Mar 4, 2009, at 9:28 AM, Otis Gospodnetic wrote:



Doug,

does the good old 'fq' not work with MLT?  It should...


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 

From: Doug Steigerwald dsteigerw...@mcclatchyinteractive.com
To: solr-user@lucene.apache.org
Sent: Wednesday, March 4, 2009 9:20:40 AM
Subject: MoreLikeThis filtering

Is it possible to filter similarities found by the MLT component/ 
handler?

Something like mlt.fq=site_id:86?

We have 32 cores in our Solr install, and some of those cores have  
up to 8 sites
indexed in them.  Typically those cores will have one very large  
site with a few
hundred thousand indexed documents, and lots of small sites with  
significantly

less documents indexed.

We're looking to implement a MLT component for our sites but want  
the similar

stories to be only for a specific site (not all sites in the core).

Is there a way to do something like this, or will we have to make  
mods (I'm not
seeing anything jump out at me in the Solr 1.3.0 or Lucene 2.4.0  
code)?


/solr/dsteiger/mlt?q=story_id:188665+AND+site_id:86mlt.fq=site_id: 
86


(We have all all of our other defaults set up in the handler  
config.)


Thanks.
---
Doug Steigerwald
Software Developer
McClatchy Interactive
dsteigerw...@mcclatchyinteractive.com




Re: MoreLikeThis filtering

2009-03-04 Thread Doug Steigerwald

Hah.  Sorry, I'm really out of it today.

The MoreLikeThisComponent doesn't seem to work for filtering using fq,  
but the MoreLikeThisHandler does.


Problem solved, we'll just use the handler instead of a component.

Doug

On Mar 4, 2009, at 11:02 AM, Doug Steigerwald wrote:

Sorry.  The examples on the wiki aren't working with the 'fq' to  
filter the similarities.  It just filters the actual queries.


http://localhost:8983/solr/mlt?q=id:SP2514Nmlt.fl=manu,catmlt.mindf=1mlt.mintf=1fq=popularity:6mlt.displayTerms=detailsmlt=true

The popularity of the doc found is 6, and trying to use  
'fq=popularity:6' brings back similarities with a popularity other  
than 6.


Doug

On Mar 4, 2009, at 10:39 AM, Doug Steigerwald wrote:

Hm.  I checked out a clean Solr 1.3.0 and indexed the example docs  
and set up a simple MLT handler the example queries on the Wiki  
work fine (fq can filter out docs).  Our build has a slight change  
to QueryComponent so another query isn't done when we use localsolr 
+field collapsing, but that change doesn't look like it would make  
a difference.  It just conditionally sets rb.setNeedDocSet() to  
true or false.


Will run some tests on a clean fresh build of Solr to see if it's  
our build.


Doug

On Mar 4, 2009, at 9:28 AM, Otis Gospodnetic wrote:



Doug,

does the good old 'fq' not work with MLT?  It should...


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 

From: Doug Steigerwald dsteigerw...@mcclatchyinteractive.com
To: solr-user@lucene.apache.org
Sent: Wednesday, March 4, 2009 9:20:40 AM
Subject: MoreLikeThis filtering

Is it possible to filter similarities found by the MLT component/ 
handler?

Something like mlt.fq=site_id:86?

We have 32 cores in our Solr install, and some of those cores  
have up to 8 sites
indexed in them.  Typically those cores will have one very large  
site with a few
hundred thousand indexed documents, and lots of small sites with  
significantly

less documents indexed.

We're looking to implement a MLT component for our sites but want  
the similar

stories to be only for a specific site (not all sites in the core).

Is there a way to do something like this, or will we have to make  
mods (I'm not
seeing anything jump out at me in the Solr 1.3.0 or Lucene 2.4.0  
code)?


/solr/dsteiger/mlt?q=story_id:188665+AND+site_id: 
86mlt.fq=site_id:86


(We have all all of our other defaults set up in the handler  
config.)


Thanks.
---
Doug Steigerwald
Software Developer
McClatchy Interactive
dsteigerw...@mcclatchyinteractive.com




Re: Applying Field Collapsing Patch

2008-12-11 Thread Doug Steigerwald
Have you tried just checking out (or exporting) the source from SVN  
and applying the patch?  Works fine for me that way.


$ svn co http://svn.apache.org/repos/asf/lucene/solr/tags/ 
release-1.3.0 solr-1.3.0
$ cd solr-1.3.0 ; patch -p0  ~/Downloads/collapsing-patch-to-1.3.0- 
ivan_2.patch


Doug

On Dec 11, 2008, at 3:50 PM, John Martyniak wrote:

It was a completely clean install.  I downloaded it from one of  
mirrors right before applying the patch to it.


Very troubling.  Any other suggestions or ideas?

I am running it on Mac OS Maybe I will try looking for some answers  
around that.


-John

On Dec 11, 2008, at 3:05 PM, Stephen Weiss swe...@stylesight.com  
wrote:


Yes, only ivan patch 2 (and before, only ivan patch 1), my sense  
was these patches were meant to be used in isolation (there were no  
notes saying to apply any other patches first).


Are you using patches for any other purpose (non-SOLR-236)?  Maybe  
you need to apply this one first, then those patches.  For me using  
any patch makes me nervous (we have a pretty strict policy about  
using beta code anywhere), I'm only doing it this once because it's  
absolutely necessary to provide the functionality desired.


--
Steve

On Dec 11, 2008, at 2:53 PM, John Martyniak wrote:


thanks for the advice.

I just downloaded a completely clean version, haven't even tried  
to build it yet.


Applied the same, and I received exactly the same results.

Do you only apply the ivan patch 2?  What version of patch are you  
running?


-John

On Dec 11, 2008, at 2:10 PM, Stephen Weiss wrote:

Are you sure you have a clean copy of the source?  Every time  
I've applied his patch I grab a fresh copy of the tarball and run  
the exact same command, it always works for me.


Now, whether the collapsing actually works is a different matter...

--
Steve

On Dec 11, 2008, at 1:29 PM, John Martyniak wrote:


Hi,

I am trying to apply Ivan's field collapsing patch to solr 1.3  
(not a nightly), and it continously fails.  I am using the  
following command:

patch -p0 -i collapsing-patch-to-1.3.0-ivan_2.patch --dry-run

I am in the apache-solr directory, and have read write for all  
files directories and files.


I am get the following results:

patching file src/test/org/apache/solr/search/TestDocSet.java
Hunk #1 FAILED at 88.
1 out of 1 hunk FAILED -- saving rejects to file src/test/org/ 
apache/solr/search/TestDocSet.java.rej

patching file src/java/org/apache/solr/search/CollapseFilter.java
patching file src/java/org/apache/solr/search/DocSet.java
Hunk #1 FAILED at 195.
1 out of 1 hunk FAILED -- saving rejects to file src/java/org/ 
apache/solr/search/DocSet.java.rej

patching file src/java/org/apache/solr/search/NegatedDocSet.java
patching file src/java/org/apache/solr/search/ 
SolrIndexSearcher.java

Hunk #1 FAILED at 1357.
1 out of 1 hunk FAILED -- saving rejects to file src/java/org/ 
apache/solr/search/SolrIndexSearcher.java.rej
patching file src/java/org/apache/solr/common/params/ 
CollapseParams.java
patching file src/java/org/apache/solr/handler/component/ 
CollapseComponent.java



Also the '.rej' files are not created.

Does anybody have any ideas?

thanks in advance for the help.

-John










Re: Problems with SOLR-236 (field collapsing)

2008-12-10 Thread Doug Steigerwald
The first output is from the query component.  You might just need to  
make the collapse component first and remove the query component  
completely.


We perform geographic searching with localsolr first (if we need to),  
and then try to collapse those results (if collapse=true).  If we  
don't have any results yet, that's the only time we use the standard  
query component.  I'm making sure we set the  
builder.setNeedDocSet=false and then I modified the query component to  
only execute when builder.isNeedDocSet=true.


In the field collapsing patch that I'm using, I've got code to remove  
a previous 'response' from the builder.rsp so we don't have duplicates.


Now, if I could get field collapsing to work properly with a docSet/ 
docList from localsolr and also have faceting work, I'd be golden.


Doug

On Dec 9, 2008, at 9:37 PM, Stephen Weiss wrote:


Hi Tracy,

Well, I managed to get it working (I think) but the weird thing is,  
in the XML output it gives both recordsets (the filtered and  
unfiltered - filtered second).  In the JSON (the one I actually use  
anyway, at least) I only get the filtered results (as expected).


In my core's solrconfig.xml, I added:

  searchComponent name=collapse  
class=org.apache.solr.handler.component.CollapseComponent /


(I'm not sure if it's supposed to go anywhere in particular but for  
me it's right before StandardRequestHandler)


and then within StandardRequestHandler:

 requestHandler name=standard class=solr.StandardRequestHandler
   !-- default values for query parameters --
lst name=defaults
  str name=echoParamsexplicit/str
  !--
  int name=rows10/int
  str name=fl*/str
  str name=version2.1/str
   --
/lst
arr name=components
   strquery/str
   strfacet/str
   strmlt/str
   strhighlight/str
   strdebug/str
   strcollapse/str
/arr
 /requestHandler


Which is basically all the default values plus collapse.  Not sure  
if this was needed for prior versions, I don't see it in any patch  
files (I just got a vague idea from looking at a comment from  
someone else who said it wasn't working for them).  It would kinda  
be nice if someone working on the code might throw us a bone and say  
explicitly what the right options to put in the config file are (if  
there are even supposed to be any - for all I know, this is just a  
bandaid over a larger problem).  I know it's not done yet though...  
just a pointer for this patch might be handy, it's really a useful  
feature if it works (I was kinda shocked this wasn't part of the  
standard distribution since it's something I had to do so often with  
mysql, kinda lucky I guess that it only came up now).


Another issue I'm having now is the faceting doesn't seem to change  
- even if I set the collapse.facet option to after...  I should  
really try before and see what happens.


Of course, I just realized the integrity of my collapse field is not  
so great so I have to go back and redo the data :-)


Best of luck.

--
Steve

On Dec 9, 2008, at 7:49 PM, Tracy Flynn (SOLR) wrote:


Steve,

I need this too. As my previous posting said, I adapted the 1.2  
field collapsing back at the beginning of the year, so I'm somewhat  
familiar.


I'll try and get a look this weekend. It's the earliest I''m likely  
to get spare cycles. I'll post any results.


Tracy

On Dec 9, 2008, at 4:18 PM, Stephen Weiss wrote:


Hi,

I'm trying to use field collapsing with our SOLR but I just can't  
seem to get it to do anything.


I've downloaded a dist copy of solr 1.3 and applied Ivan de  
Prado's patch - reading through the source code, the patch  
definitely was applied successfully (all the changes are in the  
right places, I've checked every single one).


I've run ant clean, ant compile, and ant dist to produce the war  
file in the dist/ folder, and then put the war file in place and  
restarted jetty.  According to the logs, jetty is definitely  
loading the right war file.  If I expand the war file and grep  
through the files, it would appear the collapsing code is there.


However, when I add any sort of collapse parameters (I've tried  
any combination of collapse=true collapse.field=link_id  
collapse.threshold=1 collapse.type=normal collapse.info.doc=true),  
the result set is no different from normal query, and there is no  
collapse data returned in the XML.


I'm not a java developer, this is my first time using ant period,  
and I'm just following basic directions I found on google.



Here is the output of the compilation process:



I really need this patch to work for a project...  Can someone  
please tell me what I'm missing to get this to work?  I can't  
really find any documentation beyond adding the collapse options  
to the query string, so it's hard to tell - is there an option in  
solrconfig.xml or in the core configuration that needs to be set?   
Am I going about this entirely the wrong way?


Thanks for any advice, I 

Re: snappuller issue with multicore

2008-12-10 Thread Doug Steigerwald
Try using the -d option with the snappuller so you can specify the  
path to the directory holding index data on local machine.


Doug

On Dec 10, 2008, at 10:20 AM, Kashyap, Raghu wrote:


Bill,

  Yes I do have scripts.conf for each core. However, all the options
needed for snappuller is specified in the command line itself (-D -S
etc...)

-Raghu

-Original Message-
From: Bill Au [mailto:[EMAIL PROTECTED]
Sent: Wednesday, December 10, 2008 9:17 AM
To: solr-user@lucene.apache.org
Subject: Re: snappuller issue with multicore

I notices that you are using the same rysncd port for both core.  Do  
you

have a scripts.conf for each core?

Bill

On Tue, Dec 9, 2008 at 11:40 PM, Kashyap, Raghu
[EMAIL PROTECTED]wrote:


Hi,



We are seeing a strange behavior with snappuller



We have 2 cores Hotel  Location



Here are the steps we perform



1.  index hotel on master server
2.  index location on master server
3.  execute snapshooter for hotel core on master server
4.  execute snapshooter for location core on master server
5.  execute snappuller from slave machines (once for hotel core 
once for location core)



However, the hotel core snapshot is pulled into the location data  
dir.




Here are the commands that we execute in our ruby scripts



system('solr/multicore/hotel/bin/snappuller -P 18983 -S /solr/data -M
masterServer  -D /solr/data/hotel )

system(solr/multicore/location/bin/snappuller -P 18983 -M

masterServer

-S /solr/data -D /solr/data/location)



Thanks,

Raghu






Re: IndexOutOfBoundsException

2008-08-15 Thread Doug Steigerwald
We actually have this same exact issue on 5 of our cores.  We're just  
going to wipe the index and reindex soon, but it isn't actually  
causing any problems for us.  We can update the index just fine,  
there's just no merging going on.


Ours happened when I reloaded all of our cores for a schema change.  I  
don't do that any more ;).


Doug

On Aug 14, 2008, at 11:08 PM, Yonik Seeley wrote:


Since this looks like more of a lucene issue, I've replied in
[EMAIL PROTECTED]

-Yonik

On Thu, Aug 14, 2008 at 10:18 PM, Ian Connor [EMAIL PROTECTED]  
wrote:

I seem to be able to reproduce this very easily and the data is
medline (so I am sure I can share it if needed with a quick email to
check).

- I am using fedora:
%uname -a
Linux ghetto5.projectlounge.com 2.6.23.1-42.fc8 #1 SMP Tue Oct 30
13:18:33 EDT 2007 x86_64 x86_64 x86_64 GNU/Linux
%java -version
java version 1.7.0
IcedTea Runtime Environment (build 1.7.0-b21)
IcedTea 64-Bit Server VM (build 1.7.0-b21, mixed mode)
- single core (will use shards but each machine just as one HDD so
didn't see how cores would help but I am new at this)
- next run I will keep the output to check for earlier errors
- very and I can share code + data if that will help

On Thu, Aug 14, 2008 at 4:23 PM, Yonik Seeley [EMAIL PROTECTED]  
wrote:

Yikes... not good.  This shouldn't be due to anything you did wrong
Ian... it looks like a lucene bug.

Some questions:
- what platform are you running on, and what JVM?
- are you using multicore? (I fixed some index locking bugs  
recently)

- are there any exceptions in the log before this?
- how reproducible is this?

-Yonik

On Thu, Aug 14, 2008 at 2:47 PM, Ian Connor [EMAIL PROTECTED]  
wrote:

Hi,

I have rebuilt my index a few times (it should get up to about 4
Million but around 1 Million it starts to fall apart).

Exception in thread Lucene Merge Thread #0
org.apache.lucene.index.MergePolicy$MergeException:
java.lang.IndexOutOfBoundsException: Index: 105, Size: 33
  at  
org 
.apache 
.lucene 
.index 
.ConcurrentMergeScheduler 
.handleMergeException(ConcurrentMergeScheduler.java:323)
  at org.apache.lucene.index.ConcurrentMergeScheduler 
$MergeThread.run(ConcurrentMergeScheduler.java:300)
Caused by: java.lang.IndexOutOfBoundsException: Index: 105, Size:  
33

  at java.util.ArrayList.rangeCheck(ArrayList.java:572)
  at java.util.ArrayList.get(ArrayList.java:350)
  at  
org.apache.lucene.index.FieldInfos.fieldInfo(FieldInfos.java:260)
  at  
org.apache.lucene.index.FieldsReader.doc(FieldsReader.java:188)
  at  
org.apache.lucene.index.SegmentReader.document(SegmentReader.java: 
670)
  at  
org 
.apache.lucene.index.SegmentMerger.mergeFields(SegmentMerger.java: 
349)
  at  
org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:134)
  at  
org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java: 
3998)
  at  
org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3650)
  at  
org 
.apache 
.lucene 
.index 
.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java: 
214)
  at org.apache.lucene.index.ConcurrentMergeScheduler 
$MergeThread.run(ConcurrentMergeScheduler.java:269)



When this happens, the disk usage goes right up and the indexing
really starts to slow down. I am using a Solr build from about a  
week

ago - so my Lucene is at 2.4 according to the war files.

Has anyone seen this error before? Is it possible to tell which  
Array
is too large? Would it be an Array I am sending in or another  
internal

one?

Regards,
Ian Connor







--
Regards,

Ian Connor





Re: spellcheck collation

2008-08-14 Thread Doug Steigerwald

I'd try, but the build is failing from (guessing) Ryan's last commit:

compile:
[mkdir] Created dir: /Users/dsteiger/Desktop/java/solr/build/core
[javac] Compiling 337 source files to /Users/dsteiger/Desktop/ 
java/solr/build/core
[javac] /Users/dsteiger/Desktop/java/solr/client/java/solrj/src/ 
org/apache/solr/client/solrj/embedded/EmbeddedSolrServer.java:129:  
cannot find symbol

[javac] symbol  : method isEnabled()
[javac] location: class org.apache.solr.core.CoreContainer
[javac]   multicore.isEnabled() ) {

Doug

On Aug 14, 2008, at 2:24 PM, Grant Ingersoll wrote:

I believe I just fixed this on SOLR-606 (thanks to Stefan's patch).   
Give it a try and let us know.


-Grant

On Aug 13, 2008, at 2:25 PM, Doug Steigerwald wrote:

I've noticed a few things with the new spellcheck component that  
seem a little strange.


Here's my document:

doc
field name=id5/field
field name=spellwii blackberry blackjack creative labs zen ipod  
video nano/field

/doc

Some sample queries:

http://localhost:8983/solr/core1/spellCheckCompRH?q=blackberri+wispellcheck=truespellcheck.collate=true

http://localhost:8983/solr/core1/spellCheckCompRH?q=blackberr+wispellcheck=truespellcheck.collate=true

http://localhost:8983/solr/core1/spellCheckCompRH?q=blackber+wispellcheck=truespellcheck.collate=true

When spellchecking 'blackberri wi', the collation returned is  
'blackberry wii'.  When spellchecking 'blackberr wi', the collation  
returned is 'blackberrywii'.  'blackber wi' returns 'blackberrwiiwi'.


Doug






Re: spellcheck collation

2008-08-14 Thread Doug Steigerwald
Right before I sent the message.  Did a 'svn up src/;and clean;ant  
dist' and it failed.  Seems to work fine now.


On Aug 14, 2008, at 2:38 PM, Ryan McKinley wrote:


have you updated recently?

isEnabled() was removed last night...


On Aug 14, 2008, at 2:30 PM, Doug Steigerwald wrote:


I'd try, but the build is failing from (guessing) Ryan's last commit:

compile:
  [mkdir] Created dir: /Users/dsteiger/Desktop/java/solr/build/core
  [javac] Compiling 337 source files to /Users/dsteiger/Desktop/ 
java/solr/build/core
  [javac] /Users/dsteiger/Desktop/java/solr/client/java/solrj/src/ 
org/apache/solr/client/solrj/embedded/EmbeddedSolrServer.java:129:  
cannot find symbol

  [javac] symbol  : method isEnabled()
  [javac] location: class org.apache.solr.core.CoreContainer
  [javac]   multicore.isEnabled() ) {

Doug

On Aug 14, 2008, at 2:24 PM, Grant Ingersoll wrote:

I believe I just fixed this on SOLR-606 (thanks to Stefan's  
patch).  Give it a try and let us know.


-Grant

On Aug 13, 2008, at 2:25 PM, Doug Steigerwald wrote:

I've noticed a few things with the new spellcheck component that  
seem a little strange.


Here's my document:

doc
field name=id5/field
field name=spellwii blackberry blackjack creative labs zen  
ipod video nano/field

/doc

Some sample queries:

http://localhost:8983/solr/core1/spellCheckCompRH?q=blackberri+wispellcheck=truespellcheck.collate=true

http://localhost:8983/solr/core1/spellCheckCompRH?q=blackberr+wispellcheck=truespellcheck.collate=true

http://localhost:8983/solr/core1/spellCheckCompRH?q=blackber+wispellcheck=truespellcheck.collate=true

When spellchecking 'blackberri wi', the collation returned is  
'blackberry wii'.  When spellchecking 'blackberr wi', the  
collation returned is 'blackberrywii'.  'blackber wi' returns  
'blackberrwiiwi'.


Doug








WordGramFilterFactory

2008-08-13 Thread Doug Steigerwald
Just checked out Solr trunk from SVN and ran 'ant dist  ant  
example'.  Running the example throws out errors because there is no  
WordGramFilterFactory class.


We don't need it here, but is that something waiting to be committed?

Doug

--Snippet from schema--

fieldType name=grams class=solr.TextField  
positionIncrementGap=100 

  analyzer
tokenizer class=solr.StandardTokenizerFactory/
filter class=solr.WordDelimiterFilterFactory  
generateWordParts=1 generateNumberParts=1 catenateWords=0  
catenateNumbers=0 catenateAll=0 splitOnCaseChange=1/

filter class=solr.LengthFilterFactory min=3 max=15 /
filter class=solr.WordGramFilterFactory minLength=1  
maxLength=3 sep=  /

  /analyzer
/fieldType


multicore /solr/update

2008-08-13 Thread Doug Steigerwald
I've got two cores (core{0|1}) both using the provided example schema  
(example/solr/conf/schema.xml).


Posting to http://localhost:8983/solr/update added the example docs to  
the last core loaded (core1).  Shouldn't this give you a 400?


Doug


Re: multicore /solr/update

2008-08-13 Thread Doug Steigerwald
Yeah, that's the problem.  Not having the core in the URL you're  
posting to shouldn't update any core, but it does.


Doug

On Aug 13, 2008, at 2:10 PM, Alok K. Dhir wrote:


you need to add the core to your call -- post to 
http://localhost:8983/solr/coreX/update

On Aug 13, 2008, at 1:58 PM, Doug Steigerwald wrote:

I've got two cores (core{0|1}) both using the provided example  
schema (example/solr/conf/schema.xml).


Posting to http://localhost:8983/solr/update added the example docs  
to the last core loaded (core1).  Shouldn't this give you a 400?


Doug



---
Alok K. Dhir
Symplicity Corporation
www.symplicity.com
(703) 351-0200 x 8080
[EMAIL PROTECTED]






spellcheck collation

2008-08-13 Thread Doug Steigerwald
I've noticed a few things with the new spellcheck component that seem  
a little strange.


Here's my document:

doc
  field name=id5/field
  field name=spellwii blackberry blackjack creative labs zen ipod  
video nano/field

/doc

Some sample queries:

http://localhost:8983/solr/core1/spellCheckCompRH?q=blackberri+wispellcheck=truespellcheck.collate=true

http://localhost:8983/solr/core1/spellCheckCompRH?q=blackberr+wispellcheck=truespellcheck.collate=true

http://localhost:8983/solr/core1/spellCheckCompRH?q=blackber+wispellcheck=truespellcheck.collate=true

When spellchecking 'blackberri wi', the collation returned is  
'blackberry wii'.  When spellchecking 'blackberr wi', the collation  
returned is 'blackberrywii'.  'blackber wi' returns 'blackberrwiiwi'.


Doug


more multicore fun

2008-08-13 Thread Doug Steigerwald
OK.  Last question for a while (hopefully), but something else with  
multicore seems to be wrong.


solr persistent=true
  cores adminPath=/admin/multicore
core name=core0 instanceDir=core0/
core name=core1 instanceDir=core1/
  /cores
/solr

$ java -jar start.jar
...
INFO: [core0] Opening new SolrCore at solr/core0/, dataDir=./solr/data/
...
INFO: [core1] Opening new SolrCore at solr/core1/, dataDir=./solr/data/
...

The instanceDir seems to be fine, but the dataDir isn't being set  
correctly.  The dataDir is actually example/solr/data instead of  
example/solr/core{0|1}/data.


http://localhost:8983/solr/admin/multicore shows the exact same path  
to the index for both cores.  Am I missing something that the example  
multicore config doesn't use?


Thanks.
Doug


Re: Solr stops responding

2008-07-17 Thread Doug Steigerwald
not. Not-predictable. I minimized all caches, it still happens even  
with

8192M. CPU usage
is 375%-400% (two double-core Opterons), SUN Java 5. Moved to BEA  
JRockit 5

yesterday,
looks 30 times faster (25% CPU load with 4096M RAM); no any problem  
yet,

let's see...

Strange: Tomcat simply hangs instead of exit(...)

There are some posts related to OutOfMemoryError in solr-user list.


==
http://www.linkedin.com/in/liferay

Quoting Doug Steigerwald [EMAIL PROTECTED]:


Since we pushed Solr out to production a few weeks ago, we've seen a
few issues with Solr not responding to requests (searches or admin
pages).  There doesn't seem to be any reason for it from what we can
tell.  We haven't seen it in QA or development.

We're running Solr with basically the example Solr setup with Jetty
(6.1.3).  We package our Solr install by using 'ant example' and
replacing configs/etc.  Whenever Solr stops responding, there are no
messages in the logs, nothing.  Requests just time out.

We have also only seen this on our slaves.  The master doesn't  
seem to
be hitting this issue.  All the boxes are the same, version of  
java is

the same, etc.

We don't have a stack trace and no JMX set up.  Once we see this  
issue,

our support folks just stop and start Solr on that machine.

Has anyone else run into anything like this with Solr?

Thanks.
Doug









--
--Noble Paul




Re: Solr stops responding

2008-07-17 Thread Doug Steigerwald
- 
jdk1.6.0_02 (AMD Opteron, 64bit, SLES 10 SP1, Tomcat 5.5.26). 100k  
queries a day...


Jul 17, 2008 11:08:07 AM org.apache.solr.common.SolrException log
SEVERE: java.lang.OutOfMemoryError: allocLargeObjectOrArray - Object  
size: 3149016, Num elements: 393625

   at org.apache.solr.util.OpenBitSet.init(OpenBitSet.java:86)
   at  
org 
.apache 
.solr.search.DocSetHitCollector.collect(DocSetHitCollector.java:63)
   at org.apache.solr.search.SolrIndexSearcher 
$9.collect(SolrIndexSearcher.java:1072)
   at  
org.apache.lucene.search.BooleanScorer2.score(BooleanScorer2.java:320)
   at  
org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:146)

   at org.apache.lucene.search.Searcher.search(Searcher.java:118)
   at org.apache.lucene.search.Searcher.search(Searcher.java:97)
   at  
org 
.apache 
.solr 
.search.SolrIndexSearcher.getDocListAndSetNC(SolrIndexSearcher.java: 
1069)
   at  
org 
.apache 
.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:804)
   at  
org 
.apache 
.solr 
.search.SolrIndexSearcher.getDocListAndSet(SolrIndexSearcher.java: 
1245)
   at  
org 
.apache 
.solr.handler.component.QueryComponent.process(QueryComponent.java:96)
   at  
org 
.apache 
.solr 
.handler 
.component.SearchHandler.handleRequestBody(SearchHandler.java:148)
   at  
org 
.apache 
.solr 
.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:117)

   at org.apache.solr.core.SolrCore.execute(SolrCore.java:902)
   at  
org 
.apache 
.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:280)
   at  
org 
.apache 
.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:237)
   at  
org 
.apache 
.catalina 
.core 
.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java: 
215)
   at  
org 
.apache 
.catalina 
.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188)
   at  
org 
.apache 
.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java: 
213)
   at  
org 
.apache 
.catalina.core.StandardContextValve.invoke(StandardContextValve.java: 
174)
   at  
org 
.apache 
.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
   at  
org 
.apache 
.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:117)
   at  
org 
.apache 
.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java: 
108)
   at  
org 
.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java: 
174)
   at  
org 
.apache 
.coyote.http11.Http11AprProcessor.process(Http11AprProcessor.java:834)
   at org.apache.coyote.http11.Http11AprProtocol 
$Http11ConnectionHandler.process(Http11AprProtocol.java:640)
   at org.apache.tomcat.util.net.AprEndpoint 
$Worker.run(AprEndpoint.java:1286)

   at java.lang.Thread.run(Thread.java:619)


P.P.S.
I'll send thread dump in separate Email



Quoting Doug Steigerwald [EMAIL PROTECTED]:


It happened again last night.  I cronned a script that ran jstack on
the process every 5 minutes just to see what was going on.  Here's a
snippet:

btpool0-2668 prio=10 tid=0x2aac3a905800 nid=0x76ed waiting for
monitor entry [0x5e584000..0x5e585a10]
  java.lang.Thread.State: BLOCKED (on object monitor)
   at org.apache.solr.search.LRUCache.get(LRUCache.java:129)
   - waiting to lock 0x2aaabcdd9450 (a
org.apache.solr.search.LRUCache$1)
   at
org 
.apache 
.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java: 
730)

   at
org 
.apache 
.solr.search.SolrIndexSearcher.getDocList(SolrIndexSearcher.java:693)

   at
org.apache.solr.search.CollapseFilter.init(CollapseFilter.java:137)
   at
org 
.apache 
.solr 
.handler.component.CollapseComponent.process(CollapseComponent.java: 
97)

   at
org 
.apache 
.solr 
.handler 
.component.SearchHandler.handleRequestBody(SearchHandler.java:148)

   at
org 
.apache 
.solr 
.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java: 
117)

   at org.apache.solr.core.SolrCore.execute(SolrCore.java:942)
   at
org 
.apache 
.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:280)

   at
org 
.apache 
.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java: 
237)


During this log, there were 547 threads active (going by  
occurrences of

Thread.State in the log).

Here's some more:

btpool0-2051 prio=10 tid=0x2aac39144c00 nid=0x4012 waiting for
monitor entry [0x45bfc000..0x45bfdd90]
  java.lang.Thread.State: BLOCKED (on object monitor)
   at java.util.Vector.size(Unknown Source)
   - waiting to lock 0x2aaac0af0ea0 (a java.util.Vector)
   at java.util.AbstractList.listIterator(Unknown Source)
   at java.util.AbstractList.listIterator(Unknown Source)
   at java.util.AbstractList.equals(Unknown Source)
   at java.util.Vector.equals(Unknown Source)
   - locked 0x2aaac0ae8d30 (a java.util.Vector

Solr stops responding

2008-07-15 Thread Doug Steigerwald
Since we pushed Solr out to production a few weeks ago, we've seen a  
few issues with Solr not responding to requests (searches or admin  
pages).  There doesn't seem to be any reason for it from what we can  
tell.  We haven't seen it in QA or development.


We're running Solr with basically the example Solr setup with Jetty  
(6.1.3).  We package our Solr install by using 'ant example' and  
replacing configs/etc.  Whenever Solr stops responding, there are no  
messages in the logs, nothing.  Requests just time out.


We have also only seen this on our slaves.  The master doesn't seem to  
be hitting this issue.  All the boxes are the same, version of java is  
the same, etc.


We don't have a stack trace and no JMX set up.  Once we see this  
issue, our support folks just stop and start Solr on that machine.


Has anyone else run into anything like this with Solr?

Thanks.
Doug


Re: Solr stops responding

2008-07-15 Thread Doug Steigerwald
We haven't seen an OutOfMemoryError.  The load on the server doesn't  
go up either (hovers around 1-2).  We're on Java 1.6.0_03-b05.   
4x3.8GHz Xeons, 8GB RAM.


Doug

On Jul 15, 2008, at 11:29 AM, Fuad Efendi wrote:

I constantly have the same problem; sometimes I have  
OutOfMemoryError in logs, sometimes
not. Not-predictable. I minimized all caches, it still happens even  
with 8192M. CPU usage
is 375%-400% (two double-core Opterons), SUN Java 5. Moved to BEA  
JRockit 5 yesterday,
looks 30 times faster (25% CPU load with 4096M RAM); no any problem  
yet, let's see...


Strange: Tomcat simply hangs instead of exit(...)

There are some posts related to OutOfMemoryError in solr-user list.


==
http://www.linkedin.com/in/liferay

Quoting Doug Steigerwald [EMAIL PROTECTED]:


Since we pushed Solr out to production a few weeks ago, we've seen a
few issues with Solr not responding to requests (searches or admin
pages).  There doesn't seem to be any reason for it from what we can
tell.  We haven't seen it in QA or development.

We're running Solr with basically the example Solr setup with Jetty
(6.1.3).  We package our Solr install by using 'ant example' and
replacing configs/etc.  Whenever Solr stops responding, there are no
messages in the logs, nothing.  Requests just time out.

We have also only seen this on our slaves.  The master doesn't seem  
to
be hitting this issue.  All the boxes are the same, version of java  
is

the same, etc.

We don't have a stack trace and no JMX set up.  Once we see this  
issue,

our support folks just stop and start Solr on that machine.

Has anyone else run into anything like this with Solr?

Thanks.
Doug







MergeException

2008-07-02 Thread Doug Steigerwald
What exactly does this error mean and how can we fix it?  As far as I  
can tell, all of our 30+ cores seem to be updating and autocommiting  
fine.  By fine I mean our autocommit hook is firing for all cores  
which leads me to believe that the commit is happening, but segments  
can't be merged.  Are we going to have to rebuild whatever core this  
happens to be (if I can figure it out)?


Exception in thread Thread-704 org.apache.lucene.index.MergePolicy 
$MergeException: java.lang.IndexOutOfBoundsException: Index: 43, Size:  
43
	at org.apache.lucene.index.ConcurrentMergeScheduler 
$MergeThread.run(ConcurrentMergeScheduler.java:271)

Caused by: java.lang.IndexOutOfBoundsException: Index: 43, Size: 43
at java.util.ArrayList.RangeCheck(Unknown Source)
at java.util.ArrayList.get(Unknown Source)
at org.apache.lucene.index.FieldInfos.fieldInfo(FieldInfos.java:260)
at org.apache.lucene.index.FieldsReader.doc(FieldsReader.java:154)
	at org.apache.lucene.index.SegmentReader.document(SegmentReader.java: 
659)
	at  
org.apache.lucene.index.SegmentMerger.mergeFields(SegmentMerger.java: 
319)

at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:133)
	at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java: 
3109)

at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:2834)

Thanks.
Doug


Re: MergeException

2008-07-02 Thread Doug Steigerwald

We're using Lucene 2.3.0.  I'll try upgrading to 2.3.2 at some point.

All of our cores are updating fine, so not a huge rush.

Thanks.
Doug

On Jul 2, 2008, at 9:42 AM, Yonik Seeley wrote:


Doug, it looks like it might be this Lucene bug:
https://issues.apache.org/jira/browse/LUCENE-1262

What version of Lucene is in the Solr you are running?  You might want
to try either one of the latest Solr nightly builds, or at least
upgrading your Lucene version in Solr if it's not the latest patch
release.

-Yonik

On Wed, Jul 2, 2008 at 9:03 AM, Doug Steigerwald
[EMAIL PROTECTED] wrote:
What exactly does this error mean and how can we fix it?  As far as  
I can
tell, all of our 30+ cores seem to be updating and autocommiting  
fine.  By
fine I mean our autocommit hook is firing for all cores which leads  
me to
believe that the commit is happening, but segments can't be  
merged.  Are we
going to have to rebuild whatever core this happens to be (if I can  
figure

it out)?

Exception in thread Thread-704
org.apache.lucene.index.MergePolicy$MergeException:
java.lang.IndexOutOfBoundsException: Index: 43, Size: 43
  at
org.apache.lucene.index.ConcurrentMergeScheduler 
$MergeThread.run(ConcurrentMergeScheduler.java:271)

Caused by: java.lang.IndexOutOfBoundsException: Index: 43, Size: 43
  at java.util.ArrayList.RangeCheck(Unknown Source)
  at java.util.ArrayList.get(Unknown Source)
  at  
org.apache.lucene.index.FieldInfos.fieldInfo(FieldInfos.java:260)
  at org.apache.lucene.index.FieldsReader.doc(FieldsReader.java: 
154)

  at
org.apache.lucene.index.SegmentReader.document(SegmentReader.java: 
659)

  at
org 
.apache.lucene.index.SegmentMerger.mergeFields(SegmentMerger.java: 
319)

  at
org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:133)
  at
org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java: 
3109)
  at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java: 
2834)


Thanks.
Doug





High load when updating many cores

2008-07-02 Thread Doug Steigerwald
We're experiencing some high load on our Solr master server.  It  
currently has 30 cores and processes over 3 million updates per day.   
During most of the day the load on the master is low (0.5 to 2), but  
sometimes we get spikes in excess of 12 for hours at a time.


The only reason I can figure why this is happening because we're  
updating almost all of our cores during those times.  Usually during  
the day our sites update pretty randomly, but it seems like many of  
them send updates at the same time.


Over a 3 hour period where the load was ~12 we had only 156k updates.   
Usually a pretty light load when updating a single core through just a  
few producers.  It seems as though we're just getting updates from  
nearly all of our 30 cores at once, and something in the background is  
slowing down.


Here's some stats about our setup.

4x3.2GHz Xeon.  8GB RAM.  RHEL 5.1.  4GB max heap size for Solr.  Our  
build is a trunk build from January (using Lucene 2.3.0).  Java  
1.6.0_03-b05 (64bit).


Using Jetty started as:  'java -server -Xms1024m -Xmx4096m -jar  
start.jar'


We never query the master, but we do have caching enabled (same  
configs on master and slave).  autowarmCount is set to 0 for each core  
(they all use the same configs).  We autocommit every 5 seconds.


Any ideas what might cause the load to spike?  Could it be our caching  
even though we have autowarmCount set to 0?  Could it be that Solr is  
trying to merge a lot of indexes at once?


Maybe some garbage collection stuff?

Thanks.
Doug


java.io.FileNotFoundException?

2008-04-02 Thread Doug Steigerwald

We just started hitting a FileNotFoundException for no real apparent reason for 
both our regular
index and our spellchecker index, and only a few minute after we restarted Solr.  I did some 
searching and didn't find much that helped.


We started to do some load testing, and after about 10 minutes we started 
getting these errors.

We hit the spellchecker every request through a SpellcheckComponent that we created (ie, code ripped 
out of SpellCheckRequestHandler for now).  It runs essentially the same code as the spellcheck 
request handler when we specify a parameter (spellcheck=true).


We have 34 cores.  All but two cores are fully optimized (haven't been updated in 2 months).  Only 
two cores are actively updated.  We started Solr around 11:45am, not much happened until 12:27 when 
we started load testing (just a few queries, maybe 100 updates).


find /home/dsteiger/local/solr/cores/*/data/index|wc -l  = 414
find /home/dsteiger/local/solr/cores/*/data/spell|wc -l  = 6 (only the two 'active' cores use the 
spell checker).  So, not many files are open.


Anyone have any idea what might cause the two below errors to happen?  When I restarted Solr around 
11:45am it was to test a new patch that set the mergeFactor in the lucene spellchecker to 2 instead 
of 300 because we kept running into 'too many files open' errors when rebuilding more than one spell 
index at a time.  The spell indexes were rebuilt manually using the mergeFactor of 300, solr 
restarted, and any subsequent rebuild of the spell index would use a mergeFactor of 2.


After we hit this error, I rebuilt the spell indexes with the new code replicated them to the slave, 
restarted Solr, and all has been well.  We ran the load testing for more than an hour and the issue 
hasn't returned.


Could the old spell indexes that were created using the high mergeFactor cause an issue like this 
somehow?  Could the opening and closing of searchers so fast cause this?  I don't have the slightest 
idea.  All of our search queries hit the slave, and the master just handles updates.  The master had 
no issues through all of this.


Caused by: java.io.IOException: cannot read directory
org.apache.lucene.store.FSDirectory@/home/dsteiger/local/solr/cores/qaa/data/spell:
 list() returned null
at 
org.apache.lucene.index.SegmentInfos.getCurrentSegmentGeneration(SegmentInfos.java:115)
at org.apache.lucene.index.IndexReader.indexExists(IndexReader.java:506)
at 
org.apache.lucene.search.spell.SpellChecker.setSpellIndex(SpellChecker.java:102)
at 
org.apache.lucene.search.spell.SpellChecker.init(SpellChecker.java:89)


And this happened I believe when running the snapinstaller (done through 
cron)...

Caused by: java.io.FileNotFoundException: no segments* file found in
org.apache.lucene.store.FSDirectory@/home/dsteiger/local/solr/cores/qab/data/index:
 files: null
at 
org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:587)
at 
org.apache.lucene.index.DirectoryIndexReader.open(DirectoryIndexReader.java:63)
at org.apache.lucene.index.IndexReader.open(IndexReader.java:209)
at org.apache.lucene.index.IndexReader.open(IndexReader.java:173)
at 
org.apache.solr.search.SolrIndexSearcher.init(SolrIndexSearcher.java:93)
at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:706)

We're running r614955.

Thanks.
Doug



Re: java.io.FileNotFoundException?

2008-04-02 Thread Doug Steigerwald
The user that runs our apps is configured to allow 65536 open files in limits.conf.  Shouldn't even 
come close to that number.  Solr is the only app we have running on these machines as our app user.


We hit the same type of issue when we had our mergeFactor set to 40 for all of our indexes.  We 
lowered it to 5 and have been fine since.


No errors in the snappuller for either core.  The spellcheck index is rebuilt once a night around 
midnight and copied to the slave afterwards.  I had even rebuilt the spell index manually for the 
two cores, pulled them, installed them, and tested to make sure it was working with a few queries 
before the load testing started (this was before we released the patch to lower the spell index 
mergeFactor).


We were even getting errors trying to run out postCommit script on the slave (it doesn't end up 
doing anything since it's the slave).


SEVERE: java.io.IOException: Cannot run program ./solr/bin/snapctl: java.io.IOException: error=24, 
Too many open files

at java.lang.ProcessBuilder.start(Unknown Source)
at java.lang.Runtime.exec(Unknown Source)

And a correction from my previous email.  The errors started 10 -seconds- after load testing 
started.  This was about 40 minutes after Solr started, and less than 30 queries had been run on the 
server before load testing started.


Load testing has been fine since I restarted Solr and rebuilt the spellcheck indexes with the 
lowered mergeFactor.


Doug

Otis Gospodnetic wrote:

Hi Doug,

Sounds fishy, especially increasing/decreasing mergeFactor to funny values 
(try changing your OS setting instead).

My guess is this is happening only with the 2 indices that are being modified 
and I'll guess that the FNFE is due to a bad/incomplete rsync from the master.  
Do snappuller logs mention any errors?

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch


logging in 24hour time

2008-03-27 Thread Doug Steigerwald
Is there any way to get the logs to stderr/stdout to be in 24hour time?

Thanks.

Doug


Re: field collapsing

2008-03-14 Thread Doug Steigerwald
The latest one won't apply to the trunk because it's too old.  It hasn't been updated to match 
changes made to Solr since mid-February.  One of the things I know has to change is that in 
CollapseComponent-prepare/process, the parameters need to change to just accept a ResponseBuilder.


Other than that, I'm not sure what will have to be changed.  I'm not planning on updating our Solr 
build until 1.3 is released.


Doug

muddassir hasan wrote:

Hi,

I have unsuccessfully tried to apply solr field collapsing patches available at 


https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

One of the patch could be applied to trunk but it could not be compiled.

Please let me know which of the available field collapsing patch could be 
applied to solr trunk or release 1.2.0.

Thanks.

M.Hasan

   
-

 Now you can chat without downloading messenger. Click here to know how.


Admin ping

2008-03-07 Thread Doug Steigerwald

Came in this morning to find some alerts that the admin interface has basically 
died.  Everything
was fine until about 4am.  No updates or queries going on at that time (this is a QA machine). 
Anyone know why it might die like this?


Solr 1.3 trunk build from Jan 23rd, 4GB heap size, 4x3.2GHz Xeon, 8GB RAM 
total, RHEL 5.1, 64bit.

Mar 7, 2008 5:42:46 AM org.apache.solr.common.SolrException log
SEVERE: org.apache.jasper.JasperException: PWC6117: File /admin/ping.jsp not 
found
at 
org.apache.jasper.compiler.DefaultErrorHandler.jspError(DefaultErrorHandler.java:60)
at 
org.apache.jasper.compiler.ErrorDispatcher.dispatch(ErrorDispatcher.java:346)
at 
org.apache.jasper.compiler.ErrorDispatcher.jspError(ErrorDispatcher.java:140)
at org.apache.jasper.compiler.JspUtil.getInputStream(JspUtil.java:881)
at 
org.apache.jasper.xmlparser.XMLEncodingDetector.getEncoding(XMLEncodingDetector.java:114)
at
org.apache.jasper.compiler.ParserController.determineSyntaxAndEncoding(ParserController.java:347)
at 
org.apache.jasper.compiler.ParserController.doParse(ParserController.java:181)
at 
org.apache.jasper.compiler.ParserController.parse(ParserController.java:111)
at org.apache.jasper.compiler.Compiler.generateJava(Compiler.java:169)
at org.apache.jasper.compiler.Compiler.compile(Compiler.java:387)
at 
org.apache.jasper.JspCompilationContext.compile(JspCompilationContext.java:579)
at 
org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:344)
at 
org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:464)
at org.apache.jasper.servlet.JspServlet.service(JspServlet.java:358)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)

This happened a few weeks ago, but someone just restarted Solr to get the admin interface back. 
They said that updates and queries were still working fine.


Thanks.
Doug


JSONRequestWriter

2008-03-05 Thread Doug Steigerwald
We're using localsolr and the RubyResponseWriter.  When we do a request with the localsolr component 
in our requestHandler we're seeing issues with the display of a multivalued field when it only has 
one value.


'class'=['showtime']'showtime',  --
'genre'=['Drama',
 'Suspsense/Triller'],

With no localsolr component it works fine.

Looks like the issue is with the JSONRequestWriter.writeSolrDocument().  Here's the small patch for 
it that seems to fix it.


Index: src/java/org/apache/solr/request/JSONResponseWriter.java
===
--- src/java/org/apache/solr/request/JSONResponseWriter.java(revision 
614955)
+++ src/java/org/apache/solr/request/JSONResponseWriter.java(working copy)
@@ -416,7 +416,7 @@
   writeVal(fname, val);
   writeArrayCloser();
 }
-writeVal(fname, val);
+else writeVal(fname, val);
   }

   if (pseudoFields !=null  pseudoFields.size()0) {


We're running solr trunk r614955 (Jan 23rd), and r75 of localsolr.

Result snippet with the patch:

'class'=['showtime'],
'genre'=['Drama',
 'Suspsense/Triller'],

Has anyone come across an issue like this?  Is this fixed in a newer build of Solr?   It looks like 
we'd still need this patch even in a build of the solr trunk from yesterday, but maybe not.


--
Doug Steigerwald
Software Developer
McClatchy Interactive
[EMAIL PROTECTED]
919.861.1287


Re: JSONRequestWriter

2008-03-05 Thread Doug Steigerwald

Sweet.  Thanks.

Doug

Yonik Seeley wrote:

Thanks Doug, I just checked in your fix.
This was a recent bug... writing of SolrDocument was recently added
and is not touched by normal code paths, except for distributed
search.

-Yonik

On Wed, Mar 5, 2008 at 9:29 AM, Doug Steigerwald
[EMAIL PROTECTED] wrote:

We're using localsolr and the RubyResponseWriter.  When we do a request with 
the localsolr component
 in our requestHandler we're seeing issues with the display of a multivalued 
field when it only has
 one value.

 'class'=['showtime']'showtime',  --
 'genre'=['Drama',
  'Suspsense/Triller'],

 With no localsolr component it works fine.

 Looks like the issue is with the JSONRequestWriter.writeSolrDocument().  
Here's the small patch for
 it that seems to fix it.

 Index: src/java/org/apache/solr/request/JSONResponseWriter.java
 ===
 --- src/java/org/apache/solr/request/JSONResponseWriter.java(revision 
614955)
 +++ src/java/org/apache/solr/request/JSONResponseWriter.java(working copy)
 @@ -416,7 +416,7 @@
writeVal(fname, val);
writeArrayCloser();
  }
 -writeVal(fname, val);
 +else writeVal(fname, val);
}

if (pseudoFields !=null  pseudoFields.size()0) {


 We're running solr trunk r614955 (Jan 23rd), and r75 of localsolr.

 Result snippet with the patch:

 'class'=['showtime'],
 'genre'=['Drama',
  'Suspsense/Triller'],

 Has anyone come across an issue like this?  Is this fixed in a newer build of 
Solr?   It looks like
 we'd still need this patch even in a build of the solr trunk from yesterday, 
but maybe not.

 --
 Doug Steigerwald
 Software Developer
 McClatchy Interactive
 [EMAIL PROTECTED]
 919.861.1287



Re: JSONRequestWriter

2008-03-05 Thread Doug Steigerwald

Note that we now have to add a default param to the requestHandler:

requestHandler name=/search 
class=org.apache.solr.handler.component.SearchHandler
lst name=defaults
str name=echoParamsexplicit/str
str name=json.nlmap/str
/lst
arr name=components
strcollapse/str
strlocalsolr/str
strfacet/str
/arr
  /requestHandler

If you don't add the json.nl=map to your params, then you can't eval() what you get back in Ruby 
(can't convert String into Integer).  Not sure if this can be put into the RubyResponseWriter as a 
default.  Also not sure if this an issue with the python writer either (since I don't use python).


Doug

Yonik Seeley wrote:

Thanks Doug, I just checked in your fix.
This was a recent bug... writing of SolrDocument was recently added
and is not touched by normal code paths, except for distributed
search.

-Yonik

On Wed, Mar 5, 2008 at 9:29 AM, Doug Steigerwald
[EMAIL PROTECTED] wrote:

We're using localsolr and the RubyResponseWriter.  When we do a request with 
the localsolr component
 in our requestHandler we're seeing issues with the display of a multivalued 
field when it only has
 one value.

 'class'=['showtime']'showtime',  --
 'genre'=['Drama',
  'Suspsense/Triller'],

 With no localsolr component it works fine.

 Looks like the issue is with the JSONRequestWriter.writeSolrDocument().  
Here's the small patch for
 it that seems to fix it.

 Index: src/java/org/apache/solr/request/JSONResponseWriter.java
 ===
 --- src/java/org/apache/solr/request/JSONResponseWriter.java(revision 
614955)
 +++ src/java/org/apache/solr/request/JSONResponseWriter.java(working copy)
 @@ -416,7 +416,7 @@
writeVal(fname, val);
writeArrayCloser();
  }
 -writeVal(fname, val);
 +else writeVal(fname, val);
}

if (pseudoFields !=null  pseudoFields.size()0) {


 We're running solr trunk r614955 (Jan 23rd), and r75 of localsolr.

 Result snippet with the patch:

 'class'=['showtime'],
 'genre'=['Drama',
  'Suspsense/Triller'],

 Has anyone come across an issue like this?  Is this fixed in a newer build of 
Solr?   It looks like
 we'd still need this patch even in a build of the solr trunk from yesterday, 
but maybe not.

 --
 Doug Steigerwald
 Software Developer
 McClatchy Interactive
 [EMAIL PROTECTED]
 919.861.1287



Re: JSONRequestWriter

2008-03-05 Thread Doug Steigerwald

Sure.

The default (json.nl=flat):

'response',{'numFound'=41,'start'=0,

Adding json.nl=map makes output correct:

'response'={'numFound'=41,'start'=0,

This also changes facet output (which was evaluating fine):

FLAT:

 'facet_counts',{
  'facet_queries'={},
  'facet_fields'={
'movies_movie_genre_facet'=[
 'Drama',22,
 'Action/Adventure',11,
 'Comedy',11,
 'Suspense/Thriller',11,
 'SciFi/Fantasy',5,
 'Animation',4,
 'Documentary',4,
 'Family',3,
 'Horror',3,
 'Musical',2,
 'Romance',2,
 'Concert',1,
 'War',1]},
  'facet_dates'={}}

MAP:

 'facet_counts'={
  'facet_queries'={},
  'facet_fields'={
'movies_movie_genre_facet'={
 'Drama'=22,
 'Action/Adventure'=11,
 'Comedy'=11,
 'Suspense/Thriller'=11,
 'SciFi/Fantasy'=5,
 'Animation'=4,
 'Documentary'=4,
 'Family'=3,
 'Horror'=3,
 'Musical'=2,
 'Romance'=2,
 'Concert'=1,
 'War'=1}},
  'facet_dates'={}}

Doug

Yonik Seeley wrote:

On Wed, Mar 5, 2008 at 11:25 AM, Doug Steigerwald
[EMAIL PROTECTED] wrote:

 If you don't add the json.nl=map to your params, then you can't eval() what 
you get back in Ruby
 (can't convert String into Integer).


Can you show what the problematic ruby output is?

json.nl=map isn't the default because some things need to be ordered,
and eval of a map in python  ruby looses that order.

-Yonik


Re: JSONRequestWriter

2008-03-05 Thread Doug Steigerwald

Looks like it's only happening when we use the LocalSolrQueryComponent from 
localsolr.

rsp.add(response, sdoclist);

sdoclist is a SolrDocumentList.  Could that be causing an issue instead of it 
being just a DocList?

Doug

Yonik Seeley wrote:

The output you showed is indeed incorrect, but I can't reproduce that
with stock solr.
Here is a example of what I get:

{
 'responseHeader'={
  'status'=0,
  'QTime'=16,
  'params'={
'wt'='ruby',
'indent'='true',
'q'='*:*',
'facet'='true',
'highlight'='true'}},
 'response'={'numFound'=0,'start'=0,'docs'=[]
 },
 'facet_counts'={
  'facet_queries'={},
  'facet_fields'={},
  'facet_dates'={}}}


-Yonik

On Wed, Mar 5, 2008 at 12:00 PM, Doug Steigerwald
[EMAIL PROTECTED] wrote:

Sure.

 The default (json.nl=flat):

 'response',{'numFound'=41,'start'=0,

 Adding json.nl=map makes output correct:

 'response'={'numFound'=41,'start'=0,

 This also changes facet output (which was evaluating fine):

 FLAT:

  'facet_counts',{
   'facet_queries'={},
   'facet_fields'={
'movies_movie_genre_facet'=[
 'Drama',22,
 'Action/Adventure',11,
 'Comedy',11,
 'Suspense/Thriller',11,
 'SciFi/Fantasy',5,
 'Animation',4,
 'Documentary',4,
 'Family',3,
 'Horror',3,
 'Musical',2,
 'Romance',2,
 'Concert',1,
 'War',1]},
   'facet_dates'={}}

 MAP:

  'facet_counts'={
   'facet_queries'={},
   'facet_fields'={
'movies_movie_genre_facet'={
 'Drama'=22,
 'Action/Adventure'=11,
 'Comedy'=11,
 'Suspense/Thriller'=11,
 'SciFi/Fantasy'=5,
 'Animation'=4,
 'Documentary'=4,
 'Family'=3,
 'Horror'=3,
 'Musical'=2,
 'Romance'=2,
 'Concert'=1,
 'War'=1}},
   'facet_dates'={}}

 Doug



 Yonik Seeley wrote:
  On Wed, Mar 5, 2008 at 11:25 AM, Doug Steigerwald
  [EMAIL PROTECTED] wrote:
   If you don't add the json.nl=map to your params, then you can't eval() 
what you get back in Ruby
   (can't convert String into Integer).
 
  Can you show what the problematic ruby output is?
 
  json.nl=map isn't the default because some things need to be ordered,
  and eval of a map in python  ruby looses that order.
 
  -Yonik



Re: Integrated Spellchecking

2008-02-20 Thread Doug Steigerwald

Sure.  I'll try to post it today or tomorrow.

Doug Steigerwald
Software Developer
McClatchy Interactive
[EMAIL PROTECTED]
919.861.1287

Otis Gospodnetic wrote:

Hey Doug,

You have multicore/spellcheck replication going already?  We have been working 
on the replication for multicore.  Sounds like we are replicating each 
others work.  When will you be able to attach your stuff to JIRA issue? 
https://issues.apache.org/jira/browse/SOLR-433
 
Thanks,

Otis

--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch

- Original Message 

From: Doug Steigerwald [EMAIL PROTECTED]
To: solr-user@lucene.apache.org
Sent: Friday, February 15, 2008 12:45:08 PM
Subject: Re: Integrated Spellchecking

That unfortunately got pushed aside to work on some of our higher priority solr 
work since we 
already had it working one way.


Hoping to revisit this after we push to production and start working on new 
features and share what 
I've done for this and multicore/spellcheck replication (which we have working 
quite well in QA 
right now).


Doug Steigerwald
Software Developer
McClatchy Interactive
[EMAIL PROTECTED]
919.861.1287


oleg_gnatovskiy wrote:


dsteiger wrote:

I've got a couple search components for automatic spell correction that
I've been working on.

I've converted most of the SpellCheckerRequestHandler to a search
component (hopefully will throw a 
patch out soon for this).  Then another search component that will do auto
correction for a query if 
the search returns zero results.


We're hoping to see some performance improvements out of handling this in
Solr instead of our Rails 
service.


doug


Ryan McKinley wrote:

Yes -- this is what search components are for!

Depending on where you put it in the chain, it could only return spell 
checked results if there are too few results (or the top score is below 
some threshold)


ryan


Grant Ingersoll wrote:
Is it feasible to submit a query to any of the various handlers and 
have it bring back results and spelling suggestions all in one 
response?  Is this something the query components piece would handle, 
assuming one exists for the spell checker?


Thanks,
Grant



So have you succeeded in implementing this patch? I'd definitely like to use
this functionality as a search suggestion.




Re: Integrated Spellchecking

2008-02-15 Thread Doug Steigerwald
That unfortunately got pushed aside to work on some of our higher priority solr work since we 
already had it working one way.


Hoping to revisit this after we push to production and start working on new features and share what 
I've done for this and multicore/spellcheck replication (which we have working quite well in QA 
right now).


Doug Steigerwald
Software Developer
McClatchy Interactive
[EMAIL PROTECTED]
919.861.1287


oleg_gnatovskiy wrote:



dsteiger wrote:

I've got a couple search components for automatic spell correction that
I've been working on.

I've converted most of the SpellCheckerRequestHandler to a search
component (hopefully will throw a 
patch out soon for this).  Then another search component that will do auto
correction for a query if 
the search returns zero results.


We're hoping to see some performance improvements out of handling this in
Solr instead of our Rails 
service.


doug


Ryan McKinley wrote:

Yes -- this is what search components are for!

Depending on where you put it in the chain, it could only return spell 
checked results if there are too few results (or the top score is below 
some threshold)


ryan


Grant Ingersoll wrote:
Is it feasible to submit a query to any of the various handlers and 
have it bring back results and spelling suggestions all in one 
response?  Is this something the query components piece would handle, 
assuming one exists for the spell checker?


Thanks,
Grant






So have you succeeded in implementing this patch? I'd definitely like to use
this functionality as a search suggestion.


DisMax and Search Components

2008-01-21 Thread Doug Steigerwald
Is there any support for DisMax (or any search request handlers) in search components, or is that 
something that still needs to be done?  It seems like it isn't supported at the moment.


We want to be able to use a field collapsing component 
(https://issues.apache.org/jira/browse/SOLR-236), but still be able to use our DisMax handlers.


Right now it's one or the other, and we -need- both.

Thanks.
doug


Re: DisMax and Search Components

2008-01-21 Thread Doug Steigerwald

We've found a way to work around it.  In our search components, we're doing 
something like:

  defType = defType == null ? DisMaxQParserPlugin.NAME : defType;

If you add defType=dismax to the query string, it'll use the 
DisMaxQParserPlugin.

Unfortunately, I haven't been able to figure out an easy way to access the config for the different 
defined disxmax handlers in the config, so on our service side (Rails app), we're going to have a 
configuration with all the params we need to pass (qf, pf, fl, etc) and send them based on 
parameters we have coming into the service that we use to figure out which dismax handler to use 
(uh, yeah, I think that sounds right).


This may not be the best way to do it, but it will work fine for us until we can dedicate more time 
to it (we roll out Solr and our search service to QA next week).


Doug

Charles Hornberger wrote:

On Jan 21, 2008 10:23 AM, Doug Steigerwald
[EMAIL PROTECTED] wrote:

Is there any support for DisMax (or any search request handlers) in search 
components, or is that
something that still needs to be done?  It seems like it isn't supported at the 
moment.


I was curious about this, too ... If it *is* something that needs to
be done, am happy to help w/ the coding. But I would need some
advice/guidance up front --  I'm new enough to Solr that the design
behind the SearchComponents refactoring is not immediately obvious to
me, either from the Jira comments or the code itself.

-Charlie


Re: DisMax and Search Components

2008-01-21 Thread Doug Steigerwald

We don't always want to use the dismax handler in our setup.

Doug

Yonik Seeley wrote:

On Jan 21, 2008 9:06 PM, Doug Steigerwald
[EMAIL PROTECTED] wrote:

We've found a way to work around it.  In our search components, we're doing 
something like:

   defType = defType == null ? DisMaxQParserPlugin.NAME : defType;


Would it be easier to just add it as a default parameter in the request handler?

-Yonik


Re: Spell checker index rebuild

2008-01-17 Thread Doug Steigerwald

It's in the index.  Can see it with a query: q=word:blackjack

And in luke: −
lst name=topTerms
int name=blackjack29/int

The actual index data seems to disappear.

First rebuild:
$ ls  spell/
_2.cfs  segments.gen  segments_i

Second rebuild:
$ ls spell
segments_2z  segments.gen

doug

Otis Gospodnetic wrote:

Do you trust the spellchecker 100% (not looking at its source now).  I'd peek 
at the index with Luke (Luke I trust :)) and see if that term is really there 
first.

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch

- Original Message 
From: Doug Steigerwald [EMAIL PROTECTED]
To: solr-user@lucene.apache.org
Sent: Wednesday, January 16, 2008 2:56:35 PM
Subject: Spell checker index rebuild

Having another weird spell checker index issue.  Starting off from a
 clean index and spell check 
index, I'll index everything in example/exampledocs.  On the first
 rebuild of the spellchecker index 
using the query below says the word 'blackjack' exists in the

 spellchecker index.  Great, no problems.

Rebuild it again and the word 'blackjack' does not exist any more.

http://localhost:8983/solr/core0/select?q=blackjackqt=spellcheckercmd=rebuild

Any ideas?  This is with a Solr trunk build from yesterday.

doug




Spellchecker index rebuild error

2008-01-14 Thread Doug Steigerwald
Lately I've been having issues with the spellchecker failing to properly rebuild my spell index.  I 
used to be able to delete the spell directory and reload the core and build the index fine if it 
ever crapped out, but now I can't even build it.


java.io.FileNotFoundException: /home/dsteiger/solr/data/spell/_8c.cfs (No such 
file or directory)
at java.io.RandomAccessFile.open(Native Method)
at java.io.RandomAccessFile.init(RandomAccessFile.java:212)
at 
org.apache.lucene.store.FSDirectory$FSIndexInput$Descriptor.init(FSDirectory.java:506)
at 
org.apache.lucene.store.FSDirectory$FSIndexInput.init(FSDirectory.java:536)
at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:445)
at 
org.apache.lucene.index.CompoundFileReader.init(CompoundFileReader.java:70)
at 
org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:181)
at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:167)
...

Here's the query: /solr/dsteiger/select/?q=testqt=spellcheckercmd=rebuild

Here's my config snippet:

requestHandler name=spellchecker class=solr.SpellCheckerRequestHandler 
startup=lazy
lst name=defaults
int name=suggestionCount1/int
float name=accuracy0.5/float
/lst
str name=spellcheckerIndexDirspell/str
str name=termSourceFieldspell/str
/requestHandler

Anyone have any ideas?

Doug


Re: Field collapsing

2008-01-03 Thread Doug Steigerwald
I finally took more than 30 minutes to try and apply the patch and got it to (mostly) work.  Will 
try to submit it tomorrow for review if there's interest.


Doug

Ryan McKinley wrote:
I think the last patch is pre QueryComponent infrastructure  it 
needs to be transformed into a QueryComponent to work.


I don't think anyone has tackled that yet...

ryan


Doug Steigerwald wrote:
Modifying the patch to apply.  StandardRequestHandler and 
DisMaxRequestHandler were changed a lot in mid-November and I've been 
having a hard time figuring out where the changes should be reapplied.


Doug

Grant Ingersoll wrote:

Hi Doug,

Is the problem in applying the patch or getting it to work once it is 
applied?


-Grant

On Jan 3, 2008, at 8:52 AM, Doug Steigerwald wrote:

Being able to collapse multiple documents into one result with Solr 
is a big deal for us here.  Has anyone been able to get field 
collapsing (http://issues.apache.org/jira/browse/SOLR-236) to patch 
to a recent checkout of Solr?  I've been unsuccessful so far in 
trying to modify the latest patch to work.


Thanks.
Doug




Re: Continue posting after an error

2007-09-25 Thread Doug Steigerwald
Thanks.  We're probably not going to be sending huge batches of documents very often, so I'll just 
try a persistent connection and hopefully performance won't be an issue.  With our document size, I 
was posting around 300+ docs/s, so anything reasonably close to that will be good.  Historically 
we've been processing 335k document updates per hour, so we're way under the max docs/s we've seen 
with Solr.


Doug

Chris Hostetter wrote:

: Sometimes there's a field that shouldn't be multiValued, but the data comes in
: with multiple fields of the same name in a single document.
: 
: Is there any way to continue processing other documents in a file even if one

: document errors out? It seems like whenever we hit one of these cases, it
: stops processing the file completely.

I believe you are correct, the UpdateRequestHandler aborts as soon as bad 
doc is found.  It might be possible to make it skip bad docs and continue 
processing, but what mechanism could it use to report which doc had 
failed? not all schemas have uniqueKey fields, and even if they do - the 
uniqueKey field may have been the problem.


This is one of the reasons why i personally recommend only sending one doc 
at a time -- if you use persistent HTTP connections, there really 
shouldn't be much performance differnece (and if there is, we can probably 
optimize that)



-Hoss


Geographic searching in solr

2007-09-12 Thread Doug Steigerwald

Not sure if this got through earlier, pine messed up...

Has anyone implemented any sort of geographic searching for Solr?  I've
found Local Lucene
(http://www.nsshutdown.com/projects/lucene/whitepaper/locallucene.htm) by
Patrick O'Leary and there is another project in his CVS called Local Solr
(http://www.nsshutdown.com/viewcvs/viewcvs.cgi/localsolr/).

I've gotten Local Solr and Local Lucene compiled, but trying to drop the
plugin in the Solr lib folder and trying to define the custom FieldTypes in
my scheme results in errors (see below).

fieldType name=longitude class=com.pjaol.search.solr.LngField /

Has anyone gotten Local  Lucene/Solr to work for geographic searches or
implemented anything like this?

I can't actually find any other plugins for Solr to look at and try to
resolve my issues with Local Solr.  Any help would be appreciated.

I've tried this with Solr 1.2, and compiling Solr from the trunk. Java 
1.6.


Thanks.

Doug

---error---
Sep 12, 2007 8:22:50 AM org.apache.solr.common.SolrException log
SEVERE: java.lang.NoClassDefFoundError: org/apache/solr/schema/FieldType
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:620)
at
java.security.SecureClassLoader.defineClass(SecureClassLoader.java:124)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:260)
at java.net.URLClassLoader.access$000(URLClassLoader.java:56)
at java.net.URLClassLoader$1.run(URLClassLoader.java:195)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:188)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at java.lang.ClassLoader.loadClass(ClassLoader.java:251)
...
...
...