schema.xml changes and the impact on Solr?

2007-08-13 Thread Kevin Holmes
We're planning to make changes to our schema.xml file... I need to ask a
few questions

 

1 - if we add fields / remove fields to be indexed, how will this affect
our current indexes.  Will we need to completely recreate millions on
indexes (or is it indices)?  

 

Scenario 1a :: we've been injecting field1... but not indexed.  Now,
we just want to add an index.  Once the xml is changed, how do we sanely
reindex?

 

Scenario 1b :: we add field2 (and index=true), which was not
previously used as a field at all.  Do our indexes need to be completely
recreated, or is there a way to update these indexes individually?  I
still have the original data in a DB and can do that if necessary.

 

Scenario 1c :: we remove a few of the fields in the schema.xml (but add
nothing).  Reindex required?

 

 

 

2 - Question about the structure of the injected xml file... does it
need to exactly match the data in solr?  I know it makes sense that
we're only injecting the fields that solr needs and not excluding fields
that it needs... but how fussy is solr when it comes to matching the xml
in injection?

 

 

 

So really, four questions.  I look forward to your wisdom!

 

-KH



Too many open files

2007-08-09 Thread Kevin Holmes
result status=1java.io.FileNotFoundException:
/usr/local/bin/apache-solr/enr/solr/data/index/_16ik.tii (Too many open
files)

 

When I'm importing, this is the error I get.  I know it's vague and
obscure.  Can someone suggest where to start?  I'll buy a bag of MMs
(not peanut) for anyone who can help me solve this*

 

*limit one bag per successful solution for a total maximum of 1 bag to
be given



RE: Too many open files

2007-08-09 Thread Kevin Holmes
You're a gentleman and a scholar.  I will donate the MMs to myself :).
Can you tell me from this snippet of my solrconfig.xml what I might
tweak to make this more betterer?

-KH

  indexDefaults
   !-- Values here affect all index writers and act as a default unless
overridden. --
useCompoundFilefalse/useCompoundFile
mergeFactor10/mergeFactor
maxBufferedDocs1000/maxBufferedDocs
maxMergeDocs2147483647/maxMergeDocs
maxFieldLength1/maxFieldLength
writeLockTimeout1000/writeLockTimeout
commitLockTimeout1/commitLockTimeout
  /indexDefaults


Any clever ideas to inject into solr? Without http?

2007-08-09 Thread Kevin Holmes
I inherited an existing (working) solr indexing script that runs like
this:

 

Python script queries the mysql DB then calls bash script

Bash script performs a curl POST submit to solr

 

We're injecting about 1000 records / minute (constantly), frequently
pushing the edge of our CPU / RAM limitations.

 

I'm in the process of building a Perl script to use DBI and
lwp::simple::post that will perform this all from a single script
(instead of 3).

 

Two specific questions

1: Does anyone have a clever (or better) way to perform this process
efficiently?

 

2: Is there a way to inject into solr without using POST / curl / http?

 

Admittedly, I'm no solr expert - I'm starting from someone else's setup,
trying to reverse-engineer my way out.  Any input would be greatly
appreciated.



RE: Any clever ideas to inject into solr? Without http?

2007-08-09 Thread Kevin Holmes
Is this a native feature, or do we need to get creative with scp from
one server to the other?


If it's a contention between search and indexing, separate  them
via a query-slave and an index-master.

--cw


Heap size vs memory allocation

2007-08-06 Thread Kevin Holmes
I'm searching through the mail archives and a few old letters make it
sound as if heap size is a different setting completely than memory
allocation on execute.  The box has 4gb RAM, and I'm executing solr with
2500 max / 1000 min like this:

 

java -Xmx2500M -Xms1000M -jar start.jar

 

Is this the same as heap size or not?  If there's a separate setting for
heap size, where do I find that? :)

 

 



RE: Error when starting Solr using Tomcat V6 (tomcat6-6.0.13)

2007-08-02 Thread Kevin Holmes
What's the command you are using to start solr?



RE: Please help! Solr 1.1 HTTP server stops responding

2007-07-30 Thread Kevin Holmes
Just got this:



Jul 30, 2007 3:02:14 PM org.apache.solr.core.SolrException log
SEVERE: java.lang.OutOfMemoryError: Java heap space

Jul 30, 2007 3:02:30 PM org.apache.solr.core.SolrException log
SEVERE: java.lang.OutOfMemoryError: Java heap space




 
Kevin Holmes
eNR Services, Inc.
20 Glover Ave. 2nd Floor
Norwalk, CT. 06851
203-849-7248
[EMAIL PROTECTED]


-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Yonik
Seeley
Sent: Monday, July 30, 2007 2:55 PM
To: solr-user@lucene.apache.org
Subject: Re: Please help! Solr 1.1 HTTP server stops responding

On 7/30/07, David Whalen [EMAIL PROTECTED] wrote:
 We increased the heap size to 1500M and that didn't seem to help.
 In fact, the crashes seem to occur more now than ever.  We're
 constantly restarting solr just to get a response.

 I don't know enough to know where the log files are to answer
 your question

Me neither ;-)
Solr's example app that uses Jetty just has logging going to stdout
(the console) to make it clear and visible to new users when an error
happens.  Hopefully you've configured Jetty to log to files, or at
least redirected Jetty's stdout/stderr to a file.
You need to look around and try and find those log files.
If you find them, one thing to look for would be WARNING in the log
files.  Another thing to look for would be Exception or Memory

 So maybe it's actually Jetty that's messing me up?  How can I
 make sure of that?

Perhaps point your browser at http://localhost:8983/ and see if you
get any reponse at all.

-Yonik


RE: Please help! Solr 1.1 HTTP server stops responding

2007-07-30 Thread Kevin Holmes
These might be relevant too:



Jul 30, 2007 3:05:03 PM org.apache.solr.core.SolrException log
SEVERE: java.io.IOException: Lock obtain timed out:
SimpleFSLock@/tmp/lucene-f4cca35f5bee7bbcd8238c7ef8697193-write.lock
at org.apache.lucene.store.Lock.obtain(Lock.java:69)
at
org.apache.lucene.index.IndexWriter.init(IndexWriter.java:258)
at
org.apache.lucene.index.IndexWriter.init(IndexWriter.java:208)
at
org.apache.solr.update.SolrIndexWriter.init(SolrIndexWriter.java:66)
at
org.apache.solr.update.UpdateHandler.createMainIndexWriter(UpdateHandler
.java:119)



 
Kevin Holmes
eNR Services, Inc.
20 Glover Ave. 2nd Floor
Norwalk, CT. 06851
203-849-7248
[EMAIL PROTECTED]


RE: Please help! Solr 1.1 HTTP server stops responding

2007-07-30 Thread Kevin Holmes
debiandos:~# curl -i
http://localhost:8983/solr/select/?q=superduperobscuretestingstring
HTTP/1.1 200 OK
Date: Mon, 30 Jul 2007 19:20:40 GMT
Server: Jetty/5.1.11RC0 (Linux/2.6.18-4-686 i386 java/1.5.0_11
Content-Type: text/xml; charset=UTF-8
Content-Length: 272

?xml version=1.0 encoding=UTF-8?
response
lst name=responseHeaderint name=status0/intint
name=QTime121/intlst name=paramsstr
name=qsuperduperobscuretestingstring/str/lst/lstresult
name=response numFound=0 start=0/
/response


Next, how is Jetty being started? 
cd /home/jason/code/apache-solr-1.1.0-incubating/enrsolr;
java -Xmx1500m -jar start.jar


Where is its jetty.xml  
/home/jason/code/apache-solr-1.1.0-incubating/enrsolr/etc/jetty.xml

configuration file? What does that file specify for RequestLog?

!-- Uncomment for request logging.
  Set name=RequestLog
New class=org.mortbay.http.NCSARequestLog
  ArgSystemProperty name=jetty.home
default=.//logs/_mm_dd.request.log/Arg
  Set name=retainDays90/Set
  Set name=appendtrue/Set
  Set name=extendedfalse/Set
  Set name=LogTimeZoneGMT/Set
/New
  /Set
  --






My point is that I can't predict how it's started on your machine.  
You need to find out yourself.
On Linux:
- ps -ef | grep java

root 10175 10174 11 15:17 pts/100:00:56 java -Xmx1500m -jar
start.jar



RE: Please help! Solr 1.1 HTTP server stops responding

2007-07-30 Thread Kevin Holmes
This might be relevant too?


Jul 30, 2007 3:05:22 PM org.apache.solr.core.SolrException log
SEVERE: Error during auto-warming of
key:[EMAIL PROTECTED]:java.lang.OutOfMemory
Error: Java heap space

Jul 30, 2007 3:05:25 PM org.apache.solr.core.SolrException log
SEVERE: Error during auto-warming of
key:[EMAIL PROTECTED]:java.lang.OutOfMemory
Error: Java heap space

Jul 30, 2007 3:05:27 PM org.apache.solr.core.SolrException log
SEVERE: Error during auto-warming of
key:[EMAIL PROTECTED]:java.lang.OutOfMemory
Error: Java heap space

Jul 30, 2007 3:05:30 PM org.apache.solr.core.SolrException log
SEVERE: Error during auto-warming of
key:[EMAIL PROTECTED]:java.lang.OutOfMemoryE
rror: Java heap space

Jul 30, 2007 3:05:33 PM org.apache.solr.core.SolrException log
SEVERE: Error during auto-warming of
key:[EMAIL PROTECTED]:java.lang.OutOfMemory
Error: Java heap space

Jul 30, 2007 3:05:36 PM org.apache.solr.core.SolrException log
SEVERE: Error during auto-warming of
key:[EMAIL PROTECTED]:java.lang.OutOfMemory
Error: Java heap space