RE: EmbeddedSolr class from Wiki

2007-05-01 Thread Fuad Efendi
Thank you Hoss, this is exactly what I need! Currently I perform reindexing
once a month, and it takes few days... Very slow... Over 2 millions
documents (not too much; 300Mb in files), database & SOLR on a same box, and
SOLR uses about 60-80% CPU. I will implement real-time updates, via direct
Java calls (as soon as data gets changed). 

About Compass, I noticed some messages. I tried to use it (before SOLR)
because of advertised "transactional" Lucene updates; that is not true, and
performance was really bad.

-Original Message-
From: Chris Hostetter

postCommit and postOptimize hooks can be subclass of SolrEventListener so
you can trigger arbitrary jva code if you want to write your own (use JMS,
or make an HTTP call, whatever)

the RunExecutableListener that ships with Solr would be the easiest thing
to do ... just have it execute the "commit" command line script on your
slave (which will make it reopen the index you just modified)



Re: Ranking ApacheCon proposals

2007-05-01 Thread Erik Hatcher


On May 1, 2007, at 7:42 PM, ericp wrote:

Cool, I noticed a ruby-Flare-Solr presentation too who is giving that?


I proposed that one.

Erik




Re: Ranking ApacheCon proposals

2007-05-01 Thread ericp
Cool, I noticed a ruby-Flare-Solr presentation too who is giving that?

ERIC

Chris Hostetter wrote:
> I have no idea if they did this for the impending ApacheCon EU, but I just
> noticed that for ApacheCon US, they have a "Would you attend this
> session?" ranking for for people to give feedback on the abstracts that
> have been submited before the schedule is made.
> 
> I would never dream of shilling my own session proposals, but I will
> happily encourage people who are interested in seeing Solr/Lucene well
> represented in the ApacehCon sessions to go to the ApacheCon website,
> create an account, and click the "Rate the session proposals" link after
> you login...
> 
>http://apachecon.com/html/login.html
> 
> If you are *not* interested in seeing Solr/Lucene well represented in the
> ApacheCon sessions, please disregard this email.  :)
> 
> 
> -Hoss
> 
> 


Wondering about results from PhraseQuer

2007-05-01 Thread solruser

Hi Everyone,

Pardon me if this question might be asked here in the mailing list earlier.
I tried looking for this but I could not get any answers. I am querying
against indexes with a phrase query. And although I can see my terms
occurrence in the debug results I get the overall score to be "0". To give
the scenario, understand this that user runs a search for title which has
pretty common terms such as "how do I update" {all of the words appears
1000s of times in indexes } and they want to search "prison" the last term
appears not more than 1 or 2 times across the indexes. Now I have the
problem, if I try to run phrase query on this I get zero results and if I
run term query with boolean across all terms I have too many results to be
meaningful. So what and how should I arrange the query so that I can get
relevant results. Here are my debug results for my search query
=

−

subject_t:"how do I prison"

subject_t:"how do I prison"

PhraseQuery(subject_t:"how do i prison")
subject_t:"how do i prison"

standard
−

−


0.0 = fieldWeight(subject_t:"how do i prison" in 9268), product of:
  0.0 = tf(phraseFreq=0.0)
  18.508762 = idf(subject_t: how=2225 do=3359 i=4918 prison=4)
  0.5 = fieldNorm(field=subject_t, doc=9268)

−


0.0 = fieldWeight(subject_t:"how do i prison" in 10424), product of:
  0.0 = tf(phraseFreq=0.0)
  18.508762 = idf(subject_t: how=2225 do=3359 i=4918 prison=4)
  0.5 = fieldNorm(field=subject_t, doc=10424)

−


0.0 = fieldWeight(subject_t:"how do i prison" in 12163), product of:
  0.0 = tf(phraseFreq=0.0)
  18.508762 = idf(subject_t: how=2225 do=3359 i=4918 prison=4)
  0.625 = fieldNorm(field=subject_t, doc=12163)

−


0.0 = fieldWeight(subject_t:"how do i prison" in 9289), product of:
  0.0 = tf(phraseFreq=0.0)
  18.508762 = idf(subject_t: how=2225 do=3359 i=4918 prison=4)
  0.625 = fieldNorm(field=subject_t, doc=9289)

−


0.0 = fieldWeight(subject_t:"how do i prison" in 14700), product of:
  0.0 = tf(phraseFreq=0.0)
  18.508762 = idf(subject_t: how=2225 do=3359 i=4918 prison=4)
  0.4375 = fieldNorm(field=subject_t, doc=14700)

−


0.0 = fieldWeight(subject_t:"how do i prison" in 11920), product of:
  0.0 = tf(phraseFreq=0.0)
  18.508762 = idf(subject_t: how=2225 do=3359 i=4918 prison=4)
  0.625 = fieldNorm(field=subject_t, doc=11920)

−


0.0 = fieldWeight(subject_t:"how do i prison" in 1278), product of:
  0.0 = tf(phraseFreq=0.0)
  18.508762 = idf(subject_t: how=2225 do=3359 i=4918 prison=4)
  0.375 = fieldNorm(field=subject_t, doc=1278)

−


0.0 = fieldWeight(subject_t:"how do i prison" in 3868), product of:
  0.0 = tf(phraseFreq=0.0)
  18.508762 = idf(subject_t: how=2225 do=3359 i=4918 prison=4)
  0.3125 = fieldNorm(field=subject_t, doc=3868)

−


0.0 = fieldWeight(subject_t:"how do i prison" in 3893), product of:
  0.0 = tf(phraseFreq=0.0)
  18.508762 = idf(subject_t: how=2225 do=3359 i=4918 prison=4)
  0.5 = fieldNorm(field=subject_t, doc=3893)

−


0.0 = fieldWeight(subject_t:"how do i prison" in 19024), product of:
  0.0 = tf(phraseFreq=0.0)
  18.508762 = idf(subject_t: how=2225 do=3359 i=4918 prison=4)
  0.5 = fieldNorm(field=subject_t, doc=19024)



=

Thanks 
-- 
View this message in context: 
http://www.nabble.com/Wondering-about-results-from-PhraseQuer-tf3677924.html#a10277926
Sent from the Solr - User mailing list archive at Nabble.com.


Ranking ApacheCon proposals

2007-05-01 Thread Chris Hostetter

I have no idea if they did this for the impending ApacheCon EU, but I just
noticed that for ApacheCon US, they have a "Would you attend this
session?" ranking for for people to give feedback on the abstracts that
have been submited before the schedule is made.

I would never dream of shilling my own session proposals, but I will
happily encourage people who are interested in seeing Solr/Lucene well
represented in the ApacehCon sessions to go to the ApacheCon website,
create an account, and click the "Rate the session proposals" link after
you login...

   http://apachecon.com/html/login.html

If you are *not* interested in seeing Solr/Lucene well represented in the
ApacheCon sessions, please disregard this email.  :)


-Hoss



RE: NullPointerException (not schema related)

2007-05-01 Thread Charlie Jackson
I went with the first approach which got me up and running. Your other
example config (using ./snapshooter) made me realize how foolish my
original problem was!

Anyway, I've got the whole thing up and running and it looks pretty
awesome! 

One quick question, though. As stated in the wiki, one of the benefits
of distributing the indexes is load balance the queries. Is there a
built-in solr mechanism for performing this query load balancing? I'm
suspecting there is not, and I haven't seen anything about it in the
wiki, but I wanted to check because I know I'm going to be asked.

Thanks,
Charlie

-Original Message-
From: Chris Hostetter [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, May 01, 2007 3:20 PM
To: solr-user@lucene.apache.org
Subject: RE: NullPointerException (not schema related)


: 
:   snapshooter
:   /usr/local/Production/solr/solr/bin/
:   true
: 

: the directory. However, when I committed data to the index, I was
: getting "No such file or directory" errors from the Runtime.exec call.
I
: verified all of the permissions, etc, with the user I was trying to
use.
: In the end, I wrote up a little test program to see if it was a
problem
: with the Runtime.exec call and I think it is. I'm running this on
CentOS
: 4.4 and Runtime.exec seems to have a hard time directly executing bash
: scripts. For example, if I called Runtime.exec with a command of
: "test_program" (which is a bash script), it failed. If I called
: Runtime.exec with a command of "/bin/bash test_program" it worked.

this initial problem you were having may be a result of path issues.
dir
doesn't need to be the directory where your script lives, it's the
directory where you wnat your script to run (the "cwd" of the process).
it's possible that the error you were getting was because "." isn't in
the
PATH that was being used, you should try something like this...

 
   /usr/local/Production/solr/solr/bin/snapshooter
   /usr/local/Production/solr/solr/bin/
   true
 

...or maybe even...

 
   ./snapshooter 
   /usr/local/Production/solr/solr/bin/
   true
 

-Hoss



Re: NullPointerException (not schema related)

2007-05-01 Thread Mike Klaas

On 5/1/07, Charlie Jackson <[EMAIL PROTECTED]> wrote:

This is what came in the solrconfig.xml file with just a minor tweak to
the directory. However, when I committed data to the index, I was
getting "No such file or directory" errors from the Runtime.exec call. I
verified all of the permissions, etc, with the user I was trying to use.
In the end, I wrote up a little test program to see if it was a problem
with the Runtime.exec call and I think it is. I'm running this on CentOS
4.4 and Runtime.exec seems to have a hard time directly executing bash
scripts. For example, if I called Runtime.exec with a command of
"test_program" (which is a bash script), it failed. If I called
Runtime.exec with a command of "/bin/bash test_program" it worked.


Yes, Runtime.exec does not invoke a shell automatically, so shebang
lines, shell built-ins, io redirection, etc. cannot be used directly.

-Mike


Re: Specifying no-ops...

2007-05-01 Thread Brian Whitman
When we use solr in a javascript / ajax.request context we often want  
to 'tag' requests with the user id or item number or something that  
will not normally appear in the solr results. Because in an  
asynchronous request handler, you won't know who or what the query is  
about. To do this, we make sure all of our requesthandlers in  
solrconfig.xml have "echoParams = explicit" set.


Then you can do select?q=dogs&userid=XR30010&itemid=TR30120

And solr will not complain about those extra params and will also  
echo them back in the response XML/json, which your client can parse.





On May 1, 2007, at 2:22 AM, escher2k wrote:



I want to capture information about the user who is executing a  
particular
search. Is there a way to specify in Solr that certain fields  
should just be
treated as pass through and not processed ? This way I can use  
arbitrary

params to do better logging.

Thanks.
--
View this message in context: http://www.nabble.com/Specifying-no- 
ops...-tf3673559.html#a10265041

Sent from the Solr - User mailing list archive at Nabble.com.



--
http://variogr.am/
[EMAIL PROTECTED]





RE: Unicode characters

2007-05-01 Thread HUYLEBROECK Jeremy RD-ILAB-SSF

Thanks a lot for the time you spent understanding my problem and
checking for a solution in Neko!
It helps a lot.


-Original Message-
From: Chris Hostetter [mailto:[EMAIL PROTECTED] 
Sent: Friday, April 27, 2007 4:02 PM
To: solr-user@lucene.apache.org
Subject: Re: Unicode characters 


: -fetch a web page
: -decode entities and unicode characters(such as $#149; ) using Neko
: library
: -get a unicode String in Java
: -Sent it to SOLR through XML created by SAX, with the right encoding
: (UTF-8) specified everywhere( writer, header etc...)
: -it apparently arrives clean on the SOLR side (verified in our logs).
: -In the query output from SOLR (XML message), the character is not
: encoded as an entity (not •) but the character itself is used
: (character 149=95 hexadecimal).

Just because someone uses an html entity to display a character in a web
page doesn't mean it needs to be "escaped" in XML ... i think that in
theory we could use numeric entities to escape *every* character but
that would make the XML responses a lot bigger ... so in general Solr
only escapes the characters that need to be escaped to have a valid
UTF-8 XML response.

Your may also be having some additional problems since 149 (hex 95) is
not a printable UTF-8 character, it's a control character
(MESSAGE_WAITING) ... it sounds like you're dealing with HTML where
people were using the numeric value from the "Windows-1252" charset.

you may want to modify your parsing code to do some mappings between
"control" characters that you know aren't ment to be control characters
before you ever send them to solr.  a quick search for "Neko
windows-1525" indicates that enough people have had problems with this
that it is a built in feature...
http://people.apache.org/~andyc/neko/doc/html/settings.html
"http://cyberneko.org/html/features/scanner/fix-mswindows-refs
 Specifies whether to fix character entity references for Microsoft
 Windows characters as described at
 http://www.cs.tut.fi/~jkorpela/www/windows-chars.html.";

(I've run into this a number of times over the years when dealing with
content created by windows users, as you can see from my one and only
thread on "JavaJunkies" ...
  http://www.javajunkies.org/index.pl?node_id=3436
)


-Hoss



RE: NullPointerException (not schema related)

2007-05-01 Thread Chris Hostetter

: 
:   snapshooter
:   /usr/local/Production/solr/solr/bin/
:   true
: 

: the directory. However, when I committed data to the index, I was
: getting "No such file or directory" errors from the Runtime.exec call. I
: verified all of the permissions, etc, with the user I was trying to use.
: In the end, I wrote up a little test program to see if it was a problem
: with the Runtime.exec call and I think it is. I'm running this on CentOS
: 4.4 and Runtime.exec seems to have a hard time directly executing bash
: scripts. For example, if I called Runtime.exec with a command of
: "test_program" (which is a bash script), it failed. If I called
: Runtime.exec with a command of "/bin/bash test_program" it worked.

this initial problem you were having may be a result of path issues.  dir
doesn't need to be the directory where your script lives, it's the
directory where you wnat your script to run (the "cwd" of the process).
it's possible that the error you were getting was because "." isn't in the
PATH that was being used, you should try something like this...

 
   /usr/local/Production/solr/solr/bin/snapshooter
   /usr/local/Production/solr/solr/bin/
   true
 

...or maybe even...

 
   ./snapshooter 
   /usr/local/Production/solr/solr/bin/
   true
 

-Hoss



RE: NullPointerException (not schema related)

2007-05-01 Thread Charlie Jackson
Nevermind this...looks like my problem was tagging the "args" as an
 node instead of an  node. Thanks anyway!

Charlie

-Original Message-
From: Charlie Jackson [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, May 01, 2007 12:02 PM
To: solr-user@lucene.apache.org
Subject: NullPointerException (not schema related)

Hello,

 

I'm evaluating solr for potential use in an application I'm working on,
and it sounds like a really great fit. I'm having trouble getting the
Collection Distribution part set up, though. Initially, I had problems
setting up the postCommit listener. I first used this xml to configure
the listener:

 



  snapshooter

  /usr/local/Production/solr/solr/bin/

  true



 

This is what came in the solrconfig.xml file with just a minor tweak to
the directory. However, when I committed data to the index, I was
getting "No such file or directory" errors from the Runtime.exec call. I
verified all of the permissions, etc, with the user I was trying to use.
In the end, I wrote up a little test program to see if it was a problem
with the Runtime.exec call and I think it is. I'm running this on CentOS
4.4 and Runtime.exec seems to have a hard time directly executing bash
scripts. For example, if I called Runtime.exec with a command of
"test_program" (which is a bash script), it failed. If I called
Runtime.exec with a command of "/bin/bash test_program" it worked. 

 

So, with this knowledge in hand, I modified the solrconfig.xml file
again to this:



  /bin/bash

  /usr/local/Production/solr/solr/bin/

  true

   snapshooter 



 

When I commit data now, however, I get a NullPointerException. I'm
including the stack trace here:

SEVERE: java.lang.NullPointerException

at org.apache.solr.core.SolrCore.update(SolrCore.java:716)

at
org.apache.solr.servlet.SolrUpdateServlet.doPost(SolrUpdateServlet.java:
53)

at javax.servlet.http.HttpServlet.service(HttpServlet.java:710)

at javax.servlet.http.HttpServlet.service(HttpServlet.java:803)

at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Applica
tionFilterChain.java:269)

at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilt
erChain.java:188)

at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValv
e.java:210)

at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValv
e.java:174)

at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java
:127)

at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java
:117)

at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.
java:108)

at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:1
51)

at
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:87
0)

at
org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.proc
essConnection(Http11BaseProtocol.java:665)

at
org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(PoolTcpEndpoint
.java:528)

at
org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(LeaderFollow
erWorkerThread.java:81)

at
org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool
.java:685)

at java.lang.Thread.run(Thread.java:619)

 

I know this has something to do with my config change (the problem goes
away if I turn off the postCommit listener) but I don't know what!

 

BTW I'm using solr-1.1.0-incubating. 

 

Thanks in advance for any help!

 

Charlie

 



Re: Specifying no-ops...

2007-05-01 Thread Chris Hostetter

: I want to capture information about the user who is executing a particular
: search. Is there a way to specify in Solr that certain fields should just be
: treated as pass through and not processed ? This way I can use arbitrary
: params to do better logging.

fields are different from query params ... it sounds like you are asking
about query params (which will be in the URL and recorded in your
appserver logs).  Any param Solr doesn't know about is already ignored...

http://localhost:8983/solr/select/?q=ipod&some_random_param_that_is_ignored=hoss+is+being_ignored





-Hoss



Re: Faceted count syntax (exclude zeros)...

2007-05-01 Thread Chris Hostetter

: to exclude 0s. The URL below
: doesn't seem to be excluding zeros.
: 
http://localhost:12002/solr/select/?qt=dismax&q=Y&qf=show_all_flag&fl=load_id&facet=true&facet.limit=-1&facet.field=load_id&facet.mincount=1&rows=0

which version of Solr are you using?  facet.mincount was added after Solr
1.1, but you can use "facet.zeros=false" to getthe desired results.



-Hoss



NullPointerException (not schema related)

2007-05-01 Thread Charlie Jackson
Hello,

 

I'm evaluating solr for potential use in an application I'm working on,
and it sounds like a really great fit. I'm having trouble getting the
Collection Distribution part set up, though. Initially, I had problems
setting up the postCommit listener. I first used this xml to configure
the listener:

 



  snapshooter

  /usr/local/Production/solr/solr/bin/

  true



 

This is what came in the solrconfig.xml file with just a minor tweak to
the directory. However, when I committed data to the index, I was
getting "No such file or directory" errors from the Runtime.exec call. I
verified all of the permissions, etc, with the user I was trying to use.
In the end, I wrote up a little test program to see if it was a problem
with the Runtime.exec call and I think it is. I'm running this on CentOS
4.4 and Runtime.exec seems to have a hard time directly executing bash
scripts. For example, if I called Runtime.exec with a command of
"test_program" (which is a bash script), it failed. If I called
Runtime.exec with a command of "/bin/bash test_program" it worked. 

 

So, with this knowledge in hand, I modified the solrconfig.xml file
again to this:



  /bin/bash

  /usr/local/Production/solr/solr/bin/

  true

   snapshooter 



 

When I commit data now, however, I get a NullPointerException. I'm
including the stack trace here:

SEVERE: java.lang.NullPointerException

at org.apache.solr.core.SolrCore.update(SolrCore.java:716)

at
org.apache.solr.servlet.SolrUpdateServlet.doPost(SolrUpdateServlet.java:
53)

at javax.servlet.http.HttpServlet.service(HttpServlet.java:710)

at javax.servlet.http.HttpServlet.service(HttpServlet.java:803)

at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Applica
tionFilterChain.java:269)

at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilt
erChain.java:188)

at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValv
e.java:210)

at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValv
e.java:174)

at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java
:127)

at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java
:117)

at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.
java:108)

at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:1
51)

at
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:87
0)

at
org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.proc
essConnection(Http11BaseProtocol.java:665)

at
org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(PoolTcpEndpoint
.java:528)

at
org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(LeaderFollow
erWorkerThread.java:81)

at
org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool
.java:685)

at java.lang.Thread.run(Thread.java:619)

 

I know this has something to do with my config change (the problem goes
away if I turn off the postCommit listener) but I don't know what!

 

BTW I'm using solr-1.1.0-incubating. 

 

Thanks in advance for any help!

 

Charlie

 



Re: i wanna find one crawl that can crawl with defined urls and defined data

2007-05-01 Thread James liu

2007/4/30, Graeme Merrall <[EMAIL PROTECTED]>:


> i wanna crawl http://www.amazone.com/  and just wanna product title ,
> product information, writer, publisher.
>
> and other data i wanna ignore.

How about
http://blog.foofactory.fi/2007/02/online-indexing-integrating-nutch-with.html




i read it before this mail.


for example,


i wanna crawl http://www.amazone.com/  and just wanna product title ,
product information, writer, publisher.

and other data i wanna ignore.



or if you're prepared to wait or help out there's

http://svn.apache.org/repos/asf/labs/droids/README.TXT





--
regards
jl


RE: Faceted count syntax (exclude zeros)...

2007-05-01 Thread Ge, Yao \(Y.\)
There is an bug related to "facet.mincount" in incubating version.
http://www.mail-archive.com/solr-user@lucene.apache.org/msg03269.html
-Yao 

-Original Message-
From: escher2k [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, May 01, 2007 2:00 AM
To: solr-user@lucene.apache.org
Subject: Faceted count syntax (exclude zeros)...


I am trying to execute a faceted count on a field called "load_id" and
want
to exclude 0s. The URL below
doesn't seem to be excluding zeros. 
http://localhost:12002/solr/select/?qt=dismax&q=Y&qf=show_all_flag&fl=lo
ad_id&facet=true&facet.limit=-1&facet.field=load_id&facet.mincount=1&row
s=0

Result (relevant part of XML):


   0
   0
   80
   81
   77
   62
   31061
  


Thanks.
-- 
View this message in context:
http://www.nabble.com/Faceted-count-syntax-%28exclude-zeros%29...-tf3673
535.html#a10264961
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Delete from Solr index...

2007-05-01 Thread Erik Hatcher
If you want to do this as a single delete-by-query, you could OR all  
the clauses together:


  load_id:(20070424150841 OR 20070425145301  )query>


Erik


On May 1, 2007, at 2:14 AM, Ryan McKinley wrote:


escher2k wrote:
I am trying to remove documents from my index using "delete by  
query".

However when I did this, the deleted
items seem to remain. This is the format of the XML file I am using -
load_id:20070424150841
load_id:20070425145301
load_id:20070426145301
load_id:20070427145302
load_id:20070428145301
load_id:20070429145301
When I do the deletes individually, it seems to work (i.e. create  
each of

the above in a separate file). Does this
mean that each delete query request has to be executed separately ?


correct, delete (unlike ) only accepts one command.

Just to note, if "load_id" is your unique key, you could also use:
 20070424150841

This will give you better performance and does not commit the  
changes until you explicitly send 




Re: Delete from Solr index...

2007-05-01 Thread Ryan McKinley

escher2k wrote:

Thanks Ryan. I need to use query since I am deleting a range of documents.

From your

comment, I wasn't sure if one doesn't need to do an explicit commit when
using delete by query.
Does delete by query not need an explicit commit.



delete by query causes a commit *before* it executes...  I think you 
still need one after.  From the javadoc:


http://svn.apache.org/repos/asf/lucene/solr/trunk/src/java/org/apache/solr/update/DirectUpdateHandler2.java

"deleteByQuery causes a commit to happen (close current index writer, 
open new index reader) before it can be processed.  If deleteByQuery 
functionality is needed, it's best if they can be batched and executed 
together so they may share the same index reader."


I don't quite know what "batched" means since it only reads one command...



Thanks.


ryan mckinley wrote:

escher2k wrote:

I am trying to remove documents from my index using "delete by query".
However when I did this, the deleted
items seem to remain. This is the format of the XML file I am using -

load_id:20070424150841
load_id:20070425145301
load_id:20070426145301
load_id:20070427145302
load_id:20070428145301
load_id:20070429145301

When I do the deletes individually, it seems to work (i.e. create each of
the above in a separate file). Does this
mean that each delete query request has to be executed separately ?


correct, delete (unlike ) only accepts one command.

Just to note, if "load_id" is your unique key, you could also use:
  20070424150841

This will give you better performance and does not commit the changes 
until you explicitly send