How to get a field that starts with a minus?

2011-02-20 Thread Paul Tomblin
I have a field in my database, id, which is the unique key.  The id
is generated as an MD5 hash of some of the other data in the record,
and unfortunately the way I converted it to hex meant that sometimes I
get a negative value.  I'm having a real hard time figuring out the
right combination of quotes and escapes so I can actually query these
things using SolrJ.  On the Solr web interface, I can just do put in
id:-3f66fdfb1ef5f8719f65a7403e93cc9d
which results in a url like:
http://localhost:8080/solrChunk/nutch/select/?q=id:%22-3f66fdfb1ef5f8719f65a7403e93cc9d%22version=2.2start=0rows=10indent=on
but when I try that using SolrJ

SolrQuery query = new SolrQuery();
query.setQuery(key + : + value);
query.setRows(NUM_AT_A_TIME);

int start = 0;
while (true)
{
query.setStart(start);
QueryResponse resp = solrChunkServer.query(query);
SolrDocumentList docs = resp.getResults();
LOG.debug(got  + docs.getNumFound() +  documents
(or  + docs.size() +  if you prefer));
if (docs.size() == 0)
break;

for (SolrDocument doc : docs)
{
retCode.add(doc);
}
if (docs.size()  NUM_AT_A_TIME)
break;

start += NUM_AT_A_TIME;
}

I call that using key = id and value =
-3f66fdfb1ef5f8719f65a7403e93cc9d and I get an exception.  If I
change value to \-3f66fdfb1ef5f8719f65a7403e93cc9d\, I get no
results.  If I change value to \-3f66fdfb1ef5f8719f65a7403e93cc9d, I
get no results.  If I change value to
\\-3f66fdfb1ef5f8719f65a7403e93cc9d\, I get no results.

If I had hair, I'd be tearing it out right now.

-- 
http://www.linkedin.com/in/paultomblin
http://careers.stackoverflow.com/ptomblin


Re: How to get a field that starts with a minus?

2011-02-20 Thread Paul Tomblin
Yes, it's string:
   fieldType name=string class=solr.StrField
sortMissingLast=true omitNorms=true/
field name=id type=string stored=true indexed=true/

Is there a better field definition I should be using?

On Sun, Feb 20, 2011 at 10:37 AM, Ahmet Arslan iori...@yahoo.com wrote:

 What is the field type of if field? Is it string?

 What happens when you do :

 q={!raw f=id}-3f66fdfb1ef5f8719f65a7403e93cc9d

 query.setQuery({!raw f=id}-3f66fdfb1ef5f8719f65a7403e93cc9d);



 --- On Sun, 2/20/11, Paul Tomblin ptomb...@xcski.com wrote:

 From: Paul Tomblin ptomb...@xcski.com
 Subject: How to get a field that starts with a minus?
 To: solr-user@lucene.apache.org
 Date: Sunday, February 20, 2011, 5:15 PM
 I have a field in my database, id,
 which is the unique key.  The id
 is generated as an MD5 hash of some of the other data in
 the record,
 and unfortunately the way I converted it to hex meant that
 sometimes I
 get a negative value.  I'm having a real hard time
 figuring out the
 right combination of quotes and escapes so I can actually
 query these
 things using SolrJ.  On the Solr web interface, I can
 just do put in
 id:-3f66fdfb1ef5f8719f65a7403e93cc9d
 which results in a url like:
 http://localhost:8080/solrChunk/nutch/select/?q=id:%22-3f66fdfb1ef5f8719f65a7403e93cc9d%22version=2.2start=0rows=10indent=on
 but when I try that using SolrJ

         SolrQuery query = new
 SolrQuery();
         query.setQuery(key + : +
 value);
         query.setRows(NUM_AT_A_TIME);

             int start = 0;
             while (true)
             {

 query.setStart(start);

 QueryResponse resp = solrChunkServer.query(query);

 SolrDocumentList docs = resp.getResults();

 LOG.debug(got  + docs.getNumFound() +  documents
 (or  + docs.size() +  if you prefer));
                 if
 (docs.size() == 0)

     break;

                 for
 (SolrDocument doc : docs)
                 {

     retCode.add(doc);
                 }
                 if
 (docs.size()  NUM_AT_A_TIME)

     break;


 start += NUM_AT_A_TIME;
             }

 I call that using key = id and value =
 -3f66fdfb1ef5f8719f65a7403e93cc9d and I get an
 exception.  If I
 change value to \-3f66fdfb1ef5f8719f65a7403e93cc9d\, I
 get no
 results.  If I change value to
 \-3f66fdfb1ef5f8719f65a7403e93cc9d, I
 get no results.  If I change value to
 \\-3f66fdfb1ef5f8719f65a7403e93cc9d\, I get no
 results.

 If I had hair, I'd be tearing it out right now.

 --
 http://www.linkedin.com/in/paultomblin
 http://careers.stackoverflow.com/ptomblin








-- 
http://www.linkedin.com/in/paultomblin
http://careers.stackoverflow.com/ptomblin


Re: How to get a field that starts with a minus?

2011-02-20 Thread Paul Tomblin
On Sun, Feb 20, 2011 at 10:15 AM, Paul Tomblin ptomb...@xcski.com wrote:
 I have a field in my database, id, which is the unique key.  The id
 is generated as an MD5 hash of some of the other data in the record,
 and unfortunately the way I converted it to hex meant that sometimes I
 get a negative value.  I'm having a real hard time figuring out the

It turns out that the problem isn't the minus sign, the problem is
that I keep expecting Solr to act like a relational database.  In a
relational database, if you do inserts, your queries will find those
records even if you haven't committed them.  It's only *other*
database connections that won't find the records until you commit.
But evidently in Solr, even the original connection/thread that
inserted the records doesn't see them in a query until you commit them
- I assume that's because a web connection is stateless.  I added a
commit before my query, and now I can find them.

-- 
http://www.linkedin.com/in/paultomblin
http://careers.stackoverflow.com/ptomblin


Re: How to get a field that starts with a minus?

2011-02-20 Thread Paul Tomblin
As I said in my original post, I'd already tried various methods of escaping:
:call that using key = id and value =
:-3f66fdfb1ef5f8719f65a7403e93cc9d and I get an exception.  If I
:change value to \-3f66fdfb1ef5f8719f65a7403e93cc9d\, I get no
:results.  If I change value to \-3f66fdfb1ef5f8719f65a7403e93cc9d, I
:get no results.  If I change value to
:\\-3f66fdfb1ef5f8719f65a7403e93cc9d\, I get no results.

But it turned out that the values I was querying for had been added
but not committed.

On Sun, Feb 20, 2011 at 11:17 AM, Markus Jelsma
markus.jel...@openindex.io wrote:
 He could also just escape it or am i missing something?

 --- On Sun, 2/20/11, Paul Tomblin ptomb...@xcski.com wrote:
  From: Paul Tomblin ptomb...@xcski.com
  Subject: Re: How to get a field that starts with a minus?
  To: solr-user@lucene.apache.org
  Date: Sunday, February 20, 2011, 5:53 PM
  Yes, it's string:
         fieldType name=string
  class=solr.StrField
 
  sortMissingLast=true omitNorms=true/
          field name=id
  type=string stored=true indexed=true/

 No, string is OK. In this case it is better to use raw or field query
 parser.

 SolrQuery.setQuery({!raw f=id}-3f66fdfb1ef5f8719f65a7403e93cc9d);




-- 
http://www.linkedin.com/in/paultomblin
http://careers.stackoverflow.com/ptomblin


Re: Can't delete from curl

2010-03-09 Thread Paul Tomblin
On Mon, Mar 8, 2010 at 9:39 PM, Lance Norskog goks...@gmail.com wrote:

 ... curl http://xen1.xcski.com:8080/solrChunk/nutch/select

 that should be /update, not /select


Ah, that seems to have fixed it.  Thanks.



-- 
http://www.linkedin.com/in/paultomblin
http://careers.stackoverflow.com/ptomblin


Re: Can't delete from curl

2010-03-07 Thread Paul Tomblin
On Tue, Mar 2, 2010 at 1:22 AM, Lance Norskog goks...@gmail.com wrote:

 On Mon, Mar 1, 2010 at 4:02 PM, Paul Tomblin ptomb...@xcski.com wrote:
  I have a schema with a field name category (field name=category
  type=string stored=true indexed=true/).  I'm trying to delete
  everything with a certain value of category with curl:
 
  I send:
 
  curl http://localhost:8080/solrChunk/nutch/update -H Content-Type:
  text/xml --data-binary 'deletequerycategory:Banks/query/delete'
 
  Response is:
 
  ?xml version=1.0 encoding=UTF-8?
  response
  lst name=responseHeaderint name=status0/intint
  name=QTime23/int/lst
  /response
 
  I send
 
  curl http://localhost:8080/solrChunk/nutch/update -H Content-Type:
  text/xml --data-binary 'commit/'
 
  Response is:
 
  ?xml version=1.0 encoding=UTF-8?
  response
  lst name=responseHeaderint name=status0/intint
  name=QTime1914/int/lst
  /response
 
  but when I go back and query, it shows all the same results as before.
 
  Why isn't it deleting?

 Do you query with curl also? If you use a web browser, Solr by default
 uses http caching, so your browser will show you the old result of the
 query.


I think you're right about that.  I tried using curl, and it did go to zero.
 But now I've got a different problem: sometimes when I try to commit, I get
a NullPointerException:


curl http://xen1.xcski.com:8080/solrChunk/nutch/select -H Content-Type:
text/xml --data-binary 'commit/'htmlheadtitleApache Tomcat/6.0.20 -
Error report/titlestyle!--H1
{font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:22px;}
H2
{font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:16px;}
H3
{font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:14px;}
BODY
{font-family:Tahoma,Arial,sans-serif;color:black;background-color:white;} B
{font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;}
P
{font-family:Tahoma,Arial,sans-serif;background:white;color:black;font-size:12px;}A
{color : black;}A.name {color : black;}HR {color : #525D76;}--/style
/headbodyh1HTTP Status 500 - null

java.lang.NullPointerException
at java.io.StringReader.lt;initgt;(StringReader.java:33)
at org.apache.lucene.queryParser.QueryParser.parse(QueryParser.java:173)
at org.apache.solr.search.LuceneQParser.parse(LuceneQParserPlugin.java:78)
at org.apache.solr.search.QParser.getQuery(QParser.java:131)
at
org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:89)
at
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:174)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293)
at
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:849)
at
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583)
at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:454)
at java.lang.Thread.run(Thread.java:619)
/h1HR size=1 noshade=noshadepbtype/b Status
report/ppbmessage/b unull

java.lang.NullPointerException
at java.io.StringReader.lt;initgt;(StringReader.java:33)
at org.apache.lucene.queryParser.QueryParser.parse(QueryParser.java:173)
at org.apache.solr.search.LuceneQParser.parse(LuceneQParserPlugin.java:78)
at org.apache.solr.search.QParser.getQuery(QParser.java:131)
at
org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:89)
at
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:174)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)


-- 
http://www.linkedin.com/in/paultomblin
http://careers.stackoverflow.com/ptomblin


Can't delete from curl

2010-03-01 Thread Paul Tomblin
I have a schema with a field name category (field name=category
type=string stored=true indexed=true/).  I'm trying to delete
everything with a certain value of category with curl:

I send:

curl http://localhost:8080/solrChunk/nutch/update -H Content-Type:
text/xml --data-binary 'deletequerycategory:Banks/query/delete'

Response is:

?xml version=1.0 encoding=UTF-8?
response
lst name=responseHeaderint name=status0/intint
name=QTime23/int/lst
/response

I send

curl http://localhost:8080/solrChunk/nutch/update -H Content-Type:
text/xml --data-binary 'commit/'

Response is:

?xml version=1.0 encoding=UTF-8?
response
lst name=responseHeaderint name=status0/intint
name=QTime1914/int/lst
/response

but when I go back and query, it shows all the same results as before.

Why isn't it deleting?

-- 
http://www.linkedin.com/in/paultomblin
http://careers.stackoverflow.com/ptomblin


What does this error mean?

2009-11-27 Thread Paul Tomblin
NFO: start 
commit(optimize=false,waitFlush=true,waitSearcher=true,expungeDeletes=false)
Nov 27, 2009 3:45:35 AM
org.apache.solr.update.processor.LogUpdateProcessor finish
INFO: {} 0 634
Nov 27, 2009 3:45:35 AM org.apache.solr.core.SolrCore getSearcher
WARNING: [nutch] Error opening new searcher. exceeded limit of
maxWarmingSearchers=2, try again later.
Nov 27, 2009 3:45:35 AM
org.apache.solr.update.processor.LogUpdateProcessor finishINFO: {} 0
635
Nov 27, 2009 3:45:35 AM org.apache.solr.common.SolrException log
SEVERE: org.apache.solr.common.SolrException: Error opening new
searcher. exceeded limit of maxWarmingSear
chers=2, try again later.
at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1029)
at 
org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:418)
at 
org.apache.solr.update.processor.RunUpdateProcessor.processCommit(RunUpdateProcessorFactory.jav
a:85)
at 
org.apache.solr.handler.RequestHandlerUtils.handleCommit(RequestHandlerUtils.java:107)
at 
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:48)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at 
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
   at 
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
at 
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
at 
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
at 
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
   at 
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293)
at 
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:849)
at 
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583)
at 
org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:454)
at java.lang.Thread.run(Thread.java:619)

Nov 27, 2009 3:45:35 AM org.apache.solr.core.SolrCore execute
INFO: [nutch] webapp=/solrChunk path=/update
params={waitSearcher=truecommit=truewt=javabinwaitFlush=trueversion=1}
status=503 QTime=634
Nov 27, 2009 3:45:35 AM org.apache.solr.common.SolrException log
SEVERE: org.apache.solr.common.SolrException: Error opening new
searcher. exceeded limit of maxWarmingSearchers=2, try again later.
at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1029)
at 
org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:418)
at 
org.apache.solr.update.processor.RunUpdateProcessor.processCommit(RunUpdateProcessorFactory.java:85)
at 
org.apache.solr.handler.RequestHandlerUtils.handleCommit(RequestHandlerUtils.java:107)
at 
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:48)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at 
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
at 
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
at 
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
at 
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
at 
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
at 
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293)
at 
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:849)
at 
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583)
at 
org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:454)
at java.lang.Thread.run(Thread.java:619)

Nov 27, 2009 3:45:35 AM org.apache.solr.core.SolrCore execute
INFO: 

Re: What does this error mean?

2009-11-27 Thread Paul Tomblin
What's a warming query, and how would I know if I'm doing one?  Does
this mean the web server restarted or something?

On Fri, Nov 27, 2009 at 3:25 PM, Matthew Runo mr...@zappos.com wrote:
 It means that there was 2 warming searchers, and then a commit came in and 
 caused a third to try to warm up at the same time. Do you use any warming 
 queries, or have large caches?

 Thanks for your time!

 Matthew Runo
 Software Engineer, Zappos.com
 mr...@zappos.com - 702-943-7833

 On Nov 27, 2009, at 5:46 AM, Paul Tomblin wrote:

 NFO: start 
 commit(optimize=false,waitFlush=true,waitSearcher=true,expungeDeletes=false)
 Nov 27, 2009 3:45:35 AM
 org.apache.solr.update.processor.LogUpdateProcessor finish
 INFO: {} 0 634
 Nov 27, 2009 3:45:35 AM org.apache.solr.core.SolrCore getSearcher
 WARNING: [nutch] Error opening new searcher. exceeded limit of
 maxWarmingSearchers=2, try again later.
 Nov 27, 2009 3:45:35 AM
 org.apache.solr.update.processor.LogUpdateProcessor finishINFO: {} 0
 635
 Nov 27, 2009 3:45:35 AM org.apache.solr.common.SolrException log
 SEVERE: org.apache.solr.common.SolrException: Error opening new
 searcher. exceeded limit of maxWarmingSear
 chers=2, try again later.
        at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1029)
        at 
 org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:418)
        at 
 org.apache.solr.update.processor.RunUpdateProcessor.processCommit(RunUpdateProcessorFactory.jav
 a:85)
        at 
 org.apache.solr.handler.RequestHandlerUtils.handleCommit(RequestHandlerUtils.java:107)
        at 
 org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:48)
        at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
        at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
        at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
        at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
        at 
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
        at 
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
        at 
 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
       at 
 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
        at 
 org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
        at 
 org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
        at 
 org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
       at 
 org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293)
        at 
 org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:849)
        at 
 org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583)
        at 
 org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:454)
        at java.lang.Thread.run(Thread.java:619)

 Nov 27, 2009 3:45:35 AM org.apache.solr.core.SolrCore execute
 INFO: [nutch] webapp=/solrChunk path=/update
 params={waitSearcher=truecommit=truewt=javabinwaitFlush=trueversion=1}
 status=503 QTime=634
 Nov 27, 2009 3:45:35 AM org.apache.solr.common.SolrException log
 SEVERE: org.apache.solr.common.SolrException: Error opening new
 searcher. exceeded limit of maxWarmingSearchers=2, try again later.
        at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1029)
        at 
 org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:418)
        at 
 org.apache.solr.update.processor.RunUpdateProcessor.processCommit(RunUpdateProcessorFactory.java:85)
        at 
 org.apache.solr.handler.RequestHandlerUtils.handleCommit(RequestHandlerUtils.java:107)
        at 
 org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:48)
        at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
        at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
        at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
        at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
        at 
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
        at 
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
        at 
 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
        at 
 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
        at 
 org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
        at 
 org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102

SolrJ looping until I get all the results

2009-11-02 Thread Paul Tomblin
If I want to do a query and only return X number of rows at a time,
but I want to keep querying until I get all the row, how do I do that?
 Can I just keep advancing query.setStart(...) and then checking if
server.query(query) returns any rows?  Or is there a better way?

Here's what I'm thinking

final static int MAX_ROWS = 100;
int start = 0;
query.setRows(MAX_ROWS);
while (true)
{
   QueryResponse resp = solrChunkServer.query(query);
   SolrDocumentList docs = resp.getResults();
   if (docs.size() == 0)
 break;
   
  start += MAX_ROWS;
  query.setStart(start);
}



-- 
http://www.linkedin.com/in/paultomblin
http://careers.stackoverflow.com/ptomblin


Re: SolrJ looping until I get all the results

2009-11-02 Thread Paul Tomblin
On Mon, Nov 2, 2009 at 8:47 PM, Avlesh Singh avl...@gmail.com wrote:

 I was doing it that way, but what I'm doing with the documents is do
 some manipulation and put the new classes into a different list.
 Because I basically have two times the number of documents in lists,
 I'm running out of memory.  So I figured if I do it 1000 documents at
 a time, the SolrDocumentList will get garbage collected at least.

 You are right w.r.t to all that but I am surprised that you would need ALL
 the documents from the index for a search requirement.

This isn't a search, this is a search and destroy.  Basically I need
the file names of all the documents that I've indexed in Solr so that
I can delete them.

-- 
http://www.linkedin.com/in/paultomblin
http://careers.stackoverflow.com/ptomblin


Re: Scoring algorithm?

2009-10-31 Thread Paul Tomblin
If I change the schema this way, do I need to re-submit all the
documents to Solr?  And if I have them all sitting on disk as XML
files that look like
?xml version=1.0 encoding=UTF-8 standalone=no?
doc
field name=../field
field name=../field
/doc
is there a quick way to submit them all to Solr?

On Sat, Oct 31, 2009 at 10:04 AM, Yonik Seeley
yo...@lucidimagination.com wrote:
 On Sat, Oct 31, 2009 at 8:48 AM, Paul Tomblin ptomb...@xcski.com wrote:
 Am I right in thinking that a document that the sortable field is only
 two sentences long and contains the search term once will score higher
 than one that is 50 sentences long that contains the search term 4
 times?

 Yep.  Assuming 15 tokens per sentence, doc1 will have
 lengthNorm = 1/(2*15)**.5 or 0.18 with  tf=1**.5 or 1
 doc2 will have
 lengthNorm  = 1/(50*15)**.5 or 0.04 with tf=4**.5 or 2

 Or if you don't want length normalization at all, simply use
 omitNorms=true in the schema for this field.

  Is there a way to change it to score higher based only on
 number of hits?

 Yes, simply use omitNorms=true in the schema.xml for this field.

 If you still wanted a lengthNorm, you could change the balance by
 creating a custom similarity and overriding either lengthNorm() or
 tf()

 -Yonik
 http://www.lucidimagination.com




-- 
http://www.linkedin.com/in/paultomblin
http://careers.stackoverflow.com/ptomblin


Ok, that didn't work

2009-10-31 Thread Paul Tomblin
I was looking at the script in example/exampledocs to feed documents
to the server.

Just to see if it was possible, I took one of the documents that I've
previously indexed using SolrJ, and I tried to feed it directly to the
Solr server using the following command:

curl http://localhost:8697/solrChunk/nutch/update --data-binary
@filename.xml -H 'Content-type:text/xml; charset=utf-8'

And this is what I got back:

htmlheadtitleApache Tomcat/6.0.10 - Error
report/titlestyle!--H1
{font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:22px;}
H2 
{font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:16px;}
H3 
{font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:14px;}
BODY {font-family:Tahoma,Arial,sans-serif;color:black;background-color:white;}
B {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;}
P 
{font-family:Tahoma,Arial,sans-serif;background:white;color:black;font-size:12px;}A
{color : black;}A.name {color : black;}HR {color :
#525D76;}--/style /headbodyh1HTTP Status 500 - null

java.lang.NullPointerException
at org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:138)
at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:69)
at 
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at 
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:228)
at 
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:175)
at 
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
at 
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:104)
at 
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
at 
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:216)
at 
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:844)
at 
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:634)
at 
org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:445)
at java.lang.Thread.run(Thread.java:619)
/h1HR size=1 noshade=noshadepbtype/b Status
report/ppbmessage/b unull

java.lang.NullPointerException
at org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:138)
at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:69)
at 
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at 
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:228)
at 
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:175)
at 
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
at 
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:104)
at 
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
at 
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:216)
at 
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:844)
at 
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:634)
at 
org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:445)
at java.lang.Thread.run(Thread.java:619)
/u/ppbdescription/b uThe server encountered an internal error (null

java.lang.NullPointerException
at org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:138)
at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:69)
at 

Re: Ok, that didn't work

2009-10-31 Thread Paul Tomblin
The add tag isn't part of the document.  Is there a way to feed the
actual documents without adding tags that aren't part of the schema to
them?

On Sat, Oct 31, 2009 at 10:43 AM, Yonik Seeley
yo...@lucidimagination.com wrote:
 Hmmm... perhaps you're missing the add tag around the doc?

 -Yonik
 http://www.lucidimagination.com



 On Sat, Oct 31, 2009 at 10:37 AM, Paul Tomblin ptomb...@xcski.com wrote:
 I was looking at the script in example/exampledocs to feed documents
 to the server.

 Just to see if it was possible, I took one of the documents that I've
 previously indexed using SolrJ, and I tried to feed it directly to the
 Solr server using the following command:

 curl http://localhost:8697/solrChunk/nutch/update --data-binary
 @filename.xml -H 'Content-type:text/xml; charset=utf-8'

 And this is what I got back:

 htmlheadtitleApache Tomcat/6.0.10 - Error
 report/titlestyle!--H1
 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:22px;}
 H2 
 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:16px;}
 H3 
 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:14px;}
 BODY 
 {font-family:Tahoma,Arial,sans-serif;color:black;background-color:white;}
 B {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;}
 P 
 {font-family:Tahoma,Arial,sans-serif;background:white;color:black;font-size:12px;}A
 {color : black;}A.name {color : black;}HR {color :
 #525D76;}--/style /headbodyh1HTTP Status 500 - null

 java.lang.NullPointerException
        at org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:138)
        at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:69)
        at 
 org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54)
        at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
        at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
        at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
        at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
        at 
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
        at 
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
        at 
 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:228)
        at 
 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:175)
        at 
 org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
        at 
 org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:104)
        at 
 org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
        at 
 org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:216)
        at 
 org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:844)
        at 
 org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:634)
        at 
 org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:445)
        at java.lang.Thread.run(Thread.java:619)
 /h1HR size=1 noshade=noshadepbtype/b Status
 report/ppbmessage/b unull

 java.lang.NullPointerException
        at org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:138)
        at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:69)
        at 
 org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54)
        at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
        at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
        at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
        at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
        at 
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
        at 
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
        at 
 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:228)
        at 
 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:175)
        at 
 org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
        at 
 org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:104)
        at 
 org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
        at 
 org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:216)
        at 
 org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:844)
        at 
 org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:634

Re: Ok, that didn't work

2009-10-31 Thread Paul Tomblin
On Sat, Oct 31, 2009 at 11:08 AM, Yonik Seeley
yo...@lucidimagination.com wrote:
 I personally think it would be cleaner to allow a post of just a doc
 (or multiple with a surrounding docs tag), esp now that we can put
 modifiers in the URL.

Exactly.  The action should be in the url.

 For now, just use shell scripting I guess:

 $ (echo 'add'; cat file.xml; echo '/add') | curl $URL
 --data-binary @- -H 'Content-type:text/xml; charset=utf-8'


Well, that's cleaner than what I was going to do, which was to use sed
to add the add/add tags and then remove them afterwards.  Thanks.

-- 
http://www.linkedin.com/in/paultomblin
http://careers.stackoverflow.com/ptomblin


Another question about omitNorms

2009-10-31 Thread Paul Tomblin
In an earlier message, Yonik suggested that I use omitNorms=true if
I wanted the length of the document to not be counted in the scoring.
The documentation also mentions that it omits index-time boosting.
What does that mean?

-- 
http://www.linkedin.com/in/paultomblin
http://careers.stackoverflow.com/ptomblin


Re: How to access the information from SolrJ

2009-10-02 Thread Paul Tomblin
Nope, that just gets you the number of results returned, not how many
there could be.  Like I said, if you look at the XML returned, you'll
see something like
result name='response' numFound='1251' start='0'
but only 10 doc returned.  getNumFound returns 10 in that case, not 1251.


2009/10/2 Noble Paul നോബിള്‍  नोब्ळ् noble.p...@corp.aol.com:
 QueryResponse#getResults()#getNumFound()

 On Thu, Oct 1, 2009 at 11:49 PM, Paul Tomblin ptomb...@xcski.com wrote:
 When I do a query directly form the web, the XML of the response
 includes how many results would have been returned if it hadn't
 restricted itself to the first 10 rows:

 For instance, the query:
 http://localhost:8080/solrChunk/nutch/select/?q=*:*fq=category:mysites
 returns:
 response
 lst name='responseHeader'
 int name='status'0/int
 int name='QTime'0/int
 lst name='params'
 str name='q'*:*/str
 str name='fq'category:mysites/str
 /lst
 /lst
 result name='response' numFound='1251' start='0'
 doc
 str name='category'mysites/str
 long name='chunkNum'0/long
 str name='chunkUrl'http://localhost/Chunks/mysites/0-http___xcski.com_.xml/str
 str name='concept'Anatomy/str
 ...

 The value I'm talking about is in the numFound attribute of the result 
 tag.

 I don't see any way to retrieve it through SolrJ - it's not in the
 QueryResponse.getHeader(), for instance.  Can I retrieve it somewhere?

 --
 http://www.linkedin.com/in/paultomblin




 --
 -
 Noble Paul | Principal Engineer| AOL | http://aol.com




-- 
http://www.linkedin.com/in/paultomblin


Re: How to access the information from SolrJ

2009-10-02 Thread Paul Tomblin
On Fri, Oct 2, 2009 at 3:13 PM, Shalin Shekhar Mangar
shalinman...@gmail.com wrote:
 On Fri, Oct 2, 2009 at 8:11 PM, Paul Tomblin ptomb...@xcski.com wrote:

 Nope, that just gets you the number of results returned, not how many
 there could be.  Like I said, if you look at the XML returned, you'll
 see something like
 result name='response' numFound='1251' start='0'
 but only 10 doc returned.  getNumFound returns 10 in that case, not 1251.



 Nope. Check again. getNumFound will definitely give you 1251.
 SolrDocumentList#size() will give you 10.

I don't have to check again.  I put this log into my query code:
QueryResponse resp = solrChunkServer.query(query);
SolrDocumentList docs = resp.getResults();
LOG.debug(got  + docs.getNumFound() +  documents (or 
+ docs.size() +  if you prefer));
and I got exactly the same number in both places every single time.  I
can verify from the URL line that the following query:

http://test.xcski.com:8080/solrChunk/nutch/select/?q=testfq=category:pharmafq=concept:Discoveryrows=5

has a result name='response' numFound='95' start='0' but when I do
the same in SolrJ, docs.getNumFound() returns 5.

144652 [http-8080-14] DEBUG com.lucidityworks.solr.Solr  - got 5
documents (or 5 if you prefer)


-- 
http://www.linkedin.com/in/paultomblin


Re: How to access the information from SolrJ

2009-10-02 Thread Paul Tomblin
LucidityWorks.com is my client.  The similarity to lucid is purely coincidental 
- the client didn't even know I was going to choose Solr.  I am using Solr 
trunk, last updated and compiled a few weeks ago.

-- Sent from my Palm Prē
Shalin Shekhar Mangar wrote:

On Sat, Oct 3, 2009 at 1:09 AM, Paul Tomblin lt;ptomb...@xcski.com wrote:



 

  Nope. Check again. getNumFound will definitely give you 1251.

  SolrDocumentList#size() will give you 10.



 I don't have to check again.  I put this log into my query code:

QueryResponse resp = solrChunkServer.query(query);

SolrDocumentList docs = resp.getResults();

LOG.debug(got  + docs.getNumFound() +  documents (or 

 + docs.size() +  if you prefer));

 and I got exactly the same number in both places every single time.  I

 can verify from the URL line that the following query:





 http://test.xcski.com:8080/solrChunk/nutch/select/?q=testamp;fq=category:pharmaamp;fq=concept:Discoveryamp;rows=5



 has a lt;result name='response' numFound='95' start='0' but when I do

 the same in SolrJ, docs.getNumFound() returns 5.



 144652 [http-8080-14] DEBUG com.lucidityworks.solr.Solr  - got 5

 documents (or 5 if you prefer)





I can tell you for sure that this is not a bug in Solr 1.3 or trunk. I

checked the code and it is being set correctly. Moreover, I'm using both in

production.



The class com.lucidityworks.solr.Solr suggests that you are using Lucid's

solr build. Perhaps that has a bug? Can you try this with the Solrj client

in the official 1.3 release or even trunk?



-- 

Regards,

Shalin Shekhar Mangar.




Re: How to access the information from SolrJ

2009-10-02 Thread Paul Tomblin
On Fri, Oct 2, 2009 at 5:04 PM, Shalin Shekhar Mangar
shalinman...@gmail.com wrote:
 Can you try this with the Solrj client
 in the official 1.3 release or even trunk?

I did a svn update to 821188 and that seems to have fixed the problem.
 (The jar files changed from -1.3.0 to -1.4-dev)  I guess it's been
longer since I did an update than I thought.

logs/catalina.out:318916 [http-8080-1] DEBUG
com.lucidityworks.solr.Solr  - got 138 documents (or 15 if you prefer)

Thanks very much.


-- 
http://www.linkedin.com/in/paultomblin


What to set in query.setMaxRows()?

2009-10-01 Thread Paul Tomblin
Sorry about asking this here, but I can't reach wiki.apache.org right now.
 What do I set in query.setMaxRows() to get all the rows?

-- 
http://www.linkedin.com/in/paultomblin


Correction: query.setRows

2009-10-01 Thread Paul Tomblin
Sorry, in my last question I meant setRows not setMaxRows.  Whay do I pass to 
setRows to get all matches, not just the first 10?

-- Sent from my Palm Prē



How to access the information from SolrJ

2009-10-01 Thread Paul Tomblin
When I do a query directly form the web, the XML of the response
includes how many results would have been returned if it hadn't
restricted itself to the first 10 rows:

For instance, the query:
http://localhost:8080/solrChunk/nutch/select/?q=*:*fq=category:mysites
returns:
response
lst name='responseHeader'
int name='status'0/int
int name='QTime'0/int
lst name='params'
str name='q'*:*/str
str name='fq'category:mysites/str
/lst
/lst
result name='response' numFound='1251' start='0'
doc
str name='category'mysites/str
long name='chunkNum'0/long
str name='chunkUrl'http://localhost/Chunks/mysites/0-http___xcski.com_.xml/str
str name='concept'Anatomy/str
...

The value I'm talking about is in the numFound attribute of the result tag.

I don't see any way to retrieve it through SolrJ - it's not in the
QueryResponse.getHeader(), for instance.  Can I retrieve it somewhere?

--
http://www.linkedin.com/in/paultomblin


Solr highlighting doesn't respect quotes

2009-09-24 Thread Paul Tomblin
If I do a query for a couple of words in quotes, Solr correctly only returns
pages where those words appear exactly within the quotes.  But the
highlighting acts as if the words were given separately, and stems them and
everything.  For example, if I search for knee pain, it returns a document
that has the word knee pain, and doesn't return documents that have knee
and pain without other words between them.  However, with highlighting
turned on, the highlighted field will have knee, knees, pain and
pains highlighted even when they aren't next to each other.
For instance:
responselst name='responseHeader'int name='status'0/int
int name='QTime'45/int
lst name='params'str name='explainOther'/
str name='fl'*,score/str
str name='indent'on/str
str name='start'0/str
str name='q'knee pain/str
str name='hl.fl'text/str
str name='qt'standard/str
str name='wt'standard/str
str name='hl'on/str
str name='rows'10/str
str name='version'2.2/str
/lst
/lst

lst name='2:
http://news.prnewswire.com/DisplayReleaseContent.aspx?ACCT=ind_focus.storyamp;STORY=/www/story/09-24-2009/0005100306amp;EDATE=
'arr name='text'strI had one injection in each lt;emkneelt;/em and
my doctor said it could relieve my lt;emkneelt;/em lt;empainlt;/em
for up to six/str
/arr
/lst

-- 
http://www.linkedin.com/in/paultomblin


Re: Highlighting in SolrJ?

2009-09-13 Thread Paul Tomblin
Thanks to Jay, I have my code doing what I need it to do.  If anybody
cares, this is my code:

SolrQuery query = new SolrQuery();
query.setQuery(searchTerm);
query.addFilterQuery(Chunk.SOLR_KEY_CONCEPT + : + concept);
query.addFilterQuery(Chunk.SOLR_KEY_CATEGORY + : + category);
if (maxChunks  0)
query.setRows(maxChunks);

// Set highlighting fields
query.setHighlight(true);
query.setHighlightFragsize(0);
query.addHighlightField(Chunk.SOLR_KEY_TEXT);
query.setHighlightSnippets(1);
query.setHighlightSimplePre(b);
query.setHighlightSimplePost(/b);

QueryResponse resp = solrChunkServer.query(query);
SolrDocumentList docs = resp.getResults();
retCode = new ArrayListChunk(docs.size());
for (SolrDocument doc : docs)
{
LOG.debug(got doc  + doc);
Chunk chunk = new Chunk(doc);

// retrieve highlighting
ListString highlights =
resp.getHighlighting().get(chunk.getId()).get(Chunk.SOLR_KEY_TEXT);
if (highlights != null  highlights.size()  0)
chunk.setHighlighted(highlights.get(0));

retCode.add(chunk);
}



-- 
http://www.linkedin.com/in/paultomblin


Re: Highlighting in SolrJ?

2009-09-11 Thread Paul Tomblin
What I want is the whole text of that field with every instance of the
search term high lighted, even if the search term only occurs in the
first line of a 300 page field.  I'm not sure if mergeContinuous will
do that, or if it will miss everything after the last line that
contains the search term.

On Fri, Sep 11, 2009 at 10:42 AM, Jay Hill jayallenh...@gmail.com wrote:
 It's really just a matter of what you're intentions are. There are an awful
 lot of highlighting params and so highlighting is very flexible and
 customizable. Regarding snippets, as an example Google presents two snippets
 in results, which is fairly common. I'd recommend doing a lot of
 experimenting by changing the params on the query string to get what you
 want, and then setting them up in SolrJ. The example I sent was intended to
 be a generic starting point and mostly just to show how to set highlighting
 params and how to get back a List of highlighting results.

 -Jay
 http://www.lucidimagination.com


 On Thu, Sep 10, 2009 at 5:40 PM, Paul Tomblin ptomb...@xcski.com wrote:

 If I set snippets to 9 and mergeContinuous to true, will I get
 the entire contents of the field with all the search terms replaced?
 I don't see what good it would be just getting one line out of the
 whole field as a snippet.

 On Thu, Sep 10, 2009 at 7:45 PM, Jay Hill jayallenh...@gmail.com wrote:
  Set up the query like this to highlight a field named content:
 
     SolrQuery query = new SolrQuery();
     query.setQuery(foo);
 
     query.setHighlight(true).setHighlightSnippets(1); //set other params
 as
  needed
     query.setParam(hl.fl, content);
 
     QueryResponse queryResponse =getSolrServer().query(query);
 
  Then to get back the highlight results you need something like this:
 
     IteratorSolrDocument iter = queryResponse.getResults();
 
     while (iter.hasNext()) {
       SolrDocument resultDoc = iter.next();
 
       String content = (String) resultDoc.getFieldValue(content));
       String id = (String) resultDoc.getFieldValue(id); //id is the
  uniqueKey field
 
       if (queryResponse.getHighlighting().get(id) != null) {
         ListString highightSnippets =
  queryResponse.getHighlighting().get(id).get(content);
       }
     }
 
  Hope that gets you what you need.
 
  -Jay
  http://www.lucidimagination.com
 
  On Thu, Sep 10, 2009 at 3:19 PM, Paul Tomblin ptomb...@xcski.com
 wrote:
 
  Can somebody point me to some sample code for using highlighting in
  SolrJ?  I understand the highlighted versions of the field comes in a
  separate NamedList?  How does that work?
 
  --
  http://www.linkedin.com/in/paultomblin
 
 



 --
 http://www.linkedin.com/in/paultomblin





-- 
http://www.linkedin.com/in/paultomblin


Highlighting in SolrJ?

2009-09-10 Thread Paul Tomblin
Can somebody point me to some sample code for using highlighting in
SolrJ?  I understand the highlighted versions of the field comes in a
separate NamedList?  How does that work?

-- 
http://www.linkedin.com/in/paultomblin


Re: Highlighting in SolrJ?

2009-09-10 Thread Paul Tomblin
If I set snippets to 9 and mergeContinuous to true, will I get
the entire contents of the field with all the search terms replaced?
I don't see what good it would be just getting one line out of the
whole field as a snippet.

On Thu, Sep 10, 2009 at 7:45 PM, Jay Hill jayallenh...@gmail.com wrote:
 Set up the query like this to highlight a field named content:

    SolrQuery query = new SolrQuery();
    query.setQuery(foo);

    query.setHighlight(true).setHighlightSnippets(1); //set other params as
 needed
    query.setParam(hl.fl, content);

    QueryResponse queryResponse =getSolrServer().query(query);

 Then to get back the highlight results you need something like this:

    IteratorSolrDocument iter = queryResponse.getResults();

    while (iter.hasNext()) {
      SolrDocument resultDoc = iter.next();

      String content = (String) resultDoc.getFieldValue(content));
      String id = (String) resultDoc.getFieldValue(id); //id is the
 uniqueKey field

      if (queryResponse.getHighlighting().get(id) != null) {
        ListString highightSnippets =
 queryResponse.getHighlighting().get(id).get(content);
      }
    }

 Hope that gets you what you need.

 -Jay
 http://www.lucidimagination.com

 On Thu, Sep 10, 2009 at 3:19 PM, Paul Tomblin ptomb...@xcski.com wrote:

 Can somebody point me to some sample code for using highlighting in
 SolrJ?  I understand the highlighted versions of the field comes in a
 separate NamedList?  How does that work?

 --
 http://www.linkedin.com/in/paultomblin





-- 
http://www.linkedin.com/in/paultomblin


Can't delete with a fq?

2009-09-09 Thread Paul Tomblin
I'm trying to delete using SolJ's deleteByQuery, but it doesn't like
it that I've added an fq parameter.  Here's what I see in the logs:

Sep 9, 2009 1:46:13 PM org.apache.solr.common.SolrException log
SEVERE: org.apache.lucene.queryParser.ParseException: Cannot parse
'url:http\:\/\/xcski\.com\/pharma\/fq=category:pharma': Encountered
: at line 1, column 46.
Was expecting one of:
EOF
AND ...
OR ...
NOT ...
+ ...
- ...
( ...
* ...
^ ...
QUOTED ...
TERM ...
FUZZY_SLOP ...
PREFIXTERM ...
WILDTERM ...
[ ...
{ ...
NUMBER ...

at org.apache.lucene.queryParser.QueryParser.parse(QueryParser.java:173)
at org.apache.solr.search.QueryParsing.parseQuery(QueryParsing.java:75)
at org.apache.solr.search.QueryParsing.parseQuery(QueryParsing.java:64)
...

Should I rewrite that query to be url:http:... AND category:pharma?


-- 
http://www.linkedin.com/in/paultomblin


Re: Can't delete with a fq?

2009-09-09 Thread Paul Tomblin
On Wed, Sep 9, 2009 at 2:07 PM, AHMET ARSLAN iori...@yahoo.com wrote:

 --- On Wed, 9/9/09, Paul Tomblin ptomb...@xcski.com wrote:

 SEVERE: org.apache.lucene.queryParser.ParseException:
 Cannot parse
 'url:http\:\/\/xcski\.com\/pharma\/fq=category:pharma':

 Should I rewrite that query to be url:http:... AND
 category:pharma?
 Yes, because url:http\:\/\/xcski\.com\/pharma\/fq=category:pharma is not a 
 valid query.


It works perfectly well as a query:

http://localhost:8080/solrChunk/nutch/select/?q=url:http\:\/\/xcski\.com\/pharma\/fq=category:pharma

retrieved all the documents I wanted to delete.

-- 
http://www.linkedin.com/in/paultomblin


Using scoring from another program

2009-09-03 Thread Paul Tomblin
Every document I put into Solr has a field origScore which is a
floating point number between 0 and 1 that represents a score assigned
by the program that generated the document.  I would like it that when
I do a query, it uses that origScore in the scoring, perhaps
multiplying the Solr score to find a weighted score and using that to
determine which are the highest scoring matches.  Can I do that?

-- 
http://www.linkedin.com/in/paultomblin


Viewing xml in Safari

2009-09-02 Thread Paul Tomblin
Slightly off topic, but I'm getting tired of hitting the 'view source' keyboard 
shortcut every time I do a solr query.  Is there a way to make Safari display 
xml as-is?

-- Sent from my Palm Prē



Re: Ok, why isn't this working?

2009-08-28 Thread Paul Tomblin
On Fri, Aug 28, 2009 at 6:42 AM, Shalin Shekhar
Mangarshalinman...@gmail.com wrote:
 Ok, I've spotted the problem - while SolrHome is in the right place,
 it's still looking for the data in
 /Users/ptomblin/apache-tomcat-6.0.20/solr/data/

 How can I changed that?


 One easy way is to hard code that location into solrconfig.xml ?

The conf file says:
 dataDir${solr.data.dir:./solr/data}/dataDir
That indicates to me that there is some way to override that default
./solr/data involving something called solr.data.dir, but I don't know
if that's an environment variable, or a system property, or what.


-- 
http://www.linkedin.com/in/paultomblin


Re: Ok, why isn't this working?

2009-08-28 Thread Paul Tomblin
On Fri, Aug 28, 2009 at 8:04 AM, Chantal
Ackermannchantal.ackerm...@btelligent.de wrote:
 Paul Tomblin schrieb:
 The conf file says:
  dataDir${solr.data.dir:./solr/data}/dataDir
 That indicates to me that there is some way to override that default
 ./solr/data involving something called solr.data.dir, but I don't know
 if that's an environment variable, or a system property, or what.

 hi Paul,

 it can be specified in the many ways solr.home can.
 e.g. on startup add -Dsolr.data.dir=/path/to/your/data/dir

 or replace/extend the value in solrconfig.xml.

 the default is in solr.home/data, imho.

I discovered that I can set the environment variable in
conf/Catalina/localhost/solr.xml

Context docBase=/Users/ptomblin/src/solr/example/webapps/solr.war
debug=0 crossContext=true 
   Environment name=solr/home type=java.lang.String
value=/Users/ptomblin/src/lucidity/solr override=true /
   Environment name=solr.data.dir type=java.lang.String
value=/Users/ptomblin/src/lucidity/solr/data override=true /
/Context


-- 
http://www.linkedin.com/in/paultomblin


Re: Why isn't this working?

2009-08-28 Thread Paul Tomblin
On Thu, Aug 27, 2009 at 11:36 PM, Ryan McKinleyryan...@gmail.com wrote:
 Say you have an embedded solr server and an http solr server pointed to the
 same location.
 1.  make sure only is read only!  otherwise you can make a mess.
 2. calling commit on the embedded solr instance, will not have any effect on
 the http instance UNTIL you call commit (reload) on the http instance.

Well, that's kind of klugy.  I think what I'm going to do instead is
have the cron job detect if the web server is running, and use it if
it is, otherwise use the embedded server.


-- 
http://www.linkedin.com/in/paultomblin


Multiple cores

2009-08-28 Thread Paul Tomblin
I'm trying to instantiate multiple cores.  Since nothing is different
between the two cores except the schema and the data dir, I was hoping
to share the same instanceDir.  Solr seems to recognize that there are
two cores, and gives me two different admin pages.  But unfortunately
both the admin pages are pointing to the same data dir and same
schema.

My solr.xml file looks like:

solr persistent=false
  cores adminPath=/admin/cores
core name=chunks instanceDir=.
property name=dataDir value=./data/
property name=schemaName value=schema.xml/
/core
core name=meta instanceDir=.
property name=dataDir value=./meta.data//
property name=schemaName value=metaschema.xml/
/core
  /cores
/solr

As well as the property dataDir, I've also tried solr.data.dataDir
and I've also tried putting it as an attribute in the core tag, like
core name=meta instanceDir=. dataDir=./meta.data/

Any help?
-- 
http://www.linkedin.com/in/paultomblin


Re: Updating a solr record

2009-08-27 Thread Paul Tomblin
On Thu, Aug 27, 2009 at 1:27 PM, Eric
Pughep...@opensourceconnections.com wrote:
 You can just query Solr, find the records that you want (including all
 the website data).  Update them, and then send the entire record back.


Correct me if I'm wrong, but I think you'd end up losing the fields
that are indexed but not stored.


-- 
http://www.linkedin.com/in/paultomblin


Can solr do the equivalent of select distinct(field)?

2009-08-27 Thread Paul Tomblin
Can I get all the distinct values from the Solr database, or do I
have to select everything and aggregate it myself?

-- 
http://www.linkedin.com/in/paultomblin


Ok, why isn't this working?

2009-08-27 Thread Paul Tomblin
I've loaded some data into my solr using the embedded server, and I
can see the data using Luke.  I start up the web app, and it says

cwd=/Users/ptomblin/apache-tomcat-6.0.20 
SolrHome=/Users/ptomblin/src/lucidity/solr/

I hit the schema button and it shows the correct schema.  However,
if I type anything into the query window, it never returns anything.
I've tried things that I know for sure are in the default search
field, but all I get back is

?xml version=1.0 encoding=UTF-8?
response

lst name=responseHeader
 int name=status0/int
 int name=QTime2/int
 lst name=params
  str name=indenton/str
  str name=start0/str
  str name=qscientist/str
  str name=rows10/str
  str name=version2.2/str
 /lst
/lst
result name=response numFound=0 start=0/
/response

How can I figure out why I'm not getting any results back?  Any log
files I can look at?

-- 
http://www.linkedin.com/in/paultomblin


Re: Ok, why isn't this working?

2009-08-27 Thread Paul Tomblin
On Thu, Aug 27, 2009 at 9:24 PM, Paul Tomblinptomb...@xcski.com wrote:
cwd=/Users/ptomblin/apache-tomcat-6.0.20 
SolrHome=/Users/ptomblin/src/lucidity/solr/


Ok, I've spotted the problem - while SolrHome is in the right place,
it's still looking for the data in
/Users/ptomblin/apache-tomcat-6.0.20/solr/data/

How can I changed that?


-- 
http://www.linkedin.com/in/paultomblin


Why isn't this working?

2009-08-27 Thread Paul Tomblin
Yesterday or the day before, I asked specifically if I would need to
restart the Solr server if somebody else loaded data into the Solr
index using the EmbeddedServer, and I was told confidently that no,
the Solr server would see the new data as soon as it was committed.
So today I fired up the Solr server (and after making
apache-tomcat-6.0.20/solr/data a symlink to where the Solr data really
lives and restarting the web server), and did some queries.  Then I
ran a program that loaded a bunch of data and committed it.  Then I
did the queries again.  And the new data is NOT showing.  Using Luke,
I can see 10022 documents in the index, but the Solr statistics page
(http://localhost:8080/solrChunk/admin/stats.jsp) is still showing
8677, which is how many there were before I reloaded the data.

So am I doing something wrong, or was the assurance I got yesterday
that this is possible wrong?

-- 
http://www.linkedin.com/in/paultomblin


SolrJ and Solr web simultaneously?

2009-08-26 Thread Paul Tomblin
Is Solr like a RDBMS in that I can have multiple programs querying and
updating the index at once, and everybody else will see the updates
after a commit, or do I have to something explicit to see others
updates?  Does it matter whether they're using the web interface,
SolrJ with a
CommonsHttpSolrServer or SolrJ with a EmbeddedSolrServer?


-- 
http://www.linkedin.com/in/paultomblin


Re: Wildcard seaches?

2009-08-20 Thread Paul Tomblin
On Thu, Aug 20, 2009 at 10:51 AM, Andrew Cleggandrew.cl...@gmail.com wrote:
 Paul Tomblin wrote:

 Is there such a thing as a wildcard search?  If I have a simple
 solr.StrField with no analyzer defined, can I query for foo* or
 foo.* and get everything that starts with foo such as 'foobar and
 foobaz?


 Yes. foo* is fine even on a simple string field.

Ah, I discovered what was going wrong - I was passing the url to
ClientUtils.escapeQueryChars, and that was escapign the *.  I have to
pass the URL without the * to escapeQueryChars, then tag the * on the
end.

Thanks.

-- 
http://www.linkedin.com/in/paultomblin


Re: Shutdown Solr

2009-08-19 Thread Paul Tomblin
On Wed, Aug 19, 2009 at 2:43 PM, Fuad Efendif...@efendi.ca wrote:
 Most probably Ctrl-C is graceful for Tomcat, and kill -9 too... Tomcat is
 smart... I prefer /etc/init.d/my_tomcat wrapper around catalina.sh (su
 tomcat, /var/lock etc...) - ok then, Graceful Shutdown depends on how you
 started Tomcat.

*No* application is graceful for kill -9.  The whole point of kill
-9 is that it's uncatchable.


-- 
http://www.linkedin.com/in/paultomblin


Can I search for a term in any field or a list of fields?

2009-08-18 Thread Paul Tomblin
I've got defaultSearchFieldtext/defaultSearchField and so if I
do an unqualified search it only finds in the field text.  If I want
to search title, I can do title:foo, but what if I want to find if
the search term is in any field, or if it's in text or title or
concept or keywords?  I already tried *:foo, but that throws an
exception:

Caused by: org.apache.solr.client.solrj.SolrServerException:
org.apache.solr.client.solrj.SolrServerException:
org.apache.solr.common.SolrException: undefined field *
 [java] at
org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:161)


-- 
http://www.linkedin.com/in/paultomblin


Re: Can I search for a term in any field or a list of fields?

2009-08-18 Thread Paul Tomblin
So if I want to make it so that the default search always searches
three specific fields, I can make another field multi-valued that they
are all copied into?

On Tue, Aug 18, 2009 at 10:46 AM, Marco Westermannm...@intersales.de wrote:
 I would say, you should use the copyField tag in the schema. eg:

 copyField source=sku dest=text/

 the text-field has to be difined as multivalued=true. When you now do an
 unqualified search, it will search every field, which is copied to the
 text-field.



-- 
http://www.linkedin.com/in/paultomblin


Re: Can I search for a term in any field or a list of fields?

2009-08-18 Thread Paul Tomblin
On Tue, Aug 18, 2009 at 11:04 AM, Marco Westermannm...@intersales.de wrote:
 exactly! for example you could create a field called all. And you copy
 your fields to it, which should be searched, when all fields are searched.


Awesome, that worked great.  I made my all field 'stored=false
indexed=true' and I can search for a term that I know is in any of
the key fields and it finds it.

Thanks.


-- 
http://www.linkedin.com/in/paultomblin


SolrJ question

2009-08-17 Thread Paul Tomblin
If I put an object into a SolrInputDocument and store it, how do I
query for it back?  For instance, I stored a java.net.URI in a field
called url, and I want to query for all the documents that match a
particular URI.  The query syntax only seems to allow Strings, and if
I just try query.setQuery(url: + uri.toString()) I get an error
because of the colon after http in the URI.

I'm really new to Solr, so please let me know if I'm missing something
basic here.

-- 
http://www.linkedin.com/in/paultomblin


Re: SolrJ question

2009-08-17 Thread Paul Tomblin
On Mon, Aug 17, 2009 at 5:28 PM, Harsch, Timothy J. (ARC-SC)[PEROT
SYSTEMS]timothy.j.har...@nasa.gov wrote:
 Assuming you have written the SolrInputDocument to the server, you would next 
 query.

I'm sorry, I don't understand what you mean by you would next query.
 There appear to be some words missing from that sentence.



-- 
http://www.linkedin.com/in/paultomblin


Re: SolrJ question

2009-08-17 Thread Paul Tomblin
On Mon, Aug 17, 2009 at 5:30 PM, Ensdorf Kenensd...@zoominfo.com wrote:
 You can escape the string with

 org.apache.lucene.queryParser.QueryParser.escape(String query)

 http://lucene.apache.org/java/2_4_0/api/org/apache/lucene/queryParser/QueryParser.html#escape%28java.lang.String%29


Does this mean I should have converted my objects to string before
writing them to the server?

-- 
http://www.linkedin.com/in/paultomblin


Re: SolrJ question

2009-08-17 Thread Paul Tomblin
On Mon, Aug 17, 2009 at 5:36 PM, Ensdorf Kenensd...@zoominfo.com wrote:
 Does this mean I should have converted my objects to string before
 writing them to the server?


 I believe SolrJ takes care of that for you by calling toString(), but you 
 would need to convert explicitly when you query (and then escape).


Hmmm.  It's not working right.  I've added a 5 documents, 3 with the
URL set to http://xcski.com/pharma/; and 2 with the URL set to
http://xcski.com/nano/;.  Doing other sorts of queries seems to be
pulling back the right data:

 [DEBUG] 34:20 (Solr.java:getForConcept:116)
 [java] search term = fribbet, concept = pharma
 [java]
 [java] Aug 17, 2009 5:34:20 PM org.apache.solr.core.SolrCore execute
 [java] INFO: [] webapp=null path=/select
params={q=fribbetfq=concept%3Apharma} hits=1 status=0 QTime=9
 [java] [DEBUG] 34:20 (Solr.java:getForConcept:130)
 [java] got doc SolrDocument[{id=2:http://xcski.com/pharma/,
concept=pharma, text=this is a third big long chunk of text containing
the word fribbet, title=this is the third title, keywords=pills,drugs,
origDoctype=html, chunkNum=2, url=http://xcski.com/pharma/}]

 But if I want to restrict it to a specific URL, I use

   SolrQuery query = new SolrQuery();
query.setQuery(url: + ClientUtils.escapeQueryChars(url));

and it's not returning anything.  Log4j output looks like:

 [java] [DEBUG] 34:20 (Solr.java:getAllForURL:89)
 [java] getting for URL: http://xcski.com/nano/
 [java]
 [java] Aug 17, 2009 5:34:20 PM org.apache.solr.core.SolrCore execute
 [java] INFO: [] webapp=null path=/select
params={q=url%3Ahttp%5C%3A%5C%2F%5C%2Fxcski%5C.com%5C%2Fnano%5C%2F}
hits=0 status=0 QTime=16
 [java] [DEBUG] 34:20 (Solr.java:main:229)
 [java] found: 0

Actually, looking at that, it looks like it's escaped the URL twice,
converting : into %3A, then converting that to %5C%3A.  Could
that be?



-- 
http://www.linkedin.com/in/paultomblin


Re: SolrJ question

2009-08-17 Thread Paul Tomblin
On Mon, Aug 17, 2009 at 5:47 PM, Paul Tomblinptomb...@xcski.com wrote:

 Hmmm.  It's not working right.  I've added a 5 documents, 3 with the
 URL set to http://xcski.com/pharma/; and 2 with the URL set to
 http://xcski.com/nano/;.  Doing other sorts of queries seems to be
 pulling back the right data:


Of course, It doesn't help that my url field was set to
indexed=false in the schema.  Changing it to true fixed it.

-- 
http://www.linkedin.com/in/paultomblin


Which versions?

2009-08-16 Thread Paul Tomblin
Which versions of Lucene, Nutch and Solr work together?  I've
discovered that the Nutch trunk and the Solr trunk use wildly
different versions of the Lucene jars, and it's causing me problems.

-- 
http://www.linkedin.com/in/paultomblin


I think this is a bug

2009-08-13 Thread Paul Tomblin
I don't want to join yet another mailing list or register for JIRA,
but I just noticed that the Javadocs for
SolrInputDocument.addField(String name, Object value, float boost) is
incredibly wrong - it looks like it was copied from a deleteAll
method.


-- 
http://www.linkedin.com/in/paultomblin