Re: sorlj search

2009-04-10 Thread RajuMaddy



Tevfik  Kiziloren wrote:
 
 Hi. I'm a newbie. I need to develop a jsf based search application by
 using solr. I found nothing about soljava imlementation except simple
 example on the solr wiki. When I tried a console program that similar in
 the example at solr wiki, I got the exception below. Where can i find an
 extensive documentation about solrj?
 
 Thanks in advance.
 Tevfik Kızılören.
 
 try {
 String url = http://localhost:8080/solr;;
 SolrServer server = new CommonsHttpSolrServer(url);   
 
 SolrQuery query = new SolrQuery();
 query.setQuery(solr);
 System.out.println(query.toString());   
 QueryResponse rsp = server.query(query);
 System.out.println(rsp.getResults().toString());

 } catch (IOException ex) {

 Logger.getLogger(SolrclientView.class.getName()).log(Level.SEVERE, null,
 ex);
 } catch (SolrServerException ex) {

 Logger.getLogger(SolrclientView.class.getName()).log(Level.SEVERE, null,
 ex);
 }
 
 
 ---
 solrclient.SolrclientView jButton1ActionPerformed
 SEVERE: null
 org.apache.solr.client.solrj.SolrServerException: Error executing query
 at
 org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:90)
 at
 org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:96)
 at
 solrclient.SolrclientView.jButton1ActionPerformed(SolrclientView.java:229)
 at solrclient.SolrclientView.access$800(SolrclientView.java:32)
 at
 solrclient.SolrclientView$4.actionPerformed(SolrclientView.java:135)
 at
 javax.swing.AbstractButton.fireActionPerformed(AbstractButton.java:1995)
 at
 javax.swing.AbstractButton$Handler.actionPerformed(AbstractButton.java:2318)
 at
 javax.swing.DefaultButtonModel.fireActionPerformed(DefaultButtonModel.java:387)
 at
 javax.swing.DefaultButtonModel.setPressed(DefaultButtonModel.java:242)
 at
 javax.swing.plaf.basic.BasicButtonListener.mouseReleased(BasicButtonListener.java:236)
 at java.awt.Component.processMouseEvent(Component.java:6038)
 at javax.swing.JComponent.processMouseEvent(JComponent.java:3265)
 at java.awt.Component.processEvent(Component.java:5803)
 at java.awt.Container.processEvent(Container.java:2058)
 at java.awt.Component.dispatchEventImpl(Component.java:4410)
 at java.awt.Container.dispatchEventImpl(Container.java:2116)
 at java.awt.Component.dispatchEvent(Component.java:4240)
 at
 java.awt.LightweightDispatcher.retargetMouseEvent(Container.java:4322)
 at
 java.awt.LightweightDispatcher.processMouseEvent(Container.java:3986)
 at
 java.awt.LightweightDispatcher.dispatchEvent(Container.java:3916)
 at java.awt.Container.dispatchEventImpl(Container.java:2102)
 at java.awt.Window.dispatchEventImpl(Window.java:2429)
 at java.awt.Component.dispatchEvent(Component.java:4240)
 at java.awt.EventQueue.dispatchEvent(EventQueue.java:599)
 at
 java.awt.EventDispatchThread.pumpOneEventForFilters(EventDispatchThread.java:273)
 at
 java.awt.EventDispatchThread.pumpEventsForFilter(EventDispatchThread.java:183)
 at
 java.awt.EventDispatchThread.pumpEventsForHierarchy(EventDispatchThread.java:173)
 at
 java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:168)
 at
 java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:160)
 at java.awt.EventDispatchThread.run(EventDispatchThread.java:121)
 Caused by: org.apache.solr.common.SolrException: parsing error
 at
 org.apache.solr.client.solrj.impl.XMLResponseParser.processResponse(XMLResponseParser.java:138)
 at
 org.apache.solr.client.solrj.impl.XMLResponseParser.processResponse(XMLResponseParser.java:99)
 at
 org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:317)
 at
 org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:84)
 ... 29 more
 Caused by: java.lang.RuntimeException: this must be known type! not: int
 at
 org.apache.solr.client.solrj.impl.XMLResponseParser.readNamedList(XMLResponseParser.java:217)
 at
 org.apache.solr.client.solrj.impl.XMLResponseParser.readNamedList(XMLResponseParser.java:235)
 at
 org.apache.solr.client.solrj.impl.XMLResponseParser.processResponse(XMLResponseParser.java:123)
 

Hi

   May be your query string contains any illegal values or the problem may
be in your server... make sure that your solr is running in localhost:8080 

-- 
View this message in context: 
http://www.nabble.com/sorlj-search-tp15305698p22983898.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Performance when indexing or cold cache

2009-04-10 Thread sunnyfr

Hi Walter,

Did you find a way to sort out your issue, I would be very interested.
Thanks a lot,


Walter Underwood wrote:
 
 We've had some performance problems while Solr is indexing and also when
 it
 starts with a cold cache. I'm still digging through our own logs, but I'd
 like to get more info about this, so any ideas or info are welcome.
 
 We have four Solr servers on dual CPU PowerPC machines, 2G of heap, about
 100-300 queries/second, 250K docs, Tomcat 6.0.10, not fronted by Apache.
 We don't use facets, we sort by score. In general use, there are six
 different request handlers called to build a page. Here is one, they
 are all very similar.
 
   requestHandler name=movies_people class=solr.DisMaxRequestHandler 
 lst name=defaults
  float name=tie0.01/float
  str name=qf
 exact^8.0 exact_alt^6.0 exact_base^8.0 title^4.0 title_alt^3.0
 title_base^4.0 phonetic_hi^1.0
  /str
  str name=pf
 exact^12.0 exact_alt^9.0 exact_base^12.0 title^6.0 title_alt^4.0
 title_base^6.0 phonetic_hi^1.5
  /str
  str name=bf
 popularity^2.0
  /str
  str name=fl
 id,type,movieid,personid,genreid,score
  /str
  str name=mm1/str
  int name=ps100/int
 /lst
 lst name=appends
   str name=fq(pushstatus:A AND (type:movie OR type:person))/str
 /lst
   /requestHandler
 
 wunder
 
 
 
 

-- 
View this message in context: 
http://www.nabble.com/Performance-when-indexing-or-cold-cache-tp13348420p22984912.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: multiple tokenizers needed

2009-04-10 Thread Grant Ingersoll
The only thing that comes to mind in a short term way is writing two  
TokenFilter implementations that wrap the second and third tokenizers


On Apr 9, 2009, at 11:00 PM, Ashish P wrote:



I want to analyze a text based on pattern ; and separate on  
whitespace and

it is a Japanese text so use CJKAnalyzer + tokenizer also.
in short I want to do:
 analyzer 
class=org.apache.lucene.analysis.cjk.CJKAnalyzer
tokenizer class=solr.PatternTokenizerFactory 
pattern=; /
tokenizer class=solr.WhitespaceTokenizerFactory 
/
tokenizer 
class=org.apache.lucene.analysis.cjk.CJKTokenizer /
/analyzer
Can anyone please tell me how to achieve this?? Because the above  
syntax is

not at all possible.
--
View this message in context: 
http://www.nabble.com/multiple-tokenizers-needed-tp22982382p22982382.html
Sent from the Solr - User mailing list archive at Nabble.com.



--
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
using Solr/Lucene:

http://www.lucidimagination.com/search



Re: Question on Solr Distributed Search

2009-04-10 Thread Shalin Shekhar Mangar
On Fri, Apr 10, 2009 at 7:50 AM, vivek sar vivex...@gmail.com wrote:

 Just an update. I changed the schema to store the unique id field, but
 I still get the connection reset exception. I did notice that if there
 is no data in the core then it returns the 0 result (no exception),
 but if there is data and you search using shards parameter I get the
 connection reset exception. Can anyone provide some tip on where can I
 look for this problem?


Did you re-index after changing the field to stored?
-- 
Regards,
Shalin Shekhar Mangar.


QueryElevationComponent : hot update of elevate.xml

2009-04-10 Thread Nicolas Pastorino

Hello !


Browsing the mailing-list's archives did not help me find the answer,  
hence the question asked directly here.


Some context first :
Integrating Solr with a CMS ( eZ Publish ), we chose to support  
Elevation. The idea is to be able to 'elevate' any object from the  
CMS. This can be achieved through eZ Publish's back office, with a  
dedicated Elevate administration GUI, the configuration is stored in  
the CMS temporarily, and then synchronized frequently and/or on  
demand onto Solr. This synchronisation is currently done as follows :

1. Generate the elevate.xml based on the stored configuration
2. Replace elevate.xml in Solr's dataDir
3. Commit. It appears that when having elevate.xml in Solr's dataDir,  
and solely in this case, commiting triggers a reload of elevate.xml.  
This does not happen when elevate.xml is stored in Solr's conf dir.



This method has one main issue though : eZ Publish needs to have  
access to the same filesystem as the one on which Solr's dataDir is  
stored. This is not always the case when the CMS is clustered for  
instance -- show stopper :(


Hence the following idea / RFC :
How about extending the Query Elevation system with the possibility  
to push an updated elevate.xml file/XML through HTTP ?
This would update the file where it is actually located, and trigger  
a reload of the configuration.
Not being very knowledgeable about Solr's API ( yet ! ), i cannot  
figure out whether this would be possible, how this would be  
achievable ( which type of plugin for instance ) or even be valid ?


Thanks a lot in advance for your thoughts,
--
Nicolas





Re: Any tips for indexing large amounts of data?

2009-04-10 Thread Noble Paul നോബിള്‍ नोब्ळ्
they don't usually turn off the slave , but it is not a bad idea if
you can take it offline. It is a logistical headache.

BTW do you have very good cache hit ratio? then it makes sense to autowarm .
--Noble

On Fri, Apr 10, 2009 at 4:07 PM, sunnyfr johanna...@gmail.com wrote:

 ok but how people do for a frequent update for a large dabase and lot of
 query on it ?
 do they turn off the slave during the warmup ??


 Noble Paul നോബിള്‍  नोब्ळ् wrote:

 On Thu, Apr 9, 2009 at 8:51 PM, sunnyfr johanna...@gmail.com wrote:

 Hi Otis,
 How did you manage that? I've 8 core machine with 8GB of ram and 11GB
 index
 for 14M docs and 5 update every 30mn but my replication kill
 everything.
 My segments are merged too often sor full index replicate and cache lost
 and
  I've no idea what can I do now?
 Some help would be brilliant,
 btw im using Solr 1.4.


 sunnnyfr , whether the replication is full or delta , the caches are
 lost completely.

 you can think of partitioning the index into separate Solrs and
 updating one partition at a time and perform distributed search.

 Thanks,


 Otis Gospodnetic wrote:

 Mike is right about the occasional slow-down, which appears as a pause
 and
 is due to large Lucene index segment merging.  This should go away with
 newer versions of Lucene where this is happening in the background.

 That said, we just indexed about 20MM documents on a single 8-core
 machine
 with 8 GB of RAM, resulting in nearly 20 GB index.  The whole process
 took
 a little less than 10 hours - that's over 550 docs/second.  The vanilla
 approach before some of our changes apparently required several days to
 index the same amount of data.

 Otis
 --
 Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch

 - Original Message 
 From: Mike Klaas mike.kl...@gmail.com
 To: solr-user@lucene.apache.org
 Sent: Monday, November 19, 2007 5:50:19 PM
 Subject: Re: Any tips for indexing large amounts of data?

 There should be some slowdown in larger indices as occasionally large
 segment merge operations must occur.  However, this shouldn't really
 affect overall speed too much.

 You haven't really given us enough data to tell you anything useful.
 I would recommend trying to do the indexing via a webapp to eliminate
 all your code as a possible factor.  Then, look for signs to what is
 happening when indexing slows.  For instance, is Solr high in cpu, is
 the computer thrashing, etc?

 -Mike

 On 19-Nov-07, at 2:44 PM, Brendan Grainger wrote:

 Hi,

 Thanks for answering this question a while back. I have made some
 of the suggestions you mentioned. ie not committing until I've
 finished indexing. What I am seeing though, is as the index get
 larger (around 1Gb), indexing is taking a lot longer. In fact it
 slows down to a crawl. Have you got any pointers as to what I might
 be doing wrong?

 Also, I was looking at using MultiCore solr. Could this help in
 some way?

 Thank you
 Brendan

 On Oct 31, 2007, at 10:09 PM, Chris Hostetter wrote:


 : I would think you would see better performance by allowing auto
 commit
 : to handle the commit size instead of reopening the connection
 all the
 : time.

 if your goal is fast indexing, don't use autoCommit at all ...
  just
 index everything, and don't commit until you are completely done.

 autoCommitting will slow your indexing down (the benefit being
 that more
 results will be visible to searchers as you proceed)




 -Hoss









 --
 View this message in context:
 http://www.nabble.com/Any-tips-for-indexing-large-amounts-of-data--tp13510670p22973205.html
 Sent from the Solr - User mailing list archive at Nabble.com.





 --
 --Noble Paul



 --
 View this message in context: 
 http://www.nabble.com/Any-tips-for-indexing-large-amounts-of-data--tp13510670p22986152.html
 Sent from the Solr - User mailing list archive at Nabble.com.





-- 
--Noble Paul


Re: QueryElevationComponent : hot update of elevate.xml

2009-04-10 Thread Ryan McKinley


On Apr 10, 2009, at 7:48 AM, Nicolas Pastorino wrote:


Hello !


Browsing the mailing-list's archives did not help me find the  
answer, hence the question asked directly here.


Some context first :
Integrating Solr with a CMS ( eZ Publish ), we chose to support  
Elevation. The idea is to be able to 'elevate' any object from the  
CMS. This can be achieved through eZ Publish's back office, with a  
dedicated Elevate administration GUI, the configuration is stored in  
the CMS temporarily, and then synchronized frequently and/or on  
demand onto Solr. This synchronisation is currently done as follows :

1. Generate the elevate.xml based on the stored configuration
2. Replace elevate.xml in Solr's dataDir
3. Commit. It appears that when having elevate.xml in Solr's  
dataDir, and solely in this case, commiting triggers a reload of  
elevate.xml. This does not happen when elevate.xml is stored in  
Solr's conf dir.



This method has one main issue though : eZ Publish needs to have  
access to the same filesystem as the one on which Solr's dataDir is  
stored. This is not always the case when the CMS is clustered for  
instance -- show stopper :(


Hence the following idea / RFC :
How about extending the Query Elevation system with the possibility  
to push an updated elevate.xml file/XML through HTTP ?
This would update the file where it is actually located, and trigger  
a reload of the configuration.
Not being very knowledgeable about Solr's API ( yet ! ), i cannot  
figure out whether this would be possible, how this would be  
achievable ( which type of plugin for instance ) or even be valid ?



Perhaps look at implementing custom RequestHandler:
http://wiki.apache.org/solr/SolrRequestHandler

maybe it could POST the new elevate.xm and then save it to the right  
place and call commit...


ryan






Re: Additive filter queries

2009-04-10 Thread Matthew Runo
That would work, but the other part of our problem comes in when we  
then try to facet on the resulting set.. If we filter by size 1, for  
example, and then facet Width again - we get facet results that have  
no size 1's, because we have no taught solr what 1_W means, etc etc..


I think field collapsing might solve this for us, maybe..

Thanks for your time!

Matthew Runo
Software Engineer, Zappos.com
mr...@zappos.com - 702-943-7833

On Apr 9, 2009, at 5:23 PM, Chris Hostetter wrote:


: Right now a document looks like this:
:
: doc
: !-- style level --
: productID1598548/productID
: styleID12545/styleID
: brandAdidas/brand
: size1, 2, 3, 4, 5, 6, 7/size
: widthAA, A, B, W, W, /width
: colorBrown/color
: /doc
:
: If we went down a level, it could look like..
: doc
: !-- stock level --
: productID1598548/productID
: styleID12545/styleID
: stockID654641654684/stockID
: brandAdidas/brand
: size1/size
: widthAA/width
: colorBrown/color
: /doc

If you want result at the product level then you don't have to  
have one

*doc* per legal size+width pair ... you just need one *term* per
valid size+width pair

 size1, 2, 3, 4, 5, 6, 7/size
 widthAA, A, B, W, W, /width
 opts1_W 2W 3_B 3_W 4_AA 4_A 4_B 4_W 4_WW 5_W 5_ 6_  
7_/opts


a search for size 4 clogs would look like...

 q=clogsfq=size:5facet.field=optsf.opts.facet.prefix=4_

...and the facet counts for opts would tell me what widths were
available (and how many).

for completeness you typically want to index the pairs in both  
directions
(1_W and W_1 ... typically in seperate fields) so the user can  
filter by
either option first ... for something like size+color this makes  
sense,

but i'm guessing with shoes no one expects to narrow by width untill
they've narrowed by size first.


-Hoss





Index Version Number

2009-04-10 Thread Richard Wiseman
Is it possible for a Solr client to determine if the index has changed 
since the last time it performed a query?  For example, is it possible 
to query the current Lucene indexVersion?


Thanks in advance for your help,
Richard


Re: Question on Solr Distributed Search

2009-04-10 Thread vivek sar
yes - it's all new indexes. I can search them individually, but adding
shards throws Connection Reset error. Is there any way I can debug
this or any other pointers?

-vivek

On Fri, Apr 10, 2009 at 4:49 AM, Shalin Shekhar Mangar
shalinman...@gmail.com wrote:
 On Fri, Apr 10, 2009 at 7:50 AM, vivek sar vivex...@gmail.com wrote:

 Just an update. I changed the schema to store the unique id field, but
 I still get the connection reset exception. I did notice that if there
 is no data in the core then it returns the 0 result (no exception),
 but if there is data and you search using shards parameter I get the
 connection reset exception. Can anyone provide some tip on where can I
 look for this problem?


 Did you re-index after changing the field to stored?
 --
 Regards,
 Shalin Shekhar Mangar.



Re: Help with relevance failure in Solr 1.3

2009-04-10 Thread Walter Underwood
If you don't see the attachments, you can get them here:

http://wunderwood.org/solr/

wunder

On 4/10/09 10:56 AM, Walter Underwood wunderw...@netflix.com wrote:

 We have a rare, hard-to-reproduce problem with our Solr 1.3 servers, and
 I would appreciate any ideas.
 
 Ocassionally, a server will start returning results with really poor
 relevance. Single term queries work fine, but multi-term queries are
 scored based on the most common term (lowest IDF).
 
 I don't see anything in the logs when this happens. We have a monitor
 doing a search for the 100 most popular movies once per minute to
 catch this, so we know when it was first detected.
 
 I'm attaching two explain outputs, one for the query changeling and
 one for the changeling.
 
 We are running Solr 1.3 with Lucene 2.4.0, and have added a fuzzy query
 using JaroWinkler matching.
 
 I'd appreciate ideas about where to look, what debug output to try, etc.
 
 wunder




Help with relevance failure in Solr 1.3

2009-04-10 Thread Walter Underwood
We have a rare, hard-to-reproduce problem with our Solr 1.3 servers, and
I would appreciate any ideas.

Ocassionally, a server will start returning results with really poor
relevance. Single term queries work fine, but multi-term queries are
scored based on the most common term (lowest IDF).

I don't see anything in the logs when this happens. We have a monitor
doing a search for the 100 most popular movies once per minute to
catch this, so we know when it was first detected.

I'm attaching two explain outputs, one for the query changeling and
one for the changeling.

We are running Solr 1.3 with Lucene 2.4.0, and have added a fuzzy query
using JaroWinkler matching.

I'd appreciate ideas about where to look, what debug output to try, etc.

wunder



Re: Help with relevance failure in Solr 1.3

2009-04-10 Thread Grant Ingersoll


On Apr 10, 2009, at 1:56 PM, Walter Underwood wrote:

We have a rare, hard-to-reproduce problem with our Solr 1.3 servers,  
and

I would appreciate any ideas.

Ocassionally, a server will start returning results with really poor
relevance. Single term queries work fine, but multi-term queries are
scored based on the most common term (lowest IDF).

I don't see anything in the logs when this happens. We have a monitor
doing a search for the 100 most popular movies once per minute to
catch this, so we know when it was first detected.

I'm attaching two explain outputs, one for the query changeling and
one for the changeling.



I'm not sure what exactly  you are asking, so bear with me...

Are you saying that the changeling normally returns results just  
fine and then periodically it will go bad or are you saying you  
don't understand why the changeling scores differently from  
changeling?  In looking at the explains, it is weird that in the  
the changeling case, the term changeling doesn't even show up as a  
term.


Can you share your dismax configuration?  That will be easier to parse  
than trying to make sense of the debug query parsing.


-Grant


Re: Index Version Number

2009-04-10 Thread Grant Ingersoll
This info is available via the Luke Handler, I believe: http://localhost:8983/solr/admin/luke/ 
:  In there, I see: version, current, optimized and true information.


See also http://wiki.apache.org/solr/LukeRequestHandler

HTH,
Grant

On Apr 10, 2009, at 11:58 AM, Richard Wiseman wrote:

Is it possible for a Solr client to determine if the index has  
changed since the last time it performed a query?  For example, is  
it possible to query the current Lucene indexVersion?


Thanks in advance for your help,
Richard


--
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
using Solr/Lucene:

http://www.lucidimagination.com/search



Question on StreamingUpdateSolrServer

2009-04-10 Thread vivek sar
Hi,

 I was using CommonsHttpSolrServer for indexing, but having two
threads writing (10K batches) at the same time was throwing,

  ProtocolException: Unbuffered entity enclosing request can not be repeated. 

I switched to StreamingUpdateSolrServer (using addBeans) and I don't
see the problem anymore. The speed is very fast - getting around
25k/sec (single thread), but I'm facing another problem. When the
indexer using StreamingUpdateSolrServer is running I'm not able to
send any url request from browser to Solr web app. I just get blank
page. I can't even get to the admin interface. I'm also not able to
shutdown the Tomcat running the Solr webapp when the Indexer is
running. I've to first stop the Indexer app and then stop the Tomcat.
I don't have this problem when using CommonsHttpSolrServer.

Here is how I'm creating it,

server = new StreamingUpdateSolrServer(url, 1000,3);

I simply call server.addBeans(...) on it. Is there anything else I
need to do to make use of StreamingUpdateSolrServer? Why does Tomcat
become unresponsive  when Indexer using StreamingUpdateSolrServer is
running (though, indexing happens fine)?

Thanks,
-vivek


Re: Help with relevance failure in Solr 1.3

2009-04-10 Thread Walter Underwood
Normally, both changeling and the changeling work fine. This one
server is misbehaving like this for all multi-term queries.

Yes, it is VERY weird that the term changeling does not show up in
the explain.

A server will occasionally go bad and stay in that state. In one case,
two servers went bad and both gave the same wrong results.

Here is the dismax config. groups means movies. The title* fields
are stemmed and stopped, the exact* fields are not.

  !-- groups and people  --

  requestHandler name=groups_people class=solr.SearchHandler
lst name=defaults
 str name=defTypedismax/str
 str name=echoParamsnone/str
 float name=tie0.01/float
 str name=qf
exact^6.0 exact_alt^6.0 exact_base~jw_0.7_1^8.0 exact_alias^8.0
title^3.0 title_alt^3.0 title_base^4.0
 /str

 str name=pf
exact^9.0 exact_alt^9.0 exact_base^12.0 exact_alias^12.0 title^3.0
title_alt^4.0 title_base^6.0
 /str
 str name=bf
search_popularity^100.0
 /str
 str name=mm1/str
 int name=ps100/int
 str name=flid,type,movieid,personid,genreid/str

/lst
lst name=appends
  str name=fqtype:group OR type:person/str
/lst
  /requestHandler


wunder

On 4/10/09 12:51 PM, Grant Ingersoll gsing...@apache.org wrote:

 
 On Apr 10, 2009, at 1:56 PM, Walter Underwood wrote:
 
 We have a rare, hard-to-reproduce problem with our Solr 1.3 servers,
 and
 I would appreciate any ideas.
 
 Ocassionally, a server will start returning results with really poor
 relevance. Single term queries work fine, but multi-term queries are
 scored based on the most common term (lowest IDF).
 
 I don't see anything in the logs when this happens. We have a monitor
 doing a search for the 100 most popular movies once per minute to
 catch this, so we know when it was first detected.
 
 I'm attaching two explain outputs, one for the query changeling and
 one for the changeling.
 
 
 I'm not sure what exactly  you are asking, so bear with me...
 
 Are you saying that the changeling normally returns results just
 fine and then periodically it will go bad or are you saying you
 don't understand why the changeling scores differently from
 changeling?  In looking at the explains, it is weird that in the
 the changeling case, the term changeling doesn't even show up as a
 term.
 
 Can you share your dismax configuration?  That will be easier to parse
 than trying to make sense of the debug query parsing.
 
 -Grant



Re: Question on StreamingUpdateSolrServer

2009-04-10 Thread vivek sar
I also noticed that the Solr app has over 6000 file handles open -

lsof | grep solr | wc -l   - shows 6455

I've 10 cores (using multi-core) managed by the same Solr instance. As
soon as start up the Tomcat the open file count goes up to 6400.  Few
questions,

1) Why is Solr holding on to all the segments from all the cores - is
it because of auto-warmer?
2) How can I reduce the open file count?
3) Is there a way to stop the auto-warmer?
4) Could this be related to Tomcat returning blank page for every request?

Any ideas?

Thanks,
-vivek

On Fri, Apr 10, 2009 at 1:48 PM, vivek sar vivex...@gmail.com wrote:
 Hi,

  I was using CommonsHttpSolrServer for indexing, but having two
 threads writing (10K batches) at the same time was throwing,

  ProtocolException: Unbuffered entity enclosing request can not be repeated. 
 

 I switched to StreamingUpdateSolrServer (using addBeans) and I don't
 see the problem anymore. The speed is very fast - getting around
 25k/sec (single thread), but I'm facing another problem. When the
 indexer using StreamingUpdateSolrServer is running I'm not able to
 send any url request from browser to Solr web app. I just get blank
 page. I can't even get to the admin interface. I'm also not able to
 shutdown the Tomcat running the Solr webapp when the Indexer is
 running. I've to first stop the Indexer app and then stop the Tomcat.
 I don't have this problem when using CommonsHttpSolrServer.

 Here is how I'm creating it,

 server = new StreamingUpdateSolrServer(url, 1000,3);

 I simply call server.addBeans(...) on it. Is there anything else I
 need to do to make use of StreamingUpdateSolrServer? Why does Tomcat
 become unresponsive  when Indexer using StreamingUpdateSolrServer is
 running (though, indexing happens fine)?

 Thanks,
 -vivek



Re: logging

2009-04-10 Thread Ryan McKinley
If you use the off the shelf .war, it *should* be the same.  (if not,  
we need to fix it)


If you are building your own .war, how SLF4 behaves depends on what  
implementation is in the runtime path.  If you want to use log4j  
logging, put in the slf4j-log4j.jar in your classpath and you should  
be all set.



On Apr 9, 2009, at 4:56 PM, Kevin Osborn wrote:

We built our own webapp that used the Solr JARs. We used Apache  
Commons/log4j logging and just put log4j.properties in the Resin  
conf directory. The commons-logging and log4j jars were put in the  
Resin lib driectory. Everything worked great and we got log files  
for our code only.


So, I upgraded to Solr 1.4 and I no longer get my log file. I assume  
it has something to do with Solr 1.4 using SL4J instead of JDK  
logging, but it seems like my code would be independent of that. Any  
ideas?








Re: logging

2009-04-10 Thread Kevin Osborn
Or for my quick and dirty methods (this was just a test), I just removed the 
jcl-over-slrj JAR, and it worked like normal.





From: Ryan McKinley ryan...@gmail.com
To: solr-user@lucene.apache.org
Sent: Friday, April 10, 2009 3:16:30 PM
Subject: Re: logging

If you use the off the shelf .war, it *should* be the same.  (if not, we need 
to fix it)

If you are building your own .war, how SLF4 behaves depends on what 
implementation is in the runtime path.  If you want to use log4j logging, put 
in the slf4j-log4j.jar in your classpath and you should be all set.


On Apr 9, 2009, at 4:56 PM, Kevin Osborn wrote:

 We built our own webapp that used the Solr JARs. We used Apache Commons/log4j 
 logging and just put log4j.properties in the Resin conf directory. The 
 commons-logging and log4j jars were put in the Resin lib driectory. 
 Everything worked great and we got log files for our code only.
 
 So, I upgraded to Solr 1.4 and I no longer get my log file. I assume it has 
 something to do with Solr 1.4 using SL4J instead of JDK logging, but it seems 
 like my code would be independent of that. Any ideas?
 
 
 


  

maxCodeLength in PhoneticFilterFactory

2009-04-10 Thread Brian Whitman
i have this version of solr running:

Solr Implementation Version: 1.4-dev 747554M - bwhitman - 2009-02-24
16:37:49

and am trying to update a schema to support 8 code length metaphone instead
of 4 via this (committed) issue:

https://issues.apache.org/jira/browse/SOLR-813

So I change the schema to this (knowing that I have to reindex)

filter class=solr.PhoneticFilterFactory encoder=DoubleMetaphone
inject=true maxCodeLength=8/

But when I do queries fail with

Error_initializing_DoubleMetaphoneclass_orgapachecommonscodeclanguageDoubleMetaphone__at_orgapachesolranalysisPhoneticFilterFactoryinitPhoneticFilterFactoryjava90__at_orgapachesolrschemaIndexSchema$6initIndexSchemajava821__at_orgapachesolrschemaIndexSchema$6initIndexSchemajava817__at_orgapachesolrutilpluginAbstractPluginLoaderloadAbstractPluginLoaderjava149__at_orgapachesolrschemaIndexSchemareadAnalyzerIndexSchemajava831__at_orgapachesolrschemaIndexSchemaaccess$100IndexSchemajava58__at_orgapachesolrschemaIndexSchema$1createIndexSchemajava425__at_orgapachesolrschemaIndexSchema$1createIndexSchemajava410__at_orgapachesolrutilpluginAbstractPluginLoaderloadAbstractPluginLoaderjava141__at_orgapachesolrschemaIndexSchemareadSchemaIndexSchemajava452__at_orgapachesolrschemaIndexSchemainitIndexSchemajava95__at_orgapachesolrcoreSolrCoreinitSolrCorejava501__at_orgapachesolrcoreCoreContainer$InitializerinitializeCoreContainerjava121


PHP Remove From Index/Search By Fields

2009-04-10 Thread Johnny X

Hey,


How could I write some code in PHP to place in a button to remove a returned
item from the index?

In turn, is it possible to copy all of the XML elements from said item and
place them in a document somewhere locally once it's been removed?

Finally, there is one default search field. How do you search on multiple
different fields in PHP?

If I wanted to search by all of the fields indexed, is that easy to code?
What changes do I need to make in the XML schema?


Thanks for much for any help!
-- 
View this message in context: 
http://www.nabble.com/PHP-Remove-From-Index-Search-By-Fields-tp22996701p22996701.html
Sent from the Solr - User mailing list archive at Nabble.com.



special characters in Solr search query.

2009-04-10 Thread Sagar Khetkade

Hi,
 
There is a strange issue while querying on the Solr indexes. If my query 
contains the special characters like [ ] ! etc. It is throwing the query 
parse exception. From my application interface I am able to handle the special 
characters but the issue is while the document which I am going to index 
contains any of these special characters it is throwing query parse exception. 
Can anyone give pointer over this? 
Thanks in advance.
 
Regards,
Sagar Khetkade

_
The new Windows Live Messenger. You don’t want to miss this.
http://www.microsoft.com/india/windows/windowslive/messenger.aspx

Re: special characters in Solr search query.

2009-04-10 Thread Shalin Shekhar Mangar
On Sat, Apr 11, 2009 at 10:13 AM, Sagar Khetkade sagar.khetk...@hotmail.com
 wrote:


 There is a strange issue while querying on the Solr indexes. If my query
 contains the special characters like [ ] ! etc. It is throwing the query
 parse exception. From my application interface I am able to handle the
 special characters but the issue is while the document which I am going to
 index contains any of these special characters it is throwing query parse
 exception. Can anyone give pointer over this?
 Thanks in advance.


You need to escape those characters. Look at
http://lucene.apache.org/java/2_4_1/queryparsersyntax.html#Escaping%20Special%20Characters

If you are using Solrj, this should be done automatically. Solrj calls
ClientUtils.escapeQueryChars under the hood.
-- 
Regards,
Shalin Shekhar Mangar.