different indexes for multitenant approach

2011-06-03 Thread Naveen Gupta
Hi

I want to implement different index strategy where we want to keep indexes
with respect to each tennant and we want to maintain indexes separately ...

first level of category -- company name

second level of category - company name + fields to be indexed

then further categories - group of different company name based on some
heuristic (hashing) (if it grows furhter)

i want to do in the same solr instance. can it be possible ?

Thanks
Naveen


Re: how to make getJson parameter dynamic

2011-06-03 Thread Romi
lee carroll: Sorry for this. i did this because i was not getting any
response. anyway thanks for letting me know and now i found the solution of
the above problem :)
now i am facing a very strange problem related to jquery can you please help
me out.

$(document).ready(function(){
 $(#c2).click(function(){
var q=getquerystring() ;

   
$.getJSON(http://192.168.1.9:8983/solr/db/select/?wt=jsonq=+q+json.wrf=?;,
function(result){
$.each(result.response.docs, function(i,item){
alert(result.response.docs);
alert(item.UID_PK);
});
});
});
});


when i use $(#c2).click(function() then it does not enter in $.getJSON()
and when i remove $(#c2).click(function() from the code it run fine. Why
is so please explain. because i want to get data from a text box on
onclickevent and then display response.



-
Thanks  Regards
Romi
--
View this message in context: 
http://lucene.472066.n3.nabble.com/how-to-make-getJson-parameter-dynamic-tp3014941p3018732.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: How to display search results of solr in to other application.

2011-06-03 Thread Romi
$.getJSON(
   http://[server]:[port]/solr/select/?jsoncallback=?;,
   {q: queryString,
   version: 2.2,
   start: 0,
   rows: 10,
   indent: on,
   json.wrf: callbackFunctionToDoSomethingWithOurData,
   wt: json,
   fl: field1}
   ); 

would you please explain what are  queryString and json.wrf:
callbackFunctionToDoSomethingWithOurData. and what if i want to change my
query string each time.

-
Thanks  Regards
Romi
--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-to-display-search-results-of-solr-in-to-other-application-tp3014101p3018740.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: query routing with shards

2011-06-03 Thread Dmitry Kan
Hi Otis,

I merely followed on the gmail's suggestion to include other people into the
recipients list, Yonik was the first one :) I won't do it next time.

Thanks for a rapid reply. The reason for doing this query routing is that we
abstract the distributed SOLR from the client code for security reasons
(that is, we don't want to expose the entire shard farm to the world, but
only the frontend SOLR) and for better decoupling.

Is it possible to implement a plugin to SOLR that would map queries to
shards?

We have other choices too, they'll take quite some time, that's why I
decided to quickly ask, if I was missing something from the SOLR main
components design and configuration.

Dmitry

On Fri, Jun 3, 2011 at 8:25 AM, Otis Gospodnetic otis_gospodne...@yahoo.com
 wrote:

 Hi Dmitry (you may not want to additionally copy Yonik, he's subscribed to
 this
 list, too)


 It sounds like you have the knowledge of which query maps to which shard.
  If
 so, why not control/change the value of shards param in the request to
 your
 front-end Solr (aka distributed request dispatcher) within your app, which
 is
 the one calling Solr?

 Otis
 
 Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
 Lucene ecosystem search :: http://search-lucene.com/



 - Original Message 
  From: Dmitry Kan dmitry@gmail.com
  To: solr-user@lucene.apache.org; yo...@lucidimagination.com
  Sent: Thu, June 2, 2011 7:00:53 AM
  Subject: query routing with shards
 
  Hello all,
 
  We have currently several pretty fat logically isolated shards  with the
 same
  schema / solrconfig (indices are separate). We currently have  one single
  front end SOLR (1.4) for the client code calls. Since a client  code
 query
  usually hits only one shard, we are considering making a smart  routing
 of
  queries to the shards they map to. Can you please give some  pointers as
 to
  what would be an optimal way to achieve such a routing inside  the front
 end
  solr? Is there a way to configure mapping inside the  solrconfig?
 
  Thanks.
 
  --
  Regards,
 
  Dmitry Kan
 




-- 
Regards,

Dmitry Kan


Re: How to display search results of solr in to other application.

2011-06-03 Thread Naveen Gupta
Hi Romi

As per me, you need to understand how ajax with jquery works .. then go for
json and then jsonp (if you are fetching from different)

query here is dynamic query which you will be trying to hit solr .. (it
could be simple text, or more advanced query string)

http://wiki.apache.org/solr/CommonQueryParameters

Callback is the method name which you will define .. after getting response,
this method will be called (callback mechanism)

using the response from solr (json format), you need to show the response or
analyze the response as per your business need.

Thanks
Naveen


On Fri, Jun 3, 2011 at 12:00 PM, Romi romijain3...@gmail.com wrote:

 $.getJSON(
   http://[server]:[port]/solr/select/?jsoncallback=?;,
   {q: queryString,
   version: 2.2,
   start: 0,
   rows: 10,
   indent: on,
   json.wrf: callbackFunctionToDoSomethingWithOurData,
   wt: json,
   fl: field1}
   );

 would you please explain what are  queryString and json.wrf:
 callbackFunctionToDoSomethingWithOurData. and what if i want to change my
 query string each time.

 -
 Thanks  Regards
 Romi
 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/How-to-display-search-results-of-solr-in-to-other-application-tp3014101p3018740.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: different indexes for multitenant approach

2011-06-03 Thread Chandan Tamrakar
may be you need multi core feature of solr , you can have a single Solr
instance with separate configurations and indexes

http://wiki.apache.org/solr/CoreAdmin



On Fri, Jun 3, 2011 at 12:04 PM, Naveen Gupta nkgiit...@gmail.com wrote:

 Hi

 I want to implement different index strategy where we want to keep indexes
 with respect to each tennant and we want to maintain indexes separately ...

 first level of category -- company name

 second level of category - company name + fields to be indexed

 then further categories - group of different company name based on some
 heuristic (hashing) (if it grows furhter)

 i want to do in the same solr instance. can it be possible ?

 Thanks
 Naveen




-- 
Chandan Tamrakar
*
*


Getting query fields in a custom SearchHandler

2011-06-03 Thread Marc SCHNEIDER
Hi all,

I wrote my own SearchHandler and therefore overrided the handleRequestBody
method.
This method takes two input parameters : SolrQueryRequest and
SolrQueryResponse objects.
The thing I'd like to do is to get the query fields that are used in my
request.
Of course I can use req.getParams().get(q) but it returns the complete
query (which can be very complicated). I'd like to have a simple map with
field:value.
Is there a way to get it? Or do I have to write my own parser for the q
parameter?

Thanks in advance,
Marc.


How to search camel case words using CJKTokenizer

2011-06-03 Thread tiffany
Hi all,

I'm using CJKTokenizerFactory tokenizer to handle text which contains both
Japanese and alphabet words.  However, I noticed that CJKTokenizerFactory
converts alphabet to lowercase, so that I cannot use
WordDelimiterFilterFactory filter with splitOnCaseChange property for camel
case words.

I changed to NGramTokenizerFactory (2-gram), but it only parses first 1024
characters. Because of that, I cannot use NGramTokenizerFactory, neither.

I tried the following two settings and both of them seem working fine, but I
don't know if these are good or not, or if there are some other better
solutions.

1)
tokenizer class=solr.CJKTokenizerFactory /
filter class=solr.NGramFilterFactory maxGramSize=2
minGramSize=2 /

2)
tokenizer class=solr.StandardTokenizerFactory /
filter class=solr.NGramFilterFactory maxGramSize=1
minGramSize=1 /

If anyone can give me any advice, it would be nice.

Thank you.

Tiffany

--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-to-search-camel-case-words-using-CJKTokenizer-tp3018853p3018853.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Query problem in Solr

2011-06-03 Thread Kurt Sultana
@ Pravesh: It's 2 seperate cores, not 2 indexes. Sorry for that.

@ Erick: Yes, I've seen this suggestion and it seems to be the only possible
solution. I'll look into it.

Thanks for your answers guys!
Kurt

On Wed, Jun 1, 2011 at 4:24 PM, Erick Erickson erickerick...@gmail.comwrote:

 If I read this correctly, one approach is to specify an
 increment gap in a multiValued field, then search for phrases
 with a slop less than that increment gap. i.e.
 incrementGap=100 in your definition, and search for
 apple orange~99

 If this is gibberish, please post some examples and we'll
 try something else.

 Best
 Erick

 On Wed, Jun 1, 2011 at 4:21 AM, Kurt Sultana kurtanat...@gmail.com
 wrote:
   Hi all,
 
  We're using Solr to search on a Shop index and a Product index. Currently
 a
  Shop has a field `shop_keyword` which also contains the keywords of the
  products assigned to it. The shop keywords are separated by a space.
  Consequently, if there is a product which has a keyword apple and
 another
  which has orange, a search for shops having `Apple AND Orange` would
  return the shop for these products.
 
  However, this is incorrect since we want that a search for shops having
  `Apple AND Orange` returns shop(s) having products with both apple and
  orange as keywords.
 
  We tried solving this problem, by making shop keywords multi-valued and
  assigning the keywords of every product of the shop as a new value in
 shop
  keywords. However as was confirmed in another post
 
 http://markmail.org/thread/xce4qyzs5367yplo#query:+page:1+mid:76eerw5yqev2aanu+state:results
 ,
  Solr does not support all words must match in the same value of a
  multi-valued field.
 
  (Hope I explained myself well)
 
  How can we go about this? Ideally, we shouldn't change our search
  infrastructure dramatically.
 
  Thanks!
 
  Krt_Malta
 



Return stemmed word

2011-06-03 Thread Kurt Sultana
Hi,

We have stemming in our Solr search and we need to retrieve the word/phrase
after stemming. That is if I search for oranges, through stemming a search
for orange is carried out. If I turn on debugQuery I would be able to see
this, however we'd like to access it through the result if possible.
Basically, we need this, because we pass the searched word as a parameter to
a 3rd party application which highlights the word in an online PDF reader.
Currently, if a user searches for oranges and a document contains
orange, then the PDF wouldn't highlight anything since it tries to
highlight oranges not orange.

Thanks all in advance,
Kurt


Re: Strategy -- Frequent updates in our application

2011-06-03 Thread pravesh
You can use DataImportHandler for your full/incremental indexing. Now NRT
indexing could vary as per business requirements (i mean delay cud be 5-mins
,10-mins,15-mins,OR, 30-mins). Then it also depends on how much volume will
be indexed incrementally.
BTW, r u having Master+Slave SOLR setup?

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Strategy-Frequent-updates-in-our-application-tp3018386p3019040.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Sorting

2011-06-03 Thread pravesh
BTW, why r u sorting on this field?
You could also index  store this field twice. First, in its original value,
and then second, by encoding to some unique code/hash and index it and sort
on that.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Sorting-tp3017285p3019055.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Sorting algorithm

2011-06-03 Thread Richard Hodsdon
Hi Tomás

Thanks, that makes a lot of sense, and your math is sound.

It is working well. An if() function would be great, and it seems its coming
soon.

Richard

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Sorting-algorithm-tp3014549p3019077.html
Sent from the Solr - User mailing list archive at Nabble.com.


Nullpointer Exception in Solr 4.x in DebugComponent when using wildcard in facet value

2011-06-03 Thread Stefan Moises

Hi,

in Solr 4.x (trunk version of mid may) I have noticed a null pointer 
exception if I activate debugging (debug=true) and use a wildcard to 
filter by facet value, e.g.

if I have a price field

...debug=truefacet.field=pricefq=price[500+TO+*]
I get

SEVERE: java.lang.RuntimeException: java.lang.NullPointerException
at 
org.apache.solr.search.QueryParsing.toString(QueryParsing.java:538)
at 
org.apache.solr.handler.component.DebugComponent.process(DebugComponent.java:77)
at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:239)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)

at org.apache.solr.core.SolrCore.execute(SolrCore.java:1298)
at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:353)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:248)
at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at 
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
at 
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
at 
org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:465)
at 
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
at 
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
at 
org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:555)
at 
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
at 
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298)
at 
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:852)
at 
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:588)
at 
org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)

at java.lang.Thread.run(Thread.java:662)
Caused by: java.lang.NullPointerException
at 
org.apache.solr.search.QueryParsing.toString(QueryParsing.java:402)
at 
org.apache.solr.search.QueryParsing.toString(QueryParsing.java:535)


This used to work in Solr 1.4 and I was wondering if it's a bug or a new 
feature and if there is a trick to get this working again?


Best regards,
Stefan




Re: Nullpointer Exception in Solr 4.x in DebugComponent when using wildcard in facet value

2011-06-03 Thread Stefan Matheis
Stefan,

i guess there is a colon missing? fq=price:[500+TO+*] should do the trick

Regards
Stefan

On Fri, Jun 3, 2011 at 11:42 AM, Stefan Moises moi...@shoptimax.de wrote:
 Hi,

 in Solr 4.x (trunk version of mid may) I have noticed a null pointer
 exception if I activate debugging (debug=true) and use a wildcard to filter
 by facet value, e.g.
 if I have a price field

 ...debug=truefacet.field=pricefq=price[500+TO+*]
 I get

 SEVERE: java.lang.RuntimeException: java.lang.NullPointerException
        at
 org.apache.solr.search.QueryParsing.toString(QueryParsing.java:538)
        at
 org.apache.solr.handler.component.DebugComponent.process(DebugComponent.java:77)
        at
 org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:239)
        at
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
        at org.apache.solr.core.SolrCore.execute(SolrCore.java:1298)
        at
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:353)
        at
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:248)
        at
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
        at
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
        at
 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
        at
 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
        at
 org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:465)
        at
 org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
        at
 org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
        at
 org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:555)
        at
 org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
        at
 org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298)
        at
 org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:852)
        at
 org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:588)
        at
 org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)
        at java.lang.Thread.run(Thread.java:662)
 Caused by: java.lang.NullPointerException
        at
 org.apache.solr.search.QueryParsing.toString(QueryParsing.java:402)
        at
 org.apache.solr.search.QueryParsing.toString(QueryParsing.java:535)

 This used to work in Solr 1.4 and I was wondering if it's a bug or a new
 feature and if there is a trick to get this working again?

 Best regards,
 Stefan





Re: Nullpointer Exception in Solr 4.x in DebugComponent when using wildcard in facet value

2011-06-03 Thread Stefan Moises

Hi Stefan,
sorry, actually there is a colon, I just forgot it in my example...
so the exception also appears for

fq=price:[500+TO+*]

But only if debug=true... and normal price values work, e.g.

fq=price:[500+TO+999]


Thanks,
Stefan

Am 03.06.2011 11:46, schrieb Stefan Matheis:

Stefan,

i guess there is a colon missing?fq=price:[500+TO+*] should do the trick

Regards
Stefan

On Fri, Jun 3, 2011 at 11:42 AM, Stefan Moisesmoi...@shoptimax.de  wrote:

Hi,

in Solr 4.x (trunk version of mid may) I have noticed a null pointer
exception if I activate debugging (debug=true) and use a wildcard to filter
by facet value, e.g.
if I have a price field

...debug=truefacet.field=pricefq=price[500+TO+*]
I get

SEVERE: java.lang.RuntimeException: java.lang.NullPointerException
at
org.apache.solr.search.QueryParsing.toString(QueryParsing.java:538)
at
org.apache.solr.handler.component.DebugComponent.process(DebugComponent.java:77)
at
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:239)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1298)
at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:353)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:248)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
at
org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:465)
at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
at
org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:555)
at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298)
at
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:852)
at
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:588)
at
org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.lang.NullPointerException
at
org.apache.solr.search.QueryParsing.toString(QueryParsing.java:402)
at
org.apache.solr.search.QueryParsing.toString(QueryParsing.java:535)

This used to work in Solr 1.4 and I was wondering if it's a bug or a new
feature and if there is a trick to get this working again?

Best regards,
Stefan




.



--
Mit den besten Grüßen aus Nürnberg,
Stefan Moises

***
Stefan Moises
Senior Softwareentwickler

shoptimax GmbH
Guntherstraße 45 a
90461 Nürnberg
Amtsgericht Nürnberg HRB 21703
GF Friedrich Schreieck

Tel.: 0911/25566-25
Fax:  0911/25566-29
moi...@shoptimax.de
http://www.shoptimax.de
***




php library for extractrequest handler

2011-06-03 Thread Naveen Gupta
Hi

We want to post to solr server with some of the files (rtf,doc,etc) using
php .. one way is to post using curl

is there any client like java client (solrcell)

urls will also help

Thanks
Naveen


Re: Return stemmed word

2011-06-03 Thread lboutros
Hi Kurt,

I think this is a bit more tricky than that.

For example, if a user searches for oranges, the stemmer may return
orang which is not an existing word.

So getting stemmed words might/will not work for your highlighting purpose.

Ludovic.

-
Jouve
France.
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Return-stemmed-word-tp3018880p3019180.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: php library for extractrequest handler

2011-06-03 Thread Gora Mohanty
On Fri, Jun 3, 2011 at 3:55 PM, Naveen Gupta nkgiit...@gmail.com wrote:
 Hi

 We want to post to solr server with some of the files (rtf,doc,etc) using
 php .. one way is to post using curl

Do not normally use PHP, and have not tried it myself.
However, there is a PHP extension for Solr:
  http://wiki.apache.org/solr/SolPHP
  http://php.net/manual/en/book.solr.php

Regards,
Gora


Re: how to update database record after indexing

2011-06-03 Thread vrpar...@gmail.com
Hey Erick,

i written separate process as you suggested, and achieved task.

Thanks a lot

Vishal Parekh

--
View this message in context: 
http://lucene.472066.n3.nabble.com/how-to-update-database-record-after-indexing-tp2874171p3019217.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: how to do offline adding/updating index

2011-06-03 Thread vrpar...@gmail.com
Thanks to all,

i done by using multicore,

vishal parekh

--
View this message in context: 
http://lucene.472066.n3.nabble.com/how-to-do-offline-adding-updating-index-tp2923035p3019219.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: how to concatenate two nodes of xml with xpathentityprocessor

2011-06-03 Thread vrpar...@gmail.com
Thanks kbootz

your suggestion works fine, 

vishal parekh

--
View this message in context: 
http://lucene.472066.n3.nabble.com/how-to-concatenate-two-nodes-of-xml-with-xpathentityprocessor-tp2861260p3019223.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: SolrJ and Range Faceting

2011-06-03 Thread Martijn v Groningen
Hi Jamie,

I don't know why range facets didn't make it into SolrJ. But I've recently
opened an issue for this:
https://issues.apache.org/jira/browse/SOLR-2523

I hope this will be committed soon. Check the patch out and see if you like
it.

Martijn

On 2 June 2011 18:22, Jamie Johnson jej2...@gmail.com wrote:

 Currently the range and date faceting in SolrJ acts a bit differently than
 I
 would expect.  Specifically, range facets aren't parsed at all and date
 facets end up generating filterQueries which don't have the range, just the
 lower bound.  Is there a reason why SolrJ doesn't support these?  I have
 written some things on my end to handle these and generate filterQueries
 for
 date ranges of the form dateTime:[start TO end] and I have a function
 (which
 I copied from the date faceting) which parses the range facets, but would
 prefer not to have to maintain these myself.  Is there a plan to implement
 these?  Also is there a plan to update FacetField to not have end be a
 date,
 perhaps making it a String like start so we can support date and range
 queries?




-- 
Met vriendelijke groet,

Martijn van Groningen


[Visualizations] from Query Results

2011-06-03 Thread Adam Estrada
Dear Solr experts,

I am curious to learn what visualization tools are out there to help me
visualize my query results. I am not talking about a language specific
client per se but something more like Carrot2 which breaks clusters in to
their knowledge tree and expandable pie chart. Sorry if those aren't the
correct names for those tools ;-) Anyway, what else is out there like
Carrot2 http://project.carrot2.org/ to help me visualize Solr query results?

Thanks for your input,
Adam


Re: Strategy -- Frequent updates in our application

2011-06-03 Thread Naveen Gupta
Hi Pravesh

We don't have that setup right now .. we are thinking of doing that 

for writes we are going to have one instance and for read, we are going to
have another...

do you have other design in mind .. kindly share

Thanks
Naveen

On Fri, Jun 3, 2011 at 2:50 PM, pravesh suyalprav...@yahoo.com wrote:

 You can use DataImportHandler for your full/incremental indexing. Now NRT
 indexing could vary as per business requirements (i mean delay cud be
 5-mins
 ,10-mins,15-mins,OR, 30-mins). Then it also depends on how much volume will
 be indexed incrementally.
 BTW, r u having Master+Slave SOLR setup?

 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Strategy-Frequent-updates-in-our-application-tp3018386p3019040.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: php library for extractrequest handler

2011-06-03 Thread Naveen Gupta
Yes,

that one i used and it is working fine .thanks to nabble ..

Thanks
Naveen

On Fri, Jun 3, 2011 at 4:02 PM, Gora Mohanty g...@mimirtech.com wrote:

 On Fri, Jun 3, 2011 at 3:55 PM, Naveen Gupta nkgiit...@gmail.com wrote:
  Hi
 
  We want to post to solr server with some of the files (rtf,doc,etc) using
  php .. one way is to post using curl

 Do not normally use PHP, and have not tried it myself.
 However, there is a PHP extension for Solr:
  http://wiki.apache.org/solr/SolPHP
  http://php.net/manual/en/book.solr.php

 Regards,
 Gora



Re: Strategy -- Frequent updates in our application

2011-06-03 Thread Nagendra Nagarajayya

Hi Naveen:

Solr with RankingAlgorithm supports NRT. The performance is about 262 
docs / sec. You can get more information about the performance and NRT 
from here:

http://solr-ra.tgels.com/wiki/en/Near_Real_Time_Search

You can download Solr with RankingAlgorithm from here:
http://solr-ra.tgels.com

Regards,

- Nagendra Nagarajayya
http://solr-ra.tgels.com

On 6/2/2011 8:29 PM, Naveen Gupta wrote:

Hi

We are having an application where every 10 mins, we are doing indexing of
users docs repository, and eventually, if some thread is being added in that
particular discussion, we need to index the thread again (please note we are
not doing blind indexing each time, we have various rules to filter out
which thread is new and thus that is a candidate for indexing plus new ones
which has arrived).

So we are doing updates for each user docs repository .. the performance is
not looking so far very good. the future is that we are going to get hits in
volume(1000 to 10,000 hits per mins), so looking for strategy where we can
tune solr in order to index the data in real time

and what about NRT, is it fine to apply in this case of scenario. i read
that solr NRT is not very good in performance, but i am not going to believe
it since it is one of the best open sources ..so it is going to have this
problem sorted in near future ..but if any benchmark is there, kindly share
with me ... we would like to analyze with our requirements.

Is there any way to add incremental indexes which we generally find in other
search engine like endeca and etc? i don't know much in detail about solr...
since i am newbie, so can you please tell me if we can have some settings
which can keep track of incremental indexing?


Thanks
Naveen





Solr Indexing Patterns

2011-06-03 Thread Judioo
What is the best practice method to index the following in Solr:

I'm attempting to use solr for a book store site.

Each book will have a price but on occasions this will be discounted. The
discounted price exists for a defined time period but there may be many
discount periods. Each discount will have a brief synopsis, start and end
time.

A subset of the desired output would be as follows:

...
response:{numFound:1,start:0,docs:[
  {
name:The Book,
price:$9.99,
discounts:[
{
 price:$3.00,
 synopsis:thanksgiving special,
 starts:11-24-2011,
 ends:11-25-2011,
},
{
 price:$4.00,
 synopsis:Canadian thanksgiving special,
 starts:10-10-2011,
 ends:10-11-2011,
},
 ]
  },
  .

A requirement is to be able to search for just discounted publications. I
think I could use date faceting for this ( return publications that are
within a discount window ). When a discount search is performed no
publications that are not currently discounted will be returned.

My question are:

   - Does solr support this type of sub documents

In the above example the discounts are the sub documents. I know solr is not
a relational DB but I would like to store and index the above representation
in a single document if possible.

   - what is the best method to approach the above

I can see in many examples the authors tend to denormalize to solve similar
problems. This suggest that for each discount I am required to duplicate the
book data or form a document
associationhttp://stackoverflow.com/questions/2689399/solr-associations.
Which method would you advise?

It would be nice if solr could return a response structured as above.

Much Thanks


Re: Strategy -- Frequent updates in our application

2011-06-03 Thread pravesh
You can go ahead with the Master/Slave setup provided by SOLR. Its trivial to
setup and you also get SOLR's operational scripts for index synch'ing b/w
Master-to-Slave(s), OR the Java based replication feature.

There is no need to re-invent other architecture :)

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Strategy-Frequent-updates-in-our-application-tp3018386p3019475.html
Sent from the Solr - User mailing list archive at Nabble.com.


Solr performance tuning - disk i/o?

2011-06-03 Thread Demian Katz
Hello,

I'm trying to move a VuFind installation from an ailing physical server into a 
virtualized environment, and I'm running into performance problems.  VuFind is 
a Solr 1.4.1-based application with fairly large and complex records (many 
stored fields, many words per record).  My particular installation contains 
about a million records in the index, with a total index size around 6GB.

The virtual environment has more RAM and better CPUs than the old physical box, 
and I am satisfied that my Java environment is well-tuned.  My index is 
optimized.  Searches that hit the cache respond very well.  The problem is that 
non-cached searches are very slow - the more keywords I add, the slower they 
get, to the point of taking 6-12 seconds to come back with results on a quiet 
box and well over a minute under stress testing.  (The old box still took a 
while for equivalent searches, but it was about twice as fast as the new one).

My gut feeling is that disk access reading the index is the bottleneck here, 
but I know little about the specifics of Solr's internals, so it's entirely 
possible that my gut is wrong.  Outside testing does show that the the virtual 
environment's disk performance is not as good as the old physical server, 
especially when multiple processes are trying to access the same file 
simultaneously.

So, two basic questions:


1.)Would you agree that I'm dealing with a disk bottleneck, or are there 
some other factors I should be considering?  Any good diagnostics I should be 
looking at?

2.)If the problem is disk access, is there anything I can tune on the Solr 
side to alleviate the problems?

Thanks,
Demian


Re: how to make getJson parameter dynamic

2011-06-03 Thread Erick Erickson
Romi:

Please review:
http://wiki.apache.org/solr/UsingMailingLists

This is the Solr forum. jQuery questions are best directed at a
jQuery-specific forum.

Best
Erick

On Fri, Jun 3, 2011 at 2:27 AM, Romi romijain3...@gmail.com wrote:
 lee carroll: Sorry for this. i did this because i was not getting any
 response. anyway thanks for letting me know and now i found the solution of
 the above problem :)
 now i am facing a very strange problem related to jquery can you please help
 me out.

 $(document).ready(function(){
         $(#c2).click(function(){
            var q=getquerystring() ;


 $.getJSON(http://192.168.1.9:8983/solr/db/select/?wt=jsonq=+q+json.wrf=?;,
 function(result){
                $.each(result.response.docs, function(i,item){
                    alert(result.response.docs);
                    alert(item.UID_PK);
                });
            });
        });
        });


 when i use $(#c2).click(function() then it does not enter in $.getJSON()
 and when i remove $(#c2).click(function() from the code it run fine. Why
 is so please explain. because i want to get data from a text box on
 onclickevent and then display response.



 -
 Thanks  Regards
 Romi
 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/how-to-make-getJson-parameter-dynamic-tp3014941p3018732.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: Nullpointer Exception in Solr 4.x in DebugComponent when using wildcard in facet value

2011-06-03 Thread Erick Erickson
Hmmm, I just tried it on a trunk from a couple of days ago and it
doesn't error out.
Could you re-try with a new build?

Thanks
Erick

On Fri, Jun 3, 2011 at 5:51 AM, Stefan Moises moi...@shoptimax.de wrote:
 Hi Stefan,
 sorry, actually there is a colon, I just forgot it in my example...
 so the exception also appears for

 fq=price:[500+TO+*]

 But only if debug=true... and normal price values work, e.g.

 fq=price:[500+TO+999]


 Thanks,
 Stefan

 Am 03.06.2011 11:46, schrieb Stefan Matheis:

 Stefan,

 i guess there is a colon missing?fq=price:[500+TO+*] should do the trick

 Regards
 Stefan

 On Fri, Jun 3, 2011 at 11:42 AM, Stefan Moisesmoi...@shoptimax.de
  wrote:

 Hi,

 in Solr 4.x (trunk version of mid may) I have noticed a null pointer
 exception if I activate debugging (debug=true) and use a wildcard to
 filter
 by facet value, e.g.
 if I have a price field

 ...debug=truefacet.field=pricefq=price[500+TO+*]
 I get

 SEVERE: java.lang.RuntimeException: java.lang.NullPointerException
        at
 org.apache.solr.search.QueryParsing.toString(QueryParsing.java:538)
        at

 org.apache.solr.handler.component.DebugComponent.process(DebugComponent.java:77)
        at

 org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:239)
        at

 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
        at org.apache.solr.core.SolrCore.execute(SolrCore.java:1298)
        at

 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:353)
        at

 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:248)
        at

 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
        at

 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
        at

 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
        at

 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
        at

 org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:465)
        at

 org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
        at

 org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
        at
 org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:555)
        at

 org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
        at

 org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298)
        at

 org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:852)
        at

 org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:588)
        at
 org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)
        at java.lang.Thread.run(Thread.java:662)
 Caused by: java.lang.NullPointerException
        at
 org.apache.solr.search.QueryParsing.toString(QueryParsing.java:402)
        at
 org.apache.solr.search.QueryParsing.toString(QueryParsing.java:535)

 This used to work in Solr 1.4 and I was wondering if it's a bug or a new
 feature and if there is a trick to get this working again?

 Best regards,
 Stefan



 .


 --
 Mit den besten Grüßen aus Nürnberg,
 Stefan Moises

 ***
 Stefan Moises
 Senior Softwareentwickler

 shoptimax GmbH
 Guntherstraße 45 a
 90461 Nürnberg
 Amtsgericht Nürnberg HRB 21703
 GF Friedrich Schreieck

 Tel.: 0911/25566-25
 Fax:  0911/25566-29
 moi...@shoptimax.de
 http://www.shoptimax.de
 ***





Re: [Visualizations] from Query Results

2011-06-03 Thread Erick Erickson
I'm not quite sure what you mean by visualization here. Do you
want to see the query parse tree? The results list in something other
than XML (see the /browse functionality if so). How documents are
ranked?

Visualization is another overloaded word G...

Best
Erick

On Fri, Jun 3, 2011 at 7:13 AM, Adam Estrada
estrada.adam.gro...@gmail.com wrote:
 Dear Solr experts,

 I am curious to learn what visualization tools are out there to help me
 visualize my query results. I am not talking about a language specific
 client per se but something more like Carrot2 which breaks clusters in to
 their knowledge tree and expandable pie chart. Sorry if those aren't the
 correct names for those tools ;-) Anyway, what else is out there like
 Carrot2 http://project.carrot2.org/ to help me visualize Solr query results?

 Thanks for your input,
 Adam



Re: Nullpointer Exception in Solr 4.x in DebugComponent when using wildcard in facet value

2011-06-03 Thread Stefan Moises

Hi Erick

sure, thanks for looking into it! I'll let you know if it's working for 
me there, too...
(I'm using edismax btw., but I've also tested with standard and got the 
exception)


Stefan

Am 03.06.2011 15:22, schrieb Erick Erickson:

Hmmm, I just tried it on a trunk from a couple of days ago and it
doesn't error out.
Could you re-try with a new build?

Thanks
Erick

On Fri, Jun 3, 2011 at 5:51 AM, Stefan Moisesmoi...@shoptimax.de  wrote:

Hi Stefan,
sorry, actually there is a colon, I just forgot it in my example...
so the exception also appears for

fq=price:[500+TO+*]

But only if debug=true... and normal price values work, e.g.

fq=price:[500+TO+999]


Thanks,
Stefan

Am 03.06.2011 11:46, schrieb Stefan Matheis:

Stefan,

i guess there is a colon missing?fq=price:[500+TO+*] should do the trick

Regards
Stefan

On Fri, Jun 3, 2011 at 11:42 AM, Stefan Moisesmoi...@shoptimax.de
  wrote:

Hi,

in Solr 4.x (trunk version of mid may) I have noticed a null pointer
exception if I activate debugging (debug=true) and use a wildcard to
filter
by facet value, e.g.
if I have a price field

...debug=truefacet.field=pricefq=price[500+TO+*]
I get

SEVERE: java.lang.RuntimeException: java.lang.NullPointerException
at
org.apache.solr.search.QueryParsing.toString(QueryParsing.java:538)
at

org.apache.solr.handler.component.DebugComponent.process(DebugComponent.java:77)
at

org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:239)
at

org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1298)
at

org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:353)
at

org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:248)
at

org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at

org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at

org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
at

org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
at

org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:465)
at

org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
at

org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
at
org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:555)
at

org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
at

org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298)
at

org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:852)
at

org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:588)
at
org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.lang.NullPointerException
at
org.apache.solr.search.QueryParsing.toString(QueryParsing.java:402)
at
org.apache.solr.search.QueryParsing.toString(QueryParsing.java:535)

This used to work in Solr 1.4 and I was wondering if it's a bug or a new
feature and if there is a trick to get this working again?

Best regards,
Stefan




.


--
Mit den besten Grüßen aus Nürnberg,
Stefan Moises

***
Stefan Moises
Senior Softwareentwickler

shoptimax GmbH
Guntherstraße 45 a
90461 Nürnberg
Amtsgericht Nürnberg HRB 21703
GF Friedrich Schreieck

Tel.: 0911/25566-25
Fax:  0911/25566-29
moi...@shoptimax.de
http://www.shoptimax.de
***




.



--
Mit den besten Grüßen aus Nürnberg,
Stefan Moises

***
Stefan Moises
Senior Softwareentwickler

shoptimax GmbH
Guntherstraße 45 a
90461 Nürnberg
Amtsgericht Nürnberg HRB 21703
GF Friedrich Schreieck

Tel.: 0911/25566-25
Fax:  0911/25566-29
moi...@shoptimax.de
http://www.shoptimax.de
***




Re: Strategy -- Frequent updates in our application

2011-06-03 Thread Erick Erickson
Do be careful how often you pull down indexes on your slaves. A
too-short polling interval can
lead to some problems. Start with, say, 5 minutes and insure that your
autowarm time (see your
logs) is less than your polling interval

Best
Erick


On Fri, Jun 3, 2011 at 8:43 AM, pravesh suyalprav...@yahoo.com wrote:
 You can go ahead with the Master/Slave setup provided by SOLR. Its trivial to
 setup and you also get SOLR's operational scripts for index synch'ing b/w
 Master-to-Slave(s), OR the Java based replication feature.

 There is no need to re-invent other architecture :)

 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Strategy-Frequent-updates-in-our-application-tp3018386p3019475.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: Solr performance tuning - disk i/o?

2011-06-03 Thread Otis Gospodnetic
Demian,

* You can run iostat or vmstat and see if there is disk IO during your slow 
queries and compare that to disk IO (if any) with your fast/cached queries
* You can make sure you warm up your index well after the first and any new 
searcher, so that OS and Solr caches are warmed up
* You can look at Solr Stats page to make sure your caches are utilized well 
and 
adjust their settings if they are not.
* ...

Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



- Original Message 
 From: Demian Katz demian.k...@villanova.edu
 To: solr-user@lucene.apache.org solr-user@lucene.apache.org
 Sent: Fri, June 3, 2011 8:44:33 AM
 Subject: Solr performance tuning - disk i/o?
 
 Hello,
 
 I'm trying to move a VuFind installation from an ailing physical  server into 
 a 
virtualized environment, and I'm running into performance  problems.  VuFind 
is 
a Solr 1.4.1-based application with fairly large and  complex records (many 
stored fields, many words per record).  My particular  installation contains 
about a million records in the index, with a total index  size around 6GB.
 
 The virtual environment has more RAM and better CPUs  than the old physical 
box, and I am satisfied that my Java environment is  well-tuned.  My index is 
optimized.  Searches that hit the cache  respond very well.  The problem is 
that 
non-cached searches are very slow -  the more keywords I add, the slower they 
get, to the point of taking 6-12  seconds to come back with results on a quiet 
box and well over a minute under  stress testing.  (The old box still took a 
while for equivalent searches,  but it was about twice as fast as the new one).
 
 My gut feeling is that  disk access reading the index is the bottleneck here, 
but I know little about  the specifics of Solr's internals, so it's entirely 
possible that my gut is  wrong.  Outside testing does show that the the 
virtual 
environment's disk  performance is not as good as the old physical server, 
especially when multiple  processes are trying to access the same file 
simultaneously.
 
 So, two  basic questions:
 
 
 1.)Would you agree that I'm dealing  with a disk bottleneck, or are there 
some other factors I should be  considering?  Any good diagnostics I should be 
looking at?
 
 2.) If the problem is disk access, is there anything I can tune on the 
 Solr  
side to alleviate the problems?
 
 Thanks,
 Demian
 


Re: Solr performance tuning - disk i/o?

2011-06-03 Thread Erick Erickson
This doesn't seem right. Here's a couple of things to try:
1 attach debugQuery=on to your long-running queries. The QTime returned
 is the time taken to search, NOT including the time to load the
docs. That'll
 help pinpoint whether the problem is the search itself, or assembling the
 documents.
2 Are you autowarming? If so, be sure it's actually done before querying.
3 Measure queries after the first few, particularly if you're sorting or
 faceting.
4 What are your JVM settings? How much memory do you have?
5 is enableLazyFieldLoading set to true in your solrconfig.xml?
6 How many docs are you returning?


There's more, but that'll do for a start Let us know if you gather more data
and it's still slow.

Best
Erick

On Fri, Jun 3, 2011 at 8:44 AM, Demian Katz demian.k...@villanova.edu wrote:
 Hello,

 I'm trying to move a VuFind installation from an ailing physical server into 
 a virtualized environment, and I'm running into performance problems.  VuFind 
 is a Solr 1.4.1-based application with fairly large and complex records (many 
 stored fields, many words per record).  My particular installation contains 
 about a million records in the index, with a total index size around 6GB.

 The virtual environment has more RAM and better CPUs than the old physical 
 box, and I am satisfied that my Java environment is well-tuned.  My index is 
 optimized.  Searches that hit the cache respond very well.  The problem is 
 that non-cached searches are very slow - the more keywords I add, the slower 
 they get, to the point of taking 6-12 seconds to come back with results on a 
 quiet box and well over a minute under stress testing.  (The old box still 
 took a while for equivalent searches, but it was about twice as fast as the 
 new one).

 My gut feeling is that disk access reading the index is the bottleneck here, 
 but I know little about the specifics of Solr's internals, so it's entirely 
 possible that my gut is wrong.  Outside testing does show that the the 
 virtual environment's disk performance is not as good as the old physical 
 server, especially when multiple processes are trying to access the same file 
 simultaneously.

 So, two basic questions:


 1.)    Would you agree that I'm dealing with a disk bottleneck, or are there 
 some other factors I should be considering?  Any good diagnostics I should be 
 looking at?

 2.)    If the problem is disk access, is there anything I can tune on the 
 Solr side to alleviate the problems?

 Thanks,
 Demian



Re: [Visualizations] from Query Results

2011-06-03 Thread Otis Gospodnetic
Hi Adam,

Try this:
http://lmgtfy.com/?q=search%20results%20visualizations

In practice I find that visualizations are cool and attractive looking, but 
often text is more useful because it's more direct.  But there is room for 
graphical representation of search results, sure.

Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



- Original Message 
 From: Adam Estrada estrada.adam.gro...@gmail.com
 To: solr-user@lucene.apache.org
 Sent: Fri, June 3, 2011 7:13:39 AM
 Subject: [Visualizations] from Query Results
 
 Dear Solr experts,
 
 I am curious to learn what visualization tools are out  there to help me
 visualize my query results. I am not talking about a  language specific
 client per se but something more like Carrot2 which breaks  clusters in to
 their knowledge tree and expandable pie chart. Sorry if those  aren't the
 correct names for those tools ;-) Anyway, what else is out there  like
 Carrot2 http://project.carrot2.org/ to help me visualize Solr query  results?
 
 Thanks for your input,
 Adam
 


Re: query routing with shards

2011-06-03 Thread Otis Gospodnetic
Hi Dmitry,

Yes, you could also implement your own custom SearchComponent.  In this 
component you could grab the query param, examine the query value, and based on 
that add the shards URL param with appropriate value, so that when the regular 
QueryComponent grabs stuff from the request, it has the correct shard in there 
already.

Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



- Original Message 
 From: Dmitry Kan dmitry@gmail.com
 To: solr-user@lucene.apache.org
 Sent: Fri, June 3, 2011 2:47:00 AM
 Subject: Re: query routing with shards
 
 Hi Otis,
 
 I merely followed on the gmail's suggestion to include other  people into the
 recipients list, Yonik was the first one :) I won't do it  next time.
 
 Thanks for a rapid reply. The reason for doing this query  routing is that we
 abstract the distributed SOLR from the client code for  security reasons
 (that is, we don't want to expose the entire shard farm to  the world, but
 only the frontend SOLR) and for better decoupling.
 
 Is  it possible to implement a plugin to SOLR that would map queries  to
 shards?
 
 We have other choices too, they'll take quite some time,  that's why I
 decided to quickly ask, if I was missing something from the SOLR  main
 components design and configuration.
 
 Dmitry
 
 On Fri, Jun 3,  2011 at 8:25 AM, Otis Gospodnetic otis_gospodne...@yahoo.com
   wrote:
 
  Hi Dmitry (you may not want to additionally copy Yonik, he's  subscribed to
  this
  list, too)
 
 
  It sounds  like you have the knowledge of which query maps to which shard.
If
  so, why not control/change the value of shards param in the request  to
  your
  front-end Solr (aka distributed request dispatcher)  within your app, which
  is
  the one calling Solr?
 
   Otis
  
  Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
  Lucene  ecosystem search :: http://search-lucene.com/
 
 
 
  - Original  Message 
   From: Dmitry Kan dmitry@gmail.com
   To: solr-user@lucene.apache.org; yo...@lucidimagination.com
Sent: Thu, June 2, 2011 7:00:53 AM
   Subject: query routing with  shards
  
   Hello all,
  
   We have  currently several pretty fat logically isolated shards  with the
   same
   schema / solrconfig (indices are separate). We currently  have  one single
   front end SOLR (1.4) for the client code  calls. Since a client  code
  query
   usually hits only  one shard, we are considering making a smart  routing
  of
queries to the shards they map to. Can you please give some  pointers  as
  to
   what would be an optimal way to achieve such a  routing inside  the front
  end
   solr? Is there a way to  configure mapping inside the  solrconfig?
  
Thanks.
  
   --
   Regards,
  
Dmitry Kan
  
 
 
 
 
 -- 
 Regards,
 
 Dmitry Kan
 


Re: java.io.IOException: The specified network name is no longer available

2011-06-03 Thread Otis Gospodnetic
Hi,

I'm guessing your index is on some sort of network drive that got detached?

Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



- Original Message 
 From: Gaurav Shingala gaurav.shing...@hotmail.com
 To: Apache SolrUser solr-user@lucene.apache.org
 Sent: Fri, June 3, 2011 1:52:42 AM
 Subject: java.io.IOException: The specified network name is no longer 
available
 
 
 Hi,
 
 I am using solr 1.4.1 and at the time of updating index getting  following 
error:
 
 2011-06-03 05:54:06,943 ERROR  [org.apache.solr.core.SolrCore] 
(http-10.38.33.146-8080-4) java.io.IOException:  The specified network name is 
no longer available
 at  java.io.RandomAccessFile.readBytes(Native Method)
 at  java.io.RandomAccessFile.read(RandomAccessFile.java:322)
 at  
org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput.readInternal(SimpleFSDirectory.java:132)

  at  
org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:157)
  at  
org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:38)
  at  org.apache.lucene.store.IndexInput.readVInt(IndexInput.java:78)
  at org.apache.lucene.index.TermBuffer.read(TermBuffer.java:64)
  at  
org.apache.lucene.index.SegmentTermEnum.next(SegmentTermEnum.java:129)
  at  
org.apache.lucene.index.SegmentTermEnum.scanTo(SegmentTermEnum.java:160)
  at  org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java:232)
  at  org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java:179)
  at  org.apache.lucene.index.SegmentTermDocs.seek(SegmentTermDocs.java:57)
  at  org.apache.lucene.index.IndexReader.termDocs(IndexReader.java:1103)
  at  
org.apache.lucene.index.SegmentReader.termDocs(SegmentReader.java:981)
  at  
org.apache.solr.search.SolrIndexReader.termDocs(SolrIndexReader.java:320)
  at  
org.apache.solr.search.SolrIndexSearcher.getDocSetNC(SolrIndexSearcher.java:640)
  at  
org.apache.solr.search.SolrIndexSearcher.getDocSet(SolrIndexSearcher.java:545)
  at  
org.apache.solr.search.SolrIndexSearcher.getDocSet(SolrIndexSearcher.java:581)
  at  
org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:903)

  at  
org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:884)
  at  
org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:341)
  at  
org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:182)

  at  
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195)

  at  
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)

  at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
  at  
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
  at  
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
  at  
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:274)

  at  
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:242)

  at  
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:275)

  at  
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)

  at  
org.jboss.web.tomcat.security.SecurityAssociationValve.invoke(SecurityAssociationValve.java:181)

  at  
org.jboss.modcluster.catalina.CatalinaContext$RequestListenerValve.event(CatalinaContext.java:285)

  at  
org.jboss.modcluster.catalina.CatalinaContext$RequestListenerValve.invoke(CatalinaContext.java:261)

  at  
org.jboss.web.tomcat.security.JaccContextValve.invoke(JaccContextValve.java:88)
  at  
org.jboss.web.tomcat.security.SecurityContextEstablishmentValve.invoke(SecurityContextEstablishmentValve.java:100)

  at  
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
  at  
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
  at  
org.jboss.web.tomcat.service.jca.CachedConnectionValve.invoke(CachedConnectionValve.java:158)

  at  
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)

  at  
org.jboss.web.tomcat.service.request.ActiveRequestResponseCacheValve.invoke(ActiveRequestResponseCacheValve.java:53)

  at  
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:362)
  at  
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:877)
  at  
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:654)

  at  
org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:951)
  at java.lang.Thread.run(Thread.java:619)
 
 2011-06-03 05:54:06,943  INFO  [org.apache.solr.core.SolrCore] 
(http-10.38.33.146-8080-4)  

Ignore This Test Message

2011-06-03 Thread Jasneet Sabharwal

Hey Guys

Just a test mail, please ignore this.

--
Thanx  Regards

Jasneet Sabharwal
Software Developer
NextGen Invent Corporation



Re: Better to have lots of smaller cores or one really big core?

2011-06-03 Thread JohnRodey
Thanks Erick for the response.

So my data structure is the same, i.e. they all use the same schema.  Though
I think it makes sense for us to somehow break apart the data, for example
by the date it was indexed.  I'm just trying to get a feel for how large we
should aim to keep those (by day, by week, by month, etc...).

So it sounds like we should aim to keep them at a size that one solr server
can host to avoid serving multiple cores.

One question, there is no real difference (other than configuration) from a
server hosting its own index vs. it hosting one core, is there?

Thanks!

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Better-to-have-lots-of-smaller-cores-or-one-really-big-core-tp3017973p3019686.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: Solr performance tuning - disk i/o?

2011-06-03 Thread Demian Katz
Thanks to you and Otis for the suggestions!  Some more information:

- Based on the Solr stats page, my caches seem to be working pretty well (few 
or no evictions, hit rates in the 75-80% range).
- VuFind is actually doing two Solr queries per search (one initial search 
followed by a supplemental spell check search -- I believe this is necessary 
because VuFind has two separate spelling indexes, one for shingled terms and 
one for single words).  That is probably exaggerating the problem, though based 
on searches with debugQuery on, it looks like it's always the initial search 
(rather than the supplemental spelling search) that's consuming the bulk of the 
time.
- enableLazyFieldLoading is set to true.
- I'm retrieving 20 documents per page.
- My JVM settings: -server -Xloggc:/usr/local/vufind/solr/jetty/logs/gc.log 
-Xms4096m -Xmx4096m -XX:+UseParallelGC -XX:+UseParallelOldGC -XX:NewRatio=5

It appears that a large portion of my problem had to do with autowarming, a 
topic that I've never had a strong grasp on, though perhaps I'm finally 
learning (any recommended primer links would be welcome!).  I did have some 
autowarming settings in solrconfig.xml (an arbitrary search for a bunch of 
random keywords in the newSearcher and firstSearcher events, plus autowarmCount 
settings on all of my caches).  However, when I looked at the debugQuery 
output, I noticed that a huge amount of time was being wasted loading facets on 
the first search after restarting Solr, so I changed my newSearcher and 
firstSearcher events to this:

  arr name=queries
lst
  str name=q*:*/str
  str name=start0/str
  str name=rows10/str
  str name=facettrue/str
  str name=facet.mincount1/str
  str name=facet.fieldcollection/str
  str name=facet.fieldformat/str
  str name=facet.fieldpublishDate/str
  str name=facet.fieldcallnumber-first/str
  str name=facet.fieldtopic_facet/str
  str name=facet.fieldauthorStr/str
  str name=facet.fieldlanguage/str
  str name=facet.fieldgenre_facet/str
  str name=facet.fieldera_facet/str
  str name=facet.fieldgeographic_facet/str
/lst
  /arr

Overall performance has now increased dramatically, and now the biggest 
bottleneck in the debug output seems to be the shingle spell checking!

Any other suggestions are welcome, since I suspect there's still room to 
squeeze more performance out of the system, and I'm still not sure I'm making 
the most of autowarming...  but this seems like a big step in the right 
direction.  Thanks again for the help!

- Demian

 -Original Message-
 From: Erick Erickson [mailto:erickerick...@gmail.com]
 Sent: Friday, June 03, 2011 9:41 AM
 To: solr-user@lucene.apache.org
 Subject: Re: Solr performance tuning - disk i/o?
 
 This doesn't seem right. Here's a couple of things to try:
 1 attach debugQuery=on to your long-running queries. The QTime
 returned
  is the time taken to search, NOT including the time to load the
 docs. That'll
  help pinpoint whether the problem is the search itself, or
 assembling the
  documents.
 2 Are you autowarming? If so, be sure it's actually done before
 querying.
 3 Measure queries after the first few, particularly if you're sorting
 or
  faceting.
 4 What are your JVM settings? How much memory do you have?
 5 is enableLazyFieldLoading set to true in your solrconfig.xml?
 6 How many docs are you returning?
 
 
 There's more, but that'll do for a start Let us know if you gather
 more data
 and it's still slow.
 
 Best
 Erick
 
 On Fri, Jun 3, 2011 at 8:44 AM, Demian Katz demian.k...@villanova.edu
 wrote:
  Hello,
 
  I'm trying to move a VuFind installation from an ailing physical
 server into a virtualized environment, and I'm running into performance
 problems.  VuFind is a Solr 1.4.1-based application with fairly large
 and complex records (many stored fields, many words per record).  My
 particular installation contains about a million records in the index,
 with a total index size around 6GB.
 
  The virtual environment has more RAM and better CPUs than the old
 physical box, and I am satisfied that my Java environment is well-
 tuned.  My index is optimized.  Searches that hit the cache respond
 very well.  The problem is that non-cached searches are very slow - the
 more keywords I add, the slower they get, to the point of taking 6-12
 seconds to come back with results on a quiet box and well over a minute
 under stress testing.  (The old box still took a while for equivalent
 searches, but it was about twice as fast as the new one).
 
  My gut feeling is that disk access reading the index is the
 bottleneck here, but I know little about the specifics of Solr's
 internals, so it's entirely possible that my gut is wrong.  Outside
 testing does show that the the virtual environment's disk performance
 is not as good as the old physical server, especially when 

fq null pointer exception

2011-06-03 Thread dan whelan
I am noticing something strange with our recent upgrade to solr 3.1 and 
want to see if anyone has experienced anything similar.


I have a solr.StrField field named Status the values are Enabled, 
Disabled, or ''


When I facet on that field it I get

Enabled 4409565
Disabled 29185
 112


The issue is when I do a filter query

This query works

select/?q=*:*fq=Status:Enabled

But when I run this query I get a NPE

select/?q=*:*fq=Status:Disabled


Here is part of the stack trace


Problem accessing /solr/global_accounts/select/. Reason:
null

java.lang.NullPointerException
at org.apache.solr.response.XMLWriter.writePrim(XMLWriter.java:828)
at org.apache.solr.response.XMLWriter.writeStr(XMLWriter.java:686)
at org.apache.solr.schema.StrField.write(StrField.java:49)
at org.apache.solr.schema.SchemaField.write(SchemaField.java:125)
at org.apache.solr.response.XMLWriter.writeDoc(XMLWriter.java:369)
at org.apache.solr.response.XMLWriter$3.writeDocs(XMLWriter.java:545)
at 
org.apache.solr.response.XMLWriter.writeDocuments(XMLWriter.java:482)

at org.apache.solr.response.XMLWriter.writeDocList(XMLWriter.java:519)
at org.apache.solr.response.XMLWriter.writeVal(XMLWriter.java:582)
at org.apache.solr.response.XMLWriter.writeResponse(XMLWriter.java:131)
at 
org.apache.solr.response.XMLResponseWriter.write(XMLResponseWriter.java:35)

...


Thanks,

Dan



Solr Performance

2011-06-03 Thread Rohit
Hi,

 

We migrated to Solr a few days back, but have now after going live we have
noticed a performance drop, especially when we do a delta index, which we
are executing every 1hours with around 100,000 records . We have a multi
core Solr server running on a Linux machine, with 4Gb given to the JVM, its
not possible for me to upgrade the ram or give more memory to the Solr
currently.

 

So I was considering the option of running a master-slave config, I have
another window machine with 4gb ram available on the same network. I have
two questions regarding this,

 

. Is this a right path to take ?

. How can I do this with minimum down time, given the fact that our
index is huge

. Can someone point me to the right direction for this?

 

Thanks and Regards,

Rohit



Re: Sorting

2011-06-03 Thread Clecio Varjao
Because when browsing through legislation, people want to browse in
the same order as it is actually printed in the hard copy volumes.
It did work by using a copyfield to a lowercase field.

On Fri, Jun 3, 2011 at 2:29 AM, pravesh suyalprav...@yahoo.com wrote:
 BTW, why r u sorting on this field?
 You could also index  store this field twice. First, in its original value,
 and then second, by encoding to some unique code/hash and index it and sort
 on that.

 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Sorting-tp3017285p3019055.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: Hitting the URI limit, how to get around this?

2011-06-03 Thread JohnRodey
So here's what I'm seeing: I'm running Solr 3.1
I'm running a java client that executes a Httpget (I tried HttpPost) with a
large shard list.  If I remove a few shards from my current list it returns
fine, when I use my full shard list I get a HTTP/1.1 400 Bad Request.  If
I execute it in firefox with a few shards removed it returns fine, with the
full shard list I get a blank screen returned immediately.

My URI works at around 7800 characters but adding one more shard to it blows
up.

Any ideas? 

I've tried using SolrJ rather than httpget before but ran into similar
issues but with even less shards.
See 
http://lucene.472066.n3.nabble.com/Long-list-of-shards-breaks-solrj-query-td2748556.html
http://lucene.472066.n3.nabble.com/Long-list-of-shards-breaks-solrj-query-td2748556.html
 

My shards are added dynamically, every few hours I am adding new shards or
cores into the cluster.  so I cannot have a shard list in the config files
unless I can somehow update them while the system is running.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Hitting-the-URI-limit-how-to-get-around-this-tp3017837p3020185.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Hitting the URI limit, how to get around this?

2011-06-03 Thread Ken Krugler
It sounds like you're hitting the max URL length (8K is a common default) for 
the HTTP web server that you're using to run Solr.

All of the web servers I know about let you bump this limit up via 
configuration settings.

-- Ken

On Jun 3, 2011, at 9:27am, JohnRodey wrote:

 So here's what I'm seeing: I'm running Solr 3.1
 I'm running a java client that executes a Httpget (I tried HttpPost) with a
 large shard list.  If I remove a few shards from my current list it returns
 fine, when I use my full shard list I get a HTTP/1.1 400 Bad Request.  If
 I execute it in firefox with a few shards removed it returns fine, with the
 full shard list I get a blank screen returned immediately.
 
 My URI works at around 7800 characters but adding one more shard to it blows
 up.
 
 Any ideas? 
 
 I've tried using SolrJ rather than httpget before but ran into similar
 issues but with even less shards.
 See 
 http://lucene.472066.n3.nabble.com/Long-list-of-shards-breaks-solrj-query-td2748556.html
 http://lucene.472066.n3.nabble.com/Long-list-of-shards-breaks-solrj-query-td2748556.html
  
 
 My shards are added dynamically, every few hours I am adding new shards or
 cores into the cluster.  so I cannot have a shard list in the config files
 unless I can somehow update them while the system is running.
 
 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Hitting-the-URI-limit-how-to-get-around-this-tp3017837p3020185.html
 Sent from the Solr - User mailing list archive at Nabble.com.

--
Ken Krugler
+1 530-210-6378
http://bixolabs.com
custom data mining solutions








Re: Solr Performance

2011-06-03 Thread Otis Gospodnetic
Rohit:

Yes, run indexing on one machine (master), searches on the other (slave) and 
set 
up replication between them.  Don't optimize your index and warm up the 
searcher 
and caches on slaves.  No downtime.

Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



- Original Message 
 From: Rohit ro...@in-rev.com
 To: solr-user@lucene.apache.org
 Sent: Fri, June 3, 2011 11:49:28 AM
 Subject: Solr Performance
 
 Hi,
 
 
 
 We migrated to Solr a few days back, but have now after  going live we have
 noticed a performance drop, especially when we do a delta  index, which we
 are executing every 1hours with around 100,000 records . We  have a multi
 core Solr server running on a Linux machine, with 4Gb given to  the JVM, its
 not possible for me to upgrade the ram or give more memory to  the Solr
 currently.
 
 
 
 So I was considering the option of  running a master-slave config, I have
 another window machine with 4gb ram  available on the same network. I have
 two questions regarding this,
 
 
 
 . Is this a right path to take  ?
 
 . How can I do this with minimum down time,  given the fact that our
 index is huge
 
 .  Can someone point me to the right direction for this?
 
 
 
 Thanks and  Regards,
 
 Rohit
 
 


Re: Solr performance tuning - disk i/o?

2011-06-03 Thread Otis Gospodnetic
Right, if you facet results, then your warmup queries should include those 
facets.  The same with sorting.  If you sort on fields A and B, then include 
warmup queries that sort on A and B.

Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



- Original Message 
 From: Demian Katz demian.k...@villanova.edu
 To: solr-user@lucene.apache.org solr-user@lucene.apache.org
 Sent: Fri, June 3, 2011 11:21:52 AM
 Subject: RE: Solr performance tuning - disk i/o?
 
 Thanks to you and Otis for the suggestions!  Some more  information:
 
 - Based on the Solr stats page, my caches seem to be working  pretty well 
 (few 
or no evictions, hit rates in the 75-80% range).
 - VuFind is  actually doing two Solr queries per search (one initial search 
followed by a  supplemental spell check search -- I believe this is necessary 
because VuFind  has two separate spelling indexes, one for shingled terms and 
one for single  words).  That is probably exaggerating the problem, though 
based 
on  searches with debugQuery on, it looks like it's always the initial search  
(rather than the supplemental spelling search) that's consuming the bulk of 
the  
time.
 - enableLazyFieldLoading is set to true.
 - I'm retrieving 20  documents per page.
 - My JVM settings: -server  -Xloggc:/usr/local/vufind/solr/jetty/logs/gc.log 
-Xms4096m -Xmx4096m  -XX:+UseParallelGC -XX:+UseParallelOldGC -XX:NewRatio=5
 
 It appears that a  large portion of my problem had to do with autowarming, a 
topic that I've never  had a strong grasp on, though perhaps I'm finally 
learning (any recommended  primer links would be welcome!).  I did have some 
autowarming settings in  solrconfig.xml (an arbitrary search for a bunch of 
random keywords in the  newSearcher and firstSearcher events, plus 
autowarmCount 
settings on all of my  caches).  However, when I looked at the debugQuery 
output, I noticed that a  huge amount of time was being wasted loading facets 
on 
the first search after  restarting Solr, so I changed my newSearcher and 
firstSearcher events to  this:
 
   arr name=queries
  lst
   str  name=q*:*/str
   str  name=start0/str
   str  name=rows10/str
   str  name=facettrue/str
   str  name=facet.mincount1/str
str name=facet.fieldcollection/str
str name=facet.fieldformat/str
str  name=facet.fieldpublishDate/str
str name=facet.fieldcallnumber-first/str
str  name=facet.fieldtopic_facet/str
str name=facet.fieldauthorStr/str
str  name=facet.fieldlanguage/str
str name=facet.fieldgenre_facet/str
str name=facet.fieldera_facet/str
str  name=facet.fieldgeographic_facet/str
  /lst
   /arr
 
 Overall  performance has now increased dramatically, and now the biggest 
bottleneck in  the debug output seems to be the shingle spell checking!
 
 Any other  suggestions are welcome, since I suspect there's still room to 
squeeze more  performance out of the system, and I'm still not sure I'm making 
the most of  autowarming...  but this seems like a big step in the right  
direction.  Thanks again for the help!
 
 - Demian
 
   -Original Message-
  From: Erick Erickson [mailto:erickerick...@gmail.com]
  Sent:  Friday, June 03, 2011 9:41 AM
  To: solr-user@lucene.apache.org
   Subject: Re: Solr performance tuning - disk i/o?
  
  This doesn't  seem right. Here's a couple of things to try:
  1 attach  debugQuery=on to your long-running queries. The QTime
   returned
   is the time taken to search, NOT including  the time to load the
  docs. That'll
   help  pinpoint whether the problem is the search itself, or
  assembling  the
   documents.
  2 Are you autowarming? If  so, be sure it's actually done before
  querying.
  3 Measure  queries after the first few, particularly if you're sorting
   or
   faceting.
  4 What are your JVM  settings? How much memory do you have?
  5 is  enableLazyFieldLoading set to true in your solrconfig.xml?
  6  How many docs are you returning?
  
  
  There's more, but  that'll do for a start Let us know if you gather
  more data
   and it's still slow.
  
  Best
  Erick
  
  On  Fri, Jun 3, 2011 at 8:44 AM, Demian Katz demian.k...@villanova.edu
   wrote:
   Hello,
  
   I'm trying to move a  VuFind installation from an ailing physical
  server into a virtualized  environment, and I'm running into performance
  problems.  VuFind is a  Solr 1.4.1-based application with fairly large
  and complex records (many  stored fields, many words per record).  My
  particular installation  contains about a million records in the index,
  with a total index size  around 6GB.
  
   The virtual environment has more RAM and  better CPUs than the old
  physical box, and I am satisfied that my Java  environment is well-
  tuned.  My index is optimized.  Searches that hit  

Re: fq null pointer exception

2011-06-03 Thread Otis Gospodnetic
Dan, does the problem go away if you get rid of those 112 documents with empty 
Status or replace their empty status value with, say, Unknown?

Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



- Original Message 
 From: dan whelan d...@adicio.com
 To: solr-user@lucene.apache.org
 Sent: Fri, June 3, 2011 11:46:46 AM
 Subject: fq null pointer exception
 
 I am noticing something strange with our recent upgrade to solr 3.1 and want 
 to  
see if anyone has experienced anything similar.
 
 I have a solr.StrField  field named Status the values are Enabled, Disabled, 
 or 
''
 
 When I facet  on that field it I get
 
 Enabled 4409565
 Disabled 29185
   112
 
 
 The issue is when I do a filter query
 
 This query  works
 
 select/?q=*:*fq=Status:Enabled
 
 But when I run this  query I get a NPE
 
 select/?q=*:*fq=Status:Disabled
 
 
 Here  is part of the stack trace
 
 
 Problem accessing  /solr/global_accounts/select/. Reason:
  null
 
 java.lang.NullPointerException
 at  org.apache.solr.response.XMLWriter.writePrim(XMLWriter.java:828)
  at  org.apache.solr.response.XMLWriter.writeStr(XMLWriter.java:686)
  at org.apache.solr.schema.StrField.write(StrField.java:49)
 at  org.apache.solr.schema.SchemaField.write(SchemaField.java:125)
  at org.apache.solr.response.XMLWriter.writeDoc(XMLWriter.java:369)
  at  org.apache.solr.response.XMLWriter$3.writeDocs(XMLWriter.java:545)
  at  org.apache.solr.response.XMLWriter.writeDocuments(XMLWriter.java:482)
  at  org.apache.solr.response.XMLWriter.writeDocList(XMLWriter.java:519)
  at  org.apache.solr.response.XMLWriter.writeVal(XMLWriter.java:582)
  at  org.apache.solr.response.XMLWriter.writeResponse(XMLWriter.java:131)
  at  
org.apache.solr.response.XMLResponseWriter.write(XMLResponseWriter.java:35)
 ...
 
 
 Thanks,
 
 Dan
 
 


Re: query routing with shards

2011-06-03 Thread Dmitry Kan
Hi Otis,

Thanks! This sounds promising. This custom implementation, will it hurt in
any way the stability of the front end SOLR? After implementing it, can I
run some tests to verify the stability / performance?

Dmitry
On Fri, Jun 3, 2011 at 4:49 PM, Otis Gospodnetic otis_gospodne...@yahoo.com
 wrote:

 Hi Dmitry,

 Yes, you could also implement your own custom SearchComponent.  In this
 component you could grab the query param, examine the query value, and
 based on
 that add the shards URL param with appropriate value, so that when the
 regular
 QueryComponent grabs stuff from the request, it has the correct shard in
 there
 already.

 Otis
 
 Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
 Lucene ecosystem search :: http://search-lucene.com/



 - Original Message 
  From: Dmitry Kan dmitry@gmail.com
  To: solr-user@lucene.apache.org
   Sent: Fri, June 3, 2011 2:47:00 AM
  Subject: Re: query routing with shards
 
  Hi Otis,
 
  I merely followed on the gmail's suggestion to include other  people into
 the
  recipients list, Yonik was the first one :) I won't do it  next time.
 
  Thanks for a rapid reply. The reason for doing this query  routing is
 that we
  abstract the distributed SOLR from the client code for  security reasons
  (that is, we don't want to expose the entire shard farm to  the world,
 but
  only the frontend SOLR) and for better decoupling.
 
  Is  it possible to implement a plugin to SOLR that would map queries  to
  shards?
 
  We have other choices too, they'll take quite some time,  that's why I
  decided to quickly ask, if I was missing something from the SOLR  main
  components design and configuration.
 
  Dmitry
 
  On Fri, Jun 3,  2011 at 8:25 AM, Otis Gospodnetic 
 otis_gospodne...@yahoo.com
wrote:
 
   Hi Dmitry (you may not want to additionally copy Yonik, he's
  subscribed to
   this
   list, too)
  
  
   It sounds  like you have the knowledge of which query maps to which
 shard.
 If
   so, why not control/change the value of shards param in the request
  to
   your
   front-end Solr (aka distributed request dispatcher)  within your app,
 which
   is
   the one calling Solr?
  
Otis
   
   Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
   Lucene  ecosystem search :: http://search-lucene.com/
  
  
  
   - Original  Message 
From: Dmitry Kan dmitry@gmail.com
To: solr-user@lucene.apache.org; yo...@lucidimagination.com
 Sent: Thu, June 2, 2011 7:00:53 AM
Subject: query routing with  shards
   
Hello all,
   
We have  currently several pretty fat logically isolated shards  with
 the
same
schema / solrconfig (indices are separate). We currently  have  one
 single
front end SOLR (1.4) for the client code  calls. Since a client  code
   query
usually hits only  one shard, we are considering making a smart
  routing
   of
 queries to the shards they map to. Can you please give some
  pointers  as
   to
what would be an optimal way to achieve such a  routing inside  the
 front
   end
solr? Is there a way to  configure mapping inside the  solrconfig?
   
 Thanks.
   
--
Regards,
   
 Dmitry Kan
   
  
 
 
 
  --
  Regards,
 
  Dmitry Kan
 




-- 
Regards,

Dmitry Kan


Re: query routing with shards

2011-06-03 Thread Otis Gospodnetic
Nah, if you can quickly figure out which shard a given query maps to, then all 
this component needs to do is stick the appropriate shards param value in the 
request and let the request pass through to the other SearchComponents in the 
chain,  including QueryComponent, which will know what to do with the shards 
param.

Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



- Original Message 
 From: Dmitry Kan dmitry@gmail.com
 To: solr-user@lucene.apache.org
 Sent: Fri, June 3, 2011 12:56:15 PM
 Subject: Re: query routing with shards
 
 Hi Otis,
 
 Thanks! This sounds promising. This custom implementation, will  it hurt in
 any way the stability of the front end SOLR? After implementing  it, can I
 run some tests to verify the stability /  performance?
 
 Dmitry
 On Fri, Jun 3, 2011 at 4:49 PM, Otis Gospodnetic  otis_gospodne...@yahoo.com
   wrote:
 
  Hi Dmitry,
 
  Yes, you could also implement your  own custom SearchComponent.  In this
  component you could grab the  query param, examine the query value, and
  based on
  that add the  shards URL param with appropriate value, so that when the
   regular
  QueryComponent grabs stuff from the request, it has the correct  shard in
  there
  already.
 
  Otis
   
  Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
  Lucene ecosystem  search :: http://search-lucene.com/
 
 
 
  - Original  Message 
   From: Dmitry Kan dmitry@gmail.com
   To: solr-user@lucene.apache.org
 Sent: Fri, June 3, 2011 2:47:00 AM
   Subject: Re: query routing  with shards
  
   Hi Otis,
  
   I  merely followed on the gmail's suggestion to include other  people  
into
  the
   recipients list, Yonik was the first one :) I  won't do it  next time.
  
   Thanks for a rapid reply.  The reason for doing this query  routing is
  that we
abstract the distributed SOLR from the client code for  security  reasons
   (that is, we don't want to expose the entire shard farm  to  the world,
  but
   only the frontend SOLR) and for  better decoupling.
  
   Is  it possible to implement a  plugin to SOLR that would map queries  to
   shards?
   
   We have other choices too, they'll take quite some time,   that's why I
   decided to quickly ask, if I was missing something  from the SOLR  main
   components design and  configuration.
  
   Dmitry
  
   On  Fri, Jun 3,  2011 at 8:25 AM, Otis Gospodnetic 
  otis_gospodne...@yahoo.com
  wrote:
  
Hi Dmitry (you may not  want to additionally copy Yonik, he's
   subscribed to
 this
list, too)
   

It sounds  like you have the knowledge of which  query maps to which
  shard.
  If
 so, why not control/change the value of shards param in the  request
   to
your
front-end Solr  (aka distributed request dispatcher)  within your app,
   which
is
the one calling Solr?

 Otis

 Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
 Lucene  ecosystem search :: http://search-lucene.com/
   

   
- Original  Message  
 From: Dmitry Kan dmitry@gmail.com
  To: solr-user@lucene.apache.org; yo...@lucidimagination.com
   Sent: Thu, June 2, 2011 7:00:53 AM
  Subject: query routing with  shards

  Hello all,

 We have   currently several pretty fat logically isolated shards  with
   the
 same
 schema / solrconfig  (indices are separate). We currently  have  one
  single
  front end SOLR (1.4) for the client code  calls. Since a  client  
code
query
 usually hits  only  one shard, we are considering making a smart
routing
of
  queries to the shards  they map to. Can you please give some
   pointers  as
 to
 what would be an optimal way to achieve such  a  routing inside  the
  front
end
  solr? Is there a way to  configure mapping inside the   solrconfig?

  Thanks.
 
 --
 Regards,
 
  Dmitry Kan
 
   
  
  
  
--
   Regards,
  
   Dmitry Kan
   
 
 
 
 
 -- 
 Regards,
 
 Dmitry Kan
 


Re: [Visualizations] from Query Results

2011-06-03 Thread Adam Estrada
Otis and Erick,

Believe it or not, I did Google this and didn't come up with anything all
that useful. I was at the Lucene Revolution conference last year and saw
some prezos that had some sort of graphical representation of the query
results. The one from Basic Tech especially caught my attention because it
simply showed a graph of hits over time. I can do that using jQuery or
Raphael as he suggested. I have also been playing with the Carrot2
visualization tools which are pretty cool too which is why I pointed them
out in my original email. I was just curious to see if there were any
speciality type projects out there like Carrot2 that folks in the Solr
community are using.

Adam

On Fri, Jun 3, 2011 at 9:42 AM, Otis Gospodnetic otis_gospodne...@yahoo.com
 wrote:

 Hi Adam,

 Try this:
 http://lmgtfy.com/?q=search%20results%20visualizations

 In practice I find that visualizations are cool and attractive looking, but
 often text is more useful because it's more direct.  But there is room for
 graphical representation of search results, sure.

 Otis
 
 Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
 Lucene ecosystem search :: http://search-lucene.com/



 - Original Message 
  From: Adam Estrada estrada.adam.gro...@gmail.com
  To: solr-user@lucene.apache.org
  Sent: Fri, June 3, 2011 7:13:39 AM
  Subject: [Visualizations] from Query Results
 
  Dear Solr experts,
 
  I am curious to learn what visualization tools are out  there to help me
  visualize my query results. I am not talking about a  language specific
  client per se but something more like Carrot2 which breaks  clusters in
 to
  their knowledge tree and expandable pie chart. Sorry if those  aren't the
  correct names for those tools ;-) Anyway, what else is out there  like
  Carrot2 http://project.carrot2.org/ to help me visualize Solr query
  results?
 
  Thanks for your input,
  Adam
 



Feature: skipping caches and info about cache use

2011-06-03 Thread Otis Gospodnetic
Hi,

Is it just me, or would others like things like:
* The ability to tell Solr (by passing some URL param?) to skip one or more of 
its caches and get data from the index
* An additional attrib in the Solr response that shows whether the query came 
from the cache or not

* Maybe something else along these lines?

Or maybe some of this is already there and I just don't know about it? :)

Thanks,
Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



Re: query routing with shards

2011-06-03 Thread Dmitry Kan
Got it, I can quickly figure the shard out, thanks a lot Otis!

Dmitry

On Fri, Jun 3, 2011 at 8:00 PM, Otis Gospodnetic otis_gospodne...@yahoo.com
 wrote:

 Nah, if you can quickly figure out which shard a given query maps to, then
 all
 this component needs to do is stick the appropriate shards param value in
 the
 request and let the request pass through to the other SearchComponents in
 the
 chain,  including QueryComponent, which will know what to do with the
 shards
 param.

 Otis
 
 Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
 Lucene ecosystem search :: http://search-lucene.com/



 - Original Message 
  From: Dmitry Kan dmitry@gmail.com
  To: solr-user@lucene.apache.org
Sent: Fri, June 3, 2011 12:56:15 PM
  Subject: Re: query routing with shards
 
  Hi Otis,
 
  Thanks! This sounds promising. This custom implementation, will  it hurt
 in
  any way the stability of the front end SOLR? After implementing  it, can
 I
  run some tests to verify the stability /  performance?
 
  Dmitry
  On Fri, Jun 3, 2011 at 4:49 PM, Otis Gospodnetic  
 otis_gospodne...@yahoo.com
wrote:
 
   Hi Dmitry,
  
   Yes, you could also implement your  own custom SearchComponent.  In
 this
   component you could grab the  query param, examine the query value, and
   based on
   that add the  shards URL param with appropriate value, so that when the
regular
   QueryComponent grabs stuff from the request, it has the correct  shard
 in
   there
   already.
  
   Otis

   Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
   Lucene ecosystem  search :: http://search-lucene.com/
  
  
  
   - Original  Message 
From: Dmitry Kan dmitry@gmail.com
To: solr-user@lucene.apache.org
  Sent: Fri, June 3, 2011 2:47:00 AM
Subject: Re: query routing  with shards
   
Hi Otis,
   
I  merely followed on the gmail's suggestion to include other  people
 into
   the
recipients list, Yonik was the first one :) I  won't do it  next
 time.
   
Thanks for a rapid reply.  The reason for doing this query  routing
 is
   that we
 abstract the distributed SOLR from the client code for  security
  reasons
(that is, we don't want to expose the entire shard farm  to  the
 world,
   but
only the frontend SOLR) and for  better decoupling.
   
Is  it possible to implement a  plugin to SOLR that would map queries
  to
shards?

We have other choices too, they'll take quite some time,   that's why
 I
decided to quickly ask, if I was missing something  from the SOLR
  main
components design and  configuration.
   
Dmitry
   
On  Fri, Jun 3,  2011 at 8:25 AM, Otis Gospodnetic 
   otis_gospodne...@yahoo.com
   wrote:
   
 Hi Dmitry (you may not  want to additionally copy Yonik, he's
subscribed to
  this
 list, too)

 
 It sounds  like you have the knowledge of which  query maps to
 which
   shard.
   If
  so, why not control/change the value of shards param in the
  request
to
 your
 front-end Solr  (aka distributed request dispatcher)  within your
 app,
which
 is
 the one calling Solr?
 
  Otis
 
  Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
  Lucene  ecosystem search :: http://search-lucene.com/

 

 - Original  Message  
  From: Dmitry Kan dmitry@gmail.com
   To: solr-user@lucene.apache.org; yo...@lucidimagination.com
Sent: Thu, June 2, 2011 7:00:53 AM
   Subject: query routing with  shards
 
   Hello all,
 
  We have   currently several pretty fat logically isolated shards
  with
the
  same
  schema / solrconfig  (indices are separate). We currently  have
  one
   single
   front end SOLR (1.4) for the client code  calls. Since a  client
 code
 query
  usually hits  only  one shard, we are considering making a smart
 routing
 of
   queries to the shards  they map to. Can you please give some
pointers  as
  to
  what would be an optimal way to achieve such  a  routing inside
  the
   front
 end
   solr? Is there a way to  configure mapping inside the
 solrconfig?
 
   Thanks.
  
  --
  Regards,
  
   Dmitry Kan
  

   
   
   
 --
Regards,
   
Dmitry Kan

  
 
 
 
  --
  Regards,
 
  Dmitry Kan
 




-- 
Regards,

Dmitry Kan


RE: Hitting the URI limit, how to get around this?

2011-06-03 Thread Colin Bennett
It sounds like you need to increase the HTTP header size.

In tomcat the default is 4096 bytes, and to change it you need to add
maxHttpHeaderSize=value to the connector definition in server.xml 

Colin.

-Original Message-
From: Ken Krugler [mailto:kkrugler_li...@transpac.com] 
Sent: Friday, June 03, 2011 12:39 PM
To: solr-user@lucene.apache.org
Subject: Re: Hitting the URI limit, how to get around this?

It sounds like you're hitting the max URL length (8K is a common default)
for the HTTP web server that you're using to run Solr.

All of the web servers I know about let you bump this limit up via
configuration settings.

-- Ken

On Jun 3, 2011, at 9:27am, JohnRodey wrote:

 So here's what I'm seeing: I'm running Solr 3.1
 I'm running a java client that executes a Httpget (I tried HttpPost) with
a
 large shard list.  If I remove a few shards from my current list it
returns
 fine, when I use my full shard list I get a HTTP/1.1 400 Bad Request.
If
 I execute it in firefox with a few shards removed it returns fine, with
the
 full shard list I get a blank screen returned immediately.
 
 My URI works at around 7800 characters but adding one more shard to it
blows
 up.
 
 Any ideas? 
 
 I've tried using SolrJ rather than httpget before but ran into similar
 issues but with even less shards.
 See 

http://lucene.472066.n3.nabble.com/Long-list-of-shards-breaks-solrj-query-td
2748556.html

http://lucene.472066.n3.nabble.com/Long-list-of-shards-breaks-solrj-query-td
2748556.html 
 
 My shards are added dynamically, every few hours I am adding new shards or
 cores into the cluster.  so I cannot have a shard list in the config files
 unless I can somehow update them while the system is running.
 
 --
 View this message in context:
http://lucene.472066.n3.nabble.com/Hitting-the-URI-limit-how-to-get-around-t
his-tp3017837p3020185.html
 Sent from the Solr - User mailing list archive at Nabble.com.

--
Ken Krugler
+1 530-210-6378
http://bixolabs.com
custom data mining solutions











Re: fq null pointer exception

2011-06-03 Thread dan whelan

Otis, I just deleted the documents and committed and I still get that error.

Thanks,

Dan


On 6/3/11 9:43 AM, Otis Gospodnetic wrote:

Dan, does the problem go away if you get rid of those 112 documents with empty
Status or replace their empty status value with, say, Unknown?

Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



- Original Message 

From: dan wheland...@adicio.com
To: solr-user@lucene.apache.org
Sent: Fri, June 3, 2011 11:46:46 AM
Subject: fq null pointer exception

I am noticing something strange with our recent upgrade to solr 3.1 and want to
see if anyone has experienced anything similar.

I have a solr.StrField  field named Status the values are Enabled, Disabled, or
''

When I facet  on that field it I get

Enabled 4409565
Disabled 29185
  112


The issue is when I do a filter query

This query  works

select/?q=*:*fq=Status:Enabled

But when I run this  query I get a NPE

select/?q=*:*fq=Status:Disabled


Here  is part of the stack trace


Problem accessing  /solr/global_accounts/select/. Reason:
  null

java.lang.NullPointerException
 at  org.apache.solr.response.XMLWriter.writePrim(XMLWriter.java:828)
  at  org.apache.solr.response.XMLWriter.writeStr(XMLWriter.java:686)
  at org.apache.solr.schema.StrField.write(StrField.java:49)
 at  org.apache.solr.schema.SchemaField.write(SchemaField.java:125)
  at org.apache.solr.response.XMLWriter.writeDoc(XMLWriter.java:369)
  at  org.apache.solr.response.XMLWriter$3.writeDocs(XMLWriter.java:545)
  at  org.apache.solr.response.XMLWriter.writeDocuments(XMLWriter.java:482)
  at  org.apache.solr.response.XMLWriter.writeDocList(XMLWriter.java:519)
  at  org.apache.solr.response.XMLWriter.writeVal(XMLWriter.java:582)
  at  org.apache.solr.response.XMLWriter.writeResponse(XMLWriter.java:131)
  at
org.apache.solr.response.XMLResponseWriter.write(XMLResponseWriter.java:35)
...


Thanks,

Dan






Re: Strategy -- Frequent updates in our application

2011-06-03 Thread Jack Repenning
On Jun 2, 2011, at 8:29 PM, Naveen Gupta wrote:

 and what about NRT, is it fine to apply in this case of scenario

Is NRT really what's wanted here? I'm asking the experts, as I have a situation 
 not too different from the b.p.

It appears to me (from the dox) that NRT makes a difference in the lag between 
a document being added and it being available in searches. But the BP really 
sounds to me like a concern over documents-added-per-second. Does the 
RankingAlgorithm form of NRT improve the docs-added-per-second performance?

My add-to-view limits aren't really threatened by Solr performance today; 
something like 30 seconds is just fine. But I am feeling close enough to the 
documents-per-second boundary that I'm pondering measures like master/slave. If 
NRT only improvs add-to-view lag, I'm not overly interested, but if it can 
improve add throughput, I'm all over it ;-)

-==-
Jack Repenning
Technologist
Codesion Business Unit
CollabNet, Inc.
8000 Marina Boulevard, Suite 600
Brisbane, California 94005
office: +1 650.228.2562
twitter: http://twitter.com/jrep











PGP.sig
Description: This is a digitally signed message part


Re: Hitting the URI limit, how to get around this?

2011-06-03 Thread Dmitry Kan
Hi,

Why not use HTTP POST?

Dmitry

On Fri, Jun 3, 2011 at 8:27 PM, Colin Bennett cbenn...@job.com wrote:

 It sounds like you need to increase the HTTP header size.

 In tomcat the default is 4096 bytes, and to change it you need to add
 maxHttpHeaderSize=value to the connector definition in server.xml

 Colin.

 -Original Message-
 From: Ken Krugler [mailto:kkrugler_li...@transpac.com]
 Sent: Friday, June 03, 2011 12:39 PM
 To: solr-user@lucene.apache.org
 Subject: Re: Hitting the URI limit, how to get around this?

 It sounds like you're hitting the max URL length (8K is a common default)
 for the HTTP web server that you're using to run Solr.

 All of the web servers I know about let you bump this limit up via
 configuration settings.

 -- Ken

 On Jun 3, 2011, at 9:27am, JohnRodey wrote:

  So here's what I'm seeing: I'm running Solr 3.1
  I'm running a java client that executes a Httpget (I tried HttpPost) with
 a
  large shard list.  If I remove a few shards from my current list it
 returns
  fine, when I use my full shard list I get a HTTP/1.1 400 Bad Request.
 If
  I execute it in firefox with a few shards removed it returns fine, with
 the
  full shard list I get a blank screen returned immediately.
 
  My URI works at around 7800 characters but adding one more shard to it
 blows
  up.
 
  Any ideas?
 
  I've tried using SolrJ rather than httpget before but ran into similar
  issues but with even less shards.
  See
 

 http://lucene.472066.n3.nabble.com/Long-list-of-shards-breaks-solrj-query-td
 2748556.html
 

 http://lucene.472066.n3.nabble.com/Long-list-of-shards-breaks-solrj-query-td
 2748556.html
 
  My shards are added dynamically, every few hours I am adding new shards
 or
  cores into the cluster.  so I cannot have a shard list in the config
 files
  unless I can somehow update them while the system is running.
 
  --
  View this message in context:

 http://lucene.472066.n3.nabble.com/Hitting-the-URI-limit-how-to-get-around-t
 his-tp3017837p3020185.html
  Sent from the Solr - User mailing list archive at Nabble.com.

 --
 Ken Krugler
 +1 530-210-6378
 http://bixolabs.com
 custom data mining solutions












-- 
Regards,

Dmitry Kan


How to know how many documents are indexed? Anything more elegant than parsing numFound?

2011-06-03 Thread Gabriele Kahlout
$ curl http://192.168.34.51:8080/solr/select?q=*%3A*rows=0;  resp.xml
$ xmlstarlet sel -t -v //@numFound resp.xml


-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


Re: Hitting the URI limit, how to get around this?

2011-06-03 Thread JohnRodey
Yep that was my issue.

And like Ken said on Tomcat I set maxHttpHeaderSize=65536.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Hitting-the-URI-limit-how-to-get-around-this-tp3017837p3020774.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: How to know how many documents are indexed? Anything more elegant than parsing numFound?

2011-06-03 Thread Ahmet Arslan
: How to know how many documents are indexed? Anything more elegant than
: parsing numFound?
 $ curl http://192.168.34.51:8080/solr/select?q=*%3A*rows=0;
  resp.xml
 $ xmlstarlet sel -t -v //@numFound resp.xml

solr/admin/stats.jsp is actually an xml too and contains numDocs and maxDoc 
info. 

I think you can get numDocs with jmx too. http://wiki.apache.org/solr/SolrJmx


Getting payloads in Highlighter

2011-06-03 Thread lboutros
Hi all,

I need to highlight searched words in the original text (xml) of a document. 

So I'm trying to develop a new Highlighter which uses the defaultHighlighter
to highlight some fields and then retrieve the original text file/document
(external or internal storage) and put the highlighted parts into them.

I'm using an additional field for the field offsets for each field in each
document.
To store the offsets (and perhaps other infos) I'm using the payloads. (I
cannot wait for the future DocValues).

now my question, what is the fastest way to retrieve payloads (TermPositions
?) for a given document a given field and a given term ?

If other methods exist to do that, I'm open :)

Ludovic.



-
Jouve
France.
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Getting-payloads-in-Highlighter-tp3020885p3020885.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: fq null pointer exception

2011-06-03 Thread Otis Gospodnetic
And what happens if you add fl=your id field here?

Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



- Original Message 
 From: dan whelan d...@adicio.com
 To: solr-user@lucene.apache.org
 Sent: Fri, June 3, 2011 1:38:33 PM
 Subject: Re: fq null pointer exception
 
 Otis, I just deleted the documents and committed and I still get that  error.
 
 Thanks,
 
 Dan
 
 
 On 6/3/11 9:43 AM, Otis Gospodnetic  wrote:
  Dan, does the problem go away if you get rid of those 112  documents with 
empty
  Status or replace their empty status value with,  say, Unknown?
 
  Otis
  
  Sematext :: http://sematext.com/ :: Solr -  Lucene - Nutch
  Lucene ecosystem search :: http://search-lucene.com/
 
 
 
  - Original  Message 
  From: dan wheland...@adicio.com
  To: solr-user@lucene.apache.org
   Sent: Fri, June 3, 2011 11:46:46 AM
  Subject: fq null pointer  exception
 
  I am noticing something strange with our  recent upgrade to solr 3.1 and 
want to
  see if anyone has experienced  anything similar.
 
  I have a solr.StrField  field  named Status the values are Enabled, 
Disabled, or
   ''
 
  When I facet  on that field it I  get
 
  Enabled 4409565
  Disabled  29185
112
 
 
  The issue is  when I do a filter query
 
  This query   works
 
   select/?q=*:*fq=Status:Enabled
 
  But when I run  this  query I get a NPE
 
   select/?q=*:*fq=Status:Disabled
 
 
   Here  is part of the stack trace
 
 
   Problem accessing  /solr/global_accounts/select/. Reason:
 null
 
   java.lang.NullPointerException
   at   org.apache.solr.response.XMLWriter.writePrim(XMLWriter.java:828)
 at   org.apache.solr.response.XMLWriter.writeStr(XMLWriter.java:686)
 at  org.apache.solr.schema.StrField.write(StrField.java:49)
at   org.apache.solr.schema.SchemaField.write(SchemaField.java:125)
 at  org.apache.solr.response.XMLWriter.writeDoc(XMLWriter.java:369)
 at   
org.apache.solr.response.XMLWriter$3.writeDocs(XMLWriter.java:545)
 at   
org.apache.solr.response.XMLWriter.writeDocuments(XMLWriter.java:482)
 at   
org.apache.solr.response.XMLWriter.writeDocList(XMLWriter.java:519)
 at   org.apache.solr.response.XMLWriter.writeVal(XMLWriter.java:582)
 at   
org.apache.solr.response.XMLWriter.writeResponse(XMLWriter.java:131)
 at
   
org.apache.solr.response.XMLResponseWriter.write(XMLResponseWriter.java:35)
   ...
 
 
  Thanks,
 
   Dan
 
 
 
 


Re: Strategy -- Frequent updates in our application

2011-06-03 Thread Otis Gospodnetic
Yes, when people talk about NRT search they refer to 'add to view lag'.  In a 
typical Solr master-slave setup this is dominated by waiting for replication, 
doing the replication, and then warming up.

If your problem is indexing speed then that's a separate story that I think 
you'll find answers to on http://search-lucene.com/ or if you can't find them 
we 
can repeat :)

Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



- Original Message 
 From: Jack Repenning jrepenn...@collab.net
 To: solr-user@lucene.apache.org
 Sent: Fri, June 3, 2011 2:10:27 PM
 Subject: Re: Strategy -- Frequent updates in our application
 
 On Jun 2, 2011, at 8:29 PM, Naveen Gupta wrote:
 
  and what about NRT,  is it fine to apply in this case of scenario
 
 Is NRT really what's wanted  here? I'm asking the experts, as I have a 
situation  not too different from  the b.p.
 
 It appears to me (from the dox) that NRT makes a difference in  the lag 
 between 
a document being added and it being available in searches. But  the BP really 
sounds to me like a concern over documents-added-per-second. Does  the 
RankingAlgorithm form of NRT improve the docs-added-per-second  performance?
 
 My add-to-view limits aren't really threatened by Solr  performance today; 
something like 30 seconds is just fine. But I am feeling  close enough to the 
documents-per-second boundary that I'm pondering measures  like master/slave. 
If 
NRT only improvs add-to-view lag, I'm not overly  interested, but if it can 
improve add throughput, I'm all over it  ;-)
 
 -==-
 Jack Repenning
 Technologist
 Codesion Business  Unit
 CollabNet, Inc.
 8000 Marina Boulevard, Suite 600
 Brisbane,  California 94005
 office: +1 650.228.2562
 twitter: http://twitter.com/jrep
 
 
 
 
 
 
 
 
 
 


Re: Getting payloads in Highlighter

2011-06-03 Thread lboutros
To clarify a bit more, I took a look to this function :

termPositions

public TermPositions termPositions()
throws IOException

Description copied from class: IndexReader
Returns an unpositioned TermPositions enumerator. 

But it returns an unpositioned enumerator, is there a way to get a
TermPositions directly positioned on a document, a field and a term ?

Ludovic.

-
Jouve
France.
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Getting-payloads-in-Highlighter-tp3020885p3020922.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Getting payloads in Highlighter

2011-06-03 Thread Ahmet Arslan
 I need to highlight searched words in the original text
 (xml) of a document. 

Why don't you remove xml tags in an analyzer? You can highlight xml by doing so.


Re: How to know how many documents are indexed? Anything more elegant than parsing numFound?

2011-06-03 Thread Gabriele Kahlout
$ curl --fail http://192.168.34.51:8080/solr/admin/stats.jsp;  resp.xml
$ xmlstarlet sel -t -v //@numDocs resp.xml
*Extra content at the end of the document*

On Fri, Jun 3, 2011 at 8:56 PM, Ahmet Arslan iori...@yahoo.com wrote:

 : How to know how many documents are indexed? Anything more elegant than
 : parsing numFound?
  $ curl http://192.168.34.51:8080/solr/select?q=*%3A*rows=0;
   resp.xml
  $ xmlstarlet sel -t -v //@numFound resp.xml

 solr/admin/stats.jsp is actually an xml too and contains numDocs and maxDoc
 info.

 I think you can get numDocs with jmx too.
 http://wiki.apache.org/solr/SolrJmx




-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


Re: fq null pointer exception

2011-06-03 Thread dan whelan

It returned results when I added the fl param.

Strange... wonder what is going on there

Thanks,

Dan



On 6/3/11 12:17 PM, Otis Gospodnetic wrote:

And what happens if you addfl=your id field here?

Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



- Original Message 

From: dan wheland...@adicio.com
To: solr-user@lucene.apache.org
Sent: Fri, June 3, 2011 1:38:33 PM
Subject: Re: fq null pointer exception

Otis, I just deleted the documents and committed and I still get that  error.

Thanks,

Dan


On 6/3/11 9:43 AM, Otis Gospodnetic  wrote:

Dan, does the problem go away if you get rid of those 112  documents with

empty

Status or replace their empty status value with,  say, Unknown?

Otis

Sematext :: http://sematext.com/ :: Solr -  Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



- Original  Message 

From: dan wheland...@adicio.com
To: solr-user@lucene.apache.org
  Sent: Fri, June 3, 2011 11:46:46 AM
Subject: fq null pointer  exception

I am noticing something strange with our  recent upgrade to solr 3.1 and

want to

see if anyone has experienced  anything similar.

I have a solr.StrField  field  named Status the values are Enabled,

Disabled, or

  ''

When I facet  on that field it I  get

Enabled 4409565
Disabled  29185
  112


The issue is  when I do a filter query

This query   works

  select/?q=*:*fq=Status:Enabled

But when I run  this  query I get a NPE

  select/?q=*:*fq=Status:Disabled


  Here  is part of the stack trace


  Problem accessing  /solr/global_accounts/select/. Reason:
null

  java.lang.NullPointerException
  at   org.apache.solr.response.XMLWriter.writePrim(XMLWriter.java:828)
at   org.apache.solr.response.XMLWriter.writeStr(XMLWriter.java:686)
at  org.apache.solr.schema.StrField.write(StrField.java:49)
   at   org.apache.solr.schema.SchemaField.write(SchemaField.java:125)
at  org.apache.solr.response.XMLWriter.writeDoc(XMLWriter.java:369)
at

org.apache.solr.response.XMLWriter$3.writeDocs(XMLWriter.java:545)

at

org.apache.solr.response.XMLWriter.writeDocuments(XMLWriter.java:482)

at

org.apache.solr.response.XMLWriter.writeDocList(XMLWriter.java:519)

at   org.apache.solr.response.XMLWriter.writeVal(XMLWriter.java:582)
at

org.apache.solr.response.XMLWriter.writeResponse(XMLWriter.java:131)

at


org.apache.solr.response.XMLResponseWriter.write(XMLResponseWriter.java:35)

  ...


Thanks,

  Dan








Re: Solr performance tuning - disk i/o?

2011-06-03 Thread Erick Erickson
Quick impressions:

The faceting is usually best done on fields that don't have lots of unique
values for three reasons:
1 It's questionable how much use to the user to have a gazillion facets.
 In the case of a unique field per document, in fact, it's useless.
2 resource requirements go up as a function of the number of unique
 terms. This is true for faceting and sorting.
3 warmup times grow the more terms have to be read into memory.


Glancing at your warmup stuff, things like publishDate, authorStr and maybe
callnumber-first are questionable. publishDate depends on how coarse the
resolution is. If it's by day, that's not really much use. authorStr.. How many
authors have more than one publication? Would this be better served by some
kind of autosuggest rather than facets? callnumber-first... I don't
really know, but
if it's unique per document it's probably not something the user would
find useful
as a facet.

The admin page will help you determine the number of unique terms per field,
which may guide you whether or not to continue to facet on these fields.

As Otis said, doing a sort on the fields during warmup will also help.

Watch your polling interval for any slaves in relation to the warmup times.
If your polling interval is shorter than the warmup times, you run a risk of
runaway warmups.

As you've figured out, measuring responses to the first few queries doesn't
always measure what you really need G..

I don't have the pages handy, but autowarming is a good topic to understand,
so you might spend some time tracking it down.

Best
Erick

On Fri, Jun 3, 2011 at 11:21 AM, Demian Katz demian.k...@villanova.edu wrote:
 Thanks to you and Otis for the suggestions!  Some more information:

 - Based on the Solr stats page, my caches seem to be working pretty well (few 
 or no evictions, hit rates in the 75-80% range).
 - VuFind is actually doing two Solr queries per search (one initial search 
 followed by a supplemental spell check search -- I believe this is necessary 
 because VuFind has two separate spelling indexes, one for shingled terms and 
 one for single words).  That is probably exaggerating the problem, though 
 based on searches with debugQuery on, it looks like it's always the initial 
 search (rather than the supplemental spelling search) that's consuming the 
 bulk of the time.
 - enableLazyFieldLoading is set to true.
 - I'm retrieving 20 documents per page.
 - My JVM settings: -server -Xloggc:/usr/local/vufind/solr/jetty/logs/gc.log 
 -Xms4096m -Xmx4096m -XX:+UseParallelGC -XX:+UseParallelOldGC -XX:NewRatio=5

 It appears that a large portion of my problem had to do with autowarming, a 
 topic that I've never had a strong grasp on, though perhaps I'm finally 
 learning (any recommended primer links would be welcome!).  I did have some 
 autowarming settings in solrconfig.xml (an arbitrary search for a bunch of 
 random keywords in the newSearcher and firstSearcher events, plus 
 autowarmCount settings on all of my caches).  However, when I looked at the 
 debugQuery output, I noticed that a huge amount of time was being wasted 
 loading facets on the first search after restarting Solr, so I changed my 
 newSearcher and firstSearcher events to this:

      arr name=queries
        lst
          str name=q*:*/str
          str name=start0/str
          str name=rows10/str
          str name=facettrue/str
          str name=facet.mincount1/str
          str name=facet.fieldcollection/str
          str name=facet.fieldformat/str
          str name=facet.fieldpublishDate/str
          str name=facet.fieldcallnumber-first/str
          str name=facet.fieldtopic_facet/str
          str name=facet.fieldauthorStr/str
          str name=facet.fieldlanguage/str
          str name=facet.fieldgenre_facet/str
          str name=facet.fieldera_facet/str
          str name=facet.fieldgeographic_facet/str
        /lst
      /arr

 Overall performance has now increased dramatically, and now the biggest 
 bottleneck in the debug output seems to be the shingle spell checking!

 Any other suggestions are welcome, since I suspect there's still room to 
 squeeze more performance out of the system, and I'm still not sure I'm making 
 the most of autowarming...  but this seems like a big step in the right 
 direction.  Thanks again for the help!

 - Demian

 -Original Message-
 From: Erick Erickson [mailto:erickerick...@gmail.com]
 Sent: Friday, June 03, 2011 9:41 AM
 To: solr-user@lucene.apache.org
 Subject: Re: Solr performance tuning - disk i/o?

 This doesn't seem right. Here's a couple of things to try:
 1 attach debugQuery=on to your long-running queries. The QTime
 returned
      is the time taken to search, NOT including the time to load the
 docs. That'll
      help pinpoint whether the problem is the search itself, or
 assembling the
      documents.
 2 Are you autowarming? If so, be sure it's actually done before
 querying.
 3 Measure queries after the first few, particularly if 

Re: Better to have lots of smaller cores or one really big core?

2011-06-03 Thread Erick Erickson
Nope, cores are just a self-contained index, really.

What is the point of breaking them up? If you have some kind
of rolling currency (i.e. you only want to keep the last N days/weeks/months)
then you can always delete-by-query to age-out the relevant docs.

You'll be able to fit more on one server if it's in a single core, but what the
ratio is I'm not sure.

My take would be go for the simplest, which would be a single core (index)
for administrative purposes if for no other reason, but that may well just be
personal preference...

Best
Erick

On Fri, Jun 3, 2011 at 10:10 AM, JohnRodey timothydd...@yahoo.com wrote:
 Thanks Erick for the response.

 So my data structure is the same, i.e. they all use the same schema.  Though
 I think it makes sense for us to somehow break apart the data, for example
 by the date it was indexed.  I'm just trying to get a feel for how large we
 should aim to keep those (by day, by week, by month, etc...).

 So it sounds like we should aim to keep them at a size that one solr server
 can host to avoid serving multiple cores.

 One question, there is no real difference (other than configuration) from a
 server hosting its own index vs. it hosting one core, is there?

 Thanks!

 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Better-to-have-lots-of-smaller-cores-or-one-really-big-core-tp3017973p3019686.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: Getting payloads in Highlighter

2011-06-03 Thread lboutros
The original document is not indexed. Currently it is just stored and could
be stored in an filesystem or a database in the future.

The different parts of a document are indexed in multiple different fields
with some  different analyzers (stemming, multiple languages, regex,...).

So, I don't think your solution can be applied, but if I'm wrong, could you
please explain me how ?

Thanks,

Ludovic.


-
Jouve
France.
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Getting-payloads-in-Highlighter-tp3020885p3021383.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: Feature: skipping caches and info about cache use

2011-06-03 Thread Robert Petersen
Why, I'm just wondering?
  
For a case where you know the next query would not be possible to be
already in the cache because it is so different from the norm? 

Just for timing information for instrumentation used for tuning  (ie so
you can compare cached response times vs non-cached response times)?  


-Original Message-
From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com] 
Sent: Friday, June 03, 2011 10:02 AM
To: solr-user@lucene.apache.org
Subject: Feature: skipping caches and info about cache use

Hi,

Is it just me, or would others like things like:
* The ability to tell Solr (by passing some URL param?) to skip one or
more of 
its caches and get data from the index
* An additional attrib in the Solr response that shows whether the query
came 
from the cache or not

* Maybe something else along these lines?

Or maybe some of this is already there and I just don't know about it?
:)

Thanks,
Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



Re: Feature: skipping caches and info about cache use

2011-06-03 Thread Yonik Seeley
On Fri, Jun 3, 2011 at 1:02 PM, Otis Gospodnetic
otis_gospodne...@yahoo.com wrote:
 Is it just me, or would others like things like:
 * The ability to tell Solr (by passing some URL param?) to skip one or more of
 its caches and get data from the index

Yeah, we've needed this for a long time, and I believe there's a JIRA
issue open for it.
It really needs to be on a per query basis though... so a localParam
that has cache=true/false
would be ideal.


-Yonik
http://www.lucidimagination.com


Re: fq null pointer exception

2011-06-03 Thread Yonik Seeley
Dan, this doesn't really have anything to do with your filter on the
Status field except that it causes different documents to be selected.
The root cause is a schema mismatch with your index.
A string field (or so the schema is saying it's a string field) is
returning null for a value, which is impossible (null values aren't
stored... they are simply missing).
This can happen when the field is actually stored as binary (as is the
case for numeric fields).  So my guess is that a field that was
previously a numeric field is now declared to be of type string by the
current schema.

You can try varying the fl parameter to see what field is causing
the issue, or try luke or the luke request handler for a lower-level
view of the index.

-Yonik
http://www.lucidimagination.com



On Fri, Jun 3, 2011 at 11:46 AM, dan whelan d...@adicio.com wrote:
 I am noticing something strange with our recent upgrade to solr 3.1 and want
 to see if anyone has experienced anything similar.

 I have a solr.StrField field named Status the values are Enabled, Disabled,
 or ''

 When I facet on that field it I get

 Enabled 4409565
 Disabled 29185
  112


 The issue is when I do a filter query

 This query works

 select/?q=*:*fq=Status:Enabled

 But when I run this query I get a NPE

 select/?q=*:*fq=Status:Disabled


 Here is part of the stack trace


 Problem accessing /solr/global_accounts/select/. Reason:
    null

 java.lang.NullPointerException
    at org.apache.solr.response.XMLWriter.writePrim(XMLWriter.java:828)
    at org.apache.solr.response.XMLWriter.writeStr(XMLWriter.java:686)
    at org.apache.solr.schema.StrField.write(StrField.java:49)
    at org.apache.solr.schema.SchemaField.write(SchemaField.java:125)
    at org.apache.solr.response.XMLWriter.writeDoc(XMLWriter.java:369)
    at org.apache.solr.response.XMLWriter$3.writeDocs(XMLWriter.java:545)
    at org.apache.solr.response.XMLWriter.writeDocuments(XMLWriter.java:482)
    at org.apache.solr.response.XMLWriter.writeDocList(XMLWriter.java:519)
    at org.apache.solr.response.XMLWriter.writeVal(XMLWriter.java:582)
    at org.apache.solr.response.XMLWriter.writeResponse(XMLWriter.java:131)
    at
 org.apache.solr.response.XMLResponseWriter.write(XMLResponseWriter.java:35)
 ...


 Thanks,

 Dan




Re: fq null pointer exception

2011-06-03 Thread Otis Gospodnetic
Right, so now try adding different fields and see which one breaks it again.  
Then you know which field is a problem and you can dig deeper around that field.

Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



- Original Message 
 From: dan whelan d...@adicio.com
 To: solr-user@lucene.apache.org
 Sent: Fri, June 3, 2011 4:34:40 PM
 Subject: Re: fq null pointer exception
 
 It returned results when I added the fl param.
 
 Strange... wonder what is  going on there
 
 Thanks,
 
 Dan
 
 
 
 On 6/3/11 12:17 PM, Otis  Gospodnetic wrote:
  And what happens if you addfl=your id field  here?
 
  Otis
  
  Sematext :: http://sematext.com/ :: Solr -  Lucene - Nutch
  Lucene ecosystem search :: http://search-lucene.com/
 
 
 
  - Original  Message 
  From: dan wheland...@adicio.com
  To: solr-user@lucene.apache.org
   Sent: Fri, June 3, 2011 1:38:33 PM
  Subject: Re: fq null pointer  exception
 
  Otis, I just deleted the documents and  committed and I still get that  
error.
 
   Thanks,
 
  Dan
 
 
  On  6/3/11 9:43 AM, Otis Gospodnetic  wrote:
  Dan, does the  problem go away if you get rid of those 112  documents with
   empty
  Status or replace their empty status value with,   say, Unknown?
 
  Otis
   
  Sematext :: http://sematext.com/ :: Solr -  Lucene - Nutch
  Lucene  ecosystem search :: http://search-lucene.com/
 
 
 
   - Original  Message 
  From: dan wheland...@adicio.com
  To: solr-user@lucene.apache.org
 Sent: Fri, June 3, 2011 11:46:46 AM
  Subject: fq null  pointer  exception
 
  I am noticing  something strange with our  recent upgrade to solr 3.1 and
  want  to
  see if anyone has experienced  anything  similar.
 
  I have a solr.StrField   field  named Status the values are Enabled,
  Disabled,  or
''
 
  When I  facet  on that field it I  get
 
   Enabled 4409565
  Disabled  29185
 112
 
 
  The  issue is  when I do a filter query
 
   This query   works
 
 select/?q=*:*fq=Status:Enabled
 
   But when I run  this  query I get a  NPE
 
 select/?q=*:*fq=Status:Disabled
 
 
 Here  is part of the stack  trace
 
 
Problem  accessing  /solr/global_accounts/select/. Reason:
   null
 
 java.lang.NullPointerException
at
org.apache.solr.response.XMLWriter.writePrim(XMLWriter.java:828)
   at
org.apache.solr.response.XMLWriter.writeStr(XMLWriter.java:686)
   at   org.apache.solr.schema.StrField.write(StrField.java:49)
  at
org.apache.solr.schema.SchemaField.write(SchemaField.java:125)
   at   
org.apache.solr.response.XMLWriter.writeDoc(XMLWriter.java:369)
   at
   org.apache.solr.response.XMLWriter$3.writeDocs(XMLWriter.java:545)
   at
   org.apache.solr.response.XMLWriter.writeDocuments(XMLWriter.java:482)
   at
   org.apache.solr.response.XMLWriter.writeDocList(XMLWriter.java:519)
   at
org.apache.solr.response.XMLWriter.writeVal(XMLWriter.java:582)
   at
   org.apache.solr.response.XMLWriter.writeResponse(XMLWriter.java:131)
   at
 
   org.apache.solr.response.XMLResponseWriter.write(XMLResponseWriter.java:35)
 ...
 
 
   Thanks,
 
 Dan
 
 
 
 
 


Re: Feature: skipping caches and info about cache use

2011-06-03 Thread Otis Gospodnetic
Robert,

Mainly so that you can tell how fast the search itself is when query or 
documents or filters are not cached.

Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



- Original Message 
 From: Robert Petersen rober...@buy.com
 To: solr-user@lucene.apache.org
 Sent: Fri, June 3, 2011 5:58:43 PM
 Subject: RE: Feature: skipping caches and info about cache use
 
 Why, I'm just wondering?
   
 For a case where you know the next query  would not be possible to be
 already in the cache because it is so different  from the norm? 
 
 Just for timing information for instrumentation used for  tuning  (ie so
 you can compare cached response times vs non-cached  response times)?  
 
 
 -Original Message-
 From: Otis  Gospodnetic [mailto:otis_gospodne...@yahoo.com] 
 Sent: Friday, June 03, 2011 10:02 AM
 To: solr-user@lucene.apache.org
 Subject:  Feature: skipping caches and info about cache use
 
 Hi,
 
 Is it just  me, or would others like things like:
 * The ability to tell Solr (by passing  some URL param?) to skip one or
 more of 
 its caches and get data from the  index
 * An additional attrib in the Solr response that shows whether the  query
 came 
 from the cache or not
 
 * Maybe something else along  these lines?
 
 Or maybe some of this is already there and I just don't know  about it?
 :)
 
 Thanks,
 Otis
 
 Sematext :: http://sematext.com/ :: Solr -  Lucene - Nutch
 Lucene ecosystem search :: http://search-lucene.com/
 
 


Re: How to disable QueryElevationComponent

2011-06-03 Thread Otis Gospodnetic
Romi,

If you don't have a unique ID field, you can always create a UUID - see 
http://search-lucene.com/?q=uuidfc_type=javadoc
If you don't want to use QEC, remove it from the list of components in 
solrconfig.xml

Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



- Original Message 
 From: Romi romijain3...@gmail.com
 To: solr-user@lucene.apache.org
 Sent: Fri, May 27, 2011 5:36:22 AM
 Subject: How to disable QueryElevationComponent
 
 Hi, in my indexed document i do not want a uniqueKey field, but when i do  not
 give any uniqueKey in schema.xml then it shows an  exception
 org.apache.solr.common.SolrException: QueryElevationComponent  requires the
 schema to have a uniqueKeyField.
 it means  QueryElevationComponent requires a uniqueKey field.then how can i
 disable  this QueryEvelationComponent. please reply.
 
 -
 Thanks   Regards
 Romi
 --
 View this message in context: 
http://lucene.472066.n3.nabble.com/How-to-disable-QueryElevationComponent-tp2992195p2992195.html

 Sent  from the Solr - User mailing list archive at Nabble.com.
 


Re: Nutch Crawl error

2011-06-03 Thread Otis Gospodnetic
Roger, wrong list.

Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



- Original Message 
 From: Roger Shah rs...@caci.com
 To: solr-user@lucene.apache.org solr-user@lucene.apache.org
 Sent: Thu, May 26, 2011 3:06:15 PM
 Subject: Nutch Crawl error
 
 I ran the command bin/nutch crawl urls -dir crawl -depth 3   crawl.log
 
 When I viewed crawl.log I found some errors such  as:
 
 Can't retrieve Tika parser for  mime-typeapplication/x-shockwave-flash, and 
some other similar messages for  other types such as application/xml, etc.
 
 Do I need to download Tika for  these errors to go away?  Where can I 
 download 
Tika so that it can work  with Nutch?  If there are instructions to install 
Tika 
to work with Nutch  please send them to me.
 
 Thanks,
 Roger
 


found a bug in query parser upgrading from 1.4.1 to 3.1

2011-06-03 Thread Jason Toy
Greeting all, I found a bug today while trying to upgrade from 1.4.1 to 3.1

In 1.4.1 I was able to insert this  doc:
?xml version=1.0 encoding=UTF-8?adddocfield name=idUser
14914457/fieldfield name=typeUser/fieldfield name=city_sSan
Francisco/fieldfield name=name_textjtoy/fieldfield
name=login_textjtoy/fieldfield name=description_textlife
hacker/fieldfield name=scores:rails_f0.05/field/doc/add


And then I can run the query:

http://localhost:8983/solr/select?q=lifeqf=description_textdefType=dismaxsort=scores:rails_f+desc

and I will get results.

If I insert the same document into solr 3.1 and run the same query I get the
error:

Problem accessing /solr/select. Reason:

undefined field scores

For some reason, solr has cutoff the column name from the colon
forward so scores:rails_f becomes scores

I can see in the lucene index that the data for scores:rails_f is in
the document. For that reason I believe the bug is in solr and not in
lucene.



Jason Toy
socmetrics
http://socmetrics.com
@jtoy