RE: Memory problems when highlight with not very big index

2008-06-16 Thread r.nieto
Hi Yonik,

I've tried to change de documentCache to 60 as you told me but the problem
persist.

If I set hl=off the memory never pass 65.000KB. But if I set it to on in
my first search, using a common word as a, the memory increase until
763000KB. If after this I search for other common word as web the memory
grows up until 120KB..and continue increasing with new searchs.

I can't understand very well how solr uses the RAM. Should I read how Lucene
use the memory or it's not the same management.

My solr configuration is:

###
requestHandler name=standard class=solr.DisMaxRequestHandler
lst name=defaults

   str name=echoParamsexplicit/str
   float name=tie0.01/float
 str name=qfcontent^0.5 title^0.7/str

int name=rows10/int 
str name=flpath,date,format,score,author,title/str 

str name=version2.1/str 
str name=hloff/str 
str name=hl.flcontent/str 
int name=hl.fragsize50/int
int name=hl.snippets2/int

- str name=hl.simple.pre 
![CDATA[ span class=Highlight  ]] 
  /str 
- str name=hl.simple.post 
![CDATA[ /span ]] 
  /str 

/lst
  /requestHandler

...

filterCache
  class=solr.LRUCache
  size=1000
  initialSize=500
  autowarmCount=0/

queryResultCache
  class=solr.LRUCache
  size=640
  initialSize=320
  autowarmCount=0/

documentCache
  class=solr.LRUCache
  size=60
  initialSize=60
  autowarmCount=0/
###

Any help will be very usefull for me.

Thanks for your attention.

-Rober
-Mensaje original-
De: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] En nombre de Yonik Seeley
Enviado el: viernes, 13 de junio de 2008 21:48
Para: solr-user@lucene.apache.org
Asunto: Re: Memory problems when highlight with not very big index

On Fri, Jun 13, 2008 at 3:30 PM, Roberto Nieto [EMAIL PROTECTED] wrote:
 The part that i can't understand very well is why if i desactivate
 highlighting the memory doesnt grows.
 It only uses doc cache if highlighting is used or if content retrieve is
 activated?

Perhaps you are highlighting some fields that you normally don't
return?  What is fl vs hl.fl?

-Yonik



Stylesheet

2008-06-16 Thread Mihails Agafonovs
Hi!

How can I apply stylesheet to the search result? I mean, where can I
define, what stylesheet to use?
 Ar cieņu, Mihails

Re: Language Analyser

2008-06-16 Thread Grant Ingersoll
Could you expand on what you want to do?  Do you mean you want  
language detection?  Or, you just need different analyzers for  
different languages?


Either way, probably the best thing to do is to search the archives  
here for multilingual search


-Grant

On Jun 15, 2008, at 11:48 PM, sherin wrote:



Hi All,

I need to develop a language analyzer to implement multilingual  
search. It
will be very useful if I get any sample language analyzer and a  
sample data

used to index with that analyzer.

thanks in advance..,


--
View this message in context: 
http://www.nabble.com/Language-Analyser-tp17857475p17857475.html
Sent from the Solr - User mailing list archive at Nabble.com.



--
Grant Ingersoll
http://www.lucidimagination.com

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ









Dismax + Dynamic fields

2008-06-16 Thread Norberto Meijome
Hi everyone,

I just wanted to confirm that dynamic fields cannot be used with dismax

By this I mean that the following :

schema.xml
[...]
dynamicField name=dyn_1_* type=text indexed=true
stored=true required=false /
[..]

solrconfig.xml
[..]
  requestHandler name=dismax1 class=solr.DisMaxRequestHandler 

lst name=defaults
 str name=echoParamsexplicit/str
 float name=tie0.01/float
 !-- Query fields
   --
 str name=qf
field1^10.0 dyn_1_*^5.0
/str
[...]

will never take dyn_1_* fields into consideration when searching. I've 
confirmed it with some tests, but maybe I'm missing something.

From what I've read in some emails, it seems like this, but I haven't been able 
to find a direct reference to this.

TIA!
B

_
{Beto|Norberto|Numard} Meijome

Q. How do you make God laugh?
A. Tell him your plans.

I speak for myself, not my employer. Contents may be hot. Slippery when wet. 
Reading disclaimers makes you go blind. Writing them is worse. You have been 
Warned.


Exposing admin through XML

2008-06-16 Thread McBride, John
Hello,

I have noticed that the solr/admin page pulls in XML status information
from add on modules in solr eg DataImportHandler.  

Is the core SOLR statistical data exposed through an XML API, such that
I could collate all SOLR Slave status pages into one consolidated admin
panel?



Thanks,
John


Re: Exposing admin through XML

2008-06-16 Thread Shalin Shekhar Mangar
Hi John,

The output from the statistics page is in XML format on which an XSL
stylesheet is applied to make it more presentable. You can directly call the
statistics page from your programs and parse out all the data you need.

On Mon, Jun 16, 2008 at 8:19 PM, McBride, John [EMAIL PROTECTED]
wrote:

 Hello,

 I have noticed that the solr/admin page pulls in XML status information
 from add on modules in solr eg DataImportHandler.

 Is the core SOLR statistical data exposed through an XML API, such that
 I could collate all SOLR Slave status pages into one consolidated admin
 panel?



 Thanks,
 John




-- 
Regards,
Shalin Shekhar Mangar.


Stylesheet

2008-06-16 Thread Pallaka, Kesava

Hi!

How can I apply stylesheet to the search result? I mean, where can I
define, what stylesheet to use?

Thanks,
Kesava


Re: Dismax + Dynamic fields

2008-06-16 Thread Yonik Seeley
On Mon, Jun 16, 2008 at 10:46 AM, Norberto Meijome [EMAIL PROTECTED] wrote:
 I just wanted to confirm that dynamic fields cannot be used with dismax

There are two levels of dynamic field support.

Specific dynamic fields can be queried with dismax, but you can't
wildcard the qf or other field parameters.

-Yonik

 By this I mean that the following :

 schema.xml
 [...]
dynamicField name=dyn_1_* type=text indexed=true
stored=true required=false /
 [..]

 solrconfig.xml
 [..]
  requestHandler name=dismax1 class=solr.DisMaxRequestHandler 

lst name=defaults
 str name=echoParamsexplicit/str
 float name=tie0.01/float
 !-- Query fields
   --
 str name=qf
field1^10.0 dyn_1_*^5.0
 /str
 [...]

 will never take dyn_1_* fields into consideration when searching. I've 
 confirmed it with some tests, but maybe I'm missing something.

 From what I've read in some emails, it seems like this, but I haven't been 
 able to find a direct reference to this.

 TIA!
 B

 _
 {Beto|Norberto|Numard} Meijome

 Q. How do you make God laugh?
 A. Tell him your plans.

 I speak for myself, not my employer. Contents may be hot. Slippery when wet. 
 Reading disclaimers makes you go blind. Writing them is worse. You have been 
 Warned.



Adding records during a commit

2008-06-16 Thread dls1138

I've been sending data in batches to Solr with no errors reported, yet after
a commit, over 50% of the records I added (before the commit) do not show
up- even after several subsequent commits down the road.

Is it possible that Solr/Lucene could be disregarding or dropping my add
queries if those queries were executed while a commit was running?

For example, if I add 300 records, and then do a commit- during the 10-20
seconds for the commit to execute (on an index over 1.2M records), if I add
100 more records during that 10-20 second time period, are those adds lost?
I'm assuming they are not and will be visible after the next commit, but I
want to be sure as it seems that some are being dropped. I just need to know
if this can happen during commits or if I should be looking elsewhere to
resolve my dropped record problem.

Thanks.


-- 
View this message in context: 
http://www.nabble.com/Adding-records-during-a-commit-tp17872257p17872257.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: get the fields of solr

2008-06-16 Thread wojtekpia

I'm able to get the fields specified in my schema with this query:
/solr/admin/luke?show=schemanumTerms=0

But it doesn't show me dynamic fields that I've created. Is there a way to
get dynamic fields as well?



Yonik Seeley wrote:
 
 On Dec 20, 2007 8:47 PM, Edward Zhang [EMAIL PROTECTED] wrote:
  I tried it, but the QTime was beyond my tolerance.It costs me about
 53s
 on average to show=schema.
 
 That's probably because Luke tries to find the top terms for each
 field by default.
 Try passing in numTerms=0
 
 -Yonik
 
 
 The index contains *5456360 *documents. The
 index was optimized.Is there any more fast way? Information responsed as
 follows: *
 *   ?xml version=1.0 encoding=UTF-8 ?
 - response

 - lst name=responseHeader

   int name=status0/int
   int name=QTime50187/int
   /lst
   str name=WARNINGThis response format is experimental. It is likely
 to
 change in the future./str
 - lst name=index

   int name=numDocs5456360/int
   int name=maxDoc5456360/int
   int name=numTerms25930032/int
   long name=version1196480831539/long
   bool name=optimizedtrue/bool
   bool name=currenttrue/bool
   bool name=hasDeletionsfalse/bool
   str name=directory
 org.apache.lucene.store.FSDirectory:[EMAIL 
 PROTECTED]:\LabHome\solrxhome\data\index/str

   date name=lastModified2007-12-02T11:26:54.625Z/date
   /lst




 On 12/20/07, Ryan McKinley [EMAIL PROTECTED] wrote:
 
  Check the LukeRequestHandler:
  http://wiki.apache.org/solr/LukeRequestHandler
 
 
  Edward Zhang wrote:
   I need to get all the fields of a remote solr istance. I try to parse
  the
   xmlstream returned by
 admin/get-file.jsp?file=schema.xmlcore=core1.Is
   there any other way?
  
   BTW: The xmlstream contain 3 space lines in head and 2 in tail, which
   cause some trouble to parse.
  
   Every reply appreciated.
  
 
 

 
 

-- 
View this message in context: 
http://www.nabble.com/get-the-fields-of-solr-tp14431354p17873611.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Adding records during a commit

2008-06-16 Thread Yonik Seeley
No records should be dropped, regardless of if a commit or optimize is going on.
Are you checking the return codes (HTTP return codes for Solr 1.3)?
Some updates could be failing for some reason.
Also grep for Exception in the solr log file.

-Yonik

On Mon, Jun 16, 2008 at 4:02 PM, dls1138 [EMAIL PROTECTED] wrote:

 I've been sending data in batches to Solr with no errors reported, yet after
 a commit, over 50% of the records I added (before the commit) do not show
 up- even after several subsequent commits down the road.

 Is it possible that Solr/Lucene could be disregarding or dropping my add
 queries if those queries were executed while a commit was running?

 For example, if I add 300 records, and then do a commit- during the 10-20
 seconds for the commit to execute (on an index over 1.2M records), if I add
 100 more records during that 10-20 second time period, are those adds lost?
 I'm assuming they are not and will be visible after the next commit, but I
 want to be sure as it seems that some are being dropped. I just need to know
 if this can happen during commits or if I should be looking elsewhere to
 resolve my dropped record problem.

 Thanks.


 --
 View this message in context: 
 http://www.nabble.com/Adding-records-during-a-commit-tp17872257p17872257.html
 Sent from the Solr - User mailing list archive at Nabble.com.




Re: Adding records during a commit

2008-06-16 Thread dls1138

I'm getting all 200 return codes from Solr on all of my batches. 

I skimmed the logs for errors, but I didn't try to grep for Exception. I
will take your advice look there for some clues.

Incidentally I'm running solr 1.2 using Jetty. I'm not on 1.3 because I read
it wasn't released yet. Is there a (more stable than 1.2) branch of 1.3 I
should be using instead? 

I know 1.2 is obviously dated, and came packaged with an old version of
Lucene. Should I update either or both?





Yonik Seeley wrote:
 
 No records should be dropped, regardless of if a commit or optimize is
 going on.
 Are you checking the return codes (HTTP return codes for Solr 1.3)?
 Some updates could be failing for some reason.
 Also grep for Exception in the solr log file.
 
 -Yonik
 
 On Mon, Jun 16, 2008 at 4:02 PM, dls1138 [EMAIL PROTECTED] wrote:

 I've been sending data in batches to Solr with no errors reported, yet
 after
 a commit, over 50% of the records I added (before the commit) do not show
 up- even after several subsequent commits down the road.

 Is it possible that Solr/Lucene could be disregarding or dropping my add
 queries if those queries were executed while a commit was running?

 For example, if I add 300 records, and then do a commit- during the 10-20
 seconds for the commit to execute (on an index over 1.2M records), if I
 add
 100 more records during that 10-20 second time period, are those adds
 lost?
 I'm assuming they are not and will be visible after the next commit, but
 I
 want to be sure as it seems that some are being dropped. I just need to
 know
 if this can happen during commits or if I should be looking elsewhere to
 resolve my dropped record problem.

 Thanks.


 --
 View this message in context:
 http://www.nabble.com/Adding-records-during-a-commit-tp17872257p17872257.html
 Sent from the Solr - User mailing list archive at Nabble.com.


 
 

-- 
View this message in context: 
http://www.nabble.com/Adding-records-during-a-commit-tp17872257p17874274.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Searching accross many fields

2008-06-16 Thread Chris Hostetter

: This works well when the number of fields is small, but what are the
: performance ramifications when the number of fields is more than 1000? 
: Is this a serious performance killer? If yes, what would we need to
: counter act it, more RAM or faster CPU's? Or both?

the performance characteristics of having 1000 fields should bethe same 
regardless of wether those fields are explicitly named in your schema, or 
created on the fly becuase of dynamic field declarations ... it might be 
more expensive to query 1000 fields then it is to query 10 fields, but the 
dynamic nature of it isn't going to matter -- it's the number of clauses 
that make the differnece (1000 clauses on one field is going to have about 
hte same characteristics)

: Is it better to copy all fields to a content field and then always
: search there? This works, but then it is hard to boost specific field
: values. and that is what we want to do. 

you can always do both ... the schemas i work with tend to have several 
aggregation fields, many of which are populate using copyField with 
dynamic field patterns for the source ... you can still query on 
specific fields (with high boosts) in addition to querying on the 
aggregated fields.

FWIW: the main index i worry about has well over 1000 fields once you 
consider all the dynamic fields.  I think the last time i looked it was 
about 6000 ... the only thing i worry about is making usre i have 
omitNorm=true on any dynimc field whose cardinality i can't garuntee 
will be small (ie: 2-10).

I use request handlers that execute 100-300 queries for each request 
against those dynamic fields .. but each individual query typically only 
has 1-10 clauses in it.


-Hoss



Re: Adding records during a commit

2008-06-16 Thread Yonik Seeley
On Mon, Jun 16, 2008 at 6:07 PM, dls1138 [EMAIL PROTECTED] wrote:
 I'm getting all 200 return codes from Solr on all of my batches.

IIRC, Solr1.2 uses the update servlet and always returns 200 (you need
to look at the response body to see if there was an error or not).

 I skimmed the logs for errors, but I didn't try to grep for Exception. I
 will take your advice look there for some clues.

 Incidentally I'm running solr 1.2 using Jetty. I'm not on 1.3 because I read
 it wasn't released yet. Is there a (more stable than 1.2) branch of 1.3 I
 should be using instead?

If you aren't going to go into production for another month or so, I'd
start using 1.3
Start off with a new solrconfig.xml from 1.3 and re-make any
customizations to make sure you get the latest behavior.

 I know 1.2 is obviously dated, and came packaged with an old version of
 Lucene. Should I update either or both?

Solr takes care of updating Lucene for you... I wouldn't recommend
changing the version of Lucene independent of Solr unless you are
pretty experienced in Lucene.

-Yonik





 Yonik Seeley wrote:

 No records should be dropped, regardless of if a commit or optimize is
 going on.
 Are you checking the return codes (HTTP return codes for Solr 1.3)?
 Some updates could be failing for some reason.
 Also grep for Exception in the solr log file.

 -Yonik

 On Mon, Jun 16, 2008 at 4:02 PM, dls1138 [EMAIL PROTECTED] wrote:

 I've been sending data in batches to Solr with no errors reported, yet
 after
 a commit, over 50% of the records I added (before the commit) do not show
 up- even after several subsequent commits down the road.

 Is it possible that Solr/Lucene could be disregarding or dropping my add
 queries if those queries were executed while a commit was running?

 For example, if I add 300 records, and then do a commit- during the 10-20
 seconds for the commit to execute (on an index over 1.2M records), if I
 add
 100 more records during that 10-20 second time period, are those adds
 lost?
 I'm assuming they are not and will be visible after the next commit, but
 I
 want to be sure as it seems that some are being dropped. I just need to
 know
 if this can happen during commits or if I should be looking elsewhere to
 resolve my dropped record problem.

 Thanks.


 --
 View this message in context:
 http://www.nabble.com/Adding-records-during-a-commit-tp17872257p17872257.html
 Sent from the Solr - User mailing list archive at Nabble.com.





 --
 View this message in context: 
 http://www.nabble.com/Adding-records-during-a-commit-tp17872257p17874274.html
 Sent from the Solr - User mailing list archive at Nabble.com.




Re: Adding records during a commit

2008-06-16 Thread dls1138

The version of 1.2 I'm using does use the update servlet and would return 400
( or similar ) if something went wrong, and 200 if OK, but, like you
suggested, perhaps a 200 does not entirely mean it completely worked.

It sounds like 1.3 is the way to go. I will start with the 1.3 config and
schema and work from there to create our fields and see what happens. 

Is there any problem using our existing 1.2 index in 1.3?




Yonik Seeley wrote:
 
 On Mon, Jun 16, 2008 at 6:07 PM, dls1138 [EMAIL PROTECTED] wrote:
 I'm getting all 200 return codes from Solr on all of my batches.
 
 IIRC, Solr1.2 uses the update servlet and always returns 200 (you need
 to look at the response body to see if there was an error or not).
 
 I skimmed the logs for errors, but I didn't try to grep for Exception.
 I
 will take your advice look there for some clues.

 Incidentally I'm running solr 1.2 using Jetty. I'm not on 1.3 because I
 read
 it wasn't released yet. Is there a (more stable than 1.2) branch of 1.3 I
 should be using instead?
 
 If you aren't going to go into production for another month or so, I'd
 start using 1.3
 Start off with a new solrconfig.xml from 1.3 and re-make any
 customizations to make sure you get the latest behavior.
 
 I know 1.2 is obviously dated, and came packaged with an old version of
 Lucene. Should I update either or both?
 
 Solr takes care of updating Lucene for you... I wouldn't recommend
 changing the version of Lucene independent of Solr unless you are
 pretty experienced in Lucene.
 
 -Yonik
 




 Yonik Seeley wrote:

 No records should be dropped, regardless of if a commit or optimize is
 going on.
 Are you checking the return codes (HTTP return codes for Solr 1.3)?
 Some updates could be failing for some reason.
 Also grep for Exception in the solr log file.

 -Yonik

 On Mon, Jun 16, 2008 at 4:02 PM, dls1138 [EMAIL PROTECTED] wrote:

 I've been sending data in batches to Solr with no errors reported, yet
 after
 a commit, over 50% of the records I added (before the commit) do not
 show
 up- even after several subsequent commits down the road.

 Is it possible that Solr/Lucene could be disregarding or dropping my
 add
 queries if those queries were executed while a commit was running?

 For example, if I add 300 records, and then do a commit- during the
 10-20
 seconds for the commit to execute (on an index over 1.2M records), if I
 add
 100 more records during that 10-20 second time period, are those adds
 lost?
 I'm assuming they are not and will be visible after the next commit,
 but
 I
 want to be sure as it seems that some are being dropped. I just need to
 know
 if this can happen during commits or if I should be looking elsewhere
 to
 resolve my dropped record problem.

 Thanks.


 --
 View this message in context:
 http://www.nabble.com/Adding-records-during-a-commit-tp17872257p17872257.html
 Sent from the Solr - User mailing list archive at Nabble.com.





 --
 View this message in context:
 http://www.nabble.com/Adding-records-during-a-commit-tp17872257p17874274.html
 Sent from the Solr - User mailing list archive at Nabble.com.


 
 

-- 
View this message in context: 
http://www.nabble.com/Adding-records-during-a-commit-tp17872257p17874662.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Adding records during a commit

2008-06-16 Thread Chris Hostetter

: want to be sure as it seems that some are being dropped. I just need to know
: if this can happen during commits or if I should be looking elsewhere to
: resolve my dropped record problem.

are you sure you aren't adding documents with identicle uniqueKeys to 
existing documents?  what does docsDeleted say on your stats.jsp ?



-Hoss



Re: Adding records during a commit

2008-06-16 Thread dls1138

Hoss, 

I'm sure the keys are unique. I'm generating them myself before adding. Only
a handful of items have gone in with duplicate keys.

Here is what the update handler is reporting (I assume since I last
restarted Solr on 6/13):

commits : 17429
autocommits : 0
optimizes : 0
docsPending : 3
deletesPending : 3
adds : 3
deletesById : 0
deletesByQuery : 0
errors : 0
cumulative_adds : 74639
cumulative_deletesById : 0
cumulative_deletesByQuery : 0
cumulative_errors : 0
docsDeleted : 0 

Thing is- I should have added over 200,000 records given the number of adds
I sent over since then. At this point I'm guessing that my adds are causing
errors that I haven't been able to detect, and per Yonik's suggestion,
upgrading to Solr 1.3 will help troubleshoot or possibly fix the issue. 



hossman wrote:
 
 
 : want to be sure as it seems that some are being dropped. I just need to
 know
 : if this can happen during commits or if I should be looking elsewhere to
 : resolve my dropped record problem.
 
 are you sure you aren't adding documents with identicle uniqueKeys to 
 existing documents?  what does docsDeleted say on your stats.jsp ?
 
 
 
 
 -Hoss
 
 
 

-- 
View this message in context: 
http://www.nabble.com/Adding-records-during-a-commit-tp17872257p17874991.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Solr 1.3 - leading wildcard query feature?

2008-06-16 Thread Chris Hostetter

: Maybe I could contribute that but I don't really know the code well however. I
: found the Lucene-switch and someone described in a discussion on this earlier
: and change it, but that doesn't seem to be the way you would want to handle
: it.

I've updated SOLR-218 with some tips on how (I think) it could best be 
implemented.




-Hoss


RE: add new fields to existing index

2008-06-16 Thread Chris Hostetter

: I can certainly do: search for the unique key or combination of other 
: fields, then put rest fields of this document plus new fields back to 
: it.
: 
: I know this is not a too smart way, before I do that, is there any solr 
: guru out there who can think of a better way?

That is really, the only way, to update an existing document using solr 
out of the box

(you can for the record, add new fields to your schema at anytime, but 
there is no way to add values for a field to an existing document)

If you know Java and you're willing to get really familiar with the low 
levle lucene internals, you can write yourself a custom app to merge 
your existing index into a new (empty) index while using a 
ParallelReader that would let you associate new fields with each doc.  
Writing a utility like this is a *seriously* exercise and has a lot of 
little nuances you would need to worry about to get it to work right. 

I personally have never attempted it ... even when i'ven needed to add new 
fields to my biggest indexes, I've always decided that the wall clock 
and CPU time needed to rebuild my index from scratch while i do 
something else was cheaper then my developer time of jumping through the 
hoops to get this to work. (particularly since there's still a non-trivial 
amount of cpu and wall clock time needed to run the thing once you write 
it)


-Hoss



Re: setAllowLeadingWildcard

2008-06-16 Thread Chris Hostetter

: Doesn't anyone know a revision number from svn that might be working and where
: setAllowLeadingWildcard is set-able?

it's not currently a user setable option ... at the moment you need to 
modify code to do this.  

if you know Java and whould like to help work on a general patch there is 
an open issue to make it (and other QueryParser options) configurable...

https://issues.apache.org/jira/browse/SOLR-218


-Hoss



Re: Wildcard on q.alt with Dismax

2008-06-16 Thread Chris Hostetter

: When I do the search vio*, I get the correct results, but no highlighting.

this is because for prefix queries solr use an PrefixFilter 
that is garunteed to work (no matter how many docs match) instead of a 
PrefixQuery (which might generate an exception, but can be highlighted)

see SOLR-195 for background and discussion, and SOLR-218 for discussion 
about ways to introduce config options for things like this (letting 
people who would rather use PrefixQuery instead of PrefixFilter so they 
can get highlighting at the possible risk of a TooManyClauses exception)

: When I search lu*n I get the correct results with highlighting.

becuase this uses a WildcardQuery which doesn't have the same safety net 
as PrefixFilter.

: When I search l*n, I get a 500 Internal Server Error.

...probably because it generates a TooManyClauses Exception like i was 
describing (the full stack trace will tell you -- check your logs).  the 
limit is configurable in solrconfig.xml using maxBooleanClauses/ but the 
bigger you make it, the more RAM you'll need for it to work (and the more 
likely you'll trigger an OutOfMemory exception)



-Hoss



Re: Wildcard on q.alt with Dismax

2008-06-16 Thread Rebecca Illowsky
On Mon, Jun 16, 2008 at 5:24 PM, Chris Hostetter [EMAIL PROTECTED]
wrote:


 : When I do the search vio*, I get the correct results, but no
 highlighting.

 this is because for prefix queries solr use an PrefixFilter
 that is garunteed to work (no matter how many docs match) instead of a
 PrefixQuery (which might generate an exception, but can be highlighted)

 see SOLR-195 for background and discussion, and SOLR-218 for discussion
 about ways to introduce config options for things like this (letting
 people who would rather use PrefixQuery instead of PrefixFilter so they
 can get highlighting at the possible risk of a TooManyClauses exception)

 : When I search lu*n I get the correct results with highlighting.

 becuase this uses a WildcardQuery which doesn't have the same safety net
 as PrefixFilter.

 : When I search l*n, I get a 500 Internal Server Error.

 ...probably because it generates a TooManyClauses Exception like i was
 describing (the full stack trace will tell you -- check your logs).  the
 limit is configurable in solrconfig.xml using maxBooleanClauses/ but the
 bigger you make it, the more RAM you'll need for it to work (and the more
 likely you'll trigger an OutOfMemory exception)



 -Hoss




multicore vs. multiple instances

2008-06-16 Thread Jeremy Hinegardner
I'm sitting here looking over some ideas and one thing just occurred to
me, what would be the benefits of using a MultiCore approache for
sharding vs.  multiple instances of solr?

That is, say I wanted to have 3 shards on a single piece of hardware,
what would be the advantages / disadvantages of using 1 instance of Solr
with 3 cores, vs 3 instances of Solr. In the latter case, with the
additional thought of each in its own servlet container, or each as a
different webapp inside a single servlet container.

A few comparison I can think of right now are:

  * 1 JVM in multicore vs. 3 in multi-instances, with 3 servlet
containers
  * future possibility of dynamically allocating and loading of
additional cores in multicore 
  * snapshotting with MultiCore?  I'm assuming that will just work, but
I haven't tested it yet.
  * configuration may be the easiest and most maintainable with multicore ?

Any other thoughts people have on the subject?  What considerations
would you use to decide between multicore vs. multiple instances ? 

enjoy,

-jeremy

-- 

 Jeremy Hinegardner  [EMAIL PROTECTED] 



Re: multicore vs. multiple instances

2008-06-16 Thread Otis Gospodnetic
Short-circuit attempt.  Why put 3 shards on a single server in the first place? 
 If you are working with large index and need to break it into smaller shards, 
break it in shards where each shard fully utilizes the server it is on.


Other than my thought above, I think you hit the main differences below.  The 
only other thing I'd add is that multi-core Solr is not one's only option - 
servlet containers with Solr homes defined via JNDI is another route.  I've 
used this with a lot of success with Jetty (piece of cake to configure and 
works beautifully).

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch


- Original Message 
 From: Jeremy Hinegardner [EMAIL PROTECTED]
 To: solr-user@lucene.apache.org
 Sent: Monday, June 16, 2008 10:42:25 PM
 Subject: multicore vs. multiple instances
 
 I'm sitting here looking over some ideas and one thing just occurred to
 me, what would be the benefits of using a MultiCore approache for
 sharding vs.  multiple instances of solr?
 
 That is, say I wanted to have 3 shards on a single piece of hardware,
 what would be the advantages / disadvantages of using 1 instance of Solr
 with 3 cores, vs 3 instances of Solr. In the latter case, with the
 additional thought of each in its own servlet container, or each as a
 different webapp inside a single servlet container.
 
 A few comparison I can think of right now are:
 
   * 1 JVM in multicore vs. 3 in multi-instances, with 3 servlet
 containers
   * future possibility of dynamically allocating and loading of
 additional cores in multicore 
   * snapshotting with MultiCore?  I'm assuming that will just work, but
 I haven't tested it yet.
   * configuration may be the easiest and most maintainable with multicore ?
 
 Any other thoughts people have on the subject?  What considerations
 would you use to decide between multicore vs. multiple instances ? 
 
 enjoy,
 
 -jeremy
 
 -- 
 
 Jeremy Hinegardner  [EMAIL PROTECTED] 



Re[2]: How to limit number of pages per domain

2008-06-16 Thread JLIST
Hello Otis,

https://issues.apache.org/jira/browse/SOLR-236 has links for
a lot of files. I figure this is what I need:
10. solr-236.patch (24 kb)

So I downloaded the patch file, and also downloaded 2008/06/16
nightly build, then I ran this, and got an error:

$ patch -p0 -i solr-236.patch --dry-run
patching file `src/test/org/apache/solr/search/TestDocSet.java'
patching file `src/java/org/apache/solr/search/CollapseFilter.java'
patching file `src/java/org/apache/solr/search/NegatedDocSet.java'
patching file `src/java/org/apache/solr/search/SolrIndexSearcher.java'
patching file `src/java/org/apache/solr/common/params/CollapseParams.java'
patching file 
`src/java/org/apache/solr/handler/component/CollapseComponent.java'
patch:  malformed patch at line 680:

Am I doing it wrong, or missing some other steps?

Thanks,
Jack

 I don't know yet, so I asked directly in that JIRA issue :)

 Applying patches is done something like this:
 
 Ah, just added it to the Solr FAQ on the Wiki for everyone:

 http://wiki.apache.org/solr/FAQ#head-bd01dc2c65240a36e7c0ee78eaef88912a0e4030

 Can you provide feedback about this particular patch once you try
 it?  I'd like to get it on Solr 1.3, actually, so any feedback would
 help.

 Thanks,
 Otis
 --
 Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch


 - Original Message 
 From: Jack [EMAIL PROTECTED]
 To: solr-user@lucene.apache.org
 Sent: Thursday, May 22, 2008 12:35:28 PM
 Subject: Re: How to limit number of pages per domain
 
 I think I'll give it a try. I haven't done this before. Are there any
 instructions regarding how to apply the patch? I see 9 files, some
 displayed in gray links, some in blue links; some named as .diff, some
 .patch; one has 1.3 in file name, one has 1.3, I suppose the other
 files are for both versions. Should I apply all of them?
 https://issues.apache.org/jira/browse/SOLR-236
 
  Actually, the best documentation are really the comments in the JIRA issue
 itself.
  Is there anyone actually using Solr with this patch?