RE: range query failed if highlight is used

2008-02-26 Thread Xuesong Luo
Thanks Hoss, I created https://issues.apache.org/jira/browse/SOLR-491 to
check this bug

The reason I need to highlight the numeric or data field is I have to
loop through the search result to apply role permission check on those
fields. If the searcher doesn't have permission to see the numeric/date
field of the user in the search result list, that field should be set to
null when returned. If the search doesn't have permission on all
matching fields, then the whole record should not be returned. How can I
find out which fields are the matching fields if the searcher is
searching on multiple fields? The only easy way I can think about is if
the field is highlighted, it's a matching field. 
  
Does it make sense?

Thanks
Xuesong

-Original Message-
From: Chris Hostetter [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, February 26, 2008 6:06 PM
To: solr-user@lucene.apache.org
Subject: Re: range query failed if highlight is used


: I'm using solr1.3 nightly build. I defined a sint field bookCount.
When
: I query on this field, it works fine without highlight. However if I
: turn on highlight(hl=true&hl.fl=bookCount), it failed due to the error

I'm not sure if i really understand what it would mean to highlight a
numeric field,  hilighting a range query probably won't ever work
because of the way range queries are implemented in Solr ... but at the
very least there should be a better error message in this case.  (and
the case of a simple single value numeric lookup should probably work)

could you please file a bug for this?

:
rows=10&start=0&hl.fl=bookCount&indent=on&q=bookCount:5&hl=true&version=
: 2.2 0 0
: 
: 2008-02-25 16:54:53,524 ERROR [STDERR] Feb 25, 2008 4:54:53 PM
: org.apache.solr.common.SolrException log
: SEVERE: java.lang.NumberFormatException: For input string: "   "



-Hoss


Re: solr to work for my web application

2008-02-26 Thread newBea

Hey Thorsten, I need to know one more thing...currently solr is getting up
and running on port 8080. What my understanding is whichever port my tomcat
is running, solr is running on the same. I mean if tomcat is down, solr is
down and vice versa. 

My requirement is I need solr to run on port 80 and also this should be up
and running through tomcat configuration for solr. I am searching for this
requirement, please let me know if you have any clues.

Thanks in advance...


Thorsten Scherler-3 wrote:
> 
> On Fri, 2008-02-22 at 04:11 -0800, newBea wrote:
>> Hi Thorsten,
>> 
>> Many thanks for ur replies so far...finally i set up correct environment
>> for
>> Solr. Its working:clap:
> 
> :)
> 
> Congrats, glad you got it running.
> 
>> 
>> Solr Rocks!
> 
> Indeed. :)
> 
> salu2
> 
>> 
>> Thorsten Scherler wrote:
>> > 
>> > On Thu, 2008-02-14 at 23:16 -0800, newBea wrote:
>> >> Hi Thorsten...
>> >> 
>> >> SOrry for giving u much trouble but I need some answer regarding
>> >> solr...plz
>> >> help...
>> >> 
>> >> Question1
>> >> I am using tomcat 5.5.23 so for JNDI setup of solr, adding solr.xml
>> with
>> >> context fragment as below in the tomcat5.5/...catalina/localhost.
>> >> 
>> >> 
>> >>> >> value="D:/Projects/csdb/solr" override="true" />
>> >> 
>> >> 
>> >> Is it the correct way of doing it? 
>> > 
>> > Yes as I understand the wiki page.
>> > 
>> >> Or do I need to add context fragment in
>> >> the server.xml of tomcat5.5?
>> >> 
>> >> Question2
>> >> I am starting solr server using start.jar from another location on C:
>> >> drive...whereas my home location indicated on D: drive. Is it the root
>> >> coz I
>> >> am not getting the search result?
>> > 
>> > Hmm, as I understand it you are starting two instance of solr! One as a
>> > tomcat and the other as jetty. Why do you want that? If you have solr
>> on
>> > tomcat you do not need the jetty anymore. I does make 0 sense under
>> > normal circumstances to do this.
>> > 
>> >> 
>> >> Question3
>> >> I have added parameter as C:\solr\data in
>> >> solrconfig.xml...
>> > 
>> > That seems to be wrong. It should read ${solr.data.dir:C:\solr
>> > \dat} but I am not using windows so I am not sure whether you
>> > may need to escape the path.
>> > 
>> > salu2
>> > 
>> >> but the indexes are not getting stored there...indexes for
>> >> search are getting stored in the default dir of solr...any suggestions
>> >> 
>> >> Thanks in advance...
>> >> 
>> >> 
>> >> Thorsten Scherler wrote:
>> >> > 
>> >> > On Wed, 2008-02-13 at 05:04 -0800, newBea wrote:
>> >> >> I havnt used luke.xsl. Ya but the link provided by u gives me "Solr
>> >> Luke
>> >> >> Request Handler Response"...
>> >> >> 
>> >> >>  is simple string as: csid
>> >> > 
>> >> > So you have:
>> >> > csid
>> >> > 
>> >> > and
>> >> > > >> > required="true" /> 
>> >> > 
>> >> > 
>> >> >> 
>> >> >> till now I am updating docs thru command prompt as : post.jar *.xml
>> >> >> http://localhost:8983/update
>> >> > 
>> >> > how do the docs look like? I mean since you changed the sample
>> config
>> >> > you send changed documents as well, right? How do they look?
>> >> > 
>> >> >> 
>> >> >> I am not clear on how do I post xml docs
>> >> > 
>> >> > Well like you said, with the post.jar and then you will send your
>> >> > modified docs but there are many ways to trigger an add command to
>> >> solr.
>> >> > 
>> >> >>  or wud xml docs be posted while I
>> >> >> request solr thru tomcat at the time of searching text...
>> >> > 
>> >> > To search text from tomcat you will need to have a servlet or
>> something
>> >> > similar that contacts the solr server for the search result and the
>> >> > handle the response (e.g. apply custom xsl to the results).
>> >> > 
>> >> > 
>> >> > 
>> >> >> 
>> >> >> This manually procedure when I update the xml docs on exampledocs
>> >> folder
>> >> >> inside distribution package restrict it to exampledocs itself
>> >> > 
>> >> > No, either copy the jar to the folder where you have your documents
>> or
>> >> > add it to the PATH.
>> >> > 
>> >> >> ...I am not
>> >> >> getting a way where my sites text get searched by solr...Do I need
>> to
>> >> >> copy
>> >> >> start.jar and relevant folders in my working directory for web
>> >> >> application.
>> >> > 
>> >> > Hmm, it seems that you not have understood the second paragraph of 
>> >> > http://wiki.apache.org/solr/mySolr
>> >> > 
>> >> > "Typically it's not recommended to have your front end users/clients
>> >> > hitting Solr directly as part of an HTML form submit ... the more
>> >> > conventional way to think of it is that Solr is a backend service,
>> >> which
>> >> > your application can talk to over HTTP ..."
>> >> > 
>> >> > Meaning you have two different server running. Alternatively you can
>> >> run
>> >> > solr in the same tomcat as you application. If you follow SolrTomcat
>> >> > from the wiki it will be install as "solr" servlet. Your application
>> >> > will then communicate with this serlvet.
>> >> > 
>> >> > salu2
>> 

Re: Boost the results for filter value in a single query

2008-02-26 Thread Vijay Khurana
Thanks for the response Yonik.
The content source field is a single valued field. Sorting the results won't
work for me as the content source values are arbitrary strings and there is
no set pattern i.e it can be axd or xbc or abc def. All I know at time of
the query that results for content source axd should appear before
the results for other content sources.

Thanks,
Vijay


On 2/26/08, Yonik Seeley <[EMAIL PROTECTED]> wrote:
>
> If content_source is single valued (maximum of one value per
> document), then sort.
> sort=content_source asc
>
> -Yonik
>
> On Tue, Feb 26, 2008 at 4:15 AM, Vijay Khurana <[EMAIL PROTECTED]>
> wrote:
> > Hi,
> >  Here is the problem I am facing.
> >  One of the field in my index is content_source which I use to filter my
> >  results.
> >  I want all the documents that have content_source value as ABC OR DEF
> >  and all the documents having DEF
> >  should appear before the documents that have content_source value as
> ABC. I
> >  am using DisMaxRequestHandler and Solr 2.0.
> >
> >  Currently I am using below query:
> >
> >  q=test&qt=dismax&fq=contemt_source:("ABC" OR "DEF")
> >
> >  The above query doesn't guarantee the appearance of DEF documents
> before
> >  ABC.
> >  I want to do this in one query. is it possible?
> >
> >  Appreciate your help.
> >
> >  Thanks,
> >  Vijay
> >
>


RE: solr to handle special charater

2008-02-26 Thread Chris Hostetter

: By the way, I used DisMaxRequestHandler in solrconfig.xml. I googled a 
: little about DisMaxRequestHandler, it says that '+' and '-' characters 
: prefixing nonwhitespace characters are treated as "mandatory" and 
: "prohibited" modifiers for the subsequent terms, but it doesn't say 
: anything about just '+' or '-' characters.

Hmmm... well if you add debugQuery=true to your requests, and look at the 
parsed query string, you can see that the "-" is getting applied to the 
DisjuntionMax query being built for the second clause.  which means either 
the documentation in the wiki is wrong, or there is a bug.

i aparently wrote that documentation, and i wrote the orriginal dismax 
code ... I thought someone else had at some point added in some special 
escaping for "-" or "+" followed by whitespace (which would explain why i 
wrote that in the documentation), but i can't see any evidence of it now.

so i'm going to change the wiki, and open a bug to add a feature like that 
... but in the meantime...

: Does anyone know a workaround before I stripped off '+'/'-' by myself?

...i would just do that. 


-Hoss



Re: range query failed if highlight is used

2008-02-26 Thread Chris Hostetter

: I'm using solr1.3 nightly build. I defined a sint field bookCount. When
: I query on this field, it works fine without highlight. However if I
: turn on highlight(hl=true&hl.fl=bookCount), it failed due to the error

I'm not sure if i really understand what it would mean to highlight a 
numeric field,  hilighting a range query probably won't ever work 
because of the way range queries are implemented in Solr ... but at the 
very least there should be a better error message in this case.  (and the 
case of a simple single value numeric lookup should probably work)

could you please file a bug for this?

: rows=10&start=0&hl.fl=bookCount&indent=on&q=bookCount:5&hl=true&version=
: 2.2 0 0
: 
: 2008-02-25 16:54:53,524 ERROR [STDERR] Feb 25, 2008 4:54:53 PM
: org.apache.solr.common.SolrException log
: SEVERE: java.lang.NumberFormatException: For input string: "   "



-Hoss



Re: similarity search with solr

2008-02-26 Thread Erik Hatcher


On Feb 26, 2008, at 6:11 PM, Michael Hess wrote:
Is it possible to to pass a document id to solr and get back the  
documents that are close to it?


Indeed:


Erik



Re: Why does highlight use the index analyzer (instead of query)?

2008-02-26 Thread Chris Hostetter

I'm not much of a highligher expert, but this *seems* like it was probably 
intentional ... you are tlaking abouthte use case where you have a stored 
field, and no term positions correct? ... so in order to highlight, the 
highlighter needs to analyzed the stored text to find the word positions?

The "index" analyzer is the one that is intended to be used on the text 
stored in documents, while the "query" analyzer is the one intended to be 
used on (shorter) query strings ... so when highlighting you use the 
"query" analyzer to built up the query object and the terms to search for, 
and the "index" analyzer to parse the stored field ... those two 
analyzers have to be compatible/complimentary for this to work, butthey 
have to be compatible/complimentary in the exact same way forhte 
queries to match at all.

also: this way you getthe exact same behavior even if you switch from 
storing the field to using TermPositions.


...but like i said: this is just my assumption, i don't know that much 
aboutthe highlighter.


: I am using Solr 1.2.0 with a custom compound word analyzer, which inserts the 
: decompositions into the token stream. Because I assume that when the user 
: queries for a compound word, he is interested only in whole-word matches, I 
: have it enabled only in my index analyzer chain.
: 
: However, due to a bug in the analyzer (entirely my fault), I came to realize 
: that when highlighting is enabled, the highlighter uses the index analyzer 
: chain to find the matches, instead of the query analyzer chain.
: 
: I find this curious, and I was wondering whether this is intentional, and if 
: so, what is the rationale for this?
: 
: Best regards
: - Christian
: 



-Hoss



Re: Search terms in the result

2008-02-26 Thread Chris Hostetter
: 
: Thank you for answering my question, but maybe I wasn't clear in
: explaining it. In fact, what I meant was that using the Query
: compilation in Lucene you can obtain an array of strings containing
: the terms used from the system to search the indices.

are you refering to Query.extractTerms() ?

there isn't anything like that built in right now ... i would imagine it 
would be pretty easy to add it as a component however.

if you are using the standard request handler you could use 
debugQuery=true and then show them what the "parsedquery" looks like ... 
it will have the same basic information in string form.



-Hoss



Re: Offsets?

2008-02-26 Thread Chris Hostetter

(moved to solr-user)

analysis.jsp doesn't deal with the index, it just applies the appropriate 
analizer to input text to tell you what would be indexed.

from what I gather, you want the "meat" of highlighting, but you dont' 
want any actually highlighting to take place -- you want something that 
just figures out what offsets in the orriginal string the terms to be 
highlighted were, and then return those offset numbers, correct?

I would imagine the easiest way to implement this would be to write a 
highlighter that doesn't markup the orriginal string at all ... just 
returns some easy to parse metadata about the term and the offset where it 
was found in the orriginal string.

if it is implemented as an actual Highlighter, it can be used as a plugin 
(i think .. we have configs for that already, right?) that the existing 
Highlighting component would load, but you are a little limited in the 
structure of hte metadata you can return.  Alternately you can implement 
your own component as ryan mentioned, and then put arbitrary data in the 
response (which will be renderedappropriately by whichever response writer 
you use -- json, xml, whatever)


: Subject: Re: Offsets?
: 
: I appreciate all the help - I think, for now, we'll try and leverage the
: analysis.jsp approach, as it appears that different approaches might be in the
: works, and I don't want to much with any of that just yet :)
: 
: If I get some time, maybe I'll have better news in the future.  Thanks again!
: 
: Steve
: 
: At 02:44 PM 2/26/2008, you wrote:
: 
: > > This is a possibility, but I was thinking if I could get SOLR to return
: > > that information in the initial JSON, then I could save a step and speed
: > > things up immensely.
: > 
: > nothing off the shelf to do it... you may want to look at implementing a
: > "search component" to augment the response with offset information.
: > 
: > ryan
: 



-Hoss



Re: Shared index base

2008-02-26 Thread Alok K. Dhir
thanks for your response - i've been waiting for this very  
clarification.  so 'commit()' makes readers re-read the indexes?

On Feb 26, 2008, at 7:03 PM, Mike Klaas wrote:

There hasn't really been a concrete answer given in this thread,  
so:  It works to point multiple Solr's at a single data dir, but you  
can't have more than one writer.  If you try, the index could become  
corrupted or inconsistent (especially if you are using 'simple' lock  
type).  Also, the Solrs do not communicate with each other.  You  
have to tell the readers manually that the index is updated (via  
commit()--autoCommit will not work).


-Mike

On 26-Feb-08, at 9:39 AM, Alok Dhir wrote:

Are you saying all the servers will use the same 'data' dir?  Is  
that a supported config?


On Feb 26, 2008, at 12:29 PM, Matthew Runo wrote:

We're about to do the same thing here, but have not tried yet. We  
currently run Solr with replication across several servers. So  
long as only one server is doing updates to the index, I think it  
should work fine.



Thanks!

Matthew Runo
Software Developer
Zappos.com
702.943.7833

On Feb 26, 2008, at 7:51 AM, Evgeniy Strokin wrote:

I know there was such discussions about the subject, but I want  
to ask again if somebody could share more information.
We are planning to have several separate servers for our search  
engine. One of them will be index/search server, and all others  
are search only.
We want to use SAN (BTW: should we consider something else?) and  
give access to it from all servers. So all servers will use the  
same index base, without any replication, same files.
Is this a good practice? Did somebody do the same? Any problems  
noticed? Or any suggestions, even about different configurations  
are highly appreciated.


Thanks,
Gene










boost ignored with wildcard queries

2008-02-26 Thread Head

Using the StandardRequestHandler, it appears that the index boost values are
ignored when the query has a wildcard in it.   For example, if I have 2
's and one has a boost of 1.0 and another has a boost of 10.0, then I
do a search for "bob*", both records will be returned with the same score of
1.0.   If I just do a normal search then the  that has the higher boost
has the higher score as expected.

Is this a bug?

~Tom

p.s. Here's what my debug looks like:


1.0 = (MATCH)
ConstantScoreQuery([EMAIL PROTECTED]), product of:
  1.0 = boost
  1.0 = queryNorm


1.0 = (MATCH)
ConstantScoreQuery([EMAIL PROTECTED]), product of:
  1.0 = boost
  1.0 = queryNorm

-- 
View this message in context: 
http://www.nabble.com/boost-ignored-with-wildcard-queries-tp15703334p15703334.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Shared index base

2008-02-26 Thread Mike Klaas
There hasn't really been a concrete answer given in this thread, so:   
It works to point multiple Solr's at a single data dir, but you can't  
have more than one writer.  If you try, the index could become  
corrupted or inconsistent (especially if you are using 'simple' lock  
type).  Also, the Solrs do not communicate with each other.  You have  
to tell the readers manually that the index is updated (via commit()-- 
autoCommit will not work).


-Mike

On 26-Feb-08, at 9:39 AM, Alok Dhir wrote:

Are you saying all the servers will use the same 'data' dir?  Is  
that a supported config?


On Feb 26, 2008, at 12:29 PM, Matthew Runo wrote:

We're about to do the same thing here, but have not tried yet. We  
currently run Solr with replication across several servers. So long  
as only one server is doing updates to the index, I think it should  
work fine.



Thanks!

Matthew Runo
Software Developer
Zappos.com
702.943.7833

On Feb 26, 2008, at 7:51 AM, Evgeniy Strokin wrote:

I know there was such discussions about the subject, but I want to  
ask again if somebody could share more information.
We are planning to have several separate servers for our search  
engine. One of them will be index/search server, and all others  
are search only.
We want to use SAN (BTW: should we consider something else?) and  
give access to it from all servers. So all servers will use the  
same index base, without any replication, same files.
Is this a good practice? Did somebody do the same? Any problems  
noticed? Or any suggestions, even about different configurations  
are highly appreciated.


Thanks,
Gene








similarity search with solr

2008-02-26 Thread Michael Hess
Is it possible to to pass a document id to solr and get back the documents that 
are close to it?

Thanks,
Michael 
**
Electronic Mail is not secure, may not be read every day, and should not be 
used for urgent or sensitive issues


Re: Start of solr 1.3 with patch collapse

2008-02-26 Thread David Pratt
Hi kordi. What was the issue and how did you solve it for the benefit of 
the list. Many thanks.


Regards,
David

kordi wrote:

I solved it now on myselft sorry for the post.

kordi wrote:

I cant start solr trunk with the path collapse i got the following error

SEVERE: Could not start SOLR. Check solr/home property
java.lang.NoSuchMethodError:
org.apache.lucene.analysis.Token.(IILjava/lang/String;)V
at
org.apache.solr.analysis.SynonymMap.makeTokens(SynonymMap.java:103)
at
org.apache.solr.analysis.SynonymFilterFactory.parseRules(SynonymFilterFactory.java:92)
at
org.apache.solr.analysis.SynonymFilterFactory.inform(SynonymFilterFactory.java:49)
at
org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:256)
at org.apache.solr.schema.IndexSchema.(IndexSchema.java:84)
at org.apache.solr.schema.IndexSchema.(IndexSchema.java:74)
at org.apache.solr.core.SolrCore.(SolrCore.java:314)

My startparameters are:

java -Dsolr.solr.home=/opt/solr-tomcat/solr -jar  -Xms250M-Xmx250M 
-verbose:gc bootstrap.jar







-XX:+UseLargePages ?

2008-02-26 Thread Matthew Runo

Hello!

I was wondering if there is any impact on using the LargePages JVM  
setting with Solr. Has anyone used this? Does it help performance?  
Hurt it?


We have several 64 bit servers with 16G of ram each, and were  
wondering if we should be using the -XX:+UseLargePages setting on the  
JVM for solr/tocmat.


Thanks!

Matthew Runo
Software Developer
Zappos.com
702.943.7833



Re: Start of solr 1.3 with patch collapse

2008-02-26 Thread kordi

I solved it now on myselft sorry for the post.

kordi wrote:
> 
> I cant start solr trunk with the path collapse i got the following error
> 
> SEVERE: Could not start SOLR. Check solr/home property
> java.lang.NoSuchMethodError:
> org.apache.lucene.analysis.Token.(IILjava/lang/String;)V
> at
> org.apache.solr.analysis.SynonymMap.makeTokens(SynonymMap.java:103)
> at
> org.apache.solr.analysis.SynonymFilterFactory.parseRules(SynonymFilterFactory.java:92)
> at
> org.apache.solr.analysis.SynonymFilterFactory.inform(SynonymFilterFactory.java:49)
> at
> org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:256)
> at org.apache.solr.schema.IndexSchema.(IndexSchema.java:84)
> at org.apache.solr.schema.IndexSchema.(IndexSchema.java:74)
> at org.apache.solr.core.SolrCore.(SolrCore.java:314)
> 
> My startparameters are:
> 
> java -Dsolr.solr.home=/opt/solr-tomcat/solr -jar  -Xms250M-Xmx250M 
> -verbose:gc bootstrap.jar
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Start-of-solr-1.3-with-patch-collapse-tp15683379p15689102.html
Sent from the Solr - User mailing list archive at Nabble.com.



Start of solr 1.3 with patch collapse

2008-02-26 Thread kordi

I cant start solr trunk with the path collapse i got the following error

SEVERE: Could not start SOLR. Check solr/home property
java.lang.NoSuchMethodError:
org.apache.lucene.analysis.Token.(IILjava/lang/String;)V
at
org.apache.solr.analysis.SynonymMap.makeTokens(SynonymMap.java:103)
at
org.apache.solr.analysis.SynonymFilterFactory.parseRules(SynonymFilterFactory.java:92)
at
org.apache.solr.analysis.SynonymFilterFactory.inform(SynonymFilterFactory.java:49)
at
org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:256)
at org.apache.solr.schema.IndexSchema.(IndexSchema.java:84)
at org.apache.solr.schema.IndexSchema.(IndexSchema.java:74)
at org.apache.solr.core.SolrCore.(SolrCore.java:314)

My startparameters are:

java -Dsolr.solr.home=/opt/solr-tomcat/solr -jar  -Xms250M-Xmx250M 
-verbose:gc bootstrap.jar

-- 
View this message in context: 
http://www.nabble.com/Start-of-solr-1.3-with-patch-collapse-tp15683379p15683379.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Shared index base

2008-02-26 Thread Matthew Runo
That's true about the commit issue. With that in mind, it might be  
better to use replication - just keep an eye on it to ensure it's  
working, as my 1.2 install (3 servers) tends to stop every once in a  
blue moon.


Thanks!

Matthew Runo
Software Developer
Zappos.com
702.943.7833

On Feb 26, 2008, at 10:53 AM, Walter Underwood wrote:


SAN is not NFS. I would expect SAN to be fast.

wunder

On 2/26/08 10:47 AM, "Jae Joo" <[EMAIL PROTECTED]> wrote:



In my environment, there is NO big difference between local disk  
and SAN based

file system.
A little slow down, but not a problem (1 or 2 %)
I do have 4 sets of solr indices each has more than 10G in 3 servers.
I think that it is not good way to share SINGLE Index. - disk is  
pretty cheap

and we can add more disk in SAN pretty easily.
I have another server which is called "Master" with local disk  
based Solr

Index to update the index.
By some accident or time out, the update is not done successfully,  
so I do

need to do something by manually.
If you have only one index, there is a risk to mess up the index.

Thanks,

Jae


-Original Message-
From: Walter Underwood [mailto:[EMAIL PROTECTED]
Sent: Tue 2/26/2008 1:27 PM
To: solr-user@lucene.apache.org
Subject: Re: Shared index base

I saw a 100X slowdown running with indexes on NFS.

I don't understand going through a lot of effort with unsupported
configurations just to share an index. Local disk is cheap, the
snapshot stuff works well, and local discs avoid a single point
of failure.

The testing time to make a shared index work with each new
release of Solr is almost certainly more expensive than buying
local disc.

The single point of failure is real issue. I've seen two discs
fail on one RAID. When that happens, you've lost all of your
search for hours or days.

Finally, how do you tell Solr that the index has changed and
it needs a new Searcher? Normally, that is a commit, but you
don't want to commit from a read-only Solr.

wunder

On 2/26/08 10:17 AM, "Matthew Runo" <[EMAIL PROTECTED]> wrote:

I hope so. I've found that every once in a while Solr 1.2  
replication
will die, from a temp-index file that seems to ham it up.  
Removing

that file on all the servers fixes the issue though.

We'd like to be able to point all the servers at an NFS location for
their index files, and use a single server to update it.

Thanks!

Matthew Runo
Software Developer
Zappos.com
702.943.7833

On Feb 26, 2008, at 9:39 AM, Alok Dhir wrote:


Are you saying all the servers will use the same 'data' dir?  Is
that a supported config?

On Feb 26, 2008, at 12:29 PM, Matthew Runo wrote:


We're about to do the same thing here, but have not tried yet. We
currently run Solr with replication across several servers. So  
long
as only one server is doing updates to the index, I think it  
should

work fine.


Thanks!

Matthew Runo
Software Developer
Zappos.com
702.943.7833

On Feb 26, 2008, at 7:51 AM, Evgeniy Strokin wrote:

I know there was such discussions about the subject, but I want  
to

ask again if somebody could share more information.
We are planning to have several separate servers for our search
engine. One of them will be index/search server, and all others
are search only.
We want to use SAN (BTW: should we consider something else?) and
give access to it from all servers. So all servers will use the
same index base, without any replication, same files.
Is this a good practice? Did somebody do the same? Any problems
noticed? Or any suggestions, even about different configurations
are highly appreciated.

Thanks,
Gene

















Re: Shared index base

2008-02-26 Thread Walter Underwood
SAN is not NFS. I would expect SAN to be fast.

wunder

On 2/26/08 10:47 AM, "Jae Joo" <[EMAIL PROTECTED]> wrote:

> 
> In my environment, there is NO big difference between local disk and SAN based
> file system.
> A little slow down, but not a problem (1 or 2 %)
> I do have 4 sets of solr indices each has more than 10G in 3 servers.
> I think that it is not good way to share SINGLE Index. - disk is pretty cheap
> and we can add more disk in SAN pretty easily.
> I have another server which is called "Master" with local disk based Solr
> Index to update the index.
> By some accident or time out, the update is not done successfully, so I do
> need to do something by manually.
> If you have only one index, there is a risk to mess up the index.
> 
> Thanks,
> 
> Jae
> 
> 
> -Original Message-
> From: Walter Underwood [mailto:[EMAIL PROTECTED]
> Sent: Tue 2/26/2008 1:27 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Shared index base
>  
> I saw a 100X slowdown running with indexes on NFS.
> 
> I don't understand going through a lot of effort with unsupported
> configurations just to share an index. Local disk is cheap, the
> snapshot stuff works well, and local discs avoid a single point
> of failure.
> 
> The testing time to make a shared index work with each new
> release of Solr is almost certainly more expensive than buying
> local disc.
> 
> The single point of failure is real issue. I've seen two discs
> fail on one RAID. When that happens, you've lost all of your
> search for hours or days.
> 
> Finally, how do you tell Solr that the index has changed and
> it needs a new Searcher? Normally, that is a commit, but you
> don't want to commit from a read-only Solr.
> 
> wunder
> 
> On 2/26/08 10:17 AM, "Matthew Runo" <[EMAIL PROTECTED]> wrote:
> 
>> I hope so. I've found that every once in a while Solr 1.2 replication
>> will die, from a temp-index file that seems to ham it up. Removing
>> that file on all the servers fixes the issue though.
>> 
>> We'd like to be able to point all the servers at an NFS location for
>> their index files, and use a single server to update it.
>> 
>> Thanks!
>> 
>> Matthew Runo
>> Software Developer
>> Zappos.com
>> 702.943.7833
>> 
>> On Feb 26, 2008, at 9:39 AM, Alok Dhir wrote:
>> 
>>> Are you saying all the servers will use the same 'data' dir?  Is
>>> that a supported config?
>>> 
>>> On Feb 26, 2008, at 12:29 PM, Matthew Runo wrote:
>>> 
 We're about to do the same thing here, but have not tried yet. We
 currently run Solr with replication across several servers. So long
 as only one server is doing updates to the index, I think it should
 work fine.
 
 
 Thanks!
 
 Matthew Runo
 Software Developer
 Zappos.com
 702.943.7833
 
 On Feb 26, 2008, at 7:51 AM, Evgeniy Strokin wrote:
 
> I know there was such discussions about the subject, but I want to
> ask again if somebody could share more information.
> We are planning to have several separate servers for our search
> engine. One of them will be index/search server, and all others
> are search only.
> We want to use SAN (BTW: should we consider something else?) and
> give access to it from all servers. So all servers will use the
> same index base, without any replication, same files.
> Is this a good practice? Did somebody do the same? Any problems
> noticed? Or any suggestions, even about different configurations
> are highly appreciated.
> 
> Thanks,
> Gene
 
>>> 
>> 
> 
> 
> 
> 



RE: Shared index base

2008-02-26 Thread Jae Joo

In my environment, there is NO big difference between local disk and SAN based 
file system.
A little slow down, but not a problem (1 or 2 %)
I do have 4 sets of solr indices each has more than 10G in 3 servers. 
I think that it is not good way to share SINGLE Index. - disk is pretty cheap 
and we can add more disk in SAN pretty easily. 
I have another server which is called "Master" with local disk based Solr Index 
to update the index.
By some accident or time out, the update is not done successfully, so I do need 
to do something by manually.
If you have only one index, there is a risk to mess up the index.

Thanks,

Jae


-Original Message-
From: Walter Underwood [mailto:[EMAIL PROTECTED]
Sent: Tue 2/26/2008 1:27 PM
To: solr-user@lucene.apache.org
Subject: Re: Shared index base
 
I saw a 100X slowdown running with indexes on NFS.

I don't understand going through a lot of effort with unsupported
configurations just to share an index. Local disk is cheap, the
snapshot stuff works well, and local discs avoid a single point
of failure.

The testing time to make a shared index work with each new
release of Solr is almost certainly more expensive than buying
local disc.

The single point of failure is real issue. I've seen two discs
fail on one RAID. When that happens, you've lost all of your
search for hours or days.

Finally, how do you tell Solr that the index has changed and
it needs a new Searcher? Normally, that is a commit, but you
don't want to commit from a read-only Solr.

wunder

On 2/26/08 10:17 AM, "Matthew Runo" <[EMAIL PROTECTED]> wrote:

> I hope so. I've found that every once in a while Solr 1.2 replication
> will die, from a temp-index file that seems to ham it up. Removing
> that file on all the servers fixes the issue though.
> 
> We'd like to be able to point all the servers at an NFS location for
> their index files, and use a single server to update it.
> 
> Thanks!
> 
> Matthew Runo
> Software Developer
> Zappos.com
> 702.943.7833
> 
> On Feb 26, 2008, at 9:39 AM, Alok Dhir wrote:
> 
>> Are you saying all the servers will use the same 'data' dir?  Is
>> that a supported config?
>> 
>> On Feb 26, 2008, at 12:29 PM, Matthew Runo wrote:
>> 
>>> We're about to do the same thing here, but have not tried yet. We
>>> currently run Solr with replication across several servers. So long
>>> as only one server is doing updates to the index, I think it should
>>> work fine.
>>> 
>>> 
>>> Thanks!
>>> 
>>> Matthew Runo
>>> Software Developer
>>> Zappos.com
>>> 702.943.7833
>>> 
>>> On Feb 26, 2008, at 7:51 AM, Evgeniy Strokin wrote:
>>> 
 I know there was such discussions about the subject, but I want to
 ask again if somebody could share more information.
 We are planning to have several separate servers for our search
 engine. One of them will be index/search server, and all others
 are search only.
 We want to use SAN (BTW: should we consider something else?) and
 give access to it from all servers. So all servers will use the
 same index base, without any replication, same files.
 Is this a good practice? Did somebody do the same? Any problems
 noticed? Or any suggestions, even about different configurations
 are highly appreciated.
 
 Thanks,
 Gene
>>> 
>> 
> 






Re: Shared index base

2008-02-26 Thread Walter Underwood
I saw a 100X slowdown running with indexes on NFS.

I don't understand going through a lot of effort with unsupported
configurations just to share an index. Local disk is cheap, the
snapshot stuff works well, and local discs avoid a single point
of failure.

The testing time to make a shared index work with each new
release of Solr is almost certainly more expensive than buying
local disc.

The single point of failure is real issue. I've seen two discs
fail on one RAID. When that happens, you've lost all of your
search for hours or days.

Finally, how do you tell Solr that the index has changed and
it needs a new Searcher? Normally, that is a commit, but you
don't want to commit from a read-only Solr.

wunder

On 2/26/08 10:17 AM, "Matthew Runo" <[EMAIL PROTECTED]> wrote:

> I hope so. I've found that every once in a while Solr 1.2 replication
> will die, from a temp-index file that seems to ham it up. Removing
> that file on all the servers fixes the issue though.
> 
> We'd like to be able to point all the servers at an NFS location for
> their index files, and use a single server to update it.
> 
> Thanks!
> 
> Matthew Runo
> Software Developer
> Zappos.com
> 702.943.7833
> 
> On Feb 26, 2008, at 9:39 AM, Alok Dhir wrote:
> 
>> Are you saying all the servers will use the same 'data' dir?  Is
>> that a supported config?
>> 
>> On Feb 26, 2008, at 12:29 PM, Matthew Runo wrote:
>> 
>>> We're about to do the same thing here, but have not tried yet. We
>>> currently run Solr with replication across several servers. So long
>>> as only one server is doing updates to the index, I think it should
>>> work fine.
>>> 
>>> 
>>> Thanks!
>>> 
>>> Matthew Runo
>>> Software Developer
>>> Zappos.com
>>> 702.943.7833
>>> 
>>> On Feb 26, 2008, at 7:51 AM, Evgeniy Strokin wrote:
>>> 
 I know there was such discussions about the subject, but I want to
 ask again if somebody could share more information.
 We are planning to have several separate servers for our search
 engine. One of them will be index/search server, and all others
 are search only.
 We want to use SAN (BTW: should we consider something else?) and
 give access to it from all servers. So all servers will use the
 same index base, without any replication, same files.
 Is this a good practice? Did somebody do the same? Any problems
 noticed? Or any suggestions, even about different configurations
 are highly appreciated.
 
 Thanks,
 Gene
>>> 
>> 
> 



RE: Shared index base

2008-02-26 Thread Charlie Jackson
How do you handle commits to the index? By that, I mean that Solr
recreates its searcher when you issue a commit, but only for the system
that does the commit. Wouldn't you be left with searchers on the other
machines that are stale? 

- Charlie


-Original Message-
From: Matthew Runo [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, February 26, 2008 12:18 PM
To: solr-user@lucene.apache.org
Subject: Re: Shared index base

I hope so. I've found that every once in a while Solr 1.2 replication  
will die, from a temp-index file that seems to ham it up. Removing  
that file on all the servers fixes the issue though.

We'd like to be able to point all the servers at an NFS location for  
their index files, and use a single server to update it.

Thanks!

Matthew Runo
Software Developer
Zappos.com
702.943.7833

On Feb 26, 2008, at 9:39 AM, Alok Dhir wrote:

> Are you saying all the servers will use the same 'data' dir?  Is  
> that a supported config?
>
> On Feb 26, 2008, at 12:29 PM, Matthew Runo wrote:
>
>> We're about to do the same thing here, but have not tried yet. We  
>> currently run Solr with replication across several servers. So long  
>> as only one server is doing updates to the index, I think it should  
>> work fine.
>>
>>
>> Thanks!
>>
>> Matthew Runo
>> Software Developer
>> Zappos.com
>> 702.943.7833
>>
>> On Feb 26, 2008, at 7:51 AM, Evgeniy Strokin wrote:
>>
>>> I know there was such discussions about the subject, but I want to  
>>> ask again if somebody could share more information.
>>> We are planning to have several separate servers for our search  
>>> engine. One of them will be index/search server, and all others  
>>> are search only.
>>> We want to use SAN (BTW: should we consider something else?) and  
>>> give access to it from all servers. So all servers will use the  
>>> same index base, without any replication, same files.
>>> Is this a good practice? Did somebody do the same? Any problems  
>>> noticed? Or any suggestions, even about different configurations  
>>> are highly appreciated.
>>>
>>> Thanks,
>>> Gene
>>
>



Re: Shared index base

2008-02-26 Thread Matthew Runo
I hope so. I've found that every once in a while Solr 1.2 replication  
will die, from a temp-index file that seems to ham it up. Removing  
that file on all the servers fixes the issue though.


We'd like to be able to point all the servers at an NFS location for  
their index files, and use a single server to update it.


Thanks!

Matthew Runo
Software Developer
Zappos.com
702.943.7833

On Feb 26, 2008, at 9:39 AM, Alok Dhir wrote:

Are you saying all the servers will use the same 'data' dir?  Is  
that a supported config?


On Feb 26, 2008, at 12:29 PM, Matthew Runo wrote:

We're about to do the same thing here, but have not tried yet. We  
currently run Solr with replication across several servers. So long  
as only one server is doing updates to the index, I think it should  
work fine.



Thanks!

Matthew Runo
Software Developer
Zappos.com
702.943.7833

On Feb 26, 2008, at 7:51 AM, Evgeniy Strokin wrote:

I know there was such discussions about the subject, but I want to  
ask again if somebody could share more information.
We are planning to have several separate servers for our search  
engine. One of them will be index/search server, and all others  
are search only.
We want to use SAN (BTW: should we consider something else?) and  
give access to it from all servers. So all servers will use the  
same index base, without any replication, same files.
Is this a good practice? Did somebody do the same? Any problems  
noticed? Or any suggestions, even about different configurations  
are highly appreciated.


Thanks,
Gene








RE: solr to handle special charater

2008-02-26 Thread Kevin Xiao
By the way, I used DisMaxRequestHandler in solrconfig.xml. I googled a little 
about DisMaxRequestHandler, it says that '+' and '-' characters prefixing 
nonwhitespace characters are treated as "mandatory" and "prohibited" modifiers 
for the subsequent terms, but it doesn't say anything about just '+' or '-' 
characters.

Does anyone know a workaround before I stripped off '+'/'-' by myself?

Thanks,
- Kevin

-Original Message-
From: Kevin Xiao [mailto:[EMAIL PROTECTED]
Sent: Tuesday, February 26, 2008 1:14 AM
To: solr-user@lucene.apache.org
Subject: solr to handle special charater

Hi there,

I am new to Solr. I used the following analyzer - I tried both 
WhitespaceTokenizerFactory and StandardTokenizerFactory, but when I search "xyz 
- abc", it didn't returns anything, ("xyz abc" returns "xyz - abc" though). I 
used the tokenizer/filter on both index and query time. Is that a solr bug? How 
do I make it work?

Thanks,
- Kevin

  


  
  
  



  
  

  



Re: Shared index base

2008-02-26 Thread Alok Dhir
Are you saying all the servers will use the same 'data' dir?  Is that  
a supported config?


On Feb 26, 2008, at 12:29 PM, Matthew Runo wrote:

We're about to do the same thing here, but have not tried yet. We  
currently run Solr with replication across several servers. So long  
as only one server is doing updates to the index, I think it should  
work fine.



Thanks!

Matthew Runo
Software Developer
Zappos.com
702.943.7833

On Feb 26, 2008, at 7:51 AM, Evgeniy Strokin wrote:

I know there was such discussions about the subject, but I want to  
ask again if somebody could share more information.
We are planning to have several separate servers for our search  
engine. One of them will be index/search server, and all others are  
search only.
We want to use SAN (BTW: should we consider something else?) and  
give access to it from all servers. So all servers will use the  
same index base, without any replication, same files.
Is this a good practice? Did somebody do the same? Any problems  
noticed? Or any suggestions, even about different configurations  
are highly appreciated.


Thanks,
Gene






Re: Shared index base

2008-02-26 Thread Rachel McConnell
We tried this architecture for our initial rollout of Solr/Lucene to
our production application.  We ran into a problem with it, which may
or may not apply to you.  Our production software servers all are
monitored for uptime by a daemon which pings them periodically and
restarts them if a response is not received within a configurable
period of time.

We found that under some orderings of restarts, the Lucene appservers
would not come up correctly.  I don't recall the exact details, and I
don't think it ever corrupted the index.  As I recall, we had to
restart in a particular order to avoid freezes on the read-only
servers, and of course the automated monitor, separate for each
server, could not do that.

YMMV of course, but this would be something to test thoroughly in a
shared index situation.  We moved a while ago to each server (even on
the same machine) having its own index files, and using the snapshot
puller/shooter processes for replication.

Rachel

On 2/26/08, Matthew Runo <[EMAIL PROTECTED]> wrote:
> We're about to do the same thing here, but have not tried yet. We
>  currently run Solr with replication across several servers. So long as
>  only one server is doing updates to the index, I think it should work
>  fine.
>
>
>  Thanks!
>
>
>  Matthew Runo
>  Software Developer
>  Zappos.com
>  702.943.7833
>
>
>  On Feb 26, 2008, at 7:51 AM, Evgeniy Strokin wrote:
>
>  > I know there was such discussions about the subject, but I want to
>  > ask again if somebody could share more information.
>  > We are planning to have several separate servers for our search
>  > engine. One of them will be index/search server, and all others are
>  > search only.
>  > We want to use SAN (BTW: should we consider something else?) and
>  > give access to it from all servers. So all servers will use the same
>  > index base, without any replication, same files.
>  > Is this a good practice? Did somebody do the same? Any problems
>  > noticed? Or any suggestions, even about different configurations are
>  > highly appreciated.
>  >
>  > Thanks,
>  > Gene
>
>


Re: Shared index base

2008-02-26 Thread Matthew Runo
We're about to do the same thing here, but have not tried yet. We  
currently run Solr with replication across several servers. So long as  
only one server is doing updates to the index, I think it should work  
fine.



Thanks!

Matthew Runo
Software Developer
Zappos.com
702.943.7833

On Feb 26, 2008, at 7:51 AM, Evgeniy Strokin wrote:

I know there was such discussions about the subject, but I want to  
ask again if somebody could share more information.
We are planning to have several separate servers for our search  
engine. One of them will be index/search server, and all others are  
search only.
We want to use SAN (BTW: should we consider something else?) and  
give access to it from all servers. So all servers will use the same  
index base, without any replication, same files.
Is this a good practice? Did somebody do the same? Any problems  
noticed? Or any suggestions, even about different configurations are  
highly appreciated.


Thanks,
Gene




AW: Threads in Solr

2008-02-26 Thread Hausherr, Jens
It has been some time since I last worked with the Lucene index directly, but 
AFAIK the lucene index by default is not thread-safe which means it is propably 
wrapped in som synchronization layer.

Concerning the bad performance I can only guess on some items to examine:

1) Every thread performs a complete query. 
2) Assuming that the query takes time "t" to perform concludes that "n" threads 
will run (max) "n*t"
3) If your threads hit some synchronized method they are likely to queue at the 
synchronization barrier which might lead to "n*t" execution time.
4) The join statement at the end of your code snippet ensures that your request 
handler continues iff all threads have completed.
5) Vectors are synchronized - it might not be necessary to use a Vector for 
storing your threads (as far the code snippet is concerned at least - I see no 
concurrent access to the threads here) 

Personally I think that to profit from parallelization it would be necessary to 
segment the index to perform disjunct queries - I do not know whether solr odr 
lucene already support this feature...

/Jens

-Ursprüngliche Nachricht-
Von: Evgeniy Strokin [mailto:[EMAIL PROTECTED] 
Gesendet: Dienstag, 26. Februar 2008 16:57
An: solr-user@lucene.apache.org
Betreff: Re: Threads in Solr

I'm running my tests on server with 4 double-kernel CPU. I was expecting good 
improvements from multithreaded solution but I have speed 10th times worse. 
Here is how I run those threads, I think I'm doing something wrong, please 
advise:
 
--
. code truncated .
 
public class MultiFacetRequestHandler extends StandardRequestHandler {

protected NamedList getFacetInfo(SolrQueryRequest req,
 SolrQueryResponse rsp,
 DocSet mainSet) {
SimpleFacets f = new SimpleFacets(req.getSearcher(),
mainSet,
req.getParams());
NamedList facetInfo = f.getFacetCounts(); // This is 
custom code for multi facets
SolrParams p = req.getParams();
String fl = p.get(SolrParams.FL);
int flags = 0;
if (fl != null)
flags |= SolrPluginUtils.setReturnFields(fl, rsp);
Query query = QueryParsing.parseQuery(p.required().get(SolrParams.Q),
p.get(SolrParams.DF), p, req.getSchema());
try {
NamedList facetFields = (NamedList) 
facetInfo.get("facet_fields");
if (facetFields.size() == 2) {
String shortFldName = facetFields.getName(0);
NamedList shortFld = (NamedList) facetFields.getVal(0);
NamedList longFld = (NamedList) facetFields.getVal(1);
if (shortFld.size() > longFld.size()) {
shortFld = longFld;
shortFldName = facetFields.getName(1);
}
List filters = 
SolrPluginUtils.parseFilterQueries(req);
if (filters == null) filters = new LinkedList();
SolrIndexSearcher s = req.getSearcher();
Vector threads = new Vector();
Thread thread;
for (int i = 0; i < shortFld.size(); i++) {
SolrQueryParser qp = new SolrQueryParser(s.getSchema(), 
null);
Query q = qp.parse(shortFldName + ":\"" + 
shortFld.getName(i)+"\"");
List fltrs=new LinkedList();
fltrs.addAll(filters);
fltrs.add(q);
thread = new 
Thread(makeRunnable(s,query,fltrs,flags,p,shortFld.getName(i),facetFields));
threads.add(thread);
thread.start();
}
for (Thread thread1 : threads) {
thread1.join();
}
}
} catch (Exception e) {
SolrException.logOnce(SolrCore.log, "Exception in multi faceting", 
e);
}
///
return facetInfo;
}
 
public Runnable makeRunnable(final SolrIndexSearcher s, final Query query, 
final List filters, final int flags, final SolrParams p, final String 
shrtName, final NamedList facetFields) {
return new Runnable() {
public void run() {
try{
DocListAndSet matrixRes = s.getDocListAndSet(query, 
filters, null, 0, 0, flags);
NamedList matr = new 
SimpleFacets(s,matrixRes.docSet,p).getFacetCounts();
facetFields.add(shrtName, matr.get("facet_fields"));
}catch (Exception e){
 SolrException.logOnce(SolrCore.log, "Exception in multi 
faceting", e);
}
}
};
}
. code truncated .
}
 

 



-

Re: Threads in Solr

2008-02-26 Thread Evgeniy Strokin
I'm running my tests on server with 4 double-kernel CPU. I was expecting good 
improvements from multithreaded solution but I have speed 10th times worse. 
Here is how I run those threads, I think I'm doing something wrong, please 
advise:
 
--
. code truncated .
 
public class MultiFacetRequestHandler extends StandardRequestHandler {

protected NamedList getFacetInfo(SolrQueryRequest req,
 SolrQueryResponse rsp,
 DocSet mainSet) {
SimpleFacets f = new SimpleFacets(req.getSearcher(),
mainSet,
req.getParams());
NamedList facetInfo = f.getFacetCounts();
// This is custom code for multi facets
SolrParams p = req.getParams();
String fl = p.get(SolrParams.FL);
int flags = 0;
if (fl != null)
flags |= SolrPluginUtils.setReturnFields(fl, rsp);
Query query = QueryParsing.parseQuery(p.required().get(SolrParams.Q),
p.get(SolrParams.DF), p, req.getSchema());
try {
NamedList facetFields = (NamedList) 
facetInfo.get("facet_fields");
if (facetFields.size() == 2) {
String shortFldName = facetFields.getName(0);
NamedList shortFld = (NamedList) facetFields.getVal(0);
NamedList longFld = (NamedList) facetFields.getVal(1);
if (shortFld.size() > longFld.size()) {
shortFld = longFld;
shortFldName = facetFields.getName(1);
}
List filters = 
SolrPluginUtils.parseFilterQueries(req);
if (filters == null) filters = new LinkedList();
SolrIndexSearcher s = req.getSearcher();
Vector threads = new Vector();
Thread thread;
for (int i = 0; i < shortFld.size(); i++) {
SolrQueryParser qp = new SolrQueryParser(s.getSchema(), 
null);
Query q = qp.parse(shortFldName + ":\"" + 
shortFld.getName(i)+"\"");
List fltrs=new LinkedList();
fltrs.addAll(filters);
fltrs.add(q);
thread = new 
Thread(makeRunnable(s,query,fltrs,flags,p,shortFld.getName(i),facetFields));
threads.add(thread);
thread.start();
}
for (Thread thread1 : threads) {
thread1.join();
}
}
} catch (Exception e) {
SolrException.logOnce(SolrCore.log, "Exception in multi faceting", 
e);
}
///
return facetInfo;
}
 
public Runnable makeRunnable(final SolrIndexSearcher s, final Query query, 
final List filters, final int flags, final SolrParams p, final String 
shrtName, final NamedList facetFields) {
return new Runnable() {
public void run() {
try{
DocListAndSet matrixRes = s.getDocListAndSet(query, 
filters, null, 0, 0, flags);
NamedList matr = new 
SimpleFacets(s,matrixRes.docSet,p).getFacetCounts();
facetFields.add(shrtName, matr.get("facet_fields"));
}catch (Exception e){
 SolrException.logOnce(SolrCore.log, "Exception in multi 
faceting", e);
}
}
};
}
. code truncated .
}
 

 



- Original Message 
From: Chris Hostetter <[EMAIL PROTECTED]>
To: solr-user@lucene.apache.org
Sent: Tuesday, February 26, 2008 2:55:36 AM
Subject: Re: Threads in Solr

: Yes I do computing the same DocSet. Should it be the problem? Is any way to 
solve it?
: In general in each thread I ran the same query and add different Filter 
Query. 

it's not neccessarily a problem, it's just that you may not get much 
benefit from prallelization if all of the worker threads are doing the 
same work simulteneously.

but like i said:  without knowing exactly what your threading code looks 
like, it's hard to guess what might be wrong (and even if i was looking 
right at your multithreaded code, it wouldn't neccessarily be obvious to 
me, my multi-threading knowledge is mediocre) and it's still not clear if 
you are testing on hardware that can actually take advantage of 
parallelization.


-Hoss

Shared index base

2008-02-26 Thread Evgeniy Strokin
I know there was such discussions about the subject, but I want to ask again if 
somebody could share more information.
We are planning to have several separate servers for our search engine. One of 
them will be index/search server, and all others are search only.
We want to use SAN (BTW: should we consider something else?) and give access to 
it from all servers. So all servers will use the same index base, without any 
replication, same files.
Is this a good practice? Did somebody do the same? Any problems noticed? Or any 
suggestions, even about different configurations are highly appreciated.
 
Thanks,
Gene

Re: protwords | synonyms | elevator conf files

2008-02-26 Thread Erik Hatcher
I've definitely thought about abstracting how various external data  
gets loaded by Solr - so you're not alone there.  But I haven't  
fleshed anything out myself.   I definitely strong support such  
separation though!


Erik

On Feb 25, 2008, at 11:18 AM, Matthew Runo wrote:


Hello!

All these configuration files seem like they could be stored in a  
database just as well as they are stored in the file structure.  
Specifically the new elevator handler (which looks to be exactly  
what I needed, thanks!!) would be more useful if it could get its  
configuration from a database.


Has anyone thought about linking these conf files into a database?  
Currently I'm dumping the DB out to the file structure and  
restarting solr to read in the changes - is there a better way? One  
that doesn't clear all the caches, perhaps?


Thanks!

Matthew Runo
Software Developer
Zappos.com
702.943.7833




Re: Boost the results for filter value in a single query

2008-02-26 Thread Yonik Seeley
If content_source is single valued (maximum of one value per
document), then sort.
sort=content_source asc

-Yonik

On Tue, Feb 26, 2008 at 4:15 AM, Vijay Khurana <[EMAIL PROTECTED]> wrote:
> Hi,
>  Here is the problem I am facing.
>  One of the field in my index is content_source which I use to filter my
>  results.
>  I want all the documents that have content_source value as ABC OR DEF
>  and all the documents having DEF
>  should appear before the documents that have content_source value as ABC. I
>  am using DisMaxRequestHandler and Solr 2.0.
>
>  Currently I am using below query:
>
>  q=test&qt=dismax&fq=contemt_source:("ABC" OR "DEF")
>
>  The above query doesn't guarantee the appearance of DEF documents before
>  ABC.
>  I want to do this in one query. is it possible?
>
>  Appreciate your help.
>
>  Thanks,
>  Vijay
>


Boost the results for filter value in a single query

2008-02-26 Thread Vijay Khurana
Hi,
Here is the problem I am facing.
One of the field in my index is content_source which I use to filter my
results.
I want all the documents that have content_source value as ABC OR DEF
and all the documents having DEF
should appear before the documents that have content_source value as ABC. I
am using DisMaxRequestHandler and Solr 2.0.

Currently I am using below query:

q=test&qt=dismax&fq=contemt_source:("ABC" OR "DEF")

The above query doesn't guarantee the appearance of DEF documents before
ABC.
I want to do this in one query. is it possible?

Appreciate your help.

Thanks,
Vijay


solr to handle special charater

2008-02-26 Thread Kevin Xiao
Hi there,

I am new to Solr. I used the following analyzer - I tried both 
WhitespaceTokenizerFactory and StandardTokenizerFactory, but when I search "xyz 
- abc", it didn't returns anything, ("xyz abc" returns "xyz - abc" though). I 
used the tokenizer/filter on both index and query time. Is that a solr bug? How 
do I make it work?

Thanks,
- Kevin