Help with query boosting syntax

2010-05-27 Thread kirsty

Hi there,
I am struggling with the syntax for boosting.
My scenario is that we have an algorithm that gives weightings to particular
keywords. 
When a person searches for keywords eg value1 value2 value3 we want to apply
boosting so that a document is boosted according to which of the keywords it
has.
eg of url : q=value1^4.0 OR value2^2.0 OR value3
However this is not working. I tried bq eg:
qt=KeywordSearch&bq=keyword:value1^4.0 OR value2^2.0 but I just don't seem
to be getting the results I am looking for. 
I have a requesthandler set up to just search on our Keyword column which is
a multivalued field. I am using a nightly build of SOLR 1.4 from September
2009.
Can anyone help me with the correct syntax?? Please?
Thanks in advance
Kirsty
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Help-with-query-boosting-syntax-tp847398p847398.html
Sent from the Solr - User mailing list archive at Nabble.com.


Help with PatternReplaceFilterFactory

2010-05-27 Thread kirsty

Hi,
I have a field that is a text field eg: R500,000-550,000 Per Annum,
R350,000-550,000 Per Annum Cost To Company etc. 
I would like to facet on the salary range. 
I have created a new field type








to remove all the letters. 

I have to two fields like this:



And then the copy field:


But my index still has all the text. Am I misunderstanding? Where have I
gone wrong? Any help would be greatly appreciated!
Kirsty

-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Help-with-PatternReplaceFilterFactory-tp847408p847408.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Help with query boosting syntax

2010-05-27 Thread kirsty


kirsty wrote:
> 
> Hi there,
> I am struggling with the syntax for boosting.
> My scenario is that we have an algorithm that gives weightings to
> particular keywords. 
> When a person searches for keywords eg value1 value2 value3 we want to
> apply boosting so that a document is boosted according to which of the
> keywords it has.
> eg of url : q=value1^4.0 OR value2^2.0 OR value3
> However this is not working. I tried bq eg:
> qt=KeywordSearch&bq=keyword:value1^4.0 OR value2^2.0 but I just don't seem
> to be getting the results I am looking for. 
> I have a requesthandler set up to just search on our Keyword column which
> is a multivalued field. I am using a nightly build of SOLR 1.4 from
> September 2009.
> Can anyone help me with the correct syntax?? Please?
> Thanks in advance
> Kirsty
> 

UPDATE: 
I have managed to get this syntax working
select/?qt=KeywordSearch&q=greenpoint&bq=Keyword:work^1.5&bq=Keyword:Pigalle^1.2
But this just seems quite cumbersome having to specify a bq each time. Also
I cannot get more than one q= to work.

-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Help-with-query-boosting-syntax-tp847398p847444.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Machine utilization while indexing

2010-05-27 Thread Thijs
Sorry I missed it in the solrconfig.xml (my bad). I wasn't looking for 
it in the right place.


Thijs

On 27-5-2010 6:41, Chris Hostetter wrote:


: So now I wonder why BinaryRequestWriter (and BinaryUpdateRequestHandler)
: aren't turned on by default. (eps considering some threads on the dev-list

I don't really understand this question -- the BinaryUpdateRequestHandler
is registered with the path /update/javabin in the example solrconfig.xml
-- that's about as close to turning something on by "default" as solr
supports.



-Hoss





RE: solr configuration for Subversion

2010-05-27 Thread Stefan Maric
Thanks. I'll take a look at this

-Original Message-
From: Chris Hostetter [mailto:hossman_luc...@fucit.org]
Sent: 27 May 2010 01:54
To: solr-user@lucene.apache.org
Subject: Re: solr configuration for Subversion



: I've seen the info about SvnQuery & wondered if anyone has a Solr
: configuration / loader module

I've never heard of SvnQuery until your email, but it seems to be built
using Lucene.Net...

http://svnquery.tigris.org/

If you're looking for tools for indexing subversion repos with solr
specificly, you might want to check out ReposSearch...

http://repossearch.com/


-Hoss

No virus found in this incoming message.
Checked by AVG - www.avg.com
Version: 8.5.437 / Virus Database: 271.1.1/2898 - Release Date: 05/26/10
18:26:00



solr.solr.home

2010-05-27 Thread Antonello Mangone
Hi to everyone, I'm really sorry for the s3tupid question I'm doing, but I
didn't understand how to set the java system property solr.solr.home to my
solr home.
Can someone help me ?
Thanks in advanca


RE: solr.solr.home

2010-05-27 Thread Yuval Feinstein
* Set the java system property solr.solr.home to your solr home.
(On linux - use something like export solr.solr.home=/my/solr/home.
On Windows - see
http://vlaurie.com/computers2/Articles/environment.htm
to set an environment variable named solr.solr.home .)

(You can also use the two other options from the wiki page:)
* Configure the servlet container such that a JNDI lookup of 
"java:comp/env/solr/home" by the solr webapp will point to the solr home.
* The default solr home is "solr" under the JVM's current working directory 
($CWD/solr), so start the servlet container in the directory containing ./solr

-Original Message-
From: Antonello Mangone [mailto:antonello.mang...@gmail.com] 
Sent: Thursday, May 27, 2010 11:30 AM
To: solr-user@lucene.apache.org
Subject: solr.solr.home

Hi to everyone, I'm really sorry for the s3tupid question I'm doing, but I
didn't understand how to set the java system property solr.solr.home to my
solr home.
Can someone help me ?
Thanks in advanca


Re: solr.solr.home

2010-05-27 Thread Claudio Atzori

On 05/27/2010 10:30 AM, Antonello Mangone wrote:

Hi to everyone, I'm really sorry for the s3tupid question I'm doing, but I
didn't understand how to set the java system property solr.solr.home to my
solr home.
Can someone help me ?
Thanks in advanca

   

it should be something like

System.setProperty("solr.solr.home", 
"whateverpathyou'dliketosetonyourfilesystem");


Claudio


Re: solr.solr.home

2010-05-27 Thread Antonello Mangone
But where I have to write this command ???

System.setProperty("solr.solr.home",
> "whateverpathyou'dliketosetonyourfilesystem");
>
> Claudio
>


Re: solr.solr.home

2010-05-27 Thread Marco Martinez
Hi,

When you start the tomcat, you can specify the properties, it will be
something like this -Dsolr.solr.home=path/to/your/solr/home. For example, in
linux ./startup.sh -Dsolr.solr.home=path/to/your/solr/home



Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/5/27 Antonello Mangone 

> But where I have to write this command ???
>
> System.setProperty("solr.solr.home",
> > "whateverpathyou'dliketosetonyourfilesystem");
> >
> > Claudio
> >
>


Re: Highlighting questions

2010-05-27 Thread Erik Hatcher
Just set the pre and post tags to be empty strings and you'll get the  
result you want, I think.  No?


Erik

On May 26, 2010, at 8:36 PM, Blargy wrote:



What are the correct for settings to get highlighting excerpting  
working?


Original Text: "The quick brown fox jumps over the lazy dog"
Query: "jump"
Result: " fox jumps over "

Can you do something like the above with the highlighter or can it  
only

surround matches with pre and post tags? Can someone explain what
mergeContinuous does?

Thanks.
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Highlighting-questions-tp846628p846628.html
Sent from the Solr - User mailing list archive at Nabble.com.




Re: Any realtime indexing plugin available for SOLR

2010-05-27 Thread Erik Hatcher


On May 26, 2010, at 11:29 AM, Dennis Gearon wrote:

I thought that if entries were COMMITed to the index, they were  
immediately visible?


Is this true, or am I smoking Java coffee beans?


They're visible after a commit AND warming are complete, yes.   But  
there could be a potentially substantial delay between a commit  
message being sent and the new documents actually searchable.


Erik



Re: Help with PatternReplaceFilterFactory

2010-05-27 Thread Koji Sekiguchi

(10/05/27 16:11), kirsty wrote:

Hi,
I have a field that is a text field eg: R500,000-550,000 Per Annum,
R350,000-550,000 Per Annum Cost To Company etc.
I would like to facet on the salary range.
I have created a new field type








to remove all the letters.

I have to two fields like this:



And then the copy field:


But my index still has all the text. Am I misunderstanding? Where have I
gone wrong? Any help would be greatly appreciated!
Kirsty

   

What do you mean by "my index still has all the text "?
With your schema above, I think you can get a facet result eg:



1
1



when you request q=*:*&facet=on&facet.field=Remuneration_strip

Koji

--
http://www.rondhuit.com/en/



Re: Help with PatternReplaceFilterFactory

2010-05-27 Thread kirsty


Koji Sekiguchi wrote:
> 
> (10/05/27 16:11), kirsty wrote:
>> Hi,
>> I have a field that is a text field eg: R500,000-550,000 Per Annum,
>> R350,000-550,000 Per Annum Cost To Company etc.
>> I would like to facet on the salary range.
>> I have created a new field type
>> > sortMissingLast="true"
>> omitNorms="true">
>>  
>>  > class="solr.KeywordTokenizerFactory"/>
>>  
>>  
>>  > class="solr.PatternReplaceFilterFactory" pattern="([a-z])"
>> replacement="" replace="all"   />
>>  
>>  
>> to remove all the letters.
>>
>> I have to two fields like this:
>> 
>> > stored="true"/>
>>
>> And then the copy field:
>> 
>>
>> But my index still has all the text. Am I misunderstanding? Where have I
>> gone wrong? Any help would be greatly appreciated!
>> Kirsty
>>
>>
> What do you mean by "my index still has all the text "?
> With your schema above, I think you can get a facet result eg:
> 
> 
> 
> 1
> 1
> 
> 
> 
> when you request q=*:*&facet=on&facet.field=Remuneration_strip
> 
> Koji
> 
> -- 
> http://www.rondhuit.com/en/
> 
> 
> 

Yes you are right, I get that type of result. I guess my wording was wrong. 
My field looks like this in the index:
R500,000-550,000 Per Annum
R500,000-550,000 Per Annum

How would I search for say salaries in the range of 500,000 - 550,000?
Trying fq=Rumeration_strip:500,000-550,00 doesn't bring back anything. I
must have something wrong.



-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Help-with-PatternReplaceFilterFactory-tp847408p848078.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Help with PatternReplaceFilterFactory

2010-05-27 Thread Koji Sekiguchi



Yes you are right, I get that type of result. I guess my wording was wrong.
My field looks like this in the index:
R500,000-550,000 Per Annum
R500,000-550,000 Per Annum

How would I search for say salaries in the range of 500,000 - 550,000?
Trying fq=Rumeration_strip:500,000-550,00 doesn't bring back anything. I
must have something wrong.

   

So you are not asking facet...

For query syntax of range search, take a look at:

http://lucene.apache.org/java/2_9_1/queryparsersyntax.html#Range%20Searches

And you need to index the lower and upper salary to separate fields i.e.

low:50
up:55

Then you can search the both of the fields e.g.

q=low:[50 TO *] AND up:[* TO 55]

Koji

--
http://www.rondhuit.com/en/



multicore Vs multiple solr webapps

2010-05-27 Thread Antonello Mangone
Hi to all, I have a question for you ...
Can someone exaplain me the differences between a unique solr application
multicore and multiple solr webapps ???
Thank you all in advance


Re: Need guidance on schema type

2010-05-27 Thread Blargy

There will never be any need to search the actual HTML (tags, markup, etc) so
as far as functionality goes it seems like the DIH HTMLStripTransformer is
the way to go.

Are there any significant performance differences between the two?
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Need-guidance-on-schema-type-tp846923p848874.html
Sent from the Solr - User mailing list archive at Nabble.com.


Generic questions

2010-05-27 Thread Blargy

Can someone explain to be what the state of Solr/Lucene is... didn't they
recently combine?

I know I am running version 1.4 but I keep seeing version numbers out there
that are 3.0, 4.0??? Can someone explain what that means.

Also is the state of trunk (1.4 or 4.0??) "good enough" for production use?

Thanks!
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Generic-questions-tp848917p848917.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Generic questions

2010-05-27 Thread Yonik Seeley
On Thu, May 27, 2010 at 12:48 PM, Blargy  wrote:
> Can someone explain to be what the state of Solr/Lucene is... didn't they
> recently combine?

Yes, it started in March.  Development is combined (committers, dev
list, etc), but separate downloads and user lists will remain.

> I know I am running version 1.4 but I keep seeing version numbers out there
> that are 3.0, 4.0??? Can someone explain what that means.

Lots of other stuff has changed.  For example, trunk is now always the
next *major* version number.
So the trunk of the combined lucene/solr is 4.0-dev

There is now a branch_3x that is like trunk for all future 3.x releases.

The next version of Solr will probably be 3.1, and it's unlikely there
will ever be a 1.5 released.

> Also is the state of trunk (1.4 or 4.0??) "good enough" for production use?

Yes, Lucene/Solr aims for a "stable" trunk (runtime stability, not API
stability).  Plenty of big companies use trunk since they need recent
features.  Just be sure to test well (good advice even if working off
of officially released versions).

-Yonik
http://www.lucidimagination.com


Generic question on Query Analyzers

2010-05-27 Thread iboppana

Hi to all,

I have a question on query analyzers.

How do we make sure that when searches for terms like A&M does not match
docs which have some thing like 5a.m etc 

On analysis in admin page, it looks like WordDelimiterFilterFactory, is
splitting on &, how can i make it work so that i can use features of word
delimiter as well make sure certain words like A&M, D&B etc does not split.

Below is my field definition


  






  
  







  


Thanks in advance.

Indrani
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Generic-question-on-Query-Analyzers-tp849075p849075.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Generic questions

2010-05-27 Thread Blargy


Yonik Seeley-2-2 wrote:
> 
> Lots of other stuff has changed.  For example, trunk is now always the
> next *major* version number.
> So the trunk of the combined lucene/solr is 4.0-dev
> 
> There is now a branch_3x that is like trunk for all future 3.x releases.
> 
> The next version of Solr will probably be 3.1, and it's unlikely there
> will ever be a 1.5 released.
> 

Wait.. what? Now, im more confused 

What version is (http://svn.apache.org/repos/asf/lucene/dev/trunk/)? Im
guessing its 4.0-dev but then where does 3.1 fit in?

Say I am running 1.4 and want to upgrade, which version should I use? If I
want to use a patch that has a fix version of 1.5 which should I be using?
(https://issues.apache.org/jira/browse/SOLR-1316). 

Thanks again


-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Generic-questions-tp848917p849161.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Generic questions

2010-05-27 Thread Yonik Seeley
On Thu, May 27, 2010 at 2:12 PM, Blargy  wrote:
> What version is (http://svn.apache.org/repos/asf/lucene/dev/trunk/)? Im
> guessing its 4.0-dev

Yes.

> but then where does 3.1 fit in?

http://svn.apache.org/repos/asf/lucene/dev/branches/branch_3x/

> Say I am running 1.4 and want to upgrade, which version should I use? If I
> want to use a patch that has a fix version of 1.5 which should I be using?
> (https://issues.apache.org/jira/browse/SOLR-1316).

Patches in SVN are not official and not guaranteed to work (or even
patch correctly) to any version.
You need to look at the specific patch to see what svn path it is
relative to, and how old it is.

-Yonik
http://www.lucidimagination.com


Re: multicore Vs multiple solr webapps

2010-05-27 Thread David Stuart

Hi Antonello,

In multicore you get richer fuctionality including  core discovery,  
core config reload, alias, core
swap and (soon to be) core create. Under a single webapp you get  
control over memory allocation threads etc. Personally I would chose  
multicore and I believe in solr 1.5 they are going with a default  
multicore setup with a single core.


David Stuart

On 27 May 2010, at 15:44, Antonellor Mangone > wrote:



Hi to all, I have a question for you ...
Can someone exaplain me the differences between a unique solr  
application

multicore and multiple solr webapps ???
Thank you all in advance


Re: multicore Vs multiple solr webapps

2010-05-27 Thread David Stuart
So correction as per a different thread the next verison of solr will  
be 3.1 as per the merge with the luence tpl


David Stuart

On 27 May 2010, at 15:44, Antonello Mangone  
 wrote:



Hi to all, I have a question for you ...
Can someone exaplain me the differences between a unique solr  
application

multicore and multiple solr webapps ???
Thank you all in advance


Re: multicore Vs multiple solr webapps

2010-05-27 Thread Ryan McKinley
The two approaches solve different needs.  In 'multicore' you have a
single webapp with multiple indexes.  This means they are all running
in the same JVM.  This may be an advantage or a disadvantage depending
on what you are doing.

ryan



On Thu, May 27, 2010 at 10:44 AM, Antonello Mangone
 wrote:
> Hi to all, I have a question for you ...
> Can someone exaplain me the differences between a unique solr application
> multicore and multiple solr webapps ???
> Thank you all in advance
>


Sites with Innovative Presentation of Tags and Facets

2010-05-27 Thread Mark Bennett
I'm a big fan of plain old text facets (or tags), displayed in some logical
order, perhaps with a bit of indenting to help convey context. But as you
may have noticed, I don't rule the world.  :-)

Suppose you took the opposite approach, rending facets in non-traditional
ways, that were still functional, and not ugly.

Are there any pubic sites that come to mind that are displaying facets,
tags, clusters, taxonomies or other navigators in really innovative ways?
 And what you liked / didn't like?

Right now I'm just looking for examples of what's been tried.  I suppose
even bad examples might be educational.

My future ideal wish list:
* Stays out of the way (of casual users)
* Looks "clean" and "cool" (to the power users)
I'm thinking for example a light gray chevron ">>" that casual users
don't notice,
but when you click on it, cool things come up?
* Probably that does not require Flash or SilverLight (just to avoid the
whole platform wars)
I guess that means Ajax or HTML5
* And since I'm doing pie in the sky, can be made to look good on desktops
and mobile

Some examples to get the ball rolling:

StackOverflow, Flickr and YouTube, Clusty(now Yippy) are all nice, but a bit
pedestrian for my mission today.
(grokker was cool too)

Lucid has done a nice job with Facets and Solr:
http://www.lucidimagination.com/search/
And although I really like it, it's not a flashy enough specimen for what
I'm hunting today.
(and they should thread the actual results list)

I did some mockups of "2.0 style" search navigators a couple years back:
http://www.ideaeng.com/tabId/98/itemId/115/Search-20-in-the-Enterprise-Moving-Beyond-Singl.aspx
Though these were intentionally NOT derived from specific web sites.

Digg has done some cool stuff, for example:
http://labs.digg.com/365/
http://labs.digg.com/arc/
http://labs.digg.com/stack/
But for what I'm after, these are a bit too far off of the "searching for
something in particular" track.

Google Image Swirl and Similar Images are interesting, but for images.
Lots of other cool stuff at labs.google.com

Amazon, NewEgg, etc are all fine, but again text based.

TouchGraph has some cool stuff, though very non-linear (many others on this
theme)
http://www.touchgraph.com/TGGoogleBrowser.html
http://www.touchgraph.com/navigator.html


Cool articles on the subject: (some examples now offline)
http://www.cs.umd.edu/class/spring2005/cmsc838s/viz4all/viz4all_a.html



--
Mark Bennett / New Idea Engineering, Inc. / mbenn...@ideaeng.com
Direct: 408-733-0387 / Main: 866-IDEA-ENG / Cell: 408-829-6513


Re: Sites with Innovative Presentation of Tags and Facets

2010-05-27 Thread Geert-Jan Brits
Something like sliders perhaps?
Of course only numerical ranges can be put into sliders. (or a concept that
may be logically presented as some sort of ordening, such as "bad, hmm,
good, great"

Use Solr's Statscomponent to show the min and max values

Have a look at tripadvisor.com for good uses/implementation of sliders
(price, and reviewscore are presented as sliders)
my 2c: try to make the possible input values discrete (like at tripadvisor)
which gives a better user experience and limits the potential nr of queries
(cache-wise advantage)

Cheers,
Geert-Jan

2010/5/27 Mark Bennett 

> I'm a big fan of plain old text facets (or tags), displayed in some logical
> order, perhaps with a bit of indenting to help convey context. But as you
> may have noticed, I don't rule the world.  :-)
>
> Suppose you took the opposite approach, rending facets in non-traditional
> ways, that were still functional, and not ugly.
>
> Are there any pubic sites that come to mind that are displaying facets,
> tags, clusters, taxonomies or other navigators in really innovative ways?
>  And what you liked / didn't like?
>
> Right now I'm just looking for examples of what's been tried.  I suppose
> even bad examples might be educational.
>
> My future ideal wish list:
> * Stays out of the way (of casual users)
> * Looks "clean" and "cool" (to the power users)
>I'm thinking for example a light gray chevron ">>" that casual users
> don't notice,
>but when you click on it, cool things come up?
> * Probably that does not require Flash or SilverLight (just to avoid the
> whole platform wars)
>I guess that means Ajax or HTML5
> * And since I'm doing pie in the sky, can be made to look good on desktops
> and mobile
>
> Some examples to get the ball rolling:
>
> StackOverflow, Flickr and YouTube, Clusty(now Yippy) are all nice, but a
> bit
> pedestrian for my mission today.
> (grokker was cool too)
>
> Lucid has done a nice job with Facets and Solr:
> http://www.lucidimagination.com/search/
> And although I really like it, it's not a flashy enough specimen for what
> I'm hunting today.
> (and they should thread the actual results list)
>
> I did some mockups of "2.0 style" search navigators a couple years back:
>
> http://www.ideaeng.com/tabId/98/itemId/115/Search-20-in-the-Enterprise-Moving-Beyond-Singl.aspx
> Though these were intentionally NOT derived from specific web sites.
>
> Digg has done some cool stuff, for example:
> http://labs.digg.com/365/
> http://labs.digg.com/arc/
> http://labs.digg.com/stack/
> But for what I'm after, these are a bit too far off of the "searching for
> something in particular" track.
>
> Google Image Swirl and Similar Images are interesting, but for images.
> Lots of other cool stuff at labs.google.com
>
> Amazon, NewEgg, etc are all fine, but again text based.
>
> TouchGraph has some cool stuff, though very non-linear (many others on this
> theme)
> http://www.touchgraph.com/TGGoogleBrowser.html
> http://www.touchgraph.com/navigator.html
>
>
> Cool articles on the subject: (some examples now offline)
> http://www.cs.umd.edu/class/spring2005/cmsc838s/viz4all/viz4all_a.html
>
>
>
> --
> Mark Bennett / New Idea Engineering, Inc. / mbenn...@ideaeng.com
> Direct: 408-733-0387 / Main: 866-IDEA-ENG / Cell: 408-829-6513
>


Does SOLR Allow q= (A or B) AND (C or D)?

2010-05-27 Thread efr...@gmail.com
Hi all,

I have a query need that requires multiple OR conditions, and, there must be
a match in each condition for the query to provide a result.

The search would be * (A or B) AND (C or D)* and the only valid results it
could turn up are:

A B
A C
B C
B D

Can anyone provide guidance as to how to implement this in the query string?

thanks

Brad


Re: Sites with Innovative Presentation of Tags and Facets

2010-05-27 Thread Lukas Kahwe Smith

On 27.05.2010, at 23:32, Geert-Jan Brits wrote:

> Something like sliders perhaps?
> Of course only numerical ranges can be put into sliders. (or a concept that
> may be logically presented as some sort of ordening, such as "bad, hmm,
> good, great"
> 
> Use Solr's Statscomponent to show the min and max values
> 
> Have a look at tripadvisor.com for good uses/implementation of sliders
> (price, and reviewscore are presented as sliders)
> my 2c: try to make the possible input values discrete (like at tripadvisor)
> which gives a better user experience and limits the potential nr of queries
> (cache-wise advantage)


yeah i have been pondering something similar. but i now realized that this way 
the user doesnt get an overview of the distribution without actually applying 
the filter. that being said, it would be nice to display 3 numbers with the 
silders, the count of items that were filtered out on the lower and upper 
boundaries as well as the number of items still left (*).

aside from this i just put a little tweak to my facetting online:
http://search.un-informed.org/search?q=malaria&tm=any&s=Search

if you deselect any of the checkboxes, it updates the counts. however i display 
both the count without and with those additional checkbox filters applied 
(actually i only display two numbers of they are not the same):
http://screencast.com/t/MWUzYWZkY2Yt

regards,
Lukas Kahwe Smith
m...@pooteeweet.org

(*) if anyone has a slider that can do the above i would love to integrate that 
and replace the adoption year checkboxes with that

Re: searching documents in solr

2010-05-27 Thread Lance Norskog
Leading wildcards don't work.

word* is supported
word? is supported
word*x or word?x should be supported, but something strange happens
involving boolean queries.

On Wed, May 26, 2010 at 11:31 PM, dotriz
 wrote:
>
> Here is my schema.xml file
>
> http://lucene.472066.n3.nabble.com/file/n847355/schema.xml schema.xml
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/searching-documents-in-solr-tp844800p847355.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>



-- 
Lance Norskog
goks...@gmail.com


Solr trunk and Jetty threadpool implementation problem

2010-05-27 Thread Smiley, David W.
I'd like to warn people about the default configuration of Jetty in the Solr 
trunk release (not present in Solr 1.4 and prior).  There is a difference in 
the jetty configuration which is for the latest Solr to use the 
QueuedThreadPool (as seen in jetty.xml).  Previously, it had used a 
BoundedThreadPool implementation that I've heard is considered deprecated 
presently.  I have a multi-core setup where Jetty is serving up lots of Solr 
cores 9+ and when our client does a distributed search (3 of them at a time 
actually), it triggers a condition in which the query takes 50 plus seconds to 
respond.  During this time, the machine is effectively idle, seemingly waiting 
for something.  To fix this, go back to the former BoundedThreadPool 
implementation or don't use Jetty.  FWIW this has triggered us to swtich to 
Tomcat.

Sorry but I have sunk so much resources into tracking down this nasty problem 
that I can't spend much more on further figuring out why QueuedThreadPool is 
failing us.

~ David Smiley
Author: http://www.packtpub.com/solr-1-4-enterprise-search-server/






Re: Does SOLR Allow q= (A or B) AND (C or D)?

2010-05-27 Thread efr...@gmail.com
On Thu, May 27, 2010 at 5:34 PM, efr...@gmail.com  wrote:

> Hi all,
>
> I have a query need that requires multiple OR conditions, and, there must
> be a match in each condition for the query to provide a result.
>
> The search would be * (A or B) AND (C or D)* and the only valid results it
> could turn up are:
>
> A B (sorry meant "A D")
> A C
> B C
> B D
>
> Can anyone provide guidance as to how to implement this in the query
> string?
>
> thanks
>
> Brad
>


Re: Does SOLR Allow q= (A or B) AND (C or D)?

2010-05-27 Thread Ahmet Arslan

> I have a query need that requires multiple OR conditions,
> and, there must be
> a match in each condition for the query to provide a
> result.
> 
> The search would be * (A or B) AND (C or D)* and the only
> valid results it
> could turn up are:
> 
> A B
> A C
> B C
> B D
> 
> Can anyone provide guidance as to how to implement this in
> the query string?

It should be something like : q=+(A B) +(C D)&q.op=OR


  


Re: Any realtime indexing plugin available for SOLR

2010-05-27 Thread Antonio Lobato
Funny enough, I've been looking for my own solution too.  The Zoie plugin does 
not work on multi-core setups, so that's bust for me.  Once you commit 
something to index, you need to "warm" a new searcher (load all the data from 
disk into memory/cache) like Erik says.  On a smaller index, this is very very 
quick, however on a larger index, not so much.

Solr 1.5 will (hopefully) have a new feature that will allow for near real time 
searching.  Check this out:

http://wiki.apache.org/solr/NearRealtimeSearch


On May 27, 2010, at 6:00 AM, Erik Hatcher wrote:

> 
> On May 26, 2010, at 11:29 AM, Dennis Gearon wrote:
> 
>> I thought that if entries were COMMITed to the index, they were immediately 
>> visible?
>> 
>> Is this true, or am I smoking Java coffee beans?
> 
> They're visible after a commit AND warming are complete, yes.   But there 
> could be a potentially substantial delay between a commit message being sent 
> and the new documents actually searchable.
> 
>   Erik
> 

---
Antonio Lobato
Symplicity Corporation
www.symplicity.com
(703) 351-0200 x 8101
alob...@symplicity.com



Re: Sites with Innovative Presentation of Tags and Facets

2010-05-27 Thread Geert-Jan Brits
Perhaps you could show the 'nr of items left' as a tooltip of sorts when the
user actually drags the slider.
If the user doesn't drag (or hovers over ) the slider 'nr of items left'
isn't shown.

Moreover, initially a slider doesn't limit the results so 'nr of items left'
shown for the slider would be the same as the overall number of items left
(thereby being redundant)

I must say I haven't seen this been implemented but it would be rather easy
to adapt a slider implementation, to show the nr on drag/ hover.  (they exit
for jquery, scriptaculous and a bunch of other libs)

Geert-Jan

2010/5/27 Lukas Kahwe Smith 

>
> On 27.05.2010, at 23:32, Geert-Jan Brits wrote:
>
> > Something like sliders perhaps?
> > Of course only numerical ranges can be put into sliders. (or a concept
> that
> > may be logically presented as some sort of ordening, such as "bad, hmm,
> > good, great"
> >
> > Use Solr's Statscomponent to show the min and max values
> >
> > Have a look at tripadvisor.com for good uses/implementation of sliders
> > (price, and reviewscore are presented as sliders)
> > my 2c: try to make the possible input values discrete (like at
> tripadvisor)
> > which gives a better user experience and limits the potential nr of
> queries
> > (cache-wise advantage)
>
>
> yeah i have been pondering something similar. but i now realized that this
> way the user doesnt get an overview of the distribution without actually
> applying the filter. that being said, it would be nice to display 3 numbers
> with the silders, the count of items that were filtered out on the lower and
> upper boundaries as well as the number of items still left (*).
>
> aside from this i just put a little tweak to my facetting online:
> http://search.un-informed.org/search?q=malaria&tm=any&s=Search
>
> if you deselect any of the checkboxes, it updates the counts. however i
> display both the count without and with those additional checkbox filters
> applied (actually i only display two numbers of they are not the same):
> http://screencast.com/t/MWUzYWZkY2Yt
>
> regards,
> Lukas Kahwe Smith
> m...@pooteeweet.org
>
> (*) if anyone has a slider that can do the above i would love to integrate
> that and replace the adoption year checkboxes with that


Re: Generic question on Query Analyzers

2010-05-27 Thread Ahmet Arslan
> How do we make sure that when searches for terms like
> A&M does not match
> docs which have some thing like 5a.m etc 
> 
> On analysis in admin page, it looks like
> WordDelimiterFilterFactory, is
> splitting on &, how can i make it work so that i can
> use features of word
> delimiter as well make sure certain words like A&M,
> D&B etc does not split.

With a custom FilterFactory http://search-lucene.com/m/q2RH1102fRb1/


  


Re: Does SOLR Allow q= (A or B) AND (C or D)?

2010-05-27 Thread efr...@gmail.com
Thank you. That seems to be working well, except when I included a wild card
for any of the terms, the wildcard term isn't being found out.

My searches are actually:
q=+(A A*) +(C C*)&q.op=OR

When I do a regular search on "A*" or "C*" I get matches but not in the
context of the above query. The ability to use wildcards seems to get lost.

This is all for the purposes of a "live search" in which we return matches
as the user types, thus the wildcard.  A and C represent two different terms
a user has typed in the search box (where we are providing the live-search
results).

thanks

Brad



On Thu, May 27, 2010 at 5:47 PM, Ahmet Arslan  wrote:

>
> > I have a query need that requires multiple OR conditions,
> > and, there must be
> > a match in each condition for the query to provide a
> > result.
> >
> > The search would be * (A or B) AND (C or D)* and the only
> > valid results it
> > could turn up are:
> >
> > A B
> > A C
> > B C
> > B D
> >
> > Can anyone provide guidance as to how to implement this in
> > the query string?
>
> It should be something like : q=+(A B) +(C D)&q.op=OR
>
>
>
>


Re: Does SOLR Allow q= (A or B) AND (C or D)?

2010-05-27 Thread Ahmet Arslan
> Thank you. That seems to be working
> well, except when I included a wild card
> for any of the terms, the wildcard term isn't being found
> out.
> 
> My searches are actually:
> q=+(A A*) +(C C*)&q.op=OR
> 
> When I do a regular search on "A*" or "C*" I get matches
> but not in the
> context of the above query. The ability to use wildcards
> seems to get lost.
> 
> This is all for the purposes of a "live search" in which we
> return matches
> as the user types, thus the wildcard.  A and C
> represent two different terms
> a user has typed in the search box (where we are providing
> the live-search
> results).

Looks like you are looking for auto-suggest/complete feature. As the user types 
something there will be ajax suggestions right?

Queries  A* or C* are not sorted by score/relevance. Can you explain in more 
detail what do you mean by "search on "A*" or "C*" I get matches
but not in the context of the above query"






Re: Does SOLR Allow q= (A or B) AND (C or D)?

2010-05-27 Thread Erick Erickson
You can get a lot of mileage out of the admin
analysis page and the "full interface" page, especially
by turning on the "debug" option on the admin
"full interface" page.

It takes a bit of practice to read the debug output, but
it's really, really, really worth it

Best
Erick

On Thu, May 27, 2010 at 6:37 PM, efr...@gmail.com  wrote:

> Thank you. That seems to be working well, except when I included a wild
> card
> for any of the terms, the wildcard term isn't being found out.
>
> My searches are actually:
> q=+(A A*) +(C C*)&q.op=OR
>
> When I do a regular search on "A*" or "C*" I get matches but not in the
> context of the above query. The ability to use wildcards seems to get lost.
>
> This is all for the purposes of a "live search" in which we return matches
> as the user types, thus the wildcard.  A and C represent two different
> terms
> a user has typed in the search box (where we are providing the live-search
> results).
>
> thanks
>
> Brad
>
>
>
> On Thu, May 27, 2010 at 5:47 PM, Ahmet Arslan  wrote:
>
> >
> > > I have a query need that requires multiple OR conditions,
> > > and, there must be
> > > a match in each condition for the query to provide a
> > > result.
> > >
> > > The search would be * (A or B) AND (C or D)* and the only
> > > valid results it
> > > could turn up are:
> > >
> > > A B
> > > A C
> > > B C
> > > B D
> > >
> > > Can anyone provide guidance as to how to implement this in
> > > the query string?
> >
> > It should be something like : q=+(A B) +(C D)&q.op=OR
> >
> >
> >
> >
>


Re: Does SOLR Allow q= (A or B) AND (C or D)?

2010-05-27 Thread efr...@gmail.com
Hi Ahmet,

Thanks for the response again. The best way I could illustrate our live
search feature is an example implementation:

http://www.krop.com/

Notice when you search the word "senior" in the keywords field, the results
filter down to just the job postings with that word in it.

So it's not the same as an "autocomplete" type feature where as the user
types in the search input box, their input is completed. We are just
focusing on providing results with each key stroke.  If the user types "Ca",
we will return anything with "Cat" in it. Thus we need the wildcard. As of
now we send a query to solr of "Ca*".  However, solr can struggle with
wildcards where it won't return a match on a word if there is a wildcard at
the end of a fully-typed word. You have to leave off the last letter of that
word and an asterisk to match it.

We're attempting to do an "OR" search of the "term OR term*" anytime a user
enters a term. Our need mix these "or" searches with an AND command is
because if a user types two words, we are requiring both words be in the
result to have a match.  Your suggestion as how to do this worked
beautifully, except for the fact that it didn't seem to be able to find
wildcarded terms when indeed it should have.

thanks

Brad


On Thu, May 27, 2010 at 6:57 PM, Ahmet Arslan  wrote:

> > Thank you. That seems to be working
> > well, except when I included a wild card
> > for any of the terms, the wildcard term isn't being found
> > out.
> >
> > My searches are actually:
> > q=+(A A*) +(C C*)&q.op=OR
> >
> > When I do a regular search on "A*" or "C*" I get matches
> > but not in the
> > context of the above query. The ability to use wildcards
> > seems to get lost.
> >
> > This is all for the purposes of a "live search" in which we
> > return matches
> > as the user types, thus the wildcard.  A and C
> > represent two different terms
> > a user has typed in the search box (where we are providing
> > the live-search
> > results).
>
> Looks like you are looking for auto-suggest/complete feature. As the user
> types something there will be ajax suggestions right?
>
> Queries  A* or C* are not sorted by score/relevance. Can you explain in
> more detail what do you mean by "search on "A*" or "C*" I get matches
> but not in the context of the above query"
>
>
>
>
>


highlighting broken for multivalued text fields?

2010-05-27 Thread Darren Govoni
Hi,
  I want to verify a bug if someone can help. I have a text field:

   

I use to store text that I highlight on. If the field contains more than
one text value, highlighting does not seem to work.
No highlights are returned, even though the text exists in one of the
field values returned from the query.
Am I missing a flag or is this a bug?

thanks,
Darren


Re: Does SOLR Allow q= (A or B) AND (C or D)?

2010-05-27 Thread Ahmet Arslan


> Thanks for the response again. The best way I could
> illustrate our live
> search feature is an example implementation:
> 
> http://www.krop.com/
> 
> Notice when you search the word "senior" in the keywords
> field, the results
> filter down to just the job postings with that word in it.
> 
> So it's not the same as an "autocomplete" type feature
> where as the user
> types in the search input box, their input is completed. We
> are just
> focusing on providing results with each key stroke. 
> If the user types "Ca",
> we will return anything with "Cat" in it. Thus we need the
> wildcard. As of
> now we send a query to solr of "Ca*".  However, solr
> can struggle with
> wildcards where it won't return a match on a word if there
> is a wildcard at
> the end of a fully-typed word. You have to leave off the
> last letter of that
> word and an asterisk to match it.

Okey i was referring the same. Each keystroke will return results.
There are many way to achieve this. Are you going to suggest from 
index/documents or from coming queries? I mean do you have a separate index to 
capture most popular searches?

> We're attempting to do an "OR" search of the "term OR
> term*" anytime a user
> enters a term. 

term* is super set of term so you need to include/OR term in your query.

> Our need mix these "or" searches with an AND
> command is
> because if a user types two words, we are requiring both
> words be in the
> result to have a match.  

generally two ways: 
using wildcards on shingles (ShingleFilterFactory)
or
using EdgeNGramFilterFactory
can deal two or more words

Do you mind the order of words use types? Suggestions should come in order that 
the user types?





Re: Does SOLR Allow q= (A or B) AND (C or D)?

2010-05-27 Thread efr...@gmail.com
Thanks, I found full interface :)

On Thu, May 27, 2010 at 7:12 PM, Erick Erickson wrote:

> You can get a lot of mileage out of the admin
> analysis page and the "full interface" page, especially
> by turning on the "debug" option on the admin
> "full interface" page.
>
> It takes a bit of practice to read the debug output, but
> it's really, really, really worth it
>
> Best
> Erick
>
> On Thu, May 27, 2010 at 6:37 PM, efr...@gmail.com 
> wrote:
>
> > Thank you. That seems to be working well, except when I included a wild
> > card
> > for any of the terms, the wildcard term isn't being found out.
> >
> > My searches are actually:
> > q=+(A A*) +(C C*)&q.op=OR
> >
> > When I do a regular search on "A*" or "C*" I get matches but not in the
> > context of the above query. The ability to use wildcards seems to get
> lost.
> >
> > This is all for the purposes of a "live search" in which we return
> matches
> > as the user types, thus the wildcard.  A and C represent two different
> > terms
> > a user has typed in the search box (where we are providing the
> live-search
> > results).
> >
> > thanks
> >
> > Brad
> >
> >
> >
> > On Thu, May 27, 2010 at 5:47 PM, Ahmet Arslan  wrote:
> >
> > >
> > > > I have a query need that requires multiple OR conditions,
> > > > and, there must be
> > > > a match in each condition for the query to provide a
> > > > result.
> > > >
> > > > The search would be * (A or B) AND (C or D)* and the only
> > > > valid results it
> > > > could turn up are:
> > > >
> > > > A B
> > > > A C
> > > > B C
> > > > B D
> > > >
> > > > Can anyone provide guidance as to how to implement this in
> > > > the query string?
> > >
> > > It should be something like : q=+(A B) +(C D)&q.op=OR
> > >
> > >
> > >
> > >
> >
>


Re: Does SOLR Allow q= (A or B) AND (C or D)?

2010-05-27 Thread efr...@gmail.com
Responses in blue

On Thu, May 27, 2010 at 7:32 PM, Ahmet Arslan  wrote:

>
>
> > Thanks for the response again. The best way I could
> > illustrate our live
> > search feature is an example implementation:
> >
> > http://www.krop.com/
> >
> > Notice when you search the word "senior" in the keywords
> > field, the results
> > filter down to just the job postings with that word in it.
> >
> > So it's not the same as an "autocomplete" type feature
> > where as the user
> > types in the search input box, their input is completed. We
> > are just
> > focusing on providing results with each key stroke.
> > If the user types "Ca",
> > we will return anything with "Cat" in it. Thus we need the
> > wildcard. As of
> > now we send a query to solr of "Ca*".  However, solr
> > can struggle with
> > wildcards where it won't return a match on a word if there
> > is a wildcard at
> > the end of a fully-typed word. You have to leave off the
> > last letter of that
> > word and an asterisk to match it.
>
> Okey i was referring the same. Each keystroke will return results.
> There are many way to achieve this. Are you going to suggest from
> index/documents or from coming queries? I mean do you have a separate index
> to capture most popular searches?
>

We do not have a seperate index to capture most popular searches (is that
coming queries?)



>
> > We're attempting to do an "OR" search of the "term OR
> > term*" anytime a user
> > enters a term.
>
> term* is super set of term so you need to include/OR term in your query.
>
> Thanks...


>  > Our need mix these "or" searches with an AND
> > command is
> > because if a user types two words, we are requiring both
> > words be in the
> > result to have a match.
>
> generally two ways:
> using wildcards on shingles (ShingleFilterFactory)
> or
> using EdgeNGramFilterFactory
> can deal two or more words
>
> Do you mind the order of words use types? Suggestions should come in order
> that the user types?
>
> We don't mind the order of terms. We basically are sorting by two variables
that are independent of relevency.  So I would assume the order doesn't
matter... we just need to make sure any results we filter down to (as you
saw in the krop.com example) contain the words the user has typed.


Re: Does SOLR Allow q= (A or B) AND (C or D)?

2010-05-27 Thread Ahmet Arslan
> > We don't mind the order of terms. We basically are
> sorting by two variables
> that are independent of relevency.  So I would assume
> the order doesn't
> matter... we just need to make sure any results we filter
> down to (as you
> saw in the krop.com example) contain the words the user has
> typed.
> 

Lets say you have short title field and you are going to give 
suggest/autocomplete using this field from index and order is not important. 
But in this ca





















You can use these two fields, populate them from your short title field




and use normal query, (not wildcard) as the user types words
q=titlePrefix:(term1 te) titlePrefixFull:"term1 
te"&defType=lucene&q.op=OR&fl=Title
will return you suggestions. Does this satisfy your needs?
In this case you are suggesting whole title field.

Or do you want to use ShingleFilterFactory with wildcard query?





Re: highlighting broken for multivalued text fields?

2010-05-27 Thread Koji Sekiguchi

(10/05/28 8:16), Darren Govoni wrote:

Hi,
   I want to verify a bug if someone can help. I have a text field:



I use to store text that I highlight on. If the field contains more than
one text value, highlighting does not seem to work.
No highlights are returned, even though the text exists in one of the
field values returned from the query.
Am I missing a flag or is this a bug?
   

As long as I see your setting above, no.

If you can post more information such as how you set
fieldType of your text field, how look like your sample
data you want to highlight and query parameters your
are using, we can help you.

Koji

--
http://www.rondhuit.com/en/



Re: highlighting broken for multivalued text fields?

2010-05-27 Thread Darren Govoni
Hi Koji,
   Well, its quite simple. Here is the field returned from my query:
"fox"




 
The bird flies in the sky.


The quick brown fox jumped over the fence.









No highlighting.

If the field only has one value "The quick brown fox jumped over the
fence." It works.
Interestingly, the first field value has no candidate highlight. But the
second appears to not be checked.

Seems either a bug or my expectation of behavior is wrong.

Darren

On Fri, 2010-05-28 at 09:59 +0900, Koji Sekiguchi wrote:

> (10/05/28 8:16), Darren Govoni wrote:
> > Hi,
> >I want to verify a bug if someone can help. I have a text field:
> >
> >  > multiValued="true" termVectors="true" termPositions="true"
> > termOffsets="true"/>
> >
> > I use to store text that I highlight on. If the field contains more than
> > one text value, highlighting does not seem to work.
> > No highlights are returned, even though the text exists in one of the
> > field values returned from the query.
> > Am I missing a flag or is this a bug?
> >
> As long as I see your setting above, no.
> 
> If you can post more information such as how you set
> fieldType of your text field, how look like your sample
> data you want to highlight and query parameters your
> are using, we can help you.
> 
> Koji
> 




Re: Does SOLR Allow q= (A or B) AND (C or D)?

2010-05-27 Thread efr...@gmail.com
Hi Ahmet,

Thanks again for the feedback. We will be searching several fields of each
object in the index (title, description, tags). The matches on keywords need
to be in any of these fields and there will be no different weights.

Does this affect your solution?

I'm trying to understand it as best I can as I didn't set up our solr nor am
I directly managing its implementation.

thanks

Brad


On Thu, May 27, 2010 at 8:12 PM, Ahmet Arslan  wrote:

> > > We don't mind the order of terms. We basically are
> > sorting by two variables
> > that are independent of relevency.  So I would assume
> > the order doesn't
> > matter... we just need to make sure any results we filter
> > down to (as you
> > saw in the krop.com example) contain the words the user has
> > typed.
> >
>
> Lets say you have short title field and you are going to give
> suggest/autocomplete using this field from index and order is not important.
> But in this ca
>
>  positionIncrementGap="1">
> 
> 
> 
> 
>  maxGramSize="20"/>
> 
> 
> 
> 
> 
> 
>
>  positionIncrementGap="1">
> 
> 
> 
>  maxGramSize="20"/>
> 
>
> You can use these two fields, populate them from your short title field
>
> 
> 
>
> and use normal query, (not wildcard) as the user types words
> q=titlePrefix:(term1 te) titlePrefixFull:"term1
> te"&defType=lucene&q.op=OR&fl=Title
> will return you suggestions. Does this satisfy your needs?
> In this case you are suggesting whole title field.
>
> Or do you want to use ShingleFilterFactory with wildcard query?
>
>
>
>


NoSuchFieldError: submap

2010-05-27 Thread Mauricio Scheffer
Hi, I'm trying to build from source to apply the field collapsing patch.
'Ant dist' runs just fine, no errors, but at startup I get a
"NoSuchFieldError: submap" exception (stack trace:
http://pastebin.com/NXsf0KJS ). This is before sending any requests. I don't
have any 'submap' field defined anywhere.
Has anyone seen this? Any ideas?

Thanks,
Mauricio


Re: highlighting broken for multivalued text fields?

2010-05-27 Thread Chris Hostetter

: Hi Koji,
:Well, its quite simple. Here is the field returned from my query:
: "fox"

Actually what Koji was asking for was the  declaration for 
"text" (you posted the  but not the  so we only have 
half a picture of hte settings involved)

That said: the subject of this thread caught my eye, because it sounds 
very familiar to a known bug in 1.4 that has been fixed in svn (which i 
just happend to be looking at because i was cleaning up Jira) ...

https://issues.apache.org/jira/browse/SOLR-1624

-Hoss