LetterTokenizer + EdgeNGram + apostrophe in query = invalid result

2011-02-25 Thread Matt Weber
I have the following field defined in my schema:

  

  
  
  


  
  

  

  

I have the default field set to "person" and have indexed the
following document:


   
   
   
   



The following queries return the result as expected using the standard
request handler:

vincent m d onofrio
d'o
onofrio
d onofrio

The following query fails:

d'onofrio

This is weird because "d'o" returns a result.  As soon as I type the
"n" I start to get no results.  I ran this though the field analysis
page and  it shows that this query is being tokenized correctly and
should yield a result.

I am using a build of trunk Solr (r1073990) and the example
solrconfig.xml.  I am also using the example schema with the addition
of my ngram field.

Any ideas?  I have tried this with other word's containing an
apostrophe and they all stop returning results after 4 characters.


Thanks,
Matt Weber


Re: Ramdirectory

2011-02-25 Thread Matt Weber
I have used this without issue.  In the example solrconfig.xml replace
this line:



with this one:



Thanks,
Matt Weber

On Thu, Feb 24, 2011 at 7:47 PM, Bill Bell  wrote:
> Thanks - yeah that is why I asked how to use it. But I still don't know
> how to use it.
>
> https://hudson.apache.org/hudson/job/Solr-3.x/javadoc/org/apache/solr/core/
> RAMDirectoryFactory.html
>
>
> https://issues.apache.org/jira/browse/SOLR-465
>
> 
> 
> 
>
>
> Is that right? Examples? Options?
>
> Where do I put that in solrconfig.xml ? Do I put it in
> mainIndex/directoryProvider ?
>
> I know that SOLR-465 is more generic, but
> https://issues.apache.org/jira/browse/SOLR-480 seems easier to use.
>
>
>
> Thanks.
>
>
> On 2/24/11 6:21 PM, "Chris Hostetter"  wrote:
>
>>
>>: I could not figure out how to setup the ramdirectory option in
>>solrconfig.XML. Does anyone have an example for 1.4?
>>
>>it wasn't an option in 1.4.
>>
>>as Koji had already mentioned in the other thread where you chimed in
>>and asked about this, it was added in the 3x branch...
>>
>>http://lucene.472066.n3.nabble.com/Question-Solr-Index-main-in-RAM-td25671
>>66.html
>>
>>
>>
>>-Hoss
>
>
>



-- 
Thanks,
Matt Weber


Re: field collapsing sums

2009-09-30 Thread Matt Weber
You might want to see how the stats component works with field  
collapsing.


Thanks,

Matt Weber

On Sep 30, 2009, at 5:16 PM, Uri Boness wrote:


Hi,

At the moment I think the most appropriate place to put it is in the  
AbstractDocumentCollapser (in the getCollapseInfo method). Though,  
it might not be the most efficient.


Cheers,
Uri

Joe Calderon wrote:
hello all, i have a question on the field collapsing patch, say i  
have

an integer field called "num_in_stock" and i collapse by some other
column, is it possible to sum up that integer field and return the
total in the output, if not how would i go about extending the
collapsing component to support that?


thx much

--joe






Re: Showing few results for each category (facet)

2009-09-29 Thread Matt Weber
So, you want to display 5 results from each category and still know  
how many results are in each category.  This is a perfect situation  
for the field collapsing patch:


https://issues.apache.org/jira/browse/SOLR-236
http://wiki.apache.org/solr/FieldCollapsing

Here is how I would do it.

Add a field to your schema called category or whatever.  Then while  
indexing you populate that field with whatever category the document  
belongs in.  While executing a search, collapse the results on that  
field with a max collapse of 5.  This will give you at most 5 results  
per category.  Now, at the same time enable faceting on that field and  
DO NOT use the collapsing parameter to recount the facet vales.  This  
means that the facet counts will be reflect the non-collapsed  
results.  This facet should only be used to get the count for each  
category, not displayed to the user.  On your search results page that  
gets the collapsed results, you can put a link that says "Show all X  
results from this category" where X is the value you pull out of the  
facet.  When a user clicks that link you basically do the same search  
with field collapsing disabled, and a filter query on the specific  
category they want to see, for example:  &fq=category:people.


Hope this helps.

Thanks,

Matt Weber

On Sep 29, 2009, at 4:55 AM, Marian Steinbach wrote:

On Tue, Sep 29, 2009 at 11:36 AM, Varun Gupta  
 wrote:

...

One way that I can think of doing this is by making as many queries  
as there
are categories and show these results under each category. But this  
will be

very inefficient. Is there any way I can do this ?



Hi Varun!

I think that doing multiple queries doesn't have to be inefficient,
since Solr caches subsequent queries for the same term and facets.

Imagine this as your first query:
- q: xyz
- facets: myfacet

and this as a second query:
- q:xyz
- fq: myfacet=a

Compared to the first query, the second query will be very fast, since
all the hard work ahs been done in query one and then cached.

At least that's my understanding. Please correct me if I'm wrong.

Marian




Re: Usage of Sort and fq

2009-09-29 Thread Matt Weber

A description and examples of both parameters can be found here:

http://wiki.apache.org/solr/CommonQueryParameters

Thanks,

Matt Weber

On Sep 29, 2009, at 4:10 AM, Avlesh Singh wrote:


/?q=*:*&fq:category:animal&sort=child_count%20asc

Search for all documents (of animals), and filter the ones that  
belong to
the category "animal" and sort ascending by a field called  
child_count that

contains number of children for each animal.

You can pass multiple fq's with more "&fq=..." parameters. Secondary,
tertiary sorts can be specified using comma (",") as the separator.  
i.e.

"sort=fieldA asc,fieldB desc, fieldC asc, ..."

Cheers
Avlesh

On Tue, Sep 29, 2009 at 3:51 PM, bhaskar chandrasekar
wrote:


Hi,

Can some one let me know how to use sort and fq parameters in Solr.
Any examples woould be appreciated.

Regards
Bhaskar







Re: Using two Solr documents to represent one logical document/file

2009-09-26 Thread Matt Weber

Check out the field collapsing patch:

http://wiki.apache.org/solr/FieldCollapsing
https://issues.apache.org/jira/browse/SOLR-236

Thanks,

Matt Weber

On Sep 25, 2009, at 3:15 AM, Peter Ledbrook wrote:



Hi,

I want to index both the contents of a document/file and metadata  
associated
with that document. Since I also want to update the content and  
metadata

indexes independently, I believe that I need to use two separate Solr
documents per real/logical document. The question I have is how do I  
merge
query results so that only one result is returned per real/logical  
document,
not per Solr document? In particular, I don't want to filter the  
results to

satisfy any "max results" constraint.

I have read that this can be achieved with a facet search. Is this  
the best

approach, or is there some alternative?

Thanks,

Peter
--
View this message in context: 
http://www.nabble.com/Using-two-Solr-documents-to-represent-one-logical-document-file-tp25609646p25609646.html
Sent from the Solr - User mailing list archive at Nabble.com.





Re: Is it possible to query for "everything" ?

2009-09-14 Thread Matt Weber

Query for *:*

Thanks,

Matt Weber

On Sep 14, 2009, at 4:18 PM, Jonathan Vanasco wrote:


I'm using Solr for seach and faceted browsing

Is it possible to have solr search for 'everything' , at least as  
far as q is concerned ?


The request handlers I've found don't like it if I don't pass in a q  
parameter




Re: Searching for the '+' character

2009-09-14 Thread Matt Weber
Why don't you create a synonym for + that expands to your customers  
product name that includes the plus?  You can even have your FE do  
this sort of replacement BEFORE submitting to Solr.


Thanks,

Matt Weber

On Sep 14, 2009, at 11:42 AM, AHMET ARSLAN wrote:


Thanks Ahmet,

Thats excellent, thanks :) I may have to increase the
gramsize to take into account other possible uses but i can
now read around these filters to make the adjustments.

With regard to WordDelimiterFilterFactory. Is there a way
to place a delimiter on this filter to still get most of its
functionality without it absorbing the + signs?


Yes you are right, preserveOriginal="1" will causes the original  
token to be indexed without modifications.



Will i loose a lot of 'good' functionality by removing it?


It depends of your input data. It is used to break one token into  
subwords.

Like: "Wi-Fi" -> "Wi", "Fi" and "PowerShot" -> "Power", "Shot"
If you input data set contains such words, you may need it.

But I think just to make last character searchable, using  
NGramFilter(s) is not an optimal solution. I don't know what type of  
dataset you have but, I think using separate two fields (with  
different types) for that is more suitable. One field will contain  
actual data itself. The other will hold only the last character(s).


You can achieve this by a copyField or programatically during  
indexing. The type of the field lastCharsField will be using  
EdgeNGramFilter so that only last character of token(s) will pass  
that filter.


During searching you will search those two fields:
originalField:\+ OR lastCharsField:\+

The query lastCharsField:\+ will return you all the products ending  
with +.


Hope this helps.









Re: When to optimize?

2009-09-13 Thread Matt Weber
I would say once a day is a pretty good rule of thumb.  If you think  
this is a bit much and if you have few updates you can probably back  
that off to once every couple days to once a week.  However, if you  
have a large batch update or your query performance starts to degrade,  
you will need to optimize your index.


Thanks,

Matt Weber

On Sep 13, 2009, at 6:21 PM, William Pierce wrote:


Folks:

Are there good rules of thumb for when to optimize?  We have a large  
index consisting of approx 7M documents and we currently have it set  
to optimize once a day.  But sometimes there are very few changes  
that have been committed during a day and it seems like a waste to  
optimize (esp. since our servers are pretty well loaded).


So I was looking to get some good rules of thumb for when it makes  
sense to optimize:   Optimize when x% of the documents have been  
changed since the last optimize or some such.


Any ideas would be greatly appreciated!

-- Bill 




Re: Solr Cell

2009-07-23 Thread Matt Weber
Found my own answer, use the literal parameter.  Should have dug  
around before asking.  Sorry.


Thanks,

Matt Weber
eSr Technologies
http://www.esr-technologies.com




On Jul 23, 2009, at 2:26 PM, Matt Weber wrote:

Is it possible to supply addition metadata along with the binary  
file when using Solr Cell?


For example, I have a pdf called somefile.pdf and I have some  
external metadata related to that file.  Such metadata might be  
things like author, publisher, source, date published, etc.   I want  
to post the binary data for somefile.pdf to Solr Cell AND map my  
metadata into other fields in the same document that has the  
extracted text from the pdf.


I know I could do this using Tika and SolrJ directly, but it would  
be much easier if Solr Cell can do it.


Thanks,

Matt Weber
eSr Technologies
http://www.esr-technologies.com








Solr Cell

2009-07-23 Thread Matt Weber
Is it possible to supply addition metadata along with the binary file  
when using Solr Cell?


For example, I have a pdf called somefile.pdf and I have some external  
metadata related to that file.  Such metadata might be things like  
author, publisher, source, date published, etc.   I want to post the  
binary data for somefile.pdf to Solr Cell AND map my metadata into  
other fields in the same document that has the extracted text from the  
pdf.


I know I could do this using Tika and SolrJ directly, but it would be  
much easier if Solr Cell can do it.


Thanks,

Matt Weber
eSr Technologies
http://www.esr-technologies.com






Re: Solr relevancy score - conversion

2009-06-09 Thread Matt Weber
Solr does not support this.  You can do it yourself by taking the  
highest score and using that as 100% and calculating other percentages  
from that number.  For example if the max score is 10 and the next  
result has a score of 5, you would do (5 / 10) * 100 = 50%.  Hope this  
helps.


Thanks,

Matt Weber
eSr Technologies
http://www.esr-technologies.com




On Jun 8, 2009, at 10:05 PM, Vijay_here wrote:



Hi,

I am using solr to inxdex some of the legal documents, where i need  
the solr
search engine to return relevancy ranking score for each search  
results. As

of now i am getting score like 3.12, 1.23, 0.23  so on.

Would need an more proportionate score like rounded to 100% (95%  
relevant,
80 % relevant and so on). Is there a way to make solr returns such  
scores of

such relevance. Any other approach to arrive at this scores also be
appreciated

thanks
vijay
--
View this message in context: 
http://www.nabble.com/Solr-relevancy-score---conversion-tp23936413p23936413.html
Sent from the Solr - User mailing list archive at Nabble.com.





Re: Solr Home on Linux JBoss ignored

2009-06-05 Thread Matt Weber

Check the dataDir setting in solrconfig.xml.

Thanks,

Matt Weber
eSr Technologies
http://www.esr-technologies.com




On Jun 5, 2009, at 6:03 AM, Dean Pullen wrote:


I lied, it's actually saving data to:

/usr/local/jboss-portal-2.7.1.GA/bin/C:\home\jboss\solr\data

Which is a tad crazy! And I have no idea why!

Dean.

-Original Message-
From: Dean Pullen [mailto:dean.pul...@msp-uk.com]
Sent: 05 June 2009 09:47
To: solr-user@lucene.apache.org
Subject: Solr Home on Linux JBoss ignored

Hi all,

Have an odd problem on JBoss 4.2.3 running on Redhat.
It's odd, because the configuration works fine on Windows.

Our Solr home is defined in the Solr.war web.xml as:

[Linux]

 solr/home
 java.lang.String
 /home/jboss/solr


[Windows]

 solr/home
 java.lang.String
 c:/home/jboss/solr


However, on Linux Solr is still defaulting to JBoss Web's [Tomcat]  
work directory, i.e.
/usr/local/jboss-portal-2.7.1.GA/server/default/work/jboss.web/ 
localhost/solr


Instead of the defined /home/jboss/solr


Can anyone shed any light on this?

Thanks,

Dean.


Scanned by MailDefender - managed email security from intY - 
www.maildefender.net

Scanned by MailDefender - managed email security from intY - 
www.maildefender.net

Scanned by MailDefender - managed email security from intY - 
www.maildefender.net




Re: Facet counts limit

2009-05-20 Thread Matt Weber
1.  The limit parameter takes a signed integer, so the max value is  
2,147,483,647.
2.  I don't think there is a defined limit which would mean you are  
only limited to want your system can handle.


Thanks,

Matt Weber
eSr Technologies
http://www.esr-technologies.com




On May 20, 2009, at 11:41 AM, sachin78 wrote:



Have two questions?

1) What is the limit on facet counts?  ex : test(10,0).Is this  
valid?


2) What is the limit on the no of facets? how many facets can a  
query get?


--Sachin
--
View this message in context: 
http://www.nabble.com/Facet-counts-limit-tp23641105p23641105.html
Sent from the Solr - User mailing list archive at Nabble.com.





Re: Search Query Questions

2009-05-14 Thread Matt Weber
I think you will want to look at the Field Collapsing patch for this.  http://issues.apache.org/jira/browse/SOLR-236 
.


Thanks,

Matt Weber
eSr Technologies
http://www.esr-technologies.com




On May 14, 2009, at 5:52 PM, Chris Miller wrote:


Oh, one more question

3) Is there a way to effectively do a GROUP BY? For example, if I  
have a document that has a photoID attached to it, is there a way to  
return a set of results that does not duplicate the photoID field?


Thanks,

Chris Miller
ServerMotion
www.servermotion.com



On May 14, 2009, at 7:46 PM, Chris Miller wrote:


I have two questions:

1) How do I search for ALL items? For example, I provide a sort  
query parameter of "updated" and a rows query parameter of 10 to  
limit the query results. I still have to provide a search query, of  
course. What if I want to provide a list of ALL results that match  
this? Or, in this case, the most recent 10 updated documents?


2) How do I search for all documents with a field that has data?  
For example, I have a field "foo" that is optional and multi- 
valued. How do I search for documents that have this field set to  
anything.


Thanks,

Chris Miller
ServerMotion
www.servermotion.com









Re: Selective Searches Based on User Identity

2009-05-12 Thread Matt Weber
Here is a good presentation on search security from the Infonortics  
Search Conference that was held a few weeks ago.


http://www.infonortics.com/searchengines/sh09/slides/kehoe.pdf

The approach you are using is called early-binding.  As Jay mentioned,  
one of the downsides is updating the documents each time you have an  
ACL change.  You could use the late-binding approach that checks each  
result after the query but before you display to the user.  I don't  
recommend this approach because it will strain your security  
infrastructure because you will need to check if the user can access  
each result.


Good luck.

Thanks,

Matt Weber
eSr Technologies
http://www.esr-technologies.com




On May 12, 2009, at 1:21 PM, Jay Hill wrote:

The only downside would be that you would have to update a document  
anytime
a user was granted or denied access. You would have to query before  
the
update to get the current values for grantedUID and deniedUID,  
remove/add
values, and update the index. If you don't have a lot of changes in  
the
system that wouldn't be a big deal, but if a lot of changes are  
happening

throughout the day you might have to queue requests and batch them.

-Jay

On Tue, May 12, 2009 at 1:05 PM, Matt Weber   
wrote:


I also work with the FAST Enterprise Search engine and this is  
exactly how
their Security Access Module works.  They actually use a modified  
base-32
encoded value for indexing, but that is because they don't have the  
luxury

of untokenized/un-processed String fields like Solr.

Thanks,

Matt Weber
eSr Technologies
http://www.esr-technologies.com





On May 12, 2009, at 12:26 PM, Terence Gannon wrote:

Paul -- thanks for the reply, I appreciate it.  That's a very  
practical
approach, and is worth taking a closer look at.  Actually, taking  
your

idea
one step further, perhaps three fields; 1) ownerUid (uid of the  
document's
owner) 2) grantedUid (uid of users who have been granted access),  
and 3)

deniedUid (uid of users specifically denied access to the document).
These
fields, coupled with some business rules around how they were  
populated

should cover off all possibilities I think.

Access to the Solr instance would have to be tightly controlled, but
that's
something that should be done anyway.  You sure wouldn't want end  
users
preparing their own XML and throwing it at Solr -- it would be  
pretty easy
to figure out how to get around the access/denied fields and get  
at stuff

the owner didn't intend.

This approach mimics to some degree what is being done in the  
operating
system, but it's still elegant and provides the level of control  
required.
Anybody else have any thoughts in this regard?  Has anybody  
implemented
anything similar, and if so, how did it work?  Thanks, and best  
regards...


Terence








Re: Selective Searches Based on User Identity

2009-05-12 Thread Matt Weber
I also work with the FAST Enterprise Search engine and this is exactly  
how their Security Access Module works.  They actually use a modified  
base-32 encoded value for indexing, but that is because they don't  
have the luxury of untokenized/un-processed String fields like Solr.


Thanks,

Matt Weber
eSr Technologies
http://www.esr-technologies.com




On May 12, 2009, at 12:26 PM, Terence Gannon wrote:

Paul -- thanks for the reply, I appreciate it.  That's a very  
practical
approach, and is worth taking a closer look at.  Actually, taking  
your idea
one step further, perhaps three fields; 1) ownerUid (uid of the  
document's
owner) 2) grantedUid (uid of users who have been granted access),  
and 3)
deniedUid (uid of users specifically denied access to the  
document).  These
fields, coupled with some business rules around how they were  
populated

should cover off all possibilities I think.

Access to the Solr instance would have to be tightly controlled, but  
that's
something that should be done anyway.  You sure wouldn't want end  
users
preparing their own XML and throwing it at Solr -- it would be  
pretty easy
to figure out how to get around the access/denied fields and get at  
stuff

the owner didn't intend.

This approach mimics to some degree what is being done in the  
operating
system, but it's still elegant and provides the level of control  
required.
Anybody else have any thoughts in this regard?  Has anybody  
implemented
anything similar, and if so, how did it work?  Thanks, and best  
regards...


Terence




Re: Facet counts for common terms of the searched field

2009-05-12 Thread Matt Weber
I mean you can sort the facet results by frequency, which happens to  
be the default behavior.


Here is an example field for your schema:

stored="true" multiValued="true" />


Here is an example query:

http://localhost:8983/solr/select?q=textfield:copper&facet=true&facet.field=textfieldfacet&facet.limit=5

This will give you the top 5 words in the textfieldfacet.

Thanks,

Matt Weber
eSr Technologies
http://www.esr-technologies.com




On May 12, 2009, at 7:57 AM, sachin78 wrote:



Thanks Matt for your reply.

What do you mean by frequency(the default)?

Can you please provide an example schema and query will look like.

--Sachin


Matt Weber-2 wrote:


You may have to take care of this at index time.  You can create a  
new

multivalued field that has minimal processing.  Then at index time,
index the full contents of textfield as normal, but then also split  
it

on whitespace and index each word in the new field you just created.
Now you will be able to facet on this new field and sort the facet by
frequency (the default) to get the most popular words.

Thanks,

Matt Weber
eSr Technologies
http://www.esr-technologies.com




On May 12, 2009, at 7:33 AM, sachin78 wrote:



Does anybody have answer to this post.I have a similar requirement.

Suppose I have free text field say
I index the field.If I search for textfield:copper.I have to get  
facet

counts for the most common words found in a textfield.
ie.

example:search for textfield:glass
should return facet counts for common words found textfield.
semiconductor(10),iron(20), silicon (25) material (8) thin(25) and
so on.
Can this be done using tagging or MLT.

Thanks,
Sachin


Raju444us wrote:


I have a requirement. If I search for text field let's say
"metal:glass"
what i want is to get the facet counts for all the terms related to
"glass" in my search results.

window(100)  since a window can be glass.
plastic(10)  plastic is a material just like glass
Iron(10)
Paper(15)

Can I use MLT to get this functionality.Please let me know how  
can I

achieve this.If possible an example query.

Thanks,
Raju



--
View this message in context:
http://www.nabble.com/Facet-counts-for-common-terms-of-the-searched-field-tp23302410p23503794.html
Sent from the Solr - User mailing list archive at Nabble.com.







--
View this message in context: 
http://www.nabble.com/Facet-counts-for-common-terms-of-the-searched-field-tp23302410p23504241.html
Sent from the Solr - User mailing list archive at Nabble.com.





Re: Facet counts for common terms of the searched field

2009-05-12 Thread Matt Weber
You may have to take care of this at index time.  You can create a new  
multivalued field that has minimal processing.  Then at index time,  
index the full contents of textfield as normal, but then also split it  
on whitespace and index each word in the new field you just created.   
Now you will be able to facet on this new field and sort the facet by  
frequency (the default) to get the most popular words.


Thanks,

Matt Weber
eSr Technologies
http://www.esr-technologies.com




On May 12, 2009, at 7:33 AM, sachin78 wrote:



Does anybody have answer to this post.I have a similar requirement.

Suppose I have free text field say
I index the field.If I search for textfield:copper.I have to get facet
counts for the most common words found in a textfield.
ie.

example:search for textfield:glass
should return facet counts for common words found textfield.
semiconductor(10),iron(20), silicon (25) material (8) thin(25) and  
so on.

Can this be done using tagging or MLT.

Thanks,
Sachin


Raju444us wrote:


I have a requirement. If I search for text field let's say  
"metal:glass"

what i want is to get the facet counts for all the terms related to
"glass" in my search results.

window(100)  since a window can be glass.
plastic(10)  plastic is a material just like glass
Iron(10)
Paper(15)

Can I use MLT to get this functionality.Please let me know how can I
achieve this.If possible an example query.

Thanks,
Raju



--
View this message in context: 
http://www.nabble.com/Facet-counts-for-common-terms-of-the-searched-field-tp23302410p23503794.html
Sent from the Solr - User mailing list archive at Nabble.com.





Re: solr + wordpress

2009-05-08 Thread Matt Weber

I actually wrote a plugin that integrates Solr with WordPress.

http://www.mattweber.org/2009/04/21/solr-for-wordpress/
http://wordpress.org/extend/plugins/solr-for-wordpress/
https://launchpad.net/solr4wordpress

Thanks,

Matt Weber
eSr Technologies
http://www.esr-technologies.com




On May 8, 2009, at 10:10 AM, Noble Paul നോബിള്‍  
नोब्ळ् wrote:



Somebody has writte an articles on integrating Solr with wordpress

http://www.ipros.nl/2008/12/15/using-solr-with-wordpress/

--
-
Noble Paul | Principal Engineer| AOL | http://aol.com




Re: Solr autocompletion in rails

2009-05-07 Thread Matt Weber
First, your solrconfig.xml should have the something similar to the  
following:


  class="org.apache.solr.handler.component.TermsComponent"/>


  class="org.apache.solr.handler.component.SearchHandler">


  termsComp

  

This will give you a request handler called "/autoSuggest" that you  
will use for suggestions.


Then you need to write some rails code to access this.  I am not very  
familiar with ruby, but I believe you might want to try http://wiki.apache.org/solr/solr-ruby 
.  Make sure you set your query type to "/autoSuggest".  If that won't  
work for you, then just use the standard http libraries to access the  
autoSuggest url directly and get json output.


With any of these methods make sure you set the following parameters:

terms=true
terms.fl=source_field
terms.lower=input_term
terms.prefix=input_term
terms.lower.incl=false

For direct access to the json output you will  want these as well:

indent=true
wt=json

The terms.fl parameter specifys the field(s) you want to use as the  
source for suggestions.  Make sure this field has very little  
processing done on it, maybe lowercasing and tokenization only.


Here is an example url that should give you some output once things  
are working:


http://localhost:8983/solr/autoSuggest?terms=true&terms.fl=spell&terms.lower=t&terms.prefix=t&terms.lower.incl=false&indent=true&wt=json

The next thing is to parse the json output and do whatever you want  
with the results.  In my example, I just printed out each suggestion  
on a single line of the response because this is what the jQuery  
autocomplete plugin wanted.  The easiest way to parse the json output  
is to use the json ruby library, http://json.rubyforge.org/.


After you have your rails controller working you can hook it into your  
FE with some javascript like I did in the example on my blog.  Hope  
this helps.


Thanks,

Matt Weber
eSr Technologies
http://www.esr-technologies.com




On May 7, 2009, at 7:37 AM, manisha_5 wrote:



Thanks a lot for the information. But I am still a bit confused  
about the use
of TermsComponents. Like where are we exactly going to put these  
codes in
Solr.For example I changed schema.xml to add autocomplete feauture.I  
read

your blog too, its very helpful.But still a little confused. :-((
Can you explain it a bit?



Matt Weber-2 wrote:


You will probably want to use the new TermsComponent in Solr 1.4.   
See

http://wiki.apache.org/solr/TermsComponent
.  I just recently wrote a blog post about using autocompletion with
TermsComponent, a servlet, and jQuery.  You can probably follow these
instructions, but instead of writing a servlet you can write a rails
handler parsing the json output directly.

http://www.mattweber.org/2009/05/02/solr-autosuggest-with-termscomponent-and-jquery/
.

Thanks,

Matt Weber



On May 4, 2009, at 9:39 AM, manisha_5 wrote:



Hi,

I am new to solr. I am using solr server to index the data and make
search
in a Ruby on rails project.I want to add autocompletion feature. I
tried
with the xml patch in the schema.xml file of solr, but dont know how
to test
if the feature is working.also havent been able to integrate the
same in the
Rails project that is using Solr.Can anyone please provide some help
in this
regards??

the patch of codes in Schema.xml is :


  
  
  
  
  
  
  
  
  
  
  
  
 

--
View this message in context:
http://www.nabble.com/Solr-autocompletion-in-rails-tp23372020p23372020.html
Sent from the Solr - User mailing list archive at Nabble.com.







--
View this message in context: 
http://www.nabble.com/Solr-autocompletion-in-rails-tp23372020p23428267.html
Sent from the Solr - User mailing list archive at Nabble.com.





Re: Conditional/Calculated Fields (is it possible?)

2009-05-06 Thread Matt Weber
I do not think this is possible.  You will probably want to handle  
this logic on your side during indexing.  Index the document with the  
fist price, then as that price expires, update the document with the  
new price.


Thanks,

Matt Weber
eSr Technologies
http://www.esr-technologies.com




On May 6, 2009, at 4:32 AM, Andrew Ingram wrote:


Hi everyone,

I'm working on the search schema for ecommerce products and I'm having
an issue with the prices.

Basically, a product has two price values and a date, the product
effectively has one price before the date and the other one after.
This poses no problem for the site itself since I can use conditional
logic, but I have no idea how to approach this with regards to solr
queries.

The price of a product is used for both faceting and sorting and
should use whichever price is active at the time of the query. Is
there any way to do define a field whose value is a simple algorithm
operating on the value of other fields?

I'm quite happy to use a custom field type if necessary, though I'm
not sure if what I want is even possible and I don't really know where
to begin.

Any help would be appreciated

Regards,
Andrew Ingram




Re: Multi-index Design

2009-05-05 Thread Matt Weber
1 - A field that is called "type" which is probably a string field  
that you index values such as "people", "organization", "product".


2 - Yes, for each document you are indexing, you will include it's  
type, ie. "person"


3, 4, 5 - You would have a core for each domain.  Each domain will  
then have it's own index that contains documents of all types.  See http://wiki.apache.org/solr/MultipleIndexes 
.


Thanks,

Matt Weber




On May 5, 2009, at 11:14 AM, Michael Ludwig wrote:


Chris Masters schrieb:


- flatten the searchable objects as much as I can - use a type field
  to distinguish - into a single index
- use multi-core approach to segregate domains of data


Some newbie questions:

(1) What is a "type field"? Is it to designate different types of
documents, e.g. product descriptions and forum postings?

(2) Would I include such a "type field" in the data I send to the  
update

facility and maybe configure Solr to take special action depending on
the value of the update field?

(3) Like, write the processing results to a domain dedicated to that
type of data that I could limit my search to, as per Otis' post?

(4) And is that what's called a "core" here?

(5) Or, failing (3), and lumping everything together in one search
domain (core?), would I use that "type field" to limit my search to
a particular type of data?

Michael Ludwig




Re: Solr autocompletion in rails

2009-05-04 Thread Matt Weber
You will probably want to use the new TermsComponent in Solr 1.4.  See http://wiki.apache.org/solr/TermsComponent 
.  I just recently wrote a blog post about using autocompletion with  
TermsComponent, a servlet, and jQuery.  You can probably follow these  
instructions, but instead of writing a servlet you can write a rails  
handler parsing the json output directly.


http://www.mattweber.org/2009/05/02/solr-autosuggest-with-termscomponent-and-jquery/ 
.


Thanks,

Matt Weber



On May 4, 2009, at 9:39 AM, manisha_5 wrote:



Hi,

I am new to solr. I am using solr server to index the data and make  
search
in a Ruby on rails project.I want to add autocompletion feature. I  
tried
with the xml patch in the schema.xml file of solr, but dont know how  
to test
if the feature is working.also havent been able to integrate the  
same in the
Rails project that is using Solr.Can anyone please provide some help  
in this

regards??

the patch of codes in Schema.xml is :


   
   minGramSize="3"

maxGramSize="15" />
   
   
   maxGramSize="100"

minGramSize="1" />
   
   
   
   
   
   
   
  

--
View this message in context: 
http://www.nabble.com/Solr-autocompletion-in-rails-tp23372020p23372020.html
Sent from the Solr - User mailing list archive at Nabble.com.





Re: autoSuggest

2009-05-04 Thread Matt Weber
I am not sure you can return the results in order of frequency, you  
will have to sort the results yourself.  Also, for autoSuggest you  
will want to add the terms.prefix=input term and  
terms.lower.incl=false so your example will be:


/autoSuggest? 
terms 
= 
true 
&indent 
= 
true 
&terms 
.fl 
= 
title 
&terms 
.rows 
= 
5 
&terms 
.lower=simp&terms.lower.incl=false&terms.prefix=simp&omitHeader=true


To get results for more multiple words such as "barack obama", you  
need to set the terms.fl parameter to an untokenized, un-processed  
field just as you would with a facet.  So in your schema.xml, add a  
new string field, then use a copyfield to copy the value of title into  
the new field and set terms.fl to the new field you just created after  
reindexing.


Thanks,

Matt Weber

On May 4, 2009, at 6:46 AM, sunnyfr wrote:



Hi,

I would like to know how work /autoSuggest.

I do have result when I hit :
/autoSuggest? 
terms 
= 
true 
&indent 
=true&terms.fl=title&terms.rows=5&terms.lower=simp&omitHeader=true


I've:



74
129
2
2
1




How can I ask it to suggest first expression which are more frequent  
in the

database ?
How can I look for even for two words, ie: I look for "bara" ...  
make it

suggesting "barack obama" ???

thanks a lot,


--
View this message in context: 
http://www.nabble.com/autoSuggest-tp23367848p23367848.html
Sent from the Solr - User mailing list archive at Nabble.com.



Thanks,

Matt Weber
eSr Technologies
http://www.esr-technologies.com






Re: Highlight MoreLikeThis results?

2009-05-04 Thread Matt Weber
There was a thread about this last week and verdict is currently you  
can't highlight MoreLikeThis results.


Thanks,

Matt Weber




On May 4, 2009, at 1:22 AM, jli...@gmail.com wrote:


My query returns a number of MoreLikeThis results for a given
document. I wonder if there is a way to highlight the terms
in the MoreLikeThis results? Thanks.





Re: Term highlighting with MoreLikeThisHandler?

2009-04-30 Thread Matt Weber
Yes, I understand you can't highlight a documented within a document.   
However, with MLT you a using the interesting terms from the source  
document(s) to find similar results.  An obvious solution would be  
highlighting the interesting terms that matched and thus made the  
result similar.


Thanks,

Matt Weber


On Apr 29, 2009, at 9:27 PM, Walter Underwood wrote:


Think about this for a moment. When you use MoreLikeThis, the query
is a document. How do you highlight a document in another document?

wunder

On 4/29/09 9:21 PM, "Matt Weber"  wrote:


Any luck on this?  I am experiencing the same issue.  Highlighting
works fine on all other request handlers, but breaks when I use the
MoreLikeThisHandler.

Thanks,

Matt Weber




On Apr 28, 2009, at 5:29 AM, Eric Sabourin wrote:


Yes... at least I think so.  the highlighting works correctly for me
on
another request handler... see below the request handler for my
morelikethishandler query.
Thanks for your help... Eric



  

   
   score,id,timestamp,type,textualId,subject,url,server
  

   explicit
   true
   list
 subject,requirements,productName,justification,operation_exact
str>
 2
 1
 2

  true
  1
  
  0
  0
  regex 
  regex
  



On Mon, Apr 27, 2009 at 11:30 PM, Otis Gospodnetic <
otis_gospodne...@yahoo.com> wrote:



Eric,

Have you tried using MLT with parameters described on
http://wiki.apache.org/solr/HighlightingParameters ?


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 

From: Eric Sabourin 
To: solr-user@lucene.apache.org
Sent: Monday, April 27, 2009 10:31:38 AM
Subject: Term highlighting with MoreLikeThisHandler?

I submit a query to the MoreLikeThisHandler to find documents
similar to

a

specified document.  This works and I've configured my request
handler to
also return the interesting terms.

Is it possible to have MLT return to me highlight snippets in the
similar
documents it returns? I mean generate hl snippets of the  
interesting

terms?

If so how?

Thanks... Eric






--
Eric
Sent from Halifax, NS, Canada








Re: Term highlighting with MoreLikeThisHandler?

2009-04-29 Thread Matt Weber
Any luck on this?  I am experiencing the same issue.  Highlighting  
works fine on all other request handlers, but breaks when I use the  
MoreLikeThisHandler.


Thanks,

Matt Weber




On Apr 28, 2009, at 5:29 AM, Eric Sabourin wrote:

Yes... at least I think so.  the highlighting works correctly for me  
on

another request handler... see below the request handler for my
morelikethishandler query.
Thanks for your help... Eric


 
   


score,id,timestamp,type,textualId,subject,url,server
   

explicit
true
list
  name 
= 
"mlt 
.fl">subject,requirements,productName,justification,operation_exactstr>

  2
  1
  2

   true
   1
   
   0
   0
   regex 
   regex
   
 


On Mon, Apr 27, 2009 at 11:30 PM, Otis Gospodnetic <
otis_gospodne...@yahoo.com> wrote:



Eric,

Have you tried using MLT with parameters described on
http://wiki.apache.org/solr/HighlightingParameters ?


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 

From: Eric Sabourin 
To: solr-user@lucene.apache.org
Sent: Monday, April 27, 2009 10:31:38 AM
Subject: Term highlighting with MoreLikeThisHandler?

I submit a query to the MoreLikeThisHandler to find documents  
similar to

a
specified document.  This works and I've configured my request  
handler to

also return the interesting terms.

Is it possible to have MLT return to me highlight snippets in the  
similar

documents it returns? I mean generate hl snippets of the interesting

terms?

If so how?

Thanks... Eric






--
Eric
Sent from Halifax, NS, Canada