Re: Disseminate results from different sources

2012-03-21 Thread Emmanuel Espina
In general the algorithm considers what is more relevant and probably you should check why one kind of result is giving always higher scores than the others. Are you using norms (not setting omitNorms = true). With debugQuery=true you can get a detail of how the score is calculated. That would be

Re: alphanumeric buckets

2012-03-01 Thread Emmanuel Espina
Only one interval? in that case you could add a filter query and facet in the regular way. That is: facet.field=personfq=person:[A TO C] But consider that you will get the search results that include those persons only. Thanks Emmanuel 2012/3/1 AlexR alexanderroessler1...@hotmail.com: Hi i

Re: handling case insensitive and regex

2012-02-29 Thread Emmanuel Espina
What query parser are you using? It looks like Lucene Query Parser or edismax. The cause is that wildcard queries does not get analyzed. So even if you have lowercase filters in the analysis chain that is not being applied when you search using * Thanks Emmanuel 2012/2/29 Neil Hart

Re: searching top matches of each facet

2012-02-29 Thread Emmanuel Espina
I think that what you want is FieldCollapsing: http://wiki.apache.org/solr/FieldCollapsing For example q=my searchgroup=truegroup.field=subjectgroup.limit=5 Test it to see if that is what you want. Thanks Emmanuel 2012/2/29 Paul p...@nines.org: Let's say that I have a facet named 'subject'

Re: Too many values for UnInvertedField faceting on field topic

2012-02-29 Thread Emmanuel Espina
No. But probably we can find another way to do what you want. Please describe the problem and include some numbers to give us an idea of the sizes that you are handling. Number of documents, size of the index, etc. Thanks Emmanuel 2012/2/29 Michael Jakl jakl.mich...@gmail.com: Our Solr started

Re: Search for hashtags and mentions

2012-02-15 Thread Emmanuel Espina
Do you want to index the hashtags and usernames to different fields? Probably using http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.PatternTokenizerFactory will solve your problem. However I don't fully understand the problem when you search Thanks Emmanuel 2012/2/15 Rohit

Re: update extracted docs

2012-02-15 Thread Emmanuel Espina
Solr or Lucene does not update documents. It deletes the old one and replaces it with a new one when it has the same id. So if you create a document with the changed fields only, and the same id, and upload that one, the old one will be erased and replaced with the new one. So THAT behaviour is

Re: solr to index php files

2012-02-02 Thread Emmanuel Espina
What do you mean by static php files? As far as I know PHP is to make pages look dynamic. If you want to index dynamic pages as they where just HTML you will have to download them, and add them to Solr. Programming a small program in SolrJ and using some HTTP library

Re: SolrReplication configuration with frequent deletes and updates

2012-02-01 Thread Emmanuel Espina
2012/2/1 prasenjit mukherjee prasen@gmail.com: I have the following requirements : 1. Adds : 20 docs/sec 2. Searches : 100 searches/sec 3. Deletes : (20*3600*24*7 ~ 12 mill ) docs/week ( basically a cron job which deletes all documents more than 7 days old ) I am thinking of having 6

Re: Match raw query string

2012-01-09 Thread Emmanuel Espina
How are you building your query? For your case it appears that the edismax query parser should solve it A good solution to this kind of problem involves: Storing norms (omitNorms=false) in the fields to search Storing the position of the terms (omitTermFreqAndPositions=false) in the fields to

Re: Match raw query string

2012-01-09 Thread Emmanuel Espina
^5000.0 id^2000.0 content/str ? Robert McCarroll Systems Administration NYS Department of Civil Service -Original Message- From: Emmanuel Espina [mailto:espinaemman...@gmail.com] Sent: Monday, January 09, 2012 1:42 PM To: solr-user@lucene.apache.org Subject: Re: Match raw query

Re: querying all data

2012-01-09 Thread Emmanuel Espina
*:* is parsed as a MatchAllDocsQuery and * es a wilcard query on the default search field. The matchalldocuments does just that, and the * has to resolve the wilcard (that is building a automaton query in newer versions of Lucene). Also if a document has the default field empty that document will

Lot of ORs in a query and De Morgan Law

2011-09-22 Thread Emmanuel Espina
I have queries with a big big amount of OR terms. The AND terms are much more convenient to handle because they can be turned into several filter queries and cached. Thinking in innovative solutions I recalled the De Morgan Laws http://en.wikipedia.org/wiki/De_Morgan's_laws of Boolean Algebras,

Re: Using multivalued field in map function

2011-09-08 Thread Emmanuel Espina
Function queries don't work with multivalued field. http://wiki.apache.org/solr/FunctionQuery#Vector_Functions You'll have to think in another way of doing that. What do you want to achieve with that map? Regards Emmanuel 2011/9/8 tkamphuis tom_m...@hotmail.com Hi, I'm working on

Rollback to old index stored with SolrDeletionPolicy

2011-09-06 Thread Emmanuel Espina
With SolrDeletionPolicy you can chose the number of versions of the index to store ( maxCommitsToKeep, it defaults to 1). Well, how can you revert to an arbitrary version that you have stored? Is there anything in Solr or in Lucene to pick the version of the index to load? Thank you Emmanuel

Re: Filter content upon indexing

2011-07-27 Thread Emmanuel Espina
If you can express what you want with a regular expression then the pattern Filter should work! I'm thinking that maybe you tokenized the field and that invalidated the structure of the html. I would use a contents field analized with a

Re: Filter content upon indexing

2011-07-27 Thread Emmanuel Espina
was focused only in extraction of text for searching purposes. Thanks Emmanuel 2011/7/27 Emmanuel Espina espinaemman...@gmail.com If you can express what you want with a regular expression then the pattern Filter should work! I'm thinking that maybe you tokenized the field and that invalidated

Re: Cant get Synonym working

2011-07-26 Thread Emmanuel Espina
Well it appears to be some issue with the analysis. You can check the http://localhost:8983/solr/admin/analysis.jsp (the admin page of your instance, the analysis section) to see how the analysis is applied and see the end result of aaa You should work with the index and the query analysis

Re: Multiple Solr servers and a shared index (again)

2011-07-26 Thread Emmanuel Espina
Regarding point 4, you will have to reload the indexes to preserve consistency among the indexes. When yo perform a commit in solr you have (for an instant) two versions of the index. The commit produces new segments (with new documents, new deletions, etc). After creating these new segments a new

Re: Exact match not the first result returned

2011-07-26 Thread Emmanuel Espina
That is caused by the size of the documents. The principle is pretty intuitive if one of your documents is the entire three volumes of The Lord of the Rings, and you search for tree I know that The Lord of the Rings will be in the results, and I haven't memorized the entire text of that book :p It