hook to know when a DOC is committed.

2013-05-23 Thread Fredrik Rødland
I need to know when a document is committed in SOLR - i.e. is searchable.

Is there anyone who has a solution on how to do this.

I'm aware of three methods to create hooks for knowing when a doc is added or 
a commit is performed, but the doc(id) does not seem to be included for the 
commit-hooks (naturally I guess):

A. subclass DirectUpdateHandler2 and override commit and/or addDoc
B. subclass UpdateRequestProcessor (and include it in the update-chain) and 
override processAdd and/or processCommit
C. implement SolrEventListener and implement postCommit and/or postSoftCommit

The use-case is to let other parts of a system know that a document is 
searchable without having to create a poller which has to have state on 
when/how it polls.

Any ideas or tricks out there?


Fredrik


--
Fredrik Rødland   Mail:fredrik.rodl...@finn.no
FINN.no   Cell:+47 99 21 98 17
  Twitter: @fredrikr
Oslo, NORWAY  Web: http://about.me/fmr



Re: hook to know when a DOC is committed.

2013-05-23 Thread Fredrik Rødland
On 23. mai 2013, at 14:05, Jack Krupansky j...@basetechnology.com wrote:

Hi Jack,

thanks for your answer.

 A poller really is the most sensible, practical, and easiest route to go. If 
 you add the versions=true parameter to your update request and have the 
 transaction log enabled the update response will have the version numbers for 
 each document id, then the poller can also tell if an update has been 
 committed as well.

The poller will still have to retry before advertising a doc as searchable - 
won't it?

 Do you have some other, unmentioned requirement that you feel is biasing you 
 against a sensible poller? Clue us in as to the nature of such a requirement.

My plan was to link sold with our already established high-volume 
messaging-system.  So each time a document is searchable a message would be 
broadcasted on a given channel.

Our system consist of approx 10 indexes and 8 replications of each of these, so 
keeping track of all these by pollers would require a whole bunch of logic.  
Having a pushed-based system would facilitate knowing where  when a document 
is searchable quite a lot.



regards,


Fredrik


--
Fredrik Rødland   Mail:fredrik.rodl...@finn.no
FINN.no   Cell:+47 99 21 98 17
  Twitter: @fredrikr
Oslo, NORWAY  Web: http://about.me/fmr



Re: Trouble with phrase-queries on fields with omitTermFreqAndPositions (upgrading from 3.6.X to 4.1)

2013-03-06 Thread Fredrik Rødland
Thanks again for replying and giving insight to this, Jack.  Your two links 
were exactly the answer I was hoping for going forward.

5. mars 2013 kl. 14:12 skrev Jack Krupansky j...@basetechnology.com:

 See:
 https://issues.apache.org/jira/browse/LUCENE-2370
 
 Maybe Uwe could comment on his change.
 
 I suppose you could say that it was a feature that the error was previously 
 silently ignored, but that would be a matter of debate. The simple fact is 
 that you are asking a module to do something that it cannot do for the 
 supplied index data.

I agree that this is a matter of debate, and have no trouble seeing the 
arguments for not silently ignoring this.  I feel however that this could 
potential break stuff and should be signaled.  Having dug around on nibble  
google without finding anything made me a bit worrisome.

 You might want to propose that the query parsers avoid the exception, 
 possibly as an option, if a phrase if attempted against a non-position field, 
 which can happen easily when fields are listed in qf, pf, et al. Years 
 ago when I stumbled into this change I simply generated a boolean query for 
 the phrase if the field did not have position info.

I'm curious - how did you do this?  Be re-writing the query on the client side, 
or is there a neat trick I've missed on how to this automagically?

 There does appear to be an open Solr issue that may cover this: SOLR-2660 - 
 omitPositions improvements
 https://issues.apache.org/jira/browse/SOLR-2660

I see Jan already commented on the issue linking to this discussion, so I guess 
we can continue any debate there.

 Maybe that's one difference between Lucene and Solr - Lucene is a precision 
 library for experts, while Solr is attempting to provide a search box for 
 casual users where getting some results I better even if the price is less 
 precision - or at least provide options to choose degrees of precision.

Spot on.

Thanks again.


Fredrik


--
Fredrik Rødland   Mail:fred...@rodland.no
 Cell:+47 99 21 98 17
 Twitter: @fredrikr
Maisen Pedersens vei 1Flickr:  http://www.flickr.com/fmmr/
NO-1363 Høvik, NORWAY Web: http://about.me/fmr



Re: dropping fields from input data

2013-03-06 Thread Fredrik Rødland
6. mars 2013 kl. 02:24 skrev varun srivastava varunmail...@gmail.com:

 Thanks Hoss .. Is this available in 4.0 ?
 
 On Tue, Mar 5, 2013 at 5:14 PM, Chris Hostetter 
 hossman_luc...@fucit.orgwrote:
 
 
 :dynamicField name=stamp_* type=string indexed=false
 : stored=false multiValued=true/
 
 Take a look at IgnoreFieldUpdateProcessorFactory...
 
 
 https://lucene.apache.org/solr/4_1_0/solr-core/org/apache/solr/update/processor/IgnoreFieldUpdateProcessorFactory.html
 
 https://lucene.apache.org/solr/4_1_0/solr-core/org/apache/solr/update/processor/FieldMutatingUpdateProcessorFactory.html


Yes:
https://lucene.apache.org/solr/4_0_0/solr-core/org/apache/solr/update/processor/IgnoreFieldUpdateProcessorFactory.html
https://lucene.apache.org/solr/4_0_0/solr-core/org/apache/solr/update/processor/FieldMutatingUpdateProcessorFactory.html


Fredrik


--
Fredrik Rødland   Mail:fred...@rodland.no
  Cell:+47 99 21 98 17
  Twitter: @fredrikr
Maisen Pedersens vei 1Flickr:  http://www.flickr.com/fmmr/
NO-1363 Høvik, NORWAY Web: http://about.me/fmr



Trouble with phrase-queries on fields with omitTermFreqAndPositions (upgrading from 3.6.X to 4.1)

2013-03-04 Thread Fredrik Rødland
We've been trying to get our heads around this for some days now upgrading from 
3.6 (where we didn't see this error) to 4.1 (where this error is very prominent.

We have upgraded from SOLR 3.6.1 to 4.1 and get the following error:
INFO [2013.03.04 09:22:40] http-12200-2 org.apache.solr.core.SolrCore - [finn] 
webapp=/solr path=/select params={q=audi+a6wt=json} status=500 QTime=14 
ERROR [2013.03.04 09:22:40] http-12200-2 
org.apache.solr.servlet.SolrDispatchFilter - 
null:java.lang.IllegalStateException: field name was indexed without position 
data; cannot run PhraseQuery (term=a)
at 
org.apache.lucene.search.PhraseQuery$PhraseWeight.scorer(PhraseQuery.java:274)
at 
org.apache.lucene.search.DisjunctionMaxQuery$DisjunctionMaxWeight.scorer(DisjunctionMaxQuery.java:161)


name is a field which has omitTermFreqAndPositions=true for several reasons.

We ran into several issues:

1. name was defined in our PF in our request handler (edismax)
= resolution: delete it from PF

2. name has WordDelimiterFilter, and fails because WDF creates a phrase 
behind the scenes.
= resolution: update schema to = 1.4 and add autoGeneratePhraseQueries=true 
to all other fields or types but name  (phuu)

3. excplicit phrase-queries fail.  e.g a search for audi oslo fails with the 
error above.
= no resolution - other than stop using omitTermFreqAndPositions

from 
http://wiki.apache.org/solr/SchemaXml#Schema_version_attribute_in_the_root_node
omitTermFreqAndPositions=true|false ! Solr1.4
...Queries that rely on position that are issued on a field with this option 
will silently fail to find documents.

This doesn't seem very silent to me.  We get a 500 error from SOLR.

Does anyone out there have any resolutions or tips for this problem.  We really 
wish to still have the field defined with omitTermFreqAndPositions, have it 
defined in our qf and support phrases for the end user.

On a side note: Jan Høydals excellent auto-complete solution:
http://www.cominvent.com/2012/01/25/super-flexible-autocomplete-with-solr/
https://github.com/cominvent/autocomplete/commits/master
also suffers from this - phrases are no longer supported, as noted in:
http://stackoverflow.com/questions/10063318/solr-autocomplete-based-on-edismax-type-error


Regards,


Fredrik


--
Fredrik Rødland   Mail:fred...@rodland.no
  Cell:+47 99 21 98 17
  Twitter: @fredrikr
Maisen Pedersens vei 1Flickr:  http://www.flickr.com/fmmr/
NO-1363 Høvik, NORWAY Web: http://about.me/fmr





Re: facet count distinct and sum group by field

2012-12-14 Thread Fredrik Rødland
Den 14. des. 2012 kl. 06:16 skrev cmd.ares:

 i want to use solr like sql: 
 select type,count(distinct product_name)s1,sum(price)s2 group by type 
 how to do it with solr? 
 thanks

I think you should be able to do this using the StatsComponent faceting on type

http://wiki.apache.org/solr/StatsComponent 

Regards,

Fredrik

replication of files when index is stable/static (SOLR-1304?)

2012-12-04 Thread Fredrik Rødland
I have a static index with config-files changing frequently.

Until now I've distributes these files to all solr-hosts in my current setup 
manually, but I'm wondering if I can get SOLR to do this using the 
config-replication.

Searching google I've come across 
https://issues.apache.org/jira/browse/SOLR-1304.

Anyone know if there is any work done to this issue, or if there are other 
work-arounds?

Adding (followed by a deletion) of a dummy document seems to trigger the 
replication, but this is hardly the best solution.

Regards,

Fredrik Rodland


--
Fredrik Rødland   Mail:fred...@rodland.no
  Cell:+47 99 21 98 17
  Twitter: @fredrikr
Maisen Pedersens vei 1Flickr:  http://www.flickr.com/fmmr/
NO-1363 Høvik, NORWAY Web: http://about.me/fmr