Re: Solr hangs on distributed updates

2014-12-16 Thread Peter Keegan
.74b0dc51.13a79@airmetal.local%3E On Mon, Dec 15, 2014 at 8:11 PM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: On Mon, Dec 15, 2014 at 8:41 PM, Peter Keegan peterlkee...@gmail.com wrote: If a timeout occurs, does the distributed update then go to the next replica? A distributed update

Re: Solr hangs on distributed updates

2014-12-16 Thread Peter Keegan
As of 4.10, commits/optimize etc are executed in parallel. Excellent - thanks. On Tue, Dec 16, 2014 at 6:51 AM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: On Tue, Dec 16, 2014 at 11:34 AM, Peter Keegan peterlkee...@gmail.com wrote: A distributed update is streamed to all

Re: Solr hangs on distributed updates

2014-12-15 Thread Peter Keegan
I added distribUpdateConnTimeout and distribUpdateSoTimeout to solr.xml and the commit did timeout.(btw, is there any way to view solr.xml in the admin console?). Also, although we do have an init.d start/stop script for Solr, the 'stop' command was not executed during shutdown because there was

Re: Solr hangs on distributed updates

2014-12-15 Thread Peter Keegan
. The socket and connection timeout inside the shardHandlerFactory section apply for inter-shard search requests. On Fri, Dec 12, 2014 at 8:38 PM, Peter Keegan peterlkee...@gmail.com wrote: Btw, are the following timeouts still supported in solr.xml, and do they only apply to distributed search

Solr hangs on distributed updates

2014-12-12 Thread Peter Keegan
We are running SolrCloud in AWS and using their auto scaling groups to spin up new Solr replicas when CPU utilization exceeds a threshold for a period of time. All is well until the replicas are terminated when CPU utilization falls below another threshold. What happens is that index updates sent

Re: Solr hangs on distributed updates

2014-12-12 Thread Peter Keegan
for inter-shard update requests. On Fri, Dec 12, 2014 at 2:20 PM, Peter Keegan peterlkee...@gmail.com wrote: We are running SolrCloud in AWS and using their auto scaling groups to spin up new Solr replicas when CPU utilization exceeds a threshold for a period of time. All is well until

Re: Solr hangs on distributed updates

2014-12-12 Thread Peter Keegan
/shardHandlerFactory Thanks, Peter On Fri, Dec 12, 2014 at 3:14 PM, Peter Keegan peterlkee...@gmail.com wrote: No, I wasn't aware of these. I will give that a try. If I stop the Solr jetty service manually, things recover fine, but the hang occurs when I 'stop' or 'terminate' the EC2 instance

Re: Solr hangs on distributed updates

2014-12-12 Thread Peter Keegan
of these issues are because of the lack of timeouts. Just add them and if there are more problems, we can discuss more. On Fri, Dec 12, 2014 at 8:14 PM, Peter Keegan peterlkee...@gmail.com wrote: No, I wasn't aware of these. I will give that a try. If I stop the Solr jetty service manually, things

Re: Solr hangs on distributed updates

2014-12-12 Thread Peter Keegan
The AMIs are Red Hat (not Amazon's) and the instances are properly sized for the environment (t1.micro for ZK, m3.xlarge for Solr). I do plan to add hooks for a clean shutdown of Solr when the VM is shut down, but if Solr takes too long, AWS may clobber it anyway. One frustrating part of auto

Solr exceptions during batch indexing

2014-11-07 Thread Peter Keegan
How are folks handling Solr exceptions that occur during batch indexing? Solr stops parsing the docs stream when an error occurs (e.g. a doc with a missing mandatory field), and stops indexing the batch. The bad document is not identified, so it would be hard for the client to recover by skipping

Re: Solr exceptions during batch indexing

2014-11-07 Thread Peter Keegan
gotten there yet. Best, Erick On Fri, Nov 7, 2014 at 8:25 AM, Peter Keegan peterlkee...@gmail.com wrote: How are folks handling Solr exceptions that occur during batch indexing? Solr stops parsing the docs stream when an error occurs (e.g. a doc with a missing mandatory field

Re: Ideas for debugging poor SolrCloud scalability

2014-10-31 Thread Peter Keegan
Regarding batch indexing: When I send batches of 1000 docs to a standalone Solr server, the log file reports (1000 adds) in LogUpdateProcessor. But when I send them to the leader of a replicated index, the leader log file reports much smaller numbers, usually (12 adds). Why do the batches appear

Re: Ideas for debugging poor SolrCloud scalability

2014-10-31 Thread Peter Keegan
Erickson erickerick...@gmail.com wrote: Internally, the docs are batched up into smaller buckets (10 as I remember) and forwarded to the correct shard leader. I suspect that's what you're seeing. Erick On Fri, Oct 31, 2014 at 12:20 PM, Peter Keegan peterlkee...@gmail.com wrote: Regarding batch

Re: QParserPlugin question

2014-10-24 Thread Peter Keegan
Thanks for the advice. I've moved this query rewriting logic (not really business logic) to a SearchComponent and will leave the custom query parser to deal with the keyword (q=) related aspects of the query. In my case, the latter is mostly dealing with the presence of wildcard characters. Peter

QParserPlugin question

2014-10-22 Thread Peter Keegan
I have a custom query parser that modifies the filter query list based on the keyword query. This works, but the 'fq' list in the responseHeader contains the original filter list. The debugQuery output does display the modified filter list. Is there a way to change the responseHeader? I could

Re: QParserPlugin question

2014-10-22 Thread Peter Keegan
It's for an optimization. If the keyword is 'match all docs', I want to remove a custom PostFilter from the query and change the sort parameters (so the app doesn't have to do it). It looks like the responseHeader is displaying the 'originalParams', which are immutable. On Wed, Oct 22, 2014 at

Re: QParserPlugin question

2014-10-22 Thread Peter Keegan
I meant to say: If the keyword is *:* (MachAllDocsQuery)... On Wed, Oct 22, 2014 at 2:17 PM, Peter Keegan peterlkee...@gmail.com wrote: It's for an optimization. If the keyword is 'match all docs', I want to remove a custom PostFilter from the query and change the sort parameters (so the app

Re: Does Solr support this?

2014-10-16 Thread Peter Keegan
I'm doing something similar with a custom search component. See SOLR-6502 https://issues.apache.org/jira/browse/SOLR-6502 On Thu, Oct 16, 2014 at 8:14 AM, Upayavira u...@odoko.co.uk wrote: Nope, not yet. Someone did propose a JavascriptRequestHandler or such, which would allow you to code

Question about filter cache size

2014-10-03 Thread Peter Keegan
Say I have a boolean field named 'hidden', and less than 1% of the documents in the index have hidden=true. Do both these filter queries use the same docset cache size? : fq=hidden:false fq=!hidden:true Peter

Re: Question about filter cache size

2014-10-03 Thread Peter Keegan
it will be cached as hidden:true and then inverted Inverted at query time, so for best query performance use fq=hidden:false, right? On Fri, Oct 3, 2014 at 3:57 PM, Yonik Seeley yo...@heliosearch.com wrote: On Fri, Oct 3, 2014 at 3:42 PM, Peter Keegan peterlkee...@gmail.com wrote: Say I

Re: MaxScore

2014-09-17 Thread Peter Keegan
See if SOLR-5831 https://issues.apache.org/jira/browse/SOLR-5831 helps. Peter On Tue, Sep 16, 2014 at 11:32 PM, William Bell billnb...@gmail.com wrote: What we need is a function like scale(field,min,max) but only operates on the results that come back from the search results. scale() takes

Re: Edismax mm and efficiency

2014-09-10 Thread Peter Keegan
I implemented a custom QueryComponent that issues the edismax query with mm=100%, and if no results are found, it reissues the query with mm=1. This doubled our query throughput (compared to mm=1 always), as we do some expensive RankQuery processing. For your very long student queries, mm=100%

Re: Edismax mm and efficiency

2014-09-10 Thread Peter Keegan
, but on the client side with two requests. Would you consider contributing the QueryComponent? wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ On Sep 10, 2014, at 3:47 AM, Peter Keegan peterlkee...@gmail.com wrote: I implemented a custom QueryComponent that issues

Re: ExternalFileFieldReloader and commit

2014-08-06 Thread Peter Keegan
://soleami.com/blog/comparing-document-classification-functions-of- lucene-and-mahout.html (2014/08/05 22:34), Peter Keegan wrote: When there are multiple 'external file field' files available, Solr will reload the last one (lexicographically) with a commit, but only if changes were made to the index

Re: ExternalFileFieldReloader and commit

2014-08-06 Thread Peter Keegan
org.apache.solr.search.function.FileFloatSource.ReloadCacheRequestHandler ? Let's me know if you need help with it. As a workaround you can reload the core via REST or click a button at SolrAdmin, your questions are welcome. On Wed, Aug 6, 2014 at 4:02 PM, Peter Keegan peterlkee...@gmail.com wrote

ExternalFileFieldReloader and commit

2014-08-05 Thread Peter Keegan
When there are multiple 'external file field' files available, Solr will reload the last one (lexicographically) with a commit, but only if changes were made to the index. Otherwise, it skips the reload and logs: No uncommitted changes. Skipping IW.commit. Has anyone else noticed this? It seems

Question about ReRankQuery

2014-07-23 Thread Peter Keegan
I'm looking at how 'ReRankQuery' works. If the main query has a Sort criteria, it is only used to sort the first pass results. The QueryScorer used in the second pass only reorders the ScoreDocs based on score and docid, but doesn't use the original Sort fields. If the Sort criteria is 'score

Re: Question about ReRankQuery

2014-07-23 Thread Peter Keegan
with an AND and using the original sort? At the end, you have your original list in it's original order, with (potentially) some documents removed that don't satisfy the secondary query. Or I'm missing the boat entirely. Best, Erick On Wed, Jul 23, 2014 at 6:31 AM, Peter Keegan peterlkee

Re: Question about ReRankQuery

2014-07-23 Thread Peter Keegan
at Heliosearch On Wed, Jul 23, 2014 at 11:37 AM, Peter Keegan peterlkee...@gmail.com javascript:; wrote: See http://heliosearch.org/solrs-new-re-ranking-feature/ On Wed, Jul 23, 2014 at 11:27 AM, Erick Erickson erickerick...@gmail.com javascript:; wrote: I'm having a little trouble

Question about solrcloud recovery process

2014-07-03 Thread Peter Keegan
I bring up a new Solr node with no index and watch the index being replicated from the leader. The index size is 12G and the replication takes about 6 minutes, according to the replica log (from 'Starting recovery process' to 'Finished recovery process). However, shortly after the replication

Re: Question about solrcloud recovery process

2014-07-03 Thread Peter Keegan
think it is, it should only be an issue when you directly query recovery node. The CloudSolrServer client works around this issue as well. -- Mark Miller about.me/markrmiller On July 3, 2014 at 8:42:48 AM, Peter Keegan (peterlkee...@gmail.com) wrote: I bring up a new Solr node

Re: Question about solrcloud recovery process

2014-07-03 Thread Peter Keegan
Aha, you are right wrdrvf! The query is forwarded to any of the active shards (I saw the query alternate between both of mine). Nice feature. Also, looking at 'ClusterStateAwarePingRequestHandler' (which I downloaded from www.manning.com/SolrinAction), it is checking zookeeper to see if the

Custom QueryComponent to rewrite dismax query

2014-06-10 Thread Peter Keegan
We are using the 'edismax' query parser for its many benefits over the standard Lucene parser. For queries with more than 5 or 6 keywords (which is a lot for our typical user), the recall can be very high (sometimes matching 75% or more of the documents). This high recall, when coupled with some

Autoscaling Solr instances in AWS

2014-05-20 Thread Peter Keegan
We are running Solr 4.6.1 in AWS: - 2 Solr instances (1 shard, 1 leader, 1 replica) - 1 CloudSolrServer SolrJ client updating the index. - 3 Zookeepers The Solr instances are behind a load balanceer and also in an auto scaling group. The ScaleUpPolicy will add up to 9 additional instances

Re: Distributed commits in CloudSolrServer

2014-04-16 Thread Peter Keegan
Are distributed commits also done in parallel across shards? Peter On Tue, Apr 15, 2014 at 3:50 PM, Mark Miller markrmil...@gmail.com wrote: Inline responses below. -- Mark Miller about.me/markrmiller On April 15, 2014 at 2:12:31 PM, Peter Keegan (peterlkee...@gmail.com) wrote: I have

Re: Distributed commits in CloudSolrServer

2014-04-16 Thread Peter Keegan
Are distributed commits also done in parallel across shards? I meant 'sequentially' across shards. On Wed, Apr 16, 2014 at 9:08 AM, Peter Keegan peterlkee...@gmail.comwrote: Are distributed commits also done in parallel across shards? Peter On Tue, Apr 15, 2014 at 3:50 PM, Mark Miller

Distributed commits in CloudSolrServer

2014-04-15 Thread Peter Keegan
I have a SolrCloud index, 1 shard, with a leader and one replica, and 3 ZKs. The Solr indexes are behind a load balancer. There is one CloudSolrServer client updating the indexes. The index schema includes 3 ExternalFileFields. When the CloudSolrServer client issues a hard commit, I observe that

Re: Configurable collectors for custom ranking

2014-03-07 Thread Peter Keegan
to the CollapsingQParserPlugin this week and it will have a similar interaction between a PostFilter and value source. So you may want to watch SOLR-5536 to see an example of this. Joel Joel Bernstein Search Engineer at Heliosearch On Mon, Dec 23, 2013 at 4:03 PM, Peter Keegan peterlkee

Getting index schema in SolrCloud mode

2014-02-03 Thread Peter Keegan
I'm indexing data with a SolrJ client via SolrServer. Currently, I parse the schema returned from a HttpGet on: localhost:8983/solr/collection/schema/fields What is the recommended way to read the schema with CloudSolrServer? Can it be done with a single HttpGet to a ZK server? Thanks, Peter

Re: How to override rollback behavior in DIH

2014-01-17 Thread Peter Keegan
it a bit harder to detect the failure via the admin interface. Thanks, Peter On Tue, Jan 14, 2014 at 11:12 AM, Peter Keegan peterlkee...@gmail.comwrote: I have a custom data import handler that creates an ExternalFileField from a source that is different from the main index. If the import fails

Re: How to override rollback behavior in DIH

2014-01-17 Thread Peter Keegan
source. Worth a try I guess. On Fri, Jan 17, 2014 at 7:20 PM, Peter Keegan peterlkee...@gmail.com wrote: Following up on this a bit - my main index is updated by a SolrJ client in another process. If the DIH fails, the SolrJ client is never informed of the index rollback, and any pending

Re: How to override rollback behavior in DIH

2014-01-17 Thread Peter Keegan
...@gmail.com [mailto:pkeegan01...@gmail.com] On Behalf Of Peter Keegan Sent: Friday, January 17, 2014 7:51 AM To: solr-user@lucene.apache.org Subject: Re: How to override rollback behavior in DIH Following up on this a bit - my main index is updated by a SolrJ client in another process

How to override rollback behavior in DIH

2014-01-14 Thread Peter Keegan
I have a custom data import handler that creates an ExternalFileField from a source that is different from the main index. If the import fails (in my case, a connection refused in URLDataSource), I don't want to roll back any uncommitted changes to the main index. However, this seems to be the

Re: leading wildcard characters

2014-01-14 Thread Peter Keegan
you want to open a new one? Ahmet On Friday, January 10, 2014 6:12 PM, Peter Keegan peterlkee...@gmail.com wrote: Removing ReversedWildcardFilterFactory had no effect. On Fri, Jan 10, 2014 at 10:48 AM, Ahmet Arslan iori...@yahoo.com wrote: Hi Peter, Can you remove any occurrence

leading wildcard characters

2014-01-10 Thread Peter Keegan
How do you disable leading wildcards in 4.X? The setAllowLeadingWildcard method is there in the parser, but nothing references the getter. Also, the Edismax parser always enables it and provides no way to override. Thanks, Peter

Re: leading wildcard characters

2014-01-10 Thread Peter Keegan
, Peter Keegan peterlkee...@gmail.com wrote: How do you disable leading wildcards in 4.X? The setAllowLeadingWildcard method is there in the parser, but nothing references the getter. Also, the Edismax parser always enables it and provides no way to override. Thanks, Peter

Re: Zookeeper as Service

2014-01-09 Thread Peter Keegan
There's also: http://www.tanukisoftware.com/ On Thu, Jan 9, 2014 at 11:18 AM, Nazik Huq nazik...@yahoo.com wrote: From your email I gather your main concern is starting zookeeper on server startups. You may want to look at these non-native service oriented options too: Create a script(

Re: Function query matching

2014-01-06 Thread Peter Keegan
: The bottom line for Peter is still the same: using scale() wrapped arround : a function/query does involve a computing hte results for every document, : and that is going to scale linearly as the size of hte index grows -- but : it it is *only* because of the scale function. Another problem

Re: how to include result ordinal in response

2014-01-04 Thread Peter Keegan
, 2014, at 10:00 PM, Peter Keegan wrote: Is there a simple way to output the result number (ordinal) with each returned document using the 'fl' parameter? This would be useful when visually comparing the results from 2 queries. I'm not aware of a simple way. If you're competent in Java

how to include result ordinal in response

2014-01-03 Thread Peter Keegan
Is there a simple way to output the result number (ordinal) with each returned document using the 'fl' parameter? This would be useful when visually comparing the results from 2 queries. Thanks, Peter

Re: Configurable collectors for custom ranking

2013-12-26 Thread Peter Keegan
On Mon, Dec 23, 2013 at 4:03 PM, Peter Keegan peterlkee...@gmail.com wrote: Hi Joel, Could you clarify what would be in the key,value Map added to the SearchRequest context? It seems that all the docId/score tuples need to be there, including the ones not in the 'top N ScoreDocs

Re: Configurable collectors for custom ranking

2013-12-23 Thread Peter Keegan
function call will look like this: sum(score(), field(x)) Joel On Thu, Dec 12, 2013 at 9:58 AM, Peter Keegan peterlkee...@gmail.com wrote: Regarding my original goal, which is to perform a math function using the scaled score and a field value, and sort on the result, how does

Re: Configurable collectors for custom ranking

2013-12-19 Thread Peter Keegan
, Peter Keegan peterlkee...@gmail.com wrote: This is pretty cool, and worthy of adding to Solr in Action (v2) and the other books. With function queries, flexible filter processing and caching, custom collectors, and post filters, there's a lot of flexibility here. Btw, the query times

Re: Configurable collectors for custom ranking

2013-12-19 Thread Peter Keegan
, Peter On Thu, Dec 19, 2013 at 9:51 AM, Peter Keegan peterlkee...@gmail.comwrote: In order to size the PriorityQueue, the result window size for the query is needed. This has been computed in the SolrIndexSearcher and available in: QueryCommand.getSupersetMaxDoc(), but doesn't seem to be available

Re: Configurable collectors for custom ranking

2013-12-12 Thread Peter Keegan
apply to 4.3. I think as long you have the finish method that's all you'll need. If you can get this working it would be excellent if you could donate back the Scale PostFilter. On Wed, Dec 11, 2013 at 3:36 PM, Peter Keegan peterlkee...@gmail.com wrote: This is what I was looking

Re: Configurable collectors for custom ranking

2013-12-12 Thread Peter Keegan
, 2013 at 9:58 AM, Peter Keegan peterlkee...@gmail.com wrote: Regarding my original goal, which is to perform a math function using the scaled score and a field value, and sort on the result, how does this fit in? Must I implement another custom PostFilter with a higher cost than the scale

Re: Configurable collectors for custom ranking

2013-12-11 Thread Peter Keegan
. Is there a reason why a PostFilter won't work for you? Joel On Tue, Dec 10, 2013 at 3:24 PM, Peter Keegan peterlkee...@gmail.com wrote: Quick question: In the context of a custom collector, how does one get the values of a field of type 'ExternalFileField'? Thanks, Peter On Tue, Dec 10

Re: Configurable collectors for custom ranking

2013-12-11 Thread Peter Keegan
)field.getType(); fieldValues = eff.getFileFloatSource(field, dataDir); And then read the values in 'setNextReader' Peter On Wed, Dec 11, 2013 at 2:05 PM, Peter Keegan peterlkee...@gmail.comwrote: Hi Joel, I thought about using a PostFilter, but the problem is that the 'scale' function must

Re: Configurable collectors for custom ranking

2013-12-11 Thread Peter Keegan
. If the document is not in the score map then send down 0. You'll have setup a dummy scorer to feed to lower collectors. The CollapsingQParserPlugin has an example of how to do this. On Wed, Dec 11, 2013 at 2:05 PM, Peter Keegan peterlkee...@gmail.com wrote: Hi Joel, I thought about

Re: Configurable collectors for custom ranking

2013-12-11 Thread Peter Keegan
the finish method that's all you'll need. If you can get this working it would be excellent if you could donate back the Scale PostFilter. On Wed, Dec 11, 2013 at 3:36 PM, Peter Keegan peterlkee...@gmail.com wrote: This is what I was looking for, but the DelegatingCollector 'finish' method

Re: Configurable collectors for custom ranking

2013-12-10 Thread Peter Keegan
...@gmail.com wrote: Hi Peter, I've been meaning to revisit configurable ranking collectors, but I haven't yet had a chance. It's on the shortlist of things I'd like to tackle though. On Fri, Dec 6, 2013 at 4:17 PM, Peter Keegan peterlkee...@gmail.com wrote: I looked at SOLR-4465 and SOLR-5045

Re: Configurable collectors for custom ranking

2013-12-10 Thread Peter Keegan
Quick question: In the context of a custom collector, how does one get the values of a field of type 'ExternalFileField'? Thanks, Peter On Tue, Dec 10, 2013 at 1:18 PM, Peter Keegan peterlkee...@gmail.comwrote: Hi Joel, This is related to another thread on function query matching ( http

Re: Function query matching

2013-12-07 Thread Peter Keegan
But for your specific goal Peter: Yes, if the whole point of a function you have is to wrap generated a scaled score of your base $qq, ... Thanks for the confirmation, Chris. So, to do this efficiently, I think I need to implement a custom Collector that performs the scaling (and other math)

Re: Function query matching

2013-12-06 Thread Peter Keegan
) != this (in QueryValueSource). This should be an easy fix. I'll create a JIRA ticket to use better key names in these functions and push up a patch. This will eliminate the need for the extra NoOp function. -Trey On Mon, Dec 2, 2013 at 12:41 PM, Peter Keegan peterlkee...@gmail.com wrote

Re: Function query matching

2013-12-06 Thread Peter Keegan
In my previous posting, I said: Subsequent calls to ScaleFloatFuntion.getValues bypassed 'createScaleInfo and added ~0 time. These subsequent calls are for the remaining segments in the index reader (21 segments). Peter On Fri, Dec 6, 2013 at 2:10 PM, Peter Keegan peterlkee...@gmail.com

Configurable collectors for custom ranking

2013-12-06 Thread Peter Keegan
I looked at SOLR-4465 and SOLR-5045, where it appears that there is a goal to be able to do custom sorting and ranking in a PostFilter. So far, it looks like only custom aggregation can be implemented in PostFilter (5045). Custom sorting/ranking can be done in a pluggable collector (4465), but

Re: Function query matching

2013-12-02 Thread Peter Keegan
, Peter On Fri, Nov 29, 2013 at 9:18 AM, Peter Keegan peterlkee...@gmail.comwrote: Instead of using a function query, could I use the edismax query (plus some low cost filters not shown in the example) and implement the scale/sum/product computation in a PostFilter? Is the query's maxScore

Re: Function query matching

2013-11-29 Thread Peter Keegan
Instead of using a function query, could I use the edismax query (plus some low cost filters not shown in the example) and implement the scale/sum/product computation in a PostFilter? Is the query's maxScore available there? Thanks, Peter On Wed, Nov 27, 2013 at 1:58 PM, Peter Keegan peterlkee

Re: Function query matching

2013-11-27 Thread Peter Keegan
} Is there any way to speed this up? Would writing a custom function query that compiled all the function queries together be any faster? Thanks, Peter On Mon, Nov 11, 2013 at 1:31 PM, Peter Keegan peterlkee...@gmail.comwrote: Thanks On Mon, Nov 11, 2013 at 11:46 AM, Yonik Seeley yo

Re: Function query matching

2013-11-27 Thread Peter Keegan
Although the 'scale' is a big part of it, here's a closer breakdown. Here are 4 queries with increasing functions, and theei response times (caching turned off in solrconfig): 100 msec: select?q={!edismax v='news' qf='title^2 body'} 135 msec: select?qq={!edismax v='news' qf='title^2

Re: Function query matching

2013-11-11 Thread Peter Keegan
? Thanks, Peter On Thu, Nov 7, 2013 at 2:16 PM, Peter Keegan peterlkee...@gmail.com wrote: I'm trying to used a normalized score in a query as I described in a recent thread titled Re: How to get similarity score between 0 and 1 not relative score I'm using this query: select?qq={!edismax v='news

Re: Function query matching

2013-11-11 Thread Peter Keegan
Thanks On Mon, Nov 11, 2013 at 11:46 AM, Yonik Seeley yo...@heliosearch.comwrote: On Mon, Nov 11, 2013 at 11:39 AM, Peter Keegan peterlkee...@gmail.com wrote: fq=$qq What is the proper syntax? fq={!query v=$qq} -Yonik http://heliosearch.com -- making solr shine

Function query matching

2013-11-07 Thread Peter Keegan
Why does this function query return docs that don't match the embedded query? select?qq=text:newsq={!func}sum(query($qq),0)

Re: Function query matching

2013-11-07 Thread Peter Keegan
, but don't filter them. All documents effectively match a function query. Erik On Nov 7, 2013, at 1:48 PM, Peter Keegan peterlkee...@gmail.com wrote: Why does this function query return docs that don't match the embedded query? select?qq=text:newsq={!func}sum(query($qq),0)

Re: Data Import Handler

2013-11-06 Thread Peter Keegan
I've done this by adding an attribute to the entity element (e.g. myconfig=myconfig.xml), and reading it in the 'init' method with context.getResolvedEntityAttribute(myconfig). Peter On Wed, Nov 6, 2013 at 8:25 AM, Ramesh ramesh.po...@vensaiinc.com wrote: Hi Folks, Can anyone suggest me

Re: How to get similarity score between 0 and 1 not relative score

2013-11-01 Thread Peter Keegan
There's another use case for scaling the score. Suppose I want to compute a custom score based on the weighted sum of: - product(0.75, relevance score) - product(0.25, value from another field) For this to work, both fields must have values between 0-1, for example. Toby's example using the

How to reinitialize a solrcloud replica

2013-10-25 Thread Peter Keegan
I'm running 4.3 in solrcloud mode and trying to test index recovery, but it's failing. I have one shard, 2 replicas: Leader: 10.159.8.105 Replica: 10.159.6.73 To test, I stopped the replica, deleted the 'data' directory and restarted solr. Here is the replica's logging: INFO - 2013-10-25

Re: Solr timeout after reboot

2013-10-21 Thread Peter Keegan
Have you tried this old trick to warm the FS cache? cat .../core/data/index/* /dev/null Peter On Mon, Oct 21, 2013 at 5:31 AM, michael.boom my_sky...@yahoo.com wrote: Thank you, Otis! I've integrated the SPM on my Solr instances and now I have access to monitoring data. Could you give me

Re: Solr timeout after reboot

2013-10-21 Thread Peter Keegan
I found this warming to be especially necessary after starting an instance of those m3.xlarge servers, else the response times for the first minutes was terrible. Peter On Mon, Oct 21, 2013 at 8:39 AM, François Schiettecatte fschietteca...@gmail.com wrote: To put the file data into file

Re: limiting deep pagination

2013-10-17 Thread Peter Keegan
msoko...@safaribooksonline.com adlı kullanıcı şöyle yazdı: On 10/8/13 6:51 PM, Peter Keegan wrote: Is there a way to configure Solr 'defaults/appends/invariants' such that the product of the 'start' and 'rows' parameters doesn't exceed a given value? This would be to prevent deep pagination

limiting deep pagination

2013-10-08 Thread Peter Keegan
Is there a way to configure Solr 'defaults/appends/invariants' such that the product of the 'start' and 'rows' parameters doesn't exceed a given value? This would be to prevent deep pagination. Or would this require a custom requestHandler? Peter

Re: How to get values of external file field(s) in Solr query?

2013-10-03 Thread Peter Keegan
In 4.3, frange query using an external file works for both q and fq. The Solr wiki and SIA both state that ExternalFileField does not support searching. Was the search/filter capability added recently, or is it not supported? Thanks, Peter On Wed, Jun 26, 2013 at 4:59 PM, Upayavira

Re: Cross index join query performance

2013-09-30 Thread Peter Keegan
returned by the inner query? As Joel mentions, those other joins are attempts to find other ways to work with this limitation. Upayavira On Fri, Sep 27, 2013, at 09:44 PM, Peter Keegan wrote: Hi Joel, I tried this patch and it is quite a bit faster. Using the same query

Re: Cross index join query performance

2013-09-27 Thread Peter Keegan
the fromIndex. If you have a small number of results in the fromIndex the standard join will be faster. On Wed, Sep 25, 2013 at 3:39 PM, Peter Keegan peterlkee...@gmail.com wrote: I forgot to mention - this is Solr 4.3 Peter On Wed, Sep 25, 2013 at 3:38 PM, Peter Keegan peterlkee

Cross index join query performance

2013-09-25 Thread Peter Keegan
I'm doing a cross-core join query and the join query is 30X slower than each of the 2 individual queries. Here are the queries: Main query: http://localhost:8983/solr/mainindex/select?q=title:java QTime: 5 msec hit count: 1000 Sub query: http://localhost:8983/solr/subindex/select?q=+fld1:[0.1 TO

Re: A question about attaching shards to load balancers

2013-01-30 Thread Peter Keegan
Aren't you concerned about having a single point of failure with this setup? On Wed, Jan 30, 2013 at 10:38 AM, Michael Ryan mr...@moreover.com wrote: From a performance point of view, I can't imagine it mattering. In our setup, we have a dedicated Solr server that is not a shard that takes

Re: Improving performance for use-case where large (200) number of phrase queries are used?

2012-10-25 Thread Peter Keegan
? On Wed, Oct 24, 2012 at 1:20 PM, Peter Keegan peterlkee...@gmail.com wrote: Could you index your 'phrase tags' as single tokens? Then your phrase queries become simple TermQuerys. 5) *This is my current favorite*: stop tokenizing/analyzing these terms and just use KeywordTokenizer. Most

Re: Improving performance for use-case where large (200) number of phrase queries are used?

2012-10-24 Thread Peter Keegan
Could you index your 'phrase tags' as single tokens? Then your phrase queries become simple TermQuerys. On Wed, Oct 24, 2012 at 12:26 PM, Robert Muir rcm...@gmail.com wrote: On Wed, Oct 24, 2012 at 11:09 AM, Aaron Daubman daub...@gmail.com wrote: Greetings, We have a solr instance in use

Re: Anyone using mmseg analyzer in solr multi core?

2012-10-09 Thread Peter Keegan
We're using MMSeg with Lucene, but not Solr. Since each SolrCore is independent, I'm not sure how you can avoid each having a copy of the dictionary, unless you modified MMSeg to use shared memory. Or, maybe I missing something. On Mon, Oct 8, 2012 at 3:37 AM, liyun liyun2...@corp.netease.com

Re: How to plug a new ANTLR grammar

2011-09-14 Thread Peter Keegan
into the tree, or before we start processing the query string? Thanks! Roman On Tue, Sep 13, 2011 at 10:14 PM, Peter Keegan peterlkee...@gmail.com wrote: Roman, I'm not familiar with the contrib, but you can write your own Java code to create Query objects from the tree produced

Re: How to plug a new ANTLR grammar

2011-09-13 Thread Peter Keegan
Roman, I'm not familiar with the contrib, but you can write your own Java code to create Query objects from the tree produced by your lexer and parser something like this: StandardLuceneGrammarLexer lexer = new ANTLRReaderStream(new StringReader(queryString)); CommonTokenStream tokens = new

Re: performance crossover between single index and sharding

2011-08-04 Thread Peter Keegan
We have 16 shards on 4 physical servers. Shard size was determined by measuring query response times as a function of doc count. Multiple shards per server provides parallelism. In a VM environment, I would lean towards 1 shard per VM (with 1/4 the RAM). We implemented our own distributed search

Re: Localized alphabetical order

2011-04-22 Thread Peter Keegan
On Fri, Apr 22, 2011 at 12:33 PM, Ben Preece preec...@umn.edu wrote: As someone who's new to Solr/Lucene, I'm having trouble finding information on sorting results in localized alphabetical order. I've ineffectively searched the wiki and the mail archives. I'm thinking for example about

Re: Info about Debugging SOLR in Eclipse

2011-03-17 Thread Peter Keegan
Can you use jetty? http://www.lucidimagination.com/developers/articles/setting-up-apache-solr-in-eclipse On Thu, Mar 17, 2011 at 12:17 PM, Geeta Subramanian gsubraman...@commvault.com wrote: Hi, Can some please let me know the steps on how can I debug the solr code in my eclipse? I tried

Re: Info about Debugging SOLR in Eclipse

2011-03-17 Thread Peter Keegan
The instructions refer to the 'Run configuration' menu. Did you try 'Debug configurations'? On Thu, Mar 17, 2011 at 3:27 PM, Peter Keegan peterlkee...@gmail.comwrote: Can you use jetty? http://www.lucidimagination.com/developers/articles/setting-up-apache-solr-in-eclipse On Thu, Mar 17

CapitalizationFilter

2010-12-29 Thread Peter Keegan
I was looking at 'CapitalizationFilter' and noticed that the 'incrementToken' method splits words at ' ' (space) and '.' (period). I'm curious as to why the period is treated as a word separator? This could cause unexpected results, for example: Hello There My Name Is Dr. Watson --- Hello There

Re: Does anyone notice this site?

2010-10-25 Thread Peter Keegan
fwiw, our proxy server has blocked this site for malicious content. Peter On Mon, Oct 25, 2010 at 1:25 PM, Grant Ingersoll gsing...@apache.orgwrote: On Oct 25, 2010, at 12:54 PM, scott chu wrote: I happen to bump into this site: http://www.solr.biz/ They said they are also developing a

LuceneRevolution - NoSQL: A comparison

2010-10-11 Thread Peter Keegan
I listened with great interest to Grant's presentation of the NoSQL comparisons/alternatives to Solr/Lucene. It sounds like the jury is still out on much of this. Here's a use case that might favor using a NoSQL alternative for storing 'stored fields' outside of Lucene. When Solr does a

Re: Range queries

2009-06-16 Thread Peter Keegan
How about this: x:[5 TO 8] AND x:{0 TO 8} On Tue, Jun 16, 2009 at 1:16 PM, Otis Gospodnetic otis_gospodne...@yahoo.com wrote: Hi, I think the square brackets/curly braces need to be balanced, so this is currently not doable with existing query parsers. Otis -- Sematext --

Re: new faceting algorithm

2008-12-05 Thread Peter Keegan
Hi Yonik, May I ask in which class(es) this improvement was made? I've been using the DocSet, DocList, BitDocSet, HashDocSet from Solr from a few years ago with a Lucene based app. to do faceting. Thanks, Peter On Mon, Nov 24, 2008 at 11:12 PM, Yonik Seeley [EMAIL PROTECTED] wrote: A new

  1   2   >